An Thai Le

An Thai Le

Assistant Professor @ VinUniversity · Visiting Professor @ TU Darmstadt

I work at the intersection of Robotics and Machine Learning. My research scales motion planning and policy learning to long-horizon, high-dimensional, and multimodal problems, mainly through tensor search (GPU-batched search and optimization over plan tensors) and by pairing algorithmic structure with generative models (diffusion, flow matching). Lately, mostly humanoid loco-manipulation and vision-language-action policies that stay cheap to run and hold up on real hardware.

On weekends I write JAX/PyTorch simulators for curved spacetimes: Kerr black holes, the Penrose process, and warp-drive energy conditions, because the math is too beautiful to ignore! :)

Press: Báo Tin Tức · DFKI · Popular Mechanics

News

Research

I try to scale planning to settings classical methods struggle with: long horizons, high-dimensional state spaces, large plan sets, many agents. I do this by treating search as a batched tensor operation and by leaning on generative models where structure runs out. Most current work targets humanoid loco-manipulation and vision-language-action models that stay efficient and reliable enough to run on real robots.

Tensor Search & Batched Planning

Casting search and trajectory optimization as batched tensor operations on the GPU, the spine of my thesis and most of my recent planners.

Diffusion & Flow Matching for Motion

Using diffusion and flow matching as priors over trajectories and policies, especially when the solution landscape is multimodal and gradients alone are not enough. Lately, choosing the initial noise itself to make action chunking smoother and more robust.

Humanoid Loco-manipulation

Whole-body RL and model-based control for humanoids in contact-rich tasks: perceptive locomotion that holds up over rough terrain and under payload, and compliant control that yields safely during contact and cooperative carrying. Ongoing, and many things still fall over.

Vision-Language-Action Models

Making VLA policies cheaper to run and steadier on real robots: equivariant architectures, lighter finetuning, better grounding from fewer demos, and inference runtimes for the edge.

Optimal Transport & Gradient Flows

Borrowing entropic OT and gradient-flow machinery to design planners, blend policies, and train networks where standard gradients break down.

Numerical General Relativity

A weekend hobby: JAX/PyTorch CUDA simulators for processes in curved spacetime, including Kerr orbits, Penrose extraction, and warp-drive energy conditions.

Humanoid Demos

Selected humanoid locomotion and manipulation demonstrations from VinRobotics. These videos show sim-to-real RL policies, perceptive stair climbing, and whole-body control running on our own hardware stack.

Perceptive stair locomotion

Perceptive Stair Locomotion

VR-M3 (~60 kg) climbs unfamiliar staircases at 0.6 m/s with a 5 kg payload using onboard terrain sensing and learned locomotion, without LiDAR, mocap, or teleoperation.

Read article →
Human-level walking speed

Human-Level Walking Speed

VR-H3 (178 cm, 85 kg) reaches human-level walking speed via RL-based locomotion with gait reward design, domain randomization, and curriculum learning.

Read article →
Built from scratch

Built from the Motor Up

Custom high-torque-density actuators with real-time EtherCAT communication enable 1.5–1.8 m/s dynamic walking. Full native stack, fast iteration.

Read article →
Real world football

Real-World Football

Humanoid robot participates in a real game, passing, running alongside people, and celebrating goals in an unscripted outdoor environment.

Read article →
Global debut

Global Debut

Platform preview for Computex and ICRA 2026: whole-body teleoperation, dynamic payload handling, MPC + RL locomotion, and perception-action learning.

Read article →

Selected Publications

* indicates co-first or co-last authors. See also my Google Scholar profile.

  • Self-Improving VLA Policies
    Self-Improving VLA Policies: Selected Diffusion Noise for Spurious-Robust Action Smoothing
    D.M. Nguyen, B.N. Dao, T.M. Luu, B.G. Nguyen, V. Tong, A. Liu, V.N. Duong, D.D. Le, D. Sonntag, T. Le, N. Le, J. Peters, An Thai Le, M.N. Vu, M. Niepert, K.D. Doan, D.M.H. Nguyen, V.A. Ngo
    NeurIPS 2026 Submitted
  • Start Right, Arrive Right
    Start Right, Arrive Right: Asynchronous Execution via Initial Noise Selection
    T.B. Ho*, Q.T. Nguyen*, T.L. Ha*, G.B. Nguyen, V.T. Nguyen, L. Dinh, M.N. Vu, D.M.H. Nguyen, An Thai Le, V.A. Ngo
    CoRL 2026 Submitted
  • Finetuning VLA Models Requires Fewer Layers Than You Think
    Finetuning Vision-Language-Action Models Requires Fewer Layers Than You Think
    G.B. Nguyen*, T.B. Ho, T.L. Ha, K. Vo, P.L. Møller, Q.T. Nguyen, L. Dinh, T.M. Luu, T. Dam, V. Duong, T. Le, N.D.Q. Bui, M. Vu, T.N. Le, An Thai Le, N. Le, D. Sonntag, J. Zou, J. Peters, D.M.H. Nguyen*, V.A. Ngo*
    CoRL 2026 Submitted
  • StructSAM
    StructSAM: Structure- and Spectrum-Preserving Token Merging for Segment Anything Models
    D.M.H. Nguyen, T.A. Tran, D. Nguyen, S. Xie, T.Q. Nguyen, M.T.N. Truong, D. Palenicek, An Thai Le, M. Barz, T. Nguyen, T. Dam, N. Le, M. Vu, K. Doan, V. Ngo, P. Xie, J. Zou, D. Sonntag, J. Peters, M. Niepert
    NeurIPS 2026 Submitted
  • CLOT
    CLOT: Multi-Robot Motion Planning Via Collaborative Optimal Transport under Signal Temporal Logic Tasks
    Y. Zhang, Y. Zhang, An Thai Le, M. Guo
    ICRA 2026
  • FOCA
    M.D. Nguyen, T.D. Nghiem, G.B. Nguyen, T.B. Ho, D.T. Le, Q.T. Nguyen, T.L. Ha, V.N. Tran, B. Thach, X.N. Tran, T.A. Tran, A. Habuda, P.L. Møller, N.L. Tran, D. Sonntag, M. Niepert, K.D. Doan, V.N. Duong, H. Ngo, M.N. Vu, D.M.H. Nguyen, An Thai Le*, V.A. Ngo*
    ICML 2026
  • Rarity of rocket-driven Penrose extraction in Kerr spacetime
  • Motion Planning Diffusion
    João Carvalho, An Thai Le, Philipp Jahr, Qiao Sun, Julen Urain, Dorothea Koert, Jan Peters
    IEEE T-RO 2025 AAAI 2026
  • DoublyAware
    DoublyAware: Dual Planning and Policy Awareness for Temporal Difference Learning in Humanoid Locomotion
    Khang Nguyen, An Thai Le, Jan Peters, Nhat Minh Vu
    IEEE RA-L 2025
  • Machine Learning with Physics Knowledge
    Machine Learning with Physics Knowledge for Prediction: A Survey
    Joe Watson, Chen Song, Oliver Weeger, Theo Gruner, An Thai Le, Kay Hansel, Ahmed Hendawy, Alexander Arenz, William Trojak, Kyle Cranmer, C Alberto D'Eramo, Felix Buelow, Tanmay Goyal, Jan Peters, Marc W. Hoffmann
    TMLR 2025
  • FlowMP
    FlowMP: Learning Motion Fields for Robot Planning with Conditional Flow Matching
    Khang Nguyen, An Thai Le, T. Pham, M. Huber, Jan Peters, Nhat Minh Vu
    IROS 2025
  • Grasp Diffusion Network
    João Carvalho, An Thai Le, Philipp Jahr, Qiao Sun, Julen Urain, Dorothea Koert, Jan Peters
    2025
  • Structure-Aware E(3)-Invariant
    Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks
    D.M.H. Nguyen*, N. Lukashina*, T. Nguyen, An Thai Le, T. Nguyen, N. Ho, Jan Peters, D. Sonntag, V. Zaverkin, M. Niepert
    ICML 2024
  • Dude
    An Thai Le*, D.M.H. Nguyen*, T.Q. Nguyen, N.T. Diep, T. Nguyen, D. Duong-Tran, Jan Peters, L. Shen, M. Niepert, D. Sonntag
    ACML 2024
  • Accelerating Motion Planning via Optimal
    An Thai Le, Georgia Chalvatzaki, Armin Biess, Jan Peters
    NeurIPS 2023
  • Motion Planning Diffusion
    João Carvalho, An Thai Le, Mark Baierl, Dorothea Koert, Jan Peters
    IROS 2023
  • Hierarchical Policy Blending
    An Thai Le, Kay Hansel, Jan Peters, Georgia Chalvatzaki
    L4DC 2023
  • Learning to reason over scene graphs
    Learning to reason over scene graphs: a case study of finetuning GPT-2 into a robot language model for grounded task planning
    Georgia Chalvatzaki, Ali Younes, Daljeet Nandha, An Thai Le, L.F.R. Ribeiro, I. Gurevych
    Frontiers in Robotics and AI 2023
  • Learning Implicit Priors for Motion Optimization
    An Thai Le*, Julen Urain*, Alexander Lambert*, Georgia Chalvatzaki, Byron Boots, Jan Peters
    IROS 2022
  • Learning forceful manipulation
    Learning forceful manipulation skills from multi-modal human demonstrations
    An Thai Le, Meng Guo, Niels Van Duijkeren, L. Rozo, R. Krug, A.G. Kupcsik, M. Buerger
    IROS 2021
  • Hierarchical Human-Motion Prediction
    Hierarchical Human-Motion Prediction and Logic-Geometric Programming for Minimal Interference Human-Robot Tasks
    An Thai Le, P. Kratzer, S. Hagenmayer, M. Toussaint, Jim Mainprice
    IEEE RO-MAN 2021

Experience

Visiting Professor

Oct 2025–Present · Darmstadt, Germany

Co-advising MSc and PhD students at IAS on robot learning research.

Assistant Professor

Oct 2025–Present · Hanoi, Vietnam

Building a research group on efficient learning and planning for robotics loco-manipulation, designing fundamental algorithms and methods.

Director of Foundation AI

Aug 2025–Present · Hanoi, Vietnam
  • RL stack for high-payload humanoid locomotion
  • Humanoid VLA architecture and training recipe
  • Model optimization and edge-deployment toolchain

Ph.D. in Computer Science

2022–2025 · Darmstadt, Germany

Thesis: Tensor Search Methods for Vectorizing Motion Planning, supervised by Prof. Jan Peters.

Research Intern

May 2020–Dec 2020 · Renningen, Germany

Worked on forceful imitation learning applied to E-bike assembly tasks, hosted by Dr. Meng Guo in the robotics team.

M.Sc. Information Technology

2019–2021 · Stuttgart, Germany

Thesis: Learning task-parameterized Riemannian motion policies, supervised by Dr. Jim Mainprice and Dr. Meng Guo. Graduated First class. Info-Preis for Best Diploma Award. Sony Research Award. Deutschlandstipendium.

Research Assistant

Nov 2019–Apr 2020 · Stuttgart, Germany

Implemented back-end functionalities in the DASH project; maintained and configured HPC systems.

B.Eng. Electrical Engineering and Information Technology

2015–2019 · Frankfurt, Germany

Thesis: Approaches to solve kidnapped robot problem. Graduated First class. DAAD Scholarship. AmCham Scholarship. eSilicon Scholarship.

Engineer Intern

Jan 2017–May 2017 · Ho Chi Minh City, Vietnam

Designed data analysis systems for high-volume manufacturing unit-test data; validated and reported quality of Intel Thunderbolt product manufacturing line.

Teaching

  • Reinforcement Learning TU Darmstadt · SS 2022
  • Statistical Machine Learning TU Darmstadt · SS 2023, WS 2023/24, SS 2024, WS 2024/25
  • Probabilistic Methods for Computer Science TU Darmstadt · WS 2024/25
  • Robot Learning Integrated Project / Expert Lab / Mechatronics TU Darmstadt · WS 2024/25

People

Masters Students

Dinh Van The Long
Dinh Van The Long

VinRobotics Residents

Trinh Thi Cuc
Trinh Thi Cuc
TD
Dang Truong Duy
Le Anh Chien
Le Anh Chien
ND
Nguyen Viet Duong
Ly Phuc Thanh
Ly Phuc Thanh
PD
Phuong Tuan Dat
Do Tan Dung
Do Tan Dung
Ho Thinh Hung
Ho Thinh Hung
Ha Thien Loc
Ha Thien Loc
Nguyen Quang Tan
Nguyen Quang Tan
Ho Trong Bao
Ho Trong Bao
NK
Nguyen Dang Khanh
LM
Le Van Minh
NH
Nguyen Doan Hoang
Nguyen Minh Hoang
Nguyen Minh Hoang
Dao Duc Thinh
Dao Duc Thinh

Alumni

Magnus Dierking
Magnus Dierking
Caio Freitas
Caio Freitas
QS
Qiao Sun
DA
Denis Andrić
Sebastian Zach
Sebastian Zach

Current Collaborators

Zachary Kingston
Purdue University
Meng Guo
Peking University
Ngo Anh Vien
VinRobotics
Jan Peters
TU Darmstadt
Georgia Chalvatzaki
TU Darmstadt
Viet T. Nguyen
University of Würzburg

Academic Service

Reviewer - Conferences & Area Chair

Area Chair: CoRL, RLC
Reviewer: IROS, ICRA, R:SS, L4DC, NeurIPS, ICML, ICLR, AAAI

Reviewer - Journals

IEEE RA-L, IEEE T-RO, Neurocomputing, TMLR, Frontiers in Robotics and AI