EG
Humanoid-GPT architecture diagram showing the scaling approach from motion data to zero-shot tracking
ResearchJune 11, 2026Galbot Inc., Tsinghua University, Shanghai Jiao Tong University, Peking University

Galbot and Tsinghua Unveil Humanoid-GPT: 2B-Frame Corpus Enables Zero-Shot Motion Tracking

Galbot and Tsinghua University researchers introduce Humanoid-GPT, a GPT-style Transformer trained on a 2 billion frame motion corpus for whole-body humanoid control. Unlike prior shallow MLP trackers limited by data scarcity and the agility-generalization trade-off, Humanoid-GPT achieves unprecedented zero-shot generalization to unseen motions while tracking highly dynamic behaviors. The work demonstrates that scaling both data and model capacity is key to unlocking general-purpose humanoid motion tracking.

#humanoid-robot#motion-tracking#scaling#galbot#tsinghua#arXiv#transformer#zero-shot
Reading in English

Galbot and Tsinghua University researchers have unveiled Humanoid-GPT, a groundbreaking approach to humanoid robot motion tracking that leverages the power of large-scale data and modern transformer architecture.

The Scaling Revolution in Motion Tracking

Humanoid-GPT is pre-trained on a 2 billion frame retargeted motion corpus, unifying all major motion capture datasets including Lafan1, AMASS, Motion-X++, PHUMA, and MotionMillion, plus large-scale in-house recordings. This represents over 200× larger than prior tracker training sets.

Breaking the Agility-Generalization Trade-off

Prior motion trackers suffer from an agility-generalization trade-off: trackers excelling on agile motions often break on unseen styles, while generalizing trackers underfit complex dynamics. Humanoid-GPT breaks this trade-off through systematic scaling.

Key Results

  • Zero-shot generalization to unseen motions and control tasks
  • High-dynamic behavior tracking with unprecedented precision
  • Single generative Transformer replacing multiple specialized trackers
  • 96%+ success rate on real robot deployment tests

Architecture Highlights

The model uses GPT-style causal attention for online tracking deployment constraints, ensuring real-time performance. The scalable transformer architecture continues to improve with data and model scale, unlike shallow MLPs.

Open Source

Code and project page available at: https://github.com/GalaxyGeneralRobotics/Humanoid-GPT/

Source: arXiv:2606.03985
Language: English- Showing content in English