Researchers from KAIST have developed VOTP (Video-based Optimal Transport Preference labeling), a method that enables robots to learn human judgment criteria from just 10 labeled videos. The research was accepted to ICML 2026 and selected for an Oral presentation—a distinction given to only 168 papers out of 23,918 submissions (0.7%). Current preference-based reinforcement learning requires hundreds or thousands of human comparisons to train reward functions. VOTP addresses this bottleneck by using optimal transport mathematics to infer preferences for unlabeled video pairs. In experiments, VOTP with only 10 labels outperformed policies trained with ground-truth rewards. On real tabletop robot tasks using a Rethink Sawyer arm, VOTP achieved 80% success rate on LiftBanana and 70% on DrawerOpen with just 5-10 preference labels. Professor Chang D. Yoo stated: 'Since VOTP can learn human judgment criteria with only a small number of videos, it is a core technology that will accelerate the era of robots making human-like judgments.'

ResearchJune 12, 2026•Embodied Global Team
KAIST's VOTP Method Enables Robots to Learn Human Judgment from Just 10 Videos
KAIST's VOTP method achieves ICML 2026 Oral acceptance by enabling robots to learn human judgment from just 10 videos using optimal transport mathematics, dramatically reducing annotation costs.
#kaist#votp#icml-2026#preference-learning#reinforcement-learning#embodied-ai
Reading in English
Language: English- Showing content in English
Trending Now
Industry
LG CNS and LX Pantos Partner to Build Next-Generation Unmanned Warehouse with Humanoid Robots
Jun 11, 2026 · 0 views

Research
X Square Robot Open-Sources XRZero-G0 Framework for Scalable Robot Learning
Jun 10, 2026 · 0 views
Research
Human Archive Raises $8.2M to Train Robots Using India's Gig Economy Workers
Jun 7, 2026 · 0 views
Industry
The Economist Highlights Ningbo as the Unlikely Heart of Global Humanoid Robot Component Supply Chain
Jun 7, 2026 · 0 views
More in Research
Research
GenHOI: Zero-Shot Humanoid-Object Interaction by Imitating Generated Videos
Jun 13, 2026
Research
MIT Ultrasound Wristband Tracks Every Finger Movement, Controls Robot Hand in Real Time
Jun 13, 2026
Research
Generalist AI Unveils GEN-0: Embodied Foundation Model That Scales with Physical Interaction
Jun 13, 2026