Reward-guided imitation learning for dexterous manipulation
A foundation model that helps humanoid robots pick up unseen objects and perform real-world work. It generalizes to novel objects and scenes, including cases where prior SoTA models achieve 0% success.
Humanoid robots can do backflips and kung fu, but still struggle with economically useful work. Dexterous manipulation is the main bottleneck to useful physical labor.
We collect an efficient set of human demonstrations and train a reward model to improve the base model's grasps.
The result is strong generalization: up to 63% gains on difficult objects where the best baselines achieve 0%, while maintaining near-perfect performance on standard objects. Full details are available in our technical post.
The physical world is 80% of global GDP. We believe this research represents progress toward a future in which AI systems are not limited to computer work.
We open-sourced our teleop stack!
https://github.com/GeneralTrajectory/dex-teleop
It uses vision + wrist trackers instead of data gloves → about $500 in hardware vs. ~$5,000.
If you are interested in AI for the physical world (scientific R&D, defense, etc.), please reach out!