MIT Researchers Develop Technique for Training General-Purpose Robots

In the animated series "The Jetsons," Rosie the robotic maid effortlessly switches between tasks like vacuuming, cooking, and taking out the trash. However, in reality, training a robot to handle a wide range of tasks remains a significant challenge.

Traditionally, engineers gather task-specific data for a particular robot in controlled environments to train it. This process is costly, time-consuming, and often results in robots that struggle to adapt to new environments or tasks they haven’t been trained for.

To address this, MIT researchers have developed a technique that integrates a vast amount of data from diverse sources to train robots more efficiently. Their approach combines data from different domains, including simulations and real-world robots, along with various modalities such as vision sensors and robotic arm encoders, into a unified “language” that a generative AI model can understand.

By synthesizing such a large volume of data, the method allows robots to be trained for various tasks without starting from scratch every time. This technique is not only faster and more cost-effective than traditional methods but also outperformed training from the ground up by over 20 percent in both simulations and real-world experiments.

https://github.com/NathanGRJ/Mecha-Domination-Rampage-MOD-unlimited-diamonds

https://github.com/ThomasKPT/Flame-of-Valhalla-Global-MOD-unlimited-diamonds

https://github.com/WilliamHKN/ARK-Ultimate-Mobile-Edition-MOD-unlimited-keys

https://github.com/AllanNJT/Last-War-Survival-MOD-unlimited-diamonds

https://github.com/PeterGNC/NBA-2K25-MyTEAM-MOD-unlimited-VC

https://github.com/ChristianBNC/Post-Apo-Tycoon-MOD-unlimited-money-and-gems

https://github.com/AidenRTN/Ash-Echoes-Global-MOD-unlimited-free-X-Crystal

https://github.com/AndrewKBT/Grimguard-Tactics-MOD-unlimited-free-rubies

https://github.com/BlakeBNT/LootBoy-MOD-unlimited-diamonds

https://github.com/BrianHNB/Last-Day-on-Earth-Survival-MOD-unlimited-coins

https://github.com/CharlesKPD/Truckers-of-Europe-3-MOD-unlimited-money-all-levels-unlocked

https://github.com/CodyABT/MeChat-MOD-unlimited-gems

https://github.com/ConnorTND/Gold-and-Goblins-MOD-unlimited-money-and-gems

https://github.com/EthanKPN/Head-Ball-2-MOD-unlimited-diamonds-and-coins

https://github.com/EvanKMS/Race-Max-Pro-MOD-unlimited-money-and-gold

https://github.com/EvanBKM/Spider-Fighter-3-MOD-unlimited-money

https://github.com/GabrielKNC/Standoff-2-MOD-unlimited-gold

https://github.com/JackEMB/War-Thunder-Mobile-MOD-unlimited-money

https://github.com/JacobGNO/Flex-City-Vice-Online-MOD-unlimited-money-and-gold

https://github.com/JacobGNT/School-Party-Craft-MOD-unlimited-money

https://github.com/JamesGBT/One-State-RP-MOD-unlimited-money-and-gems

https://github.com/LiamFWV/Ride-Master-MOD-unlimited-money-and-gems

https://github.com/LucasPRB/Rec-Room-MOD-unlimited-tokens

https://github.com/MichaelNGF/Super-City-Building-Master-MOD-unlimited-money-and-gems

https://github.com/NathanKPG/Driving-School-SImulator-EVO-MOD-unlimited-money

“In robotics, we often hear that there isn’t enough training data. But the bigger issue is that this data comes from many different domains, modalities, and robot hardware,” explains Lirui Wang, an EECS graduate student and lead author of the research. “Our work demonstrates how you can train a robot using all this diverse data.”

Wang's co-authors include fellow graduate student Jialiang Zhao, Meta research scientist Xinlei Chen, and senior author Kaiming He, an associate professor in EECS and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). The research will be presented at the Conference on Neural Information Processing Systems.

A robot’s “policy” is the set of instructions it follows based on sensor inputs, such as camera images or arm position data. Policies are typically trained through imitation learning, where a human demonstrates actions or operates the robot remotely. However, this method relies on a limited amount of task-specific data, which often causes robots to struggle when faced with new environments or tasks.

To improve this, the MIT team drew inspiration from large language models like GPT-4, which are pretrained on vast amounts of data and then fine-tuned with task-specific data. This pretraining enables them to perform a wide range of tasks effectively.

"In the language domain, data consists of sentences. But in robotics, the data is much more varied. To pretrain robots like language models, we need a different architecture," says Wang.

Given the diverse nature of robotic data—ranging from camera images to language instructions and depth maps—and the mechanical differences between robots, the researchers developed a new architecture called Heterogeneous Pretrained Transformers (HPT). This architecture unifies data from different modalities and domains to train robots more efficiently.

0 Kudos

Comments

Displaying 0 of 0 comments ( View all | Add Comment )