Robot Training's Big Divide: Human Teachers vs. YouTube Binges

The race to build a capable humanoid robot is quickly becoming less about hardware and more about a fundamental philosophical question: what’s the best way to teach a machine? On one side, you have companies like Sunday, betting on an army of human teachers. On the other, giants like Tesla and Nvidia are hoping their robots can learn just by watching YouTube. This strategic split defines the entire field, and nobody agrees on the right answer.

Sunday is all-in on imitation learning, equipping 500 “memory developers” with special gloves to meticulously record high-quality data for every conceivable chore. The company claims this method allows it to train and evaluate a new task every one to two weeks, creating what it calls the “world’s fastest learning robot.” It’s a hands-on, artisanal approach to data collection, focused on quality over sheer quantity.

Video thumbnail

This human-centric model has variations. The Norwegian firm 1X Technologies also uses human guidance, but instead of gloves and curated sessions, it deploys its 1X Neo: Your AI Butler is Here, For a Price robots directly into real-world scenarios to learn via teleoperation. It’s less of a classroom and more of an on-the-job apprenticeship. Meanwhile, Figure is building out physical “Neura Gyms,” structured environments where its robots can train on specific tasks, sometimes in partnership with companies like BMW.

Then there’s the “just watch videos” camp. Tesla has been vocal about its goal for the Optimus bot to learn tasks simply by observing videos of humans performing them. Nvidia, with its NVIDIA Builds a Matrix for Robots with Cosmos platform, is also leveraging simulation and vast, internet-scale video data to train its foundation models for robotics. This method promises immense scale—there are more hours of “how-to” videos online than any team of memory developers could ever produce—but it struggles with context, embodiment, and the sheer noise of unstructured data.

Why is this important?

The schism in training methodology represents the single biggest hurdle to creating a truly general-purpose robot. The core of the debate is a classic quality versus quantity problem, amplified by the complexities of physical interaction.

Is a meticulously curated, high-quality dataset from human demonstrators—like the one Sunday AI Skips Robot Puppets, Teaches Chores by Hand is building—the key to reliable task execution? Or will the sheer, chaotic volume of internet data ultimately provide a more robust and scalable path to intelligence, as Tesla and Nvidia believe? The company that solves this scalable learning puzzle won’t just build a better robot; it will likely define the next decade of artificial intelligence and automation.