For a tech industry intoxicated by advances in artificial intelligence, the idea that fully humanoid robots will soon be stalking the earth hardly seems a stretch.
Elon Musk recently predicted a $10tn market for Optimus, Tesla’s attempt at an artificial human that can take over your household chores. Nvidia boss Jensen Huang said this would be “the largest technology industry the world has ever seen”.
And to judge by the surge of investment into robot start-ups and a flood of videos online of two-legged robots displaying impressive humanlike movements, it is easy to believe such a revolution is at hand. If large language models can tackle difficult reasoning tasks, then it might seem simple to implant a model in a robot and retrain it to navigate the world. Problem solved.
This seriously underestimates the difficulties. Thanks to decades of science fiction, many people “assume AI is inherently embodied”, points out Peter Barrett, a venture investor at Playground Global. In reality, bringing intelligence to the physical world is a much bigger leap.
It will require entirely new ways of training robot brains. When it comes to bringing powerful autonomous hardware systems and people into proximity, there will be no room for the kind of “hallucinations” that today’s LLMs are prone to. And that doesn’t begin to scratch the surface of the many problems that the robot makers need to overcome in constructing and controlling complex hardware systems designed to emulate the human body.
By raising expectations about the practicality of artificial humans, the robot makers are making things much harder for themselves than they need to. They also risk missing a nearer-term and very significant market that is opening up for robots that don’t have two legs or try to ape humankind in all its complexity.
On the artificial intelligence front, the robotics companies face several hurdles beyond those confronting today’s LLM-makers. While services like ChatGPT are based on models that have largely been trained on the internet, there isn’t a ready-made corpus of data describing the physical world.
Also, machines that interact with the world and manipulate objects face a far higher degree of difficulty than simpler autonomous machines like self-driving cars. Vehicles only need to move through the world without hitting anything; A robot has to be able to apply touch to achieve even its most basic task.
There’s also the question of “planning”, or deciding in real time on a course of action based on a flood of real-world sensory data — one of the hardest problems in robotics. Driverless cars may finally be appearing on city streets, but they have taken years longer to reach this stage than tech industry boosters predicted. Robots represent a far higher degree of difficulty.
At its annual tech conference in Silicon Valley this week, Nvidia took some of these issues head-on. Its Cosmos system has been developed to create virtual worlds that can be used to train robot brains — though it is unclear how far this synthetic data will go in substituting for the real thing. The chipmaker also said it had started work on developing a “physics engine” that can help a robot understand the properties of the many different things it might encounter, for instance, by distinguishing hard and soft objects. The work on the physics engine is being undertaken alongside Disney and Google DeepMind — an alignment of corporate interests that speaks volumes about the mix of deep technology and fantasy that is driving the robot revolution.
Nvidia is also releasing its nascent robot operating system as an open source project, potentially drawing in other developers. That could move the field forward faster — though it could sideline the efforts of many others that have rushed into the field. And laying out the development programme that is ahead is still a far cry from showing actual results.
Instead of emulating people, there might be more opportunities in creating boring machines made to handle single tasks or to work in environments adapted for their use, like warehouses and factories. They include machines such as the automated warehouse carts built by Robust.ai, a start-up by Rodney Brooks, a founder of the company behind the Roomba vacuum cleaner and a former AI professor at Massachusetts Institute of Technology. A dish washer doesn’t need hands and arms to relieve humans of a tedious household chore. Applying the latest AI and low-cost hardware could yield a wave of useful robots — even if they look nothing like us.
richard.waters@ft.com