Meta’s AI chief says world models are key to ‘human-level AI’ — but it might be 10 years out

Date:

Share post:


Are today’s AI models truly remembering, thinking, planning, and reasoning, just like a human brain would? Some AI labs would have you believe they are, but according to Meta’s chief AI scientist Yann LeCun, the answer is no. He thinks we could get there in a decade or so, however, by pursuing a new method called a “world model.”

Earlier this year, OpenAI released a new feature it calls “memory” that allows ChatGPT to “remember” your conversations. The startup’s latest generation of models, o1, displays the word “thinking” while generating an output, and OpenAI says the same models are capable of “complex reasoning.”

That all sounds like we’re pretty close to AGI. However, during a recent talk at the Hudson Forum, LeCun undercut AI optimists, such as xAI founder Elon Musk and Google DeepMind co-founder Shane Legg, who suggest human-level AI is just around the corner.

“We need machines that understand the world; [machines] that can remember things, that have intuition, have common sense, things that can reason and plan to the same level as humans,” said LeCun during the talk. “Despite what you might have heard from some of the most enthusiastic people, current AI systems are not capable of any of this.”

LeCun says today’s large language models, like those which power ChatGPT and Meta AI, are far from “human-level AI.” Humanity could be “years to decades” away from achieving such a thing, he later said. (That doesn’t stop his boss, Mark Zuckerberg, from asking him when AGI will happen, though.)

The reason why is straightforward: those LLMs work by predicting the next token (usually a few letters or a short word), and today’s image/video models are predicting the next pixel. In other words, language models are one-dimensional predictors, and AI image/video models are two-dimensional predictors. These models have become quite good at predicting in their respective dimensions, but they don’t really understand the three-dimensional world.

Because of this, modern AI systems cannot do simple tasks that most humans can. LeCun notes how humans learn to clear a dinner table by the age of 10, and drive a car by 17 – and learn both in a matter of hours. But even the world’s most advanced AI systems today, built on thousands or millions of hours of data, can’t reliably operate in the physical world.

In order to achieve more complex tasks, LeCun suggests we need to build three dimensional models that can perceive the world around you, and center around a new type of AI architecture: world models.

“A world model is your mental model of how the world behaves,” he explained. “You can imagine a sequence of actions you might take, and your world model will allow you to predict what the effect of the sequence of action will be on the world.”

Consider the “world model” in your own head. For example, imagine looking at a messy bedroom and wanting to make it clean. You can imagine how picking up all the clothes and putting them away would do the trick. You don’t need to try multiple methods, or learn how to clean a room first. Your brain observes the three-dimensional space, and creates an action plan to achieve your goal on the first try. That action plan is the secret sauce that AI world models promise.

Part of the benefit here is that world models can take in significantly more data than LLMs. That also makes them computationally intensive, which is why cloud providers are racing to partner with AI companies.

World models are the big idea that several AI labs are now chasing, and the term is quickly becoming the next buzzword to attract venture funding. A group of highly-regarded AI researchers, including Fei-Fei Li and Justin Johnson, just raised $230 million for their startup, World Labs. The “godmother of AI” and her team is also convinced world models will unlock significantly smarter AI systems. OpenAI also describes its unreleased Sora video generator as a world model, but hasn’t gotten into specifics.

LeCun outlined an idea for using world models to create human-level AI in a 2022 paper on “objective-driven AI,” though he notes the concept is over 60 years old. In short, a base representation of the world (such as video of a dirty room, for example) and memory are fed into an world model. Then, the world model predicts what the world will look like based on that information. Then you give the world model objectives, including an altered state of the world you’d like to achieve (such as a clean room) as well as guardrails to ensure the model doesn’t harm humans to achieve an objective (don’t kill me in the process of cleaning my room, please). Then the world model finds an action sequence to achieve these objectives.

Meta’s longterm AI research lab, FAIR or Fundamental AI Research, is actively working towards building objective-driven AI and world models, according to LeCun. FAIR used to work on AI for Meta’s upcoming products, but LeCun says the lab has shifted in recent years to focusing purely on longterm AI research. LeCun says FAIR doesn’t even use LLMs these days.

World models are an intriguing idea, but LeCun says we haven’t made much progress on bringing these systems to reality. There’s a lot of very hard problems to get from where we are today, and he says it’s certainly more complicated than we think.

“It’s going to take years before we can get everything here to work, if not a decade,” said Lecun. “Mark Zuckerberg keeps asking me how long it’s going to take.”



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

Europe’s Relay pulls in $35M Series A after applying Asia’s model to delivery

Being somewhat later than Europe in adopting the idea of parcel delivery, much of Asia built its...

Shop Circle raises $60M to encircle ecommerce with an app suite

The boom in ecommerce post-pandemic meant shops moved online. However, some merchants ended up with dozens of...

Maternity clinic Millie nabs $12M Series A from an all-star, all female class of VCs

Millie, a California-based maternity clinic, founded by Anu Sharma, announced the raise of a $12 million Series...

Avride’s sidewalk delivery bots land in Japan

Avride sidewalk bots will start delivering restaurant orders and groceries in central Tokyo this week through a...

Here are all the tech companies rolling back DEI or still committed to it — so far

Companies around America have started cutting DEI programs and eliminating DEI commitments from public documents in response...

Inception emerges from stealth with a new type of AI model

Inception, a new Palo Alto-based company started by Stanford computer science professor Stefano Ermon, claims to have...

Hacked crypto exchange Bybit offers $140 million bounty to trace stolen funds

Last week, hackers stole around $1.4 billion in Ethereum cryptocurrency from crypto exchange Bybit, believed to be...

ElevenLabs is launching its own speech-to-text model

ElevenLabs, an AI startup that just raised a $180 million mega funding round, has been primarily known...