Nvidia releases its own brand of world models

Date:

Share post:


Nvidia is getting into world models — AI models that take inspiration from the mental models of the world that humans develop naturally. 

At CES 2025 in Las Vegas, the company announced that it is making openly available a family of world models that can predict and generate “physics-aware” videos. Nvidia is calling this family Cosmos World Foundation Models, or Cosmos WFMs for short.

The models, which can be fine-tuned for specific applications, are available from Nvidia’s API and NGC catalogs, GitHub, and the AI dev platform Hugging Face.

“Nvidia is making available the first wave of Cosmos WFMs for physics-based simulation and synthetic data generation,” the company wrote in a blog post provided to TechCrunch. “Researchers and developers, regardless of their company size, can freely use the Cosmos models under Nvidia’s permissive open model license that allows commercial usage.”

Output from one of Nvidia’s Cosmos World Foundation Models.Image Credits:Nvidia

There are a number of models in the Cosmos WFM family, divided into three categories: Nano for low latency and real-time applications, Super for “highly performant baseline” models, and Ultra for maximum quality and fidelity outputs.

The models range in size from 4 billion to 14 billion parameters, with Nano being the smallest and Ultra being the largest. Parameters roughly correspond to a model’s problem-solving skills, and models with more parameters generally perform better than those with fewer parameters.

As a part of Cosmos WFM, Nvidia is also releasing an “upsampling model,” a video decoder optimized for augmented reality, and guardrail models to ensure responsible use, as well as fine-tuned models for applications like generating sensor data for autonomous vehicle development. These, as well as the other Cosmos WFM models, were trained on 9,000 trillion tokens from 20 million hours of real-world human interactions, environment, industrial, robotics, and driving data, Nvidia said. (In AI, “tokens” represent bits of raw data — in this case, video footage.)

Nvidia wouldn’t say where this training data came from, but at least one report — and lawsuit — alleges that the company trained on copyrighted YouTube videos without permission.

When reached for comment, an Nvidia spokesperson told TechCrunch that Cosmos “isn’t designed to copy or infringe any protected works.”

“Cosmos learns just like people learn,” the spokesperson said. “To help Cosmos learn, we gathered data from a variety of public and private sources and are confident our use of data is consistent with both the letter and spirit of the law. Facts about how the world works — which are what the Cosmos models learn — are not copyrightable or subject to the control of any individual author or company.”

Setting aside the fact that models like Cosmos don’t really learn like people learn, copyright experts say claims like Nvidia’s, which draw support from fair use legal doctrine, may not stand up to judicial scrutiny. Whether these companies prevail will largely depend on how courts decide fair use, which allows for the use of copyrighted works to make something new as long as it’s transformative, applies to AI training.

Nvidia claimed that Cosmos WFM models, given text or video frames, can generate “controllable, high-quality” synthetic data to bootstrap the training of models for robotics, driverless cars, and more.

Nvidia Cosmos WFM models
Cosmos can simulate realistic environments like factory floors, according to Nvidia.Image Credits:Nvidia

“Nvidia Cosmos’ suite of open models means developers can customize the WFMs with data sets, such as video recordings of autonomous vehicle trips or robots navigating a warehouse,” Nvidia wrote in a press release. “Cosmos WFMs are purpose-built for physical AI research and development, and can generate physics-based videos from a combination of inputs, like text, image and video, as well as robot sensor or motion data.”

Nvidia said that companies including Waabi, Wayve, Fortellix, and Uber have already committed to piloting Cosmos WFMs for various use cases, from video search and curation to building AI models for self-driving vehicles.

“Generative AI will power the future of mobility, requiring both rich data and very powerful compute,” Uber CEO Dara Khosrowshahi said in a statement. “By working with Nvidia, we are confident that we can help supercharge the timeline for safe and scalable autonomous driving solutions for the industry.”

Important to note is that Nvidia’s world models aren’t “open source” in the strictest sense. To abide by one widely accepted definition of “open source” AI, an AI model has to provide enough information about its design so that a person could “substantially” recreate it, and disclose any pertinent details about its training data, including the provenance and how the data can be obtained or licensed.

Nvidia hasn’t published Cosmos WFM training data details, nor has it made available all the tools needed to recreate the models from scratch. That’s probably why the tech giant is referring to the models as “open” as opposed to open source.

“We really hope [Cosmos will] do for the world of robotics and industrial AI what Llama … has done for enterprise,” Nvidia CEO Jensen Huang said onstage during a press event on Monday.



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

Neom is reportedly turning into a financial disaster, except for McKinsey & Co.

A new WSJ report suggests that Saudi Arabia’s now eight-year-old Neom project — a futuristic, carbon-neutral, 105-mile-long...

Manus probably isn’t China’s second ‘DeepSeek moment’

Manus, an “agentic” AI platform that launched in preview last week, is generating more hype than a...

Japan’s service robot market projected to triple in five years

Faced with an aging population and labor shortages, Japanese businesses are increasingly relying on service robots to...

Colossal CEO Ben Lamm says humanity has a ‘moral obligation’ to pursue de-extinction tech

The CEO of Colossal, a startup that aims to use genetic editing techniques to bring back extinct...

Tammy Nam joins AI-powered ad startup Creatopy as CEO

Creatopy, a startup that uses AI to automate the creation of digital ads, has brought on a...

Apple’s smart home hub reportedly delayed by Siri challenges

Apple announced this week that the “more personalized” version of Siri that it promised last year has...

Musk may still have a chance to thwart OpenAI’s for-profit conversion

Elon Musk lost the latest battle in his lawsuit against OpenAI this week, but a federal judge...

How to stop doomscrolling

The world is bad sometimes, but it feels even worse if you can’t stop staring into the...