Researchers created an open rival to OpenAI’s o1 ‘reasoning’ model for under $50

Date:

Share post:


AI researchers at Stanford and the University of Washington were able to train an AI “reasoning” model for under $50 in cloud compute credits, according to a new research paper released last Friday.

The model, known as s1, performs similarly to cutting-edge reasoning models, such as OpenAI’s o1 and DeepSeek’s R1, on tests measuring math and coding abilities. The s1 model is available on GitHub, along with the data and code used to train it.

The team behind s1 said they started with an off-the-shelf base model, then fine-tuned it through distillation, a process to extract the “reasoning” capabilities from another AI model by training on its answers.

The researchers said s1 is distilled from one of Google’s reasoning models, Gemini 2.0 Flash Thinking Experimental. Distillation is the same approach Berkeley researchers used to create an AI reasoning model for around $450 last month.

To some, the idea that a few researchers without millions of dollars behind them can still innovate in the AI space is exciting. But s1 raises real questions about the commoditization of AI models.

Where’s the moat if someone can closely replicate a multi-million-dollar model with relative pocket change?

Unsurprisingly, big AI labs aren’t happy. OpenAI has accused DeepSeek of improperly harvesting data from its API for the purposes of model distillation.

The researchers behind s1 were looking to find the simplest approach to achieve strong reasoning performance and “test-time scaling,” or allowing an AI model to think more before it answers a question. These were a few of the breakthroughs in OpenAI’s o1, which DeepSeek and other AI labs have tried to replicate through various techniques.

The s1 paper suggests that reasoning models can be distilled with a relatively small dataset using a process called supervised fine-tuning (SFT), in which an AI model is explicitly instructed to mimic certain behaviors in a dataset.

SFT tends to be cheaper than the large-scale reinforcement learning method that DeepSeek employed to train its competitor to OpenAI’s o1 model, R1.

Google offers free access to Gemini 2.0 Flash Thinking Experimental, albeit with daily rate limits, via its Google AI Studio platform.

Google’s terms forbid reverse-engineering its models to develop services that compete with the company’s own AI offerings, however. We’ve reached out to Google for comment.

S1 is based on a small, off-the-shelf AI model from Alibaba-owned Chinese AI lab Qwen, which is available to download for free. To train s1, the researchers created a dataset of just 1,000 carefully curated questions, paired with answers to those questions, as well as the “thinking” process behind each answer from Google’s Gemini 2.0 Flash Thinking Experimental.

After training s1, which took less than 30 minutes using 16 Nvidia H100 GPUs, s1 achieved strong performance on certain AI benchmarks, according to the researchers. Niklas Muennighoff, a Stanford researcher who worked on the project, told TechCrunch he could rent the necessary compute today for about $20.

The researchers used a nifty trick to get s1 to double-check its work and extend its “thinking” time: They told it to wait. Adding the word “wait” during s1’s reasoning helped the model arrive at slightly more accurate answers, per the paper.

In 2025, Meta, Google, and Microsoft plan to invest hundreds of billions of dollars in AI infrastructure, which will partially go toward training next-generation AI models.

That level of investment may still be necessary to push the envelope of AI innovation. Distillation has shown to be a good method for cheaply re-creating an AI model’s capabilities, but it doesn’t create new AI models vastly better than what’s available today.



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

Avelios nabs $31M led by Sequoia to fix the ailing world of healthcare IT

The race is on to build a new generation of healthcare software to replace legacy hospital systems...

European VC firm Emblem raises $85 million for its initial fund

Emblem, a relatively new European VC firm based in Paris, is announcing the final closing of its...

Aiming to accelerate product design with AI, Trace.Space raises a seed round

Modern product engineering requires the production of highly accurate digital simulations, allowing engineers to create prototypes and...

These researchers used NPR Sunday Puzzle questions to benchmark AI ‘reasoning’ models

Every Sunday, NPR host Will Shortz, The New York Times’ crossword puzzle guru, gets to quiz thousands...

Boston Dynamics joins forces with its former CEO to speed the learning of its Atlas humanoid robot

Boston Dynamics Wednesday announced a partnership designed to bring improved reinforcement learning to its electric Atlas humanoid...

SoftBank could soon buy Renee James’ Ampere chip company for about $6.5B

Ampere, the semiconductor business founded by former Intel exec Renee James, is reportedly on the cusp of...

Sonos lays off 200 ahead of rumored set-top box release

Sonos announced that it has laid off 200 people in a letter posted to its site Wednesday....

Scout Motors sued over plan to sell EVs direct to consumers

Scout Motors’ plan to eschew traditional dealerships and sell EVs directly to consumers is running into legal...