Researchers created an open rival to OpenAI’s o1 ‘reasoning’ model for under $50

Date:

Share post:


AI researchers at Stanford and the University of Washington were able to train an AI “reasoning” model for under $50 in cloud compute credits, according to a new research paper released last Friday.

The model, known as s1, performs similarly to cutting-edge reasoning models, such as OpenAI’s o1 and DeepSeek’s R1, on tests measuring math and coding abilities. The s1 model is available on GitHub, along with the data and code used to train it.

The team behind s1 said they started with an off-the-shelf base model, then fine-tuned it through distillation, a process to extract the “reasoning” capabilities from another AI model by training on its answers.

The researchers said s1 is distilled from one of Google’s reasoning models, Gemini 2.0 Flash Thinking Experimental. Distillation is the same approach Berkeley researchers used to create an AI reasoning model for around $450 last month.

To some, the idea that a few researchers without millions of dollars behind them can still innovate in the AI space is exciting. But s1 raises real questions about the commoditization of AI models.

Where’s the moat if someone can closely replicate a multi-million-dollar model with relative pocket change?

Unsurprisingly, big AI labs aren’t happy. OpenAI has accused DeepSeek of improperly harvesting data from its API for the purposes of model distillation.

The researchers behind s1 were looking to find the simplest approach to achieve strong reasoning performance and “test-time scaling,” or allowing an AI model to think more before it answers a question. These were a few of the breakthroughs in OpenAI’s o1, which DeepSeek and other AI labs have tried to replicate through various techniques.

The s1 paper suggests that reasoning models can be distilled with a relatively small dataset using a process called supervised fine-tuning (SFT), in which an AI model is explicitly instructed to mimic certain behaviors in a dataset.

SFT tends to be cheaper than the large-scale reinforcement learning method that DeepSeek employed to train its competitor to OpenAI’s o1 model, R1.

Google offers free access to Gemini 2.0 Flash Thinking Experimental, albeit with daily rate limits, via its Google AI Studio platform.

Google’s terms forbid reverse-engineering its models to develop services that compete with the company’s own AI offerings, however. We’ve reached out to Google for comment.

S1 is based on a small, off-the-shelf AI model from Alibaba-owned Chinese AI lab Qwen, which is available to download for free. To train s1, the researchers created a dataset of just 1,000 carefully curated questions, paired with answers to those questions, as well as the “thinking” process behind each answer from Google’s Gemini 2.0 Flash Thinking Experimental.

After training s1, which took less than 30 minutes using 16 Nvidia H100 GPUs, s1 achieved strong performance on certain AI benchmarks, according to the researchers. Niklas Muennighoff, a Stanford researcher who worked on the project, told TechCrunch he could rent the necessary compute today for about $20.

The researchers used a nifty trick to get s1 to double-check its work and extend its “thinking” time: They told it to wait. Adding the word “wait” during s1’s reasoning helped the model arrive at slightly more accurate answers, per the paper.

In 2025, Meta, Google, and Microsoft plan to invest hundreds of billions of dollars in AI infrastructure, which will partially go toward training next-generation AI models.

That level of investment may still be necessary to push the envelope of AI innovation. Distillation has shown to be a good method for cheaply re-creating an AI model’s capabilities, but it doesn’t create new AI models vastly better than what’s available today.



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

Manus probably isn’t China’s second ‘DeepSeek moment’

Manus, an “agentic” AI platform that launched in preview last week, is generating more hype than a...

Japan’s service robot market projected to triple in five years

Faced with an aging population and labor shortages, Japanese businesses are increasingly relying on service robots to...

Colossal CEO Ben Lamm says humanity has a ‘moral obligation’ to pursue de-extinction tech

The CEO of Colossal, a startup that aims to use genetic editing techniques to bring back extinct...

Tammy Nam joins AI-powered ad startup Creatopy as CEO

Creatopy, a startup that uses AI to automate the creation of digital ads, has brought on a...

Apple’s smart home hub reportedly delayed by Siri challenges

Apple announced this week that the “more personalized” version of Siri that it promised last year has...

Musk may still have a chance to thwart OpenAI’s for-profit conversion

Elon Musk lost the latest battle in his lawsuit against OpenAI this week, but a federal judge...

How to stop doomscrolling

The world is bad sometimes, but it feels even worse if you can’t stop staring into the...

New DOJ proposal still calls for Google to divest Chrome, but allows for AI investments

The US Department of Justice is still calling for Google to sell its web browser Chrome, according...