Hugging Face researchers are trying to build a more open version of DeepSeek’s AI ‘reasoning’ model

Date:

Share post:


Barely a week after DeepSeek released its R1 “reasoning” AI model — which sent markets into a tizzy — researchers at Hugging Face are trying to replicate the model from scratch in what they’re calling a pursuit of “open knowledge.”

Hugging Face head of research Leandro von Werra and several company engineers have launched Open-R1, a project that seeks to build a duplicate of R1 and open source all of its components, including the data used to train it.

The engineers said they were compelled to act by DeepSeek’s “black box” release philosophy. Technically, R1 is “open” in that the model is permissively licensed, which means it can be deployed largely without restrictions. However, R1 isn’t “open source” by the widely accepted definition because some of the tools used to build it are shrouded in mystery. Like many high-flying AI companies, DeepSeek is loathe to reveal its secret sauce.

“The R1 model is impressive, but there’s no open dataset, experiment details, or intermediate models available, which makes replication and further research difficult,” Elie Bakouch, one of the Hugging Face engineers on the Open-R1 project, told TechCrunch. “Fully open sourcing R1’s complete architecture isn’t just about transparency — it’s about unlocking its potential.”

Not so open

DeepSeek, a Chinese AI lab funded in part by a quantitative hedge fund, released R1 last week. On a number of benchmarks, R1 matches — and even surpasses — the performance of OpenAI’s o1 reasoning model.

Being a reasoning model, R1 effectively fact-checks itself, which helps it avoid some of the pitfalls that normally trip up models. Reasoning models take a little longer — usually seconds to minutes longer — to arrive at solutions compared to a typical non-reasoning model. The upside is that they tend to be more reliable in domains such as physics, science, and math.

R1 broke into the mainstream consciousness after DeepSeek’s chatbot app, which provides free access to R1, rose to the top of the Apple App Store charts. The speed and efficiency with which R1 was developed — DeepSeek released the model just weeks after OpenAI released o1 — has led many Wall Street analysts and technologists to question whether the U.S. can maintain its lead in the AI race.

The Open-R1 project is less concerned about U.S. AI dominance than “fully opening the black box of model training,” Bakouch told TechCrunch. He noted that, because R1 wasn’t released with training code or training instructions, it’s challenging to study the model in depth — much less steer its behavior.

“Having control over the dataset and process is critical for deploying a model responsibly in sensitive areas,” Bakouch said. “It also helps with understanding and addressing biases in the model. Researchers require more than fragments … to push the boundaries of what’s possible.”

Steps to replication

The goal of the Open-R1 project is to replicate R1 in a few weeks, relying in part on Hugging Face’s Science Cluster, a dedicated research server with 768 Nvidia H100 GPUs.

The Hugging Face engineers plan to tap the Science Cluster to generate datasets similar to those DeepSeek used to create R1. To build a training pipeline, the team is soliciting help from the AI and broader tech communities on Hugging Face and GitHub, where the Open-R1 project is being hosted.

“We need to make sure that we implement the algorithms and recipes [correctly,]” von Werra told TechCrunch, “but it’s something a community effort is perfect at tackling, where you get as many eyes on the problem as possible.”

There’s a lot of interest already. The Open-R1 project racked up 10,000 stars in just three days on GitHub. Stars are a way for GitHub users to indicate that they like a project or find it useful.

If the Open-R1 project is successful, AI researchers will be able to build on top of the training pipeline and work on developing the next generation of open source reasoning models, Bakouch said. He hopes the Open-R1 project will yield not only a strong open source replication of R1, but also a foundation for better models to come.

“Rather than being a zero-sum game, open source development immediately benefits everyone, including the frontier labs and the model providers, as they can all use the same innovations,” Bakouch said.

While some AI experts have raised concerns about the potential for open source AI abuse, Bakouch believes that the benefits outweigh the risks.

“When the R1 recipe has been replicated, anyone who can rent some GPUs can build their own variant of R1 with their own data, further diffusing the technology everywhere,” he said. “We’re really excited about the recent open source releases that are strengthening the role of openness in AI. It’s an important shift for the field that changes the narrative that only a handful of labs are able to make progress, and that open source is lagging behind.”



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

US said to halt offensive cyber operations against Russia 

The United States has suspended its offensive cyber operations against Russia, according to reports, amid efforts by...

Chinese buyers are getting Nvidia Blackwell chips despite U.S. export controls

Upholding export controls on semiconductor chips made in the U.S. made chips may be harder than Washington...

As Skype shuts down, its legacy is end-to-end encryption for the masses

In the early evening of March 5, 2012, in Cairo, Egyptian revolutionaries stormed the headquarters of the...

Opera announces a new agentic feature for its browser

Norway-based browser company Opera announced a new agent feature called Browser Operator as a feature preview. The...

Uber is piloting accounts for teenagers in India

Uber has started piloting its teenagers-focused Uber for Teens service in India, TechCrunch has exclusively learned. Uber...

Google’s Gemini now lets you ask questions using videos and what’s on your screen

Google is adding new features to its AI assistant, Gemini, that let users ask it questions using...

Signal is the number-one downloaded app in the Netherlands. But why?

Privacy-focused messaging app Signal has been flying high in the Dutch app stores this past month, sitting...

Apple might not release a truly ‘modernized’ Siri until 2027

Apple is struggling to rebuild Siri for the age of generative AI, according to Bloomberg’s Mark Gurman,...