Hugging Face researchers are trying to build a more open version of DeepSeek’s AI ‘reasoning’ model

Date:

Share post:


Barely a week after DeepSeek released its R1 “reasoning” AI model — which sent markets into a tizzy — researchers at Hugging Face are trying to replicate the model from scratch in what they’re calling a pursuit of “open knowledge.”

Hugging Face head of research Leandro von Werra and several company engineers have launched Open-R1, a project that seeks to build a duplicate of R1 and open source all of its components, including the data used to train it.

The engineers said they were compelled to act by DeepSeek’s “black box” release philosophy. Technically, R1 is “open” in that the model is permissively licensed, which means it can be deployed largely without restrictions. However, R1 isn’t “open source” by the widely accepted definition because some of the tools used to build it are shrouded in mystery. Like many high-flying AI companies, DeepSeek is loathe to reveal its secret sauce.

“The R1 model is impressive, but there’s no open dataset, experiment details, or intermediate models available, which makes replication and further research difficult,” Elie Bakouch, one of the Hugging Face engineers on the Open-R1 project, told TechCrunch. “Fully open sourcing R1’s complete architecture isn’t just about transparency — it’s about unlocking its potential.”

Not so open

DeepSeek, a Chinese AI lab funded in part by a quantitative hedge fund, released R1 last week. On a number of benchmarks, R1 matches — and even surpasses — the performance of OpenAI’s o1 reasoning model.

Being a reasoning model, R1 effectively fact-checks itself, which helps it avoid some of the pitfalls that normally trip up models. Reasoning models take a little longer — usually seconds to minutes longer — to arrive at solutions compared to a typical non-reasoning model. The upside is that they tend to be more reliable in domains such as physics, science, and math.

R1 broke into the mainstream consciousness after DeepSeek’s chatbot app, which provides free access to R1, rose to the top of the Apple App Store charts. The speed and efficiency with which R1 was developed — DeepSeek released the model just weeks after OpenAI released o1 — has led many Wall Street analysts and technologists to question whether the U.S. can maintain its lead in the AI race.

The Open-R1 project is less concerned about U.S. AI dominance than “fully opening the black box of model training,” Bakouch told TechCrunch. He noted that, because R1 wasn’t released with training code or training instructions, it’s challenging to study the model in depth — much less steer its behavior.

“Having control over the dataset and process is critical for deploying a model responsibly in sensitive areas,” Bakouch said. “It also helps with understanding and addressing biases in the model. Researchers require more than fragments … to push the boundaries of what’s possible.”

Steps to replication

The goal of the Open-R1 project is to replicate R1 in a few weeks, relying in part on Hugging Face’s Science Cluster, a dedicated research server with 768 Nvidia H100 GPUs.

The Hugging Face engineers plan to tap the Science Cluster to generate datasets similar to those DeepSeek used to create R1. To build a training pipeline, the team is soliciting help from the AI and broader tech communities on Hugging Face and GitHub, where the Open-R1 project is being hosted.

“We need to make sure that we implement the algorithms and recipes [correctly,]” von Werra told TechCrunch, “but it’s something a community effort is perfect at tackling, where you get as many eyes on the problem as possible.”

There’s a lot of interest already. The Open-R1 project racked up 10,000 stars in just three days on GitHub. Stars are a way for GitHub users to indicate that they like a project or find it useful.

If the Open-R1 project is successful, AI researchers will be able to build on top of the training pipeline and work on developing the next generation of open source reasoning models, Bakouch said. He hopes the Open-R1 project will yield not only a strong open source replication of R1, but also a foundation for better models to come.

“Rather than being a zero-sum game, open source development immediately benefits everyone, including the frontier labs and the model providers, as they can all use the same innovations,” Bakouch said.

While some AI experts have raised concerns about the potential for open source AI abuse, Bakouch believes that the benefits outweigh the risks.

“When the R1 recipe has been replicated, anyone who can rent some GPUs can build their own variant of R1 with their own data, further diffusing the technology everywhere,” he said. “We’re really excited about the recent open source releases that are strengthening the role of openness in AI. It’s an important shift for the field that changes the narrative that only a handful of labs are able to make progress, and that open source is lagging behind.”



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

ElevenLabs, the hot AI audio startup, confirms $180M in Series C funding at a $3.3B valuation

ElevenLabs, one of the more popular startups working in the field AI audio, said Thursday that it...

Threads adds a ‘media’ tab and the ability to tag people in photos

Meta’s newer social network Threads announced on Thursday that it’s introducing a dedicated “media” tab for both...

International police coalition takes down two prolific cybercrime and hacking forums

An international coalition of law enforcement agencies took down two hacking forums that had more than 10...

Mexican president pushes back against Google’s renaming of Gulf of Mexico

Google Maps is planning to comply with President Donald Trump’s executive order to rename several American landmarks,...

DeepSeek exposed internal database containing chat histories and sensitive data

Chinese AI company DeepSeek has fixed an exposed back-end database that was spilling sensitive information, including user...

SuperOps bags $25M to use AI and better help managed service providers

SuperOps, an Indian startup offering tools to help IT service providers and internal system administrators at enterprises,...

India lauds Chinese AI lab DeepSeek, plans to host its models on local servers

India’s IT minister on Thursday praised DeepSeek‘s progress and said the country will host the Chinese AI...

European embedded banking startup Swan adds another $44 million to its Series B

French startup Swan has raised another €42 million (around $44 million at current exchange rates). The company...