A Chinese lab has released a ‘reasoning’ AI model to rival OpenAI’s o1

Date:

Share post:


A Chinese lab has unveiled what appears to be one of the first “reasoning” AI models to rival OpenAI’s o1.

On Wednesday, DeepSeek, an AI research company funded by quantitative traders, released a preview of DeepSeek-R1, which the firm claims is a reasoning model competitive with o1.

Unlike most models, reasoning models effectively fact-check themselves by spending more time considering a question or query. This helps them avoid some of the pitfalls that normally trip up models.

Similar to o1, DeepSeek-R1 reasons through tasks, planning ahead and performing a series of actions that help the model arrive at an answer. This can take a while. Like o1, depending on the complexity of the question, DeepSeek-R1 might “think” for tens of seconds before answering.

Image Credits:DeepSeek

DeepSeek claims that DeepSeek-R1 (or DeepSeek-R1-Lite-Preview, to be precise) performs on par with OpenAI’s o1-preview model on two popular AI benchmarks, AIME and MATH. AIME uses other AI models to evaluate a model’s performance, while MATH is a collection of word problems. But the model isn’t perfect. Some commentators on X noted that DeepSeek-R1 struggles with tic-tac-toe and other logic problems. (O1 does, too.)

DeepSeek can also be easily jailbroken — that is, prompted in such a way that it ignores safeguards. One X user got the model to give a detailed meth recipe.

And DeepSeek-R1 appears to block queries deemed too politically sensitive. In our testing, the model refused to answer questions about Chinese leader Xi Jinping, Tiananmen Square, and the geopolitical implications of China invading Taiwan.

DeepSeek-R1
Image Credits:DeepSeek

The behavior is likely the result of pressure from the Chinese government on AI projects in the region. Models in China must undergo benchmarking by China’s internet regulator to ensure their responses “embody core socialist values.” Reportedly, the government has gone so far as to propose a blacklist of sources that can’t be used to train models — the result being that many Chinese AI systems decline to respond to topics that might raise the ire of regulators.

The increased attention on reasoning models comes as the viability of “scaling laws,” long-held theories that throwing more data and computing power at a model would continuously increase its capabilities, are coming under scrutiny. A flurry of press reports suggest that models from major AI labs including OpenAI, Google, and Anthropic aren’t improving as dramatically as they once did.

That’s led to a scramble for new AI approaches, architectures, and development techniques. One is test-time compute, which underpins models like o1 and DeepSeek-R1. Also known as inference compute, test-time compute essentially gives models extra processing time to complete tasks.

“We are seeing the emergence of a new scaling law,” Microsoft CEO Satya Nadella said this week during a keynote at Microsoft’s Ignite conference, referencing test-time compute.

DeepSeek, which says that it plans to open source DeepSeek-R1 and release an API, is a curious operation. It’s backed by High-Flyer Capital Management, a Chinese quantitative hedge fund that uses AI to inform its trading decisions.

One of DeepSeek’s first models, a general-purpose text- and image-analyzing model called DeepSeek-V2, forced competitors like ByteDance, Baidu, and Alibaba to cut the usage prices for some of their models — and make others completely free.

High-Flyer builds its own server clusters for model training, the most recent of which reportedly has 10,000 Nvidia A100 GPUs and cost 1 billion yen (~$138 million). Founded by Liang Wenfeng, a computer science graduate, High-Flyer aims to achieve “superintelligent” AI through its DeepSeek org.



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

Sam Altman disputes Marc Andreessen’s description of AI meetings with Biden administration

Famed investor Marc Andreessen recently talked about meetings with Biden administration staff who gave him the impression...

EV startup Canoo places remaining employees on a ‘mandatory unpaid break’

Struggling electric van startup Canoo has placed its remaining employees on what it’s calling a “mandatory unpaid...

After causing outrage on the first day of Y Combinator, AI code editor PearAI lands $1M seed

On the first day of Y Combinator’s winter 2024 session – right after orientation and a photo...

Third member of LockBit ransomware gang has been arrested

U.S. prosecutors in New Jersey on Friday publicly announced charges against Rostislav Panev, 51, a dual Russian-Israeli...

Feds clear the way for robotaxis without steering wheels and pedals

The National Highway Traffic Safety Administration (NHTSA) on Friday proposed a new national framework that could make...

VCs pledge not to take money from Russia or China, and Databricks raises a humongous round

Welcome to Startups Weekly — your weekly recap of everything you can’t miss from the world of...

Nvidia clears regulatory hurdle to acquire Run:ai

Chip company Nvidia gets the green light from the European Union to complete its acquisition of Run:ai. The...

Google is expanding Gemini’s in-depth research mode to 40 languages

Google said Friday that the company is expanding Gemini’s latest in-depth research mode to 40 more languages. The...