DeepSeek: Everything you need to know about the AI chatbot app

DeepSeek has gone viral.

Chinese AI lab DeepSeek broke into the mainstream consciousness this week after its chatbot app rose to the top of the Apple App Store charts. DeepSeek’s AI models, which were trained using compute-efficient techniques, have led Wall Street analysts — and technologists — to question whether the U.S. can maintain its lead in the AI race and whether the demand for AI chips will sustain.

But where did DeepSeek come from, and how did it rise to international fame so quickly?

DeepSeek’s trader origins

DeepSeek is backed by High-Flyer Capital Management, a Chinese quantitative hedge fund that uses AI to inform its trading decisions.

AI enthusiast Liang Wenfeng co-founded High-Flyer in 2015. Wenfeng, who reportedly began dabbling in trading while a student at Zhejiang University, launched High-Flyer Capital Management as a hedge fund in 2019 focused on developing and deploying AI algorithms.

In 2023, High-Flyer started DeepSeek as a lab dedicated to researching AI tools separate from its financial business. With High-Flyer as one of its investors, the lab spun off into its own company, also called DeepSeek.

From day one, DeepSeek built its own datacenter clusters for model training. But like other AI companies in China, DeepSeek has been affected by U.S. export bans on hardware. To train one of its more recent models, the company was forced to use Nvidia H800 chips, a less-powerful version of a chip, the H100, available to U.S. companies.

DeepSeek’s technical team is said to skew young. The company reportedly aggressively recruits doctorate AI researchers from top Chinese universities. DeepSeek also hires people without any computer science background to help its tech better understand a wide range of subjects, per The New York Times.

DeepSeek’s strong models

DeepSeek unveiled its first set of models — DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat — in November 2023. But it wasn’t until last spring, when the startup released its next-gen DeepSeek-V2 family of models, that the AI industry started to take notice.

DeepSeek-V2, a general-purpose text- and image-analyzing system, performed well in various AI benchmarks — and was far cheaper to run than comparable models at the time. It forced DeepSeek’s domestic competition, including ByteDance and Alibaba, to cut the usage prices for some of their models, and make others completely free.

DeepSeek-V3, launched in December 2024, only added to DeepSeek’s notoriety.

According to DeepSeek’s internal benchmark testing, DeepSeek V3 outperforms both downloadable, openly available models like Meta’s Llama and “closed” models that can only be accessed through an API, like OpenAI’s GPT-4o.

Equally impressive is DeepSeek’s R1 “reasoning” model. Released in January, DeepSeek claims R1 performs as well as OpenAI’s o1 model on key benchmarks.

Being a reasoning model, R1 effectively fact-checks itself, which helps it to avoid some of the pitfalls that normally trip up models. Reasoning models take a little longer — usually seconds to minutes longer — to arrive at solutions compared to a typical non-reasoning model. The upside is that they tend to be more reliable in domains such as physics, science, and math.

There is a downside to R1, DeepSeek V3, and DeepSeek’s other models, however. Being Chinese-developed AI, they’re subject to benchmarking by China’s internet regulator to ensure that its responses “embody core socialist values.” In DeepSeek’s chatbot app, for example, R1 won’t answer questions about Tiananmen Square or Taiwan’s autonomy.

A disruptive approach

If DeepSeek has a business model, it’s not clear what that model is, exactly. The company prices its products and services well below market value — and gives others away for free.

The way DeepSeek tells it, efficiency breakthroughs have enabled it to maintain extreme cost competitiveness. Some experts dispute the figures the company has supplied, however.

Whatever the case may be, developers have taken to DeepSeek’s models, which aren’t open source as the phrase is commonly understood but are available under permissive licenses that allow for commercial use. According to Clem Delangue, the CEO of Hugging Face, one of the platforms hosting DeepSeek’s models, developers on Hugging Face have created over 500 “derivative” models of R1 that have racked up 2.5 million downloads combined.

DeepSeek’s success against larger and more established rivals has been described as “upending AI” and ushering in “a new era of AI brinkmanship.” The company’s success was at least in part responsible for causing Nvidia’s stock price to drop by 18% on Monday, and for eliciting a public response from OpenAI CEO Sam Altman.

As for what DeepSeek’s future might hold, it’s not clear. Improved models are a given. But the U.S. government appears to be growing wary of what it perceives as harmful foreign influence.

Source link

DeepSeek: Everything you need to know about the AI chatbot app

DeepSeek’s trader origins

DeepSeek’s strong models

A disruptive approach

Recent posts

DataCrunch wants to be Europe’s first AI cloud hyperscaler — powered by renewable energy

Meta tests facial recognition for spotting ‘celeb-bait’ ads scams and easier account recovery

Waymo raises $5.6B from Alphabet, a16z, Silver Lake, and more

As Bluesky surges, Threads begins testing custom feeds

Portugal’s Tekever raises $74M for dual-use drone platform deployed to Ukraine

OpenAI rolls out Advanced Voice Mode with more voices and a new look

The Port of Seattle and Sea-Tac Airport say they’ve been hit by ‘possible cyberattack’

Bench customers are now being forced to hand over their data or risk losing it, they say

TipRanks, an AI-based stock tip evaluator created after its founder got burned by bad advice, sells for $200M to Prytek

Meta announces a new CapCut rival called Edits

Trump pardons Silk Road creator Ross Ulbricht

Halide’s next version will come with new film filters, HDR

Threads is testing the ‘trending now’ view in Japan

Boom’s macOS camera app lets you customize your video call appearance

Africa’s newest fintech unicorns are winning by keeping their feet on the ground

Related articles

Threads adds a ‘media’ tab and the ability to tag people in photos

International police coalition takes down two prolific cybercrime and hacking forums

Mexican president pushes back against Google’s renaming of Gulf of Mexico

DeepSeek exposed internal database containing chat histories and sensitive data

SuperOps bags $25M to use AI and better help managed service providers

India lauds Chinese AI lab DeepSeek, plans to host its models on local servers

European embedded banking startup Swan adds another $44 million to its Series B

SoftBank in talks to invest as much as $25B in OpenAI, report says

Company

Follow us