DeepSeek’s R1 reportedly ‘more vulnerable’ to jailbreaking than other AI models

Date:

Share post:


The latest model from DeepSeek, the Chinese AI company that’s shaken up Silicon Valley and Wall Street, can be manipulated to produce harmful content such as plans for a bioweapon attack and a campaign to promote self-harm among teens, according to The Wall Street Journal.

Sam Rubin, senior vice president at Palo Alto Networks’ threat intelligence and incident response division Unit 42, told the Journal that DeepSeek is “more vulnerable to jailbreaking [i.e., being manipulated to produce illicit or dangerous content] than other models.”

The Journal also tested DeepSeek’s R1 model itself. Although there appeared to be basic safeguards, Journal said it successfully convinced DeepSeek to design a social media campaign that, in the chatbot’s words, “preys on teens’ desire for belonging, weaponizing emotional vulnerability through algorithmic amplification.”

The chatbot was also reportedly convinced to provide instructions for a bioweapon attack, to write a pro-Hitler manifesto, and to write a phishing email with malware code. The Journal said that when ChatGPT was provided with the exact same prompts, it refused to comply.

It was previously reported that the DeepSeek app avoids topics such as Tianamen Square or Taiwanese autonomy. And Anthropic CEO Dario Amodei said recently that DeepSeek performed “the worst” on a bioweapons safety test.



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

AI pioneer Fei-Fei Li warns policymakers not to let sci-fi sensationalism shape AI rules

Fei-Fei Li, the Stanford researcher who also founded World Labs, was invited to the Grand Palais in...

Media giant Lee Enterprises confirms cyberattack as news outlets report ongoing disruption

Lee Enterprises, a media giant that owns dozens of newspapers across the United States, has confirmed a...

Macron unveils $112B AI investment package, France’s answer to US’ Stargate

Late Sunday local time the French president, Emmanuel Macron, announced a total of €109 billion in private...

Saudi’s BRKZ closes $17M Series A for its construction tech platform

Construction procurement is highly fragmented, manual, and opaque, forcing contractors to juggle multiple suppliers, endure lengthy negotiations,...

OpenAI CEO Sam Altman admits that AI’s benefits may not be widely distributed

In a new essay on his personal blog, OpenAI CEO Sam Altman said the company is open...

Apple could launch a new iPhone SE and PowerBeats Pro 2 on February 11

Apple could launch a new iPhone SE and PowerBeats Pro headphones as early as February 11, according...

Sony says PlayStation Plus members will get five-day extension after outage

After the PlayStation Network experienced a global outage that lasted for nearly a day, Sony says it...

Here are five startups that are running Super Bowl ads this year

Super Bowl weekend is here with the Philadelphia Eagles set to take on the Kansas City Chiefs...