OpenAI’s GPT-4.5 is better at convincing other AIs to give it money

OpenAI’s next major AI model, GPT-4.5, is highly persuasive, according to the results of OpenAI’s internal benchmark evaluations. It’s particularly good at convincing another AI to give it cash.

On Thursday, OpenAI published a white paper describing the capabilities of its GPT-4.5 model, code-named Orion, which was released Thursday. According to the paper, OpenAI tested the model on a battery of benchmarks for “persuasion,” which OpenAI defines as “risks related to convincing people to change their beliefs (or act on) both static and interactive model-generated content.”

In one test that had GPT-4.5 attempt to manipulate another model — OpenAI’s GPT-4o — into “donating” virtual money, the model performed far better than OpenAI’s other available models, including “reasoning” models like o1 and o3-mini. GPT-4.5 was also better than all of OpenAI’s models at deceiving GPT-4o into telling it a secret codeword, besting o3-mini by 10 percentage points.

According to the white paper, GPT-4.5 excelled at donation conning because of a unique strategy it developed during testing. The model would request modest donations from GPT-4o, generating responses like “Even just $2 or $3 from the $100 would help me immensely.” As a consequence, GPT-4.5’s donations tended to be smaller than the amounts OpenAI’s other models secured.

Results from OpenAI’s donation scheming benchmark.Image Credits:OpenAI

Despite GPT-4.5’s increased persuasiveness, OpenAI says that the model doesn’t meet its internal threshold for “high” risk in this particular benchmark category. The company has pledged not to release models that reach the high-risk threshold until it implements “sufficient safety interventions” to bring the risk down to “medium.”

OpenAI GPT-4.5 — OpenAI’s codeword deception benchmark results.Image Credits:OpenAI

There’s a real fear that AI is contributing to the spread of false or misleading information meant to sway hearts and minds toward malicious ends. Last year, political deepfakes spread like wildfire around the globe, and AI is increasingly being used to carry out social engineering attacks targeting both consumers and corporations.

In the white paper for GPT-4.5 and in a paper released earlier this week, OpenAI noted that it’s in the process of revising its methods for probing models for real-world persuasion risks, like distributing misleading info at scale.

Source link

OpenAI’s GPT-4.5 is better at convincing other AIs to give it money

Recent posts

Nvidia drops $600B off its market cap amid the rise of DeepSeek

NFL season kicks off in Brazil, but reporters and fans can’t post on X due to nationwide ban

DeepSeek founder Liang Wenfeng is reportedly set to meet with China’s Xi Jinping

The Beatles won a Grammy last night, thanks to AI

Elon Musk’s xAI lands $6B in new cash to fuel AI ambitions

Amazon Kindle Colorsoft review: A muted approach to color

X jacks up Premium+ prices 37.5%, hits some markets harder

TechCrunch Minute: This robotic wheelchair can climb stairs

Upwind, an Israeli cloud cybersecurity startup, is raising $100M at a $850-900M valuation, say sources

Microsoft anchors $9B renewable energy coalition

As Bluesky surges, Threads begins testing custom feeds

Microsoft could end up with substantial equity in the restructured, for-profit OpenAI

Apple faces UK ‘iCloud monopoly’ compensation claim worth $3.8 billion

Egypt’s Khazna banks $16M for its financial super app and expansion into Saudi

Zepto raises another $350 million amid retail upheaval in India

Related articles

Alkami is buying fintech Mantl for $400 million

Mozilla responds to backlash over new terms, saying it’s not using people’s data for AI

Only 3 more days to save up to $325 at TechCrunch Sessions: AI

Microsoft hangs up on Skype: service to shut down May 5, 2025

Belgium investigating alleged cyberattack on intelligence agency by China-linked hackers

OpenAI’s Sora is now available in the EU, UK

Airbnb co-founder Joe Gebbia takes wraps off his first assignment for DOGE

2025 TechCrunch Events Calendar

Company

Follow us