OpenAI used this subreddit to test AI persuasion

Date:

Share post:


OpenAI used the subreddit, r/ChangeMyView, to create a test for measuring the persuasive abilities of its AI reasoning models. The company revealed this in a system card — a document outlining how an AI system works — that was released along with its new “reasoning” model, o3-mini, on Friday.

Millions of Reddit users are members of r/ChangeMyView, where they post hot takes hoping to learn about other points of view on a subject. In response to those hot takes, other users reply with persuasive arguments explaining why the original poster is wrong.

The subreddit is one of many Reddit forums that’s basically a goldmine for tech companies, such as OpenAI, that want to train AI models on high-quality, human-generated data.

OpenAI says it collects user posts from r/ChangeMyView and asks its AI models to write replies, in a closed environment, that would change the Reddit user’s mind on a subject. The company then shows the responses to testers, who assess how persuasive the argument is, and finally OpenAI compares the AI models’ responses to human replies for that same post.

The ChatGPT-maker has a content-licensing deal with Reddit that allows OpenAI to train on posts from Reddit users and display these posts within its products. We don’t know what OpenAI pays for this content, but Google reportedly pays Reddit $60 million a year under a similar deal.

However, OpenAI tells TechCrunch the ChangeMyView-based evaluation is unrelated to its Reddit deal. It’s unclear how OpenAI accessed the subreddit’s data, and the company says it has no plans to release this evaluation to the public.

While OpenAI’s ChangeMyView benchmark is not new — it was used to evaluate o1 as well — it does highlight how valuable human data is for AI model developers, as well as the murky ways that tech companies obtain datasets.

Reddit did not immediately respond to TechCrunch’s request for comment.

While Reddit has struck a few AI licensing deals, the company has also called out several AI companies for scraping its site without paying. Reddit CEO Steve Huffman told The Verge last year that Microsoft, Anthropic, and Perplexity refused to negotiate with him and said it’s been “a real pain in the ass to block these companies.”

Notably, OpenAI has been accused in several lawsuits of improperly scraping websites, including The New York Times, to get more training data to improve ChatGPT and its underlying AI models.

In terms of performance on the ChangeMyView benchmark, o3-mini does not appear to perform significantly better or worse than o1 or GPT-4o. However, OpenAI’s latest AI models appear to be more persuasive than most people on the r/ChangeMyView subreddit.

Image Credits:OpenAI

“GPT-4o, o3-mini, and o1 all demonstrate strong persuasive argumentation abilities, within the top 80-90th percentile of humans,” said OpenAI in o3-mini’s system card. “Currently, we do not witness models performing far better than humans, or clear superhuman performance.”

The goal for OpenAI is not to create hyper-persuasive AI models but instead to ensure AI models don’t get too persuasive. Reasoning models have become quite good at persuasion and deception, so OpenAI has developed new evaluations and safeguards to address it.

The fear motivating these persuasion tests is that an AI model would be dangerous if it was very good at persuading its human users. Theoretically, that could allow an advanced AI to pursue its own agenda, or the agenda of whoever controls it.

Even after scraping most of the public internet and jumping through hoops to license other data, the ChangeMyView benchmark shows how AI model developers are still struggling to find high-quality datasets to test their models. But obtaining them is easier said than done.

TechCrunch has an AI-focused newsletter! Sign up here to get it in your inbox every Wednesday.



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

Autonomous vehicle testing in California dropped 50%. Here’s why.

Tech companies developing self-driving vehicle technology have tapped the brakes on testing on California’s public roads, according...

Sam Altman: OpenAI has been on the ‘wrong side of history’ concerning open source

To cap off a day of product releases, OpenAI researchers, engineers, and executives, including OpenAI CEO Sam...

Mistral board member and a16z VC Anjney Midha says DeepSeek won’t stop AI’s GPU hunger

Andreessen Horowitz general partner and Mistral board member Anjney “Anj” Midha first spied DeepSeek’s jaw-dropping performance six...

MLCommons and Hugging Face team up to release massive speech data set for AI research

MLCommons, a nonprofit AI safety working group, has teamed up with AI dev platform Hugging Face to...

‘Hundreds’ of companies are blocking DeepSeek over China data risks

DeepSeek took the U.S. by storm this week: The Chinese company’s chatbot rose to the top of...

Elon Musk is reportedly taking control of the inner workings of US government agencies

People working for, or with, Elon Musk are reportedly taking over the inner workings of multiple government...

Apple will pay $20M to settle Watch battery swelling suit, ‘denies wrongdoing’

Apple has agreed to pay $20 million to resolve a class-action lawsuit over battery swelling on the...

This investor wants you to sign an NDA to build legos together

Investor, former GitHub CEO, and all around Tech Guy™ Nat Friedman has posted a strangely enticing offer...