Will people really pay $200 a month for OpenAI’s new chatbot?

On Thursday, OpenAI released what’s effectively a $200-a-month chatbot — and the AI community didn’t know quite what to make of it.

The company’s new ChatGPT Pro plan grants access to “o1 pro mode,” which OpenAI says “uses more compute for the best answers to the hardest questions.” A souped-up version of OpenAI’s o1 reasoning model, o1 pro mode should answer questions relating to science, math, and coding more “reliably” and “comprehensively,” OpenAI says.

Almost immediately, people started asking it to draw unicorns:

I asked ChatGPT o1 Pro Mode to create an SVG of a unicorn.

(This is the model you get access to for $200 monthly) pic.twitter.com/h9HwY3aYwU

— Rammy (@rammydev) December 5, 2024

And design a “crab-based” computer:

Finally putting o1-pro to its ultimate use case. pic.twitter.com/nX4JAjx71m

— Ethan Mollick (@emollick) December 6, 2024

And wax poetic on the meaning of life:

I just subscribed to OpenAI’s $200/month subscription.
Reply with questions to ask it and I will repost them in this thread. pic.twitter.com/oTQxbPxnoP

— Garrett Scott 🕳 (@thegarrettscott) December 5, 2024

But many folks on X didn’t seem convinced that o1 pro mode’s answers were, well, $200-level.

“Have OpenAI shared any concrete examples of prompts that fail in regular o1 but succeed in o1-pro?” asked British computer scientist Simon Willison. “I want to see a single concrete example that shows its advantage.”

It’s a reasonable question; after all, this is the world’s most expensive chatbot subscription. The service comes with other benefits, like the removal of rate limits and unlimited access to OpenAI’s other models. But $2,400 per year isn’t chump change, and the value proposition of o1 pro mode in particular remains murky.

It didn’t take long to find failure cases. O1 pro mode struggles with Sudoku, and it’s tripped up by an optical illusion joke that’s obvious to any human.

o1 and o1-pro both failed here, probably still because of the vision limitations (the same with Sudoku puzzles)https://t.co/mAVK7WxBrq pic.twitter.com/O9boSv7ZGt

— Tibor Blaho (@btibor91) December 5, 2024

OpenAI’s internal benchmarks show that o1 pro mode performs only slightly better than the standard o1 on coding and math problems:

Image Credits:OpenAI

OpenAI ran a “stricter” evaluation on the same benchmarks to showcase o1 pro mode’s consistency: the model was only considered to have solved a question if it got the answer right four out of four times. But even in these tests, the improvements weren’t dramatic:

OpenAI o1-pro-mode — **Image Credits:**OpenAI

OpenAI CEO Sam Altman, who once wrote that OpenAI was on a path “towards intelligence too cheap to meter,” was forced to clarify multiple times on Thursday that ChatGPT Pro isn’t for most people.

“Most users will be very happy with the o1 in the [ChatGPT] Plus tier!” he said on X. “Almost everyone will be best-served by our free tier or the Plus tier.”

So who is it for? Are there really people out there willing to pay $200 a month to ask toy questions like “Write a 3-paragraph essay on strawberries without using the letter ‘e’” or “solve this Math Olympiad problem“? Will they happily part ways with their hard-earned cash without much guarantee that the standard o1 can’t satisfactorily answer the same questions?

I asked Ameet Talwalkar, an associate professor of machine learning at Carnegie Mellon and a venture partner at Amplify Partners, his opinion. “It seems like a big risk to me to raise the price tenfold,” he told TechCrunch via email. “I think we’ll have a much better sense in just a few weeks as to the appetite for this functionality.”

UCLA computer scientist Guy Van den Broeck was more candid in his assessment. “I don’t know if the price point makes sense,” he told TechCrunch, “and if pricey reasoning models will be the norm.”

o1 is “better than most humans at most tasks” because, yes, humans exist exclusively in amnesic disembodied multi-turn chat interfaces https://t.co/zbLY2BG5pQ

— Aidan McLau (@aidan_mclau) December 6, 2024

A generous take is that it’s a marketing blunder. Describing o1 pro mode as best at solving “the hardest problems” doesn’t tell prospective customers much. Nor do vague statements about how the model can “think longer” and demonstrate “intelligence.” As Willison point out, without specific examples of this supposedly improved capability, it’s hard to justify paying more at all, let alone ten times the price.

So far as I can tell, experts in specialized fields are the intended audience. OpenAI says it plans to grant a handful of medical researchers at “leading institutions” free access to ChatGPT Pro, which will include o1 pro mode. Mistakes matter a lot in healthcare, and, as Bob McGrew, OpenAI’s former chief research officer, noted on X, better reliability is perhaps o1 pro mode’s chief unlock.

Been playing with o1 and o1-pro for bit.

They are very good & a little weird. They are also not for most people most of the time. You really need to have particular hard problems to solve in order to get value out of it. But if you have those problems, this is a very big deal.

— Ethan Mollick (@emollick) December 5, 2024

McGrew also mused o1 pro mode is an example of what he calls “intelligence overhang”: users (and perhaps the model’s creators) not knowing how to get value from any “extra intelligence” due to fundamental limits of a simple, text-based interface. As with OpenAI’s other models, the only way to interact with o1 pro mode is through ChatGPT, and — to McGrew’s point — ChatGPT isn’t perfect.

It’s also true, though, that $200 sets expectations high. And judging by the early reception on social media, ChatGPT Pro is no slam dunk.

Source link

Will people really pay $200 a month for OpenAI’s new chatbot?

Recent posts

Battery unicorn Northvolt files for bankruptcy, upending Europe’s industrial plan

Microsoft to spend $80 billion in FY’25 on data centers for AI

Meta pitches VR to mobile developers with new support for Android apps on Quest

Tesla Superchargers: All the EV brands that have access

In another chess move with Microsoft, OpenAI is pouring $12B into CoreWeave

Elon Musk reveals Elon Musk was wrong about Full Self-Driving

Travis Kalanick thinks Uber screwed up: “Wish we had an autonomous ride-sharing product”

Invesco raises its valuation of Swiggy to $13.3B

EV startup Canoo files for bankruptcy and ceases operations

Prosus buys Despegar for $1.7B, taking a bite out of Latin America’s travel sector

Last week to apply to speak at TechCrunch Sessions: AI

A test for AGI is closer to being solved — but it may be flawed

Ben Ling’s Bling Capital has already nabbed another $270M for fourth fund

Hindustan Unilever in talks to acquire Peak XV-backed Minimalist for up to $350M

Despite risks, Vinod Khosla is optimistic about AI

Related articles

Profitable Klarna files for a potentially blockbuster IPO

Google is replacing Google Assistant with Gemini

Testing the Uber-Waymo robotaxi, Rivian goes hands-free, and Travis Kalanick has AV FOMO

Tern AI’s low-cost GPS alternative actually works

China is reportedly keeping DeepSeek under close watch

iPhone and Android users will soon be able to send encrypted RCS messages to each other

Developer of Lockbit ransomware gets extradited to the United States

US lawmakers urge UK spy court to hold Apple ‘backdoor’ secret hearing in public

Company