OpenAI launches o3-mini, its latest ‘reasoning’ model

OpenAI on Friday launched a new AI “reasoning” model, o3-mini, the newest in the company’s o family of reasoning models.

OpenAI first previewed the model in December alongside a more capable system called o3, but the launch comes at a pivotal moment for the company, whose ambitions — and challenges — are seemingly growing by the day.

OpenAI is battling the perception that it’s ceding ground in the AI race to Chinese companies like DeepSeek, which OpenAI alleges might have stolen its IP. It has been trying to shore up its relationship with Washington as it simultaneously pursues an ambitious data center project, and as it reportedly lays the groundwork for one of the largest funding rounds in history.

Which brings us to o3-mini. OpenAI is pitching its new model as both “powerful” and “affordable.”

“Today’s launch marks […] an important step toward broadening accessibility to advanced AI in service of our mission,” an OpenAI spokesperson told TechCrunch.

More efficient reasoning

Unlike most large language models, reasoning models like o3-mini thoroughly fact-check themselves before giving out results. This helps them avoid some of the pitfalls that normally trip up models. These reasoning models do take a little longer to arrive at solutions, but the trade-off is that they tend to be more reliable — though not perfect — in domains like physics.

O3-mini is fine-tuned for STEM problems, specifically for programming, math, and science. OpenAI claims the model is largely on par with the o1 family, o1 and o1-mini, in terms of capabilities, but runs faster and costs less.

The company claimed that external testers preferred o3-mini’s answers over those from o1-mini more than half the time. O3-mini apparently also made 39% fewer “major mistakes” on “tough real-world questions” in A/B tests versus o1-mini, and produced “clearer” responses while delivering answers about 24% faster.

O3-mini will be available to all users via ChatGPT starting Friday, but users who pay for OpenAI’s ChatGPT Plus and Team plans will get a higher rate limit of 150 queries per day. ChatGPT Pro subscribers will get unlimited access, and o3-mini will come to ChatGPT Enterprise and ChatGPT Edu customers in a week. (No word on ChatGPT Gov yet).

Users with premium plans can select o3-mini using the ChatGPT drop-down menu. Free users can click or tap the new “Reason” button in the chat bar, or have ChatGPT “re-generate” an answer.

Beginning Friday, o3-mini will also be available via OpenAI’s API to select developers, but it initially will not have support for analyzing images. Devs can select the level of “reasoning effort” (low, medium, or high) to get o3-mini to “think harder” based on their use case and latency needs.

O3-mini is priced at $0.55 per million cached input tokens and $4.40 per million output tokens, where a million tokens equates to roughly 750,000 words. That’s 63% cheaper than o1-mini, and competitive with DeepSeek’s R1 reasoning model pricing. DeepSeek charges $0.14 per million cached input tokens and $2.19 per million output tokens for R1 access through its API.

In ChatGPT, o3-mini is set to medium reasoning effort, which OpenAI says provides “a balanced trade-off between speed and accuracy.” Paid users will have the option of selecting “o3-mini-high” in the model picker, which will deliver what OpenAI calls “higher intelligence” in exchange for slower responses.

Regardless of which version of o3-mini ChatGPT users choose, the model will work with search to find up-to-date answers with links to relevant web sources. OpenAI cautions that the functionality is a “prototype” as it works to integrate search across its reasoning models.

“While o1 remains our broader general-knowledge reasoning model, o3-mini provides a specialized alternative for technical domains requiring precision and speed,” OpenAI wrote in a blog post on Friday. “The release of o3-mini marks another step in OpenAI’s mission to push the boundaries of cost-effective intelligence.”

Caveats abound

O3-mini is not OpenAI’s most powerful model to date, nor does it leapfrog DeepSeek’s R1 reasoning model in every benchmark.

O3-mini beats R1 on AIME 2024, a test that measures how well models understand and respond to complex instructions — but only with high reasoning effort. It also beats R1 on the programming-focused test SWE-bench Verified (by .1 point), but again, only with high reasoning effort. On low reasoning effort, o3-mini lags R1 on GPQA Diamond, which tests models with PhD-level physics, biology, and chemistry questions.

To be fair, o3-mini answers many queries at competitively low cost and latency. In the post, OpenAI compares its performance to the o1 family:

“With low reasoning effort, o3-mini achieves comparable performance with o1-mini, while with medium effort, o3-mini achieves comparable performance with o1,” OpenAI writes. “O3-mini with medium reasoning effort matches o1’s performance in math, coding and science while delivering faster responses. Meanwhile, with high reasoning effort, o3-mini outperforms both o1-mini and o1.”

It’s worth noting that o3-mini’s performance advantage over o1 is slim in some areas. On AIME 2024, o3-mini beats o1 by just 0.3 percentage points when set to high reasoning effort. And on GPQA Diamond, o3-mini doesn’t surpass o1’s score even on high reasoning effort.

OpenAI asserts that o3-mini is as “safe” or safer than the o1 family, however, thanks to red-teaming efforts and its “deliberative alignment” methodology, which makes models “think” about OpenAI’s safety policy while they’re responding to queries. According to the company, o3-mini “significantly surpasses” one of OpenAI’s flagship models, GPT-4o, on “challenging safety and jailbreak evaluations.”

TechCrunch has an AI-focused newsletter! Sign up here to get it in your inbox every Wednesday.

Source link

OpenAI launches o3-mini, its latest ‘reasoning’ model

More efficient reasoning

Caveats abound

Recent posts

Called your doctor after-hours? ConnectOnCall hackers may have stolen your medical data

Madrona just announced its biggest fund ever, closing on $770M as other venture funds grow smaller

The Future of AI’s Data Infrastructure: Unlocking the Power of Gen AI with MongoDB and Capgemini

Stripe mulls employee shareholder sale at $85B-plus valuation

Perplexity brings ads to its platform

A Waymo robotaxi got stuck in a roundabout loop

The biggest data breaches of 2025 — so far

How social app Spill plans to capitalize on the exodus from X

Softbank CEO announces a $100 billion investment into the US

SGNL snags $30M for a new take on ID security based on zero-standing privileges

Fidelity reportedly marked up its X stake by 32 percent

Meteomatics eyes U.S. expansion for its enterprise-focused weather forecasting tools

FTX’s Ryan Salame posts jokes on LinkedIn as he heads to prison

X is cleared to go back online in Brazil

‘Chat control’: The EU’s controversial CSAM-scanning legal proposal explained

Related articles

Neom is reportedly turning into a financial disaster, except for McKinsey & Co.

Manus probably isn’t China’s second ‘DeepSeek moment’

Japan’s service robot market projected to triple in five years

Colossal CEO Ben Lamm says humanity has a ‘moral obligation’ to pursue de-extinction tech

Tammy Nam joins AI-powered ad startup Creatopy as CEO

Apple’s smart home hub reportedly delayed by Siri challenges

Musk may still have a chance to thwart OpenAI’s for-profit conversion

How to stop doomscrolling

Company

Follow us