OpenAI unveils GPT-4o mini, a smaller and cheaper AI model

OpenAI introduced GPT-4o mini on Thursday, its latest small AI model. The company says GPT-4o mini, which is cheaper and faster than OpenAI’s current cutting-edge AI models, is being released for developers, as well as through the ChatGPT web and mobile app for consumers, starting today. Enterprise users will gain access next week.

The company says GPT-4o mini outperforms industry-leading small AI models on reasoning tasks involving text and vision. As small AI models improve, they are becoming more popular for developers due to their speed and cost efficiencies compared to larger models, such as GPT-4 Omni or Claude 3.5 Sonnet. They’re a useful option for high volume, simple tasks that developers might repeatedly call on an AI model to perform.

GPT-4o mini will replace GPT-3.5 Turbo as the smallest model OpenAI offers. The company claims its newest AI model scores 82% on MMLU, a benchmark to measure reasoning, compared to 79% for Gemini 1.5 Flash and 75% for Claude 3 Haiku, according to data from Artificial Analysis. On MGSM, which measures math reasoning, GPT-4o mini scored 87%, compared to 78% for Flash and 72% for Haiku.

Chart comparing small AI models from Artificial Analysis. Price here is a combination of input and output tokens.

Image Credits: Artificial Analysis

Further, OpenAI says GPT-4o mini is significantly more affordable to run than its previous frontier models, and more than 60% cheaper than GPT-3.5 Turbo. Today, GPT-4o mini supports text and vision in the API, and OpenAI says the model will support video and audio capabilities in the future.

“For every corner of the world to be empowered by AI, we need to make the models much more affordable,” said OpenAI’s head of Product API, Olivier Godement, in an interview with TechCrunch. “I think GPT-4o mini is a really big step forward in that direction.”

For developers building on OpenAI’s API, GPT4o mini is priced at 15 cents per million input tokens and 60 cents per million output tokens. The model has a context window of 128,000 tokens, roughly the length of a book, and a knowledge cutoff of October 2023.

OpenAI would not disclose exactly how large GPT-4o mini is, but said it’s roughly in the same tier as other small AI models, such as Llama 3 8b, Claude Haiku and Gemini 1.5 Flash. However, the company claims GPT-4o mini to be faster, more cost-efficient and smarter than industry-leading small models, based pre-launch testing in the LMSYS.org chatbot arena. Early independent tests seem to confirm this.

“Relative to comparable models, GPT-4o mini is very fast, with a median output speed of 202 tokens per second,” said George Cameron, Co-Founder at Artificial Analysis, in an email to TechCrunch. “This is more than 2X faster than GPT-4o and GPT-3.5 Turbo and represents a compelling offering for speed-dependent use-cases including many consumer applications and agentic approaches to using LLMs.”

Separately, OpenAI announced new tools for enterprise customers on Thursday. In a blog post, OpenAI announced the Enterprise Compliance API to help businesses in highly regulated industries such as finance, healthcare, legal services and government comply with logging and audit requirements.

The company says these tools will allow admins to audit and take action on their ChatGPT Enterprise data. The API will provide records of time-stamped interactions, including conversations, uploaded files, workspace users and more.

OpenAI is also giving admins more granular control for workspace GPTs, a custom version of ChatGPT created for specific business use cases. Previously, admins could only fully allow or block GPT actions created in their workspace, but now, workspace owners can create an approved list of domains that GPTs can interact with.

Source link

OpenAI unveils GPT-4o mini, a smaller and cheaper AI model

Recent posts

Meta’s Movie Gen model puts out realistic video with sound, so we can finally have infinite Moo Deng

Fortinet confirms customer data breach

Uber fined $324M over EU driver data transfer breach

Agave, the startup behind Find the Cat, finds $18M

La French Tech gears up to go in a new direction

Sam Bankman-Fried appeals conviction, criticizes judge’s ‘unbalanced’ decisions

Tesla updates include Apple Watch app and fart prank

NASA and Rocket Lab aim to prove we can go to Mars for 1/10 the price

Lyft to ‘open up a can of whoop ass’ on surge pricing

Palo Alto Networks CEO apologizes for happy hour display featuring women with lampshades on their heads

Study suggests that even the best AI models hallucinate a bunch

Microsoft’s AI-powered Canva-like Designer app lands on iOS and Android

FTC finds that smart device makers fail to make clear how long their products will be supported

Kenyan HR and payroll startup Workpay lands Visa as investor in $5M round

Elon Musk does not owe ex-Twitter staffers $500 million in severance, court rules

Related articles

WeTransfer’s free plan now has a monthly limit of 10 transfers

Five years later… Netflix hit with Dutch data access fine

AI is burying company web sites in search results, but Otterly.AI thinks it can help

Threads is testing a post scheduling feature

‘It’s dumb to IPO this year’: Databricks CEO explains why he’s waiting to go public

India’s MobiKwik surges 82% in market debut

The DOJ wants a Perplexity executive to testify in its Google antitrust case

Insight VC describes Databricks’ wild $10B deal and the bad advice the CEO ignored

Company

Follow us