OpenAI unveils GPT-4o mini, a smaller and cheaper AI model

Date:

Share post:


OpenAI introduced GPT-4o mini on Thursday, its latest small AI model. The company says GPT-4o mini, which is cheaper and faster than OpenAI’s current cutting-edge AI models, is being released for developers, as well as through the ChatGPT web and mobile app for consumers, starting today. Enterprise users will gain access next week.

The company says GPT-4o mini outperforms industry-leading small AI models on reasoning tasks involving text and vision. As small AI models improve, they are becoming more popular for developers due to their speed and cost efficiencies compared to larger models, such as GPT-4 Omni or Claude 3.5 Sonnet. They’re a useful option for high volume, simple tasks that developers might repeatedly call on an AI model to perform.

GPT-4o mini will replace GPT-3.5 Turbo as the smallest model OpenAI offers. The company claims its newest AI model scores 82% on MMLU, a benchmark to measure reasoning, compared to 79% for Gemini 1.5 Flash and 75% for Claude 3 Haiku, according to data from Artificial Analysis. On MGSM, which measures math reasoning, GPT-4o mini scored 87%, compared to 78% for Flash and 72% for Haiku.

Chart comparing small AI models from Artificial Analysis. Price here is a combination of input and output tokens.
Image Credits: Artificial Analysis

Further, OpenAI says GPT-4o mini is significantly more affordable to run than its previous frontier models, and more than 60% cheaper than GPT-3.5 Turbo. Today, GPT-4o mini supports text and vision in the API, and OpenAI says the model will support video and audio capabilities in the future.

“For every corner of the world to be empowered by AI, we need to make the models much more affordable,” said OpenAI’s head of Product API, Olivier Godement, in an interview with TechCrunch. “I think GPT-4o mini is a really big step forward in that direction.”

For developers building on OpenAI’s API, GPT4o mini is priced at 15 cents per million input tokens and 60 cents per million output tokens. The model has a context window of 128,000 tokens, roughly the length of a book, and a knowledge cutoff of October 2023.

OpenAI would not disclose exactly how large GPT-4o mini is, but said it’s roughly in the same tier as other small AI models, such as Llama 3 8b, Claude Haiku and Gemini 1.5 Flash. However, the company claims GPT-4o mini to be faster, more cost-efficient and smarter than industry-leading small models, based pre-launch testing in the LMSYS.org chatbot arena. Early independent tests seem to confirm this.

“Relative to comparable models, GPT-4o mini is very fast, with a median output speed of 202 tokens per second,” said George Cameron, Co-Founder at Artificial Analysis, in an email to TechCrunch. “This is more than 2X faster than GPT-4o and GPT-3.5 Turbo and represents a compelling offering for speed-dependent use-cases including many consumer applications and agentic approaches to using LLMs.”

Separately, OpenAI announced new tools for enterprise customers on Thursday. In a blog post, OpenAI announced the Enterprise Compliance API to help businesses in highly regulated industries such as finance, healthcare, legal services and government comply with logging and audit requirements.

The company says these tools will allow admins to audit and take action on their ChatGPT Enterprise data. The API will provide records of time-stamped interactions, including conversations, uploaded files, workspace users and more.

OpenAI is also giving admins more granular control for workspace GPTs, a custom version of ChatGPT created for specific business use cases. Previously, admins could only fully allow or block GPT actions created in their workspace, but now, workspace owners can create an approved list of domains that GPTs can interact with.



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

WeTransfer’s free plan now has a monthly limit of 10 transfers

File transfer service WeTransfer is now limiting users to 10 transfers per month with its free plan....

Five years later… Netflix hit with Dutch data access fine

Five years later sounds like a half-baked sequel to a well-known zombie flick franchise. But it’s a...

AI is burying company web sites in search results, but Otterly.AI thinks it can help

Many sites saw their organic traffic decline in 2024, in big part due to the rise of...

Threads is testing a post scheduling feature

Meta’s social network Threads is experimenting with a feature that will let you schedule posts, Instagram head...

‘It’s dumb to IPO this year’: Databricks CEO explains why he’s waiting to go public

Databricks just closed one of the largest funding rounds ever, raising a staggering $10 billion in fresh...

India’s MobiKwik surges 82% in market debut

Shares in digital payments firm MobiKwik surged 82% to ₹507.5 ($6) on their first day of trading,...

The DOJ wants a Perplexity executive to testify in its Google antitrust case

A U.S. court ruled in August that Google has a search monopoly, and while Google appeals, the...

Insight VC describes Databricks’ wild $10B deal and the bad advice the CEO ignored

It’s been a wild week for investors clawing their way into Databricks’ record-breaking $10 billion fund raising, one...