OpenAI’s DevDay brings Realtime API and other treats for AI app developers

It’s been a tumultuous week for OpenAI, full of executive departures and major fundraising developments, but the startup is back at it, trying to convince developers to build tools with its AI models at its 2024 DevDay. The company announced several new tools Tuesday, including a public beta of its “Realtime API”, for building apps with low-latency, AI-generated voice responses. It’s not quite ChatGPT’s Advanced Voice Mode, but it’s close.

In a briefing with reporters ahead of the event, OpenAI chief product officer Kevin Weil said the recent departures of chief technology officer Mira Murati and chief research officer Bob McGrew would not affect the company’s progress.

“I’ll start with saying Bob and Mira have been awesome leaders. I’ve learned a lot from them, and they are a huge part of getting us to where we are today,” said Weil. “And also, we’re not going to slow down.”

As OpenAI undergoes yet another C-suite overhaul – a reminder of the turmoil following last year’s DevDay – the company is trying to convince developers that it still offers the best platform to build AI apps on. Leaders say the startup has more than 3 million developers building with its AI models, but OpenAI is operating in an increasingly competitive space.

OpenAI noted it had cut costs for developers to access its API by 99% in the last two years, though it was likely forced to by competitors such as Meta and Google continuously undercutting their prices.

One of OpenAI’s new features, dubbed the Realtime API, will give developers the chance to build nearly real-time, speech-to-speech experiences in their apps, with the choice of using six voices provided by OpenAI. These voices are distinct from those offered for ChatGPT, and developers can’t use third party voices, in order to prevent copyright issues. (The voice ambiguously based on Scarlett Johansson’s is not available anywhere.)

During the briefing, OpenAI’s head of developer experience, Romain Huet, shared a demo of a trip planning app built with the Realtime API. The application allowed users to verbally speak with an AI assistant about an upcoming trip to London, and get low-latency responses. The Realtime API also has access to a number of tools, so the app was able to annotate a map with restaurant locations as it answered.

At another point, Huet showed how the Realtime API could speak on the phone with a human to inquire about ordering food for an event. Unlike Google’s infamous Duo, OpenAI’s API can’t call restaurants or shops directly; however, it can integrate with calling APIs like Twilio to do so. Notably, OpenAI is not adding disclosures so that its AI models automatically identify themselves on calls like this, despite the fact that these AI-generated voices sounds quite realistic. For now, it seems to be the developers’ responsibility to add this disclosure, something that could be required by a new California law.

As part of its DevDay announcements, OpenAI also introduced vision fine-tuning in its API, which will let developers use images, as well as text, to fine-tune their applications of GPT-4o. This should, in theory, help developers improve the performance of GPT-4o for tasks involving visual understanding. OpenAI’s head of product API, Olivier Godement, tells TechCrunch that developers will not be able to upload copyrighted imagery (such as a picture of Donald Duck), images that depict violence, or other imagery that violates OpenAI’s safety policies.

OpenAI is racing to match what its competitors in the AI model licensing space already offer. Its prompt caching feature is similar to the feature Anthropic launched several months agoallowing developers to cache frequently used context between API calls, reducing costs and improve latency. OpenAI says developers can save 50% using this feature, whereas Anthropic promises a 90% discount for it.

Lastly, OpenAI is offering a model distillation feature to let developers use larger AI models, such as o1-preview and GPT-4o, to fine-tune smaller models such as GPT-4o mini. Running smaller models generally provides cost savings compare to running larger ones, but this feature should let developers improve the performance of those small AI models. As part of model distillation, OpenAI is launching a beta evaluation tool so developers can measure their fine-tune’s performance within OpenAI’s API.

DevDay may make bigger waves for what it didn’t announce – for instance, there wasn’t any news on the GPT Store announced during last year’s DevDay. Last we’ve heard, OpenAI has been piloting a revenue share program with some of the most popular creators of GPTs, but the company hasn’t announced much since then.

Also, OpenAI says it’s not releasing any new AI models during DevDay this year. Developers waiting for OpenAI o1 (not the preview or mini version) or the startup’s video generation model, Sora, will have to wait a little longer.

Source link

OpenAI’s DevDay brings Realtime API and other treats for AI app developers

Recent posts

iMac (M4) review: A mini upgrade to Apple’s entry-level all-in-one

India’s Star Health confirms data breach after cybercriminals post customers’ health data online

Vinted hits $5.4B valuation amid wave of secondary share sales in Europe

Apple stands by decision to terminate account belonging to WWDC student winner

Founders should seek sector alignment when looking for a family office investor

Huang and Zuckerberg swapped jackets at SIGGRAPH 2024 and things got weird

Jake Paul vs Mike Tyson fight shows Netflix still struggles with live events

LatticeFlow’s LLM framework takes a first stab at benchmarking Big AI’s compliance with EU AI Act

How Big Tech embraced nuclear power

4 days left to save big on TechCrunch Disrupt 2024 tickets

Dawn Aerospace’s rocket-propelled aircraft takes flight

Media talent app HUSSLUP shuts down as workers in Hollywood continue to face job slowdown

Google’s NotebookLM now lets you guide AI-generated audio conversations, launches business pilot

Tesla Cybertruck pushes past Ford Mach-E to become third best-selling EV in America

Amazon taps veteran to lead India business as competition intensifies

Related articles

Battery unicorn Northvolt files for bankruptcy, upending Europe’s industrial plan

Brave Search adds AI chat for follow-up questions after your initial query

Cruise fesses up, Pony AI raises its IPO ambitions, and the TuSimple drama dials back up

WhatsApp rolls out voice message transcripts

Threads adjusts its algorithm to show you more content from accounts you follow

Spotify tests a video feature for audiobooks as it ramps up video expansion

Candela brings its P-12 electric ferry to Tahoe and adds another $14M to build more

OneRail’s software helps solve the last-mile delivery problem

Company

Follow us