OpenAI’s DevDay brings Realtime API and other treats for AI app developers

Date:

Share post:


It’s been a tumultuous week for OpenAI, full of executive departures and major fundraising developments, but the startup is back at it, trying to convince developers to build tools with its AI models at its 2024 DevDay. The company announced several new tools Tuesday, including a public beta of its “Realtime API”, for building apps with low-latency, AI-generated voice responses. It’s not quite ChatGPT’s Advanced Voice Mode, but it’s close.

In a briefing with reporters ahead of the event, OpenAI chief product officer Kevin Weil said the recent departures of chief technology officer Mira Murati and chief research officer Bob McGrew would not affect the company’s progress.

“I’ll start with saying Bob and Mira have been awesome leaders. I’ve learned a lot from them, and they are a huge part of getting us to where we are today,” said Weil. “And also, we’re not going to slow down.”

As OpenAI undergoes yet another C-suite overhaul – a reminder of the turmoil following last year’s DevDay – the company is trying to convince developers that it still offers the best platform to build AI apps on. Leaders say the startup has more than 3 million developers building with its AI models, but OpenAI is operating in an increasingly competitive space.

OpenAI noted it had cut costs for developers to access its API by 99% in the last two years, though it was likely forced to by competitors such as Meta and Google continuously undercutting their prices.

One of OpenAI’s new features, dubbed the Realtime API, will give developers the chance to build nearly real-time, speech-to-speech experiences in their apps, with the choice of using six voices provided by OpenAI. These voices are distinct from those offered for ChatGPT, and developers can’t use third party voices, in order to prevent copyright issues. (The voice ambiguously based on Scarlett Johansson’s is not available anywhere.)

During the briefing, OpenAI’s head of developer experience, Romain Huet, shared a demo of a trip planning app built with the Realtime API. The application allowed users to verbally speak with an AI assistant about an upcoming trip to London, and get low-latency responses. The Realtime API also has access to a number of tools, so the app was able to annotate a map with restaurant locations as it answered.

At another point, Huet showed how the Realtime API could speak on the phone with a human to inquire about ordering food for an event. Unlike Google’s infamous Duo, OpenAI’s API can’t call restaurants or shops directly; however, it can integrate with calling APIs like Twilio to do so. Notably, OpenAI is not adding disclosures so that its AI models automatically identify themselves on calls like this, despite the fact that these AI-generated voices sounds quite realistic. For now, it seems to be the developers’ responsibility to add this disclosure, something that could be required by a new California law.

As part of its DevDay announcements, OpenAI also introduced vision fine-tuning in its API, which will let developers use images, as well as text, to fine-tune their applications of GPT-4o. This should, in theory, help developers improve the performance of GPT-4o for tasks involving visual understanding. OpenAI’s head of product API, Olivier Godement, tells TechCrunch that developers will not be able to upload copyrighted imagery (such as a picture of Donald Duck), images that depict violence, or other imagery that violates OpenAI’s safety policies.

OpenAI is racing to match what its competitors in the AI model licensing space already offer. Its prompt caching feature is similar to the feature Anthropic launched several months agoallowing developers to cache frequently used context between API calls, reducing costs and improve latency. OpenAI says developers can save 50% using this feature, whereas Anthropic promises a 90% discount for it.

Lastly, OpenAI is offering a model distillation feature to let developers use larger AI models, such as o1-preview and GPT-4o, to fine-tune smaller models such as GPT-4o mini. Running smaller models generally provides cost savings compare to running larger ones, but this feature should let developers improve the performance of those small AI models. As part of model distillation, OpenAI is launching a beta evaluation tool so developers can measure their fine-tune’s performance within OpenAI’s API.

DevDay may make bigger waves for what it didn’t announce – for instance, there wasn’t any news on the GPT Store announced during last year’s DevDay. Last we’ve heard, OpenAI has been piloting a revenue share program with some of the most popular creators of GPTs, but the company hasn’t announced much since then.

Also, OpenAI says it’s not releasing any new AI models during DevDay this year. Developers waiting for OpenAI o1 (not the preview or mini version) or the startup’s video generation model, Sora, will have to wait a little longer.



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

Battery unicorn Northvolt files for bankruptcy, upending Europe’s industrial plan

Beleaguered Swedish battery manufacturer Northvolt announced today that it was filing for bankruptcy in the U.S., striking...

Brave Search adds AI chat for follow-up questions after your initial query

Brave announced on Thursday that it’s introducing an AI chat mode for follow-up questions based on initial...

Cruise fesses up, Pony AI raises its IPO ambitions, and the TuSimple drama dials back up

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of...

WhatsApp rolls out voice message transcripts

WhatsApp announced on Thursday it’s rolling out voice message transcripts. The Meta-owned company says the new feature...

Threads adjusts its algorithm to show you more content from accounts you follow

After several complaints about its algorithm, Threads is finally making changes to surface more content from people...

Spotify tests a video feature for audiobooks as it ramps up video expansion

Spotify is enhancing the audiobook experience for premium users through three new experiments: video clips, author pages,...

Candela brings its P-12 electric ferry to Tahoe and adds another $14M to build more

Electric passenger boat startup Candela has topped off its most recent raise with another $14 million, the...

OneRail’s software helps solve the last-mile delivery problem

Last-mile delivery, the very last step of the delivery process, is a common pain point for companies....