Gemini Live first look: Better than talking to Siri, but worse than I’d like

Google launched Gemini Live during its Made By Google event in Mountain View, California, on Tuesday. The feature allows you to have a semi-natural spoken conversation, not typed out, with an AI chatbot powered by Google’s latest large language model. TechCrunch was there to test it out firsthand.

Gemini Live is Google’s answer to OpenAI’s Advanced Voice Mode, ChatGPT’s nearly identical feature that’s current in a limited alpha test. While OpenAI beat Google to the punch by demoing the feature first, Google is the first to roll out the finalized feature.

In my experience, these low latency, verbal features feel much more natural than texting with ChatGPT, or even talking with Siri or Alexa. I found that Gemini Live responded to questions in less than two seconds, and was able to pivot fairly quickly when interrupted. Gemini Live is not perfect, but it’s the best way to use your phone hands-free that I’ve seen yet.

How it works

Before speaking with Gemini Live, the feature lets you choose from 10 voices, compared to just three voices from OpenAI. Google worked with voice actors to create each one. I appreciated the variety there, and found each one to sound very humanlike.

In one example, a Google product manager verbally asked Gemini Live to find family-friendly wineries near Mountain View with outdoor areas and playgrounds nearby, so that kids could potentially come along. That’s a far more complicated task than I’d ask Siri — or Google Search, frankly — but Gemini successfully recommended a spot that met the criteria: Cooper-Garrod Vineyards in Saratoga.

That said, Gemini Live leaves something to be desired. It seemed to hallucinate a nearby playground called Henry Elementary School Playground that is supposedly “10 minutes away” from that vineyard. There are other playgrounds nearby in Saratoga, but the nearest Henry Elementary School is more than a two-hour drive from there. There’s a Henry Ford Elementary School in Redwood City, but it’s 30 minutes away.

Google liked to show off how users can interrupt Gemini Live mid-sentence, and the AI will quickly pivot. The company says this allows users to control the conversation. In practice, this feature doesn’t work perfectly. Sometimes Google’s project managers and Gemini Live were talking over each other, and the AI didn’t seem to pick up on what was said.

Notably, Google is not allowing Gemini Live to sing or mimic any voices outside of the 10 it provides, according to product manager Leland Rechis. The company is likely doing this to avoid run ins with copyright law. Further, Rechis said Google is not focused on getting Gemini Live to understand emotional intonation in a user’s voice – something OpenAI touted during its demo.

Overall, the feature seems like a great way to dive deeply into a subject more naturally than you would with simple Google Search. Google notes that Gemini Live is a step along the way to Project Astra, the fully multimodal AI model the company debuted during Google I/O. For now, Gemini Live is just capable of voice conversations, however, in the future Google wants to add real-time video understanding.

Source link

Gemini Live first look: Better than talking to Siri, but worse than I’d like

How it works

Recent posts

YouTube is now letting creators remix songs through AI prompting

Tiger Global partner Alex Cook to leave firm, sources say

Fal.ai, which hosts media-generating AI models, raises $23M from a16z and others

500 Amazon employees reportedly ask AWS CEO to reverse return-to-office policy

Patreon gets gift subscriptions in time for the holidays

TechCrunch Space: New year, new milestones

Intelmatix raises $20M Series A to enable MENA businesses to tap AI for decision-making

Pharma giant Cencora is alerting millions about its data breach

Hedosophia leads $7M seed round into retail supply chain AI startup Ameba

Whisper Aero is working with NASA to bring its ultra-quiet tech to outer space

Nothing launches a slick pair of $149 open-ear headphones

Spotify launches its evolving playlist, daylist, globally

How to make open source software more secure

Snapchat’s new Footsteps feature tracks your location history

Android’s latest update improves text-to-speech, Circle to Search, earthquake alerts and more

Related articles

Karmen secures $9.4 million for its revenue-based financing products

President Trump signs exec order to make Musk’s DOGE commission more official

Trump signs exec order delaying TikTok enforcement action for 75 days

President Trump repeals Biden’s AI executive order

UK to unveil ‘Humphrey’ assistant for civil servants with other AI plans to cut bureaucracy

OpenAI’s agent tool may be nearing release

Friend delays shipments of its AI companion pendant

US safety regulators expand Ford hands-free driving tech investigation

Company

Follow us