Gemini Live first look: Better than talking to Siri, but worse than I’d like

Date:

Share post:


Google launched Gemini Live during its Made By Google event in Mountain View, California, on Tuesday. The feature allows you to have a semi-natural spoken conversation, not typed out, with an AI chatbot powered by Google’s latest large language model. TechCrunch was there to test it out firsthand.

Gemini Live is Google’s answer to OpenAI’s Advanced Voice Mode, ChatGPT’s nearly identical feature that’s current in a limited alpha test. While OpenAI beat Google to the punch by demoing the feature first, Google is the first to roll out the finalized feature.

In my experience, these low latency, verbal features feel much more natural than texting with ChatGPT, or even talking with Siri or Alexa. I found that Gemini Live responded to questions in less than two seconds, and was able to pivot fairly quickly when interrupted. Gemini Live is not perfect, but it’s the best way to use your phone hands-free that I’ve seen yet.

How it works

Before speaking with Gemini Live, the feature lets you choose from 10 voices, compared to just three voices from OpenAI. Google worked with voice actors to create each one. I appreciated the variety there, and found each one to sound very humanlike.

In one example, a Google product manager verbally asked Gemini Live to find family-friendly wineries near Mountain View with outdoor areas and playgrounds nearby, so that kids could potentially come along. That’s a far more complicated task than I’d ask Siri — or Google Search, frankly — but Gemini successfully recommended a spot that met the criteria: Cooper-Garrod Vineyards in Saratoga.

That said, Gemini Live leaves something to be desired. It seemed to hallucinate a nearby playground called Henry Elementary School Playground that is supposedly “10 minutes away” from that vineyard. There are other playgrounds nearby in Saratoga, but the nearest Henry Elementary School is more than a two-hour drive from there. There’s a Henry Ford Elementary School in Redwood City, but it’s 30 minutes away.

Google liked to show off how users can interrupt Gemini Live mid-sentence, and the AI will quickly pivot. The company says this allows users to control the conversation. In practice, this feature doesn’t work perfectly. Sometimes Google’s project managers and Gemini Live were talking over each other, and the AI didn’t seem to pick up on what was said.

Notably, Google is not allowing Gemini Live to sing or mimic any voices outside of the 10 it provides, according to product manager Leland Rechis. The company is likely doing this to avoid run ins with copyright law. Further, Rechis said Google is not focused on getting Gemini Live to understand emotional intonation in a user’s voice – something OpenAI touted during its demo.

Overall, the feature seems like a great way to dive deeply into a subject more naturally than you would with simple Google Search. Google notes that Gemini Live is a step along the way to Project Astra, the fully multimodal AI model the company debuted during Google I/O. For now, Gemini Live is just capable of voice conversations, however, in the future Google wants to add real-time video understanding.



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

Indian news agency sues OpenAI alleging copyright infringement

One of India’s largest news agencies, Asian News International, has sued OpenAI in a case that could...

Sagence is building analog chips to run AI

Graphics processing units (GPUs), the chips on which most AI models run, are energy-hungry beasts. As a...

Yuka, the app that rates food and makeup, now lets users complain to companies directly

Yuka is a popular health app that allows users to scan the barcodes of food items to...

Socium doubles down on Francophone Africa after $5M seed round

Demand for HR-tech solutions in Africa is growing, and Senegal’s Socium is out to tap the opportunity...

Logitech’s MX console for creatives

Deep into Adobe apps like Photoshop or Premiere? Logitech’s MX Creative Console is designed to streamline your...

Ben Affleck tells actors and writers not to worry about AI

Unions representing Hollywood actors and writers went on strike last year to secure protections against AI systems...

Venture funding in Europe in 2024 fell to $45 billion, says Atomico

Funding for European tech appears to have stabilized in 2024 after dropping precipitously in 2023, but the...

Justice Department reportedly pushing Google to spin off Chrome

The Department of Justice is reportedly pushing to force Google to spin off its Chrome browser business. That’s...