Google’s and Microsoft’s chatbots are making up Super Bowl stats

If you needed more evidence that GenAI is prone to making stuff up, Google’s Gemini chatbot, formerly Bard, thinks that the 2024 Super Bowl already happened. It even has the (fictional) statistics to back it up.

Per a Reddit thread, Gemini, powered by Google’s GenAI models of the same name, is answering questions about Super Bowl LVIII as if the game wrapped up yesterday — or weeks before. Like many bookmakers, it seems to favor the Chiefs over the 49ers (sorry, San Francisco fans).

Gemini embellishes creatively, in at least one case giving a player stats breakdown suggesting that Kansas Chief quarterback Patrick Mahomes ran 286 yards for two touchdowns and an interception versus Brock Purdy’s 253 running yards and one touchdown.

Image Credits: /r/smellymonster (opens in a new window)

It’s not just Gemini. Microsoft’s Copilot chatbot, too, insisted that the game ended and provided citations (albeit erroneous) to back up the claim. But — perhaps reflecting a San Francisco bias! — it said the 49ers, not the Chiefs, emerged victorious “with a final score of 24-21.”

Image Credits: Kyle Wiggers / TechCrunch

It’s all rather silly — and possibly fixed by now, given that this reporter had no luck replicating the Gemini responses in the Reddit thread. But it also illustrates the major limitations of today’s GenAI — and the dangers of placing too much trust in it.

GenAI models have no real intelligence. Fed an enormous number of examples usually sourced from the public web, AI models learn how likely data (e.g. text) is to occur based on patterns, including the context of any surrounding data.

This probability-based approach works remarkably well at scale. But while the range of words and their probabilities are likely to result in text that makes sense, it’s far from certain. LLMs can generate something that’s grammatically correct but nonsensical, for instance — like the claim about the Golden Gate. Or they can spout mistruths, propagating inaccuracies in their training data.

It’s not malicious on the LLMs’ part. They don’t have malice, and the concepts of true and false are meaningless to them. They’ve simply learned to associate certain words or phrases with certain concepts, even if those associations aren’t accurate.

Hence Gemini’s Super Bowl falsehoods.

Google and Microsoft, like most GenAI vendors, readily acknowledge their GenAI isn’t perfect and is, in fact, prone to making mistakes. But these acknowledgements come in the form of small print I’d argue could easily be missed.

Super Bowl disinformation certainly isn’t the most harmful example of GenAI going off the rails. That distinction probably lies with endorsing torture or writing convincingly about conspiracy theories. It is, however, a useful reminder to double-check statements from GenAI bots. There’s a decent chance they’re not true.

Source link

Google’s and Microsoft’s chatbots are making up Super Bowl stats

Recent posts

FTC bans antivirus giant Avast from selling its users’ browsing data to advertisers

Sam’s Club’s AI-powered exit tech reaches 20% of stores

Bluesky opens to everyone, Rivian reveals its new SUV, and governments exploit iPhones

Inversion Space will test its space-based delivery tech in October

Hugging Face releases a benchmark for testing generative AI on health tasks

US says Russian hackers stole federal government emails during Microsoft cyberattack

Meati Foods bites into another $100M amid growth to 7,000 retail locations

A Waymo robotaxi was vandalized and burned in San Francisco

Apple sued by DOJ over iPhone monopoly claims

Women in AI: Sandra Watcher, professor of data ethics at Oxford

Roam raises $24M to scale electric vehicle production in Kenya

Apple iPad event: What to expect

Mark Zuckerberg calls Apple’s DMA rules ‘so onerous’ he doubts any developer will opt in

TechCrunch Minute: Quantum computing’s next era could be led by Microsoft and Quantinuum

WhatsApp trials Meta AI chatbot in India, more markets

Related articles

Sources: Mistral AI raising at a $6B valuation, SoftBank ‘not in’ but DST is

Blackboard founder transforms Zoom add-on designed for teachers into business tool

Rowing startup Hydrow acquires a majority stake in Speede Fitness as their CEO steps down

TikTok will automatically label AI-generated content created on platforms like DALL·E 3

India weighs delaying caps on UPI market share in win for PhonePe, Google Pay

Thai food delivery app Line Man Wongnai weighs IPO in Thailand, US in 2025

Apple’s ‘Crush’ ad is disgusting

OpenAI offers a peek behind the curtain of its AI’s secret instructions

Company

Follow us