AWS’ new service tackles AI hallucinations

Date:

Share post:


Amazon Web Services (AWS), Amazon’s cloud computing division, is launching a new tool to combat hallucinations — that is, scenarios where an AI model behaves unreliably.

Announced at AWS’ re:Invent 2024 conference in Las Vegas, the service, Automated Reasoning checks, validates a model’s responses by cross-referencing customer-supplied info for accuracy. AWS claims in a press release that Automated Reasoning checks is the “first” and “only” safeguard for hallucinations.

But that’s, well… putting it generously.

Automated Reasoning checks is nearly identical to the Correction feature Microsoft rolled out this summer, which also flags AI-generated text that might be factually wrong. Google also offers a tool in Vertex AI, its AI development platform, to let customers “ground” models by using data from third-party providers, their own data sets, or Google Search.

In any case, Automated Reasoning checks, which is available through AWS’ Bedrock model hosting service (specifically the Guardrails tool), attempts to figure out how a model arrived at an answer — and discern whether the answer is correct. Customers upload info to establish a ground truth of sorts, and Automated Reasoning checks and creates rules that can then be refined and applied to a model.

As a model generates responses, Automated Reasoning checks verifies them, and, in the event of a probable hallucination, draws on the ground truth for the right answer. It presents this answer alongside the likely mistruth so customers can see how far off-base the model might’ve been.

AWS says PwC is already using Automated Reasoning checks to design AI assistants for its clients. And Swami Sivasubramanian, VP of AI and data at AWS, suggested that this type of tooling is exactly what’s attracting customers to Bedrock.

“With the launch of these new capabilities,” he said in a statement, “we are innovating on behalf of customers to solve some of the top challenges that the entire industry is facing when moving generative AI applications to production.” Bedrock’s customer base grew by 4.7x in the last year to tens of thousands of customers, Sivasubramanian added.

But as one expert told me this summer, trying to eliminate hallucinations from generative AI is like trying to eliminate hydrogen from water.

AI models hallucinate because they don’t actually “know” anything. They’re statistical systems that identify patterns in a series of data, and predict which data comes next based on previously-seen examples. It follows that a model’s responses aren’t answers, then, but predictions of how questions should be answered — within a margin of error.

AWS claims that Automated Reasoning checks uses “logically accurate” and “verifiable reasoning” to arrive at its conclusions. But the company volunteered no data showing that the tool is itself reliable.

In other Bedrock news, AWS this morning announced Model Distillation, a tool to transfer the capabilities of a large model (e.g. Llama 405B) to a small model (e.g. Llama 8B) that’s cheaper and faster to run. An answer to Microsoft’s Distillation in Azure AI Foundry, Model Distillation provides a way to experiment with various models without breaking the bank, AWS says.

Image Credits:Frederic Lardinois/TechCrunch

“After the customer provides sample prompts, Amazon Bedrock will do all the work to generate responses and fine-tune the smaller model,” AWS explained in a blog post, “and it can even create more sample data, if needed, to complete the distillation process.”

But there’s a few caveats.

Model Distillation only works with Bedrock-hosted models from Anthropic and Meta at present. Customers have to select a large and small model from the same model “family” — the models can’t be from different providers. And distilled models will lose some accuracy — “less than 2%,” AWS claims.

If none of that deters you, Model Distillation is now available in preview, along with Automated Reasoning checks.

Also available in preview is “multi-agent collaboration,” a new Bedrock feature that lets customers assign AI to subtasks in a larger project. A part of Bedrock Agents, AWS’ contribution to the AI agent craze, multi-agent collaboration provides tools to create and tune AI to things like reviewing financial records and assessing global trends.

Customers can even designate a “supervisor agent” to break up and route tasks to the AIs automatically. The supervisor can “[give] specific agents access to the information they need to complete their work,” AWS says, and “[determine] what actions can be processed in parallel and which need details from other tasks before [an] agent can move forward.”

“Once all of the specialized [AIs] complete their inputs, the supervisor agent [can pull] the information together [and] synthesize the results,” AWS wrote in the post.

Sounds nifty. But as with all these features, we’ll have to see how well it works when deployed in the real world.



Source link

Lisa Holden
Lisa Holden
Lisa Holden is a news writer for LinkDaddy News. She writes health, sport, tech, and more. Some of her favorite topics include the latest trends in fitness and wellness, the best ways to use technology to improve your life, and the latest developments in medical research.

Recent posts

Related articles

Neom is reportedly turning into a financial disaster, except for McKinsey & Co.

A new WSJ report suggests that Saudi Arabia’s now eight-year-old Neom project — a futuristic, carbon-neutral, 105-mile-long...

Manus probably isn’t China’s second ‘DeepSeek moment’

Manus, an “agentic” AI platform that launched in preview last week, is generating more hype than a...

Japan’s service robot market projected to triple in five years

Faced with an aging population and labor shortages, Japanese businesses are increasingly relying on service robots to...

Colossal CEO Ben Lamm says humanity has a ‘moral obligation’ to pursue de-extinction tech

The CEO of Colossal, a startup that aims to use genetic editing techniques to bring back extinct...

Tammy Nam joins AI-powered ad startup Creatopy as CEO

Creatopy, a startup that uses AI to automate the creation of digital ads, has brought on a...

Apple’s smart home hub reportedly delayed by Siri challenges

Apple announced this week that the “more personalized” version of Siri that it promised last year has...

Musk may still have a chance to thwart OpenAI’s for-profit conversion

Elon Musk lost the latest battle in his lawsuit against OpenAI this week, but a federal judge...

How to stop doomscrolling

The world is bad sometimes, but it feels even worse if you can’t stop staring into the...