Google releases tech to watermark AI-generated text

Google is making SynthID Text, its technology that lets developers watermark and detect text generated by generative AI models, generally available.

SynthID Text can be downloaded from the AI platform Hugging Face and Google’s updated Responsible GenAI Toolkit.

“Today, we’re open sourcing our SynthID Text watermarking tool,” the company wrote in a post on X. “Available freely to developers and businesses, it will help them identify their AI-generated content.”

So how does it work?

Given a prompt like “What’s your favorite fruit?,” text-generating models predict which “token” most likely follows another — one token at a time. Tokens are the building blocks a generative model uses to process information. They can be a single character, word, or part of a phrase.

The model assigns each possible token a score, which is the percentage chance it’s included in outputted text. SynthID Text inserts additional data in this token distribution by “modulating the likelihood of tokens being generated,” Google says.

“The final pattern of scores for both the model’s word choices combined with the adjusted probability scores are considered the watermark,” the company wrote in a blog post. “This pattern of scores is compared with the expected pattern of scores for watermarked and unwatermarked text, helping SynthID detect if an AI tool generated the text or if it might come from other sources.”

Google claims that SynthID Text, which has been integrated with its Gemini models since this spring, doesn’t compromise the quality, accuracy, or speed of text generation, and works even on text that’s been cropped, paraphrased, or modified.

But the company also admits that its watermarking technology has limitations.

For example, SynthID Text doesn’t perform as well with short text or text that’s been rewritten or translated from another language, and with responses to factual questions. “On responses to factual prompts, there are fewer opportunities to adjust the token distribution without affecting the factual accuracy,” explains the company. “This includes prompts like ‘What is the capital of France?’ or queries where little or no variation is expected like ‘recite a William Wordsworth poem.’”

Source link

Google releases tech to watermark AI-generated text

Recent posts

Pear wants to empower up-and-coming VCs with its new emerging managers in residence program

Flash, founded by ex-Flipkart exec, launches in the U.S. to help shoppers juggle multiple online orders

Tesla says it has reached a ‘conditional’ settlement in Rivian trade secrets lawsuit

Hedosophia leads $7M seed round into retail supply chain AI startup Ameba

Humidity sucks. Transaera has a new way to deal with it

Dropbox acquires Index Ventures-backed AI scheduling tool Reclaim.ai

OpenAI reportedly plans to release its Orion AI model by December

Vinted hits $5.4B valuation amid wave of secondary share sales in Europe

UK data watchdog fines NHS vendor Advanced for security failures prior to LockBit ransomware attack

Uber now lets users in India book three trips at once

Elon Musk’s reposts of Kamala Harris deepfakes may not fly under new California law

Hubspot says it’s investigating customer account hacks

Are Cybertrucks too angular for Europe?

The most interesting unicorns to come out of Japan

Meta will pay Texas $1.4B in settlement over facial recognition software

Related articles

How a digital “you” can sit through your agonizing web conference calls

‘Wolfs’ sequel canceled because director ‘no longer trusted’ Apple

DOJ tells Google to sell Chrome

Tesla says it has reached a ‘conditional’ settlement in Rivian trade secrets lawsuit

The rise and fall of the ‘Scattered Spider’ hackers

Trump’s tariff threats don’t scare this Mexican fintech

Meet three incoming EU lawmakers in charge of key tech policy areas

OpenAI accidentally deleted potential evidence in NY Times copyright lawsuit (updated)

Company

Follow us