Photo source: Pexels
Google DeepMind has introduced an open-source tool designed to identify text generated by artificial intelligence, known as SynthID. This tool is part of a broader suite of watermarking technologies aimed at generative AI outputs.
To assess the effectiveness of SynthID, the company conducted an extensive experiment involving millions of users of its Gemini application, who were asked to evaluate the tool’s performance. SynthID is a recent addition to DeepMind’s watermarking arsenal, which includes tools for images and AI-generated videos.
In May, Google announced the integration of SynthID into its Gemini app and chatbots, making it accessible on Hugging Face, a platform for AI datasets and models. Watermarking has become crucial in helping users discern AI-generated content, which can mitigate issues like misinformation.
How SynthID Works
Given a prompt like “What’s your favourite colour?,” text-generating models predict which “token” most likely follows another—one token at a time. Tokens, which can be a single character or word, are the building blocks a generative model uses to process information. A model assigns each possible token a score, which is the percentage chance it’s included in the output text. SynthID Text inserts additional info in this token distribution by “modulating the likelihood of tokens being generated.”
“The final pattern of scores for both the model’s word choices combined with the adjusted probability scores are considered the watermark,” the company wrote in a blog post. “This pattern of scores is compared with the expected pattern of scores for watermarked and unwatermarked text, helping SynthID detect if an AI tool generated the text or if it might come from other sources.”
Pushmeet Kohli, vice president of Research at Google DeepMind, explained that SynthID modifies the generation process by adjusting the probability of certain tokens being selected. To ascertain whether text has been generated by an AI tool, SynthID compares the expected probability scores between watermarked and unwatermarked texts.
Testing Phase
In extensive testing, Google DeepMind determined that employing SynthID did not negatively impact the quality, accuracy, creativity, or speed of the generated text. This conclusion stemmed from a large-scale live experiment where millions interacted with SynthID-enabled Gemini products, allowing users to rate responses with thumbs-up or thumbs-down feedback.
Kohli and his team analysed approximately 20 million responses from both watermarked and unwatermarked chatbots and found no significant differences in perceived quality or utility. Currently, SynthID is limited to content generated by Google’s models. However, opening it up for public use aims to broaden its compatibility across various platforms.
Current Limitations
Despite its strengths, SynthID does have limitations. While it can withstand some forms of tampering—such as cropping or light editing—it becomes less effective when AI-generated text is rewritten or translated. Additionally, it struggles with factual prompts where there are fewer opportunities to adjust token probabilities without altering factual accuracy.
Soheil Feizi, an associate professor at the University of Maryland who has researched AI watermarking vulnerabilities, said “Achieving reliable and imperceptible watermarking of AI-generated text is fundamentally challenging, especially in scenarios where LLM outputs are near deterministic, such as factual questions or code generation tasks.”
Feizi praised Google DeepMind’s decision to make its watermarking method open-source as a significant advancement for the AI community.
“It allows the community to test these detectors and evaluate their robustness in different settings, helping to better understand the limitations of these techniques,” he added.
João Gante, a machine-learning engineer at Hugging Face, highlighted another advantage of open-sourcing SynthID—it enables developers to integrate watermarking into their models freely. This enhances privacy since only the model owner will possess the cryptographic keys associated with the watermark.
“With better accessibility and the ability to confirm its capabilities, I want to believe that watermarking will become the standard, which should help us detect malicious use of language models,” Gante noted.
However, Irene Solaiman, also from Hugging Face, cautioned that watermarks are not a comprehensive solution.
“Watermarking is one aspect of safer models in an ecosystem that needs many complementing safeguards. As a parallel, even for human-generated content, fact-checking has varying effectiveness,” she said.