Google Adds Guardrails to Keep AI in Check

GOOGLE I/O 2023, MOUNTAIN VIEW, CALIF. — Sandwiched between major announcements at Google I/O, company executives discussed guardrails to its new artificial intelligence (AI) products to ensure they are used responsibly and not misused. They included Google CEO Sundar Pichai, who noted some of the security concerns associated with advanced AI technologies coming out of the labs.

The spread of misinformation, deepfakes, and abusive text or imagery generated by AI would be hugely detrimental if Google were responsible for the model that created this content, said James Sanders, principal analyst at CCS Insight.

“Safety, in the context of AI, concerns the impact of artificial intelligence on society,” he said. “Google’s interests in responsible AI are motivated, at least in part, by reputation protection and discouraging intervention by regulators.”

For example, Universal Translator is a video AI offshoot of Google Translate that can take footage of a person speaking and translate the speech into another language. The app could potentially expand the video’s audience to include those who don’t speak the original language.

But the technology could also erode trust in the source material, since the AI modifies the lip movement to make it seem as if the person were speaking in the translated language, said James Manyika, Google’s senior vice president charged with responsible development of AI, who demonstrated the application on stage.

“There’s an inherent tension here,” Manyika said. “You can see how this can be incredibly beneficial, but some of the same underlying technology can be misused by bad actors to create deepfakes. We built the service around guardrails to help prevent misuse and to make it accessible only to authorized partners.”

Setting up Custom Guardrails

Different companies have different approaches to AI guardrails. Google is focused on controlling the output generated by artificial intelligence tools and limiting who can actually use the technologies. Universal Translators are available to fewer than 10 partners, for example. ChatGPT has been programmed to say it can’t answer certain types of questions if the question or answer could cause harm.

Nvidia has NeMo Guardrails, an open source tool to ensure responses fit within specific parameters. The technology also prevents the AI from hallucinating, the term for giving a confident response that is not justified by its training data. If the Nvidia program detects that the answer isn’t relevant within specific parameters, it can decline to answer the question or send the information to another system to find more relevant answers.

Google shared its research on safeguards in its new PaLM-2 large-language model, which was also announced at Google I/O. That Palm-2 technical paper explains that there are some questions in certain categories the AI engine will not touch.

“Google relies on automated adversarial testing to identify and reduce these outputs. Google’s Perspective API, created for this purpose, is used by academic researchers to test models from OpenAI and Anthropic, among others,” CCS Insight’s Sanders said.

Kicking the Tires at DEF CON

Manyika’s comments fit into the narrative of responsible use of AI, which took on more urgency following concerns about bad actors misusing technologies like ChatGPT to craft phishing approaches or generate malicious code to break into systems.

AI was already being used for deepfake videos and voices. AI company Graphika, which counts the Department of Defense as a client, recently identified instances of AI-generated footage in use to influence public opinion.

“We believe the use of commercially available AI products will allow IO actors to create increasingly high-quality deceptive content at greater scale and speed,” the Graphika team wrote in its deepfakes report.

The White House has chimed in with a call for guardrails to mitigate misuse of AI technology. Earlier this month, the Biden administration secured the commitment of companies including Google, Microsoft, Nvidia, OpenAI, and Stability AI to allow participants to publicly evaluate their AI systems during DEF CON 31, which will be held in August in Las Vegas. The models will be red-teamed using an evaluation platform developed by Scale AI.

“This independent exercise will provide critical information to researchers and the public about the impacts of these models, and will enable AI companies and developers to take steps to fix issues found in those models,” the White House statement said.

CyberSigna

Cyber Forensics and Research

Google Adds Guardrails to Keep AI in Check

Setting up Custom Guardrails

Kicking the Tires at DEF CON