AI threats in the wild: The current state of prompt injections on the web

At Google, our Threat Intelligence teams are dedicated to staying ahead of real-world adversarial activity, proactively monitoring emerging threats before they can impact users. Right now, Indirect Prompt Injection (IPI) is a top priority for the security community, anticipating it as a primary attack vector for adversaries to target and compromise AI agents. But while the danger of IPI is widely discussed, are threat actors actually exploiting this vector today – and if so, how?

To answer these questions and to uncover real-world abuse, we initiated a broad sweep of the public web to monitor for known indirect prompt injection patterns. This is what we found. 

The threat of indirect prompt injection

Unlike a direct injection where a user “jailbreaks” a chatbot, IPI occurs when an AI system processes content—like a website, email, or document—that contains malicious instructions. When the AI reads this poisoned content, it may silently follow the attacker’s commands instead of the user’s original intent.

This is not a new area of concern for us and Google has been working tirelessly to combat these threats. Our efforts involve cross-functional collaboration between researchers at Google DeepMind (GDM) and defenders like the Google Threat Intelligence Group (GTIG). We have previously detailed our work in this area and researchers have further highlighted the evolving nature of these vulnerabilities.

Despite this collective focus, a fundamental question remains: to what degree are real-world malicious actors currently operationalizing these attacks?

Proactive monitoring at Google

The landscape of IPI on the web

There are many channels through which attackers might try to s

[…]
Content was cut in order to protect the source.Please visit the source for the rest of the article.

This article has been indexed from Google Online Security Blog

Read the original article: