Top cyber threats to your AI systems and infrastructure

Tags:

Attacks against AI systems and infrastructure are beginning to take shape in real-world instances, and security experts expect the number of these attack types will rise in coming years. In a rush to realize the benefits of AI, most organizations have played it fast and loose on security hardening when rolling out AI tools and use cases. As a result, experts also warn that many organizations aren’t prepared to detect, deflect, or respond to such attacks.

“Most are aware of the possibility of such attacks, but I don’t think a lot of people are fully aware of how to properly mitigate the risk,” says John Licato, associate professor in the Bellini College of Artificial Intelligence, Cybersecurity and Computing at the University of South Florida, founder and director of the Advancing Machine and Human Reasoning Lab, and owner of startup company Actualization.AI.

Top threats to AI systems

Multiple attack types against AI systems are arising. Some attacks, such as data poisoning, occur during training. Others, such as adversarial inputs, happen during inference. Still others, such as model theft, occur during deployment.

Here is a rundown of the top threat types to AI infrastructure experts warn about today. Some are more rare or theoretical than others, though many have been observed in the wild or have been demonstrated by researchers through notable proofs of concept.

Data poisoning

Data poisoning is a type of attack in which bad actorsmanipulate, tamper with, and pollute the data used to develop or train AI systems, including machine learning models. By corrupting the data or introducing faulty data, attackers can alter, bias, or otherwise render inaccurate a model’s performance.

Imagine an attack that tells a model that green means stop instead of go, says Robert T. Lee, CAIO and chief of research at SANS, a security training and certification firm. “It’s meant to degrade the output of the model,” he explains.

Model poisoning

Here, the attack goes after the model itself, seeking to produce inaccurate results by tampering with the model’s architecture or parameters. Some definitions of model poisoning models also include attacks where the model’s training data has been corrupted through data poisoning.

Tool poisoning

Invariant Labs identified this type of attack in spring 2025. When announcing its findings, Invariant wrote that it had “discovered a critical vulnerability in the Model Context Protocol (MCP) that allows for what we term Tool Poisoning Attacks. This vulnerability can lead to sensitive data exfiltration and unauthorized actions by AI models.”

The company went on to note that its experiments showed “that a malicious server can not only exfiltrate sensitive data from the user but also hijack the agent’s behavior and override instructions provided by other, trusted servers, leading to a complete compromise of the agent’s functionality, even with respect to trusted infrastructure.”

These attacks involve embedding malicious instructions inside MCP tool descriptions that, when interpreted by AI models, can hijack the model. These attacks essentially corrupt the MCP layer “to trick an agent to do something,” says Chirag Mehta, vice principal and principal analyst at Constellation Research.

For more on MCP threats, see “Top 10 MCP vulnerabilities: The hidden risks of AI integrations.”

Prompt injection

During a prompt injection attack, hackers use prompts that look legitimate but actually have embedded malicious commands meant to get the large language model to do something it shouldn’t. Hackers use these prompts to trick the model to bypass or override its guardrails, to share sensitive data, or to perform unauthorized actions.

“With prompt injection, you can change what the AI agent is supposed to do,” says Fabien Cros, chief data and AI officer at global consulting firm Ducker Carlisle.

Several notable prompt injection attacks and proofs of concept have been reported of late, including researchers tricking ChatGPT into prompt injecting itself, attackers embedding malicious prompts into document macros, and researchers demoing zero-click prompt attacks on popular AI agents.

Adversarial inputs

Model owners and operators use perturbed data to test models for resiliency, but hackers use it to disrupt. In an adversarial input attack, malicious actors feed deceptive data to a model with the goal of making the model output incorrect.

The changes to the perturbed input are typically small, or the deceptive data may be noise; the changes are deliberately designed to be subtle enough to evade detection by security systems but still capable of throwing off the model. This makes adversarial inputs a type of evasion attack.

Model theft/model extraction

Malicious actors can replicate, or reverse-engineer, a model, its parameters, and even its training data. They typically do this using publicly available APIs — for example, the model’s prediction API or a cloud services API — to repeatedly query the model and collect outputs.

They then can analyze how the model responds and use that analysis to reconstruct it.

“It’s enabling unauthorized duplication of the tools itself,” says Allison Wikoff, director and Americas lead for global threat intelligence at PwC.

Model inversion

Model inversion refers to a specific extraction attack in which the adversary attempts to reconstruct or infer the data that was used to train the model.

The name comes from the hackers “inverting” the model, using its outputs to reconstruct or reverse-engineer information about the inputs used to train the model.

Supply chain risks

Like other software systems, AI systems are built with a combination of components that can include open-source code, open-source models, third-party models, and various sources of data. Any security vulnerability in the components can show up in the AI systems. This makes AI systems vulnerable to supply chain attacks, where hackers can exploit vulnerabilities within the components to launch an attack.

For recent examples, see “AI supply chain threats loom — as security practices lag.”

Jailbreaking

Also called model jailbreaking, attackers’ goal here is to get AI systems — primarily through engaging with LLMs — to disregard the guardrails that confine their actions and behavior, such as safeguards to prevent harmful, offensive, or unethical outputs.

Hackers can use various techniques to execute this type of attack. For example, they could employ a role-playing exploit (aka role-play attack), using commands to instruct the AI to adopt a persona (such as a developer) that can work around the guardrails. They could disguise malicious instructions in seemingly legitimate prompts or use encoding, foreign words, or keyboard characters to bypass filters. They could also use a prompt framed as a hypothetical or research question or a series of prompts that leads to their end objective.

Those objectives, which also are varied, include getting AI systems to write malicious code, spread problematic content, and reveal sensitive data.

“When there is a chat interface, there are ways to interact with it to get it to operate outside the parameters,” Licato says. “That’s the tradeoff of having an increasingly powerful reasoning system.”

Counteracting threats to AI systems

While their executive colleagues jump into AI initiatives in search of enhanced productivity and innovation, CISOs must take an active role in ensuring security for those initiatives — and the organization’s AI infrastructure at large — is a top priority.

According to a recent survey from security tech company HackerOne, 84% of CISOs are now responsible for AI security and 82% now oversee data privacy. If CISOs don’t advance their security strategies to counteract attacks against AI systems and the data the feeds them, future issues will reflect on their leadership — regardless of whether they were invited to the table when AI initiatives were conceived and launched.

As a result, CISOs have a “need for a proactive AI security strategy,” according to Constellation’s Mehta.

“AI security is not just a technical challenge but also a strategic imperative requiring executive buy in and cross-functional collaboration,” he writes in his 2025 report AI Security Beyond Traditional Cyberdefenses: Rethinking Cybersecurity for the Age of AI and Autonomy. “Data governance is foundational, because securing AI begins with ensuring the integrity and provenance of training data and model inputs. Security teams must develop new expertise to handle AI-driven risks, and business leaders must recognize the implications of autonomous AI systems and the governance frameworks needed to manage them responsibly.”

Strategies for assessing, managing, and counteracting the threat of attacks on AI systems are emerging. In addition to maintaining strong data governance and other fundamental cyber defense best practices, AI and security experts say CISOs and their organizations should be evaluating AI models before deploying them, monitoring AI systems in use, and using red teams to test models.

CISOs may need to implement specific actions to counter certain attacks, says PwC’s Wikoff. For example, CISOs looking to head off model theft can monitor for suspicious queries and patterns as well as have timeouts and capture rate-limited responses. Or, to help prevent evasion attacks, security leaders could employ adversarial training — essentially training models to guard against those types of attacks.

Adopting MITRE ATLAS is another step. This framework, short for Adversarial Threat Landscape for Artificial-Intelligence Systems, provides a knowledge base mapping how attackers target AI systems and details identifying tactics, techniques, and procedures (TTPs).

Security and AI experts acknowledge the challenges of taking such steps. Many CISOs are contending with more immediate threats, including shadow AI and attacks that are getting faster, more sophisticated, and harder to detect, thanks in part to attackers’ use of AI. And given that attacks on AI systems are still nascent, with some attack types still considered theoretical, CISOs face challenges in getting resources to develop strategies and skills to counteract attacks on AI systems.

“For the CISO this is something that’s really difficult, because attacks on AI backends is still being researched. We’re at the early stages of figuring out what hackers are doing and why,” Lee, of SANS, says.

Lee and others recognize the competitive pressure on organizations to make the most of AI, yet they stress that CISOs and their executive colleagues can’t let securing AI systems be an afterthought.

“Thinking about what these attacks could be as they build the infrastructure is key for the CISO,” says Matt Gorham, leader of PwC’s Cyber and Risk Innovation Institute.

Categories

No Responses

Leave a Reply

Your email address will not be published. Required fields are marked *