Top 5 real-world AI security threats revealed in 2025

December 29 • 7:00 am

Tags:

No tags

The year of agentic AI came with promises of massive productivity gains for businesses, but the rush to adopt new tools and services also opened new attack paths in enterprise environments.

Here are some of the top security risks to the AI ecosystem that were revealed this year by security researchers, either in the wild or as researcher-demonstrated attacks.

Shadow AI and vulnerable AI tools

Giving free reign to employees to experiment with AI tools to automate business processes might sound like a good idea that could surface creative solutions. But it can quickly get out of control if not done under a strict policy and monitoring.

A recent survey of 2,000 employees from companies in the US and UK revealed that 49% use AI tools not sanctioned by their employers and that over half do not understand how their inputs are stored and analyzed by these tools.

The deployment of all AI-related tools and services on premises or in the cloud needs to involve the security team in order to catch insecure configurations or known vulnerabilities.

In its 2025 State of Cloud Security report, Orca Security reported that 84% of organizations now use AI-related tools in the cloud and that 62% had at least one vulnerable AI package in their environments.

A separate report from the Cloud Security Alliance reported that one third of organizations experienced a cloud data breach that involved an AI workload, with 21% of those incidents caused by vulnerabilities, 16% by misconfigured security settings, and 15% by compromised credentials or weak authentication.

Even the AI tools released by major vendors regularly have vulnerabilities identified and patched in them. Examples this year include:

A critical remote code execution (RCE) in open-source AI agent framework Langflow that was also exploited in the wild

An RCE flaw in OpenAI’s Codex CLI

Vulnerabilities in NVIDIA Triton Inference Server

RCE vulnerabilities in major AI inference server frameworks, including those from Meta, Nvidia, Microsoft, and open-source projects such as vLLM and SGLang

Vulnerabilities in open-source compute framework Ray

AI supply chain poisoning

Companies that are developing software with AI-related libraries and frameworks need to be aware that their developers might be targeted. Vetting the source of AI models and development packages is vital.

This year security researchers from ReversingLabs found malware hidden in AI models hosted on Hugging Face, the largest online hosting database for open-source models and other machine learning assets. Separately, they also found trojanized packages on the Python Package Index (PyPI) posing as SDKs for interacting with AI cloud services from Aliyun AI Labs, Alibaba Cloud’s AI research arm.

In both cases, the attackers exploited the Pickle object serialization format to hide their code, a Python format that is commonly used to store AI models meant to be used with PyTorch, one of the most popular machine learning libraries.

AI credential theft

Attackers are also adopting AI for their operations and would prefer to do so without paying and in other people’s names. The theft of credentials that can be used to access LLMs through official APIs or services such as Amazon Bedrock is now prevalent and has even received a name: LLMjacking.

This year Microsoft filed a civil lawsuit against a gang that specialized in stealing LLM credentials and using them to build paid services for other cybercriminals to generate content that bypassed the usual built-in ethical safeguards.

Large quantities of API calls to LLMs can rack up significant costs for the owners of stolen credentials, with researchers estimating potential costs of over $100,000 per day when querying cutting-edge models.

Prompt injections

AI tools also come with entirely new types of security vulnerabilities, the most common of which is known as prompt injection and stems from the fact that it is very hard to control what LLMs interpret as instructions to execute or as passive data to analyze. By design there is no distinction, as LLMs don’t interpret language and intent like humans do.

This leads to scenarios where data passed to an LLM from a third-party source — for example in the form of a document, an incoming email, a web page, and so on — could contain text that the LLM will execute as a prompt. This is known as indirect prompt injection and is a major problem in the age of AI agents where LLMs are linked with third-party tools to be able to access data for context or to perform tasks.

This year researchers demonstrated prompt injection attacks in AI coding assistants such as GitLab Duo, GitHub Copilot Chat; AI agent platforms like ChatGPT, Copilot Studio, Salesforce Einstein; AI-enabled browsers such as Perplexity’s Comet, Microsoft’s Copilot for Edge, and Google’s Gemini for Chrome; chatbots like Claude, ChatGPT, Gemini, Microsoft Copilot; and more.

These attacks can at the very least lead to sensitive data exfiltration, but can also trick the AI agent to perform other rogue tasks using the tools at its disposal, including potentially malicious code execution.

Prompt injections are a risk for all custom AI agents built by organizations that pass third-party data to an LLM and mitigating it requires a multi-layered approach as no defense is perfect. This includes forcing context separation by splitting different tasks to different LLM instances and employing the principle of least privilege for the agent or the tools it has access to, taking a human-in-the-loop approach for approving sensitive operations, filtering input for text strings that are commonly used in prompt injections, using system prompts to instruct the LLM to ignore commands from ingested data, using structured data formats, and more.

Rogue and vulnerable MCP servers

The Model Context Protocol (MCP) has become a standard for how LLMs interact with external data sources and applications to improve their context for reasoning. The protocol has seen rapid adoption and is a key component in developing AI agents, with tens of thousands of MCP servers now published online.

An MCP server is the component that allows an application to expose its functionality to an LLM through a standardized API and an MCP client is the component through which that functionality gets accessed. Integrated development environments (IDEs) such as Microsoft’s Visual Studio Code or those based on it, like Cursor and Antigravity, natively support integration with MCP servers and command-line-interface tools such as Claude Code CLI can also access them.

MCP servers can be hosted and downloaded from anywhere, for example GitHub, and they can contain malicious code. Researchers recently showed how a rogue MCP server could inject malicious code into the built-in browser from Cursor IDE.

However, MCP servers don’t necessarily have to be intentionally rogue to be a security threat. Many MCP servers can have vulnerabilities and misconfigurations and can open a path to OS command injection. The communication between MCP clients and MCP servers is also not always secure and can be exposed to an attack called prompt hijacking where attackers can get access to servers by guessing session IDs.