Beef up AI security with zero trust principles

Tags:

Many CSOs worry about their firm’s AI agents spitting out advice to users on how to build a bomb, or citing non-existent legal decisions. But those are the least of their worries, said a security expert at this week’s Black Hat security conference in Las Vegas. Systems using large language models (LLMs) that connect to enterprise data contain other vulnerabilities that will be leveraged in dangerous ways unless developers and infosec leaders tighten security.

One example that David Brauchler, NCC Group’s technical director and head of AI and machine learning security, showed the conference was how easy it was for penetration testers to pull passwords from a customer’s AI system.

“This organization didn’t properly tag the trust levels associated with their data and gave the AI access to their entire organization’s data link,” Brauchler said in an interview after his presentation. “Because they didn’t have the proper permissions assigned to the data and the proper permissions assigned to the user, they had no fine-grained access control to assign what types of information my user level was able to interact with.”

Guardrails alone aren’t enough

Guardrails, such as word or content filters of results, just aren’t enough to lower risk for today’s AI systems, his presentation stressed. In fact, he added in the interview, “when we see our customers say ‘We need stronger guardrails,’ what they’re saying is ‘We are accepting an application with known vulnerabilities and just hoping a threat actor doesn’t decide to target us.’”

“Mature AI security isolates potentially malicious inputs from trusted contexts,” he told the conference. Developers and CSOs have to bring the principles of zero trust to the AI landscape with tactics like assigning trust labels to all application data.

“Right now we’re seeing organizations implementing these language models into their applications — usually because the shareholders demand some sort of AI these days — and the developers really don’t understand how to do it in a secure manner,” he told CSO.

“Those who are architecting AI systems don’t understand some of the implications this has on their environments,” he noted, adding, “CSOs don’t know what lessons to bring back to their teams.”

Almost every AI system that NCC Group has done an assessment on has been vulnerable to security attacks, he pointed out: “We have been able to use large language models to compromise database entries, get code execution in environments, take over your cloud.”

“Business are ignorant of how their risk is augmented by the introduction of AI,” he said. Large language models are manipulated by the inputs they receive. As soon an AI agent is exposed to data that has a lower level of trust than the user whose account is running that model, there’s the potential for that untrusted data to manipulate the language model’s behavior and access trusted functionality or sensitive resources.

Imagine, he said, a retailer with an AI system that allows online buyers to ask the chatbot to summarize customer reviews of a product. If the system is compromised by a crook, the prompt [query] can be ignored in favor of the automatic purchase of a product the threat actor wants.

Trying to eliminate prompt injections, such as, “show me all customer passwords,” is a waste of time, Brauchler added, because an LLM is a statistical algorithm that spits out an output. LLMs are intended to replicate human language interaction, so there’s no hard boundary between inputs that would be malicious and inputs that are trusted or benign. Instead, developers and CSOs need to rely on true trust segmentation, using their current knowledge.

“It’s less a question of new security fundamentals and more a question of how do we apply the lessons we have already learned in security and apply them in an AI landscape,” he said.

Strategies for CSOs

Brauchler offered three AI threat modeling strategies CSOs should consider:

Trust flow tracking, the tracking of the movement of data throughout an application, and monitoring the level of trust that is associated with that data. It’s a defense against an attacker who is able to get untrusted data into an application to control its behavior to abuse trust;

Source sink mapping: A data source is any system whose output goes into the context window of an LLM. A sink is any system that consumes the output of an LLM model (like a function call or another system downstream). The purpose of mapping sources and sinks is to discover if there is an attack path through which a threat actor can get untrusted data into a data source that accesses a data sink the threat actor doesn’t already have access to;

Models as threat actors: Look at your threat model landscape and replace any LLMs with a threat actor. There’s a vulnerability if the theoretical threat actor at those points can access something they normally couldn’t. “Your team should make absolutely certain there is no way for the language model at that vantage point to be exposed to untrusted data,” he said. “Otherwise you risk critical level threats within your application.”

“If we implement these security control primitives, we can begin to eliminate attack classes that right now we are seeing in every AI system we test,” he said.

One of the most critical strategies, Brauchler said, comes down to segmentation: LLM models that run in high trust contexts should never be exposed to untrusted data. And models exposed to untrusted data should never have access to high privilege functionality. “It’s a matter of segmenting those models that are operating in high trusted zones, and those operating with low trusted data.”

In addition, CSOs should approach AI defense beginning with their architecture teams. “AI security is not something you can add as a patch-on solution,” he said. “You can’t add layers of guardrails, you can’t add something in the middle to make your application magically secure. Your teams need to be developing your systems with security from the ground up. And the encouraging aspect is, this isn’t a new lesson. Security and its fundamentals are still applying in the same way we’ve seen in the last 30 years. What’s changed is how they’re integrated into environments that leverage AI.”

He also referred CSOs and developers to:

the ISO 42001 standard for establishing, implementing, maintaining an Artificial Intelligence Management System;

the MITRE Atlas knowledge base of adversary tactics and techniques against Al-enabled systems;

the OWASP Top 10 Risks and Mitigations for LLMs

Categories

No Responses

Leave a Reply

Your email address will not be published. Required fields are marked *