{"id":8437,"date":"2026-06-10T09:00:00","date_gmt":"2026-06-10T09:00:00","guid":{"rendered":"https:\/\/cybersecurityinfocus.com\/?p=8437"},"modified":"2026-06-10T09:00:00","modified_gmt":"2026-06-10T09:00:00","slug":"ai-red-teaming-comes-of-age","status":"publish","type":"post","link":"https:\/\/cybersecurityinfocus.com\/?p=8437","title":{"rendered":"AI red teaming comes of age"},"content":{"rendered":"<div>\n<div class=\"grid grid--cols-10@md grid--cols-8@lg article-column\">\n<div class=\"col-12 col-10@md col-6@lg col-start-3@lg\">\n<div class=\"article-column__content\">\n<div class=\"container\"><\/div>\n<p>When <a href=\"https:\/\/www.linkedin.com\/in\/rssk\/\">Ram Shankar Siva Kumar<\/a> launched Microsoft\u2019s AI red team in 2019, the discipline barely existed.<\/p>\n<p>\u201cThe running joke used to be that people who used to work in AI red teaming, you can round them up in a 14-foot catamaran,\u201d he tells CSO.<\/p>\n<p>At the time, Microsoft\u2019s approach looked familiar to anyone in cybersecurity: Attack machine learning systems the same way security teams attacked everything else. Identify weaknesses, emulate adversaries, and uncover vulnerabilities before products reach customers.<\/p>\n<p>Then GPT-4 arrived. \u201cThe tool that we had changed; actually, it broke,\u201d Siva Kumar says. The attacks his team had developed against earlier machine learning systems no longer worked against large language models. The tools had to be rebuilt. The methodologies had to be newly devised. Even the definition of the job had to be rebuilt.<\/p>\n<p>\u201cWe had to retool completely, and we also had to rethink what it means to red team an AI system,\u201d he says.<\/p>\n<p>That rethinking is still under way. Today, AI red teaming has become one of the fastest-growing specialties in cybersecurity, with dedicated teams at Microsoft, Anthropic, OpenAI, Google, and Nvidia. But the field is grappling with a more fundamental question than which tools to use: What exactly is the job?<\/p>\n<h2 class=\"wp-block-heading\">Not your father\u2019s penetration test<\/h2>\n<p>The most basic difference between testing traditional software and testing AI reshapes everything else: AI is not deterministic; it\u2019s probabilistic.<\/p>\n<p>\u201cThe same attack might only work one time out of 100 times or 10 times out of 100 times or 90 times out of 100 times,\u201d <a href=\"https:\/\/www.linkedin.com\/in\/dane-sherrets-7a049973\/\">Dane Sherrets<\/a>, staff innovation architect at HackerOne, tells CSO. That changes how security teams evaluate risk. Instead of asking whether a vulnerability exists, they must also determine how frequently it appears, under what conditions, and whether it can be reliably reproduced.<\/p>\n<p><a href=\"https:\/\/www.csoonline.com\/Users\/cynth\/OneDrive\/Documents\/csomagazine\/AI%20Red%20Teaming\/linkedin.com\/in\/pebryan\">Pete Bryan<\/a>, technical lead of the AI red team at Microsoft, thinks the probabilistic nature of AI systems fundamentally changes the testing process. Systems must be evaluated repeatedly, under varying conditions, to understand how they behave and whether risky outputs emerge consistently.<\/p>\n<p>The challenge is not only that AI behaves differently from traditional software. It is also capable of things traditional software could never do.<\/p>\n<p><a href=\"https:\/\/www.linkedin.com\/in\/tomgillis1\/\">Tom Gillis<\/a>, SVP\/GM of the infrastructure and security group at Cisco, points to frontier models discovering vulnerabilities in complex software systems at a pace that would have seemed implausible a few years ago. \u201cThey\u2019re able to find weird interdependencies,\u201d he tells CSO. \u201cI change the state of this little piece, which changes the state of that piece, which changes the state of this piece, which leads to a memory overflow.\u201d<\/p>\n<p>Modern models can analyze enormous codebases and identify chains of interaction that eventually lead to exploitable conditions \u2014 relationships human researchers miss even after years of scrutiny.<\/p>\n<p>That capability cuts both ways. The same reasoning power that makes AI useful for security testing makes AI systems themselves a new kind of target, one that requires different methods to probe.<\/p>\n<h2 class=\"wp-block-heading\">\u2018Teenager with a potty mouth\u2019<\/h2>\n<p>Traditional red teams spend most of their time modeling sophisticated adversaries: nation-states, cybercriminal groups, advanced persistent threats. AI red teams still care about those actors \u2014 but the roster of relevant threat actors has grown considerably.<\/p>\n<p>\u201cOne of the enduring personas that we also focus on is what my team lovingly likes to call a teenager with a potty mouth,\u201d Microsoft\u2019s Siva Kumar says.<\/p>\n<p>The phrase captures one of the defining realities of the generative AI era. Many of the most significant jailbreaks and prompt injection attacks were not discovered by elite offensive operators. They were found by curious users experimenting with prompts \u2014 people who had no particular expertise but plenty of creativity and time.<\/p>\n<p>\u201cIn 2019, if we had had this interview, I\u2019d have said, \u2018Hey, my job is to emulate nation-state adversaries and to emulate advanced persistent threats,\u2019\u201d Siva Kumar says.<\/p>\n<p>Those adversaries still matter. But AI systems can fail in response to ordinary users asking unexpected questions, creatively manipulating prompts, or simply interacting with the technology in ways its developers never anticipated.<\/p>\n<p><a href=\"https:\/\/www.linkedin.com\/in\/ianswanson\/\">Ian Swanson<\/a>, AI security leader at Palo Alto Networks, sees this reflected in how enterprises think about the problem. \u201cWhat that really means is we need to behaviorally test AI for security, safety, and maybe even brand reputational type risks,\u201d he tells CSO.<\/p>\n<p>The question is no longer simply whether an attacker can break into a system. It is whether the system itself can behave in ways that create risk \u2014 regardless of who is doing the asking.<\/p>\n<h2 class=\"wp-block-heading\">Safety moves in alongside security<\/h2>\n<p>That reframing has expanded AI red teaming well beyond its cybersecurity origins.<\/p>\n<p>When Microsoft\u2019s team launched in 2019, its focus was largely on the confidentiality, integrity and availability of machine learning systems \u2014 the <a href=\"https:\/\/www.csoonline.com\/article\/568917\/the-cia-triad-definition-components-and-examples.html\">traditional CIA triad<\/a>. Generative AI dramatically enlarged that mandate. Trust and safety concerns now sit alongside conventional security ones. Misinformation, dangerous knowledge domains, manipulation risks, and questions about autonomous AI behavior all fall within the remit of many AI red teams today.<\/p>\n<p>\u201cThe composition of my team has commensurately increased to kind of meet the AI moment,\u201d Siva Kumar says. His team now includes a psychologist, a linguist, and a specialist in bioweapons \u2014 expertise that would have seemed out of place in a traditional security organization.<\/p>\n<p>Bryan sees the expansion as a natural consequence of AI\u2019s role in society. \u201cAI red teaming has a much broader scope,\u201d he says. \u201cWe\u2019re worried about those engineering technical elements, but we also encompass the socio-technical risks of the safety side.\u201d<\/p>\n<p>Those expanded sets of worries mean evaluating harms that traditional cybersecurity teams rarely encountered: misinformation amplification, psychosocial risk, content that can cause harm without any attacker ever being involved.<\/p>\n<p>\u201cWe need skillsets that are much broader \u2014 people who think deeply about psychosocial harms or misinformation amplification \u2014 to cover the full remit of AI safety and security,\u201d Bryan says.<\/p>\n<p>AI red teaming\u2019s growing remit has even attracted Washington\u2019s attention. President Biden\u2019s 2023 <a href=\"https:\/\/www.federalregister.gov\/documents\/2023\/11\/01\/2023-24283\/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence\">executive order<\/a> formally defined AI red teaming and required safety testing results for the most powerful models to be shared with the government before deployment. President Trump later <a href=\"https:\/\/www.whitehouse.gov\/presidential-actions\/2025\/01\/initial-rescissions-of-harmful-executive-orders-and-actions\/\">revoked<\/a> the order, leaving standards development largely to industry and voluntary frameworks.<\/p>\n<h2 class=\"wp-block-heading\">Red teaming the whole car<\/h2>\n<p>One of the most common mistakes organizations make when they begin testing AI systems is focusing exclusively on the model.<\/p>\n<p>HackerOne\u2019s Sherrets uses a car analogy. The model is the engine. But the AI system is everything connected to it \u2014 the databases, the APIs, the customer records, the payment systems, the internal workflows. \u201cWhat I encourage people to do is red team the entire car,\u201d he says. \u201cWe need to understand not only the engine, but also all of the other pieces that connect to that engine and how they operate together, because how they connect and operate together could also have vulnerabilities.\u201d<\/p>\n<p>Weaknesses often emerge not from the model itself but from the interactions between components. Sherrets points to an <a href=\"https:\/\/www.forbes.com\/sites\/marisagarcia\/2024\/02\/19\/what-air-canada-lost-in-remarkable-lying-ai-chatbot-case\/\">Air Canada case<\/a> to make the point.<\/p>\n<p>The airline\u2019s customer service chatbot invented a bereavement refund policy that did not exist. A customer relied on it. The airline ended up in court. Nobody had hacked the system. Nobody had exploited a vulnerability in the conventional sense. The chatbot behaved incorrectly \u2014 and the organization was held responsible for what its AI said on its behalf.<\/p>\n<p>As organizations deploy AI assistants across customer service, sales, HR, and internal operations, that kind of failure becomes an increasingly significant risk category. The system does not need to be attacked to cause harm. It needs only to be wrong, at the wrong moment, in front of the wrong person.<\/p>\n<h2 class=\"wp-block-heading\">The agent problem<\/h2>\n<p>For much of the generative AI era, red teamers worried primarily about outputs. Would the model hallucinate? Would it leak sensitive information? Would it generate harmful content?<\/p>\n<p>Agents introduce a different category of risk entirely.<\/p>\n<p>Agentic AI systems do not just generate text. They retrieve information. They invoke APIs. They process refunds. They access databases. They perform tasks on behalf of users with real-world consequences. A vulnerability that causes a chatbot to say something wrong is a communications problem. A vulnerability in an agent that <a href=\"https:\/\/www.csoonline.com\/article\/4047974\/agentic-ai-a-cisos-security-nightmare-in-the-making.html\">executes business processes is an operational one<\/a>.<\/p>\n<p>The shift extends beyond testing AI systems themselves. Cisco\u2019s Gillis argues that increasingly capable AI models are accelerating the pace of change across enterprise environments, making static security approaches obsolete. \u201cThis idea of hardening your infrastructure and then hoping it never changes for 18 months, that is over, permanently dead, gone in this post-Mythos environment,\u201d he tells CSO.<\/p>\n<p>The implication is that security testing can no longer be a periodic exercise. As AI systems become more autonomous, organizations must continuously evaluate how those systems behave in production environments. \u201cWe need to test the behavior to make sure agents are doing the right things,\u201d Swanson says.<\/p>\n<p>Microsoft\u2019s Bryan believes agentic systems are forcing a convergence between traditional cybersecurity red teams and AI red teams that will define the field\u2019s next phase. At Microsoft, the two teams remain separate organizations \u2014 but they work increasingly closely together, because the systems they now test combine conventional software risks with AI-specific safety concerns in ways that neither team can address alone.<\/p>\n<p>\u201cAgentic AI is really the intersection of all of the cybersecurity risks that come with traditional software systems along with all of the AI security and safety risks,\u201d he says.<\/p>\n<h2 class=\"wp-block-heading\">AI is a team sport, too<\/h2>\n<p>Bryan points to Microsoft\u2019s decision to open-source AI safety testing tools as a recognition that AI risk is not a problem model providers can solve on behalf of their customers. Enterprises deploying AI need their own testing capabilities. Not every organization will maintain a specialized AI red team \u2014 but every organization deploying AI needs to understand its risks.<\/p>\n<p>\u201cLike cybersecurity, which has always kind of been a team sport, AI safety and security is really a community-driven piece,\u201d Bryan says. \u201cEveryone has their role and responsibility.\u201d<\/p>\n<p>Bryan also sees the long-term trajectory of the field bending toward a different kind of convergence. \u201cI think there will just become a point where having the AI for red teaming almost kind of becomes redundant, and that just is the red teaming,\u201d he says. \u201cEveryone is using AI to improve their work regardless of the area.\u201d<\/p>\n<p>What will remain distinct is the challenge of testing AI systems themselves \u2014 probabilistic systems that expand in scope with each new capability and that can cause harm without anyone intending them to.<\/p>\n<p>Five years ago, AI red teaming was a niche specialty practiced by a handful of researchers. Today, it encompasses cybersecurity, safety, misinformation, autonomy, and governance. Tomorrow it will look different again \u2014 shaped by whatever the next generation of AI systems turns out to be capable of.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>When Ram Shankar Siva Kumar launched Microsoft\u2019s AI red team in 2019, the discipline barely existed. \u201cThe running joke used to be that people who used to work in AI red teaming, you can round them up in a 14-foot catamaran,\u201d he tells CSO. At the time, Microsoft\u2019s approach looked familiar to anyone in cybersecurity: [&hellip;]<\/p>\n","protected":false},"author":0,"featured_media":8438,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-8437","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-education"],"_links":{"self":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/posts\/8437"}],"collection":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=8437"}],"version-history":[{"count":0,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/posts\/8437\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/media\/8438"}],"wp:attachment":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=8437"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=8437"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=8437"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}