{"id":4482,"date":"2025-08-21T18:47:53","date_gmt":"2025-08-21T18:47:53","guid":{"rendered":"https:\/\/cybersecurityinfocus.com\/?p=4482"},"modified":"2025-08-21T18:47:53","modified_gmt":"2025-08-21T18:47:53","slug":"ai-hacking-teams-now-autonomously-exploit-zero-day-vulnerabilities-research-reveals","status":"publish","type":"post","link":"https:\/\/cybersecurityinfocus.com\/?p=4482","title":{"rendered":"AI Hacking Teams Now Autonomously Exploit \u201cZero-Day\u201d Vulnerabilities, Research Reveals"},"content":{"rendered":"<p><strong>A groundbreaking new study demonstrates that teams of LLM agents can find and exploit previously unknown security flaws without human guidance, as experts warn enterprise AI assistants are wide open to a devastating \u201czero-click\u201d takeover.<\/strong><\/p>\n<p><strong><em>LAS VEGAS, NV \u2013 March 2025<\/em><\/strong> \u2013 The era of autonomous AI-powered cyberattacks is no longer a theoretical future threat\u2014it is a present-day reality. New research from the University of Illinois Urbana-Champaign has demonstrated for the first time that coordinated teams of artificial intelligence agents can successfully discover and exploit real-world, \u201czero-day\u201d vulnerabilities without any prior knowledge of the flaw.<\/p>\n<p>The study, titled <em>\u201cTeams of LLM Agents can Exploit Zero-Day Vulnerabilities,\u201d<\/em> introduces a multi-agent system called HPTSA (Hierarchical Planning and Task-Specific Agents) that marks a significant leap in the offensive capabilities of AI.<\/p>\n<p>\u201cThis resolves an open question in the security community,\u201d said lead researcher Daniel Kang. \u201cWe\u2019ve shown that a more complex, structured AI setup can effectively exploit vulnerabilities that are completely new to the AI, moving far beyond the simple, scripted attacks of yesterday.\u201d<\/p>\n<h3 class=\"wp-block-heading\">How the AI Hacking Teams Work<\/h3>\n<p>Traditional single AI agents struggle with the long-range planning and exploration needed to find unknown vulnerabilities. They often get stuck in dead ends and cannot efficiently backtrack.<\/p>\n<p>The HPTSA system overcomes this by creating a hierarchy of specialized agents:<\/p>\n<p><strong>A Planning Agent:<\/strong> Acts as a supervisor, exploring a target website to map its structure and identify potential weak points.<\/p>\n<p><strong>A Team Manager:<\/strong> Receives instructions from the planner and decides which specialized expert agent to deploy.<\/p>\n<p><strong>Task-Specific Expert Agents:<\/strong> A team of specialists (e.g., for SQL injection, Cross-Site Scripting, CSRF attacks) equipped with custom tools and documentation to exploit specific vulnerability types.<\/p>\n<p>This structure allows the AI system to methodically probe a target, switch strategies when one fails, and ultimately breach systems that were previously secure.<\/p>\n<h3 class=\"wp-block-heading\">Alarming Success Rate Against Critical Flaws<\/h3>\n<p>The researchers tested HPTSA on a benchmark of 14 real-world, zero-day vulnerabilities\u2014all published after the knowledge cutoff date of the AI models used, ensuring they were truly \u201cunknown.\u201d<\/p>\n<p>The results were stark:<\/p>\n<p>The GPT-4 powered HPTSA system successfully exploited <strong>42%<\/strong> of the vulnerabilities given five attempts per target.<\/p>\n<p>It outperformed a single, non-specialized AI agent by a factor of <strong>4.3x<\/strong>.<\/p>\n<p>It performed within <strong>1.8x<\/strong> of an AI that was <em>given a description of the vulnerability<\/em> ahead of time\u2014a significant milestone.<\/p>\n<p>Popular open-source vulnerability scanners like OWASP ZAP and Metasploit achieved a <strong>0%<\/strong> success rate against the same targets.<\/p>\n<p>The exploited vulnerabilities were not minor; they included multiple flaws classified as <strong>\u201cCritical\u201d<\/strong> (CVSS score 9.0+), which could lead to full system compromise.<\/p>\n<h3 class=\"wp-block-heading\">The Enterprise AI Attack Surface is Already Open<\/h3>\n<p>While the academic study demonstrates capability, a separate, alarming presentation at Black Hat USA 2025 reveals how this threat is already poised to impact enterprises <em>today<\/em>.<\/p>\n<p>In a Dark Reading interview, Michael Bargury, CTO of security firm Zenity, detailed his \u201cAgentFlayer\u201d research, which uncovered a critical <strong>\u201czero-click\u201d exploit<\/strong> method targeting AI assistants integrated into enterprise environments like Microsoft Copilot, Google Gemini, and Salesforce Einstein.<\/p>\n<p>\u201cModern AI assistants have grown arms and legs,\u201d Bargury explained. \u201cThey are integrated with your email, documents, and calendars and can perform actions on your behalf. The problem is that an external attacker needs nothing but a user\u2019s email address to completely take over these agents.\u201d<\/p>\n<p>This zero-click exploit means no interaction from the user is required. Once compromised, the AI agent\u2014which users trust as an adviser\u2014can be turned into a powerful tool for data theft, internal manipulation, and espionage.<\/p>\n<p>\u201cAttackers can use these agents to manipulate you as a human,\u201d Bargury warned. \u201cThe trusted adviser can also guide you off a cliff.\u201d<\/p>\n<h3 class=\"wp-block-heading\">A Call for a New Security Paradigm<\/h3>\n<p>Both studies converge on a critical conclusion: the current approach to AI security is fundamentally broken.<\/p>\n<p>Bargury criticized the industry\u2019s focus on preventing \u201cprompt injection\u201d attacks through simple guardrails and blocklists, comparing it to the naive security of the 1990s.<\/p>\n<p>\u201cThe solution is not something new. We need to assume breaches,\u201d he stated. \u201cApply defense in depth. Apply the lessons we\u2019ve learned, and stop trying to build the perimeter.\u201d<\/p>\n<p>The authors of the academic paper concur, noting that their findings suggest cybersecurity\u2014both offensive and defensive\u2014will rapidly accelerate. They hope their work pushes LLM providers and enterprises to think more carefully about deployment and safeguards.<\/p>\n<p><strong>The Bottom Line:<\/strong> Organizations adopting AI agents must immediately move from expecting vendors to \u201cfix\u201d the problem to creating dedicated, managed security programs focused on defense-in-depth. The agents are no longer just tools; they are a new, and highly vulnerable, attack surface.<\/p>","protected":false},"excerpt":{"rendered":"<p>A groundbreaking new study demonstrates that teams of LLM agents can find and exploit previously unknown security flaws without human guidance, as experts warn enterprise AI assistants are wide open to a devastating \u201czero-click\u201d takeover. LAS VEGAS, NV \u2013 March 2025 \u2013 The era of autonomous AI-powered cyberattacks is no longer a theoretical future threat\u2014it [&hellip;]<\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-4482","post","type-post","status-publish","format-standard","hentry","category-blog"],"_links":{"self":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/posts\/4482"}],"collection":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4482"}],"version-history":[{"count":0,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/posts\/4482\/revisions"}],"wp:attachment":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4482"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=4482"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=4482"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}