Anthropic’s Claude Code Security rollout is an industry wakeup call

February 24 • 6:07 am

Tags:

No tags

When Anthropic launched a “limited research preview” of its Claude Code Security offering on Friday, Wall Street investors sent the stocks of the largest cybersecurity vendors plunging.

But did the Anthropic rollout warrant such a reaction?

After all, those companies, including CrowdStrike, Zscaler, Palo Alto Networks and Okta, are preparing their own agentic capabilities, and even if they weren’t, the code-checking capabilities promised by Anthropic are not initially a replacement for their functionality.

“Code security is a vital piece of a cybersecurity program and overall tech stack, but far from the only one” Justin Greis, CEO of consulting firm Acceligence pointed out. “There’s no doubt that improving code security and enhancing the Secure Software Development Lifecycle (SDLC) and Product Development Lifecycle (PDLC) will strengthen an organization’s security posture, but it will not eliminate the need for tools and services like EDR/MDR, IAM, threat intel, and data protection.”

He added, “however, this is a clear signal that the AI companies are going to continue to expand their use cases and analyze more and more data, code, and bring real insight and action to security organizations. The pace of their innovation is staggering and unprecedented.”

Keeps a human in the loop

However, Greis offered a warning to CISOs: “For those who blindly rely on any code scanning tool, AI or otherwise, to replace the fundamentals of good security practices and secure coding, this is your red blinking light to not outsource the very expertise that protects the value proposition of the product or service you’re developing. We must keep qualified humans in the loop and ensure we use AI as an accelerator, not a replacement for expertise,” he said.

Anthropic’s announcement stated, “Claude Code Security, a new capability built into Claude Code on the web” will “[scan] codebases for security vulnerabilities and suggest targeted software patches for human review, allowing teams to find and fix security issues that traditional methods often miss.”

The rollout is limited, at least initially, Anthropic said. “We’re releasing it as a limited research preview to Enterprise and Team customers, with expedited access for maintainers of open-source repositories.”

The company did not respond to a request for an interview.

Anticipating concerns that the code-checker will take over security functions rather than augment them, Anthropic stressed that it wants to keep humans in the loop.

“Rather than scanning for known patterns, Claude Code Security reads and reasons about your code the way a human security researcher would: understanding how components interact, tracing how data moves through your application, and catching complex vulnerabilities that rule-based tools miss,” the announcement said. “Every finding goes through a multi-stage verification process before it reaches an analyst. Claude re-examines each result, attempting to prove or disprove its own findings and filter out false positives.”

It noted that validated findings appear in the Claude Code Security dashboard, where teams can review them, inspect the suggested patches, and approve fixes. But, it said, “because these issues often involve nuances that are difficult to assess from source code alone, Claude also provides a confidence rating for each finding. Nothing is applied without human approval: Claude Code Security identifies problems and suggests solutions, but developers always make the call.”

Anchors security posture to the model

However, those assurances didn’t make all concerns evaporate.

“The moment those vibe coders plug a foundation model into their CI pipeline, their entire security posture is no longer anchored only to the company’s code,” I-Gentic AI CEO Zahra Timsah pointed out.

“It is anchored to the current behavior of that model. Anthropic can update weights, adjust reasoning heuristics, refine safety layers, or change how semantic patterns are interpreted. None of that requires your approval. None of that triggers your internal change control. Your pipelines stay green. Your dashboards stay stable. But the engine defining what counts as a vulnerability has changed,” she said.

“Anthropic is in full control. That means your secure codebase today could be evaluated under a different vulnerability boundary tomorrow without you touching a single line. This is outsourcing part of your security definition to an upstream probabilistic system you do not control.”

Outsourcing dependence is nothing new

But others have suggested that the security outsourcing has been gradually happening for years, starting with cloud operations and SaaS, then moving to cybersecurity firms that took increasing control of enterprise cyber operations, and finally to genAI and agentic vendors.

Flavio Villanustre, CISO for the LexisNexis Risk Solutions Group, applauded the fact that Anthropic is at least giving lip service to humans overseeing the process, but, he noted, “this doesn’t mean that people will not cut corners in some cases and add yet another LLM with non-deterministic behavior to the existing problem of code generation by an LLM with non-deterministic behavior too.”

An ever-present concern about both agentic and generative AI systems is their tendency to hallucinate, in addition to having other reliability challenges. But several cybersecurity specialists said that is nothing new, in that large security systems always have their fair share of false positives and false negatives.

Cybersecurity consultant Brian Levine, executive director of FormerGov, said the Wall Street reaction to Anthropic’s announcement could signal that investors “are recalibrating around the idea that AI‑native security might compress or even reorder parts of the stack. Whether that’s justified or just reflexive fear of disruption, it suggests that people now believe a foundation model could meaningfully compete with, or be more helpful than, traditional detection and analysis engines.”

A different category of analysis

If Anthropic can continue to deliver, it could mean an even more fundamental shift, he noted.

“If a model can reason across sprawling codebases, correlate patterns that static tools miss, and do it continuously, that’s not incremental improvement, it may be a whole different category of analysis. It suggests a world where vulnerability discovery becomes less about signature libraries and more about adaptive interpretation,” Levine said.

But he, like Timsah, is concerned about changes in the model impacting an organization’s security posture. “That’s the tradeoff,” he said. “Unprecedented analytical power paired with a new kind of dependency that security leaders will have to evaluate with clear heads.”

A single point of trust and a single point of failure

Joshua Woodruff, CEO of MassiveScale.AI, said he found the Anthropic move problematic, but not for what it might do to other security companies. He is mostly worried about the benefits to cyber attackers.

“If Anthropic’s model found 500+ unknown high-severity vulns in open source projects, that means any attacker running a similar model can find those same vulns right now. Only no one’s reporting them. They’re exploiting them,” Woodruff said. “Vulnerability discovery just went asymmetric. Defenders get a tool that suggests patches for human review. Attackers get a tool that finds zero-days at machine speed with no review step.”

There’s another issue, he added: “If an AI agent finds the bug and suggests the fix, who’s checking the patch? You’re trusting the same model to be both auditor and repair crew. No security team would ever let the same person find the vulnerability and write the fix without some sort of independent review. But that’s exactly what happens if teams treat human review as a rubber stamp. The fix becomes the new attack surface.”

Ravid Circus, CPO at Seemplicity, agreed with Woodruff that the potential circular use of AI to both find the holes and fix them is a concern.

“When the same AI writes the code, finds the vulnerabilities, and proposes the fix, you’ve created a single point of trust and a single point of failure. Compromise that and you don’t just introduce bugs, you potentially manufacture backdoors at scale,” Circus said. “I worry we’re about to see ‘We use Claude Security’ become the new checkbox, like SOC 2 badges or Zero Trust branding. The real question isn’t which AI you use. It’s whether your organization has the operational maturity to validate and govern what it tells you. ‘Claude said we’re secure’ cannot become a security posture.”

To be sure, Anthropic has had its own issues with cybersecurity recently, but few disagreed that what it has been delivering for code examination is impressive. The question is whether it will ultimately deliver better pricing, scalability, and reliability than existing partners, and how soon this could occur.

In fact, another cyber executive, Gadi Evton, CEO of Knostic, argues that because the speed of innovation is moving far faster than most in the industry have ever seen, some organizations may not be re-evaluating AI offerings often enough.

“It is moving so fast. People who tried [Anthropic’s offering] two months ago don’t understand how well it works now,” Evton said.

And, said Rock Lambros, director of AI security at Zenity, “as long as genAI remains non-deterministic, secure-at-generation will always have gaps and you’ll always need post-generation validation for something that can’t guarantee the same output twice. The real problem is that nobody is staffed, funded, or even scoped to govern the autonomous systems that are already deployed.”