Frontier AI models offer sneak peak of seismic cyber shifts ahead

June 11 • 9:00 am

Tags:

No tags

The advent of Claude Mythos combined with the release of OpenAI’s GPT-5.5 have changed the threat model for CISOs.

The arrival of those frontier AI models — and the ones soon to follow — makes it much easier to discover and chain vulnerabilities at a speed and scale that will require most cyber departments to rethink their strategies and operations.

Experts polled by CSO on the impact of these capabilities say defenders should assume AI will make initial compromise more likely and that they should focus less on trying to patch everything perfectly and more on limiting blast radius through stronger identity controls, least privilege, and internal segmentation.

Wild frontier

Although access to Mythos remains restricted to a limited number of trusted partners, comparable AI-based vulnerability discovery platforms are in the works, and few experts think access to sufficiently capable AI models will be kept from attackers for long. Anthropic itself has now released to the public the “Mythos-class” Fable 5 AI model, with extra cybersecurity guardrails.

Noe Ramos, vice president of AI operations at Agiloft, says CISOs should operate on the assumption that attackers will get access to frontier AI-style capabilities within months if not sooner.

“Whether through jailbreaks, fine-tuned open-weight derivatives, or purpose-built black-hat versions, determined threat actors are resourceful and motivated,” says Ramos. “Frontier AI capabilities tend to diffuse faster than the security community expects and slower than the headlines suggest. Defenders should plan for the former.”

Rather than jailbreaking frontier models it is more likely that attackers will gain access to capable vulnerability discovery platforms by fine-tuning open-weight models on offensive security data and running them locally.

“We see people out there that are starting to work on replicating the results of Mythos with existing infrastructure and open source models that they don’t have to run through the clouds,” Martin Roesch, lead developer of the Snort intrusion detection system turned head of cloud at security startup Vectra AI, tells CSO.

“This kind of industrial-scale vulnerability discovery and potential exploit generation is not something that most of the world is really prepared for in terms of the downstream implications of the effects that it’ll have on the defendability of organizations,” Roesch concludes.

Will Barker, cybersecurity advisor at managed detection and response vendor Huntress, agrees that research is showing that AI-driven vulnerability discovery is no longer something only frontier models can do.

“Smaller open-weights models are already finding the same types of zero-days and exploit chains,” says Barker.

These findings imply that the model itself is not always the biggest differentiator.

“The real value comes from everything around it: how the work is orchestrated, how findings are validated, how noise is filtered, and how quickly humans can turn those findings into action,” Barker says.

Vulnerability discovery compressed

A junior security researcher with API access to a frontier model can find vulnerabilities without the reverse-engineering work that used to take an experienced team.

“Logic flaws are where this hits hardest,” says Nik Kale, principal engineer and member of the Coalition for Secure AI (CoSAI). “Traditional scanners never caught them well because the code isn’t broken, just strategically wrong. A frontier LLM reads a hardcoded trust assumption like it’s reading a paragraph. That’s the gap that opened, and it isn’t closing.”

Frontier AI has meaningfully compressed discovery time for well-understood vulnerability classes: SQL injection variants, common misconfigurations, things that pattern-match against known CVEs.

Raphael Peyret, a former product manager at Google turned startup advisor at SHA/RP, argues that the barrier to creating a reliable exploit from a vulnerability has been lowered rather than removed.

“In many cases, finding the weakness is no longer the bottleneck,” says Peyret. “But novel zero-days in hardened targets are a genuinely different problem, and that still takes human expertise.”

Matthew Bidwell, founder at Newzino.com, backs up this assessment. “The binding constraint for attackers has shifted from finding bugs to operationalizing them: turning a hypothetical flaw into a working exploit, chaining it against a real target, evading detection, [and] persisting,” he says.

The more meaningful shift in the vulnerability discovery landscape is economic rather than technical, according to several experts.

“Attackers are running roughly the same playbook they always ran,” Peyret notes. “What’s changed is the unit cost of running a credible campaign, and it’s dropped substantially.”

Other experts agreed that AI is turning vulnerability discovery from a scarce human craft into a scalable computational problem.

“Mythos-class systems compress reconnaissance, target triage, payload customization, and social engineering into minutes,” says Noah M. Kenney, founder and principal consultant at Digital 520. “Jailbreaks and black-hat forks will happen, but the bigger risk is legitimate enterprise AI being turned against the enterprise that deployed it.”

Attackers do not need Mythos itself; they need Mythos-like vulnerability discovery workflows, says Mudit Sinha, AI Lead at Lineaje.

“Mythos may be expensive and restricted today, but the gap is closing fast through frontier models, specialized cyber models, and black-hat harnesses around general-purpose AI,” he says.

Exploit pathways

The historical bottleneck in offensive cyber operations was finding novel weaknesses. AI-native cyber systems are automating code reasoning, attack-path identification, and variant analysis at machine speed, according to Kai CISO Alfredo Hickman.

“The constraint is shifting from ‘Can we find bugs?’ to ‘Can we reliably weaponize and scale them?’” he says.

Louis Leung, a software developer and co-founder at InFlow Inventory, believes attackers’ real challenge remains turning a discovered weakness into a stable, stealthy, repeatable capability that survives modern defensive controls and produces operational impact.

“The hard part is turning the bug into a stable working exploit that functions across real-world production environments, which come with modern defenses, monitoring, and patching solutions,” he says. “Attackers increasingly need to chain multiple weaknesses together in SaaS environments — like inventory and warehouse systems — more than they need to identify the first point of weakness.”

Still, frontier AI models are likely to accelerate the ability to chain those weaknesses together, said Jon Yeoh, chief scientific officer at the Cloud Security Alliance, at the recent CSO Cybersecurity Awards and Conference.

“We’re looking at taking like maybe three or four CVEs that were very low-level and chaining those to become something that’s high or critical,” he said. “That’s something we haven’t seen — just what the models themselves do with a simple prompt.”

Opening Pandora’s Box

Independent security experts were keen to avoid blaming Anthropic for opening a Pandora’s Box full of vulnerability discoveries, however.

“I do think Anthropic is trying to do the right thing by getting organizations involved early, letting them battle-test, harden, and build some understanding of what this looks like in the wild before it’s widely available,” says Melissa Bischoping, senior director of security and product design research at Tanium. “It’s not a perfect solution, but the spirit and intent are well-placed.”

Bischoping, a SANS Technology Institute board member, warns that there are concerns whether organizational change control can move fast enough to action what Mythos finds before Mythos is out in the wild.

“Agentic patch workflows are possible and can match pace with adversarial AI in a lot of cases, but [organizational] politics and change control don’t run at the speed of AI today,” says Bischoping.

Countermeasures

For defenders, the answer to the challenge posed by frontier AI models is faster vulnerability remediation.

“Security teams need to stop treating vulnerability discovery as the hard part and start fixing aggressively,” argues Lineaje’s Sinha. “Known CVEs are the easiest place to begin: prioritize, validate exploitability, patch, test, and verify continuously. The same frontier models that can detect vulnerabilities often have some capacity to remediate them, but they need a harness around them: asset context, SBOMs, exploitability validation, patch generation, CI/CD checks, sandboxed testing, and human approval for risky changes.”

AI Operations’ Ramos adds: “If AI surfaces vulnerabilities at a rate that outpaces human remediation, and Mythos suggests it will, then the strategic priority has to shift toward containment and resilience.”

“Assume breach. Shrink blast radius,” Ramos concludes.