Claude Mythos Cracks 73% of Expert Cyber Tasks No AI Could Solve Before

Anthropic’s Claude Mythos Preview has become the first AI model to complete a full simulated corporate network attack, according to new evaluations from the UK’s AI Security Institute (AISI).

The findings, published days after the model’s April 7 announcement, suggest AI cyber capabilities have reached a level that demands immediate attention from security teams worldwide.

What Is Claude Mythos?

Anthropic unveiled its Claude Mythos Preview model on April 7, opting against a broad public release. The team instead grants limited access to security research firms to evaluate and prepare for its advanced capabilities.

“This model performs strongly across the board, but it is strikingly capable at computer security tasks. In response, we have launched Project Glasswing, an effort to use Mythos Preview to help secure the world’s most critical software, and to prepare the industry for the practices we all will need to adopt to keep ahead of cyberattackers,” the announcement read.

Follow us on X to get the latest news as it happens

The development has already begun to draw attention across tech and even policy circles. According to a Reuters report citing sources familiar with the matter, US Treasury Secretary Scott Bessent and Federal Reserve ​Chair Jerome Powell held an urgent meeting with major bank CEOs, warning about potential cyber risks linked to this model.

How Claude Mythos Preview Performed

AI Security Institute (AISI), a research organisation within the UK government’s Department for Science, Innovation and Technology, conducted cyber evaluations of Anthropic’s Claude Mythos Preview to examine its cybersecurity capabilities.

First were the capture-the-flag (CTF) evaluations, where systems must identify and exploit vulnerabilities to retrieve hidden “flags.” Mythos achieved a 73% success rate on expert-level tasks. This had remained unsolved by any model prior to April 2025.

Claude Mythos Cyber Attack Capabilities
Claude Mythos Cyber Attack Capabilities. Source: AISI

Moreover, AISI built a 32-step corporate network attack simulation called “The Last Ones” (TLO). Human security professionals would need roughly 20 hours to finish it.

Mythos Preview finished the entire simulation in 3 out of 10 attempts. On average, it completed 22 of the 32 attack steps. Claude Opus 4.6, the next-best performer, averaged only 16 steps.

“Mythos Preview’s success on one cyber range indicates that is at least capable of autonomously attacking small, weakly defended and vulnerable enterprise systems where access to a network has been gained. However, our ranges have important differences from real-world environments that make them easier targets,” the team added.

Anthropic’s own red team testing found that Claude Mythos Preview can detect and exploit zero-day vulnerabilities across all major operating systems and leading web browsers when explicitly instructed by a user.

“We’re limited in what we can report here. Over 99% of the vulnerabilities we’ve found have not yet been patched, so it would be irresponsible for us to disclose details about them,” the team said.

AISI noted that organizations should prioritize foundational cybersecurity measures. These include regular patching, strict access controls, security configuration hardening, and comprehensive logging.

Subscribe to our YouTube channel to watch leaders and journalists provide expert insights

The post Claude Mythos Cracks 73% of Expert Cyber Tasks No AI Could Solve Before appeared first on BeInCrypto.

Source: https://beincrypto.com/claude-mythos-preview-cyber-capabilities-test/