Claude Mythos Preview posts 73% success on expert CTFs; first model to finish 32‑step "The Last Ones" range

April 13, 2026
A woman working on multiple computer screens at night, focusing on her tasks.
Photo by cottonbro studio on Pexels

Key findings

The AI Security Institute (AISI) ran a battery of controlled cyber evaluations on Anthropic’s Claude Mythos Preview and it has been reported that the model succeeds on expert‑level capture‑the‑flag (CTF) tasks 73% of the time — a milestone, because no model could complete those expert tasks before April 2025. These tests weren’t casual back‑of‑the‑napkin probes: Mythos Preview was explicitly directed and given network access in controlled environments, and AISI says it can autonomously discover and exploit vulnerabilities that would take human pros days to stitch together.

Range performance: the meat of the matter

AISI also built a 32‑step corporate network simulation called "The Last Ones" (TLO) to mimic a real multi‑stage intrusion. Humans would need roughly 20 hours to finish it; Claude Mythos Preview finished the full run in 3 out of 10 attempts and averaged 22 of 32 steps across all tries. By comparison, Claude Opus 4.6 averaged about 16 steps. Models improved with larger token budgets — a reminder that capability often scales with the compute and context you feed them.

Limits, stakes, and what comes next

Mythos Preview did show limits in AISI’s testing; it allegedly could not complete operational‑technology–focused evaluations within this suite, underscoring that the progress is uneven across domains. Still — wow. This is the clearest demonstration yet that foundation models are moving from clever assistants to highly capable, semi‑autonomous red‑teaming tools. Should defenders be worried? Absolutely. But this isn’t sci‑fi: it’s a call to action for better defensive tooling, stricter access controls, and policy guardrails so that the next leap in offensive capability doesn’t outpace our ability to contain it. Not quite Skynet, but definitely a new chapter in the arms race between attackers and defenders.

Sources: aisi.gov.uk