Has Mythos just broken the deal that kept the internet safe?

What happened?
It has been reported that Anthropic's research preview, Mythos, can automatically generate working exploits against a Firefox JavaScript shell in 72.4% of trials — a dramatic jump from under 1% for a previous model, Opus 4.6. If that number holds up under wider scrutiny, it's a big deal. Sandboxes — the multiple layers of containment that let you click strangers' links without inviting catastrophe — only work if exploit chains are hard to find. What happens when an AI finds them for you?
Why it matters
The immediate fear is clear: LLMs are lowering the bar for discovering and chaining bugs that break sandboxes. It has been reported that Mythos is a very large model and that Anthropic may be compute-constrained; rumours even point to a GPT-4.5‑scale architecture. Allegedly, leaked pricing for Mythos is high ($125/MTok output), which might limit broad deployment for now. But smaller models are already catching up — look at the strides in open weights models like Gemma 4 — and newer chips will make serving big models cheaper. In short: today's boutique capability can become tomorrow's commodity.
What's next?
Anthropic released this as a research preview, which buys time for analysis, mitigation, and patches. It has been reported that evaluations used a standalone JS shell (SpiderMonkey) — not a full browser — so context and caveats matter. Still, browsers, OS vendors, and cloud operators will need to treat this as urgent: better hardening, faster bug-bounty work, and possibly new architectural defenses. And policymakers? They should be asking hard questions about access, disclosure, and responsible release. The internet's deal — click, run, be safe — depended on sandboxes holding. For the first time in a long while, that bargain looks frayed. What do we do next?
Sources: martinalderson.com, Hacker News
Comments