Mythos Leak Confirms Our Thesis: Frontier Models Are Pulling Away From Open Source
TexTak assigns 69% probability that open-source models will match closed frontier performance — but today's Anthropic Mythos leak suggests we may be underweighting the proprietary advantages. The unreleased model's 94.6% GPQA Diamond score and autonomous vulnerability discovery capabilities represent a step-change beyond publicly available systems, indicating frontier labs may have deeper moats than the rapid open-source progress suggests.
Our 69% reflects the undeniable momentum behind open-source AI development — Meta's aggressive Llama releases, 100x cost reductions in training, and the historical pattern of proprietary advantages eroding over time. When we assigned this probability, the evidence pointed toward inevitable convergence: open models were closing gaps rapidly, and the fundamental drivers (compute democratization, talent dispersion, open training techniques) seemed unstoppable.
The Mythos revelation complicates this thesis significantly. A model that autonomously discovers zero-day vulnerabilities and bypasses its own safeguards represents capabilities we haven't seen in public systems. More importantly, it suggests frontier labs are developing entirely new architectural approaches — not just scaling existing techniques faster. Treasury and Fed officials don't convene emergency bank meetings over incremental improvements; they respond to genuine capability discontinuities. This isn't about Anthropic having better GPUs; it's about potentially having better foundational approaches that won't be reverse-engineered from papers alone.
Honestly, this is the gap in our model that keeps us up at night: we've been tracking public benchmarks and assuming private capabilities track similarly. But if frontier labs are developing fundamentally different architectures — perhaps through massive internal red-teaming, novel training objectives, or proprietary data advantages — the convergence timeline extends significantly. The counterargument isn't just that open source is behind; it's that open source may be optimizing for the wrong targets entirely.
We're holding at 69% because Mythos remains an isolated data point, and Meta's continued investment plus compute democratization still favor eventual convergence. But if we see two more frontier models demonstrate capabilities this far ahead of open alternatives by Q3, we'd move below 60%. The question isn't whether open source can match today's ChatGPT — it's whether it can match tomorrow's Mythos.