textak
← BACK TO FEED
63% 1 ptsby Dec 2026
high

AI reasoning model scores in top 1% on bar exam without special training

Reasoning models are approaching expert-level performance on professional exams. A top-1% bar exam score from a general-purpose model would mark a significant capability threshold.

RESOLUTION CRITERIA

True if a publicly available AI model achieves a score in the top 1% of human test-takers on the Uniform Bar Exam, as reported by the developer or independent evaluation. Must be a general-purpose system not fine-tuned exclusively for legal tasks.

▲ FOR

GPT-4 scored 90th percentile in 2023

Reasoning models show step-change improvements on structured exams

Two years of capability advancement since GPT-4 bar performance

Chain-of-thought reasoning models now consistently occupy top ranks across benchmark leaderboards in 2026

o3 achieving 45.1% on ARC-AGI and setting new standards across math, coding, and science benchmarks

SubQ's 12M-token architecture and Gemini 3.5 Flash's frontier-level performance indicate continued capability expansion

▼ AGAINST

Top 1% requires near-perfect performance — a much higher bar than 90th percentile

Bar exam includes subjective essay components that benchmark dominance doesn't address

Benchmark gaming concerns — models may train on exam-adjacent data

Benchmark saturation is now confirmed: MMLU gaps are within measurement noise, suggesting standard capability metrics can no longer distinguish frontier models — top-1% bar exam requires genuine edge, not average frontier performance

Rapid model churn (255 releases in Q1) creates verification timing risk — which model, tested when, by whom?

RECENT SIGNALS (5)
Claude Mythos Preview Leads Frontier Reasoning Benchmarks with 71.2 Score on Multi-Step Inference
LLM Stats
Claude Mythos Dominates Reasoning Benchmarks, Frontier Model Competition Intensifies
LLM Stats
Claude Fable 5 reaches 95% on SWE-bench Verified, reclaims frontier coding lead
MorphLLM
GPT-5.6 Imminent: OpenAI Chief Scientist Signals 'Meaningful Improvement' Over GPT-5.5
TechTimes, The Information, explainx.ai
OpenAI's GPT-5.5 Scores Perfect 100% on AIME 2026 Math Benchmark
LLM Stats Leaderboard