AI reasoning model scores in top 1% on bar exam without special training

Reasoning models are approaching expert-level performance on professional exams. A top-1% bar exam score from a general-purpose model would mark a significant capability threshold.

RESOLUTION CRITERIA

True if a publicly available AI model achieves a score in the top 1% of human test-takers on the Uniform Bar Exam, as reported by the developer or independent evaluation. Must be a general-purpose system not fine-tuned exclusively for legal tasks.

▲ FOR

GPT-4 scored 90th percentile in 2023

Reasoning models show step-change improvements on structured exams

Two years of capability advancement since GPT-4 bar performance

▼ AGAINST

Top 1% requires near-perfect performance

Bar exam includes subjective essay components

Benchmark gaming concerns — models may train on exam-adjacent data

RECENT SIGNALS (3)

↑

Open Source AI Models Achieve Frontier Performance in April 2026 Releases

Fazm Blog

↕

UK AI Security Institute Report Shows Frontier Models Surpassing Expert Baselines in Multiple Domains

UK AI Security Institute

↑

Anthropic Launches Claude Mythos Preview with Record Benchmark Performance

LLM Stats