20 open predictions with public probabilities, explicit resolution criteria, and tracked accuracy.How the system works →
Rapid adoption of coding and customer service agents suggests broad enterprise deployment is accelerating.
If this resolves true, it signals the end of AI as a productivity tool and the beginning of AI as a decision-maker — with major implications for hiring, liability, and organizational structure.
True if 3+ Fortune 100 companies publicly report autonomous agent deployment across multiple business functions.
The gap between open and closed models has been narrowing.
Open-source parity would democratize frontier AI capability globally, breaking the closed-model oligopoly and fundamentally changing who can build the most powerful AI systems.
True if an open-weights model scores within 2% of the leading closed model on MMLU, HumanEval, and GPQA.
Companies are quietly replacing roles with AI but avoiding public attribution.
The first company to publicly name AI as the reason for mass layoffs will trigger regulatory, legal, and market reactions that reshape how every other firm communicates about workforce AI adoption.
True if a Fortune 500 company announces 1000+ layoffs with AI automation as the stated primary reason.
The volume of AI-generated text, images, and video is growing exponentially.
Once AI-generated content crosses 50%, the economics of human content creation shift permanently — affecting journalism, marketing, publishing, and creative industries simultaneously.
True if credible research measures >50% of newly published internet content as AI-generated.
The Blake Lemoine incident at Google in 2022 established this pattern. As models become more capable the probability of another high-profile insider claim increases regardless of scientific merit.
True if a current or recently departed employee of OpenAI, Google DeepMind, Anthropic, Meta AI, or xAI makes a public statement claiming a system shows signs of sentience or consciousness. Must generate coverage in 3+ major outlets.
Reasoning models are approaching expert-level performance on professional exams. A top-1% bar exam score from a general-purpose model would mark a significant capability threshold.
True if a publicly available AI model achieves a score in the top 1% of human test-takers on the Uniform Bar Exam, as reported by the developer or independent evaluation. Must be a general-purpose system not fine-tuned exclusively for legal tasks.
AI legal discovery is technically mature. The barrier is institutional conservatism and liability risk not capability. Client cost pressure may force adoption.
True if an AmLaw 100 firm publicly announces or confirms it uses AI for first-pass document review in litigation reducing or replacing contract attorney teams. Internal use without public acknowledgment does not qualify.
AI compute concentration is driving middle powers to invest in sovereign infrastructure. The trend is accelerating but $1B commitments require political will and capital.
True if 5 or more nations outside the US and China each announce or fund sovereign AI compute programs with committed budgets exceeding $1B USD. Allocated budgets qualify — aspirational statements without funding do not.
AI radiology tools are closest to full autonomy, but the FDA's regulatory framework still assumes human-in-the-loop oversight.
FDA approval without physician review would establish the legal and liability precedent for fully autonomous medical AI — a threshold that once crossed cannot be uncrossed.
True if FDA grants approval for an AI system to make diagnostic decisions without mandatory physician review.
Context windows have expanded rapidly, but production deployment at enterprise scale faces latency, cost, and reliability barriers that benchmarks do not capture.
True if 3+ Fortune 500 companies publicly report using AI systems with 500K+ token context windows in production workflows. Public statements, case studies, or earnings call references qualify.
Chinese domestic chip development has accelerated under export control pressure, with Huawei Ascend 910C/910D and SMIC-fabricated alternatives showing measurable capability gains. By December 2028, the H100 will be a six-year-old part — single-chip parity with an aging benchmark is materially easier than frontier parity. The verification path is the critical question: Chinese vendors do not submit to MLPerf, so the path to 'verified parity' runs through SemiAnalysis-style independent system testing or equivalent independent technical media coverage of standardized benchmarks. The political incentive to claim parity exceeds the technical incentive to demonstrate it; verification standards matter.
TRUE if a Chinese-designed, China-fabricated AI chip is verified by independent third-party benchmark (not vendor or state-sponsored testing) to match or exceed Nvidia H100 performance on standardized AI training (MLPerf) or inference (industry-standard benchmarks) workloads, with results published in peer-reviewed journals or independent technical media (e.g., SemiAnalysis, Chips and Cheese, equivalent) before Dec 31, 2028.
As AI systems gain more autonomy, the probability of a high-profile failure that forces coordinated regulatory action increases.
If this resolves true, it means AI governance moved from voluntary frameworks to binding law in response to a specific failure — setting the regulatory template globally for decades.
True if a specific AI system failure leads to binding regulation adopted by 3+ major economies within 12 months.
AI tutoring tools show learning gains in pilots. Full district-wide adoption requires budget approval, teacher union buy-in, and infrastructure.
True if a US public school district serving 50000+ students implements an AI tutoring or adaptive learning system available to all students district-wide. Board approval or press coverage qualifies.
The scientific community is engaging with AI consciousness as a research question. Consciousness detection frameworks are being developed but the bar for peer-reviewed evidence is high.
True if a peer-reviewed paper in a journal with impact factor greater than 5 publishes findings claiming measurable indicators of consciousness or sentience in an AI system. Must make an affirmative claim not merely propose a framework. Preprints do not qualify.
Robo-advisors exist but an LLM-powered advisory service from a major bank providing personalized advice not just portfolio allocation would represent a step change.
True if a top-20 US bank by assets launches a product marketed as AI-powered financial advisory for retail customers where the AI provides personalized recommendations. Must be generally available not a pilot.
The Digital Omnibus proposes delaying high-risk enforcement to Dec 2027, but the legislative process may not complete in time. If it stalls, August 2, 2026 remains the binding deadline.
True if EU AI Act Annex III high-risk obligations become enforceable on August 2, 2026 without a legislated delay. False if the Digital Omnibus or equivalent legislation formally extends the deadline before that date.
The EU AI Act's general-purpose model provisions activated August 2025, but Article 88 enforcement powers do not legally activate until August 2, 2026, leaving roughly five months between activation and the resolution date. Historical base rates for novel EU frameworks producing first major enforcement actions in their initial months are low — both DSA and DMA took 12+ months despite political pressure. Compounding the timing risk, ongoing Digital Omnibus negotiations create incentive for the Commission to hold first-action posture while surrounding rules are renegotiated. The directional pressure exists; the path is structurally constrained.
TRUE if the European AI Office formally announces an investigation, fine, or regulatory order targeting a named general-purpose AI model provider (OpenAI, Anthropic, Google, Meta, xAI, Mistral, or comparable) under the EU AI Act, published through official EU Commission channels, before Dec 31, 2026. Leaked or anonymous reporting does not qualify.
AI rights discourse is moving from philosophy to legislative bodies. The EU explored electronic personhood in 2017 then shelved it. As capabilities advance legislative interest may revive.
True if any national legislature introduces and formally debates a bill addressing AI rights, legal personhood, or moral status of AI systems. Committee hearing or floor discussion required — introduction without debate does not qualify.
Fragmented state AI laws create compliance pressure and both parties have AI bills in draft. But partisan gridlock and the Trump administration deregulatory stance make comprehensive legislation unlikely before midterms.
True if the US President signs into law any federal legislation that establishes binding requirements or prohibitions on AI development or deployment. Narrow sector-specific provisions do not qualify — the law must apply broadly to commercial AI.
Bipartisan frustration with Trump administration permissive chip export policy is building. AI OVERWATCH Act passed committee but full passage and presidential signature are uncertain.
True if Congress passes and the President signs or Congress overrides a veto of legislation giving Congress veto authority over AI chip export licenses or prohibiting export of specific chip classes. The AI OVERWATCH Act or equivalent qualifies.