Language models have come closest to matching human superforecasters, a small group of trained generalists who hold the best track record on questions like: Will China blockade Taiwan during 2026? Will the IMF declare a global recession in H1 2026? Will global temperatures breach 1.5°C in 2026? LLMs are expected to surpass the best human teams by 2027.
Machine intelligence trained to reason about cause and effect scales expert judgment to every high-stakes decision. When that judgment prices risk more accurately than the market, the system that produces it becomes the most profitable capital allocator in the world. Governments, hedge funds, and Fortune 500 corporations are all, at their core, capital allocators.
We need to overcome three challenges to scale the accuracy and coverage of machine forecasting.
- 1.Clean evaluation and training environments. The quality of a forecasting model is bounded by the quality of the environment it trains in. High-fidelity environments, historically locked corpora, leakage-free evaluation, and well-constructed question pipelines, produce models that transfer directly to live performance. This infrastructure barely exists today.
- 2.Building credibility in our forecasts. Markets are continuous benchmarks with adversarial actors trying to out-predict your agent. The model that gets widely adopted will be the one that demonstrably outperforms prediction markets and niche private markets like political-risk insurance in pricing risk. All forecasts and recommendations need to be explainable to regulators and risk teams.
- 3.Compute-efficient architectures for learning causal representations. The current frontier predicts the next token from embeddings spanning text, pixels, and frequencies. Scaling parameters and embedding efficiency can let language models learn causal models of the world as a by-product of next-token prediction, but training at this scale has diminishing returns, and there is limited real-world data left in the public domain. Despite three quarters of model improvement, the gap between LLMs and human superforecasters on live Metaculus competitions has not narrowed. Scaling alone is not closing it. We need architectures that learn causal structure more efficiently.
The modern world is too fast for human-driven research and decision-making. Rising insurance premiums and volatile commodity markets are symptoms of our inability to predict what happens next.
- 1.Specialty insurance. Insurance that covers aviation war risk, cyber attacks, political violence, and climate catastrophes is one of the fastest-growing segments in financial services, on track to double to over $250B in annual premiums by the end of the decade. This is growing because insurers keep getting the pricing wrong. When wars broke out and supply chains were disrupted, losses exceeded what premiums had budgeted for by the largest margin on record in 2024. Premiums rise to compensate, and the extra cost flows through to every economic actor, from central banks in Japan to retailers in Brazil.
- 2.Commodity markets. When Russian energy supply to Europe was suspended, it triggered over a trillion dollars in margin calls across European commodity markets. Commodity volatility is at the highest level in 50 years, and the drivers, geopolitical fragmentation, climate instability, supply-chain concentration, are structural. They take capital, coordination, and time to fix.
Tetlock has consistently shown that trained generalists who update their beliefs faster out-predict domain experts with more information. Machine intelligence takes this further, learning historical base rates and updating across thousands of questions simultaneously. It detects early indicators of crises by recognising aberrations from historical patterns and produces calibrated probabilities before the rest of the market has processed the signal. The goal is to build world models that can understand how the future unfolds, and act on it.
We work with specialty insurers, commodity traders, and risk consulting firms to apply machine forecasting to scenario modelling, portfolio stress-testing, and real-time risk pricing. If you underwrite geopolitical or climate risk, manage large ($50M+) commodity exposures, or lead risk research at an institution where better forecasts change how capital is allocated, reach out at contact@anthral.com.
