Anthropic’s recently previewed Claude Mythos model now leads the Humanity’s Last Exam leaderboard at 64.7 percent accuracy, ahead of OpenAI’s GPT-5.4 Pro at 58.7 percent. This frontier benchmark comprises 2,500 expert-vetted, multi-modal questions spanning advanced mathematics, physics, and specialized academic domains, where earlier Claude Opus 4.6 variants scored around 36–40 percent. The rapid progress stems from targeted scaling of reasoning capabilities and expanded context windows, though model timelines remain fluid and further gains before June 30 would likely require another major training run or architectural refinement. Traders are monitoring Anthropic’s release cadence and any credible reports of internal test results that could shift the implied probability of Claude claiming the top spot by the deadline.
Riepilogo sperimentale generato dall'AI con riferimento ai dati di Polymarket. Questo non è un consiglio di trading e non ha alcun ruolo nella risoluzione di questo mercato. · AggiornatoPunteggio di Claude all'ultimo esame dell'umanità entro il 30 giugno?
$283,400 Vol.
45%+
18%
50%+
9%
55%+
4%
$283,400 Vol.
45%+
18%
50%+
9%
55%+
4%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Mercato aperto: Jan 30, 2026, 12:00 AM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Anthropic’s recently previewed Claude Mythos model now leads the Humanity’s Last Exam leaderboard at 64.7 percent accuracy, ahead of OpenAI’s GPT-5.4 Pro at 58.7 percent. This frontier benchmark comprises 2,500 expert-vetted, multi-modal questions spanning advanced mathematics, physics, and specialized academic domains, where earlier Claude Opus 4.6 variants scored around 36–40 percent. The rapid progress stems from targeted scaling of reasoning capabilities and expanded context windows, though model timelines remain fluid and further gains before June 30 would likely require another major training run or architectural refinement. Traders are monitoring Anthropic’s release cadence and any credible reports of internal test results that could shift the implied probability of Claude claiming the top spot by the deadline.
Riepilogo sperimentale generato dall'AI con riferimento ai dati di Polymarket. Questo non è un consiglio di trading e non ha alcun ruolo nella risoluzione di questo mercato. · Aggiornato
Fai attenzione ai link esterni.
Fai attenzione ai link esterni.
Domande frequenti