As of mid-May 2026, OpenAI’s GPT-5.5 Pro leads the FrontierMath leaderboard at 52.4 percent, while the latest Claude Opus 4.7 variant sits at 43.8 percent on Epoch AI’s set of unpublished, research-level math problems authored by expert mathematicians. Claude models have consistently shown relative weakness on pure mathematical reasoning benchmarks compared with their stronger performance on software-engineering tasks, according to Epoch’s Domain-specific Capabilities Index released this month. With June 30 approaching, traders are watching for any new model updates or extended-thinking techniques from Anthropic that could close the gap before resolution. Epoch AI’s ongoing human review of FrontierMath problems and upcoming open-problems workshops through early June may also produce revised scoring data that influences final outcomes.
Resumo experimental gerado por IA com dados do Polymarket. Isto não é aconselhamento de trading e não tem qualquer papel na resolução deste mercado. · AtualizadoPontuação antrópica de Claude no FrontierMath Benchmark até 30 de junho?
$61,944 Vol.
50%+
52%
$61,944 Vol.
50%+
52%
This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Mercado Aberto: Jan 30, 2026, 12:00 AM ET
Resolver
0x65070BE91...This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Resolver
0x65070BE91...As of mid-May 2026, OpenAI’s GPT-5.5 Pro leads the FrontierMath leaderboard at 52.4 percent, while the latest Claude Opus 4.7 variant sits at 43.8 percent on Epoch AI’s set of unpublished, research-level math problems authored by expert mathematicians. Claude models have consistently shown relative weakness on pure mathematical reasoning benchmarks compared with their stronger performance on software-engineering tasks, according to Epoch’s Domain-specific Capabilities Index released this month. With June 30 approaching, traders are watching for any new model updates or extended-thinking techniques from Anthropic that could close the gap before resolution. Epoch AI’s ongoing human review of FrontierMath problems and upcoming open-problems workshops through early June may also produce revised scoring data that influences final outcomes.
Resumo experimental gerado por IA com dados do Polymarket. Isto não é aconselhamento de trading e não tem qualquer papel na resolução deste mercado. · Atualizado
Cuidado com os links externos.
Cuidado com os links externos.
Frequently Asked Questions