xAI's Grok models currently sit at 12-14% accuracy on Epoch AI's FrontierMath Tiers 1-3, a set of 300 unpublished, research-level math problems designed to resist data contamination and require hours or days of expert effort per question. This places them well behind leaders like OpenAI's o-series variants and GPT-5 iterations, which have posted scores in the mid-20s to low-50s in recent independent evaluations. With only days remaining until the June 30, 2026 resolution deadline and no confirmed Grok updates or capability jumps announced in the past month, trader sentiment reflects the narrow window for any rapid improvement. Competitive dynamics in advanced reasoning benchmarks continue to favor labs with stronger demonstrated tool use and scaling on math-specific tasks, though xAI's focus on unique problem-solving strengths has occasionally yielded novel solves on FrontierMath.
Resumo experimental gerado por IA com dados do Polymarket. Isto não é aconselhamento de trading e não tem qualquer papel na resolução deste mercado. · Atualizado$24,234 Vol.
25%+
Yes
30%+
Yes
40%+
Yes
50%+
No
$24,234 Vol.
25%+
Yes
30%+
Yes
40%+
Yes
50%+
No
This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Mercado Aberto: Jan 30, 2026, 12:01 AM ET
Resolver
0x65070BE91...Resultado proposto: Yes
Sem contestação
Resultado final: Yes
This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Resolver
0x65070BE91...Resultado proposto: Yes
Sem contestação
Resultado final: Yes
xAI's Grok models currently sit at 12-14% accuracy on Epoch AI's FrontierMath Tiers 1-3, a set of 300 unpublished, research-level math problems designed to resist data contamination and require hours or days of expert effort per question. This places them well behind leaders like OpenAI's o-series variants and GPT-5 iterations, which have posted scores in the mid-20s to low-50s in recent independent evaluations. With only days remaining until the June 30, 2026 resolution deadline and no confirmed Grok updates or capability jumps announced in the past month, trader sentiment reflects the narrow window for any rapid improvement. Competitive dynamics in advanced reasoning benchmarks continue to favor labs with stronger demonstrated tool use and scaling on math-specific tasks, though xAI's focus on unique problem-solving strengths has occasionally yielded novel solves on FrontierMath.
Resumo experimental gerado por IA com dados do Polymarket. Isto não é aconselhamento de trading e não tem qualquer papel na resolução deste mercado. · Atualizado
Cuidado com os links externos.
Cuidado com os links externos.
Frequently Asked Questions