xAI's Grok models currently sit at 12-14% accuracy on Epoch AI's FrontierMath Tiers 1-3, a set of 300 unpublished, research-level math problems designed to resist data contamination and require hours or days of expert effort per question. This places them well behind leaders like OpenAI's o-series variants and GPT-5 iterations, which have posted scores in the mid-20s to low-50s in recent independent evaluations. With only days remaining until the June 30, 2026 resolution deadline and no confirmed Grok updates or capability jumps announced in the past month, trader sentiment reflects the narrow window for any rapid improvement. Competitive dynamics in advanced reasoning benchmarks continue to favor labs with stronger demonstrated tool use and scaling on math-specific tasks, though xAI's focus on unique problem-solving strengths has occasionally yielded novel solves on FrontierMath.
Résumé expérimental généré par IA à partir des données Polymarket. Ceci n'est pas un conseil de trading et ne joue aucun rôle dans la résolution de ce marché. · Mis à jour$24,109 Vol.
40 %+
93%
50 %+
<1%
$24,109 Vol.
40 %+
93%
50 %+
<1%
This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Marché ouvert : Jan 30, 2026, 12:01 AM ET
Resolver
0x65070BE91...This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Resolver
0x65070BE91...xAI's Grok models currently sit at 12-14% accuracy on Epoch AI's FrontierMath Tiers 1-3, a set of 300 unpublished, research-level math problems designed to resist data contamination and require hours or days of expert effort per question. This places them well behind leaders like OpenAI's o-series variants and GPT-5 iterations, which have posted scores in the mid-20s to low-50s in recent independent evaluations. With only days remaining until the June 30, 2026 resolution deadline and no confirmed Grok updates or capability jumps announced in the past month, trader sentiment reflects the narrow window for any rapid improvement. Competitive dynamics in advanced reasoning benchmarks continue to favor labs with stronger demonstrated tool use and scaling on math-specific tasks, though xAI's focus on unique problem-solving strengths has occasionally yielded novel solves on FrontierMath.
Résumé expérimental généré par IA à partir des données Polymarket. Ceci n'est pas un conseil de trading et ne joue aucun rôle dans la résolution de ce marché. · Mis à jour
Méfiez-vous des liens externes.
Méfiez-vous des liens externes.
Questions fréquentes