xAI's latest Grok 4 models score 12-14% on Epoch AI's FrontierMath benchmark of exceptionally challenging, original math problems—including unsolved research questions—trailing leaders like OpenAI's GPT-5.4 at 47.6%. A May 11 update flagged fatal errors in roughly one-third of Tier 1-4 problems via AI-assisted review, with human-vetted scores forthcoming that could recalibrate standings. xAI's rapid iteration, exemplified by efficient Grok 4.3 topping instruction-following and vibe-coding benchmarks, alongside training of 10-trillion-parameter Grok 5 on Colossus superclusters, drives optimism for math capability leaps. Traders monitor announcements, independent evals, or releases before June 30 amid intensifying frontier AI competition from Anthropic and Google.
Polymarketデータを参照したAI生成の実験的な要約。これは取引アドバイスではなく、このマーケットの解決方法には一切関係ありません。 · 更新日$20,870 Vol.
25%以上
57%
30%以上
48%
40%以上
40%
50%以上
18%
$20,870 Vol.
25%以上
57%
30%以上
48%
40%以上
40%
50%以上
18%
This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
マーケット開始日: Jan 30, 2026, 12:01 AM ET
Resolver
0x65070BE91...This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Resolver
0x65070BE91...xAI's latest Grok 4 models score 12-14% on Epoch AI's FrontierMath benchmark of exceptionally challenging, original math problems—including unsolved research questions—trailing leaders like OpenAI's GPT-5.4 at 47.6%. A May 11 update flagged fatal errors in roughly one-third of Tier 1-4 problems via AI-assisted review, with human-vetted scores forthcoming that could recalibrate standings. xAI's rapid iteration, exemplified by efficient Grok 4.3 topping instruction-following and vibe-coding benchmarks, alongside training of 10-trillion-parameter Grok 5 on Colossus superclusters, drives optimism for math capability leaps. Traders monitor announcements, independent evals, or releases before June 30 amid intensifying frontier AI competition from Anthropic and Google.
Polymarketデータを参照したAI生成の実験的な要約。これは取引アドバイスではなく、このマーケットの解決方法には一切関係ありません。 · 更新日
外部リンクに注意してください。
外部リンクに注意してください。
よくある質問