xAI's latest Grok 4 models score 12-14% on Epoch AI's FrontierMath benchmark of exceptionally challenging, original math problems—including unsolved research questions—trailing leaders like OpenAI's GPT-5.4 at 47.6%. A May 11 update flagged fatal errors in roughly one-third of Tier 1-4 problems via AI-assisted review, with human-vetted scores forthcoming that could recalibrate standings. xAI's rapid iteration, exemplified by efficient Grok 4.3 topping instruction-following and vibe-coding benchmarks, alongside training of 10-trillion-parameter Grok 5 on Colossus superclusters, drives optimism for math capability leaps. Traders monitor announcements, independent evals, or releases before June 30 amid intensifying frontier AI competition from Anthropic and Google.
Eksperymentalne podsumowanie AI odwołujące się do danych Polymarket. To nie jest porada handlowa i nie ma wpływu na rozstrzyganie tego rynku. · ZaktualizowanoxAI Grok score on FrontierMath Benchmark by June 30?
xAI Grok score on FrontierMath Benchmark by June 30?
$20,870 Wol.
25%+
57%
30%+
48%
40%+
42%
50%+
18%
$20,870 Wol.
25%+
57%
30%+
48%
40%+
42%
50%+
18%
This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Rynek otwarty: Jan 30, 2026, 12:01 AM ET
Resolver
0x65070BE91...This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Resolver
0x65070BE91...xAI's latest Grok 4 models score 12-14% on Epoch AI's FrontierMath benchmark of exceptionally challenging, original math problems—including unsolved research questions—trailing leaders like OpenAI's GPT-5.4 at 47.6%. A May 11 update flagged fatal errors in roughly one-third of Tier 1-4 problems via AI-assisted review, with human-vetted scores forthcoming that could recalibrate standings. xAI's rapid iteration, exemplified by efficient Grok 4.3 topping instruction-following and vibe-coding benchmarks, alongside training of 10-trillion-parameter Grok 5 on Colossus superclusters, drives optimism for math capability leaps. Traders monitor announcements, independent evals, or releases before June 30 amid intensifying frontier AI competition from Anthropic and Google.
Eksperymentalne podsumowanie AI odwołujące się do danych Polymarket. To nie jest porada handlowa i nie ma wpływu na rozstrzyganie tego rynku. · Zaktualizowano
Uważaj na linki zewnętrzne.
Uważaj na linki zewnętrzne.
Często zadawane pytania