Google's Gemini 3.1 Pro Preview currently sits at 44.7% on Humanity's Last Exam (HLE), a 2,500-question benchmark testing expert-level reasoning across math, sciences, and humanities, trailing Anthropic's leading Claude variants at 53.3%. Recent gains from the 37.5% posted by Gemini 3 Pro in late 2025 reflect iterative improvements in reasoning capabilities and test-time scaling, yet Anthropic's edge stems from stronger adaptive techniques on this saturated frontier benchmark. With the June 30 deadline approaching, any unannounced Gemini preview or official update could shift scores, while competitive pressure from OpenAI's GPT-5.4 series adds uncertainty around further leaderboard movement before resolution. Traders monitor Google DeepMind announcements for signs of rapid iteration that might close the gap.
基于Polymarket数据的AI实验性摘要。这不是交易建议,也不影响该市场的结算方式。 · 更新于$318,389 交易量
50%+
3%
55%+
3%
60%+
2%
$318,389 交易量
50%+
3%
55%+
3%
60%+
2%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
市场开放时间: Jan 29, 2026, 12:50 PM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Google's Gemini 3.1 Pro Preview currently sits at 44.7% on Humanity's Last Exam (HLE), a 2,500-question benchmark testing expert-level reasoning across math, sciences, and humanities, trailing Anthropic's leading Claude variants at 53.3%. Recent gains from the 37.5% posted by Gemini 3 Pro in late 2025 reflect iterative improvements in reasoning capabilities and test-time scaling, yet Anthropic's edge stems from stronger adaptive techniques on this saturated frontier benchmark. With the June 30 deadline approaching, any unannounced Gemini preview or official update could shift scores, while competitive pressure from OpenAI's GPT-5.4 series adds uncertainty around further leaderboard movement before resolution. Traders monitor Google DeepMind announcements for signs of rapid iteration that might close the gap.
基于Polymarket数据的AI实验性摘要。这不是交易建议,也不影响该市场的结算方式。 · 更新于
警惕外部链接哦。
警惕外部链接哦。
常见问题