Google's Gemini 3.1 Pro Preview currently sits at 44.7% on Humanity's Last Exam (HLE), a 2,500-question benchmark testing expert-level reasoning across math, sciences, and humanities, trailing Anthropic's leading Claude variants at 53.3%. Recent gains from the 37.5% posted by Gemini 3 Pro in late 2025 reflect iterative improvements in reasoning capabilities and test-time scaling, yet Anthropic's edge stems from stronger adaptive techniques on this saturated frontier benchmark. With the June 30 deadline approaching, any unannounced Gemini preview or official update could shift scores, while competitive pressure from OpenAI's GPT-5.4 series adds uncertainty around further leaderboard movement before resolution. Traders monitor Google DeepMind announcements for signs of rapid iteration that might close the gap.
Polymarket 데이터를 참조하는 실험적 AI 생성 요약입니다. 이것은 거래 조언이 아니며 이 마켓의 정산에 영향을 미치지 않습니다. · 업데이트$318,389 거래량
50%+
3%
55%+
3%
60%+
3%
$318,389 거래량
50%+
3%
55%+
3%
60%+
3%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
마켓 개설일: Jan 29, 2026, 12:50 PM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Google's Gemini 3.1 Pro Preview currently sits at 44.7% on Humanity's Last Exam (HLE), a 2,500-question benchmark testing expert-level reasoning across math, sciences, and humanities, trailing Anthropic's leading Claude variants at 53.3%. Recent gains from the 37.5% posted by Gemini 3 Pro in late 2025 reflect iterative improvements in reasoning capabilities and test-time scaling, yet Anthropic's edge stems from stronger adaptive techniques on this saturated frontier benchmark. With the June 30 deadline approaching, any unannounced Gemini preview or official update could shift scores, while competitive pressure from OpenAI's GPT-5.4 series adds uncertainty around further leaderboard movement before resolution. Traders monitor Google DeepMind announcements for signs of rapid iteration that might close the gap.
Polymarket 데이터를 참조하는 실험적 AI 생성 요약입니다. 이것은 거래 조언이 아니며 이 마켓의 정산에 영향을 미치지 않습니다. · 업데이트
외부 링크에 주의하세요.
외부 링크에 주의하세요.
자주 묻는 질문