Google DeepMind’s Gemini 3.1 Pro Preview currently leads the Humanity’s Last Exam leaderboard at 44.7–46.4% accuracy on the 2,500-question PhD-level benchmark, following its February 2026 release that lifted performance well above Gemini 3 Pro’s 37.5%. This edge over OpenAI’s GPT-5.4 and Anthropic’s Claude models reflects ongoing gains in chain-of-thought reasoning and multimodal capabilities. A May 7 API update introducing Gemini 3.1 Flash-Lite, alongside reports of enhanced “thinking” modes and a possible Gemini 4 preview, signals continued rapid iteration before the June 30 cutoff. Traders weigh these incremental releases against risks of benchmark saturation and evaluation variance when assessing whether any Gemini variant will clear higher score thresholds.
基於Polymarket數據的AI實驗性摘要。這不是交易建議,也不影響該市場的結算方式。 · 更新於$312,098 交易量
50%+
61%
55% 以上
25%
60%+
6%
$312,098 交易量
50%+
61%
55% 以上
25%
60%+
6%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
市場開放時間: Jan 29, 2026, 12:50 PM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Google DeepMind’s Gemini 3.1 Pro Preview currently leads the Humanity’s Last Exam leaderboard at 44.7–46.4% accuracy on the 2,500-question PhD-level benchmark, following its February 2026 release that lifted performance well above Gemini 3 Pro’s 37.5%. This edge over OpenAI’s GPT-5.4 and Anthropic’s Claude models reflects ongoing gains in chain-of-thought reasoning and multimodal capabilities. A May 7 API update introducing Gemini 3.1 Flash-Lite, alongside reports of enhanced “thinking” modes and a possible Gemini 4 preview, signals continued rapid iteration before the June 30 cutoff. Traders weigh these incremental releases against risks of benchmark saturation and evaluation variance when assessing whether any Gemini variant will clear higher score thresholds.
基於Polymarket數據的AI實驗性摘要。這不是交易建議,也不影響該市場的結算方式。 · 更新於
警惕外部連結哦。
警惕外部連結哦。
Frequently Asked Questions