Google’s Gemini 3.1 Pro Preview currently leads the Humanity’s Last Exam leaderboard at 46.4% accuracy, building on the 37.5% score Gemini 3 Pro achieved in late 2025. This frontier benchmark tests frontier large language models across 2,500 expert-level questions in math, science, and humanities, where even top systems remain well below human PhD performance. Recent API updates, including enhanced reasoning modes and the May rollout of Gemini 3.1 Flash variants, underscore Google DeepMind’s rapid iteration cycle. Traders watch for potential Gemini 4 previews or further “thinking” optimizations ahead of the June 30 deadline, while noting benchmark saturation risks and evaluation variances that could cap gains. Intense rivalry with OpenAI’s GPT-5.4 and GPT-5.5 models keeps the competitive landscape fluid.
Polymarket डेटा का संदर्भ देने वाला प्रयोगात्मक AI-जनरेटेड सारांश। यह ट्रेडिंग सलाह नहीं है और इस बाज़ार के समाधान में कोई भूमिका नहीं निभाता। · अपडेट किया गया$312,088 वॉल्यूम
50%+
66%
55%+
27%
60%+
6%
$312,088 वॉल्यूम
50%+
66%
55%+
27%
60%+
6%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
बाज़ार खुला: Jan 29, 2026, 12:50 PM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Google’s Gemini 3.1 Pro Preview currently leads the Humanity’s Last Exam leaderboard at 46.4% accuracy, building on the 37.5% score Gemini 3 Pro achieved in late 2025. This frontier benchmark tests frontier large language models across 2,500 expert-level questions in math, science, and humanities, where even top systems remain well below human PhD performance. Recent API updates, including enhanced reasoning modes and the May rollout of Gemini 3.1 Flash variants, underscore Google DeepMind’s rapid iteration cycle. Traders watch for potential Gemini 4 previews or further “thinking” optimizations ahead of the June 30 deadline, while noting benchmark saturation risks and evaluation variances that could cap gains. Intense rivalry with OpenAI’s GPT-5.4 and GPT-5.5 models keeps the competitive landscape fluid.
Polymarket डेटा का संदर्भ देने वाला प्रयोगात्मक AI-जनरेटेड सारांश। यह ट्रेडिंग सलाह नहीं है और इस बाज़ार के समाधान में कोई भूमिका नहीं निभाता। · अपडेट किया गया
बाहरी लिंक से सावधान रहें।
बाहरी लिंक से सावधान रहें।
अक्सर पूछे जाने वाले प्रश्न