Google's Gemini 3.1 Pro Preview currently leads key Humanity's Last Exam leaderboards with scores of 44.7% to 51.4% on verified evaluations from Scale Labs and Artificial Analysis, surpassing rivals like OpenAI's GPT-5.5 (43-57%) and Anthropic's Claude Opus 4.x (36-54%) in some rankings. This reflects rapid scaling in Gemini 3 series reasoning capabilities since late 2025 releases, bolstered by Deep Think modes and agentic enhancements like April's Deep Research Max, which hit 54.6% on HLE. Trader sentiment hinges on further gains amid competitive pressure, with Google I/O on May 19 poised for Gemini updates or new model previews that could push scores higher before the June 30 deadline. Benchmark discrepancies and tool-use variations underscore resolution uncertainties.
Polymarket ডেটা রেফারেন্স করে পরীক্ষামূলক AI-জেনারেটেড সারাংশ। এটি ট্রেডিং পরামর্শ নয় এবং এই মার্কেট কীভাবে রেজলভ হয় তাতে কোনো ভূমিকা রাখে না। · আপডেটেড$312,073 Vol.
50%+
64%
55%+
31%
60%+
6%
$312,073 Vol.
50%+
64%
55%+
31%
60%+
6%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
মার্কেট ওপেন হয়েছে: Jan 29, 2026, 12:50 PM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Google's Gemini 3.1 Pro Preview currently leads key Humanity's Last Exam leaderboards with scores of 44.7% to 51.4% on verified evaluations from Scale Labs and Artificial Analysis, surpassing rivals like OpenAI's GPT-5.5 (43-57%) and Anthropic's Claude Opus 4.x (36-54%) in some rankings. This reflects rapid scaling in Gemini 3 series reasoning capabilities since late 2025 releases, bolstered by Deep Think modes and agentic enhancements like April's Deep Research Max, which hit 54.6% on HLE. Trader sentiment hinges on further gains amid competitive pressure, with Google I/O on May 19 poised for Gemini updates or new model previews that could push scores higher before the June 30 deadline. Benchmark discrepancies and tool-use variations underscore resolution uncertainties.
Polymarket ডেটা রেফারেন্স করে পরীক্ষামূলক AI-জেনারেটেড সারাংশ। এটি ট্রেডিং পরামর্শ নয় এবং এই মার্কেট কীভাবে রেজলভ হয় তাতে কোনো ভূমিকা রাখে না। · আপডেটেড
বাহ্যিক লিংক থেকে সাবধান।
বাহ্যিক লিংক থেকে সাবধান।
সচরাচর জিজ্ঞাসা