Google DeepMind’s Gemini 3.1 Pro Preview currently leads the Humanity’s Last Exam leaderboard at 46.4 percent accuracy on its 2,500 expert-crafted, PhD-level questions, edging out OpenAI’s GPT-5.4 Pro at 44.3 percent. This edge stems from the February 2026 release that delivered a clear jump over Gemini 3 Pro’s late-2025 score of 37.5 percent, followed by the May 7 rollout of the lighter Gemini 3.1 Flash-Lite variant that further refines reasoning chains. Traders are watching for any pre-June 30 preview of Gemini 4 or expanded “thinking” modes that could push scores higher before the deadline, while noting risks from benchmark saturation and minor evaluation variance across frontier large language models.
Eksperimental na AI-generated summary na nire-reference ang Polymarket data. Hindi ito trading advice at wala itong papel sa kung paano nire-resolve ang market na ito. · Na-updateGoogle Gemini score on Humanity’s Last Exam by June 30?
$312,088 Vol.
50%+
62%
55%+
27%
60%+
6%
$312,088 Vol.
50%+
62%
55%+
27%
60%+
6%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Binuksan ang Market: Jan 29, 2026, 12:50 PM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Google DeepMind’s Gemini 3.1 Pro Preview currently leads the Humanity’s Last Exam leaderboard at 46.4 percent accuracy on its 2,500 expert-crafted, PhD-level questions, edging out OpenAI’s GPT-5.4 Pro at 44.3 percent. This edge stems from the February 2026 release that delivered a clear jump over Gemini 3 Pro’s late-2025 score of 37.5 percent, followed by the May 7 rollout of the lighter Gemini 3.1 Flash-Lite variant that further refines reasoning chains. Traders are watching for any pre-June 30 preview of Gemini 4 or expanded “thinking” modes that could push scores higher before the deadline, while noting risks from benchmark saturation and minor evaluation variance across frontier large language models.
Eksperimental na AI-generated summary na nire-reference ang Polymarket data. Hindi ito trading advice at wala itong papel sa kung paano nire-resolve ang market na ito. · Na-update
Mag-ingat sa mga external link.
Mag-ingat sa mga external link.
Mga Madalas na Tanong