Trader sentiment on Claude achieving a milestone score on Humanity’s Last Exam—a frontier benchmark of 2,500 expert-level questions testing AI reasoning across math, science, and humanities—hinges on the Scale Labs leaderboard, which evaluates models without tools at temperature 0.0. Anthropic's Claude Opus 4.7, released April 16, 2026, scores around 36-47% there, trailing recent frontrunners like xAI's Grok 4 at 50.7% as of early May, while self-reported aggregator scores inflate Claude Mythos Preview (unreleased) to 64.7%. Rapid 2026 releases from Anthropic signal potential for Opus 4.8 or Claude 5 before June 30, but leaderboard verification lags and competitive pressure from OpenAI's GPT-5.5 and Google's Gemini 3.1 keep probabilities modest amid slipping timelines and benchmark discrepancies.
Polymarket ডেটা রেফারেন্স করে পরীক্ষামূলক AI-জেনারেটেড সারাংশ। এটি ট্রেডিং পরামর্শ নয় এবং এই মার্কেট কীভাবে রেজলভ হয় তাতে কোনো ভূমিকা রাখে না। · আপডেটেড$283,400 Vol.
45%+
18%
50%+
9%
55%+
4%
$283,400 Vol.
45%+
18%
50%+
9%
55%+
4%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
মার্কেট ওপেন হয়েছে: Jan 30, 2026, 12:00 AM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Trader sentiment on Claude achieving a milestone score on Humanity’s Last Exam—a frontier benchmark of 2,500 expert-level questions testing AI reasoning across math, science, and humanities—hinges on the Scale Labs leaderboard, which evaluates models without tools at temperature 0.0. Anthropic's Claude Opus 4.7, released April 16, 2026, scores around 36-47% there, trailing recent frontrunners like xAI's Grok 4 at 50.7% as of early May, while self-reported aggregator scores inflate Claude Mythos Preview (unreleased) to 64.7%. Rapid 2026 releases from Anthropic signal potential for Opus 4.8 or Claude 5 before June 30, but leaderboard verification lags and competitive pressure from OpenAI's GPT-5.5 and Google's Gemini 3.1 keep probabilities modest amid slipping timelines and benchmark discrepancies.
Polymarket ডেটা রেফারেন্স করে পরীক্ষামূলক AI-জেনারেটেড সারাংশ। এটি ট্রেডিং পরামর্শ নয় এবং এই মার্কেট কীভাবে রেজলভ হয় তাতে কোনো ভূমিকা রাখে না। · আপডেটেড
বাহ্যিক লিংক থেকে সাবধান।
বাহ্যিক লিংক থেকে সাবধান।
সচরাচর জিজ্ঞাসা