Anthropic’s recently previewed Claude Mythos model now leads the Humanity’s Last Exam leaderboard at 64.7 percent accuracy, ahead of OpenAI’s GPT-5.4 Pro at 58.7 percent. This frontier benchmark comprises 2,500 expert-vetted, multi-modal questions spanning advanced mathematics, physics, and specialized academic domains, where earlier Claude Opus 4.6 variants scored around 36–40 percent. The rapid progress stems from targeted scaling of reasoning capabilities and expanded context windows, though model timelines remain fluid and further gains before June 30 would likely require another major training run or architectural refinement. Traders are monitoring Anthropic’s release cadence and any credible reports of internal test results that could shift the implied probability of Claude claiming the top spot by the deadline.
Polymarketデータを参照したAI生成の実験的な要約。これは取引アドバイスではなく、このマーケットの解決方法には一切関係ありません。 · 更新日$283,400 Vol.
45%以上
18%
50%以上
9%
55%以上
4%
$283,400 Vol.
45%以上
18%
50%以上
9%
55%以上
4%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
マーケット開始日: Jan 30, 2026, 12:00 AM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Anthropic’s recently previewed Claude Mythos model now leads the Humanity’s Last Exam leaderboard at 64.7 percent accuracy, ahead of OpenAI’s GPT-5.4 Pro at 58.7 percent. This frontier benchmark comprises 2,500 expert-vetted, multi-modal questions spanning advanced mathematics, physics, and specialized academic domains, where earlier Claude Opus 4.6 variants scored around 36–40 percent. The rapid progress stems from targeted scaling of reasoning capabilities and expanded context windows, though model timelines remain fluid and further gains before June 30 would likely require another major training run or architectural refinement. Traders are monitoring Anthropic’s release cadence and any credible reports of internal test results that could shift the implied probability of Claude claiming the top spot by the deadline.
Polymarketデータを参照したAI生成の実験的な要約。これは取引アドバイスではなく、このマーケットの解決方法には一切関係ありません。 · 更新日
外部リンクに注意してください。
外部リンクに注意してください。
よくある質問