Anthropic’s recently previewed Claude Mythos model now leads the Humanity’s Last Exam leaderboard at 64.7 percent accuracy, ahead of OpenAI’s GPT-5.4 Pro at 58.7 percent. This frontier benchmark comprises 2,500 expert-vetted, multi-modal questions spanning advanced mathematics, physics, and specialized academic domains, where earlier Claude Opus 4.6 variants scored around 36–40 percent. The rapid progress stems from targeted scaling of reasoning capabilities and expanded context windows, though model timelines remain fluid and further gains before June 30 would likely require another major training run or architectural refinement. Traders are monitoring Anthropic’s release cadence and any credible reports of internal test results that could shift the implied probability of Claude claiming the top spot by the deadline.
Экспериментальная сводка, созданная ИИ на основе данных Polymarket. Это не является торговой рекомендацией и не влияет на то, как разрешается этот рынок. · ОбновленоОценка Клода на последнем экзамене человечества к 30 июня?
$283,400 Объем
45%+
18%
50%+
9%
55%+
4%
$283,400 Объем
45%+
18%
50%+
9%
55%+
4%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Открытие рынка: Jan 30, 2026, 12:00 AM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Anthropic’s recently previewed Claude Mythos model now leads the Humanity’s Last Exam leaderboard at 64.7 percent accuracy, ahead of OpenAI’s GPT-5.4 Pro at 58.7 percent. This frontier benchmark comprises 2,500 expert-vetted, multi-modal questions spanning advanced mathematics, physics, and specialized academic domains, where earlier Claude Opus 4.6 variants scored around 36–40 percent. The rapid progress stems from targeted scaling of reasoning capabilities and expanded context windows, though model timelines remain fluid and further gains before June 30 would likely require another major training run or architectural refinement. Traders are monitoring Anthropic’s release cadence and any credible reports of internal test results that could shift the implied probability of Claude claiming the top spot by the deadline.
Экспериментальная сводка, созданная ИИ на основе данных Polymarket. Это не является торговой рекомендацией и не влияет на то, как разрешается этот рынок. · Обновлено
Не доверяй внешним ссылкам.
Не доверяй внешним ссылкам.
Часто задаваемые вопросы