Recent releases of GPT-5.5 Instant from OpenAI and Grok 4.3 from xAI have kept those companies at the forefront of benchmark leadership through mid-May 2026, while Anthropic’s Claude Opus 4.7 and Google’s Gemini 3.1 Pro continue to post top scores in reasoning, coding, and multimodal tasks. The four major labs now sit within a narrow performance band on key evaluations such as GPQA, SWE-Bench, and arena leaderboards, with incremental gains in context windows and agentic capabilities driving the tight race. Traders are watching for any late-May updates ahead of Google I/O and potential July model drops that could shift the #1 spot by the June 30 cutoff.
Tóm tắt AI thử nghiệm tham chiếu dữ liệu Polymarket. Đây không phải tư vấn giao dịch và không ảnh hưởng đến cách thị trường này được giải quyết. · Cập nhật$1,563,416 KL.

OpenAI
11%

xAI
5%

Mistral
4%

Meta
4%

Z.ai
2%

DeepSeek
2%

Alibaba
2%

Nvidia
2%

Baidu
2%

Meituan
1%
$1,563,416 KL.

OpenAI
11%

xAI
5%

Mistral
4%

Meta
4%

Z.ai
2%

DeepSeek
2%

Alibaba
2%

Nvidia
2%

Baidu
2%

Meituan
1%
Results from the "Arena Score" section on the Leaderboard tab of https://lmarena.ai/ with the style control unchecked will be used to resolve this market.
If a listed model ties for #1 Arena score, it will suffice to resolve this market to "Yes."
The resolution source for this market is the Chatbot Arena LLM Leaderboard found at https://lmarena.ai/. If this resolution source becomes unavailable, the market will remain open until it is accessible again. If it becomes permanently unavailable, resolution will be based on another credible source.
Thị trường mở: Dec 22, 2025, 5:28 PM ET
Resolver
0x65070BE91...Results from the "Arena Score" section on the Leaderboard tab of https://lmarena.ai/ with the style control unchecked will be used to resolve this market.
If a listed model ties for #1 Arena score, it will suffice to resolve this market to "Yes."
The resolution source for this market is the Chatbot Arena LLM Leaderboard found at https://lmarena.ai/. If this resolution source becomes unavailable, the market will remain open until it is accessible again. If it becomes permanently unavailable, resolution will be based on another credible source.
Resolver
0x65070BE91...Recent releases of GPT-5.5 Instant from OpenAI and Grok 4.3 from xAI have kept those companies at the forefront of benchmark leadership through mid-May 2026, while Anthropic’s Claude Opus 4.7 and Google’s Gemini 3.1 Pro continue to post top scores in reasoning, coding, and multimodal tasks. The four major labs now sit within a narrow performance band on key evaluations such as GPQA, SWE-Bench, and arena leaderboards, with incremental gains in context windows and agentic capabilities driving the tight race. Traders are watching for any late-May updates ahead of Google I/O and potential July model drops that could shift the #1 spot by the June 30 cutoff.
Tóm tắt AI thử nghiệm tham chiếu dữ liệu Polymarket. Đây không phải tư vấn giao dịch và không ảnh hưởng đến cách thị trường này được giải quyết. · Cập nhật
Cẩn thận với liên kết bên ngoài.
Cẩn thận với liên kết bên ngoài.
Câu hỏi thường gặp