Last Week in AI

Last Week in AI

Share this post

Last Week in AI
Last Week in AI
Last Week in AI #308 - The Leaderboard Illusion, ChatGPT Glazing, Qwen 3, Ernie X1
Copy link
Facebook
Email
Notes
More
News

Last Week in AI #308 - The Leaderboard Illusion, ChatGPT Glazing, Qwen 3, Ernie X1

OpenAI undoes its glaze-heavy ChatGPT update, Alibaba unveils Qwen 3, a family of ‘hybrid’ AI reasoning models , Baidu ERNIE X1 and 4.5 Turbo boast high performance at low cost

Last Week in AI's avatar
Last Week in AI
May 02, 2025
∙ Paid
57

Share this post

Last Week in AI
Last Week in AI
Last Week in AI #308 - The Leaderboard Illusion, ChatGPT Glazing, Qwen 3, Ernie X1
Copy link
Facebook
Email
Notes
More
6
Share

Top News

The Leaderboard Illusion

The authors of this paper argue that the over-reliance on a single leaderboard can lead to overfitting and gaming of the system, rather than genuine technological advancement. They conducted a systematic review of the Chatbot Arena, analyzing data from 2 million battles, 42 providers, and 243 models over a fixed time peri…

Keep reading with a 7-day free trial

Subscribe to Last Week in AI to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Skynet Today
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More