in , ,

Meta’s AI Benchmarks May Not Tell the Full Story

Meta’s AI Benchmarks May Not Tell the Full Story

Meta recently introduced a new flagship AI model called Maverick, which quickly climbed to the second spot on LM Arena—a popular benchmark where human raters compare AI model responses. However, there’s a catch: the version of Maverick used on LM Arena isn’t the same as the one available to developers.

According to Meta’s announcement, the Maverick on LM Arena is an “experimental chat version.” Additionally, a chart on the official LLaMA website shows that the LM Arena testing was done using a version of LLaMA 4 Maverick specifically optimized for conversation.

Hosting 75% off

This raises concerns, especially since LM Arena has never been the most reliable way to measure AI model performance. While the platform has its flaws, most companies don’t typically tweak their models just to perform better on these tests—or at least, they don’t admit to it.

The main issue here is transparency. When a company fine-tunes a model to shine on a benchmark but releases a more basic version to the public, it becomes hard for developers to know what to expect. It also gives a skewed view of how strong the model is across various real-world tasks.

Several researchers on X (formerly Twitter) have noticed big differences between the Maverick model in LM Arena and the one available for download. The Arena version, for example, uses lots of emojis and gives overly long-winded answers—a behavior not seen in the downloadable version.

“Okay Llama 4 is def a little cooked lol, what is this yap city” — @natolambert, April 6, 2025

“For some reason, the Llama 4 model in Arena uses a lot more emojis. On together.ai, it seems better.” — @techdevnotes, April 6, 2025

Hosting 75% off

Written by Hajra Naz

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Loading…

Pakistan's IT Minister Confirms Official Starlink Launch Date

Pakistan’s IT Minister Confirms Official Starlink Launch Date

US Begins Cancelling Student Visas Without Warning or Explanation