OpenAI employees publicly accused xAI's latest AI model Grok3 of having misleading benchmark results

2025-02-23 11:03:04
Collection

ChainCatcher news, according to Jinshi reports, an employee of OpenAI publicly accused Musk's xAI company of having misleading benchmark results for its latest AI model Grok3. In response, xAI co-founder Igor Babushkin insisted that the company did nothing wrong.

xAI's chart shows that the two versions of Grok3—Grok3 Reasoning Beta and Grok3 mini Reasoning—outperformed OpenAI's currently strongest available model o3-mini-high on AIME 2025. However, OpenAI employees quickly pointed out on the X platform that xAI's chart did not include the AIME 2025 score of o3-mini-high under the "cons@64" condition.

Babushkin argued on the X platform that OpenAI had previously released similar misleading benchmark charts, even though these charts were used to compare the performance of its own models.

ChainCatcher reminds readers to view blockchain rationally, enhance risk awareness, and be cautious of various virtual token issuances and speculations. All content on this site is solely market information or related party opinions, and does not constitute any form of investment advice. If you find sensitive information in the content, please click "Report", and we will handle it promptly.
ChainCatcher Building the Web3 world with innovators