OpenAI employees publicly accused xAI's latest AI model Grok3 of having misleading benchmark results
ChainCatcher news, according to Jinshi reports, an employee of OpenAI publicly accused Musk's xAI company of having misleading benchmark results for its latest AI model Grok3. In response, xAI co-founder Igor Babushkin insisted that the company did nothing wrong.xAI's chart shows that the two versions of Grok3—Grok3 Reasoning Beta and Grok3 mini Reasoning—outperformed OpenAI's currently strongest available model o3-mini-high on AIME 2025. However, OpenAI employees quickly pointed out on the X platform that xAI's chart did not include the AIME 2025 score of o3-mini-high under the "cons@64" condition.Babushkin argued on the X platform that OpenAI had previously released similar misleading benchmark charts, even though these charts were used to compare the performance of its own models.