You are viewing a single comment's thread from:

RE: LeoThread 2025-02-18 09:48

in LeoFinance2 months ago

xAI claims Grok 3 beats GPT-4o on benchmarks including AIME (which evaluates a model’s performance on a sampling of math questions) and GPQA (which assesses models using PhD-level physics, biology, and chemistry problems). An early version of Grok 3 also scored competitively in Chatbot Arena, a crowdsourced test that pits different AI models against each other and has users vote on their preferred responses, according to xAI.