RE: LeoThread 2025-02-18 09:48

You are viewing a single comment's thread from:

RE: LeoThread 2025-02-18 09:48

View the full context
View the direct parent

taskmaster4450le (81)in LeoFinance • 2 months ago

xAI claims Grok 3 beats GPT-4o on benchmarks including AIME (which evaluates a model’s performance on a sampling of math questions) and GPQA (which assesses models using PhD-level physics, biology, and chemistry problems). An early version of Grok 3 also scored competitively in Chatbot Arena, a crowdsourced test that pits different AI models against each other and has users vote on their preferred responses, according to xAI.

2 months ago in LeoFinance by taskmaster4450le (81)

$0.00

Sort:

Trending