You are viewing a single comment's thread from:

RE: LeoThread 2024-10-22 21:22

in LeoFinance4 months ago

Model Improvements: Raising the Bar

Anthropic has released two updated models: Claude 3.5 Sonnet (new) and Claude 3.5 Haiku. The new Sonnet version shows across-the-board improvements over its predecessor, with particularly notable gains in coding capabilities - an area where Claude was already considered an industry leader.

Benchmark results paint a compelling picture of the new Claude 3.5 Sonnet's capabilities:

  • Graduate-level reasoning (GPT-QA): Improved from 59% to 65%
  • MLU Pro: Advanced from 75% to 78%
  • Math problem solving: Substantial jump from 71% to 78%
  • High school math competitions: 16% using zero-shot chain of thought, nearly double the previous version
  • Agentic coding: Significant increase from 33% to 49%