RE: LeoThread 2024-10-11 15:52

You are viewing a single comment's thread from:

mightpossibly (68)in LeoFinance • 7 months ago

Success in Simple Tasks: O1 Preview achieved perfect scores in some simpler tasks, such as the "Blocks World" challenge, significantly outperforming both GPT-4 and O1 Mini.
Struggles with Complexity: Despite improvements, all models, including O1 Preview, struggled with more complex spatial reasoning tasks. The "Floor Tile" and "Termes" challenges, which involve multi-dimensional planning, proved particularly difficult.

7 months ago in LeoFinance by mightpossibly (68)

$0.00

Sort: