RE: Advanced Large Language Models Are Capable And Prone to In-Context Scheming

You are viewing a single comment's thread from:

RE: Advanced Large Language Models Are Capable And Prone to In-Context Scheming

View the full context

gadrian (75)in The Mindful AI • 8 days ago

Arthroscopic reasoning models have also been caught ignoring certain safeguards and intentionally lying when they thought it was the best course of action to not be updated during the post-training phase.

8 days ago in The Mindful AI by gadrian (75)

$0.00

1 vote

Sort:

Trending

[-]

markkujantunen (77) 8 days ago

Yeah, the author of the video shows a table of the ways various models cheat. It's quite an interesting read.

$0.04

1 vote

[-]

gadrian (75) 7 days ago

I should watch it then! I thought he only talks about o1.

$0.01

1 vote