You are viewing a single comment's thread from:

RE: LeoThread 2024-10-22 09:10

in LeoFinance4 months ago

A recent study found that models without the ability to use desktop apps, like OpenAI’s GPT-4o, were willing to engage in harmful “multi-step agent behavior,” such as ordering a fake passport from someone on the dark web, when “attacked” using jailbreaking techniques. Jailbreaks led to high rates of success in performing harmful tasks even for models protected by filters and safeguards, according to the researchers.