Models like o1 “actually take longer and are able to evaluate their own response,” Makanju said, “So they’re able to sort of say, ‘Okay, this is how I’m approaching this problem,’ and then, like, look at their own response and say, ‘Oh, this might be a flaw in my reasoning.’”
She added, “It’s doing that virtually perfectly. It’s able to analyze its own bias and return and create a better response, and we’re going to get better and better in that.”