RE: LeoThread 2025-02-01 10:54

The team observed the same pattern when it came to solving Einstein’s riddle: GPT-3 failed when asked to answer bigger versions of the puzzle compared to the ones it was fine-tuned on. “It’s mimicking something that it has seen, but it doesn’t have full understanding of it,” Dziri said.

As Dziri and her co-authors were finalizing their results, a different team was taking another approach to understanding why LLMs struggled with compositional tasks. Binghui Peng(opens a new tab), at the time a doctoral student at Columbia University, was working with one of his advisers, Christos Papadimitriou, and colleagues to understand why LLMs “hallucinate,” or generate factually incorrect information. Peng, now a postdoctoral researcher at Stanford University, suspected it was because transformers seem to lack the “capability of composition.”