You are viewing a single comment's thread from:

RE: LeoThread 2025-02-01 10:54

To be clear, this is not the end of LLMs. Wilson of NYU points out that despite such limitations, researchers are beginning to augment transformers to help them better deal with, among other problems, arithmetic. For example, Tom Goldstein(opens a new tab), a computer scientist at the University of Maryland, and his colleagues added a twist(opens a new tab) to how they presented numbers to a transformer that was being trained to add, by embedding extra “positional” information in each digit. As a result, the model could be trained on 20-digit numbers and still reliably (with 98% accuracy) add 100-digit numbers, whereas a model trained without the extra positional embedding was only about 3% accurate. “This suggests that maybe there are some basic interventions that you could do,” Wilson said. “That could really make a lot of progress on these problems without needing to rethink the whole architecture.”