I'm sure it isn't late for specialized models. For generalized models, the battle at the top is very tough. Training on other models' data introduces another level of approximation, because it introduces those models hallucinations as facts, and then, your own model hallucinates too. So, you might have hallucinations on top of hallucinations. But the bigger limitations are the costs and availability of chips and inference (the process of producing an answer given a prompt).
You are viewing a single comment's thread from: