As someone pretty close to this space this is pretty accurate but related to 5:20 about models costing more to train I should point it pushing the limits of state of the art is expensive but catching up to the status quo is actually becoming much cheaper as the SOTA begins to hit a plateau. Meta 3.3 70B beats GPT-4 and GPT-4o for instance and it was a fraction of the cost to train. Similar results can be achieved with smaller, faster, and less expensive models today. Outside of that point most of the video was generally correct.
You are viewing a single comment's thread from: