You are referring to data degradation - the idea that each generation of training decays the value of the data since it is emphasizing what you stated.
That is a theory that is hotly debated. To start, algo design can alter that especially with the weighting system. Also, we are looking at the ability to keep enhancing the data with human responses and feedback.
I have seem some who have designed models all on synthetic data. A layer of coding has to go over the top.
That's interesting. So as much as we can, we should comment on these AI responses we post.
Without a doubt. Adding more to the information helps with context.
It is also why I run the synthetic through Leolinker to add the links. That helps with the knowledge graph that is set up.
Leolinker? What's that?
It allows for the instant placing of Leoglossary links into whatever is entered. That helps with context.
https://leolinker.site/
That's cool!!!