Quality data is a very important metric, but I would add as well, the fact that its then minable by any other tool without the need of extensive GPU processing needed to check a video that can also later on, be deleted.
But like @mightpossibly, there are many perspectives.
Either way, "text" is a very cheap resource on #hive, and if anyone wants to generate more data, they will consume more RC and that should bring more value in itself, to the chain, just alone on the using the chain.
You're right. That's a perfect system