There is one more thing. I went through past top lists and there's a theme and pattern revealing. Most of the top 10 posts are about the Steemit platform or Cryptocurrencies. This is, of course, not surprising. These are popular tags and, hence, just by posting in these categories you can already expected a higher payout than in other categories.
Maybe @trufflepig should balance the rewards out. It should look for good content, not so much for popular tags. So my idea is to slightly punish posts with popular tags and promote less popular tag posts via rescaling the posts' payouts by the average tag payouts.
For example, let's say we have post A with tag dog
being paid 10 SBD and post B with tag cat
being paid 8 SBD. Moreover, on average dog
yields 6 SBD per post, but cat
only 4 SBD. Regardless of a tag a post is awarded 5 SBD on average. Next, we compensate for the popular and unpopular tag by normalizing with respect to the ratio between average payout per tag and the total average payout . For instance, post's A reward is rescaled like 10 * 5/6 = 8.3 SBD and post B 8 * 5/4 = 10 SBD.
I have to options to include this into the algorithm:
Directly rescaling rewards in the training set. Hence,
TrufflePig
would directly predict already rescaled rewards for new posts.Keep predicting the original expected reward, but adjusting the top list according to the rescaled rewards to promote less popular tag posts in the daily truffle picks.
What do you think about this?
Currently experimenting with version 2...
... and merged into master. Still, there appears lots of posts about Steemit in the top list. But if the community loves posts about itself so much, I cannot help it ;-)