Hi, yeah, I'm currently in the mode of tweaking it, that's even more intricate than the machine learning stuff :-D
Let me shortly address your points:
Reagarding 1: Yes, Steem has its flaws, but currently I do not know about any better or more direct method to measure if people like something than the votes and rewards that were paid on this platform. I'm not so worried about the judgment of the community than about the abuse of bid bots and bought rewards. However, I am currently trying to mitigate this by filtering rewards and votes provided by bid bots like @upme and vote services such as @smartsteem.
Regarding 2: The Machine Learning model is not linear, meaning there does not necessarily exist a relationship like reward = x * flesch_kincaid_index
. The Flesch-Kincaid index is just one of the about 150 dimensions that describe a post. The Machine Learning model tries to infer by itself how to make use of this index in order to predict the reward.
Let's take your example. Suppose we have a corpus with William Faulkner, Ernest Hemingway, as well as texts by 11 year old Marc who likes ponies. The former two get a lot of reward, but score low on Flesch-Kincaid index. On the other hand, little Marc doesn't get much for his texts about his beloved ponies, yet, he does achieve high scores on the index due to his rather short and simple sentences.
Accordingly, the Machine Learning model will see this data and, consequently, come up with a rule like that IF Flesch-Kincaid index IS low THEN high reward
. Hence, the value of the index itself is not proportional to the reward. There are more intricate and non-linear ways how the index determines the payout, and all these are learned or inferred form actual data (i.e. previous Steemit posts).
By the way, the Flesch-Kincaid index is not the only measure of readability @trufflepig looks at. The others are: The Gunning Fog index, the Smog Index, the Automated Readability Index, the Coleman Liau Index, and the four first moments of the syllable distribution, i.e. mean, variance, skew, and kurtosis of number of syllables in a word. Fun fact, looking at the random forest's feature importances, @trufflepig bases his decision much more on the latter raw representation of word complexities than the carefully crafted former readability indices :-D.
By the way I looked at the influence of bid bots. It's quite large:
In the training set 17% of all articles were promoted with bots. In total the users spend more than 3700 STEEM and 69000 SBD on these bots!