A larger sample size would of course be more intuitive but the steem api did not seem to permit larger sizes.
Could you use https://steemdata.com/ service for larger sample size and more accurate results?
Do you plan to share this algorithm or python source code? It would be interesting to see and may be we could improve it together.