I'm not so at peace with the square root of number of characters thing. A short comment can actually weigh more than a long comment filled with gibberish. We need both quantity and quality and the 125 characters/20 words filter seems enough imo. Further finding the square root of number of characters might encourage spamming/plagiarism and how many people have got the time to check comments for plagiarism?
You are viewing a single comment's thread from:
Interesting thoughts! Really!
I am not sure people will really try to spam/plag to increase their score as this will bring them nothing, at least for now. Moreover, people finishing high on the list will be automatically scrutinised, and blacklisted from the script if needed (so that it may work... maybe once :D ). I don't know here... Let's continue discussing this.
Back to the square root now. I chose this as this decrease the relative importance of very long comments. Moreover, with the 125-characters threshold, the smallest score a comment could bring is 3 points so that too short comment are not really impacted as they don't count.
First, as suggested by @abh12345 below, I have tried to extract the scores without the word count / length limit. Not much changes in terms of the top-20. We have some permutations, but actually nothing more. I may actually remove this filter. I am still thinking about it.
Second, I agree the sqrt may not be ideal. @borislavzlatanov proposed to include the number of votes a comment brings. I however don't know what to do in practice. Any idea for another metric?