Somehow I missed that you also changed the way the scoring works, but from looking at the GIFs I thought it was still the same. Since I didn't look at the code I don't exactly understand how different it is than the current system (you say you use a weighting, but I don't really get it yet). In my opinion it would be better if you kept the behaviour the same, and included the sliders as per my suggestion. Otherwise I think it could cause problems if some people are using your version, while others aren't. But like I said, I haven't looked at the code, so I could be completely wrong about this.
It did get me thinking about something else though. Maybe it's also better (not just for yours, but in general) that, once the sliders have been implemented, the results page includes the actual score given per question instead of the answer. For example:
Q: How would you describe the formatting, language and overall presentation of the post?
A: 8.5/10
instead of
Q: How would you describe the formatting, language and overall presentation of the post?
A: The post is of decent quality.
This way the answers can still be used as a guideline for the reviewer, but the contributor gets a clearer overview of where their contribution was lacking (or amazing). Also contributors will have a better understanding of where we put the emphasis when scoring a contribution (e.g. code quality has a much higher weighting than the quality of the commit messages), which I've seen some complaints about.
The problem with the review result page is that there is no persistence. Once the questionnaire is changed, all previous reviews are invalid. The extension does not solve this issue (yet) as it generates a link to the review page and thus has to follow the rules of their questionnaire implementation.
Not sure how feasible this is because the starting score is 100. Since I multiply the score with a "weight" in [0;1] (excluding bonus points), there is no direct sign of how many points you get for a particular question.
For example (A is the current model, B is a model used in the extension), let's have 2 questions with 3 answers each.
A) In the current system, you start at 100 and decrease it when something is not top. Points for each question (1: 0, -25, -50; 2: 0, -25, -50, 3: 0, -25, -50). Selected answers are (1, 3, 2)
The resulting score is therefore 100 - 0 - 50 - 25 = 25
B) The starting point is 100; each answer has a weight which is multiplied with the score. I experimentally adjusted the weights corresponding to the current scoring.
Questions and weights (1: 1, 0.75, 0.5; 2: 1, 0.75, 0.5; 3: 1, 0.75, 0.5), selected answers are the same as in A.
The final score is 100 * 1 * 0.5 * 0.75 = 37.5
The model is experimental and can be updated any time.