Having spent a fair chunk of my blogging time yesterday talking about rating scales, this Financial Times piece came as an eye-opener.
“Practice does not help. Neither, surprisingly, does varying the gaps in the scale: it’s no easier to distinguish five sounds between “very loud” and “very quiet” than between “fairly loud” and “fairly quiet”. Some people have perfect pitch and can transcend these limits when it comes to musical tones, but there seem to be few other exceptions. No wonder so many reviews use a scale of one to five stars.”
If true this would not only explain why so many reviews use a scale of one to five stars, but why – when presented with a wider scale – reviewers tend to cluster in the middle or at one end of it. Sadly the FT is somewhat vague about citing its sources in this piece.
Here is my own experiential contribution to scale research – which bears this out to some extent. On Popular, as you know, I have a 1 to 10 scale, and there’s a fair amount of discrimination within it. But my internal method of alloting marks tends to be:
1. Go on instinct whether a record is good (6-10) or not good (1-5).
2. Discriminate within those ranges.
So I’m still using the five-degrees rule, I’m just chunking records into subcategories before applying it. This is also what I do when ordering end-of-year lists by the way (yes, we’re getting deep into Hornby territory here!): I put everything into 4 or 5 baskets, then sort within each basket until I get granularity.
I suspect this is an iterative process – i.e. the Pitchfork 101 point scale LOOKS ridiculous, but not if the reviewers use a series of decisions to differentiate. Is this good Y/N? Is it a 6/7/8/9/10? Is it a high or low 7? Is it 7.1 7.2 7.3? OK, it’s still ridiculous. I love using it though!