6
Mar 09

Which is more broken: music criticism or metacritic?

FT15 comments • 1,286 views

Here’s a post from Hitsville raising an interesting point about Metacritic scores. The case looks pretty clear-cut: movie critics are harsher and more discerning than music ones.

Harsher, yes. More discerning? Well. Allow me to get all research wonkish for a minute (or don’t, and scroll down to the section headed “WHY?”). If you look at the lists of current movies and music on Metacritic, it’s true that the average ratings are much higher for music, and it’s also true that the movie reviews “use” a wider range of scores. But not that much wider. The music reviews use a 53-point range: just over half the scale. The movie ones use a 67-point range: two thirds of the scale. That’s still a lot of scale going unused!

This is a problem of rating scales. If you have a 10 point rating scale, then ten points means good and one point means bad, so six means “OK”. Right? Not necessarily. What actually happens is a cultural skew effect: nearly always, you’ll find marks clustering at the top end of the scale, and in some places the bottom end is hardly used.

A direct comparison of Dutch and Chinese rating scale answers, for instance, wouldn’t be a lot of use: Dutch respondents use the lower end a whole lot more. This type of comparison is exactly what Metacritic is doing, though: its red-yellow-green overlay straddles the middle of the scale, which the movie critics use a lot and the music critics don’t. So the colour scheme exaggerates the extent to which the film guys use the whole scale.

So how do you deal with comparisons across cultures – those Dutch and Chinese respondents, for instance? Partly, you look at the distribution as well as the mean, which lets you normalise the results.

For an experiment I took all the scores – 47 for movies, 123 for music – on the front pages of metacritic, took an average, and indexed them on that average. The indices go higher and lower at their extremes for movies – as you’d expect, since the movie critics use more of the scale. But this gives you a better at-a-glance reading of an album or film’s relative standing within their artform than the raw rating data.

But what about the colour scheme – which is an even better at-a-glance guide? Well, the magic of stats gives us a way to reimpose red-yellow-green in a way that works better. If you take the standard deviation of the index scores, and give red (bad) to anything more than 1sd below 100 and green (good) to anything more than 1sd above, you’ve got yourself a method which makes the colours a bit more meaningful.

What does this do to the scores? Here’s an Excel spreadsheet to demonstrate! Around two-thirds of both films and music end up in yellow: music critics are slightly less likely to push an album into green but the distribution is pretty similar across both media. The movies list becomes better at discriminating between true stinkers and populist mediocrities: Beverley Hills Chihuahua and Paul Blart:Mall Cop sit at the bottom end of yellow, Bride Wars remains firmly in red. The music list on the other hand improves its discrimination between fine and excellent records – of the four artists Hitsville raises its eyebrow over, only Chris Isaak remains in green (so maybe he should check him out!).

WHY?

None of which answers Hitsville’s original question of why music critics seem to have a different culture than movie reviewers. Hitsville blames it on our old friend “popism” – which he characterises as journalists giving favourable reviews to popular stuff because that’s where the money is. I’m not sure about this, because it doesn’t look to me like that many “big” and popular releases are getting glowing metascores here. The answer for me lies in the huge imbalance between the number of movies and the number of albums that are released.

As I understand it, most ‘tenured’ film critics are expected to see and review a high proportion of films on general release: this just isn’t the case for music critics, who tend to work by pitching or being assigned stuff they’re interested or expert in. Therefore they are far less likely to involuntarily encounter a record they think is crap – disappointment rather than rage is their primary negative emotion. And this is reflected in the skew of their marks.

There’s another important question though, which is: do readers care about the skew? My hunch is that a regular reader of a review source will have a pretty good idea of what each grade means. At Pitchfork, for instance, when I give a record less than 7.0 I will often get links or mail or tweets treating it as effectively a pan – even though a P4K score of 6.1-6.9 is green (good) in the metacritic score! The readers know where the average is, even if they haven’t sat down and calculated it.

Comments

  1. 1
    marc h. on 6 Mar 2009 #

    could another factor in the discrepancy also be that there are a lot fewer movies per movie fan than there are albums per music fan? so film reviewers are often critiquing things their audiences have already heard of, via ads or whatever, while in many cases as a music critic even if you’re writing about the biggest pop hits these days (at least in the us) i still assume the average person hasn’t heard it. so oftentimes most of the people reading a music review, particularly of an obscure indie band, are the people who like that band– when you give negative reviews, your readers think you’re a jerk (hi!). and there’s still a definite exchange of money and time involved with movies anymore (although i’m sure downloading is on the rise), whereas with albums what exactly we’re critiquing has become more vague– are we saying this is worth your download time? your hard drive space? or (sad chuckle) your money? when i rave about an album, i try to make it an album that you can lose yourself in for a while, make a part of your life. but that’s pretty nebulous, too! etc. etc.

  2. 2
    Tom on 6 Mar 2009 #

    Yeah, I wanted to make a point like that but I honestly don’t KNOW if people still pay to see films (I don’t!) (though that’s cos of toddlers not taste).

    Another possibility of course is that it’s a lot easier to make a quite enjoyable 6-out-of-10 record than a quite enjoyable 6-out-of-10 film. Maybe music is just better! ;)

  3. 3
    Matthew Perpetua on 6 Mar 2009 #

    You know, I’m inclined to say that there are just more great albums and songs than good movies. These days, a good film seems like a minor miracle.

  4. 4
    Stevie T on 6 Mar 2009 #

    According to this: http://moviecitynews.com/voices/2009/090302_critics.html
    there are only 126 full-time movie critics in the USA… and I bet most of them have to review most the movies opening in their city that week – of which they probably won’t like very many (though I imagine there is advertising pressure from the local cinemas on the surviving local alt mags and critics to give more good reviews). Pitching reviews to a monthly mag, you generally pitch the albums you like or that in your area of specialism. So… what Tom said.

  5. 5
    Tom on 6 Mar 2009 #

    http://idolator.com/5097276/the-rainbow-connection-are-music-critics-too-tolerant

    Good discussion covering this same ground via Mike Barthel (I THOUGHT I’d remembered this conversation coming round before!)

  6. 6
    Dave on 6 Mar 2009 #

    On the Pfork scale, always look for the 7.3-7.9’s if you want an interesting and relatively hypeless listen, or a good chance to be a contrarian and proclaim it to be a masterpiece (an 8.3 over a 7.9 is a world of difference!).

    I seriously wouldn’t underestimate how sophisticated — if makeshift — the audience’s understanding of a given publication’s music reviewing system is, though. This is different than in film criticism, where when there is a score, it’s almost always somewhere in the “thumbs up” “thumbs down” spectrum, e.g. in a list of capsule reviews for films in a given major paper, a few of them might have a special star next to them to note “go see this,” or the difference between a 2 (bad) 3 (good) and 4 (great) star review.

  7. 7
    a tanned rested and unlogged lørd sükråt wötsît on 6 Mar 2009 #

    my boss at s&s when i was a sub there used to distinguish very sharply between a reviewer and a critic — my gloss on this was that when i wrote a review where you got to think about lots of stuff but did NOT learn my actual opinion of the film’s goodness or otherwise, then i was functioning as a critic (cz who gives a fuck ab my opinion, i’m just some guy who knows stuff abt stuff, you might like it even if i hate it)

  8. 8
    lonepilgrim on 6 Mar 2009 #

    It’s interesting to read the Mike Barthel post you linked to Tom. It made more sense to me than talk of percentages, etc.
    Whereas there seems to be a shared language of the mainstream films that are likely to be reviewed (despite genre and, to a lesser extent, aesthetic) music tends to be more tribal and reviews tend to reflect this.
    The music press in either format largely focuses on music (duh) and so a)relies on creating a buzz of excitement amongst their readers and b)relies on advertising and access to interviews, etc from the record companies. They have a vested interest in not upsetting them too much.
    Wasn’t there an incident with Rattle & Hum back in the day? ;-)
    Movie reviews on the other hand tend to form a sub-section of the mainstream press – whether newspaper/magazine (both print and online versions). Reviewers can afford to be more critical if they feel like it because the paper/mag they are writing for is less dependent on the movie companies largesse.

    I seem to recall a recent letter in one of the music mags pointing out how Oasis had got a 5 star review for their album early in the year and then failed to make the top 50 albums at year’s end.

    …any road, more of the pop soon please

  9. 9
    a tanned rested and unlogged lørd sükråt wötsît on 6 Mar 2009 #

    i think it’s a big (and rather strange) mistake to assume that the MAN remotely wants consumers to be indiscriminate — discrimination is exactly how a lot of (non-essential) purchase is driven; the issue is predictable and exploitable discrimination

    a genuine popism — a critical movement that argued that everything was good and you were culturally deprived if you hadn’t heard EVERYTHING — would be a weird and a radicalising movement i think: not least because it would generate an anger that this deprivation was enforced (by pricing; by the fact that we have to work some hours of the week; or sleep; or etc)

  10. 10
    Tom on 6 Mar 2009 #

    Just to backlink – Hitsville’s replied to this post (and I’ve replied in the comments)

    http://www.hitsville.org/2009/03/06/what-hath-popism-wrought-ii/

  11. 11
    Pete on 7 Mar 2009 #

    The elephant in this room seems to be videogame criticism, where frankly the range seems to go from 65% (Bad) to 100% (pretty good). Individual outlets have their own rules of thumb, and as you know Tom from Pfork, will jiggery poke a score due to the history and received wisdom of existing scores.

    the other thing to consider with scoring systems is the indices (which you have talked about elsewhere). Marking out of five stars will give a midground of three stars which in a percentage score will be 60%. Out of ten 5 will be a midground (though of course 5 seems like an insult, and is too low for an average movie say, because an average movie should be professional and good enough to be enjoyed).

    But the sheer weight of reviewing takes its toll. Along with the experience differential – you cannot write a review of a narrative movie in a screening theatre, wheras it is quite easy to write a review of the new U2 album while you are listening to it (and indeed do enough prep that you have your thesaurus open on the “rubbish” page).

  12. 12
    Tom on 7 Mar 2009 #

    Metacritic do actually acknowledge how completely mental the scale on videogame reviews has got by putting their traffic light boundaries at different scores.

  13. 13
    Mark M on 9 Mar 2009 #

    In a sense, the work done by the film critics at the national newspapers is more like how singles reviews functioned on the inkies back in the day than album reviews: whether composed as a single column or supposedly individual reviews, you’re looking at the week’s releases across the board and across genres and weighing up the big release of the week against the more obscure but possible more exciting offerings.

    When I was writing features about (not reviewing) films for the papers, people would always be genuinely shocked if I hadn’t seen that week’s big release, something that happened much less with music, where people accept the idea that you might be a specialist (not that I really was, but…)

    Metacritic’s film list is dominated by big US papers and general magazines like the New Yorker, while its music list is much more skewed to music magazines/websites.

    Of course, on the monthly movie mags things work in a fairly similar sort of way to the music monthlies: they’ll be movies at Empire that will always be “better get Kim Newman to do that” or at S&S, “do think Tony Rayns should do that?” in the same way as assorted writers colonise genres at Q or Uncut.

    Re: 3 – I think that’s a matter of personal feeling. I’d be disappointed if there weren’t twenty films I really wanted to see a year, and shocked if there were more than five albums I actually wanted to buy.

  14. 14
    Pete Baran on 9 Mar 2009 #

    Yr cultural knowledge point is well made. When I was doing my film masters, it seemed like a badge of honour for most of the class that the DIDN’T go to the cinema at all! They had their specialisms and cast iron certainties and seemed a wee bit scared that these might be damaged by liking something new.

    As mentioned in the conversation on this subject on The Lollards Radio Show, the word Average has a different meaning as applied in criticism as applied in mathematics. So we should not be surprised when applying mathematics to criticism odd results come out.

  15. 15
    Nitsuh on 9 Mar 2009 #

    I’ve always interpreted the numbers issue this way:

    GENERAL-AUDIENCE MOVIE REVIEW SECTIONS: Many of these assume there’s only a handful of films opening this weekend, most of which you’ve seen ads for, all of which are already clearly segmented for different audiences/genres, and therefore they will (a) deem a couple worth watching and shrug or grumble about the rest, plus (b) maybe allow one critic to rave about a foreign or independent film now and then.

    GENERAL-AUDIENCE MUSIC REVIEW SECTIONS: Many of these assume there are a billion albums coming out this week, most by artists the reader has never heard of, and that it’s the publication’s job to assemble them into little taste-packages for the imagined reader — therefore they will (a) pick a handful of releases to praise, creating like a 10-album “listening station” designed for whatever tastes/lifestyle is the publication’s imagined brand, plus (b) pan a few things, either to round out the other end of that “brand” (what we DON’T like), or because it’s from an act that’s well-known to the audience.

    I.e., movie sections discern between options, music sections package and offer a little nugget of recommendations. This actually seems really natural to me, and part of it can be expressed in a much simpler way: people go out to a movie and have 4-5 convenient choices and have an actual need for being warned off bad ones via critical pans — people looking for something to listen to, though, are in no danger of buying stuff they’ve never even heard of; they’re facing endless options in a store or on a computer and are better served by hearing what’s good.

Add your comment

(Register to guarantee your comments don't get marked as spam.)


Required

Required (Your email address will not be published)

Top of page