Last week a question occurred to me: what interesting things can you find out by playing around with Last.FM listening data? Last.FM themselves offer a fair bit of extra analysis to users in their “Playground” section, but it’s all to do with individual listeners or their networks (or “neighbourhoods”). I wanted to see how much LFM data could tell us about specific artists, and how people listen to them.

So using the most topline, publically available data possible – the artist pages and charts of most-played tracks – what can we find out? I created a few metrics which I could generate (by hand! no programmer I!) in 20 seconds or so for each artist and set to work populating a mini database out of the artists on the overall LFM charts, then the ones on my personal charts, then anyone I thought might be interesting. The results are this series of three – somewhat wonkish – posts: the conclusions will be in Part III so if you don’t fancy seeing me crunch numbers (albeit very EASY numbers) wait around for that.

Here’s what I came up with!

Obviously the sample I pulled from the LFM database is non-representative – these are, for the most part, famous bands with upwards of a million plays each. And as you’ll realise, LFM is itself probably highly unrepresentative of music listeners in general – probably more so of “committed listeners to music via computer” though. Even so some interesting patterns emerged.

The first metric I created was the incredibly obvious plays per listener (PPL) – a division of the number of plays by the number of listeners. The figures LFM gives for both of these are, as best I can tell, total: i.e. in the lifetime of Last, just north of 3 million users have listened to Coldplay, and they’ve racked up about 160 million plays, so their PPL is 53 (which is very high, as it happens).

The highest PPL I found was 116, for The Beatles (see the conclusions post for my thoughts on that). The lowest was 4, for Gloria Gaynor. Obviously the PPL has something to do with how deep into an artist’s catalogue fans will go, but there’s an element of loyalty and repeat plays in there too.

A high PPL seems to be 45+ – bands with that kind of score are generally long-serving, serious groups with global appeal: Radiohead, Metallica, NIN, Led Zep, The Smiths. But there are a bunch of much newer bands in there too who gain their high PPL simply by having a smaller, but obsessive fanbase: Paramore and the XX both have PPLs over 60, implying very heavy repeat plays of their smaller number of tracks.

This certainly doesn’t apply to all new bands: Wavves has a PPL of 27, Sleigh Bells have 24. Maybe people are trying those groups and not returning? And not all of the ‘rock canon’ do so well either – Springsteen has 28, Kate Bush 22, critical touchstones New Order and the Beach Boys have PPLs of 21 each. For soul legends the figures get worse still: Steve Wonder on 15, Marvin Gaye on 13.

In the 15-or-below zone we have a bunch of older acts who are best known for one or two songs: Human League, Soft Cell, Glenn Campbell, Donna Summer. We also have an awful lot of dancehall, R&B and hip-hop acts on under 15 PPL, which is partly a reflection of LFM’s audience demographics but partly because those musics are still hit-promoted as much as album driven.

The next metric I put together I’m calling Top Track Incidence (TTI) – this is the number of listeners for the top track as a percentage of the number of listeners. This one’s a lot less useful than PPL, because listeners per track are counted on a 6 month basis whereas overall listeners are counted across the site’s whole history. So it’s really just the percentage of an act’s listeners who listened to its most popular track over the last 6 months. Probably everyone who has ever listened to Outkast has listened to “Hey Ya!” but only 6.4% of them did it within the last 6 months (which might be good news for Outkast, it suggests they’re not a one-hit-wonder).

The highest TTIs I found were for the XX (“Crystalised”), and Ke$ha (have a guess) – both with almost 52%. There must, you would think, be higher ones out there. These were flukes, though – on the whole a TTI of over 10% was pretty unusual.

Down at the lower end – under 5% – were acts with broad catalogues and no one single smash: yes, “My Way” is Frank Sinatra’s top track, but only 3% of his listeners played it in the last 6 months – with a catalogue as big as his, there’s plenty of other stuff to play. The lowest TTI I found – and I looked at him because I knew he’d be low – was for Muslimgauze, whose CDs are notoriously a) numerous and b) similar to one another. His top track – “Dharam Hinduja”, if you’re interested – was played by only 1.4% of his fans. Perhaps more surprisingly, “Imperial March (Darth Vader Theme)” got a TTI of only 1.9% for John Williams – he has done many other soundtracks and it’s unlikely listeners are buying them for HIM rather than for the films.

In part II I’ll look more deeply at metrics dealing with artists’ depth of catalogue: the ratios between levels of listeners to given tracks – and there will be a CHART. In part III I’ll link to my little hand-made dataset (though expect more examples in the comments before then!) and also draw a few conclusions.