So what was that poll all about then? (This poll, the one I linked on Twitter and Tumblr – a basic tick-the-box job on the best-selling music acts of last year)
Well, the truth is it was trying something out for my day job. I wanted to try out DIY split-testing tool Optimizely and see how easy it was to run basic experiments.
In this case there were actually two different polls – the one you saw should have been random.* One of them asked the question as follows:
“This is a list of the best-selling acts of 2011. Please tick any acts that you enjoy.”
And the other asked it this way:
“This is a list of the best-selling musicians of 2011. Please tick any artists that you enjoy.”
So the idea** was to see if phrasing the question using more loaded words – “musicians” and “artists” would have any aggregate impact on the results, perhaps making people more careful in their judgement*** or favouring particular acts.
And did it? Well, maybe. As a researcher, I’d only report a result as significant at the 95% confidence level, and none of the differences hit that.**** But two acts showed different voting patterns at the 90% confidence level (which roughly means, if we ran this again, there’d be a 9 in 10 chance of getting the same result, and to be ‘significant’ you need a 19 in 20 chance.)*****
These were Adele and Amy Winehouse. Amy beat Adele in both polls, but in the ‘acts’ poll Adele got 46% and Amy got 54%. Change the terms of reference to ‘artists’ and the gap between them widened enormously: Adele ended up with 35% (11 points lower) and Amy got 64% (12 points higher).
This is interesting enough that I wouldn’t mind running the test again on a wider population with a better designed survey.
Of course the trouble with A/B tests is that it can give you a result but it doesn’t tell you why. I was expecting Amy to do better once you started talking about ‘artists’ – she’s a recently canonised dead musician – but I’m really surprised Adele’s vote dropped (whereas Gaga and Rihanna held up fine – so it’s not a ‘triggering yr latent rockism’ thing). Maybe enjoying ‘acts’ implies less of a commitment, so Adele picked up more ‘she’s OK’ ticks on that poll. I don’t know!
Even 90% confidence intervals were stronger than I expected, though, and the “is this tool usable” element of the test worked fine, so if you voted, thankyou very much for doing so!
The final combined rankings, incidentally:
Lady Gaga – 60%
Rihanna – 60%
Amy – 57%
Adele – 42%
Coldplay – 22%
Bruno Mars – 9%
Jessie J – 8%
Ed Sheeran – 4%
Michael Buble – 2%
Olly Murs – 2%
*It may be that there were browser issues in some cases, or that some of you have plug-ins which avoid javascript nonsense like the Optimizely code. For whatever reason, a lot more people ended up filling in the “acts” poll than the “artists” one, but I don’t think this was a ‘result’.
**Beyond just trying out Optimizely.
***This is why I REALLY should have put a “none of the above” in!
****Except poor old Ed Sheeran, who got 10 votes in the “acts” poll and NONE AT ALL in the “artists” one – apparently this is significant, but since even his 10 votes only got him a 7% share I don’t think it is really.
*****The whole poll ought to be very bad research anyway, since the sample is opt-in and very skewed (people who follow me on Twitter don’t have the same music tastes as the general population, it’s fair to say.) BUT the great thing about split tests is that this doesn’t matter in terms of examining the split, since the sample is identically dodgy on both tests!