The Potential and Perils of Election Prediction Using Social Media Sources

The Potential and Perils of Election Prediction Using

Social Media Sources

Federico Nanni and Josh CowlsUniversity of Mannheim/Comparative

Media Studies, MIT

Reasons to be cheerful+ Social media data is (often) cheap+ Phone response rates are in decline+ More granularity available?

CostUtility

Traditional inferential model Social media model

Reasons to be doubtful- Myriad reliability issues...– Difficult to establish the meaning of

latent messages– Platform specific behaviours (e.g.

hashtags, likes) are not always understood

– Political discourse often laced with e.g. sarcasm

- The ethics of collecting and using social media data

Results to date have been mixed...• A meta-analysis found little evidence that

using Twitter to predict elections is better than chance in the aggregate (Gayo-Avello, 2013)

• Nonetheless, social media can provide an ‘early warning system’ for a candidate’s momentum (Jensen and Anstead, 2013)

• Big problem: what’s in a name?

Our approach: intention over attention

• Most models count references to candidates’ or parties’ names – measuring attention

• Other models use sentiment analysis, seeking to ascertain emotion responses to candidates

• We built an intention model, collecting instances of vote declarations for specific candidates

Case study• Context: Labour and the Lib Dems

required new leaders in 2015 (after a polling fail!)

• Leadership elections conducted in summer 2015– Lib Dems: two candidates (Tim Farron,

Norman Lamb)– Labour: four candidates (Jeremy Corbyn,

Andy Burnham, Yvette Cooper, Liz Kendall)

Advantages of our case• Primary candidates’ names easier to

isolate than ambiguous party names (“Labour”, “Liberal”)

• Party elections are a minority sport – better signal to noise ratio?

• Start and end dates clear; postal vote system ensured greater period of decision-making

Method Wrote Python scripts to collect tweets which:

Mentioned the name of a candidate Included a specific declaration to vote (“I’ll vote

for...”, “I’m voting for” etc) Cleaned data

Removed non-declarations (“I’m not voting for...”) Ascertained preferred candidate in ambiguous cases

Final dataset: 1361 valid declarations for Lib Dem race and 17617 for Labour

Analysis (1)

Analysis (2)

Key successes• ‘Intention’ model beat out ‘Attention’

model in 5 out of 6 races, and in both races overall

• Lib Dem prediction accuracy close to traditional margin of error (MOE = 3.5)

• Caught Corbyn’s success to a high degree of accuracy (MOE = 2)

Reflections and future work• Tough to generalise successes – specific

cases, particular platform. (How) would this work for:– Multi-state process (e.g. US primaries)?– General elections?

• Despite ongoing challenges, social media will surely play a key role in the future of accurate election prediction

The Potential and Perils of Election Prediction Using Social Media Sources

Education

Transcript of The Potential and Perils of Election Prediction Using Social Media Sources