Richard deswarte interrogating the archived uk web

14
The Search for Meaning: Exploring Euroscepticism in the UK Web Archive IHR Digital History Seminar IHR, 23 April 2014 Richard Deswarte School of History, UEA

description

Digital History seminar 4 November 2014 Live Stream: http://ihrdighist.blogs.sas.ac.uk/2014/10/28/tuesday-4-november-interrogating-the-archived-uk-web-historians-and-social-scientists-research-experiences/

Transcript of Richard deswarte interrogating the archived uk web

Page 1: Richard deswarte   interrogating the archived uk web

The Search for Meaning:

Exploring Euroscepticism in the

UK Web Archive

IHR Digital History Seminar

IHR, 23 April 2014Richard Deswarte

School of History, UEA

Page 2: Richard deswarte   interrogating the archived uk web

• Intro to ‘Revealing British Eurosceptism in the

UK Web Domain and Archive’

• Searching

• Meaning

• Provisional thoughts so far

Page 3: Richard deswarte   interrogating the archived uk web
Page 4: Richard deswarte   interrogating the archived uk web
Page 5: Richard deswarte   interrogating the archived uk web
Page 6: Richard deswarte   interrogating the archived uk web
Page 7: Richard deswarte   interrogating the archived uk web
Page 8: Richard deswarte   interrogating the archived uk web

Searching

• Eurosceptic, Euroscepticism, UKIP, EU, Referendum Party

• Searched 0.5% of domain; then 12%; then fullish

• Limited but numerous results

• UK Web Archive - Eurosceptic 312 returns; 5604 returns; approx. 14 000

• UK Government Web Archive – Eurosceptic 3420

• Ordering of results – currently crawl date

• Strange ‘false’ returns – Yorkshire Post sports pages, Morning Advertiser

• Results/Filters – crawl year, hosts, suffixes, postcode, sentiment, content type, language

Page 9: Richard deswarte   interrogating the archived uk web

Meaning

• Making sense and analysing results – research valid

• Dirty data – Yorkshire Post

• Misleading data – UKIP

• Qualitative • Needle in a haystack

• Added value tools – sentiment analysis

• Quantitative – completeness of data and crawls

• Tools

• Downloading/exporting

Page 10: Richard deswarte   interrogating the archived uk web
Page 11: Richard deswarte   interrogating the archived uk web

Sentiment Analysis

Morning Advertiser

Forum postings

14 Feb 2007

IA WayBackMachine

Neutral (14)

Very Positive (10)

Very Negative (5)

Mildly positive (3)

Positive (1)

Page 12: Richard deswarte   interrogating the archived uk web

0

200

400

600

800

1000

1200

1400

Year 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Euroscepticism

Eurosceptic

UKIP

EU (x100)

Referendum Party

Keyword search returns 1996-2010

Page 13: Richard deswarte   interrogating the archived uk web

Preliminary Thoughts

• Search focus

• Unstructured big data (uncatalogued)

• Access to & understanding ‘full’ data

• Understanding meaning – sampling

• ‘False returns’ & ‘clean data’

• Tools

• Exporting results

• Interpreting results - sampling

• A unique but problematic primary source (anything & everything almost)

Page 14: Richard deswarte   interrogating the archived uk web

Thank you. Comments and questions welcome.

Richard Deswarte

School of History

UEA

[email protected]