2014 02-21 media-open_day_talk_slides

18
Newspapers as data Dr James Baker Curator, Digital Research @j_w_baker

description

Talk entitled 'Newspapers as Data' delivered at the Media, Cultural Studies and Journalism Doctoral Open Day, British Library, 24 February 2014. Notes supporting these slides can be found on GitHub Gist https://gist.github.com/drjwbaker/9184318

Transcript of 2014 02-21 media-open_day_talk_slides

Page 1: 2014 02-21 media-open_day_talk_slides

Newspapers as data

Dr James Baker

Curator, Digital Research

@j_w_baker

Page 2: 2014 02-21 media-open_day_talk_slides

www.bl.uk 2

More than resource discovery…

“The emergence of the new digital humanities [and social sciences] isn’t an isolated academic phenomenon. The institutional and disciplinary changes are part of a larger cultural shift, inside and outside the academy, a rapid cycle of emergence and convergence in technology and culture”

Steven E Jones, Emergence of the Digital Humanities (2013)

Page 3: 2014 02-21 media-open_day_talk_slides

www.bl.uk 3

Raging torrent of data

Page 5: 2014 02-21 media-open_day_talk_slides

www.bl.uk 5

New Discoveries

disciplinecamp and camps sentence

Page 6: 2014 02-21 media-open_day_talk_slides

www.bl.uk 6

New Understanding• Study of 11M social media posts

from China– King, Pan, Roberts (2013)– Chinese government is not

censoring speech but is censoring “attempts at collective action, whether for or against the government”

– Automated text analysis

• Quantitative Analysis of Culture Using Millions of Digitized Books

– New competition for telling stories about change over time.

– Michel, Aiden et al (2010)

• NSA, GCHQ, Big Data…– Just because they use big data,

should we?– What does/doesn’t it represent?– Ethics, use of technology

Page 7: 2014 02-21 media-open_day_talk_slides

www.bl.uk 7

“Reading individual works is as irrelevant as describing the architecture of a building from a single brick, or the layout of a city from a single church”

Franco Moretti, Stanford

Page 8: 2014 02-21 media-open_day_talk_slides

www.bl.uk 8

Adam Crymble

Page 9: 2014 02-21 media-open_day_talk_slides

www.bl.uk 9

Bob Nicholson, ‘Counting Culture; or, How to Read Victorian Newspapers from a Distance’, Journal of Victorian Culture 17:2 (2012)

“Faced with this mountain of print, we have two choices: to continue subjecting tiny fragments of Victorian culture to close reading, or to supplement this approach by exploring a much larger proportion of the archive through ‘distant reading’.”

Page 10: 2014 02-21 media-open_day_talk_slides

www.bl.uk 10

Newspaper Man photograph courtesy of Flickr user Ed Stevenson / Creative Commons Licensed

Page 11: 2014 02-21 media-open_day_talk_slides

www.bl.uk 11

Page 12: 2014 02-21 media-open_day_talk_slides

www.bl.uk 12

“If each paragraph in the infinite archive, all the

trillions of words, is simply a collection of data,

it immediately becomes something that can be tied to a series of other things – to any other bit of data. A name, a date, a selection of words, or a phrase […] defined as a polygon on the surface of the earth. In other

words, the texts that form the basis for western history can now be geo-referenced and tied directly to a historical / geographical understanding of spatial distribution, which can in turn be cross analysed with any other series of measures of text – textmining makes text available for embedding within a geographical frame.”

Tim Hitchcock, ‘Place and the Politics of the Past’ (2012)

Page 13: 2014 02-21 media-open_day_talk_slides

www.bl.uk 13

“Literary scholars and historians have in the past been limited in their analyses of print culture by the constraints of physical archives and human

capacity. A lone scholar cannot read, much less make sense of, millions of newspaper pages. With the aid of computational linguistics tools and digitized corpora, however, we are working toward a large-scale, systemic understanding of how texts were valued and transmitted during this period”

David A. Smith, Ryan Cordell, and Elizabeth Maddock Dillon, ‘Infectious Texts: Modeling Text Reuse in Nineteenth-Century Newspapers’ (2013) http://www.ccs.neu.edu/home/dasmith/infect-bighum-2013.pdf

Page 14: 2014 02-21 media-open_day_talk_slides

www.bl.uk 14

“Literary scholars and historians have in the past been limited in their analyses of print culture by the constraints of physical archives and human

capacity. A lone scholar cannot read, much less make sense of, millions of newspaper pages. With the aid of computational linguistics tools and digitized corpora, however, we are working toward a large-scale, systemic understanding of how texts were valued and transmitted during this period”

David A. Smith, Ryan Cordell, and Elizabeth Maddock Dillon, ‘Infectious Texts: Modeling Text Reuse in Nineteenth-Century Newspapers’ (2013) http://www.ccs.neu.edu/home/dasmith/infect-bighum-2013.pdf

Page 15: 2014 02-21 media-open_day_talk_slides

www.bl.uk 15

Andrew Hobbs, ‘The Deleterious Dominance of The Times in Nineteenth-Century Scholarship’, Journal of Victorian Culture 18:4 (2013)

“Newspaper digitization has made good practice easier and ‘the availability of a swathe of nineteenth-century newspaper titles means that The Times should never again appear as an isolated authority for historical events or trends’ [… But] Now that a small but representative sample of provincial newspapers has been digitized, it is puzzling that so little scholarship based on them has been published. This sector of the press has many unique features that can enrich our research.”

Page 16: 2014 02-21 media-open_day_talk_slides

www.bl.uk 16

• Spatial Humanities (Lancaster)

• Asymmetrical Encounters: E-Humanity Approaches to Reference Cultures in Europe, 1815–1992 (Utrecht, Trier, UCL)

• Europeana Newspapers

• Welsh Newspapers Online - National Library of Wales

• …and you?

Page 17: 2014 02-21 media-open_day_talk_slides

www.bl.uk 17

Task time

For the next few minutes, break into pairs or groups and consider one or all of the following questions:

– What changes when you turn a news media into data?• What could you do with digital sources that you couldn’t do with physical

sources (and vice-versa)?

– What hypothetical analytical tools(s) would improve your research?

• And what are the barriers to you using them?

– What are the ethical considerations when using digital data?• Can data offend?

Be prepared to offer a short response!

Page 18: 2014 02-21 media-open_day_talk_slides

www.bl.uk 18

Thank you!

@j_w_baker

Follow the Digital Scholarship Blog: http://britishlibrary.typepad.co.uk/digital-scholarship/

Contact us at: [email protected]