How and why study big cultural data

44
How and why study big cultural data Lev Manovich [email protected] softwarestudies.c om

description

Lev Manovich. How and why study big cultural data. Presentation at Data Mining and Visualization for the Humanities symposium, NYU, March 19, 2012. softwarestudies.com

Transcript of How and why study big cultural data

Page 1: How and why study big cultural data

How and why study big cultural data

Lev [email protected]

Page 2: How and why study big cultural data

New York Times (November 16, 2010):“The next big idea in language, history and the arts? Data.”

NEH/NSF Digging into Data competition (2009): “How does the notion of scale affect humanities and social science research? Now that scholars have access to huge repositories of digitized data—far more than they could read in a lifetime—what does that mean for research?”

Page 3: How and why study big cultural data

Why study big cultural data ?

Page 4: How and why study big cultural data

1 study societies through the social media traces - social computing (but do we study society or only social media itself?)

2 more inclusive understanding of history and present (using much larger samples)

3 detect large scale cultural patterns 4 the best way to follow global professionally produced digital culture; understand new developed cultural fields (“X” design)

5 map cultural variability and diversity

Page 5: How and why study big cultural data

Data: 3,724 18th century volumes, using 10,000 most frequent words (excluding proper nouns). Ted Underwood. The Differentiation of Literary and nonliterary diction, 1700-1900.

Page 6: How and why study big cultural data

Growth of a global culture space after 1990: Cumulative number of new art biennales, 1895-2008.

6

Page 7: How and why study big cultural data

modern (19th-20th centuries) social and cultural theory: describe what is similar (classes, structures, types) / statistics (reduction)

computational humanities and social science should focus on describing what is different / variability / diversity

not “from data to knowledge” but from (incomplete) knowledge to actual cultural data

Page 8: How and why study big cultural data

We are no longer interested in the conformity of an individual to an ideal type; we are now interested in the relation of an individual to the other individuals with which it interacts... Relations will be more important than categories; functions, which are variable, will be more important than purposes; transitions will be more important than boundaries; sequences will be more important than hierarchies.

Louis Menand on Darvin, 2001.

Page 9: How and why study big cultural data

Visualization: Thinking without “large” categories

“The ontological status of assemblages, large and small, is always that of unique, singular individuals.” “Unlike taxonomic essentialism in which genus, species and individuals are separate ontological categories, the ontology of assemblages is flat since it contains nothing but differently scaled individual singularities.”Manuel DeLanda. A New Philosophy of Society.

Page 10: How and why study big cultural data

Bruno Latour:

The “whole” is now nothing more than a provisional visualization which can be modified and reversed at will, by moving back to the individual components, and then looking for yet other tools to regroup the same elements into alternative assemblages.

Page 11: How and why study big cultural data

How to study big cultural data ?

how to explore massive visual collections (exploratory media analysis)?

which data analysis and visualization techniques are appropriate for non-technical users? How to democratize data analysis?

Page 12: How and why study big cultural data

Our approach:

media visualization (visualizing media directly rather than only using abstract infovis language)

Page 13: How and why study big cultural data

visualizing large non-visual data using abstraction

Page 14: How and why study big cultural data

media visualization: showing visual data directly

Every cover of Times magazine, 1923-2009 (4535 images).X-axis = publication date. Y-axis = saturation mean.

Page 15: How and why study big cultural data

our media visualization software on 287 megapixel display (image: 1 million manga pages)

Page 16: How and why study big cultural data

our software on new display wall with thin bezels (data: 4535 Time magazine covers)

Page 17: How and why study big cultural data

Our methods:

1. media visualization using existing metadata - show complete collection

2. media visualization using existing metadata - use samples to better reveal patterns

3. digital image processing + media visualization (use simple image features which have direct perceptual meaning - and gradually introduce humanists to image processing)

Page 18: How and why study big cultural data

1. media visualization / existing metadata: montage

Page 19: How and why study big cultural data

2. media visualization / existing metadata / sample

Page 20: How and why study big cultural data
Page 21: How and why study big cultural data

Image plots of selected paintings by six impressionist artists. X-axis = mean saturation. Y-axis = median hue.Megan O’Rourke, 2012.

3. digital image processing + media visualization

Page 22: How and why study big cultural data
Page 23: How and why study big cultural data

Advantages:

replacing discrete categorieswith continuos attributes

Page 24: How and why study big cultural data

1. from timelines to curves

2. better represent analog cultural attributes

3. understand cultural landscapes (fuzzy / overlapping / hard clusters?)

4. visualize cultural variability

5. discover new gropings

Page 25: How and why study big cultural data

1. from timelines to curves

Page 26: How and why study big cultural data

2. better represent analog attributes

Page 27: How and why study big cultural data

3. our maps of cultural landscapes reveal fuzzy/overlapping clusters - rather than discrete categories with hard boundaries

Page 28: How and why study big cultural data

4. visualize cultural variability

Page 29: How and why study big cultural data

5. discover new groupings

Page 30: How and why study big cultural data

Studying large cultural data challenges our existing theoretical concepts and assumptions

example: what is “style”?

Page 31: How and why study big cultural data

one million manga pages

Page 32: How and why study big cultural data
Page 33: How and why study big cultural data
Page 34: How and why study big cultural data

single short manga series (>1000 pages)

Page 35: How and why study big cultural data

776 Vincent van Gogh paintings

Page 36: How and why study big cultural data

Selected current projects:

7000 year old stone arrowheads (with UCSD anthropologist and CS postdoc at University of Washington)

comparing Art Now & Graphic design Flickr groups (340,000 images)(with CS collaborator from Laurence Berkeley National Laboratory)

One million images (+ metadata) from deviantArt (with an art historian / DH collaborator from Netherlands Academy of Arts and Sciences)

Page 37: How and why study big cultural data

4.7 million newspaper pages from Library of Congress (UCSD undergraduate students)

virtual world / game analytics (NSF Eager, with UCSD Experimental Games Lab)

SEASR tools and workflows for working with image and video data (with NCSA at University of Illinois, Urbana-Champaign)

Page 38: How and why study big cultural data

Conclusion: Computational humanities vs.digital humanities

Page 39: How and why study big cultural data

“The capacity to collect and analyze massive amounts of data has transformed such fields as biology and physics. But the emergence of a data-driven 'computational social science' has been much slower. Leading journals in economics, sociology, and political science show little evidence of this field. But computational social science is occurring in Internet companies such as Google and Yahoo, and in government agencies such as the U.S. National Security Agency.”“Computational Social Science.” Science, vol. 323, no. 6, February 2009.

Digital humanities: scholars are mostly working with the archives of digitized historical cultural archives which were created by libraries and universities with the funding from NEH and other institutions.

Page 40: How and why study big cultural data

Computational humanities: Analyzing massive amounts of cultural content and and peoples' conversations, opinions, and cultural activities online - personal and professional web sites, general and specialized social media networks and sites. This data offers us unprecedented opportunities to understand cultural processes and their dynamics and develop new concepts and models which can be also used to better understand the past.

Current players in computational humanities: - Google, Facebook, YouTube, Blue Fin Lans, Echonest, and many other companies which analyze social media signals (blogs, Twitter, etc.) and the content of media on social networks.- Computer scientists who are working with this data.

Page 41: How and why study big cultural data

[email protected]

www.softwarestudies.com

Page 42: How and why study big cultural data

Appendix:

visualizing video collections

use media visualization with a set of keyframes

automatic selection of key frames (for example, using free shot detection software)

Page 43: How and why study big cultural data
Page 44: How and why study big cultural data