Beyond Preservation: Situating Archaeological Data in Professional Practice

70
Eric C. Kansa (@ekansa) UC Berkeley D-Lab & Open Context 2014-2015 Harvard Center for Hellenic Studies & German Archaeological Institute Research Fellow Beyond Preservation: Situating Archaeological Data in Professional Practice Unless otherwise indicated, this work is licensed under a Creative Commons Attribution 3.0 License <http://creativecommons.org/licenses/by/3.0/> Eric C. Kansa (@ekansa) UC Berkeley D-Lab & Open Context 2014-2015 Harvard Center for Hellenic Studies & German Archaeological Institute Research Fellow

description

I presented this lecture at the German Archaeological Institute (DAI) in Berlin on Nov. 6, 2014 (see: http://www.dainst.org/termin/-/event-display/ogNX4Gtxkd87/342513) The lecture focuses on how archaeological data fits in professional practice. It looks at scholarly communications, government policies toward the sciences and humanities, and professional reward structures. The lecture then shows examples of how Open Context publishes archeological data, including editorial processes to promote data quality and relate contributed data to the &#x27;Web of Data&#x27; using Linked Open Data methods. Research applications of Open Context and linked archaeological data include the Digital Index of North American Archaeology (DINAA) project (see: http://ux.opencontext.org/blog/archaeology-site-data/) and a data integration study exploring the development and dispersal of animal husbandry economies in Epipaleolithic - Chalcolithic Anatolia (see: http://dx.doi.org/10.1371/journal.pone.0099845) The lecture concludes with how archaeologists need to invest more intellectually in the method and theory of modeling and creating data. It also looks at how concepts and expectations of publishing static artifacts need to be revised (using techniques like version control) to enable continued and more transparent revision of data to fix problems, implement new standards, and meet new research goals.

Transcript of Beyond Preservation: Situating Archaeological Data in Professional Practice

Page 1: Beyond Preservation: Situating Archaeological Data in Professional Practice

Eric C. Kansa (@ekansa)UC Berkeley D-Lab

& Open Context

2014-2015 Harvard Center for Hellenic Studies & German

Archaeological Institute Research Fellow

Beyond Preservation: Situating Archaeological Data in

Professional Practice

Unless otherwise indicated, this work is licensed under a Creative Commons Attribution 3.0 License <http://creativecommons.org/licenses/by/3.0/>

Eric C. Kansa (@ekansa)UC Berkeley D-Lab

& Open Context

2014-2015 Harvard Center for Hellenic Studies & German

Archaeological Institute Research Fellow

Page 2: Beyond Preservation: Situating Archaeological Data in Professional Practice

Data Sharing as Publication• Started in 2007• Open data (mainly CC-By)• Archiving by California Digital

Library• Part of a broader reform

movement in scholarly communications

Data Sharing as Publication• Started in 2007• Open data (mainly CC-By)• Archiving by California Digital

Library• Part of a broader reform

movement in scholarly communications

Page 3: Beyond Preservation: Situating Archaeological Data in Professional Practice

IntroductionIntroduction

Visions for Digital Data in Archaeology1. “Optimizing the status quo”2. Opportunity for fundamentally better

ways to conduct and communicate research

Visions for Digital Data in Archaeology1. “Optimizing the status quo”2. Opportunity for fundamentally better

ways to conduct and communicate research

Page 4: Beyond Preservation: Situating Archaeological Data in Professional Practice

IntroductionIntroduction

Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual

investment

Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual

investment

Page 5: Beyond Preservation: Situating Archaeological Data in Professional Practice

IntroductionIntroduction

Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual

investment

Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual

investment

Page 6: Beyond Preservation: Situating Archaeological Data in Professional Practice

Data source: Arif Jinha (2010). Article 50 million: an estimate of the number of scholarly articles in existence Learned Publishing, 23 (3), 258-263 DOI: 10.1087/20100308.

Image Source: http://www.cs.cmu.edu/~comar/open-science/

Data source: Arif Jinha (2010). Article 50 million: an estimate of the number of scholarly articles in existence Learned Publishing, 23 (3), 258-263 DOI: 10.1087/20100308.

Image Source: http://www.cs.cmu.edu/~comar/open-science/

Page 7: Beyond Preservation: Situating Archaeological Data in Professional Practice

Paper and paper like digital files (PDFs) do not scale well:● Discovery● Reuse

Paper and paper like digital files (PDFs) do not scale well:● Discovery● Reuse

Image Credit: Wikimedia Commons (Public Domain) http://commons.wikimedia.org/wiki/File:Archives_entreprises.jpg

Page 8: Beyond Preservation: Situating Archaeological Data in Professional Practice

Image Credit: Wikimedia Commons (CC-BY-SA) http://commons.wikimedia.org/wiki/File:BigData_2267x1146_white.png

Page 9: Beyond Preservation: Situating Archaeological Data in Professional Practice

Lots of investment in “Big Data”● Corporate● Government● 'STEM' academia

Lots of investment in “Big Data”● Corporate● Government● 'STEM' academia

Page 10: Beyond Preservation: Situating Archaeological Data in Professional Practice

Lots of investment in “Big Data”● Corporate● Government● 'STEM' academia

Lots of investment in “Big Data”● Corporate● Government● 'STEM' academia

Page 11: Beyond Preservation: Situating Archaeological Data in Professional Practice
Page 12: Beyond Preservation: Situating Archaeological Data in Professional Practice

Image Credit: 'gin soak' (CC-BY-NC-ND) https://www.flickr.com/photos/gin_soak/2215398726

Structured Data – Creativity1. New forms of communication2. New forms of collaboration3. New research opportunities

Structured Data – Creativity1. New forms of communication2. New forms of collaboration3. New research opportunities

Page 13: Beyond Preservation: Situating Archaeological Data in Professional Practice

'Mash-ups' (informal

integrations)Open Context &

Arachne

'Mash-ups' (informal

integrations)Open Context &

Arachne

Page 14: Beyond Preservation: Situating Archaeological Data in Professional Practice

Experiment in open, distributedpost-publication peer-review

Experiment in open, distributedpost-publication peer-review

Page 15: Beyond Preservation: Situating Archaeological Data in Professional Practice

Text-mining literature to identify references to ancient places

Text-mining literature to identify references to ancient places

2010 (renewed 2012) Google Digital Humanities Awards: with Elton Barker, Leif Isaksen, Kate Byrne, Nick Rabinowitz2010 (renewed 2012) Google Digital Humanities Awards: with Elton Barker, Leif Isaksen, Kate Byrne, Nick Rabinowitz

Page 16: Beyond Preservation: Situating Archaeological Data in Professional Practice

Project limited to public domain (pre-1920) resources

Project limited to public domain (pre-1920) resources

Page 17: Beyond Preservation: Situating Archaeological Data in Professional Practice

IntroductionIntroduction

Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual

investment

Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual

investment

Page 18: Beyond Preservation: Situating Archaeological Data in Professional Practice

Commercial interests and public policy

Conditions of academic labor

Neoliberalism: (Loosely associated ideologies /

assumptions / interests)

Page 19: Beyond Preservation: Situating Archaeological Data in Professional Practice

Source: The Occasional Pamphlet - Harvard University (http://blogs.law.harvard.edu/pamphlet/2013/01/29/why-open-access-is-better-for-scholarly-societies/)

Page 20: Beyond Preservation: Situating Archaeological Data in Professional Practice
Page 21: Beyond Preservation: Situating Archaeological Data in Professional Practice
Page 22: Beyond Preservation: Situating Archaeological Data in Professional Practice

Conditions of academic labor

Neoliberalism: (Loosely associated ideologies /

assumptions / interests)

Page 23: Beyond Preservation: Situating Archaeological Data in Professional Practice

Neoliberalism:Taylorism,

“Audit Culture” and fierce job/grant competition

Data contributions don’t

count!

Image Credit: Wikimedia Commons (Public Domain) http://en.wikipedia.org/wiki/Frederick_Winslow_Taylor#mediaviewer/File:Frederick_Winslow_Taylor_crop.jpg

Page 24: Beyond Preservation: Situating Archaeological Data in Professional Practice

Ironies of data: Publications counted as data, but data don’t

count!

Page 25: Beyond Preservation: Situating Archaeological Data in Professional Practice

☹Frowns at

Many researchers (esp. junior scholars) lack academic freedom

Page 26: Beyond Preservation: Situating Archaeological Data in Professional Practice

My Precious Data

Image Credit: “Lord of the Rings” (2003, New Line), All Rights Reserved Copyright

Page 27: Beyond Preservation: Situating Archaeological Data in Professional Practice
Page 28: Beyond Preservation: Situating Archaeological Data in Professional Practice

Data sharing as compliance

Page 29: Beyond Preservation: Situating Archaeological Data in Professional Practice
Page 30: Beyond Preservation: Situating Archaeological Data in Professional Practice

Need more carrots!1. Citation, credit, intellectually

valued2. Research outcomes (new

insights from data reuse!)

Need more carrots!1. Citation, credit, intellectually

valued2. Research outcomes (new

insights from data reuse!)

Page 31: Beyond Preservation: Situating Archaeological Data in Professional Practice

Need more carrots!1. Citation, credit, intellectually

valued2. Research outcomes (new

insights from data reuse!)

Need more carrots!1. Citation, credit, intellectually

valued2. Research outcomes (new

insights from data reuse!)

Page 32: Beyond Preservation: Situating Archaeological Data in Professional Practice

Adapt Academic Taylorism:● Datacite (metadata, citation

for datasets)● Alt-metrics (social media,

view counts, download counts, etc.)

Make data count!

Page 33: Beyond Preservation: Situating Archaeological Data in Professional Practice

Need more carrots!1. Citation, credit, intellectually

valued2. Research outcomes (new

insights from data reuse!)

Need more carrots!1. Citation, credit, intellectually

valued2. Research outcomes (new

insights from data reuse!)

Page 34: Beyond Preservation: Situating Archaeological Data in Professional Practice

IntroductionIntroduction

Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual

investment

Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual

investment

Page 35: Beyond Preservation: Situating Archaeological Data in Professional Practice

Data Sharing as Publication• Started in 2007• Open data (mainly CC-By)• Archiving by California Digital

Library• Part of a broader reform

movement in scholarly communications

Data Sharing as Publication• Started in 2007• Open data (mainly CC-By)• Archiving by California Digital

Library• Part of a broader reform

movement in scholarly communications

Page 36: Beyond Preservation: Situating Archaeological Data in Professional Practice

Publishing Workflow

Improve / Enhance1. Consistency2. Context (intelligibility,

interoperability)

Improve / Enhance1. Consistency2. Context (intelligibility,

interoperability)

Page 37: Beyond Preservation: Situating Archaeological Data in Professional Practice
Page 38: Beyond Preservation: Situating Archaeological Data in Professional Practice
Page 39: Beyond Preservation: Situating Archaeological Data in Professional Practice
Page 40: Beyond Preservation: Situating Archaeological Data in Professional Practice
Page 41: Beyond Preservation: Situating Archaeological Data in Professional Practice
Page 42: Beyond Preservation: Situating Archaeological Data in Professional Practice
Page 43: Beyond Preservation: Situating Archaeological Data in Professional Practice
Page 44: Beyond Preservation: Situating Archaeological Data in Professional Practice
Page 45: Beyond Preservation: Situating Archaeological Data in Professional Practice
Page 46: Beyond Preservation: Situating Archaeological Data in Professional Practice

Digital Index of North American Archaeology (DINAA)1. Rich metadata (cultures,

chronology, site-types)2. Reduced precision location data

(site security, legal)3. Data modeling challenges (using

GeoJSON-LD, CIDOC-CRM, event models)

Digital Index of North American Archaeology (DINAA)1. Rich metadata (cultures,

chronology, site-types)2. Reduced precision location data

(site security, legal)3. Data modeling challenges (using

GeoJSON-LD, CIDOC-CRM, event models)

Page 47: Beyond Preservation: Situating Archaeological Data in Professional Practice
Page 48: Beyond Preservation: Situating Archaeological Data in Professional Practice

Using site file data to

examine the impacts of sea

level rise

In 100 years, 19,676 sites will be covered!

Page 49: Beyond Preservation: Situating Archaeological Data in Professional Practice

Digital Index of North American Archaeology (DINAA)1. ~ 500,000 site records curated by

state officials2. Key (Linked Data!) reference for N.

American archaeology3. PIs/Co-PIs: David G. Anderson,

Joshua Wells, Eric Kansa, Sarah Kansa, Stephen Yerka

Digital Index of North American Archaeology (DINAA)1. ~ 500,000 site records curated by

state officials2. Key (Linked Data!) reference for N.

American archaeology3. PIs/Co-PIs: David G. Anderson,

Joshua Wells, Eric Kansa, Sarah Kansa, Stephen Yerka

Page 50: Beyond Preservation: Situating Archaeological Data in Professional Practice

Stable Web URI:Reference this to disambiguate between “Alexandria” (Egypt) and other places called “Alexandria” (many of which are also ancient)

Stable Web URI:Reference this to disambiguate between “Alexandria” (Egypt) and other places called “Alexandria” (many of which are also ancient)

Page 51: Beyond Preservation: Situating Archaeological Data in Professional Practice

Pelagios:Heat map of museum collections, archives, databases referencing places in Pleiades (PIs Leif Isaksen, Elton Barker)

Pelagios:Heat map of museum collections, archives, databases referencing places in Pleiades (PIs Leif Isaksen, Elton Barker)

Page 52: Beyond Preservation: Situating Archaeological Data in Professional Practice

Web of Data (2011)Web of Data (2011)

Need Archaeology on the Map

Contributions should not be isolated from other communities

Page 53: Beyond Preservation: Situating Archaeological Data in Professional Practice

Linked Data:Annotations to community vocabularies part of Open Context editorial process

Linked Data:Annotations to community vocabularies part of Open Context editorial process

Page 54: Beyond Preservation: Situating Archaeological Data in Professional Practice

IntroductionIntroduction

Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual

investment

Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual

investment

Page 55: Beyond Preservation: Situating Archaeological Data in Professional Practice

I just started using an Excel spreadsheet that has sort of slowly gotten bigger and bigger over time with more variables or columns…I've added …color coding…I also use…a very sort of primitive numerical coding system, again, that I inherited from my research advisers…So, this little book that goes with me of codes which is sort of odd, but …we all know that a 14 is a sheep.” (CCU13)

Need to do more than “Optimize the Status Quo”Need to do more than “Optimize the Status Quo”

Page 56: Beyond Preservation: Situating Archaeological Data in Professional Practice

Raw Data Can Be UnappetizingRaw Data Can Be Unappetizing

Page 57: Beyond Preservation: Situating Archaeological Data in Professional Practice

Sometimes data is better served cooked

Page 58: Beyond Preservation: Situating Archaeological Data in Professional Practice

Large scale data sharing & integration for exploring the origins of farming. Funded by EOL / NEH

Large scale data sharing & integration for exploring the origins of farming. Funded by EOL / NEH

Page 59: Beyond Preservation: Situating Archaeological Data in Professional Practice

1. 300,000 bone specimens2. Complex: dozens, up to 110

descriptive fields3. 34 contributors from 15

archaeological sites4. More than 4 person years of

effort to create the data !

1. 300,000 bone specimens2. Complex: dozens, up to 110

descriptive fields3. 34 contributors from 15

archaeological sites4. More than 4 person years of

effort to create the data !

Page 60: Beyond Preservation: Situating Archaeological Data in Professional Practice

7000 BC (many pigs, cattle)

7500 BC (sheep + goat dominate, few pigs, few cattle)

6500 BC (few pigs, mixing with wild animals?)

8000 BC (cattle, pigs,sheep + goats)

• Not a neat model of progress to adopt a more productive economy. Very different, sometimes piecemeal adoption in different regions.

Arbuckle BS, Kansa SW, Kansa E, Orton D, Çakırlar C, et al. (2014) Data Sharing Reveals Complexity in the Westward Spread of Domestic Animals across Neolithic Turkey. PLoS ONE 9(6): e99845. doi:10.1371/journal.pone.0099845

Page 61: Beyond Preservation: Situating Archaeological Data in Professional Practice

Easy to Align1. Animal taxonomy2. Skeletal elements3. Sex determinations4. Side of the animal5. Fusion (bone growth, up to a

point)

Easy to Align1. Animal taxonomy2. Skeletal elements3. Sex determinations4. Side of the animal5. Fusion (bone growth, up to a

point)

Page 62: Beyond Preservation: Situating Archaeological Data in Professional Practice

Hard to Align (poor modeling, recording)1. Tooth wear (age)2. Fusion data3. Measurements

Despite common research methods!!

Hard to Align (poor modeling, recording)1. Tooth wear (age)2. Fusion data3. Measurements

Despite common research methods!!

Page 63: Beyond Preservation: Situating Archaeological Data in Professional Practice

“Under the hood” exposure and reuse attempts critical! Fundamental method & theory issues in data modeling!

Page 64: Beyond Preservation: Situating Archaeological Data in Professional Practice

Investing in Data is a Continual Need1. Data and code co-evolve. New

visualizations, analysis may reveal unseen problems in data.

2. Data and metadata change routinely (revised stratigraphy requires ongoing updates to data in this analysis)

3. Problems, interpretive issues in data (and annotations) keep cropping up.

4. Is publishing a bad metaphor implying a static product?

Investing in Data is a Continual Need1. Data and code co-evolve. New

visualizations, analysis may reveal unseen problems in data.

2. Data and metadata change routinely (revised stratigraphy requires ongoing updates to data in this analysis)

3. Problems, interpretive issues in data (and annotations) keep cropping up.

4. Is publishing a bad metaphor implying a static product?

Page 65: Beyond Preservation: Situating Archaeological Data in Professional Practice
Page 66: Beyond Preservation: Situating Archaeological Data in Professional Practice

Data sharing as publication

Data sharing as open source release cycles?

Data sharing as publication

Data sharing as open source release cycles?

Page 67: Beyond Preservation: Situating Archaeological Data in Professional Practice

Data sharing as publication

Data sharing as open source release cycles?

Data sharing as publication

Data sharing as open source release cycles?

Page 68: Beyond Preservation: Situating Archaeological Data in Professional Practice

Data sharing as publicationAND

Data sharing as open source release cycles

Data sharing as publicationAND

Data sharing as open source release cycles

Page 69: Beyond Preservation: Situating Archaeological Data in Professional Practice

Go beyond Optimization of the Status Quo

Go beyond Optimization of the Status Quo

More to data than 'compliance'

Data require intellectual investment, methodological and theoretical innovation.

New professional roles needed, but who will pay for it?

More to data than 'compliance'

Data require intellectual investment, methodological and theoretical innovation.

New professional roles needed, but who will pay for it?

Page 70: Beyond Preservation: Situating Archaeological Data in Professional Practice

Thank you!Thank you!

Special Thanks!

Harvard Center for Hellenic Studies & the German Archaeological Institute (DAI)