Beyond Preservation: Situating Archaeological Data in Professional Practice
-
Upload
eric-kansa -
Category
Education
-
view
42 -
download
0
description
Transcript of Beyond Preservation: Situating Archaeological Data in Professional Practice
Eric C. Kansa (@ekansa)UC Berkeley D-Lab
& Open Context
2014-2015 Harvard Center for Hellenic Studies & German
Archaeological Institute Research Fellow
Beyond Preservation: Situating Archaeological Data in
Professional Practice
Unless otherwise indicated, this work is licensed under a Creative Commons Attribution 3.0 License <http://creativecommons.org/licenses/by/3.0/>
Eric C. Kansa (@ekansa)UC Berkeley D-Lab
& Open Context
2014-2015 Harvard Center for Hellenic Studies & German
Archaeological Institute Research Fellow
Data Sharing as Publication• Started in 2007• Open data (mainly CC-By)• Archiving by California Digital
Library• Part of a broader reform
movement in scholarly communications
Data Sharing as Publication• Started in 2007• Open data (mainly CC-By)• Archiving by California Digital
Library• Part of a broader reform
movement in scholarly communications
IntroductionIntroduction
Visions for Digital Data in Archaeology1. “Optimizing the status quo”2. Opportunity for fundamentally better
ways to conduct and communicate research
Visions for Digital Data in Archaeology1. “Optimizing the status quo”2. Opportunity for fundamentally better
ways to conduct and communicate research
IntroductionIntroduction
Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual
investment
Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual
investment
IntroductionIntroduction
Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual
investment
Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual
investment
Data source: Arif Jinha (2010). Article 50 million: an estimate of the number of scholarly articles in existence Learned Publishing, 23 (3), 258-263 DOI: 10.1087/20100308.
Image Source: http://www.cs.cmu.edu/~comar/open-science/
Data source: Arif Jinha (2010). Article 50 million: an estimate of the number of scholarly articles in existence Learned Publishing, 23 (3), 258-263 DOI: 10.1087/20100308.
Image Source: http://www.cs.cmu.edu/~comar/open-science/
Paper and paper like digital files (PDFs) do not scale well:● Discovery● Reuse
Paper and paper like digital files (PDFs) do not scale well:● Discovery● Reuse
Image Credit: Wikimedia Commons (Public Domain) http://commons.wikimedia.org/wiki/File:Archives_entreprises.jpg
Image Credit: Wikimedia Commons (CC-BY-SA) http://commons.wikimedia.org/wiki/File:BigData_2267x1146_white.png
Lots of investment in “Big Data”● Corporate● Government● 'STEM' academia
Lots of investment in “Big Data”● Corporate● Government● 'STEM' academia
Lots of investment in “Big Data”● Corporate● Government● 'STEM' academia
Lots of investment in “Big Data”● Corporate● Government● 'STEM' academia
Image Credit: 'gin soak' (CC-BY-NC-ND) https://www.flickr.com/photos/gin_soak/2215398726
Structured Data – Creativity1. New forms of communication2. New forms of collaboration3. New research opportunities
Structured Data – Creativity1. New forms of communication2. New forms of collaboration3. New research opportunities
'Mash-ups' (informal
integrations)Open Context &
Arachne
'Mash-ups' (informal
integrations)Open Context &
Arachne
Experiment in open, distributedpost-publication peer-review
Experiment in open, distributedpost-publication peer-review
Text-mining literature to identify references to ancient places
Text-mining literature to identify references to ancient places
2010 (renewed 2012) Google Digital Humanities Awards: with Elton Barker, Leif Isaksen, Kate Byrne, Nick Rabinowitz2010 (renewed 2012) Google Digital Humanities Awards: with Elton Barker, Leif Isaksen, Kate Byrne, Nick Rabinowitz
Project limited to public domain (pre-1920) resources
Project limited to public domain (pre-1920) resources
IntroductionIntroduction
Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual
investment
Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual
investment
Commercial interests and public policy
Conditions of academic labor
Neoliberalism: (Loosely associated ideologies /
assumptions / interests)
Source: The Occasional Pamphlet - Harvard University (http://blogs.law.harvard.edu/pamphlet/2013/01/29/why-open-access-is-better-for-scholarly-societies/)
Conditions of academic labor
Neoliberalism: (Loosely associated ideologies /
assumptions / interests)
Neoliberalism:Taylorism,
“Audit Culture” and fierce job/grant competition
Data contributions don’t
count!
Image Credit: Wikimedia Commons (Public Domain) http://en.wikipedia.org/wiki/Frederick_Winslow_Taylor#mediaviewer/File:Frederick_Winslow_Taylor_crop.jpg
Ironies of data: Publications counted as data, but data don’t
count!
☹Frowns at
Many researchers (esp. junior scholars) lack academic freedom
My Precious Data
Image Credit: “Lord of the Rings” (2003, New Line), All Rights Reserved Copyright
Data sharing as compliance
Need more carrots!1. Citation, credit, intellectually
valued2. Research outcomes (new
insights from data reuse!)
Need more carrots!1. Citation, credit, intellectually
valued2. Research outcomes (new
insights from data reuse!)
Need more carrots!1. Citation, credit, intellectually
valued2. Research outcomes (new
insights from data reuse!)
Need more carrots!1. Citation, credit, intellectually
valued2. Research outcomes (new
insights from data reuse!)
Adapt Academic Taylorism:● Datacite (metadata, citation
for datasets)● Alt-metrics (social media,
view counts, download counts, etc.)
Make data count!
Need more carrots!1. Citation, credit, intellectually
valued2. Research outcomes (new
insights from data reuse!)
Need more carrots!1. Citation, credit, intellectually
valued2. Research outcomes (new
insights from data reuse!)
IntroductionIntroduction
Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual
investment
Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual
investment
Data Sharing as Publication• Started in 2007• Open data (mainly CC-By)• Archiving by California Digital
Library• Part of a broader reform
movement in scholarly communications
Data Sharing as Publication• Started in 2007• Open data (mainly CC-By)• Archiving by California Digital
Library• Part of a broader reform
movement in scholarly communications
Publishing Workflow
Improve / Enhance1. Consistency2. Context (intelligibility,
interoperability)
Improve / Enhance1. Consistency2. Context (intelligibility,
interoperability)
Digital Index of North American Archaeology (DINAA)1. Rich metadata (cultures,
chronology, site-types)2. Reduced precision location data
(site security, legal)3. Data modeling challenges (using
GeoJSON-LD, CIDOC-CRM, event models)
Digital Index of North American Archaeology (DINAA)1. Rich metadata (cultures,
chronology, site-types)2. Reduced precision location data
(site security, legal)3. Data modeling challenges (using
GeoJSON-LD, CIDOC-CRM, event models)
Using site file data to
examine the impacts of sea
level rise
In 100 years, 19,676 sites will be covered!
Digital Index of North American Archaeology (DINAA)1. ~ 500,000 site records curated by
state officials2. Key (Linked Data!) reference for N.
American archaeology3. PIs/Co-PIs: David G. Anderson,
Joshua Wells, Eric Kansa, Sarah Kansa, Stephen Yerka
Digital Index of North American Archaeology (DINAA)1. ~ 500,000 site records curated by
state officials2. Key (Linked Data!) reference for N.
American archaeology3. PIs/Co-PIs: David G. Anderson,
Joshua Wells, Eric Kansa, Sarah Kansa, Stephen Yerka
Stable Web URI:Reference this to disambiguate between “Alexandria” (Egypt) and other places called “Alexandria” (many of which are also ancient)
Stable Web URI:Reference this to disambiguate between “Alexandria” (Egypt) and other places called “Alexandria” (many of which are also ancient)
Pelagios:Heat map of museum collections, archives, databases referencing places in Pleiades (PIs Leif Isaksen, Elton Barker)
Pelagios:Heat map of museum collections, archives, databases referencing places in Pleiades (PIs Leif Isaksen, Elton Barker)
Web of Data (2011)Web of Data (2011)
Need Archaeology on the Map
Contributions should not be isolated from other communities
Linked Data:Annotations to community vocabularies part of Open Context editorial process
Linked Data:Annotations to community vocabularies part of Open Context editorial process
IntroductionIntroduction
Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual
investment
Digital Data in Archaeology1. Why discuss data?2. Data in (bad) institutional contexts3. Open Context's approach4. Need for more & wider intellectual
investment
I just started using an Excel spreadsheet that has sort of slowly gotten bigger and bigger over time with more variables or columns…I've added …color coding…I also use…a very sort of primitive numerical coding system, again, that I inherited from my research advisers…So, this little book that goes with me of codes which is sort of odd, but …we all know that a 14 is a sheep.” (CCU13)
Need to do more than “Optimize the Status Quo”Need to do more than “Optimize the Status Quo”
Raw Data Can Be UnappetizingRaw Data Can Be Unappetizing
Sometimes data is better served cooked
Large scale data sharing & integration for exploring the origins of farming. Funded by EOL / NEH
Large scale data sharing & integration for exploring the origins of farming. Funded by EOL / NEH
1. 300,000 bone specimens2. Complex: dozens, up to 110
descriptive fields3. 34 contributors from 15
archaeological sites4. More than 4 person years of
effort to create the data !
1. 300,000 bone specimens2. Complex: dozens, up to 110
descriptive fields3. 34 contributors from 15
archaeological sites4. More than 4 person years of
effort to create the data !
7000 BC (many pigs, cattle)
7500 BC (sheep + goat dominate, few pigs, few cattle)
6500 BC (few pigs, mixing with wild animals?)
8000 BC (cattle, pigs,sheep + goats)
• Not a neat model of progress to adopt a more productive economy. Very different, sometimes piecemeal adoption in different regions.
Arbuckle BS, Kansa SW, Kansa E, Orton D, Çakırlar C, et al. (2014) Data Sharing Reveals Complexity in the Westward Spread of Domestic Animals across Neolithic Turkey. PLoS ONE 9(6): e99845. doi:10.1371/journal.pone.0099845
Easy to Align1. Animal taxonomy2. Skeletal elements3. Sex determinations4. Side of the animal5. Fusion (bone growth, up to a
point)
Easy to Align1. Animal taxonomy2. Skeletal elements3. Sex determinations4. Side of the animal5. Fusion (bone growth, up to a
point)
Hard to Align (poor modeling, recording)1. Tooth wear (age)2. Fusion data3. Measurements
Despite common research methods!!
Hard to Align (poor modeling, recording)1. Tooth wear (age)2. Fusion data3. Measurements
Despite common research methods!!
“Under the hood” exposure and reuse attempts critical! Fundamental method & theory issues in data modeling!
Investing in Data is a Continual Need1. Data and code co-evolve. New
visualizations, analysis may reveal unseen problems in data.
2. Data and metadata change routinely (revised stratigraphy requires ongoing updates to data in this analysis)
3. Problems, interpretive issues in data (and annotations) keep cropping up.
4. Is publishing a bad metaphor implying a static product?
Investing in Data is a Continual Need1. Data and code co-evolve. New
visualizations, analysis may reveal unseen problems in data.
2. Data and metadata change routinely (revised stratigraphy requires ongoing updates to data in this analysis)
3. Problems, interpretive issues in data (and annotations) keep cropping up.
4. Is publishing a bad metaphor implying a static product?
Data sharing as publication
Data sharing as open source release cycles?
Data sharing as publication
Data sharing as open source release cycles?
Data sharing as publication
Data sharing as open source release cycles?
Data sharing as publication
Data sharing as open source release cycles?
Data sharing as publicationAND
Data sharing as open source release cycles
Data sharing as publicationAND
Data sharing as open source release cycles
Go beyond Optimization of the Status Quo
Go beyond Optimization of the Status Quo
More to data than 'compliance'
Data require intellectual investment, methodological and theoretical innovation.
New professional roles needed, but who will pay for it?
More to data than 'compliance'
Data require intellectual investment, methodological and theoretical innovation.
New professional roles needed, but who will pay for it?
Thank you!Thank you!
Special Thanks!
Harvard Center for Hellenic Studies & the German Archaeological Institute (DAI)