Annotating Research Datasets
-
Upload
john-kunze -
Category
Technology
-
view
211 -
download
0
description
Transcript of Annotating Research Datasets
Annota&ng Research Datasets
11 A p r i l 2 0 1 3
Un i ve r s i t y o f C a l i fo r n i a Cu ra&on Cen te r
C a l i fo r n i a D i g i t a l L i b ra r y
Term skew
Annota&on: The act of adding a note by way of comment or explana&on.
Genome annota&on: The process of aFaching biological informa&on to sequences. E.g.,
• Protein Data Bank annota&on manual: 247 pgs
Research data annota&on: (?!) Adding to opaque data to make it visible, sensible, and valuable.
The Long Tail
Size of dataset
# datasets
The Long Tail
Size of dataset
# researchers
# datasets
The Long Tail
Size of dataset
# researchers
# datasets
# grants
The Long Tail
Size of dataset
# researchers
# datasets
# grants
grant ($)
The Long Tail
Size of dataset
# researchers
# datasets
# grants
grant ($)
With data managers and fancy tools
Do-‐it-‐yourself tools
From Flickr By puck90
UGLY TRUTH
Many researchers… have limited funding for data services
are not taught data management
don’t know what metadata or data centers are
don’t share data publicly or store it in an archive
aren’t convinced they should share data
The research data problem
• Journal article
– Uniquely and persistently identified
– Concept of “publish”
– Multiple copies
– Easily findable
– Impact metrics, etc.
– Curation funding
• Research data
– Nope
– Not really
– Typically one
– Difficult
– Nope
– Barely
Research data is ripe for crowd-sourced annotation