Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons...

14
Cynthia Parr @cydparr US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research data

description

The problems in agricultural data Broad subject areas Journals not integrated with repositories like Dryad Too many existing databases & web distribution points Lack of infrastructure for long-tail data Lack of a neutral, sustainable solution for long- term multi-institutional projects 3

Transcript of Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons...

Page 1: Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research.

Cynthia Parr @cydparrUS Department of AgricultureNational Agricultural Library30 September 2015

Ag Data Commons Adding value to

open agricultural research data

Page 2: Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research.

Federal directives: Public access to open, machine-readable data

Page 3: Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research.

The problems in agricultural data

• Broad subject areas• Journals not integrated with repositories like

Dryad• Too many existing databases & web distribution

points• Lack of infrastructure for long-tail data• Lack of a neutral, sustainable solution for long-

term multi-institutional projects

3

Page 4: Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research.

• Supports Public Access mandates• Holds agricultural research data• Primary audience: researchers• Holds metadata for data held elsewhere• Starting with USDA data but will broaden• Both human and machine access• Can include unpublished data that is ready

for release

Ag Data Commons Prototyping FY 2015

A proposed solution

Page 5: Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research.

AG DATA COMMONSSearch &

Knowledge Discovery

Thesaurus &Indexing

Ag Data CommonsRepository

Organization & Curation

Grant management

systems

INGESTION DISSEMINATION

PubAg

DatasetSubmission

Analytics & Tools

Data.govAg Data

Commons Catalog

LegendBuildingAdaptingExisting

Distributed repositories

Forest ServiceGeospatial

Page 6: Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research.

Adding value

6

Metadata + data package

DOILinksThesaurus tags

Idiosyncratic data dictionary

Search, services, compliance checking

Page 7: Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research.

DKAN http://nucivic.com/dkan/ PRO• Open source community• Drupal modules for basic

CMS functions • Integrated CKAN catalog• Feeds Data.gov• Basic metadata already

supported

CON• Not designed for scientific

data or scientists• No links to literature• No Digital Object

Identifiers• Doesn’t handle dataset

relationships• Metadata inadequate for

compliance checking & re-use

7

Page 8: Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research.

Metadata StandardsCore Metadata Schema

POD 1.1 (Project Open Data)https://project-open-data.cio.gov/

Related Scientific Metadata & Data Standards (e.g.)ISO 19115 (GIS Data, FGDC)https://www.iso.orgDarwin Core (Biodiversity standards)http://rs.tdwg.org/dwc/EML (Ecological Metadata Language)https://knb.ecoinformatics.org/#tools/emlMiXS GSC (Genomic Standards Consortium)http://gensc.org/projects/mixs-gsc-project/

Page 9: Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research.

Controlled Vocabularies

• NALT – National Agricultural Library Thesaurus http://agclass.nal.usda.gov

GACS Global Agricultural Concept Scheme

• Taxonomy

• Gene Ontology (GO) http://geneontology.org/

• ENVO, ecological, economic, etc.

Relevant for Agriculture

• Help create a semantic web• SKOS (Simple Knowledge Organization System): W3C

recommendation, or RDF

Credit: AIMS--FAO

Page 10: Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research.

https://data.nal.usda.gov/

Launching next week

Page 11: Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research.
Page 12: Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research.

Adding even more value

12

Structured methods metadata

Shared data dictionary

Semantic data dictionary

Page 13: Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research.

Adding even more value

13

Assist application launch

Find related data

Integrate/link related data

= help build the knowledge graph

Page 14: Cynthia US Department of Agriculture National Agricultural Library 30 September 2015 Ag Data Commons Adding value to open agricultural research.

Acknowledgements

[email protected]

Susan McCarthy, NAL – KSDUrsula Pieper, NAL – ISDQing Qu, NAL – KSD contractor Jeff Campbell – NAL – KSDJaylen Nathwani, NAL – student internNüCivic, Angry Cactus TeamJocelyn McNamara -- NAL – KSD contractorKerry Huller – UMD graduate fellow Erin Antognoli – UMD graduate fellow