Summary of RDA Outputs so far dr. Ir. Herman Stehouwer 22 September 2015.
-
Upload
sharyl-hicks -
Category
Documents
-
view
213 -
download
0
Transcript of Summary of RDA Outputs so far dr. Ir. Herman Stehouwer 22 September 2015.
4
Neutral Forum for discussing issues Generates global discussions Very diverse
Roles Disciplines
-> Increased insight (e.g. Jamie) -> Increased needs alignment (e.g. Antonio)
Intangible Outcomes
5RDA Working Groups
Form the Foundation for RDA Community Impact!
Working Groups envisioned as accelerants to data sharing practice and infrastructure in the short-term with the overarching goal of advancing global data-driven discovery and innovation
RDA Working Group profile:
Short-term: 12-18 months
Focused efforts with specific actions adopted by specific communities
International participation
Open, voluntary, consensus-driven
Complementary to effective efforts elsewhere
5
Potential outcomes / deliverables:
• New data standards or harmonization of existing standards.
• Greater data sharing, exchange, interoperability, usability and re-usability.
• Greater discoverability of research data sets.
• Better management, stewardship, and preservation of research data.
6
• An Interest Group (IGs) can be established prior to a Working Group for community discussion of issues and areas that facilitate data-driven research. • IGs are longer-term groups defining common issues and interests.•WGs and IGs are collaborating intensively
with groups in comparable initiatives such
as IETF, CODATA, WDS, W3C. Possible functions
Create new WGs Communication/Coordination
Domains Themes External pariets (WDS, CODATA, etc.)
RDA Interest Groups
7
Presented at P4 in Amsterdam Far along in adoption, all have ratified recommendations
1. DFT
2. DTR
3. PIT
4. PP
First Four Outputs
8RDA Results I: common data model
• PIDs at the beginning of trust chain • need a worldwide, independent and robust PID system
worldwide • metadata are essential in anonymous data world
taken from RDA WG Data Foundation & Terminology
9
result: a registry for data types simple example: you get an unknown file,
pull it on DTR and content is being
visualized DTR can also be used to describe
and re-use semantic content no free lunch: someone needs to
register and define type PIT Demo already working with
DTR
RDA Results II: Data Type Registry
Federated Set ofType Registries
Visualization
Data Processing1010011010101…. Data Set
Dissemination
1010011010101….
1010011010101….
Terms:…
Rights
Agree
VisualizationProcessingInterpretation
3
Domain ofServices
2
1
Human or Machine Consumers
4
10
result: a generic API and a set of basic attributes a PID Record is like a Passport (Number, Photo, Exp-Date, etc.) if all PID Service-Provider agree on one API and talk the same language
(registered terms) SW development will become easy Test-Installation
in operation
together with
DTR
RDA Results III: PID Information Types
LOC location, path
CKSM checksum
CKSM_T checksum type
RoR owning repository
MD path to MD
ŝŐĂƚĂWƌŽĐĞƐƐ ;ĐŽŶƐƵŵŝŶŐŵĂŶLJĚŝŐŝƚĂůŽďũĞĐƚƐ
ĨƌŽŵĚŝĨĨĞƌĞŶƚƌĞƉŽƐŝƚŽƌŝĞƐ Ϳ
W/ ϭ W/ Ϯ W/ ϯ W/ Ŭ
>ŝƐƚŽĨW/Ɛ
ĂƚĂ dLJƉĞZĞŐŝƐƚƌLJ
W/ZĞƐŽůƵƚŝŽŶ
^LJƐƚĞŵ
ĐŬĞĐŬƐƵŵ
W/ZĞƐŽůƵƚŝŽŶ
^LJƐƚĞŵ
ĐŚĞĐŬ
W/ZĞƐŽůƵƚŝŽŶ
^LJƐƚĞŵ
ĐŬƐŵ
ĚĞĨŝŶĞĚŝŶdZ
ŵĂŬĞƐƵƐĞ ŽĨdZ
ĚĞĨŝŶŝƚŝŽŶ
ƌĞƋƵĞƐƚŝŶŐĐŚĞĐŬƐƵŵ ĨŽƌĂůůW/ƐĨŽƵŶĚ
W/
11
due to unforeseen circumstances need until P5 Practical Policies = executable Workflow Statements result at P5: a set of Best Practice PPs for a number of typical DM/DP
tasks (Integrity Check, Replication, etc.) currently a large collection of PPs, currently being evaluated• huge simplification for data stewards• finally feasible quality checks and certification• huge step in trust improvement
RDA Results IV: Practical Policies
replication policy Xreplication policy Yintegrity policy Aintegrity policy Bintegrity policy Cmd extraction policy lmd extraction policy ketc.
Policy InventoryRepositoryselection
implementation
execution
data manager
12
Presented at the last Plenary in San Diego Working on Adoption / Recommendations
1. Citation of Dynamic Data
2. DDRI
3. Metadata Standards Directory
4. Wheat Data Interoperability
Second Group of Outcomes
13RDA Results V: Citation of Dynamic Data
We have: Data + Means-of-access
Dynamic Data Citation: Cite data dynamically via query!
Steps / Principles:
1. Data versioned (history, with time-stamps)
Researcher creates working-set via some interface:
2. Access assign PID to “QUERY”, enhanced with- Time-stamping for re-execution against versioned DB- Re-writing for normalization, unique-sort, mapping to history- Hashing result-set: verifying identity/correctness
leading to landing page
Many prototypes and pilot impletmentations
14
Enabling cross-platform discovery between research data registries
Interoperability projects between ANDS, CERN, Dryad based on DataCite and ORCID services Research Data Switchboard
Interoperability between da-ra and DataPASS based on Dataverse
De-duplication project; a collaboration between Data Curation Unit and DANS
This infrastructure enables anyone to query and find links between registries. It can be used by universities, repositories, registries and funders.
RDA Results VI: DDRI
15
Standards are a good thing But, only works when people use the same standards
Too few standards -> People do their own thing Too many standards -> Fragmentation
Goal: Develop a directory listing Metadata standards Comprehensive Easy to contribute to
Extend DCC Metadata Directory Make it community-updatable
RDA Results VII: Metadata Standards Directory
16
Wheat is a major food-staple Need data interoperability to increase production Encourage Interoperability by:
Creating an Interoperability framework Providing guidelines on Wheat data (cookbook) Repository of linked vocabularies
Adoption by WheatIS (Wheat Initiative), FAO, etc.
Currently in community validation
RDA Results VIII: Wheat