Research Data Management in South Africa · Sound data management RDM policies and standards...
Transcript of Research Data Management in South Africa · Sound data management RDM policies and standards...
Research Data Management
in South Africa
Anwar Vahed
July 2017
Outline
• SA investment in research data
• NICIS and DIRISA
• South African National Data
Infrastructure & Services (SANDIS)
• Figshare as data repository: Background
© CSIR, 2017 2 July 2017
• SAEON: Environmental
• HSRC: Human Sciences and Humanities
• DataFirst: Survey and administrative
• Agincourt: Social, health demographics
• SANSA: Earth observation
• SADA: Survey & related
• SAAO: Astronomy
• Govt departments: StatsSA, DWA...
• Academia & research councils Meteorology, EO, Climate Change, Water, Energy, Health…
National Data Investment
SKA projected budget
€2 billion to 2020
€650 million (Phase 1)
SA so far: R2 billion
July 2017 © CSIR, 2017 3
Core services
Networked
resources
Skills & expertise
National Integrated Cyberinfrastructure System
(NICIS)
• Amalgamated physically distributed cyber platform for e-research
• Overarching national coordination
• Priority and cross-cutting domains
July 2017 © CSIR, 2017 4
Computing Services (CHPC )
Networking Services
(SANReN)
Data Services (DIRISA)
Data intensive research environments (Cloud)
Materials & Manuf.
Energy
Earth & Environment
Phy Sci & Eng.
Humans & Society
Health, Bio & Food
DIRISA Objectives
© CSIR, 2017 5
Robust infrastructure & services
Sustained, federated and Trusted repositories
Enabling environments (VREs/Gateways)
Sound data management
RDM policies and standards
Data sharing and reuse
Capacity & expertise
Data intensive research & management
Data Science “technologists”
Advocacy & outreach
Data sharing
Stakeholder engagement
Coordination & strategy
National data intensive research activities
Inform strategic agenda
Research Ecosystems: cross & multi disciplinary
research
Data Services: harmonised data
management
Federated Data Infrastructure: observations (models and
measurements)
July 2017
Data Lifecycle Management
PIDs: People, Objects, Activities (DONA, DataCite; ORCID)
July 2017 6 © CSIR, 2017
Tiered harvesting with Trusted Repository Certification: WDS DSA
SANDIS Services
DSubscribe
• Register as DIRISA user
DataDrop
• Deposit and store data reliably
FindGet
• Discover, download data sets
SafeShare
• Safely share data with users
DataStage
• Prepare data for processing
User documentation Help & support Core services (DMP, DOI)
Phase2: Collaborative Research Environments My data management plans My workflows My data sets and outputs My communities
Community driven
July 2017 © CSIR, 2017 7
South African National Data Infrastructure and Services (SANDIS)
Collaborators
RDA
CODATA
WDS
DCC
EUDAT
ANDS
UK D_A
Data.gov
July 2017 © CSIR, 2017 8
So far... (Current status)
• Robust infrastructure and services – Regional Tier 2 data node (Western Cape): Astronomy & Bioinformatics – Server cluster, DMP tool, DONA and DataCite Services
• Data management – Policies (Open and not Open Data): Regulatory (POPI Act), Ethical – Guidelines, SA DMP tool, User subscription
• Capacity & expertise – National MSc in eScience – Data fundamentals courses (Data & Software Carpentry; IBM)
• Advocacy & outreach – SARDA, USAf, ASSAF, NRF (OA), DHET – RDA, CODATA, WDS, SKA – IDW 2018?
• Coordination & strategy – National data intensive research strategy – SADC Cyberinfrastructure framework
July 2017 © CSIR, 2017 9
National Research Data Management
• Western Cape Regional (T2) Data Node
– Consortium: UCT, UWC, US, CPUT, Sol Plaatje, SKA
– Figshare for data management
• Research Data Management “Systems”
– Figshare; Islandora; Open Science Framework, Dataverse,…
• Figshare: User / Institutional uptake?
– 6 month pilot
July 2017 © CSIR, 2017 10
Thank you
Discussion Which is “best”?
Institutional commitment? How cost recovery?
July 2017 © CSIR, 2017 11
www.dirisa.ac.za
July 2017 © CSIR, 2017 12