Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you...

83
Creating a Data Management Plan (or how to get started with RDM) Myriam Mertens | Ghent University Library Nele Pauwels | Knowledge Centre for Health Ghent 1

Transcript of Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you...

Page 1: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Creating a Data Management Plan (or how to get started with RDM)

Myriam Mertens | Ghent University Library

Nele Pauwels | Knowledge Centre for Health Ghent1

Page 2: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Aims of today’s workshop • provide you with a basic understanding of data management planning and why it’s

important

• give you a broad overview of some key issues & topics involved in research data management (not just about storage)

• help you get started with writing your own data management plan

• point you to existing resources for further information, training and advice

2

Page 3: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Workshop outline• Introduction: what is a data management plan and why create one?

• Key topics to cover

• Putting it all together: create your own data management plan

3

Page 4: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

IntroductionWhat is a Data

Management Plan, and why create one?

4

Page 5: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Data management plan (DMP)

document outlining how data will be handled during and after a project

increasingly required by research funders/institutions

good practice even if not required, because…

5

Page 6: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Planning is first step towards good research data management (RDM)

“[RDM] ensure[s] that data are of a high quality,

are well organized, documented, preserved, sustainable, accessibleand reusable.” (Corti et al.

2014)

6

Page 7: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Data in the digital age

• data “explosion”

- navigating and using data is the challenge

• digital data are fragile, e.g. because of

- hardware/software failure

- human error

- natural disasters

- malicious attacks

- passage of time! (changing technology, loss of information…)

- …

7

Page 8: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Why you should care about RDM

• increases research efficiency

• facilitates collaboration

• encourages data reuse (increased visibility!)

• minimizes risk of data loss

• supports research integrity & quality

8

Page 9: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

RDM is a crucial part of good research practice

• secure preservation for a reasonable period

• access: as open as possible, as closed as necessary

• FAIR principles (Findable, Accessible, Interoperable & Reusable data)

• data = legitimate & citable products of research

Expectations regarding data include:

9

Page 10: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Research data are valuable scientific output!

Research data lifecycle

10

Page 11: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

DMPs – a mere administrative burden?

- takes time and effort upfront, but…

+ saves time and problems later on

+ helps consider whole range of RDM activities/issues

+ makes expectations, procedures & responsibilities explicit

+ leads to more informed decisions about data

+ helps identify resources required (& obtain funding)

11

Page 12: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Key DMP topics

“[…] plans typically state what data will be created and how, and outline the plans for sharing and

preservation, noting what is appropriate given the nature of the data and any restrictions that may need to be applied.” (DCC website)

1. description of data to be generated or used

2. methods, standards for collecting/creating & documenting data

3. ethics & legal issues

4. plans for data sharing

5. strategy for preserving data beyond project end

Also see: http://www.dcc.ac.uk/resources/how-guides/develop-data-plan 12

Page 13: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

1What types of data will you use or produce?

13

Page 14: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

‘Research data’ can mean a lot…

any information collected/createdfor the purposes of analysis to generate

scientific claims

• content: numerical, textual, audiovisual, multimedia... data

• data format/object: spreadsheets/tabular data, field notes, databases, images, audio recordings, marked up texts, surveys, instrument readings…

• mode of data collection: experimental, observational, simulation, derived/compiled… data

• digital or non-digital data

• primary or secondary data

• raw, processed or analyzed data

For example:

14

Page 15: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

File formats

• Potential problems

- not interoperable: other people cannot open file

- obsolescence: problems to open file at a later date

• Formats for long-term access

- non-proprietary: no specific (version of) software required to open file

- open, documented standard

- widely used

- uncompressed (or lossless compression)

15

Page 16: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Examples of recommended formats

• be aware of potential errors/information loss when converting

• consider saving data in both proprietary and open format

16

Type of data Formats

tabular data .csv; .tab; .por (SPSS

portable format);

.xml

textual data .rtf; .txt; .xml

image data .tif (TIFF 6.0

uncompressed)

audio data .flac

Source: https://www.ukdataservice.ac.uk/manage-data/format/recommended-formats

Page 17: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Data volume

• how much data will you produce? how fast will it grow?

• where will you store data? do you have enough storage capacity?

• what about back-ups?

17

Page 18: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

3-2-1 backup rule

• have at least 3 copies of important files, on at least 2 different types of storage media, with at least 1 offsite copy

• back up regularly & automatically

18

Page 19: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Don’t rely on local storage devices (alone)• Central storage options (DICT):

- shared network drives (‘shares’): secure, regular backups

- OneDrive for Business (http://onedrive.ugent.be): 1 TB; no confidential data unless encrypted

- Sharepoint (http://sharepoint.ugent.be): no confidential data unless encrypted

19

Requirements StorageoptionsDICT

Verylargedatasets

Activedata

SharingoutsideofUGent

Localcopy

HPC

ArchivalShares

ACLShares

Sharepoint

OneDriveforBusiness

Confidentialdata

Yes No ü ü

Yes Yes,onHPC ü Encrypt

Yes Yes ? ?

No No No ü ü

No No Yes ü Encrypt

No Yes No ü ü ü

No Yes Yes No ü ü Encrypt

No Yes Yes Yes ü Encrypt

(No)

case-by-case

…considerdeposit

Page 20: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Activity: • Introduce yourself to your neighbour and describe the types of data you produce/use

- e.g. mode of data collection, digital/non-digital, formats, volume…

20

Page 21: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

2What methods, standards will you use to create & document your data?

21

Page 22: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Organizing data

22

Page 23: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Have a logical system for organizing data (files)

• should be meaningful to you and your colleagues

• should allow you to find files/data easily

• develop standards early in project & use consistently

• don’t forget non-digital materials

23

• hierarchical folder structure

• database for large, complex datasets

• non-digital data: filing system, labels to identify content of data folders

For example

Page 24: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Organizing data• which example looks better, and why?

24

Page 25: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

File naming • would you know what these files are in 3 years’ time?

25

Page 26: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Use file naming conventions

• names should

- uniquely identify and reflect content of a file

- be consistent

- have no special characters and spaces

- not be too long

- use date conventions (YYYYMMDD or YYYY-MM-DD)

26

• 19991021_WesternBlot0

• western blot experiment number 7 from 21 Oct 1999

• Int024_MP_2008-06-05.doc

• interview with participant 24, interviewed by Marc Peters on 5 June 2008

For example

Page 27: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Versioning • would you know which version of the data to use? Or how the versions differ from

each other?

27

Page 28: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Have a strategy for file versioning

• record changes made to files

• identify different versions of files

• decide which versions to keep & how to organize them (e.g. master & working copies)

28

• dates or version numbers in file names (v1.0, v1.1, v2.0...)

• keep a log or file history table

• use version control software (e.g. Git), or version control features in software

possible strategies

Page 29: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Documenting data

29

Page 30: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Data documentation

• any descriptive or contextual information necessary to find, assess, understand & properly use your data

• start as soon as you start collecting data

30

Page 31: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Levels of data documentation

• study level

- published article may not be enough!

- conditions of data collection (e.g. study description, protocols, instruments & software/hardware used…)

- any changes made to collected data (e.g. processing & analysis procedures)

- overall structure of files

• data level

- information about individual data items/elements within data files (e.g. variable names & descriptions, value codes, within-file annotations,…)

31

Page 32: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Capturing data documentation

• in separate files, e.g.

- readme.txt files

- lab notebooks

- data dictionaries/codebooks

• embedded within data files

• using metadata

- highly structured, machine-readable format for describing data

- elements from a controlled list, as defined by a metadata schema

- useful for searching through large amounts of data, to facilitate exchange & comparison of data

Metadata example for image data, based on the Dublin Core

metadata schema (reproduced from Briney 2015)

32

Page 33: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Don’t reinvent the wheel: use existing standards!• Standard = an agreed way of doing something. A standard provides the

requirements, specifications, guidelines or characteristics that can be used for the description, interoperability, citation, sharing, publication, or preservation of all kinds of digital objects such as data, code, algorithms, workflows, software, or papers.

• Examples:

minimal reporting standards for publications (e.g. systematic review, qualitative research, diagnostic tests, etc.)

minimal reporting standards for biomedical investigations (e.g. Flow Cytometry, quantitative Real-Time PCR)

standard vocabularies, ontologies (Human Proteome Organization's Standards Initiative)

standard data structures/formats

• look for standards via

FAIRsharing.org

RDA metadata standards directory

33

Page 34: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Activity• Which reporting standards are recommended by my funder or publisher?

• Which standards do I need to use to prepare datasets?

• Which minimal information should I report when performing my experiment(s)?

=> Search FAIRsharing.org

34

Page 35: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

35

• A web-based, searchable portal of three interlinked registries: standards, databases and policies all following the FAIR principles

STANDARDS: Formal Standards Developing Organizations (SDOs), such as the American National Standards Institute (ANSI) and the International Organization for Standardization (ISO) [de jure standards]; grass-root efforts [de facto standards]

DATABASES: implementing standards

POLICIES: policies referring to standards and databases.

Page 36: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

3 How will you handle ethics & legal issues?

36

Page 37: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Personal data • any information relating to an

identified/identifiable natural person

• handling is regulated by European and Belgian privacy/data protection legislation

- e.g. certain requirements for lawful processing of personal data, such as data subject’s informed consent

• sensitive personal data

- info about racial or ethnic origin, political opinions, religious/philosophical beliefs, trade-union membership, health or sex life, criminal offences…

- benefit from stronger protection

37

Page 38: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Otherwise confidential data, e.g.

• you have signed a non-disclosure agreement/contract with a confidentiality clause

• data are otherwise sensitive

- i.e. disclosure may harm endangered species, vulnerable sites or groups, national security…

- also see ethical standards governing confidentiality in research

• data have economic valorisation potential

- duty to report to TechTransfer office before disclosing anything!

38

Page 39: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Some things to consider• don’t collect more personal/confidential data than needed

• seek the permissions required to handle these data

- informed consent, ethical approval…

- permission for data collection, but also for processing, archiving, sharing…

• pseudonomize/anonymize personal data to protect privacy

• pay attention to data security to prevent unauthorized access & disclosure

- physical security (e.g. lock labs, offices…)

- security of computer systems and files (e.g. passwords, encryption, up-to-date software, controlled access to files/folders, …)

- network security (e.g. firewall protection, no confidential data on servers/computers connected to external network…)

familiarize yourself with UGent Information Security Policy & Guidelines!

39

Page 40: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Intellectual property (IP)

• IP issues can affect your ability to (re)use, archive and/or share data

• data may be protected by IP rights

- e.g. copyright, database right

- permission from rights owner(s) required, e.g. to reproduce or publicly communicate data

• your data may have economic valorisationpotential

- UGent normally owns rights in such research results

- sharing may not be possible (to protect confidential knowhow), or only after an embargo period (to seek patent protection first) – always check with TT office!

40

Page 41: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Intellectual property (IP)• (Rights to) data may be (co-)owned by a third party

- e.g. when you re-use existing data

- e.g. when research is funded by or conducted in collaboration with external partner

- third-party permissions required

41

Page 42: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

4What are your plans for externally sharing data?

42

Page 43: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Sharing data with others• easier than ever in the digital age

• nothing new in certain domains + changing culture among researchers

• increasingly required by research funders and publishers

- funder policies on access to research data (e.g. European Commission – H2020)

- journal data availability policies (e.g. PLoS, Nature, Science….???)

43

Page 44: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Sharing does not necessarily mean “open data”• “As open as possible, as closed as

necessary” approach

• Open: “anyone can freely access, use, modify, and share for any purpose” (opendefinition.org)

• Possible to share data under more restricted conditions

- e.g. only a subset of the data

- only with certain (types of) users

- only for certain types of use

- after an embargo period…

“Data Tree” by Auke Herrema – Het Bouwteam,

licensed under CC-BY 44

Page 45: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

How to share data? • Email data “upon request”

- but often fails in practice (e.g. Wicherts et al. 2006)

• Disseminate via a project or personal website

http://ajp.psychiatryonline.org/doi/full/10.1176/ajp.2007.164.6.942

45

Page 46: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

How to share data? • Make data available via a trusted database

or data repository

- domain-specific, catch-all or institutional

- helps make data citable & FAIR

- assigns persistent identifier (e.g. DOI) to dataset, which resolves to a landing page

provides online access to metadata (always public), data & documentation files (open or more restricted access)

states data reuse rights (via licenses)

uses standards to promote interoperability

Dataset record from the 4TU.ResearcData repository

46

Page 47: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

How to share data? • Publish a data paper

- extensive dataset description, published in a journal

- link to data deposited in repository

- paper and data are peer-reviewed

- cited like a traditional article

- format offered by regular journals (e.g. PLoSONE), and dedicated data journals (e.g. Scientific Data)

http://www.nature.com/articles/sdata201697 47

Page 48: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Licensing research data • Use licenses to makes reuse rights clear

• Many repositories use international, standard (rather than bespoke) licenses

Creative Commons Licenses

Open Data Commons Licenses

But less suitable for restricted data

• See EUDAT license selector for help with selecting an appropriate license

• More info in How to License Research Data (DCC)

Licenses conformant with Open Definition principles.

From “Conformant Licenses” by Open Definition, licensed

under CC-BY 48

Page 49: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Reusing data• Find existing datasets via repositories’ data

catalogues

• Check data reuse rights

- stated in licenses, contracts…

- see SURF’s guide on reusing research data

• Cite any datasets you reuse in your publications

- minimum elements:

Author, Publication Year, Title, Publisher, Location (usually PID + resolver service)

- more info in How to Cite Datasets (DCC)

Example

Benkman C (2016). Data from: Matching habitat choice in nomadic crossbills appears most pronounced when food is most limiting. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.dg41r

49

Page 50: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

“Building a culture of data citation” by ANDS & NCRIS, licensed under CC-BY 50

Page 51: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Activity• Can you think of 3 arguments against data sharing (and possible counterarguments)?

• Can you think of 3 good arguments for data sharing?

51

Page 52: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Sharing data

Common arguments for not sharing

1. Impossible because of privacy or IP

2. Fear of being “scooped”

3. Fear of errors being exposed

4. Fear of misinterpretation or misuse

5. Too much effort/too costly

6. Data not of interest to anyone else

7. Lack of reward

Possible counterarguments

1. Valid concern, but not black & white

2. Sole use period is acceptable + you know your data best

3. Isn’t getting it right important

4. Proper documentation should lead to proper understanding & use

5. If RDM is planned & integral part of research, it should save time & money, make your life easier

6. Think again! Difficult to predict, but may become essential for future research or teaching

7. Increasing recognition for data as valuable scientific output

Adapted from Corti et al. 2014 52

Page 53: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Sharing data

Common arguments for sharing

• Uncovers errors, fraud, irreproducible results

• Avoids duplication (greater “return on investment”)

• Public access to publicly funded research

• Citations - advantage for publications with shared

datasets (e.g. Piwowar et al. 2013)

- citations when others reuse your data

• Opportunities for new collaborations, co-authorships

• Advances science, accelerates discovery

• Returing the favour (already reusing other people’s data)

“Data Sharing” by Auke Herrema – Het Bouwteam,

licensed under CC-BY 53

Page 54: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

5How will your data be

preserved for long-term access & use?

54

Page 55: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Preserving data• What happens to research data once a

project is completed?

http://www.nature.com/news/scientists-losing-data-at-a-rapid-rate-1.14416

55

Page 56: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Preserving data• Don’t keep everything (indefinitely)!

do you need to keep all versions of your data/data from failed experiments/data that you won’t use again…?

• What to keep and how long? Select based on e.g.

- obligations to keep data (e.g. institutional/funder policies, legal obligations)

- what is needed to verify & validate your publications

- what cannot be recreated (e.g. data captured in real time) or is too expensive to recreate

- potential re-use value

- scientific, historical, cultural significance

- …

• Also see Checklist for Appraising Research Data (DCC)

56

Page 57: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Preserving data• Don’t simply expect stored data to still be

there after 10 years

• Keeping data files readable and usable over time requires appropriate strategies, e.g.

prepare data for preservation (e.g. convert to sustainable file formats, create metadata)

also keep appropriate documentation with the data

move files to new storage hardware every 3 to 5 years

backups are still necessary

monitor for file corruption using checksums

“A Domesday system at the Vintage Computer Festival 2010, Bletchley, UK” by

Regregex, licensed under CC-BY 3.0

57

Page 58: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

If possible, outsource preservation• For example, to a trusted external data repository

- suitable for publicly shareable data that need longer retention periods

- check explicit commitment to preservation (e.g. preservation policy, certificate, statement on how long data will be supported...)

• Confidential data may need to stay in-house

• See Where to Keep Research Data (DCC) for more details and options

58

Page 59: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Find a data repository via Re3data.org

https://www.openaire.eu/opendatapilot-repository

59

Page 60: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Things to consider when choosing a repository

• Does it

- provide a persistent & unique identifier to your dataset?

- provide a landing page for each dataset, with metadata?

- help you track usage (e.g. access & download statistics)

- have a certificate to indicate trustworthiness (e.g. DSA)?

- match your data needs (e.g. your type of data are accepted)?

- meet legal requirements in terms of data protection and allowing reuse without unnecessary licensing conditions?

- provide guidance on how to cite data?

- charge for its services? https://www.openaire.eu/briefpaper-rdm-infonoads

(adapted from https://www.openaire.eu/opendatapilot-repository) 60

Page 61: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Don’t forget non-digital materials• E.g. biological materials (slides BCCM integreren)

61

Page 62: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Activity• Can you find a repository suitable for your research data via Re3data.org?

62

Page 63: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Example – data preservation

“Raw data and documentation files will be offered for deposit to 4TU.ResearchData, which is a DSA-certified data repository

accepting research data in the field of engineering and preserves them for a

minimum of 15 years. Files will be offered in the repository’s preferred formats (.txt,

.xml and JCAMP), and as the volume of data does not exceed 10GB the repository

will not charge for the deposit.”

More on how to select a data repository: https://www.openaire.eu/opendatapilot-

repository63

Page 64: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Putting it all together: Create your own DMP

64

Page 65: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Use an online planning tool: DMPonline.be

Page 66: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

DMPonline.be• Local instance of open source software developed by DCC (UK)

• Launched as a pilot at UGent in 2015,

• Now hosted on BELNET servers, and currently accessible for researchers from institutions with DMPbelgium consortium:

66

Page 67: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

67

How the tool works

https://dmponline.be

Log in with:

- institutional

credentials

(BELNET

Federation)

- local account

- ORCID (if

profile linked

to ORCID)

Page 68: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

68

How the tool works

https://dmponline.be

Page 69: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

1. Viewing existing plans

69

Click ‘View plans’

button to see the

list of plans you

have created,

and/or plans that

others have

shared with you

Page 70: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

2. Creating a new plan

Select funder to

get its template

Select institution to

get local guidance, as

well as institutional

template(s) - if

funder not applicable

Choose additional

optional guidance 70

Page 71: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

2. Creating a new plan: answering questions

71

Click ‘+’ sign to

open up section

and see questions

Page 72: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

2. Creating a new plan: featuresProgress

indicator

Section

Questio

n

Write down

your answer

here

Leave a

comment for

collaborators

Custom guidance

from funder,

university,

group… 72

Page 73: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

73

3. Sharing your plan

Manage

collaborators

Add

collaborator by

entering email

addressSelect

permission

level

Page 74: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

4. Exporting a plan Select export

format

Adjust export

settings as

needed

74

Page 75: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

5. Finding help

75

Click ‘Help’

button for

guidance

Page 76: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Further tips for writing a DMP • check applicable data policies

• keep it simple, but be as specific as possible

• justify your decisions

• consider it a ‘living’ document

• have a look at example DMPs

• familiarize yourself with RDM terminology & best practices (for your field)

76

Page 77: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Example plans• examples on the Digital Curation Centre (DCC) website

http://www.dcc.ac.uk/resources/data-management-plans/guidance-examples

• examples in the Zenodo repository

https://zenodo.org/search?page=1&size=20&q=data management plans

• public DMPs on the DMPTool website

https://dmptool.org/public_dmps

• DMPs published in RIO (Research Ideas and Outcomes OA journal)

http://riojournal.com/browse_user_collection_documents?collection_id=3

77

Page 78: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Online RDM training resources • FOSTER training portal

• OpenAIRE webinars

• EUDAT training materials

• Digital Curation Centre How-to Guides & Checklists

• UK Data Archive ‘Create & Manage Data’ webpages

• MANTRA – Research Data Management Training

• ‘Research Data Management and Sharing’ MOOC on Coursera

• Data Management Training Clearinghouse

78

Page 79: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Thank you for listening!

79

Page 81: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

Credits• slides adapted from S. Jones (2016), ‘What is a Data Management Plan?’, Licensed under CC-

BY 4.0

• images [slide 2]: ‘Writing’ by Aiconica, licensed under CC0 1.0

[slide 3]: ‘Database’ by Jørgen Stamp, attribution: digitalbevaring.dk, licensed under CC-BY 2.5 DK

[slide 5]: From ‘Analyzing DMPs to inform Research Data Services’ by A. L. Whitmire, licensed under CC-BY 4.0

[slide 6]: ‘Day 10: Lost’ by Dave Hill, licensed under CC-BY-NC-SA 2.0; ‘A Domesday system at the Vintage Computer Festival 2010, Bletchley UK’ by Regregex, licensed under CC-BY 3.0

[slide 7]: ‘Publications and Data’ by Auke Herrema, licensed under CC-BY 4.0; T. H. Vines et al. , ‘The availability of Research Data Declines Rapidly with Article Age’, Current Biology 24 (2014) 1: 94-97. http://doi.org/10.1016/j.cub.2013.11.014

[slide 8]: ‘How Science goes wrong’, The Economist, 19 Oct 2013. http://www.economist.com/printedition/2013-10-19; N.L. Yozwiak, ‘Data sharing: Make outbreak research open access’, Nature 518 (2015) 7540: 477- 479. https://doi.org/10.1038/518477a

[slide 9]: From ‘RDM: An Overview’ by Research Support Team, IT Services (University of Oxford), licensed under CC-BY-NC-SA 4.0

[slide 10]: The European Code of Conduct for Research Integrity. Revised Edition (ALLEA – ALL European Academies, 2017).

[slide 12]: ‘Planning’ by Jørgen Stamp, attribution: digitalbevaring.dk, licensed under CC-BY 2.5 DK

[slide 15]: ‘Metadata’ by Jørgen Stamp, attribution: digitalbevaring.dk, licensed under CC-BY 2.5 DK

[slide 19]: V. Van den Eynden et al., Managing and Sharing Data. Best practice for Researchers (UK Data Archive, 2009), licensed underCC-BY-NC-SA 3.0

[slide 20]: ‘Preservation plan’ by Jørgen Stamp, attribution: digitalbevaring.dk, licensed under CC-BY 2.5 DK

[slide 22]: ‘Tools’ by Jørgen Stamp, attribution: digitalbevaring.dk, licensed under CC-BY 2.5 DK

[slide 42]: ‘Knowledge’ by Jørgen Stamp, attribution: digitalbevaring.dk, licensed under CC-BY 2.5 DK

[slides 24-40]: Screenshots from dmponline.be and dmponline.kuleuven.be

81

Page 82: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

• Slide 8: “Data Ocean” by Auke Herrema – Het Bouwteam, licensed under CC-BY

• slide 11: https://www.deic.dk/sites/default/files/uploads/PDF/SUND_policy_for_research_data_management.pdf

• Slide 16: https://www.freepik.com/free-vector/file-formats-collection_801555.htm#term=file%20formats&page=1&position=1

• Slide 18: “Day 10: Lost” by Dave Hill, licensed under CC-BY-NC-SA 2.0

• Slide 19: From “Research Data Management: An Overview - 2014-05-12“ by Research Support Team, IT Services, University of Oxford, licensed under CC-BY-NC-SA 4.0

• Slide 26: from PRE_RDMWorkshopSTEM_V3_20161206 (Cambridge/Marta Teperek)

• Slide 28: From “Introduction to Rsearch Data Management” by A. Whitmire & S. Van Tuyl, licensed under CC-BY

• Slide 30: From “Data Handling: Documentation, Organization and Storage” by Sebastian Netscher, licensed under CC-BY

• Slide 31: From “Research Data Management: An Overview - 2014-05-12“ by Research Support Team, IT Services, University of Oxford, licensed under CC-BY-NC-SA 4.0

• Slide 32: “Loss of Data” by Auke Herrema – Het Bouwteam, licensed under CC-BY

• Slide 37: http://www.thebluediamondgallery.com/wooden-tile/p/privacy.html

• Slide 38: https://commons.wikimedia.org/wiki/File:Lorenzo_Federici_2.jpg

82

Page 83: Creating a Data Management Plan - KCGG · 2018-09-04 · Aims of today’s workshop •provide you with a basic understanding of data management planning and why it’s important

AcknowledgementsThis presentation draws heavily and/or adapts materials from the following sources :

K. Briney (2015), Data Management for Researchers: Organize, Maintain and Share your Data for Research Success (Pelagic Pub Ltd).

L. Corti, V. Van den Eynde & M. Woolard (2014), Managing and Sharing Research Data. A Guide to Good Practice (Sage).

S. Jones, “Research Data Management”, University of East-London, May 1, 2013. Licensed under CC-BY.

T. Ross-Hellauer & S. Jones, “Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT”. Licensed under CC-BY 4.0.

S. Netscher, “Data Handling: Documentation, Organization and Storage”, GESIS-Leibniz Institute for the Social Sciences 2015. Licensed under CC-BY 4.0.

83