Exploring Scientists’ Research Data Management Practices and Perspectives

29
USING A MIXED-METHODS RESEARCH APPROACH VIA AN ADAPTED DATA ASSET FRAMEWORK (DAF) METHODOLOGY Exploring Scientists’ Research Data Management Practices and Perspectives Plato L. Smith II, FSU CCI– School of Information, Florida’s iSchool University of Maryland’s iSchool Lecture February 20, 2014

description

This presentation was presented to the University of Maryland ISchool via Skype as part of the campus interview for an assistant professor position. The presentation include some results, conclusions, and recommendations for funding stemming from my dissertation research.

Transcript of Exploring Scientists’ Research Data Management Practices and Perspectives

Page 1: Exploring Scientists’ Research Data Management Practices and Perspectives

USING A MIXED-METHODS RESEARCH APPROACH VIA AN ADAPTED DATA ASSET FRAMEWORK (DAF) METHODOLOGY

Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II, FSU CCI– School of Information, Florida’s iSchoolUniversity of Maryland’s iSchool LectureFebruary 20, 2014

Page 2: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II

1. Background & Significance

2. Research Purpose

3. Research Questions

4. Research Opportunity

5. Research Design & Methodology

6. Target Population & Purposive Sampling

7. Findings

8. Implications

9. Conclusions

Table of Contents

2/20/2014

Page 3: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

A. Data management and curation (DMC) is a research data management (RDM) concept that includes (1) data management planning, (2) data curation, (3) digital curation, and (4) digital preservation key concepts. These concepts focus on the lifecycle management of data.

B. These key RDM concepts have sometimes been expressed as competing models and frameworks in literature and in practice thus leaving theory in an under-developed state.

C. This project seeks to combine two data curation models into a DMC Framework while adapting a conceptual framework to explore research data management within and across multiple scientific disciplines.

Background & Significance

Page 4: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

Background & Significance – Data Management & Curation (DMC) Framework

• Metadata• Archived data• Level 2 Curation

• Trusted repository

• Technical & strategic storage actions

• Level 3 Curation

• Data creation• Representation over lifecycle

• Level 1 Curation

• DMP (i.e. NSF)• RDM Policies• DCC Curation Lifecycle Model

Data Management

Planning

Data Curation

Digital Curation

Digital Preservation

Page 5: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

The purpose of this research project is to investigate scientists’ current DMC practices

across multiple disciplines and explore opportunities for improving data management

activities where applicable.

Research Purpose

Page 6: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

① What types of data do scientists create?

② How do scientists manage, store, and preserve research data?

③ What are some of the types of theories, practices, or methods disciplines use in research data management?

④ How can multiple disciplines perspectives on data management and curation (DMC) practices within and across disciplinary domains contribute to building underdeveloped DMC theory?

Research Questions

Page 7: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

What does this research want to discover?Investigate how scientists manage, store, & preserve research

data

Why are the research questions are important?Address funding agencies data management requirements

Educate, articulate, and promote scientists’ need for improved DMC

How is this research going to answer the research questions?Discover, map, and correlate data management synergies across

disciplines

Introduce/share data management concepts & models across disciplines

Research Opportunity

Page 8: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II

• Qualtrics Survey• 25 questions• DMC - Data

management & curation practices

DAF Survey – Phase 1

• Online (45 Qs) • Exemplar project• DMC experiences,

paradigms, & perspectives

DAF Interview – Phase 2 • Descriptive

• QUAN qual• Findings• Explain results• Future Research

Findings – Conclusions

Research Design & Methodology

2/20/2014

Page 9: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II

Data Asset Framework (DAF)

Surveys Interviews

Starts 129 7

Completes 107 6

Completion Rate 83% 86%

Six research labs at Florida State UniversityMultidisciplinary (52%)

Interdisciplinary (25%)

Other (23%)

Multiple Disciplinary Domains

National Science Foundation (NSF) EarthCube project

Target Population & Purposive Sampling

2/20/2014

Page 10: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

① Biology and Oceanography

② Boundary-layer Meteorology and Biogeochemical cycles of water and carbon

③ Computer Science

④ Condensed Matter Physics

⑤ Marine Ecology, Fisheries Science

⑥ Materials Science & Physics

⑦ Meteorology

Target Population & Purposive Sampling – Multiple Disciplinary Domains

Page 11: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

Findings – DAF Survey (About You)

Senior Researcher

Principal Investigator

Research Assistant

Research Technician

Research Support

Research Student

Other

0 5 10 15 20 25 30 35

23

29

26

3

3

10

7

What is your primary research role?

Page 12: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

Findings – DAF Survey (General Data Management)

No

prim

ary

data

Compu

ter co

de

Der

ived

Expe

rim

enta

l

Obs

erva

tiona

l

Refer

ence

Oth

er0

20

40

60

80

3

4858

74

4226

2

What is the data type of your primary data?

Primary Data Type

Page 13: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

Findings – DAF Survey (General Data Management)

Audio tap

es

Dat

a - c

ompute

r

Dig

ital

audio

file

s

Exce

l shee

ts

Imag

es, s

cans,

phot

os

MS A

cces

s

MS W

ord

SPSS file

s/st

atistica

l

Web

site

s0

20

40

60

1

374350

3 9

46

1223

40

6

3031

4 6 4 92

What is the data type of your secondary data?

Secondary Data Type

Page 14: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

Findings – DAF Survey (General Data Management)

0

30

60

14 15

4968

30 245

327

Where do you store your data (excluding backup copies)? [Select all that apply]

Page 15: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

Findings – DAF Survey (Barriers to Research Data Management)

Budget/funding22%

Infrastructure/resources

31%Stakeholders

8%

Storage/technology

25%

Other 13%

What are some barriers for you with regards to managing and storing your research data?

Budget/fundingInfrastructure/resourcesStakeholdersStorage/technologyOther

Page 16: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

Findings – DAF Survey (Barriers to Research Data Management)

Finding files/folder structure

Locating where data files are stored

Non standard file formats

Legal issues arising from transfer of data

Problems establishing ownership of data

Finding or accessing research data

Security and protection of files

Other

0 10 20 30 40 50 60

5054

296

445

186

Which of the following data management is-sues have you experienced? [Please select all

that apply]

RDM Issues

Page 17: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

Findings – DAF Survey (Your Data Assets)

Project manager

Research assistant

Research groups

National data center

You

Other

0 10 20 30 40 50 60 70 80

20

17

16

8

70

6

Who is responsible for managing your research data

(select all that apply)?

RDM Responsibility

Page 18: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

Findings – DAF Survey (Your Data Assets)

< 1 Gigabyte9%

1 - 50 Gi-gabyte

30%

50 - 100 Gigabyte10%

100 - 500 Gigabyte9%

500 Gi-gabyte - 1 Ter-abyte

7%

1 - 50 Terabyte

22%

50 - 100 Terabyte

2%

100 Ter-abyte - 1 Petabyte

1%Don't know

8%

What is the estimated amount of electronic research data do you currently hold/maintain?

< 1 Gigabyte1 - 50 Gigabyte50 - 100 Gigabyte100 - 500 Gigabyte500 Gigabyte - 1 Terabyte1 - 50 Terabyte50 - 100 Terabyte100 Terabyte - 1 PetabyteDon't know

Page 19: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

The DCC Curation Lifecycle Model list some major stages in data management that encompasses the four key concepts of data management and curation (DMC).

Findings – DAF Interviews (Q7)

The DCC Curation Lifecycle Model (DCC, 2007/2014)

Page 20: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

1. Level 1 Curation – traditional academic information flow

2. Level 2 Curation – information flow with data archiving

3. Level 3 curation – information flow with data curation (Lord & Macdonald, 2003)

Findings – DAF Interviews (Q8)

Level 3 Curaton – information flow with data curation – (Lord & Macdonald, 2003, p. 45 )

Page 21: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

“The framework reflects the basic philosophical presuppositions or metatheoretical assumptions underlying scientific inquiry” Solem, 1993, p. 595).

Findings – DAF Interviews (Q9)

Burrell & Morgan (1979); Morgan & Smircich (1980); Morgan (1983); Solem (1993, p. 595); Smith II, (2013)

Page 22: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

① “Mostly geological based: all measurements are of a physical reality.” – P1

② “Most of the data in my domain are spatio-temporally organized.” – P2

③ “It analyzes reality by making observations…” – P3

④ “In meteorology, we seek patterns in a chaotic system. Through organization and classification, patterns emerge that subsequently support understanding of underlying physical relationships…” – P4

⑤ “Physics is the study of reality so a supposition that there is an objective reality is the core of the discipline.” – P5

⑥ “We use experimental methods to reveal the true reality.” – P6

Findings – DAF Interviews (How does your discipline look at and understand reality?)

Page 23: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

① “I don’t participate in this.” – P1

② “My discipline uses numerical models, field sampling and controlled experiments to learn about reality.” – P2

③ “I guess from the sensors it employs.” – P3

④ “Primarily through observation and modeling. Meteorology is based on the physical observation of our world. Through observation patterns emerge that support the development of conceptual models for atmospheric systems…” – P4

⑤ “Experiments and observations.” – P5

⑥ “Via carefully controlled experiment.” – P6

Findings – DAF Interviews (How does your discipline learn about reality?)

Page 24: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014D

isco

very Integration

Application

Teaching Research

Practical

Societal

Implications –Boyer’s Model of Scholarship (Nibert, 2008)

Page 25: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

• Facilitate the use & interpretation of research data management practices across disciplines

• Broaden literature review contribution & application

• Collaborate on core or RDM special topics course design & delivery

INTEGRATION

• Improve data description, representation, & publication

• Allow new research based on accessible & discoverable data

• Promote current & future use of data

DISCOVERY

Implications

Page 26: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

• Explore multiple disciplines teaching models & practices

• Advance DMC theory development via classroom research, teaching, & learning

• Mentor faculty, post docs, students & professionals

• Design, implement, & assess program success metrics

TEACHING

• Enable the profession and society to address data management plan issues

• Facilitate funding agency data management requirements

• Serve as RDM model for organizations & students

APPLICATION

Implications

Page 27: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

• Stimulate community, departmental, campus & consortium collaborations

• Promote creativity, project outputs, intellectual merits & broader impacts

• Advance diversity, curriculum development, special topics, & profession

FUTURE RESEARCH, TEACHING, & LEARNING

• Research data management (RDM) practices vary across departments & disciplines

• Organizational & social structures impact RDM

• Scientists’ data management practices exhibited integration, differentiation, & fragmentation perspectives (Martin, 1992)

RESEARCH PROJECT LEARNING OUTCOMES

Conclusions

Page 28: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

① National Science Foundation (NSF) Alliances for Graduate Education and the Professoriate (14-505)

② NSF Advanced Cyberinfrastructure (ACI) & Faculty Early Career Development (CAREER) (13-092)

③ NSF Advancing Digital Biological Collections (13-569)

④ NSF EarthCube (13-529)

⑤ NSF Research Coordination Networks (RCN) (13-520)

⑥ NSF Grant Opportunities for Academic Liaison with Industry (GOALI) (12-513)

⑦ NEH Digital Humanities Advanced Topics

Potential Grant Ideas for UMD iSchool

Page 29: Exploring Scientists’ Research Data Management Practices and Perspectives

Plato L. Smith II 2/20/2014

ACKNOWLEDGEMENTS

Dr. Paul Marty (FSU), Dr. Diane Barlow, Dr. John Bertot, Diane Travis, Dr. Katy Lawley & David Baugh (UMD), UMD

iSchool Search Committee & UMD Faculty, Students, & Staff

THANK YOU!

Questions, comments, and/or feedback.

[email protected] | http://platosmith.com