Management of Data Collections

22
Data Collections Bernadette Duffy and Abraham de Jesus LIBR 580 Louise Broadley October 5, 2011
  • date post

    18-Oct-2014
  • Category

    Technology

  • view

    444
  • download

    0

description

 

Transcript of Management of Data Collections

Page 1: Management of Data Collections

Data Collections

Bernadette Duffy and Abraham de Jesus

LIBR 580

Louise Broadley

October 5, 2011

Page 2: Management of Data Collections

What are Data Collections?

• Data from surveys, opinion polls, climate data

• Numeric data in machine-readable form • To make use of the data files need

Codebooks and other supporting files

Page 3: Management of Data Collections

Data Lifecyclefrom DataOne https://www.dataone.org/content/education

Page 4: Management of Data Collections

Libraries and Data Collections

• Important in academic and special libraries

• Used by researchers and policy analysts

• Academic libraries starting to get involved in the preservation of research data from own institution

Page 5: Management of Data Collections

UBC Library Data Serviceshttp://data.library.ubc.ca/

Page 6: Management of Data Collections

Data suppliers - UBC

• Statistics Canada http://www.statcan.gc.ca/ Canadian Census, labour, health, income, trade

• The Roper Center for Public Opinion Research at the University of Connecticut http://www.ropercenter.uconn.edu/ Opinion polls

• Inter-university Consortium for Political and Social Research (ICPSR) at the University of Michigan http://data.library.ubc.ca/gen/icpsr.html Social Sciences data

Page 7: Management of Data Collections

abacus

Page 8: Management of Data Collections

abacus - data set Part 1

Page 9: Management of Data Collections

abacus - data set Part 2

Page 10: Management of Data Collections

Data file

Page 11: Management of Data Collections

Challenge - Cost

Strategies to reduce cost for subscription data sets

• Collaborative purchase with several departments (UC Berkeley)

• University consortium (UBC, SFU, UVic, UNBC combined to form BC Research Libraries’ Data Services consortium – abacus http://abacus.library.ubc.ca/

Page 12: Management of Data Collections

Challenge - Selection

Decisions are based on• Collection policy• Knowledge of what is available• Understanding user need• Cost• Individual patron need• If the data would be useful to multiple

users

Page 13: Management of Data Collections

Challenge - Supporting Access

• Make visible in Library Catalogue. • Convert file formats for use in statistical

programs• Outreach / education in use of data

collection and statistical tools• Workshops on data literacy• Create a Data Lab• Become embedded in course requiring use

of data collections

Page 14: Management of Data Collections

Infrastructure

• Data sets can be highly variable in size.• This creates certain infrastructural

challenges for storage, institution’s system, and the institution itself.

Page 15: Management of Data Collections

Storage

• Scalability: “the ability of a system, network, or process, to handle growing amounts of work in a graceful manner or its ability to be enlarged to accommodate that growth.” (Wikipedia)

• Location: Does your institution expect to host the data produced by researchers at that institution?

Page 16: Management of Data Collections

Systems Support

• Network: Can the network handle downloading of large datasets?

• Hardware: Can the systems support computation over disparate data sets?

• Software: Do you have statistical programs (like SPSS or R) available for your users?

• Flexibility: Can your system handle the wide variety of data formats, sizes, and uses?

• Example of a good system: http://www.devinfo.info/genderinfo/

Page 17: Management of Data Collections

UN Gender Info

Page 18: Management of Data Collections

Institutional Support

• Workflows: Can your data collections be integrated into the larger collections management framework?

• Faculty Partnerships: Will faculty work with the library to create data management plans?

• Mandate: Does your institution consider data collections a priority?

Page 19: Management of Data Collections

Preservation

• Best practices for data preservation mean that preservation concerns enter in at the earliest point in the data management cycle: creation.

Page 20: Management of Data Collections

Criteria for Preservation

• Obligation• Value• Uniqueness• Verification• Other Cultural Reasons

Page 21: Management of Data Collections

Metadata

• Plagued by a lack of standards.• No international metadata standard for

data sets.• Needs to give enough context for the data

to be understandable. • No clear citation practice has emerged for

data sets. • Data Documentation Initiative (DDI)

Page 22: Management of Data Collections

Wrap-Up

• What is a data collection? A collection of the data resulting from research.

• They have unique challenges for selection, access, infrastructure, and preservation.

• Data Curation is an up and coming field in librarianship.

• Librarians are uniquely poised to be involved in the recent surge of interest in data.