ICPSR Data Managment

31
ICPSR AT 50: Facilitating Research and Data Sharing Part III: Data Management IASSIST Vancouver, BC May 31, 2011

description

This is Part III of a workshop presented by ICPSR at IASSIST 2011. This section focuses on data management including data management plans, secure computing environments, and restricted data contract management.

Transcript of ICPSR Data Managment

Page 1: ICPSR Data Managment

ICPSR AT 50:Facilitating Research

and Data Sharing

Part III: Data ManagementIASSIST Vancouver, BCMay 31, 2011

Page 2: ICPSR Data Managment

Data Management begins at 11:45

Page 3: ICPSR Data Managment

Data Management Agenda

• Data Management Plans

• Computing & Data Sharing in Secure Environments

• Managing Restricted Contracts

Page 4: ICPSR Data Managment

The Statement Heard Round the Research World:

• The National Science Foundation has released a new requirement for proposal submissions regarding the management of data generated using NSF support. Starting in January, 2011, all proposals must include a data management plan (DMP).

• The plan should be short, no more than two pages, and will be submitted as a supplementary document. The plan will need to address two main topics: – What data are generated by your research? – What is your plan for managing the data?

Page 5: ICPSR Data Managment

Data Management in Demand

ICPSR conducts webinars on data management plans:

• November 8, 2010: 134 attend

• January 12, 2011: 535 attend

• February 17, 2011: 71 attend

Page 6: ICPSR Data Managment

ICPSR’s DMP Web Site

www.icpsr.umich.edu/ICPSR/dmp/

Page 7: ICPSR Data Managment

Guidelines for Download

Page 8: ICPSR Data Managment

ICPSR’s DMP Blog - FAQs

http://datamanagementplans.blogspot.com/

Page 9: ICPSR Data Managment

ICPSR’s DMP Statistics

• January 2011: 3,984 views• January – April 2011: 7,802 views• Where are they coming from?

– 5,527 Direct (bookmarked, etc.)– 3,370 from Google search– 878 from NSF

Page 10: ICPSR Data Managment

Improving Data Management

• Potential increase in demand for data management services as a result of grant/contract requirements

• Increase in demand for processing, analysis, and distribution of sensitive data

• Resulted in improvements focused on secure computing and data sharing environments at ICPSR

Page 11: ICPSR Data Managment

Three Angles of Security

• Secure Ingest• Secure Computing in the Cloud• Secure Online Application & Tracking

Page 12: ICPSR Data Managment

ICPSR Secure Data Services

We'd tell you more, but then we'd have to kill you.

Page 13: ICPSR Data Managment

Two services; one platform

Secure Data Environment

• Serves ICPSR staff• Protects against

accidental data leakage

• Uses firewalls, virtualized workstations to access content

• Keeps the bad guys out

Virtual Data Enclave

• Serves ICPSR users• Protects against

accidental data leakage

• Uses firewalls, virtualized workstations to access content

• Keeps the bad guys out

Page 14: ICPSR Data Managment

One technology platform to rule them all

Page 15: ICPSR Data Managment

Technology components

• Needed to stand up the services quickly and with little working capital for investment

• Selected a strategy of investing in storage, and "renting" access and security services• EMC NS 120 Network Attached Storage device• University of Michigan "desktop virtualization"

product, the Virtual Desktop Infrastructure (VDI) service

• University of Michigan "firewall virtualization" product, the Virtual Firewall service

Page 16: ICPSR Data Managment

EMC NAS

• Leverages existing infrastructure at ICPSR and experience with EMC products

• Two NAS units (NS 120 model)

o Private NAS - home to all secure data

o Semi-Private NAS - home to all other content, such as web site content, downloadable files, etc

• Each unit is attached to a different virtual network (VLAN); more on this later

Page 17: ICPSR Data Managment

Staff install EMC fiber-channel-attached storage

Page 18: ICPSR Data Managment

Virtual Desktop Infrastructure Service• University of Michigan service

o Information Technology Services is the providero Virtualization as a Service (VaaS)

• ICPSR was a pilot user

• Enables access to content on the Private NAS via virtualized environment

o Easier to updateo Easier to secureo Enables more secure remote access

• Uses the UMich Active Directory system for authentication, authorization, and accounting

• Priced comparably to Amazon's cloud (EC2)

Page 19: ICPSR Data Managment

Staff access secure data through the SDE

Page 20: ICPSR Data Managment

Network topology• Former network topology was flat; every device had a routable

IP address

• New topology is highly segmented; seven VLANs

• Physical systems - three VLANs

o Publico Semi-Publico Private

• Virtual systems - four VLANs

o SDEo VDEo Summer Program virtual labo Web site testing

Page 21: ICPSR Data Managment
Page 22: ICPSR Data Managment

Secure Data Environment

• Content enters via our Deposit System

• Content exits via one of two mechanisms

o turnover for content entering Archival Storage and/or Dissemination systems

o data airlock for other stuff

• Both exit points can be monitored, controlled, reviewed, audited, etc.

• Technology and strategic direction may be moving faster than culture

Page 23: ICPSR Data Managment

Staff react to new restrictions

Page 24: ICPSR Data Managment

Virtual Data Enclave• Not suitable for "enclave-only" data

• Highly suitable for data ordinarily shared via a restricted-use agreement

o Alternative to shipping out sensitive data on removable media and hoping that nothing goes wrong

• Does shift cost burden (virtual workstation, storage) and risk burden (data security) from data analyst to data provider

o Who pays?

o How?

Page 25: ICPSR Data Managment

I have used the ICPSR VDE, and it is fantastic.

Oz Noori - Detroit 1-8-7 

This is a paid celebratory endorsement

Page 26: ICPSR Data Managment

Restricted Use Contracting System (RCS)

Purpose• Enables data processors (internal) to set up

contracts with restricted data with terms of use and contract behavior preferences

• Enables end-users to apply for restricted data online & track progress

• Enables ICPSR user support to manage contracts and track end-users

Page 27: ICPSR Data Managment

Overview of ICPSR’s RCS

Page 28: ICPSR Data Managment

Application Steps

Page 29: ICPSR Data Managment

50 Years of Research Data

• Data Exploration• Data Sharing• Data Management

Page 30: ICPSR Data Managment

Presenter Contact Information

• Peter Granda – [email protected]• Linda Detterman – [email protected] • Sanda Ionescu – [email protected]• Elizabeth Moss – [email protected]• Steve Burling – [email protected]

Page 31: ICPSR Data Managment

Enjoy Vancouver & IASSIST 2011!