Magle data curation in libraries
-
Upload
c-tobin-magle -
Category
Education
-
view
244 -
download
5
Transcript of Magle data curation in libraries
Why would I let a librarian anywhere near my research data?C. Tobin Magle, Biomedical Sciences Research Support Specialist
http://www.slideshare.net/CTobinMagle/magle-data-curation-in-libraries
Updates from the library
• 24/7 badge access
• Website refresh
• Research support pages• Free Interlibrary loan for all!
• New off campus login
• Water bottle filling stations
Questions
• Why should I care about data management?
• What do libraries have to do with it?
• What is Tobin up to in this area?
Why should I care about data management?
Rinehart, AK. “Getting emotional about data” College & Research Libraries News September 2015 vol. 76 no. 8 437-440
Anonymous researcher’s
submission to “Day of Data”
at Brown University
But, more people are doing research
https://nexus.od.nih.gov/all/2012/06/27/what-weve-learned-about-graduate-students/
There is more data than ever before
See arXiv:1402.4578 for details
17% data is lost per year post publication
doi:10.1016/j.cub.2013.11.014
The majority of research data aren’t curated
doi:10.1353/lib.0.0036
<22% NIH grants require a Data Sharing Plan
Thus, we are losing a lot of dataThat could be repurposed
Research funding is tight
From: The Anatomy of Medical Research: US and International ComparisonsJAMA. 2015;313(2):174-189. doi:10.1001/jama.2014.15939
NIH
Pharma
Med. Device Companies
Biotech
State/localPrivate funds
Other Fed.
Funders want to do more with lessHence, data sharing
http://figshare.com/blog/2015_The_year_of_open_data_mandates/143
NSF post-award requirements
“Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing.”http://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/aag_6.jsp#VID4
2003 NIH data sharing policy
“The NIH endorses the sharing of final research data to serve these and other important scientific goals. The NIH expects and supports the timely release and sharing of final research data from NIH-supported studies for use by other researchers.” http://grants.nih.gov/grants/guide/notice-files/NOT-OD-03-032.html
NIH Genomic Data Sharing (2014)
“To promote robust sharing of human and non-human data from a wide range of genomic research and to provide appropriate protections for research involving human data, the National Institutes of Health (NIH) issued the NIH Genomic Data Sharing Policy (GDS Policy) on August 27, 2014 in the NIH Guide Grants and Contracts.”http://grants.nih.gov/grants/guide/notice-files/NOT-OD-14-124.html
Whitehouse’s 2013 OSTP
“The Obama Administration is committed to the proposition that citizens deserve easy access to the results of research their tax dollars have paid for. That’s why, in a policy memorandum released today, OSTP Director John Holdren has directed Federal agencies with more than $100M in R&D expenditures to develop plans to make the results of federally funded research freely available to the public—generally within one year of publication.”http://www.whitehouse.gov/blog/2013/02/22/expanding-public-access-results-federally-funded-research
I’m going to have to curate my data…
But it can also be good for your career
Make Lemonade:
• Reduce data loss
• More data usage
• More exposure to collaborators
• More competitive grant applications
How are libraries getting involved?
• We don’t make the rules
• We want to provide guidance
• Research data management services
• NLM Administrative Supplements
But libraries don’t have anything to do with research data…
“The biggest challenge that libraries face in building data management services is the researchers’ perception that librarians do not understand research data and have no role to play in data management.”
J Med Libr Assoc. 2015 Jul;103(3):131-5. doi: 10.3163/1536-5050.103.3.005.PMID: 26213504
Libraries are changing:
Strength: organizing and finding information• Old role: Finding and cataloging books • New role: Finding and cataloging electronic resources• Informationist’s role: Finding datasets for data
repurposing and helping researchers curate their own
Libraries are providing data services
Research Data
Lifecycle
From: https://www.lib.msu.edu/rdmg/servcat/
DMPs
Software
Data cleaning
Deposit inRepository
Metadata
Search for datasets
We care what you think
Help us help you!
Data Management Challenges
Basic Science• Lack of standards and
procedures • “Reinventing the wheel”
• Disconnect among data types• Imaging and tabular data
stored separately
• Staff leave with their data
Clinical• Data quality
• Collection inconsistent among staff
• Changing data format among statistical programs
J Med Libr Assoc. 2015 Jul;103(3):131-5. doi: 10.3163/1536-5050.103.3.005.PMID: 26213504
Do researchers want to share data?
Basic Science• Not really (27%)
• Concerns• Negative experiences• Privacy• “My data is too specialized
to be of use to others”• Lack of infrastructure• Curation takes time
Clinical• Mostly (58%)
• Concerns• As long as they know who’s
using it.• Interest in using other
researchers’ data
J Med Libr Assoc. 2015 Jul;103(3):131-5. doi: 10.3163/1536-5050.103.3.005.PMID: 26213504
Librarians are receiving grant funding
Informationist Funding
NLM Administrative Supplements for Informationist Services
Purposes:
(1) To enhance collaborative, multi-disciplinary basic and clinical research by integrating an information specialist into the research team in order to improve the capture, storage, organization, management, integration, presentation and dissemination of biomedical research data
(2) To assess and document the value and impact of the informationist’s participation.
http://www.nlm.nih.gov/ep/AdminSupp.html
Project backgroundDr. Kechris’s R01 proposal generated miRNA expression data from LXS recombinant inbred mouse panel as a resource for the research community.
Planned to share data in PhenoGen database
NLM Informationist Awards
Aims:
1. Make data and code publicly available with appropriate metadata
2. Create tutorials to facilitate data reuse
3. Assess efficacy of Aims 1 + 2
Aim 1: Make data/code/metadata public
• Deposit raw miRNA data public repositories• NCBI (SRA/GEO/BioProject/BioSample)• PhenoGen (new functionality to support NGS data)
• Standardize and apply metadata
• Make analysis workflows (R code) available in GitHub
• Repository entry to link all materials from this project• Including tutorials from Aim 2
Aim 2: Facilitate data reuse with tutorials
Variety of formats:
• Video Tutorials: Adobe Captivate
• Written tutorials: Blog• https://hslnews.wordpress.com/category/bioinformatics-
bites/
• Guide on the Side: • http://hslibrarytraining.ucdenver.edu
Aim 3: Assess efficacy of Aims 1 and 2
• Monitor data usage• Citation• Downloads (Google Analytics)
• Surveys and assessments about tutorials• Are the tutorials helping others use the data?
Acknowledgements
HSL Faculty and Staff• Melissa Desantis• Lisa Traditi• Kristen Desanto• John Jones• Lilian Hoffecker• Ben Harnke• Ruby Nugent
[email protected]: 303-724-2114Twitter: @tobinmaglehttp://orcid.org/0000-0003-3185-7034
Contact Information
Kechris Group/PhenoGen• Katerina Kechris• Boris Tabakoff• Laura Saba• Spencer Mahaffey• Pam Russell
http://www.slideshare.net/CTobinMagle/magle-data-curation-in-libraries