Research resources: curating the new eagle-i discovery system
-
Upload
nicole-vasilevsky-phd -
Category
Health & Medicine
-
view
32 -
download
1
Transcript of Research resources: curating the new eagle-i discovery system
Research resources: cura,ng the new eagle-‐i discovery system Nicole Vasilevsky1, Tenille Johnson2, Karen Corday2, Carlo Torniai1, Ma:hew Brush1, Sco: Hoffmann1, Erik Segerdell1, Melanie L. Wilson1, Christopher J. Shaffer1, David Robinson1, and Melissa A. Haendel1** 1 Oregon Health & Science University, Library, Portland, Oregon 2 Harvard Medical School, Center for Biomedical InformaTcs, Cambridge, Massachuse:s
www.eagle-‐i.net Open source so;ware available at: h=ps://open.med.harvard.edu/display/eaglei/So;ware eagle-‐i Ontology GoogleCode: h=p://code.google.com/p/eagle-‐i/
Acknowledgements **We, the authors, represent the members and leaders of the eagle-‐i CuraTon team, and describe some of the efforts and products of all teams involved in the development of the eagle-‐i discovery system. We would like to thank the Resource NavigaTon team, led by Richard Pearse; SoWware Build team, led by Daniela Bourges; and Project Management team, led by Julie McMurry. We would also like to thank Jackie Wirz. We gratefully acknowlege NIH award #U24RR029825.
Seman,c Web Entry and Edi,ng Tool Components of the eagle-‐i annotaTon tool, known by the acronym SWEET, are generated directly from the eagle-‐i ontology. The SWEET contains both annotaTon fields that are auto-‐populated using the ontology (purple box) and free text (orange box). Entrez Gene ID links out to the NCBI database (red box). Fields in the SWEET can also link records to other records in the repository, such as related publicaTons or documentaTon (blue box). Users can request new terms be added to the ontology using the Term Request field.
Ontological modeling of research resources
Data Cura,on at eagle-‐i
Development of data curaTon pracTces at eagle-‐i depended on the Resource NavigaTon team for data collecTon, the CuraTon team for ontology development and data QA, and the SoWware team for user interface design in an iteraTve process. Tools and documentaTon were developed to assist users and team members with each of these processes.
Lessons Learned • Balance the data you need with the data you can get • Documenta,on and quality assurance are itera,ve • Tools and technology choices depend on the above
Denotes required annotaTons.
Denotes quesTons eliciTng informaTon for annotaTon.
Denotes redirecTon to a different decision tree.
Denotes higher value/priority annotaTons. Denotes medium value/priority annotaTons. Denotes lower value/priority annotaTons.
Denotes drop down or annotaTon field examples.
Decision trees assist with data entry and annota,on of resources
The Ideal Scholarly Research Cycle
During the course of collecTng informaTon about research resources, which many laboratories were willing to share, we discovered that while larger core faciliTes rouTnely have resource and workflow organizaTon strategies, primary research labs very rarely do. This creates barriers to reproducing experiments as well as to publishing and sharing resources. Giving labs organizaTonal tools can help address these issues.
Provide scien,sts with the tools they need to record their resources during the course of research
How can we make this cycle more efficient?
o Researchers produce data and resources that lead to publicaTons.
o Published data informs researchers of new experimental designs.
o InformaTon about researchers, resources, data, and published papers is stored in various public repositories.
The goal of eagle-‐i is to make scienTfic research resources more visible via a federated network of insTtuTonal repositories. Using an ontology-‐driven approach for biomedical resource annotaTon and discovery, the Network currently includes resources from 23 insTtuTons.
New ini,a,ves with eagle-‐i NCATS has funded two new projects that leverage eagle-‐i to further translaTonal science. The first project aims to expand the breadth, quality, and discoverability of data about people and resources by harmonizing the ontologies of VIVO, eagle-‐i, and ShareCenter (www.ctsaconnect.org). The second project aims to expand the eagle-‐i plakorm to new CTSA insTtuTons, and to publish resources as Linked Open Data.
BiocuraTon
Data collecTon
User interface design
Ontology development
CuraTon guidelines
SPARQL query tool for QA
Ontology Browser
SWEET Search applicaTon
Decision trees
Google code
The eagle-‐i workflow
Search applicaTon
AnnotaTon tool
InsTtuTonal repositories
Biocurator Ontology Reques
t new terms
Request resources
eagle-‐i parTcipaTng lab
Researcher
Resources and data
Researcher Publica,ons
Public repositories • eagle-‐i • MODs • NIF • Entrez Gene...
Public repositories • PubMed • Google Scholar • Mendeley…
Professional networking: • VIVO • Harvard Profiles • LinkedIn…
1
3
2
Major eagle-‐i resource types are shown as dark boxes. Persons and laboratories play a central role in eagle-‐i. Classes and properTes are reused from pre-‐exisTng ontologies or created de novo. Examples of some of the relaTons between the classes are indicated.