The Role of Ontologies in Improved Scholarly Communication
description
Transcript of The Role of Ontologies in Improved Scholarly Communication
![Page 1: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/1.jpg)
The Role of Ontologies in Improved Scholarly
Communication
Philip E. BourneUniversity of California San Diego
[email protected]://www.sdsc.edu/pb
![Page 2: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/2.jpg)
My Perspective …• Ontology Developer (years ago – mmCIF -
Bioinformatics 2002 18: 1280-128)• Database Developer – RCSB PDB• Supporter of open access (provided there is a
business model) - editor in chief of PLoS Computational Biology
• Co-founder - SciVee Inc. • I am becoming increasingly interested in scholarly
communication• I use ontologies to support this work
![Page 3: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/3.jpg)
Objective Today
• Describe how we are using ontologies to try and improve scholarly communication
• Motivate you towards thinking about ontologies that should be developed
• Learn from you where we might spend our efforts
![Page 4: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/4.jpg)
First Consider What Motivates Us to Improve Scholarly
Communication
![Page 5: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/5.jpg)
We Cannot Possibly Read a Fraction of the Papers We Should
Drivers of Change Renear & Palmer 2009 Science 325:828-832
![Page 6: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/6.jpg)
Hence We Are Scanning More Reading Less
Renear & Palmer 2009 Science 325:828-832Drivers of Change
![Page 7: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/7.jpg)
The Truth About the Scientific eLaboratory
• I have ?? mail folders!
• The intellectual memory of my laboratory is in those folders
• This is an unhealthy hub and spoke mentality
Drivers of Change
![Page 8: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/8.jpg)
The Truth About the Scientific eLaboratory
• I generate way more negative that positive data, but where is it?
• Content management is a mess– Slides, posters…..– Data, lab notebooks ….– Collaborations, Journal clubs …
• Software is open but where is it?• Farewell is for the data too
Drivers of Change
Computational Biology Resources Lack Persistence and Usability. PLoS Comp. Biol. 4(7): e1000136
![Page 9: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/9.jpg)
Data and the Publication Are Disjoint
• PubMed contains 18,792,257 entries
• ~100,000 papers indexed per month
• In Feb 2009:– 67,406,898 interactive
searches were done– 92,216,786 entries were
viewed
• 1078 databases reported in NAR 2008
• MetaBase http://biodatabase.org reports 2,651 entries edited 12,587 times
Biosciences Data as of April 14, 2009Drivers of Change
![Page 10: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/10.jpg)
Publishing Limitations
• A paper is an artifact of a previous era• It is not the logical end product of eScience,
hence:– Work is omitted– Article vs supplement is a mess– Visualization may be limited– Interaction and enquiry are non-existent– Rich media can help, but are rarely used
Drivers of Change
![Page 11: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/11.jpg)
We Need to do Better & The Game is Afoot
It is being driven from the top down and the bottom up
![Page 12: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/12.jpg)
Ontologies & Semantic Tagging
![Page 13: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/13.jpg)
BioLit Data Extraction/StorageDatabase IDsOntology termsText excerptsOther… BioLit
MySQLdatabase
XML
XML,Meta-data
<w
eb
se
rvic
es>
we
b
ext
ern
ald
ata
bas
es
Semantic Tagging
![Page 14: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/14.jpg)
Tagging of PubMed Central
• Ontologies read from OBO Files• Words converted to tree structures• Matched to every non-trivial word in the
paper• Matches tagged• A long paper can be matched to GO in less
than 30 seconds
Semantic Tagging http://biolit.ucsd.edu
![Page 15: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/15.jpg)
Semantic Tagging http://biolit.ucsd.edu
![Page 16: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/16.jpg)
ICTP Trieste, December 10, 200716
http://biolit.ucsd.eduSemantic Tagging
![Page 17: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/17.jpg)
Provision of Webservices to this tagging may be the most valuable contribution..
Semantic Tagging
![Page 18: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/18.jpg)
www.rcsb.org/pdb/explore/literature.do?structureId=1TIMDatabase & Literature Integration
Context
BMC Bioinformatics 2010 11:220Semantic Tagging
![Page 19: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/19.jpg)
Semantic Tagging of Database Content
http://www.pdb.orgPLoS Comp. Biol. 6(2) e1000673Semantic Tagging
![Page 20: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/20.jpg)
Automatic Knowledge Discovery for Those with No Time to Read
Immunology Literature
Cardiac DiseaseLiterature
Shared FunctionSemantic Tagging
![Page 21: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/21.jpg)
This is Literature Post-processingBetter to Get the Authors Involved
• Authors are the absolute experts on the content
• More effective distribution of labor
• Add metadata before the article enters the publishing process
BMC Bioinformatics 2010 11:103Semantic Tagging
![Page 22: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/22.jpg)
Word 2007 Add-in for Authors
• Allows authors to add metadata as they write, before they submit the manuscript
• Authors are assisted by automated term recognition– OBO ontologies– Database IDs
• Metadata are embedded directly into the manuscript document via XML tags, OOXML format– Open– Machine-readable
• Open source, Microsoft Public License
http://www.codeplex.com/ucsdbiolitDrivers of Change
![Page 23: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/23.jpg)
Word 2007 Add-in Example of What it Looks Like - Ontologies
• Inline Recognition, Highlighting, and Mark-up of Informative Terms– A recognized term will have a dotted, purple underline– Hovering generates a Smart Tag above the term
• add mark-up for this term• ignore this term• view the term in the ontology browser• If a recognized term appears in more than one ontology, all instances
of that term will be listed– Hovering over a marked-up term
• option to apply mark-up to all recognized instances of term• stop recognizing a term
– Pass ontology terms back to provider
Semantic Tagging BMC Bioinformatics 2010 11:103
![Page 24: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/24.jpg)
• Built-in Knowledge of Ontologies and Databases– Add-in provides a list of biomedical ontologies to
download– and a list of databases for ID recognition
(GenBank/RefSeq, UniProt, Protein Data Bank)– A user may also supply a URL to download other
ontologies
• Ontology Browser– allows a user to select an ontology and then navigate
through it to view terms and their relationships
BMC Bioinformatics 2010 11:103
![Page 25: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/25.jpg)
Custom Metadata• Ontologies do not contain all usages of a concept• Add-in allows user to assign custom metadata
• Human Disease Ontology term: Leukemia, T-Cell, HTLV-II-Associated
• Synonym: Atypical hairy cell leukemia (disorder) • Actual use in literature:
– hairy cell leukemia– hairy-cell leukemia– hairy T cell leukemia– T cell hairy leukemia
BMC Bioinformatics 2010 11:103
![Page 26: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/26.jpg)
Synonym mapping, disambiguation
• Inclusion of an additional set of synonyms for a term that reflect its use in natural language– Automated finding of synonyms in extant
literature– Gather synonyms from term-mapping databases
• Incorporate a more sophisticated term recognition approach into the add-in
BMC Bioinformatics 2010 11:103
![Page 27: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/27.jpg)
Challenges
• Author use– Familiarity with ontologies, terms– Agreement between co-authors
• End-use of semantically enriched manuscript
• Need to combine with NLM XML standard
Semantic Tagging BMC Bioinformatics 2010 11:103
![Page 28: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/28.jpg)
Challenges: Author Use
IF one or more publishers fast tracked a paper that had semantic
markup I would argue it would catch on in no time
Semantic Tagging BMC Bioinformatics 2010 11:103
![Page 29: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/29.jpg)
Where we Need {Better} Ontologies
1. To Support Mashups Between Different Types of Scholarly Output
![Page 30: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/30.jpg)
Post-publication of Video and Paperwww.scivee.tv
Drivers of Change
![Page 31: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/31.jpg)
Pubcast – Video Integrated with the Full Text of the Paper
![Page 32: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/32.jpg)
Pubcasts - A Unique Technology
Don’t understand what you are reading? Click and have the author pop-up and explain it!
See the scientists and the experiments behind the research papers and textbooks
Pubcasts - A Blend of Video, text, tables, figures, PowerPoints, comments, ratings…ALL SYNCHRONIZED FOR RAPID LEARNING
Mashups – www.scivee.tv
![Page 33: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/33.jpg)
Where we Need {Better} Ontologies
2. To Support Tagging of all Aspects of the Scholarly Product
![Page 34: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/34.jpg)
Consider Today’s Academic Workflow
Research[Grants]
JournalArticle
ConferencePaper
PosterSession
Feds
Societies
Publishers
Reviews
BlogsCommunity Service/Data
Curation
What Should be Done?
![Page 35: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/35.jpg)
Consider Tomorrow’s Academic Workflow
Research[Grants]
JournalArticle
ConferencePaper
PosterSession
Feds
Societies
Publishers
Reviews
BlogsCommunity Service/Data
CurationIdeas, Data, Hypotheses
What Should be Done?
![Page 36: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/36.jpg)
Maybe The Line is Somewhere Else?
Scientist
Idea
Experiment
Data
Conclusions
Publish
Laboratory
Publisher
![Page 37: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/37.jpg)
Maybe The Line is Somewhere Else?
Scientist
Idea
Experiment
Data
Conclusions
PublishWhat Should We Do?
Laboratory
Publisher
Institution
Lab Notebook
![Page 38: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/38.jpg)
Crowd Sourcing the Electronic Printing Press(aka Workshop: Beyond the PDF)
• Proposal to the US National Science Foundation:
• Aims:– Define user requirements– Establish a specification document– Open source the development effort– Have a commitment from a publisher to publish a
research object using the system– Act as an exemplar for what can be done
![Page 39: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/39.jpg)
Question: What if Everyone Had An Electronic Printing Press?
• Peer review might change?• Bibliometrics might change?• Business models will likely change?• What happens to the database/literature divide?• Societies might do more self publishing?• We might have improved the dissemination of
science, but will we have improved the comprehension?
![Page 40: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/40.jpg)
General References
• What Do I Want from the Publisher of the Future PLoS Comp Biol http://www.sdsc.edu/pb
• Fourth Paradigm: Data Intensive Scientific Discovery http://research.microsoft.com/enus/collaboration/fourthparadigm/
![Page 41: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/41.jpg)
References to Exemplars
• Semantic Biochemical Journal - 2010: Using Utopia
• Article of the Future, Cell, 2009:• Prospect, Royal Society of Chemistry, 2009:• Adventures in Semantic Publishing, Oxford U, 2009:
• The Structured Digital Abstract, Seringhaus/Gerstein, 2008• CWA Nanopublications – 2010
![Page 42: The Role of Ontologies in Improved Scholarly Communication](https://reader035.fdocuments.net/reader035/viewer/2022070409/56814496550346895db13a10/html5/thumbnails/42.jpg)
Acknowledgements• BioLit Team
– Lynn Fink– Parker Williams– Marco Martinez– Rahul Chandran– Greg Quinn
• Microsoft Scholarly Communications– Pablo Fernicola– Lee Dirks– Savas Parastitidas– Alex Wade– Tony Hey
• wwPDB team
• SciVee Team– Apryl Bailey– Tim Beck– Leo Chalupa– Lynn Fink– Marc Friedman (CEO)– Ken Liu– Alex Ramos– Willy Suwanto
http://www.scivee.tv
http://biolit.ucsd.eduhttp//www.pdb.orghttp://www.codeplex.com/ucsdbiolit