SeaLife Simon Jupp
description
Transcript of SeaLife Simon Jupp
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
SeaLife
Simon Jupp
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
SeaLife
Conception and realisation of a Semantic Grid Browser, which links the current Web to the emerging eScience infrastructure
• Partners: Manchester, Dresden, Edinburgh, London, Inria Sophia-Antipolis, Scionics
• Objectives:– Many grids, few users: make Web servers and services accessible to end users– Semantic Hyperlinks: use ontologies and background knowledge to map web
contents to services– Shopping cart: Service composition and enactment module
• Application: from cells, via tissue to patients– Evidence-based medicine– Patent and literature mining– Molecular biology
• Implementations:– COHSE– GoPubMed– CORESE
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Objective• We have a World Wide Web of data • We have e-science and a grid of bioinformatics services• We have text-mining tools, ontologies, web services and W3C standards
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressorare needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.QuickTime™ and a decompressorare needed to see this picture.
QuickTime™ and a decompressorare needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressorare needed to see this picture.
QuickTime™ and a decompressorare needed to see this picture.
QuickTime™ and a decompressorare needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressorare needed to see this picture.
QuickTime™ and a decompressorare needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Evidence based medicine
"Ribavirin with or without alpha interferon for chronic hepatitis C"
• Background Knowledge: MeSH, Disease Ontology, SNOMED…• UK based Resources:
– National Institute for Health and Clinical Excellence (NICE)– National Electronic Library of Infection (NeLI)– Health protection Agency (HPA)
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Molecular Biology
‘’Rabaptin-5 interacts with the small GTPase Rab5 and is an essential component of the fusion machinery for targeting endocytic vesicles to early endosomes’’
• Background Knowledge: – Rabaptin-5 and Rab5 are proteins – endocytosis as GO biological process – early endosome as GO cellular component.
• Resources:
– Get sequences, execute alignment service– Add proteins to “shopping cart” Rab5– PubMed query for relevant abstracts
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
A Sealife browser
• Definition: A SeaLife browser is any web browser that can identify domain concepts in web documents via text-mining or use of background knowledge, and provides context based links to related services/resources on the web/grid.
• Several exists: COHSE, GoPubMed, Magpie, PiggyBank, KIM, Concept Web Linker….
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Implementations
• COHSE - Conceptual Open Hypermedia Service– Dynamic linking system for WWW documents– Uses background knowledge (ontologies) to identify domain concepts– Service module for navigating to relevant documents on the Web
• GoPubMed– Ontology based search engine: Query expansion and results filtering– Supports What, Who, Where, When.
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Web Navigation• The Semantic Web is still a Web to be used by humans
– A collection of linked nodes• Navigation is still an important aspect of information gathering on the Web
– Serendipitous information retrieval
• Problem– Links are typically embedded– Hard coded– Difficult to author– Ownership– Unary– Legacy resources– Offer little in the way of semantics
• Approach– Exploit Semantic Web components to add links
dynamically to documents– Exploit knowledge structure to drive Navigation
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Web Navigation with COHSE
• Knowledge Service
– Text processor and background knowledge identify concepts in a page
• Resource Manager
– Finds links targets for concepts
found in the page
• DLS
– Dynamically adds the links to
the page and manages requests
to the resource manager
• Can be run as browser plugin
or through a proxy
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
NeLI use case
• National Electronic Library of Infection, London, UK. – Evidence based, quality tagged resource for public and clinical health records– Diverse set of users
• GPs, Clinicians, Molecular biologists, General Public– Many documents, few hyperlinks
• Can COHSE provide useful links to relevant external documents?– Evaluation is underway
• Searching for guidelines on the use of "Ribavirin with or without alpha interferon for chronic hepatitis C"
– Clinicians need up to date, authoritative information
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
COHSE-NeLI Demo
http://www.cs.man.ac.uk/~sjupp/downloads/COHSE-NELI-2009-demo.mov
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Background knowledge
• What semantics do we need for the background knowledge to drive navigation?
• Richer and more granular knowledge is better for navigation.
• The type of background knowledge varies between types users and the task at hand.
– E.g. Nurses, doctors, public, medic etc..
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Phenotype
Sequence
Proteins
Gene products Transcript
Pathways
Cell type
BRENDA tissue / enzyme source
Development
Anatomy
Phenotype
Plasmodium life cycle
-Sequence types and features-Genetic Context
- Molecule role - Molecular Function- Biological process - Cellular component
-Protein covalent bond -Protein domain -UniProt taxonomy
-Pathway ontology -Event (INOH pathway ontology) -Systems Biology -Protein-protein interaction
-Arabidopsis development -Cereal plant development -Plant growth and developmental stage -C. elegans development -Drosophila development FBdv fly development.obo OBO yes yes -Human developmental anatomy, abstract version -Human developmental anatomy, timed version
-Mosquito gross anatomy-Mouse adult gross anatomy -Mouse gross anatomy and development -C. elegans gross anatomy-Arabidopsis gross anatomy -Cereal plant gross anatomy -Drosophila gross anatomy -Dictyostelium discoideum anatomy -Fungal gross anatomy FAO -Plant structure -Maize gross anatomy -Medaka fish anatomy and development -Zebrafish anatomy and development
-NCI Thesaurus -Mouse pathology -Human disease -Cereal plant trait -PATO PATO attribute and value.obo -Mammalian phenotype -Habronattus courtship -Loggerhead nesting -Animal natural history and life history
eVOC (Expressed Sequence Annotation for Humans)
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Knowledge representation
Tuberculosis
TBInfectious Disease
Bacteria
Lung
Mycobacterium bovisCoughing
Chest X-ray
BCG vaccine Isoniazid
abbreviation Is a
vaccine
Caused by
drug
Affects
Similar to
SymptomDiagnosis/detection
Can’t make these close links with strict semantics!
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
SKOS conversions
Tuberculosis
TBInfectious Disease
Bacteria
Lung
Mycobacterium bovisCoughing
Chest X-ray
BCG vaccine Isoniazid
skos:altLabelskos:broader
skos:narrower
skos:broader
skos:related
skos:related
skos:narrower
skos:relatedskos:narrower
• We need “something to do with” semantics for Navigation• SKOS provides standard for common representation with “enough” semantics
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
COHSE and e-science
• Enhancements to COHSE, working prototype available– Addition of text-mining component
• Identifies Genes, Proteins, Chemicals in text
• Query service repositories– E.g. myExperiment, BioCatalogue, Bio-moby– Execute services and workflows within the browser
• Edinburgh developed shopping cart and argumentation services– Shop online for your genes, proteins, sequences etc…– Shop online for services and workflows– All from within your web browser!– But that’s the future….
A Semantic Grid Browser for the Life Sciences Applied to the Study of Infectious Diseases
Summary
• Range of Semantic Web browsers under development
• Semi-automated addition of semantic content to existing resources is the only viable option in many cases
• What are we waiting for?– More background knowledge– Semantic web services description