Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical...
-
Upload
ralph-noah-henry -
Category
Documents
-
view
216 -
download
0
Transcript of Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical...
![Page 1: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/1.jpg)
Ontologies: Making Computers Smarter to Deal with Data
Kei Cheung, PhD
Yale Center for Medical Informatics
CBB752, February 9, 2015, Yale University
![Page 2: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/2.jpg)
Dealing with dataScience 11 February 2011: 692-729
![Page 3: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/3.jpg)
Examples of Big Data
• Genomics and proteomics data (e.g., next generation sequencing and mass spectrometry)
• Earth science data (e.g., satellite images)
• Electronic health records
• Social network data (e.g., facebook, youtube, …)
![Page 4: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/4.jpg)
Big Data in Genome Sciences
![Page 5: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/5.jpg)
Can Google answer every question?
![Page 6: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/6.jpg)
Kei (Hoi) Cheung (>20 years ago)
Kei (Hoi) Cheung(more recent)
Kei (Hui) CheungNot me!
I’m NOT a company!
Find the most recent imageof the person “Kei Hoi Cheung”
Problem with Keyword Search
![Page 7: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/7.jpg)
![Page 8: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/8.jpg)
Data Science
• Extraction of knowledge from data (and metadata)
• Machine learning
• Natural language processing
• High performance computing
![Page 9: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/9.jpg)
Knowledge Bases
• Artificial Intelligence
• Machine-readable (reasonable) knowledge representation
• Ontologies
• Semantic web
![Page 10: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/10.jpg)
Data Science & Knowledge Base
Data Science Knowledge Base
![Page 11: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/11.jpg)
What is an ontology?
• An ontology is a specification of a conceptualization
• It is a description of the concepts and their relationships that exist for a particular domain
![Page 12: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/12.jpg)
Knowledge Web
![Page 13: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/13.jpg)
Knowledge Web Data Integration
![Page 14: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/14.jpg)
Semantic Web: Web 3.0
• The Semantic Web provides a common machine-readable ontology framework that allows data to be represented, shared and reused across application, enterprise, and community boundaries– The Semantic Web is a knowledge web of data
• The Semantic Web is about two things– It is about common formats for identification,
representation, and integration of data drawn from diverse sources
– It is also about languages for describing how the data relates to real world objects
![Page 15: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/15.jpg)
Layers of the Semantic Web
![Page 16: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/16.jpg)
Web 3.0: Semantic Web (Cont’d)
• Global identifying scheme (URI)
• Standard data modeling languages (RDF, RDFS, OWL)
• Standard query languages (SPARQL)
• Enabling tools/technologies (e.g., Protégé, Jena, triplestore, etc)
![Page 17: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/17.jpg)
Resource Description Framework (RDF)
• It is a standard data model (directed acyclic graph) for representing information (metadata) about resources in the World Wide Web
• In general, it can be used to represent information about “things” or “resources” that can be identified (using URI’s) on the Web
• It is intended to provide a simple way to make statements (descriptions) about Web resources
![Page 18: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/18.jpg)
Uniform Resource Identifiers (URIs)
• A URI is a string of characters used to identify or name a resource on the Internet.
• URLs (Uniform Resource Locators) are a particular type of URI, used for resources that can be accessed on the WWW (e.g., web pages)
• In RDF, URIs typically look like “normal” URLs, often with fragment identifiers to point at specific parts of a document:
– http://www.semantic-systems-biology.org/SSB#CCO_B0000000 (id for “core cell cycle protein” in Cell Cycle Ontology)
![Page 19: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/19.jpg)
RDF Triple/Graph• The basic information unit in RDF is an RDF statement in the form of
– (subject, property, object)
• Each RDF statement can be modeled as a graph comprising two nodes connected by a directed arc
• A triple example
• A set of such triples can jointly form a directed labeled graph (DLG) that can in theory model a significant part of domain knowledge.
• An RDF graph can be represented in different formats (XML, Turtle, N3…)
![Page 20: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/20.jpg)
Linking data of the same type from multiple sources
is a
![Page 21: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/21.jpg)
Linking data across different types
![Page 22: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/22.jpg)
Named Graph
located in
Interacts with
biordf:P05067
Meta Statement
biordf:P05067 foaf:kei_cheungCreated by
![Page 23: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/23.jpg)
Cell Cycle Ontology (CCO) (Antezana et al, 2009, Genome Biology)
http://genomebiology.com/2009/10/5/R58
![Page 24: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/24.jpg)
RDF Graph Match (SPARQL)
BASE <http://www.semantic-systems-biology.org/ webcite>PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema# webcite>PREFIX ssb:<http://www.semantic-systems-biology.org/SSB# webcite>SELECT ?protein_labelWHERE { GRAPH <cco_S_pombe> { ?protein ssb:is_a ssb:CCO_B0000000. ?protein rdfs:label ?protein_label }}
core cell cycle protein
![Page 25: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/25.jpg)
Linked Data Cloud
![Page 26: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/26.jpg)
RDF Schema (RDFS)
• RDF Schema terms:– Class– Property– type– subClassOf– range– Domain
• Example:<DNASequence, type, Class><Promoter,subClassOf,DNASequence><Protein,type,Class><TranscriptionFactor,subClassOf,Protein><Bind,type,Property><Bind,domain, TranscriptionFactor><Bind,range, Promoter>
![Page 27: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/27.jpg)
Relational table -> RDF -> RDFS ontology
![Page 28: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/28.jpg)
Web Ontology Language (OWL)
• It is more semantically expressive than RDF and RDFS, but it is syntactically the same as RDF– Relationship constraints such as cardinality,
sameAs, etc
• It has three species: OWL Lite, OWL DL, OWL Full
![Page 29: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/29.jpg)
OWL DL Representation (Subsumption)
:Nucleus a owl:Class ; rdfs:subClassOf [ a owl:Restriction ; owl:onProperty :part_of ; owl:someValuesFrom :Cell ]
Necessary but not sufficient condition: part of a nucleus is also part of a cell, but part of a cell is not necessarily part of a nucleus
![Page 30: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/30.jpg)
OWL Reasoning
• Which proteins participate in “mitosis”
:Protein a owl:Class ; rdfs:subClassOf [ a owl:Restriction ; owl:onProperty :participates_in ; owl:someValuesFrom :Mitosis ]
![Page 31: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/31.jpg)
Semantic Web Rule Language (SWRL = OWL + Rules)
hasParent(?x1,?x2) hasBrother(?x2,?x3) hasUncle(?x1,?x3)∧ ⇒
![Page 32: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/32.jpg)
SW Enabling Technologies
• Ontology editor (e.g., protégé)
• Triple store (e.g., virtuoso)
• OWL reasoner (e.g., Pellet)
• SWRL reasoner (e.g., protégé plug-in)
![Page 33: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/33.jpg)
Boimedical ontologies available in RDF/OWL format
• UniProt• Gene Ontology• NCI Metathesaurus• Cell Ontology• Sequence Ontology• Protein Ontology• These and many more ontologies are available
in ontology repositories such as the NCBO BioPortal (http://bioportal.bioontology.org/)
![Page 34: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/34.jpg)
Applications of Ontologies
![Page 35: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/35.jpg)
Siri
![Page 36: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/36.jpg)
Google Knowledge Graph
![Page 37: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/37.jpg)
![Page 38: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/38.jpg)
Semantic Medline
![Page 39: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/39.jpg)
![Page 40: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/40.jpg)
![Page 41: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/41.jpg)
![Page 42: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/42.jpg)
Questions to be answered
• Patient P has a tumor recurrence with new mutations X and Y – which drugs should be used?
• In estradiol-treated SKBR3 cells, which nuclear protein complexes have the greatest change in phosphorylation?
• What is the largest number of genes one can knock out of Mycoplasma for it to remain viable?
![Page 43: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/43.jpg)
![Page 44: Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.](https://reader030.fdocuments.net/reader030/viewer/2022032706/56649dde5503460f94ad7b47/html5/thumbnails/44.jpg)
The End