The royal society of chemistry and its adoption of semantic web technologies for chemistry at the...

43
The Royal Society of Chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world Antony Williams, Valery Tkachenko, Ken Karapetyan, Alexey Pshenichnov ACS, 248th National Meeting San Francisco, CA August 11 th 2014

description

Semantic web technologies have quickly penetrated all areas of traditional and new database systems and have become the de facto standard in information exchange and communication. The Royal Society of Chemistry has built a new chemistry data repository with the semantic web at the core of the system. Every module of the data repository contains a semantic web layer and is able to interact internally and externally using standard approaches and formats including RDF, appropriate ontologies, SPARQL querying and so on. In this presentation we will review the challenges associated with developing this new system based on semantic web technologies and how the approach that we have taken offers distinct advantages over the original data model designed to produce the ChemSpider database. Its advantages include extensibility, an ontological underpinning, federated integration and the adoption of modern standards rather than the constraints of a standard SQL model.

Transcript of The royal society of chemistry and its adoption of semantic web technologies for chemistry at the...

Page 1: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

The Royal Society of Chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Antony Williams, Valery Tkachenko, Ken Karapetyan, Alexey Pshenichnov

ACS, 248th National Meeting

San Francisco, CA

August 11th 2014

Page 2: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Who is involved?

29 partners

Page 3: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Research questions

Page 4: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Research questions

ChEMBLChEMBL DrugBankDrugBank Gene Ontology

Gene Ontology WikipathwaysWikipathways

UniProtUniProt

ChemSpiderChemSpider

UMLSUMLS

ConceptWikiConceptWiki

ChEBIChEBI

TrialTroveTrialTrove

GVKBioGVKBio

GeneGoGeneGo

TR IntegrityTR Integrity

“Find me compounds that inhibit targets in NFkB pathway assayed in only functional assays with a potency <1 μM”

“What is the selectivity profile of known p38 inhibitors?”

“Let me compare MW, logP and PSA for known oxidoreductase inhibitors”

Page 5: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Open PHACTS Explorer Web based searching interface

explorer.openphacts.org

Discovery Platform

Open PHACTS API dev.openphacts.org Applications can query the pharmacological data within Open PHACTS

Open PHACTS applicationsExternal bespoke applications using the Open PHACTS API.

chembionavigator.org

pharmatrek.org

• Compound-protein interactions • Physicochemical properties

Workflow toolsPipeline Pilot, KNIME, R

• Gene information• Biological pathways

Page 6: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

OpenPHACTS UIhttp://explorer.openphacts.org/

Page 7: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world
Page 8: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

ChemBioNavigator

Page 9: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

OpenPHACTS APIhttps://dev.openphacts.org/

https://dev.openphacts.org/

Page 10: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

KNIME

Page 11: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

OpenPHACTS Architecture

Page 12: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Micro-article

Compounds

Reaction

Analytical Data

Text and References

Page 13: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Technical view - unification

Page 14: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Chemistry Validation and Standardization Platform

Page 15: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

DrugBank dataset (6516 records)

J. Brechner, IUPACGraphical Representation of stereochem. configurationsSection: ST-1.1.10

DB06287

Page 16: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

PubChemDrugbankChemSpider

Imatinib

Mesylate

What Is Gleevec?

Ambiguities

Page 17: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

How is this a semantic web problem? Why can’t people just be clear?

People may be working with faulty data.

Salts, say, may make little difference to the effects of an active ingredient.

People may assume a one-to-one mapping between a gene and the gene product (protein, ncRNA) that it codes for.

Page 18: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

What’s in a lens?

IdentifierTitle (dct:title)Description (dct:description)Documentation link (dcat:landingPage)Creator (pav:createdBy)Timestamp (pav:createdOn)

Equivalence rules (bdb:linksetJustification)

Page 19: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Equivalence rules

The BridgeDB vocabulary adds metadata that provides a justification for treating two URIs alike, thus allowing the researcher to determine whether their circumstances fit.

owl:sameAs ≤ skos:exactMatch ≤ skos:closeMatch ≤ rdfs:seeAlso

The ChEBI and CHEMINF ontologies provide a rich set of relations (many of which developed for this project) to relate one molecule to another.

Page 20: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

ChEBI (http://www.ebi.ac.uk/chebi)

has partis tautomer of

CHEMINF (http://code.google.com/p/semanticchemistry/)

has component with uncharged counterparthas counterpart molecular entity

has normalized counterparthas OPS normalized counterparthas PubChem normalized counterpart

has uncharged counterpartsimilar to

similar to by PubChem 2D similarity algorithmsimilar to by PubChem 3D similarity algorithm

has same connectivity asis isotopologue ofis stereoisomer of

subClassOf (standard relation in RDF)has isotopically unspecified parenthas stereoundefined parent

Page 21: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Link: skos:closeMatchReason: non-salt form

Link: skos:exactMatchReason: drug name

Page 22: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Strict Relaxed

Analysing Browsing

skos:exactMatch(InChI)

Page 23: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Strict Relaxed

Analysing Exploring

23

skos:closeMatch(Drug Name)

skos:closeMatch(Drug Name)

skos:exactMatch(InChI)

Page 24: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

What does the Open PHACTS Chemistry Registration System do?

Takes in structures from ChEMBL, ChEBI, DrugBank, PDB, Thomson Reuters.

Normalizes structures according to rules based on FDA guidelines.

Generates counterpart molecules: without charge, fragments

Page 25: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Chemistry Validation and Standardization Platform

Page 26: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Input pipeline

Page 27: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Compounds domain

Page 28: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Navigation in chemical space

Page 29: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Navigation in chemical space

Page 30: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Reactions domain

Page 31: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world
Page 32: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Analytical data domain

Page 33: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Crystallography domain

Page 34: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Standards

Page 35: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Share in a “proper way”

Page 36: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

APIs, endpoints and widgets

Page 37: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world
Page 39: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Handling complex content

What’s the structure?What’s the structure?

Are they in our file?

Are they in our file?

What’s similar?What’s

similar?

What’s the target?

What’s the target?Pharmacology

data?Pharmacology

data?

Known Pathways?

Known Pathways?

Working On Now?

Working On Now?Connections

to disease?Connections to disease?

Expressed in right cell type?Expressed in

right cell type?

Competitors?Competitors?

IP?IP?

Page 40: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world
Page 41: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Machine learning

Page 42: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world
Page 43: The royal society of chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world

Thank you

Email: [email protected]

Slides: http://www.slideshare.net/valerytkachenko16