Dynamic Semantic Metadata in Biomedical Communications

40
Tim Clark Harvard Medical School & Massachusetts General Hospital April 12, 2011 Copyright 2010 Massachusetts General Hospital. All rights reserved.

description

1st Annual Conference of the Pistoia AllianceKeynote talk Apr 12 2011 by Tim Clark

Transcript of Dynamic Semantic Metadata in Biomedical Communications

Page 1: Dynamic Semantic Metadata in Biomedical Communications

Tim ClarkHarvard Medical School &

Massachusetts General Hospital

April 12, 2011Copyright 2010 Massachusetts General Hospital. All rights reserved.

Page 2: Dynamic Semantic Metadata in Biomedical Communications

Information sharing and integration requirements for curing complex disorders.

Web 3.0 and semantic metadata. Integrating ontologies, documents,

data.Annotation Ontology & Annotation

Framework.

Page 3: Dynamic Semantic Metadata in Biomedical Communications

Yearly mortality (U.S.) = 642,00 people

Yearly costs (U.S.) = $676 B / 4.7% GDP

Prevalence = 5.3 M + 76 M + 14.4 M = 95.7 M people

Page 4: Dynamic Semantic Metadata in Biomedical Communications

create hypothesis

design experiment

run experiment collect data

interpret data

share interpretations

synthesize knowledge

Page 5: Dynamic Semantic Metadata in Biomedical Communications

MCI progressors non progressors

PET imaging of PIB (radiolabelled compound binds amyloid beta A4 protein)

MRI imaging of brain structure showing loss of hippocampal volume

Brain. 2010 Nov;133(Pt 11):3336-3348.

= 218 subjects +

Page 6: Dynamic Semantic Metadata in Biomedical Communications

dopaminergic pathway

α-synuclein, β-amlyoid

α-synuclein, Tau

chr 16p11.2 CNV

chr 16p11.2 CNV

CRF, glutaminergic system, dopamine, amygdala …

Alzheimer Disease

Parkinson’s Disease Schizophrenia

Autism

Bipolar Disorder Drug Addiction

Huntington’s Disease

ALS

Depression

SIRT2

Page 7: Dynamic Semantic Metadata in Biomedical Communications

1. We want to organize all the known facts in neurobiology so we can mash them up.

2. There are no “facts” in neurobiology, except uninteresting ones.

3. All we have, are assertions supported by evidence, of varying quality.

Page 8: Dynamic Semantic Metadata in Biomedical Communications

1667 2010

Printing Press Web

Page 9: Dynamic Semantic Metadata in Biomedical Communications

We scientists do not attend professional meetings to present our findings ex cathedra, but in order to argue.

John Polanyi, FRS, Nobel LaureateUniversity of Manchester

Page 10: Dynamic Semantic Metadata in Biomedical Communications

Social Web (Web 2.0, read/write)

Shared annotation with controlled terminology systems (Sem Web)

+

Page 11: Dynamic Semantic Metadata in Biomedical Communications

Information sharing within communities or tasks via Social Web (Web 2.0), wikis and forums

Information “permeability” across pharma R&D projects / domains / pipeline stages via shared metadata (semantic annotation)

Web 3.0 improves cross-domain Signal to Noise, institutional memory & data “findability”

Page 12: Dynamic Semantic Metadata in Biomedical Communications
Page 13: Dynamic Semantic Metadata in Biomedical Communications

Genes

Proteins

Biological Processes

Chemical Compounds

Antibodies

Cells

Brain anatomy

Page 14: Dynamic Semantic Metadata in Biomedical Communications

Annotation Ontology (AO) is a domain-independent Web ontology. Links document fragments to ontology terms. Metadata separate from annotated documents.

SWAN AF manages document annotation. Interfaces to textmining svcs & supports

curation. Collaborating with

NCBO, UCSD, Elsevier, USC, Manchester, EMBL, Colorado, EBI, etc…

Page 15: Dynamic Semantic Metadata in Biomedical Communications

TextShared metadata

Page 16: Dynamic Semantic Metadata in Biomedical Communications

2) Automatic annotation

Dr.

Pao

lo C

icca

rese

– O

ct 8

, 201

0

Page 17: Dynamic Semantic Metadata in Biomedical Communications

Dr.

Pao

lo C

icca

rese

– O

ct 8

, 201

0

Page 18: Dynamic Semantic Metadata in Biomedical Communications
Page 19: Dynamic Semantic Metadata in Biomedical Communications
Page 20: Dynamic Semantic Metadata in Biomedical Communications

Semantics on documents (SESL) Vocabulary standards & terminology

development Document & data managementCollaboratories & web communitiesHypothesis management (SWAN)Nanopublications (OpenPHACTS)

Page 21: Dynamic Semantic Metadata in Biomedical Communications
Page 22: Dynamic Semantic Metadata in Biomedical Communications

Model the thinking behind your research Database it, web-ify it, RDF-ize it, share it Link the Models / Hypotheses to

Claims / Interpretations Evidence (publications, experiments, data) Supporting and contradictory claims from others Evidence for these other claims

Web 3.0: share, compare and discuss Manage knowledge while creating it

Can be public, private, or semi-private

Page 23: Dynamic Semantic Metadata in Biomedical Communications
Page 24: Dynamic Semantic Metadata in Biomedical Communications
Page 25: Dynamic Semantic Metadata in Biomedical Communications

Dr.

Pao

lo C

icca

rese

– O

ct 8

, 201

0

Page 26: Dynamic Semantic Metadata in Biomedical Communications

Dr.

Pao

lo C

icca

rese

– O

ct 8

, 201

0

Page 27: Dynamic Semantic Metadata in Biomedical Communications

Dr.

Pao

lo C

icca

rese

– O

ct 8

, 201

0

Page 28: Dynamic Semantic Metadata in Biomedical Communications

Cognitive

Deficits(S)

BACE1(O)

Relate to(p)

provenancecontext

With thanks to Barend Mons and Paul Groth…

Mons / Groth model of a nanopublication

Page 29: Dynamic Semantic Metadata in Biomedical Communications

swande:Claim

<http://tinyurl.com/4h2am3a>

Intramembranous Aβ behaves as chaperones of other membrane proteins

rdf:type

dct:title

G1

<http://example.info/person/1>

pav:authoredBy

Vincent Marchesi

foaf:name

foaf:Person

rdf:type

pav: http://purl.org/pav/provenance/2.0/ foaf: http://xmlns.com/foaf/0.1/

G2

Page 30: Dynamic Semantic Metadata in Biomedical Communications

swande:Claim

<http://tinyurl.com/4h2am3a>

Intramembranous Aβ behaves as chaperones of other membrane proteins

rdf:type

dct:title

G1

<http://example.info/person/1>

pav:authoredBy

G2

<http://example.info/person/0>

pav:curatedBy

G4

Gwen Wong

foaf:name

foaf:Person

rdf:type

Page 31: Dynamic Semantic Metadata in Biomedical Communications

swande:Claim

<http://tinyurl.com/4h2am3a>

Intramembranous Aβ behaves as chaperones of other membrane proteins

rdf:type

dct:title

G1

<http://example.info/person/1>

pav:contributedBy

<http://example.info/citation/1>

swanrel:referencesAsSupportiveEvidence

G5

G6

Page 32: Dynamic Semantic Metadata in Biomedical Communications

G8

<http://example.info/alzswan:statement_f3556dcfc331d9b9af9d5c0cfc570ba6_event_1>

<http://bio2rdf.org/go:0051087>

rdf:type

Event of type GO "chaperone binding"

rdfs:label

<prefix:actor_1>

<prefix:target_1>

<prefix:location_1>

<http://bio2rdf.org/chebi:53002>

<http://bio2rdf.org/mesh:D008565>

<http://bio2rdf.org/go:0005886>

rdf:type

rdf:type

rdf:type

rdfs:label “Beta amyloid”

rdfs:label “Membrane protein”

rdfs:label “Plasma membrane”

With many thanks to Nigam Shah, Stanford University

Page 33: Dynamic Semantic Metadata in Biomedical Communications

Hyque triples

G8

<http://example.info/person/2>

pav:contributedBy

Nigam Shah

foaf:name

foaf:Person

rdf:typeG9

Page 34: Dynamic Semantic Metadata in Biomedical Communications

swande:Claim

<http://tinyurl.com/4h2am3a>

Intramembranous Aβ behaves as chaperones of other membrane proteins

rdf:type

dct:title

G1

Hyque triples

G8

swanrel:derivedFrom

Page 35: Dynamic Semantic Metadata in Biomedical Communications

The target hypothesis will be linked to: Pathway & target relation to disease, Target selection criteria, Validation assays and criteria, Experiment (assay) provenance, Experimental data and computations, Scientist remarks, findings and discussion.

Start as a relatively simple model and extend

Page 36: Dynamic Semantic Metadata in Biomedical Communications

Hypotheses of therapeutic action for compounds and scaffolds, linked to

Hypothesis / results for individual assays,

Experiment (assay) provenance, Experimental data, Group annotation, Internal databases etc. Start as a relatively simple model and

extend

Page 37: Dynamic Semantic Metadata in Biomedical Communications
Page 38: Dynamic Semantic Metadata in Biomedical Communications

Information ecosystem

Page 39: Dynamic Semantic Metadata in Biomedical Communications

Curing complex medical disorders goes hand in hand with next-gen biomedical communications

Web 3.0 provides the technology framework Semantic annotation, hypothesis management,

nanopubs: tools for next-gen biomed comms . Requires / enables international collaborations of

biomedical researchers and informaticians. Open enterprise model with semantic metadata.

Page 40: Dynamic Semantic Metadata in Biomedical Communications

People Paolo Ciccarese (Harvard) Maryann Martone (UCSD) Anita DeWaard & Tony Scerri (Elsevier) Karen Verspoor & Larry Hunter (Colorado) Adam West & Ernst Dow (Eli Lilly) Carole Goble (Manchester) Nigam Shah (Stanford / NCBO) Paul Groth (VU Amsterdam)

Funding: Elsevier, NIH, Eli Lilly, & EMD Serono