RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA...

19
João André Carriço, Mario Ramirez Microbiology Institute and Instituto de Medicina Molecular, Faculty of Medicine, University of Lisbon [email protected] twitter: @jacarrico RAMI-NGS, Hamburg, Germany, 9-11 June 2016

Transcript of RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA...

Page 1: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

João André Carriço, Mario Ramirez Microbiology Institute and Instituto de Medicina Molecular, Faculty of Medicine, University of Lisbon [email protected] twitter: @jacarrico

RAMI-NGS, Hamburg, Germany, 9-11 June 2016

Page 2: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

Moving from Typing into High Throughput Sequencing (HTS) Genomics : Increase in discrimination Extra information to be extracted the

genome (resistance profiles, virulence factors, genome organization)

Global Outbreak detection / Surveillance

Direct application in public health Source attribution -> intervention

Page 3: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

Image credits: 1) http://www.iissiidiology.net/en/publications/104-ayfaar-interpersonal-and-true-human-relationship-harmonization-mechanisms 2) http://blog.f1000research.com/2014/04/04/reproducibility-tweetchat-recap/

Data Integration

Harmonization Reproducibility

1)

Page 4: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

Algorithms

Interfaces

Ontologies

Page 5: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

Read mapping algorithms Bowtie2

BWA

SOAP2

Saruman

mr/mrsFAST

…. (And a lot more )

Algorithms

Hatem M et all BMC Bioinformatics 2013..14:184 DOI: 10.1186/1471-2105-14-184

+ a plethora of parameters for each of them + a (proper) choice of reference

Page 6: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

Gene-by-gene approach allele call algorithms: BIGSdb ( Jolley, K. A. & Maiden, M. C. J. BMC Bioinf 11, 595 (2010).)

Enterobase (https://enterobase.warwick.ac.uk/)

GEP (Genome Profiler) (JCM. 2015 May;53(5):1765-7)

Ridom Seqsphere Bionumerics (Applied Maths)

Mostly assembly based (yes it is a lot of work … )

Assembly algorithms have some parameters (mostly k-mer sizes)

Lots of heuristics for allele definition..

Algorithms

Page 7: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

Gene by gene approaches:

What is a locus?

What is an allele?

It depends on the algorithm(s) used!

Algorithms

However the results are largely congruent!

Page 8: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

Ontologies

Image from http://www.emiliosanfilippo.it/?page_id=1172

Page 9: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

“Formal representation of knowledge as a set of concepts within a domain, and the relationships between those concepts” – Wikipedia

Domain modeling: represents all the concepts involved in in microbial typing by sequence-based methods

Provides a shared vocabulary, where the concepts should be unambiguous

Enables a machine-readable format that can be used for software and algorithms automatically interact with multiple databases

Ontologies

Page 10: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

Ontologies

Page 11: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

GenEpiO: Combining Different Epi, Lab,

Genomics and Clinical Data Fields.

Lab Analytics Genomics, PFGE

Serotyping, Phage typing MLST, AMR

Clinical Data Patient demographics,

Medical History, Comorbidities, Symptoms,

Health Status

Reporting Case/Investigation Status

GenEpiO (Genomic Epidemiology Application Ontology)

See draft version at https://github.com/Public-Health-Bioinformatics/IRIDA_ontology

Original slide from Emma Griffiths

Ontologies

Page 12: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

Public Health Surveillance

Case Cluster Analysis

Result Reporting

Infectious Disease Epidemiology (from case to Intervention)

Lab Surveillance (from sample to strain typing results)

Evidence Collection

& Outbreak Investigation

Sample Collection & Processing

Sequence Data Generation &

Processing

Bioinformatics Analysis

Result Reporting

Whole Genome Sequencing (SO, ERO, OBI etc)

Quality Control (OBI, ERO)

Anatomy (FMA)

Environment (Envo)

Food (FoodOn)

Clinical Sampling (OBI)

Custom LIMS

Quality Control (OBI, ERO)

AMR (ARO)

Virulence (PATO)

Phylogenetic Clustering (EDAM)

Mobile Elements (MobiO)

Quality Control (OBI, ERO)

AMR (ARO) LOINC

Surveillance (SurvO)

Demographics (SIO)

Patient History (SIO)

Symptoms (SYMP)

Exposures (ExO)

Source Attribution (IDO)

Travel (IDO)

Transmission (TRANS)

Food (FoodOn)

Geography (OMRSE)

Outbreak Protocols

Surveillance (SurvO)

Food (FoodOn)

Surveillance (SurvO)

Mobile Elements (MobiO)

Infectious Disease (IDO)

Typing (TypON)

Nomenclature & Taxonomy (NCBItaxon)

Original slide from Emma Griffiths /IRIDA

http://foodontology.github.io/foodon/

(pipeline) NGSOnto

Page 13: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

Provides machine-readable web-based interface,i.e.,the algorithms (not humans) can:

retrieve, submit , update data /analysis results

launch analysis/algorithms

Interfaces

http://www.clker.com/cliparts/q/P/V/D/5/R/cog-allgrey-hi.png

Page 14: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

BIGSdb Enterobase Offer an Restful API for data retrieving, submission and data analysis

Interfaces

Page 15: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

Interfaces

Page 16: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

Interfaces

https://online.phyloviz.net/

API: *account creation *profile + metadata upload *running goeBURST *retrieving a link Private or Public data sharing Scalable to thousands of nodes Tree Analysis tools:

Interactive distance matrix NLV graph

Page 17: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

Transparency of analytical methods

Better definition of concepts

(Clinical/Lab/Analysis)

Better tool/database interoperability

• Reproducibility of results • Creation of modular analysis with added value • Custom interfaces for non-bionf specialists

Actionable

Results

Page 18: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics
Page 19: RAMI-NGS, Hamburg, Germany, 9-11 June 2016 · 2016-06-21 · Read mapping algorithms Bowtie2 BWA SOAP2 Saruman mr/mrsFAST …. (And a lot more ) Algorithms Hatem M et all BMC Bioinformatics

UMMI Members

Bruno Gonçalves Mickael Silva Miguel MAchado Mário Ramirez José Melo-Cristino

INESC-ID Alexandre Francisco Cátia Vaz Marta Nascimento

EFSA INNUENDO Project (https://sites.google.com/site/innuendocon/)

Mirko Rossi

FP7 PathoNGenTrace (http://www.patho-ngen-trace.eu/):

Dag Harmsen (Univ. Muenster) Stefan Niemann (Research Center Borstel) Keith Jolley, James Bray and Martin Maiden (Univ. Oxford) Joerg Rothganger (RIDOM) Hannes Pouseele (Applied Maths)

Genome Canada IRIDA project (www.irida.ca)

Franklin Bristow, Thomas Matthews, Aaron Petkau, Morag Graham and Gary Van Domselaar (NLM , PHAC) Ed Taboada and Peter Kruczkiewicz (Lab Foodborne Zoonoses, PHAC) Fiona Brinkman (SFU) William Hsiao (BCCDC)

INTEGRATED RAPID INFECTIOUS DISEASE ANALYSIS