WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015

35
Whole genome sequencing in public health microbiology A/Prof Torsten Seemann Victorian Life Sciences Computation Initiative (VLSCI) Microbiological Diagnostic Unit Public Health Laboratory (MDU PHL) Doherty Centre for Applied Microbial Genomics (DCAMG) The University of Melbourne MDU/VIDRL Mini Seminar - Melbourne, AU - Wed 17 June 2015

Transcript of WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015

Whole genome sequencing in public health microbiology

A/Prof Torsten Seemann

Victorian Life Sciences Computation Initiative (VLSCI)Microbiological Diagnostic Unit Public Health Laboratory (MDU PHL)

Doherty Centre for Applied Microbial Genomics (DCAMG)The University of Melbourne

MDU/VIDRL Mini Seminar - Melbourne, AU - Wed 17 June 2015

The one true assay?

Traditional workflow

Whole Genome Sequencing (WGS)

2-10 Mbp

100-300 bp30-100x depth

Simple?“Analysis” and “Results”

The promise of genomics

∷ A single assay

∷ Cheaper

∷ Faster

∷ High throughput

∷ Full single nucleotide resolution

Utility of WGS

∷ Diagnostics: strain level identification: in silico antibiogram and virulence profile

∷ Surveillance: in silico genotyping - MLST, serotyping, VNTR, MLVA: what’s lurking in our hospital/community?

∷ Forensics: outbreak detection: source tracking

Modern workflow

Does it deliver?

∷ In general, YES

∷ But it does NOT replace good epidemiology

∷ WGS is a just another (powerful) tool

∷ Need proper bioinformatics

Got my reads. Now what?

Aligning to reference

AGTCTGATTAGCTTAGCTTGTAGCGCTATATTATAGTCTGATTAGCTTAGAT

ATTAGCTTAGATTGTAG

CTTAGATTGTAGC-C

TGATTAGCTTAGATTGTAGC-CTATAT

TAGCTTAGATTGTAGC-CTATATT

TAGATTGTAGC-CTATATTA

TAGATTGTAGC-CTATATTAT

SNP Deletion

Reference based analysis

∷ Implies you have a “close” reference: need to be careful with draft genomes

∷ Very sensitive: single mutation precision

∷ Core genome only: ignores novel DNA in your isolate

De novo genome assembly

De novo analyses

∷ Does not require a reference∷ Access to whole pan-genome

: new plasmids: unexpected antibiotic resistance elements: virulence factors

∷ Limited by short reads: misleading results in repeated regions: not suitable for high-res SNP analysis

Best practice

∷ Use both approaches: reference-based + de novo

∷ Best of both worlds: and worst of both worlds - interpretation is non-trivial

∷ Still need: good epidemiology, metadata and domain knowledge!

Limitations

Sequencing bias

Isolate genomeSequenced reads

Other isolates in sequencing run

Contamination

Unsequenced regions

Read length

250 bp - Illumina - $100 8000 bp - Pacbio - $1000

RepeatsRepeat copy 1 Repeat copy 2

Collapsed repeat consensus

1 locus

4 contigs

Inferring transmission

∷ Identical sequence does not imply transmission

∷ Easier to rule out than in

Cutting edge web tools

Real time tracking of seasonal influenza

virus evolution in

humans

nextflu.org

ebola.nextflu.org

mers.nextflu.org

Drag genomes.Calculates:∷ tree∷ MLST∷ resistomeAdd metadata∷ source∷ location∷ colours

wgsa.net

Visualise and explore trees linked to genome data. Just upload .nwk and .csv! microreact.org

Sharing data

Open science

∷ Crowd-sourcing provably works: EHEC outbreak 2011: Ebola: MERS

∷ But only if people share: sequencing data: metadata: software source code for analysis

GenomeTrakr

∷ International cooperation : Led by FDA + NCBI: >20 collaborating institutes inc. UK PHE, DK DTU, MX: Salmonella and Listeria

∷ Public SRA BioProject #183844 : Real-time submission of WGS genome reads: Nightly updates of phylogenomic trees: Contains ~8000 strains of Salmonella

“GenomeTrakka”

∷ A shared online system for all Australian labs: upload samples: automated standard/specific analyses: simple reports and visualization: easy to submit to international archives (SRA)

∷ Access control

: each lab controls their own data: jurisdictions can share data in national outbreaks

Conclusion

Acknowledgements

∷ Slide source material: Nick Loman: Jennifer Gardy: Rob Beiko

∷ Slide feedback: Jason Kwong: Dieter Bulach

Contact

∷ http://tseemann.github.io

[email protected]

∷ @torstenseemann

The EndThank you for listening.