ASHG 2012 Poster

InVitae reports findings only for requisitioned conditions. A report for the 150 conditions currently offered is 250 pages. Online reporting and organization make the report easily navigable. A few features of our reports are shown below.

A consistent definition of a transcript's exon structure is essential to reliably mapping and interpreting variants. Inconsistencies lead to incorrect translations of research findings to clinical settings. We account for the following challenges:

Curators and developers may easily generate reports for simulated samples with arbitrary collections of curated and novel variants across multiple conditions. Tests may be saved for future execution and regression testing.

one sampleone requisition

one reportup to 150 conditions

two weeksone lab, one price

Computational and informatics challenges in providing Computational and informatics challenges in providing clinically-relevant genome interpretation from clinically-relevant genome interpretation from

high-throughput sequencing data.high-throughput sequencing data.Reece Hart; InVitae Team, San Francisco, CA, 94107

InVitae provides sequencing and clinically-relevant genome interpretation services to physicians from patient blood samples. Our value is based on three essential components: a database of high-quality associations of variants and conditions, carefully designed targeted sequencing assays, and a sophisticated analysis pipeline for interpreting variants. The current process requires less than two weeks from the arrival of blood to the delivery of a clinical report covering over 10,000 curated variants in 250 genes for up to 150 conditions (subject to physician's requisition). This poster summarizes the computational and informatics tools that enable this process.

Clinician's view of InVitae

InVitae's process features online requisitioning and reporting, CLIA-certified sequencing, and a HIPAA-compliant information management.

intake

The Trouble with Transcripts

Report Excerpts

Variant Simulation and Report Testing

similar conditions grouped together

carriers of known pathogenic variants

condition groups sorted by risk level and evidence

ancestry-dependent quantitative risks

known pathogenic variants have strongest evidence of association

predicted effect(s)

supporting publications

frequency in 1000 Genomes Project

haplotype alleles, inferred haplotypes, and risk association

absence of known pathogenic variants(covered regions and qualities shown at end of report)

pathogenic variants inferred from condition-specific rules for the interpretation of novel variants

variants of unknown significance, with and without prior observations

ancestry-aware inference of risk from combination

of odds ratios

regions where transcript sequences differ from the reference genome are not interpretable

simulate variants for specified genders and ancestry

simulate new variants for VUS analysis

select curated variants create homozygous, heterozygous, and no-data loci

NM_012345.6

NM_012345.6

ENST987654

disagreement between reference genome and transcript

(3514/33165 transcripts)

exon structure changes for a single RefSeq accessione.g., NM_001035.2 (RYR2)

suboptimal alignments to the reference genome

e.g., ALMS1

structure and CDS equivalence of RefSeq and Ensembl transcripts

transcript records with atypical record formats(all 18 DMD transcripts)

NM_123456.7

The InVitae pipeline is designed to provide at least 50x depth across all targeted regions for all covered genes/conditions. Samples that do not meet stringent criteria for sequence depth, sequence coverage, and coverage of known pathogenic variants for requisitioned conditions are rerun or failed. Personal Health Information remains on premises; the rest of the pipeline (reads through anonymized report) executes with the Amazon Web Services platform.☞ See also: 3692W, lab process (Session I)

report

knownpathogenic

alignment● bwa● base quality

recalibration● automated● coverage

analysis

variant calling● GATK● polyMNP caller● variant phasing● haplotype calling

reporting● overall pipeline

versioning● lab director

oversight

VUS

sample intake● online requisitioning● barcoding● information security

sequencing assay● assay design● PCR fill-in● multiplexing● automation● LIMS

inferredpathogenic

variantsalignmentsreads

variant annotation● classification● variant effect/VUS

pipeline● quantitative risk

modeling

blood

known pathogenic, novel pathogenic, and VUS variants appear in distinct sections of the report

pro

ces

sco

mp

uti

ng

ch

alle

ng

es

The heart of InVitae is the curation database, a manually curated compendium of associations of genomic variants and clinical conditions derived from literature and public sources. The curation database informs assay design and variant interpretation.

curationdatabase

Curated genomic variants and clinical findings derived from

literature and public databases.

☞ See also:1766W curation process 1771W variant classification(both Session I)

Curation Database

Sequence Analysis and Variant Interpretation

Requisitioning and Laboratory Information Management System

interpretationsequencing director review

ASHG 2012 Poster

Health & Medicine

Transcript of ASHG 2012 Poster