ìDiversity and Genome AnalysisDeannaM.ChurchSeniorDirectorofApplications, 10xGenomics
• Howdoesdiversityaffectassembly?
• Howdoesdiversityaffectassembly?• Whyisitimportanttorepresentthis?
• Howdoesdiversityaffectassembly?• Whyisitimportanttorepresentthis?• Improvingindividualgenomeanalysis
ìHow does diversity affect assembly
Old Assembly ModelsSequencesfromhaplotype1Sequencesfromhaplotype2
Old Assembly ModelsSequencesfromhaplotype1Sequencesfromhaplotype2
Old Assembly ModelsSequencesfromhaplotype1Sequencesfromhaplotype2
New Assembly ModelsSequencesfromhaplotype1Sequencesfromhaplotype2
New Reference Assembly ModelsSequencesfromhaplotype1Sequencesfromhaplotype2
Improving the modelCCL3: region: GRCh37
NC_000017.10 (chr17): 34,442,621-35,005,379
Improving the modelNC_000017.10 (chr17): 34,442,621-35,005,379
CCL3: region: GRCh37
Alternate LociCCL5-TBC1D3 region: GRCh38NC_000017.11 (chr17): 36,032,574-36,269,924
NT_187661.1
100 Kb deletion on chromosome
Steinberg et al., 2014 http://dx.doi.org/10.1101/006841
Diversity and the assembly
• Complexdiversitycanleadtoassemblyerrors
Diversity and the assembly
• Complexdiversitycanleadtoassemblyerrors• Haploidconsensusleadstomissinginformation
ìWhy is this important?
Allelic diversity
10.1038/nrg3933
MAPT locus
MHC (chr6)Chr 6representation(PGF)
Alt_Ref_Locus_2(COX)
MHC (chr6)Chr 6representation(PGF)
Alt_Ref_Locus_2(COX)
MHC (chr6)Chr 6representation(PGF)
Alt_Ref_Locus_2(COX)
Assembly Representation
75Mbnewsequence>2000newproteincodingtranscripts
178RegionswithAlternateLoci3.6MbuniquesequencetoAlts
153genesuniquetoalts
GRCh38
• Missingsequenceleadsto• MissingHaplotypeConfiguration• MissingGeneModels• MissingKnowledge
ìImproving individual genome analysis
Analysis challenges: alignments
Primary Assembly
Paralogous duplicationAllelic duplication
AltLocus
MapQ
Analysis challenges: alignments
Primary Assembly
Paralogous duplicationAllelic duplication
Alt Locus
MapQ
Alt Loci and Genes
Medically Interpretable Genes (MIG) ~8,000
Primary Assembly
Alt Locus
6.4%
6.2%0.18%
Impact on reference based analysis
De Novo Local Assembly of HLA-A RegionExample of analysis from Linked-Reads
CallandphaseHLAgenes,intronsandinterveningMHCsequenceValerieSchneider,Friday7:50pmDavidJaffe,Friday8:30pmKaryn Meltz Steinberg,Friday9:10
http://www.pathology.washington.edu/galleries/Cytogallery/main.php?file=human%20karyotypes
The reference assembly and genome interpretation
KCNE1
Parting thoughts
Outwiththeoldandinwiththenew
Parting thoughts
Outwiththeoldandinwiththenew
• Oldassemblymodelsleadtoanalysisholes
Parting thoughts
Outwiththeoldandinwiththenew
• Oldassemblymodelsleadtoanalysisholes• Improvedreferencecanhelpreferencebased
analysis
Parting thoughts
Outwiththeoldandinwiththenew
• Oldassemblymodelsleadtoanalysisholes• Improvedreferencecanhelpreferencebased
analysis• Individualdenovoassemblyisthefuture• Stillneedreferencetointerpret• Needtooldevelopment