Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and...

54
Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria de la Puente, Manuel Fondevila, Ángel Carracedo, Maviky Lareu Forensic Genetics Unit, University of Santiago de Compostela

Transcript of Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and...

Page 1: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Validation of Identity and Ancestry SNP Panels for the Ion PGM™

Christopher Phillips, Carla Santos, Maria de la Puente, Manuel Fondevila, Ángel Carracedo, Maviky Lareu

Forensic Genetics Unit,University of Santiago de Compostela

Page 2: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Runa Daniel, Dennis McNevin & Roland van Oorschot, VPFSD, Melbourne

Christopher Phillips, Carla Santos, Maria de la Puente, Manuel Fondevila, Ángel Carracedo, Maviky Lareu

Forensic Genetics Unit,University of Santiago de Compostela

Validation of Identity and Ancestry SNP Panels for the Ion PGM™

Walther Parson & Mayra Eduardoff, GMIPeter Schneider & Theresa Gross, UHC

Page 3: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Where does Ion PGM™ fit into the ‘sequence explosion’ ?

LT SOLiD

Illumina HiSeq

Ion Torrent is designed to read comparatively short fragments atmuch greater coverage - more reads provide high levels of accuracy

Next-generation sequencing (NGS)

Massively parallel sequencing Ion PGM™

MiSeq

Page 4: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Considerations for evaluation of Ion PGM™ SNP panels

• For analysing degraded DNA with short amplicon SNPs there is plenty ofchoice (75 million variants in 1000 Genomes) - so if a SNP performs poorly,it can be easily replaced. SNPs selected for EVC prediction and the bestAIMs must work optimally for the test to be informative.

• Interested in the SNPs as much as the detection system, and can ask: ‘arethere good and bad performers in any one SNP set detected with IonPGM™?’ ‘Does performance impact quality of results from low level DNA?’

Are all SNPs in the multiplex equally well genotyped?

What is the genotyping precision (but also, what is the true genotype)?

What happens to sequence data when low-level or mixed DNA is the input?

• The need to align generated sequences to a reference sequence adds anew layer of complexity if contextsequence features (homopolymersor indels) occur near the SNP andinterfere with its secure alignment.

Page 5: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

• Forensic SNP typing must get thebalance right between the desiredcoverage : multiplex scales : thebarcode samples loaded per chip.Ion PGM™ sequences are spreadacross the ‘pit capacity’ in the chips- not easy to set up consistently.

Are there coverage outliers?

Considerations for evaluation of Ion PGM™ SNP panels

Can we set a minimum coverage limit?

Page 6: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

• Forensic SNP typing must get thebalance right between the desiredcoverage : multiplex scales : thebarcode samples loaded per chip.Ion PGM™ sequences are spreadacross the ‘pit capacity’ in the chips- not easy to set up consistently.

Are there coverage outliers?

• As SNPs are binary and Ion PGM™ is very sensitive, it is important tocreate a secure system to distinguish mixtures from imbalancedheterozygotes.

Considerations for evaluation of Ion PGM™ SNP panels

Can we set a minimum coverage limit?

Page 7: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

• Forensic SNP typing must get thebalance right between the desiredcoverage : multiplex scales : thebarcode samples loaded per chip.Ion PGM™ sequences are spreadacross the ‘pit capacity’ in the chips- not easy to set up consistently.

Are there coverage outliers?

• As SNPs are binary and Ion PGM™ is very sensitive, it is important tocreate a secure system to distinguish mixtures from imbalancedheterozygotes. What are the patterns of allelic balance in the SNP set?

Considerations for evaluation of Ion PGM™ SNP panels

Can we set a minimum coverage limit?

Page 8: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Three studies of the Ion PGM™ system

Page 9: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Combined five established SNaPshot PCRs to create a pool of 136 uniqueamplicons with 51-156 bp size range (five-fold increase in marker depth)

No balancing made of individual PCRs - 10 µl of purified product (elution columns) pooled for the library preparation then sequenced with 314v1 chips. No Ampliseq™

Three studies of the Ion PGM™ system

Three samples, three input amounts, concordance checked with SNaPshot/Sanger

Page 10: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Three studies of the Ion PGM™ system

Combined five established SNaPshot® PCRs to create a pool of 136unique amplicons with 51-156 bp size range (five-fold increase in markerdepth)

No balancing made of individual PCRs - 10 µl of purified product (elution columns) pooled for the library preparation then sequenced with 314v1 chips. No Ampliseq™

Three samples, three input amounts, concordance checked with SNaPshot® kit /Sanger

Provided with prototype versions of the LT forensic Identity Panel

Decided to adopt a simple NIST validation framework centred onthe qualified run - here, closely matched protocols for the same control DNAs

Four chip types, 16 samples, concordance checked with public data. Ampliseq™

Page 11: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Three studies of the Ion PGM™ system

Combined five established SNaPshot® PCRs to create a pool of 136unique amplicons with 51-156 bp size range (five-fold increase in markerdepth)

No balancing made of individual PCRs - 10 µl of purified product (elution columns) pooled for the library preparation then sequenced with 314v1 chips. No Ampliseq™

Three samples, three input amounts, concordance checked with SNaPshot®

kit/Sanger

Provided with prototype versions of the LT forensic Identity Panel

Decided to adopt a simple NIST validation framework centred onthe qualified run - here, closely matched protocols for the same control DNAs

Four chip types, 16 samples, concordance checked with public data. Ampliseq™

Page 12: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Combined five established SNaPshot® PCRs to create a pool of 136unique amplicons with 51-156 bp size range (five-fold increase in markerdepth)

No balancing made of individual PCRs - 10 µl of purified product (elution columns) pooled for the library preparation then sequenced with 314v1 chips. No Ampliseq™

Three studies of the Ion PGM™ system

Three samples, three input amounts, concordance checked with SNaPshot® kit /Sanger

Provided with prototype versions of the LT forensic Identity Panel

Decided to adopt a simple NIST validation framework centred onthe qualified run - here, closely matched protocols for the same control DNAs

Four chip types, 16 samples, concordance checked with public data. Ampliseq™

A EUROFORGEN deliverable is to develop ancestry and EVCinference panels. Designed an 128-SNP ancestry set for NGS.

Obtained customised primer design service of LT ‘white glove’ team.Validating this SNP set performance with above framework in five labs. Ongoing

Page 13: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Combined five established SNaPshot® PCRs to create a pool of 136unique amplicons with 51-156 bp size range (five-fold increase in markerdepth)

No balancing made of individual PCRs - 10 µl of purified product (elution columns) pooled for the library preparation then sequenced with 314v1 chips. No Ampliseq™

Three studies of the Ion PGM™ system

Three samples, three input amounts, concordance checked with SNaPshot® kit /Sanger

Provided with prototype versions of the LT forensic Identity Panel

Decided to adopt a simple NIST validation framework centred onthe qualified run - here, closely matched protocols for the same control DNAs

Four chip types, 16 samples, concordance checked with public data. Ampliseq™

A EUROFORGEN deliverable is to develop ancestry and EVCinference panels. Designed an 128-SNP ancestry set for NGS.

Obtained customised primer design service of LT ‘white glove’ team.Validating this SNP set performance with above framework in five labs. Ongoing

Page 14: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Combined five optimised forensic PCRs

52plex 34plex Eurasiaplex Pacifiplex IrisPlex

Sequence coverage: 9947A 0.2 ng

Amplified three control DNAs with five SNaPshot PCRs

QIAquick® PCR purification, but no quantitation

Pooled DNA end-labelled and put into library preparation

SNPforID: 52 34-plex: 34 Eurasiaplex: 27Pacifiplex: 28IrisPlex: 6

136 unique sites

Pacifiplex 34plex AIMs IrisPlex SNPforID 52

SNPforID 52

Eurasiaplex

Page 15: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Eurasiaplex dominates top 10% coverage - PCR too strong

Established forensic PCRs worked well

9947A

007

S1

Obtained three times more coverage than expected from 314 Chips

0.1 ng 0.2 ng 0.3 ng

Template amounts did not influence coverage

Page 16: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

• Outlier Allele Read Frequency ranges correlated with lower coverage

9947A 007 S1

• Outlier 10-30% and 70-90% Allele Read Frequency were difficult to interpret

Allelic balance

• These ranges represent ambiguous genotype designations - are these homozygotes with alarge number of spurious alternative alleles or heterozygotes with pronounced allelic imbalance?

Alle

le c

ount

sC

over

age

• If a small proportion of SNPs are consistently imbalanced they may be ‘discounted’ from analyses

0.1 ng 0.2 ng 0.3 ng

Page 17: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

97% concordance obtained comparing Ion PGM™ to SNaPshot® - with a further1.5% genotypes from Sanger sequencing concordant with Ion PGM™. No no-calls

With SangerWith SNaPshot®

98.5%

rs1029047, rs717302

-3%

-1.5%

Genotyping concordance

rs1029047

...one of several tricky homopolymeric alignments

Page 18: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

The PCR multiplex limits of SNaPshot® can be readily expanded bycombining products of multiple reactions and pooling for library preparation.

Conclusions

Although the average sequence coverage per SNP of 500x was very high,it was highly variable amongst the SNPs in each multiplex. This coveragebias was non-random: the same SNPs had consistently high/low coverage ineach sample/input amount. Absence of AmpliSeq® likely to affect balance.

Imbalanced heterozygotes correlated with low coverage, but not all lowcoverage SNPs were imbalanced. So an Allele Read Frequency threshold(10/90 - 40/60) is preferred approach and needs adjusting for outlier SNPs.

Furthermore, SNaPshot® could be better balanced. Not certain if IonPGM™ kit PCRs are equimolar or have molar ratios adjusted to improvebalance.

4/5 discordant genotypes in Ion PGM™ had coverage <13x indicating a 20xminimum could be applicable (a value suggested in other NGS studies).

Page 19: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Combined five established SNaPshot® PCRs to create a pool of 136unique amplicons with 51-156 bp size range (five-fold increase in markerdepth)

No balancing made of individual PCRs - 10 µl of purified product (elution columns) pooled for the library preparation then sequenced with 314v1 chips. No Ampliseq™

Three studies of the Ion PGM™ system

Three samples, three input amounts, concordance checked with SNaPshot®/Sanger

Provided with prototype versions of the LT forensic Identity Panel

Decided to adopt a simple NIST validation framework centred onthe qualified run - here, closely matched protocols for the same control DNAs

Four chip types, 16 samples, concordance checked with public data. Ampliseq™

A EUROFORGEN deliverable is to develop ancestry and EVCinference panels. Designed an 128-SNP ancestry set for NGS.

Obtained customised primer design service of LT ‘white glove’ team.Validating this SNP set performance with above framework in five labs. Ongoing

Page 20: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

The HID-SNP identity set has undergone several revisions

HID-SNP 2.3 ‘prototype’ (169 SNPs)

HID-Ion AmpliSeq™Identity Panel (124)

HID-Ion AmpliSeq™Ancestry Panel

P202M479 L298

9 Y-SNPs replaced

45 A-SNPs removed

Page 21: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

The HID-SNP identity set has undergone several revisions

HID-SNP 2.3 ‘prototype’ (169 SNPs)

HID-Ion AmpliSeq™Identity Panel (124)

HID-Ion AmpliSeq™Ancestry Panel

SNaPshot®

9 Y-SNPs replaced

45 A-SNPs removed

P202M479 L298

Page 22: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

The HID-SNP identity set has undergone several revisions

HID-SNP 2.3 ‘prototype’ (169 SNPs)

HID-Ion AmpliSeq™Identity Panel (124)

Genplex

HID-Ion AmpliSeq™Ancestry Panel

SNaPshot

9 Y-SNPs replaced

45 A-SNPs removed

P202M479 L298

Page 23: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

NIST validation framework chosen for Ion PGM™ evaluations

SRM2391b now superseded by SRM2391c, but opted for 007 and 9947A forensic control DNAs. Added six staff and seven Coriell genomic controls

NIST suggest a simple CE validation MTP for new STRs/kits/protocols:

neg10 ng1 ng

0.1 ng0.05 ng

0.025 ng

neg10 ng1 ng

0.1 ng0.05 ng

0.025 ng

Challenging DNA sources

Page 24: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

NIST validation framework chosen for Ion PGM™ evaluations

SRM2391b now superseded by SRM2391c, but opted for 007 and 9947A forensic control DNAs. Added six staff and seven Coriell genomic controls

Coriell universal genomic controls

NA06994 EUR maleNA07000 EUR femaleNA07029 male child

NA18498 AFRHG00403 E ASNNA10540 OCENA11200 AME

NIST suggest a simple CE validation MTP for new STRs/kits/protocols:

neg10 ng1 ng

0.1 ng0.05 ng

0.025 ng

neg10 ng1 ng

0.1 ng0.05 ng

0.025 ng

Challenging DNA sources

Concordance Inter/Intra-lab plus vs. online data for Coriells

Mixture Male:female staff at above five ratios (duplicated)

Qualifying run Closely matched Ion PGM™ protocols IMU/UHC/USC

Sensitivity 007 & 9947A plus 12th Century aDNA (0.45 ng)

Page 25: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Key analysis parameters

Ion Torrent Suite (plugin: VariantCaller) used Somatic & Germline parameters

• Allele Read Frequency - how much of each allele-carrying sequence isdetected and in what ratio

• Coverage – number of target sequences per chip, per sample, per SNP

• Base misincorporation - the number of incorrect bases detected

• Strand bias - the ratio of sequences in each direction

Page 26: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Coverage

Page 27: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Y-SNPs

Ranked mean coverage per marker

A-SNPs•

500

1000

1500 > 1500

2000

2500

> 300 > 400 > 500 > 1000 > 1500

CoverageRanked mean coverage per SNP

Page 28: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Y-SNPs

Increasing mean coverage per sam

ple

Increasing mean coverage per SNP

Ranked mean coverage per marker

A-SNPs•

> 200

> 300

> 500

> 1500

> 400

> 1000> 100

0

> 10

> 5000

500

1000

1500 > 1500

2000

2500

> 300 > 400 > 500 > 1000 > 1500

Y-SNPs

A-SNPs

CoverageRanked mean coverage per SNP

Page 29: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Y-SNPs

Increasing mean coverage per sam

ple

Increasing mean coverage per SNP

Ranked mean coverage per marker

The majority of the topmost samples were:

• amplified with ≤100pg• degraded DNA• too many samples

loaded on a chip

A-SNPs•

> 200

> 300

> 500

> 1500

> 400

> 1000> 100

0

> 10

> 5000

500

1000

1500 > 1500

2000

2500

> 300 > 400 > 500 > 1000 > 1500

Y-SNPs

A-SNPs

CoverageRanked mean coverage per SNP

20x minimum coverage limit

Page 30: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Allele Read Frequency balance

Page 31: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

SNP

Ref

eren

ce a

llele

freq

uenc

y / t

otal

alle

le fr

eque

ncy

Allele Read Frequency balance

Identified as ARF outliers by bothIdentified as ARF outliers by Børsting only Identified as ARF outliers by this study only

Page 32: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Base misincorporations

Page 33: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

C reads G reads

A reads T reads4

2

4

2

4

2

4

2

6 6

6 6

Non-specific base misincorporation (e.g. C or T in an A/G SNP)

Mis

inco

rpor

atio

n as

pro

port

ion

of to

tal c

over

age

<1.5% misincorporation of expected bases <3% misincorporation of a non-specific 3rd/4th base

Reference or alternative allele misincorporations (e.g. low levels of A in GG homozygotes)

Total Y-SNP sequences detected in females: 34 (in >2 million sequences from 6 analyses)

Base misincorporations

0.25% overall rate of misincorporation -applicable to nearly all SNPs

Page 34: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Strand bias

Page 35: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

rs430046, rs1463729, rs9866013, rs13182883

rs5746846, rs576261, rs2567608, rs4606077, rs1523537

Strand bias

Page 36: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

The SNP target base calls unequivocally record a GGhomozygote but sequences were generated from 355 forwardstrands and 2 reverse strands = 0.994 strand bias)

IGV: rs13182883 C T A G

Deletion (DEL)

Direction

Insertion (INS)

Page 37: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

1 2 3

One deletion in ~95% of sequences in the forward strand

Twelve direction-based deletion sites recorded in the reverse strand, including the target SNP and two clustering SNP sites 1 and 3

1 32

C T A G

Deletion (DEL)

Direction

Insertion (INS)

IGV: rs430046

Page 38: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Concordance

99.8% concordance from: inter-run (6009/6022); inter-lab (3751/3763); vs. online data (1621/1624)

Likely 1000 Genomes-Phase 1 error

Likely 1000 Genomes-Phase 1 error

No call on either allele in Complete Genomics

No call for 1st allele in Complete Genomics

No call on either allele in Complete Genomics

No call for 2nd allele in Complete Genomics

One Ion PGM™ discordancy in a single sample

T is the incorrect base call (A/C SNP)

NN

NN

NG

GN

T is the incorrect base call (A/C SNP)

Page 39: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

rs2032597

rs2399332, rs1004357, rs938283, rs1979255 and rs2032597

0.3%0.2%1.2% 0.8% 2.4%

0.2%

Concordance

Likely 1000 Genomes-Phase 1 error

Likely 1000 Genomes-Phase 1 error

No call on either allele in Complete Genomics

No call for 1st allele in Complete Genomics

No call on either allele in Complete Genomics

No call for 2nd allele in Complete Genomics

One Ion PGM™ discordancy in a single sample

T is the incorrect base call (A/C SNP)

99.8% concordance from: inter-run (6009/6022); inter-lab (3751/3763); vs. online data (1621/1624)

NN

NN

NG

GN

T is the incorrect base call (A/C SNP)

Page 40: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Single donors 1:1

3:1

9:1

1:9

1:3

Mixtures

Mixed input DNA created Allele Read Frequency balance shifts - but at 9:1 very little heterozygote displacement

Page 41: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

25 or 50 pg

Sensitivity

In low-level samples ~8-12 SNPs failed Only rs2016276 appeared disproportionately in failing SNPs

In ‘optimum input’ concordance samples ~1-3 SNPs gave no-calls or dropouts All were outlier SNPs

Page 42: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

25 or 50 pg

Sensitivity

Only rs2016276 appeared disproportionately in failing SNPs

All were outlier SNPs

Read length in bp

Low-level DNA

Read length in bp

Optimum input DNA

In ‘optimum input’ concordance samples ~1-3 SNPs gave no-calls or dropouts

In low-level samples ~8-12 SNPs failed

Page 43: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Volders site, Tyrol, Austria

aDNA

450 pg quantified with Quantifiler Duo 25 PCR cycles or 25+5 library re-amplification cycles

Sensitivity

Page 44: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Volders site, Tyrol, Austria

aDNA

N / NN

QUAL=0

SNPs>100 x coverage

SNPs20-100 x coverage

1.2E-28

GlobalFiler

1.2E-33

RMP

450 pg quantified with Quantifiler Duo 25 PCR cycles or 25+5 library re-amplification cycles

Page 45: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Prototype SNP panel amplicons

N / NN

QUAL=0

47

68

108

130

SNPs>100 x coverage

SNPs20-100 x coverage

54 SNPs removed from prototype set

57 SNPs have reduced amplicon sizes (by an average 57.5 bp)

58 SNPs retain the original prototype primer designs

120

123

137

117

119

99

HID-Ion AmpliSeq™ Identity Panel amplicons

1.2E-28

GlobalFiler

1.2E-33

RMP

Page 46: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Identified three kinds of outlier SNPs

rs1979255rs1004357rs938283rs2032597rs2399332

rs9866013rs727811rs321198rs4606077rs1463729rs6591147rs8037429rs430046rs2567608rs1523537rs17250535

should be removed

have some outlyingcharacteristics such asstrand bias that meansthey should be treated withcaution when looking forunusual patterns such asmixed DNA

have some outlyingcharacteristics but thesedo not have a detectableeffect on genotypingperformance

rs1029047rs1336071rs1478829rs2032599rs13182883rs2107612rs576261rs5746846rs13447352

Page 47: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Not all SNPs performed equally well. Five gave genotype discordancies:inter-lab and inter-run, two remain in the HID-Ion Identity Panel.

Conclusions

Several other SNPs have outlier characteristics; those with imbalancedARFs were also identified in Børsting’s study, though two were not and threeBørsting outliers had reasonably good allelic balance in our study.

Limited experiments with low-level DNA suggest very high sensitivity. 81/169SNPs gave >100x coverage from aDNA. 57 SNPs have since been shortened.

Sought to identify ARF outliers to allow their discounting in mixtures. LTGenotyper uses Somatic parameter settings, but mixtures mimic mutationpatterns, so Germline settings improve post-hoc analysis of mixed DNA.

Harmonising chip loading to balance the coverage with samples-per-chipwas very difficult. Continues into AIM set validation - high inter-lab variability.

The 99.8% concordance is best of any SNP test comparisons made at USC.

EUROFORGEN working on a mixture analysis system for Ion PGM™ datathat works well with binary loci but requires conditioning on one contributor.

Page 48: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Combined five established SNaPshot® PCRs to create a pool of 136unique amplicons with 51-156 bp size range (five-fold increase in markerdepth)

No balancing made of individual PCRs - 10 µl of purified product (elution columns) pooled for the library preparation then sequenced with 314v1 chips. No Ampliseq™

Three studies of the Ion PGM™ system

Three samples, three input amounts, concordance checked with SNaPshot®/Sanger

Provided with prototype versions of the LT forensic Identity Panel

Decided to adopt a simple NIST validation framework centred onthe qualified run - here, closely matched protocols for the same control DNAs

Four chip types, 16 samples, concordance checked with public data. Ampliseq™

A EUROFORGEN deliverable is to develop ancestry and EVCinference panels. Designed an 128-SNP ancestry set for NGS.

Obtained customised primer design service of LT ‘white glove’ team.Validating this SNP set performance with above framework in five labs. Ongoing

Page 49: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Optimising a custom SNP panel for Ion PGM™

• 125 of 128 SNPs incorporated into the PCR multiplex: 97.5% conversion rate

• Concordance rates high: inter-run 99.98% (1 SNP), inter-lab 99.75% (6 SNPs)- with reasons for discordance identified in each case

Page 50: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

• 125 of 128 SNPs incorporated into the PCR multiplex: 97.5% conversion rate

• Concordance rates high: inter-run 99.98% (1 SNP), inter-lab 99.75% (6 SNPs)- with reasons for discordance identified in each case

• Concordance from online database comparisons of Coriell DNAs:1000 Genomes 99.74% (3 SNPs), Complete Genomics 99.79% (same SNPs)

Optimising a custom SNP panel for Ion PGM™

Page 51: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

• The six discordant SNPs in 13 genotypes were mainly homopolymeric tractsaround the SNP. Need retrogressive analysis due to strand directionality.

Optimising a custom SNP panel for Ion PGM™

• Eight no-call SNPs in 20 genotypes: 4 low coverage, 1 population-specificflanking indel, 3 did not pass variantCaller quality filters.

• Some population-specific low coverage seen - untracked primer site SNPs?

Page 52: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Thank You

Page 53: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Thanks to all the team at Santiago particularly Carla, Maria and Fonde. To Mayra, Walther at GMI and Theresa, Peter at UHC. To Matt Phipps, LT and David Ballard, KCL for help with data analysis.

Page 54: Validation of Identity and Ancestry SNP Panels for the Ion ...€¦ · Validation of Identity and Ancestry SNP Panels for the Ion PGM™ Christopher Phillips, Carla Santos, Maria

Speaker was provided travel and hotel support by Thermo Fisher Scientific for this presentation, but no remuneration