Genotyping By Sequencing (GBS) Method Overview
Transcript of Genotyping By Sequencing (GBS) Method Overview
![Page 1: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/1.jpg)
Genotyping By Sequencing (GBS) Method Overview
RJ Elshire, JC Glaubitz, Q Sun, JV HarrimanES Buckler, and SE Mitchell
http://www.maizegenetics.net/
![Page 2: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/2.jpg)
•Background/Goals
•GBS lab protocol
•Illumina sequencing review
•GBS adapter system
•How GBS differs from RAD
•Modifying GBS for different species
•Workflow in our lab
Topics Presented
![Page 3: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/3.jpg)
Background
Genotyping by sequencing (GBS) in any large genome species requires reduction of genome complexity.
II. Restriction Enzymes (REs)I. Target enrichment
•Long range PCR of specific genes or genomic subsets
•Molecular inversion probes
•Sequence capture approaches hybridization‐based (microarrays)
*Technically less challenging*
• Methylation sensitive REs filter out repetitive genomic fraction
![Page 4: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/4.jpg)
QTL are often located in non‐coding regionsVgt1, Tb, B regulatory regions 60‐150kb from gene
Exon Exon Exon Exon
Exon captureMap large numbers ofgenome‐wide markers
QTL
![Page 5: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/5.jpg)
Create inexpensive, robust multiplex
sequencing protocol
Low‐coverageSequencing Illumina GA
Informatics Pipelin
Anchor markers across the genomeImpute missing data
Combine genotypic & phenotypic data for GS and GWAS
Our goal is to create a public genotyping/informatics platform based on next‐generation sequencing
![Page 6: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/6.jpg)
Open Source
• Method available for anyone to use / modify.• Analysis pipeline details and code are public.• Promote dataset compatibility.• Method published in PLoS ONE to promote accessibility.
• Genotype calls publically available.
![Page 7: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/7.jpg)
< 450 bp
Restriction site( ) sequence tag
Loss of cut site
Sample1
Overview of Genotyping by Sequencing (GBS)
• Focuses NextGen sequencing power to ends of restriction fragments• Scores both SNPs and presence/absence markers
Sample2
![Page 8: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/8.jpg)
• Reduced sample handling• Few PCR & purification steps• No DNA size fractionation• Efficient barcoding system• Simultaneous marker discovery & genotyping
• Scales very well
GBS is a simple, highly multiplexed system for constructing libraries for next‐gen sequencing
![Page 9: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/9.jpg)
GBS 96‐plex Protocol(http://www.maizegenetics.net/gbs‐overview)
1. Plate DNA & adapter pair
2. Digest DNA with RE3. Ligate adapters
Now at 384‐plex
![Page 10: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/10.jpg)
GBS Adapters and Enzymes
BarcodeAdapter
“Sticky Ends”
Barcode(4‐8 bp)
CommonAdapter
P1 P2
ApeKI G CWGCPstI CTGCA GEcoT22I ATGCA T
5’ 3’
Restriction Enzymes
Illumina Sequencing Primer 2
Illumina Sequencing Primer 1
![Page 11: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/11.jpg)
GBS 96‐plex Protocol(http://www.maizegenetics.net/gbs‐overview)
............. ...
..... ..................... ........... ..
..
. .. ...
.
...
..... ......... .. .........
.....
...
. .. ....... ......
...
.. .......
...
. .
..
.... . .. . ....
1. Plate DNA & adapter pair
5. PCR
Primers
2. Digest DNA with RE3. Ligate adapters
Clean‐up4. Pool 96 samples
![Page 12: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/12.jpg)
. .
..
..
.
....
.. ...
...........
.
.
.. ........
.
.
. ... ...
..
.
.
..
.
.
.
.....
.
.
...
... ..
..
.
.
.
.
...
.. ...
.. ...
......
.
..
.....
. ......
....
... ..
...
.. ..
. .
..
..
.
........
.
. .
P1 P2 P1 P2
.
O1 O2
PCR
Insert
Pooled Digestion/LigationReactions (n=94)
GBS“Library”
PRC primers:
Insert
![Page 13: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/13.jpg)
GBS 96‐plex Protocol(http://www.maizegenetics.net/gbs‐overview)
............. ...
..... ..................... ........... ..
..
. .. ...
.
...
..... ......... .. .........
.....
...
. .. ....... ......
...
.. .......
...
. .
..
.... . .. . ....
1. Plate DNA & adapter pair
5. PCR
Primers
2. Digest DNA with RE3. Ligate adapters4. Pool 96 DNA aliquots
6. Evaluate fragment sizes
Clean‐upClean‐up
![Page 14: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/14.jpg)
Perform Titration to Minimize Adapter Dimers Before Sequencing
Size Standards
LibraryAdapterDimer
1500 bp15 bp
NOTE: Done once with a small number of samples.Adapter dimers constitute only 0.05% of raw sequence reads
Fluo
rescen
seintensity
Time Time
Optimal adapter amount
![Page 15: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/15.jpg)
Small Fragments are Enriched in GBS Libraries Total Fragm
ents
0%
5%
10%
15%
20%
25%
30%
35%
40%
0 50 100
150
200
250
300
350
400
450
500
550
600
650
700
750
800
850
900
950
1000
1000
0000
REFGENOME
IBM
ApeKI fragment size (bp) >10000
B73 RefGen v1IBM (B73 X Mo17) RILs
![Page 16: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/16.jpg)
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
384‐plex GBS Results for Maize
Reads
Mean read count per line = 528,000c.v. = 0.22
![Page 17: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/17.jpg)
Flowcell8 channels
Solid Phase Oligos
Illumina Sequencing by Synthesis Review
![Page 18: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/18.jpg)
Flow cell with bound oligos
Denatured “Library”
Cluster Formation Amplifies Sequencing Signal
Bridge Amplification
Linearization
![Page 19: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/19.jpg)
ACGTGGC
ACGT
GC
T
G
TG
P1 primer
C
ACGT
GC
TG
GA
ACGT
GC
TGC
G
T TG TGC TGCA
Sequencing by Synthesis
FlowcellHiSeq 2000
![Page 20: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/20.jpg)
![Page 21: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/21.jpg)
GACGT
GC
T
G
ACGTGGC
T
C
ACGT
GC
TG
G
T N
ACGTGGC
T ACGTGGC
T ACGTGGC
T
“In Phase” “Out of Phase”
![Page 22: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/22.jpg)
Read 1
GBS captures barcode and insert DNA sequence in single read
Insert DNA
Barcode
Sequencing Primer 1
Sequencing Primer 2
Illumina
Read 1
Read 2
Sequencing Primer 1
BarcodeRE cut‐site
Insert DNA
GBS
![Page 23: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/23.jpg)
Variable Length GBS Barcodes Solves Sequence Phasing Issues
•First 12 nt used to calculate phasing.•Algorithm assumes random nt distribution.•Incorrect phasing causes incorrect base calls.
Barcode
Illumina
Insert DNA
•Good design and modulating the RE cut site position with variable length barcodes produces even nt distribution.
BarcodeRE cut‐site
GBS
Insert DNA
Read 2
Read 1
![Page 24: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/24.jpg)
Invariant GBS barcodes cause loss of signal intensity.
![Page 25: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/25.jpg)
Successful GBS sequencing run.
![Page 26: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/26.jpg)
Most significant GBS technical issue?
DNA quality
![Page 27: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/27.jpg)
Different Ends 50%
Same Ends 50%
100%
Ligation
LigationPCR
GBSStandard
GBS does not use standard “Y” adapters
![Page 28: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/28.jpg)
Denatured “library”
Flow Cell with Bound Oligos
Same‐ended Fragments Do Not Form Clusters
Bridge Amplification
Cluster Formation &“Linearization”
No P1 bindingsite
Cleaved from surface
![Page 29: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/29.jpg)
< 450 bp
Restriction site( ) sequence tag
Loss of cut site
Sample1
GBS vs. RAD
• Focuses NextGen sequencing power to ends of restriction fragments• Scores both SNPs and presence/absence markers
Sample2
![Page 30: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/30.jpg)
Digest
Ligate adapters
Pool
Random shear
Size select
Ligate Y adapters
PCR
RADGBS
ReferenceDavey et al. 2011
![Page 31: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/31.jpg)
Modifying GBS
Considerations for using GBS with new species and / or different enzymes.
![Page 32: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/32.jpg)
Why Modify the GBS Protocol?
• More markers• Fewer markers (deeper sequence coverage per locus)
• Increase multiplexing• More genome appropriate (avoid more repetitive DNA classes)
• Other novel applications (i.e., bisulfitesequencing).
![Page 33: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/33.jpg)
Genome Sampling Strategies Vary by Species
Dependent on Factors that Affect Diversity:
•Mating System(Outcrosser, inbreeder, clonal?)
•Ploidy(Haploid, diploid, auto‐ or allopolyploid?)
•Geographical Distribution(Island population, cosmopolitan?)
![Page 34: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/34.jpg)
Other Factors
• Genome size– The size of the genome has some bearing on the size of the fragment pool.
– Amount of repetitive DNA directly correlated with genome size.
• Genome composition– The composition of the genome can affect the frequency and distribution of the cut sites.
![Page 35: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/35.jpg)
Sampling large genomes with methylation‐sensitive restriction enzymes.
5‐base cutter
6‐base cutter
MethylatedDNA
UnmethylatedDNA
GBS LibrarySequenced Fragments
![Page 36: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/36.jpg)
Scrub jay
Vole Giantsquid
Deer Mouse
TunicateYeast
Solitary Bee
Grape Maize Cacao
Rice
Raspberry
Barley
Cassava
Goose
Sorghum
Cassava
Shrub willow
Optimizing GBS in New Species
![Page 37: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/37.jpg)
Choosing Appropriate Restriction Enzymes
“Life is like a bowl of chocolates”.
![Page 38: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/38.jpg)
Goose
PstI
Giant Squid
ApeKI
![Page 39: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/39.jpg)
Deer Mouse
ApeKI
PstI
![Page 40: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/40.jpg)
Shrub Willow
ApeKI
PstI
EcoT22I
![Page 41: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/41.jpg)
GBS Adapter Design
![Page 42: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/42.jpg)
Barcode Design Considerations• Barcode sets are enzyme specific
– Must not recreate the enzyme recognition site– Must have complementary overhangs
• Sets must be of variable length• Bases must be well balanced at each position• Must different enough from each other to avoid confusion if
there is a sequencing error.– At least 3 bp differences among barcodes.
• Must not nest within other barcodes• No mononucleotide runs of 3 or more bases
http://www.deenabio.com/services/gbs‐adapters
![Page 43: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/43.jpg)
GBS SNP calls in Sorghum bicolor
091211.c10.hmp alleles chrom SC283 BR007B RIL_100 RIL_101 RIL_102 RIL_103 RIL_104 RIL_105 RIL_106 RIL_107 RIL_108 RIL_109 RIL_10 RIL_110 RIL_111 RIL_112 RIL_113 RIL_114 RIL_115
S10_29954 C/A 10 A N A C A N N C C N N N N N N N N C N
S10_29963 A/C 10 C N C A C N N A A N N N N N N N N A N
S10_29965 T/C 10 C N C T C N N T T N N N N N N N N T N
S10_84178 C/G 10 N N C N G N N N N N N N N N C N N N N
S10_84512 T/C 10 T T N N T N T N N N N C N N N N N N N
S10_84513 C/T 10 C C N N C N C N N N N T N N N N N N N
S10_85556 G/A 10 G A G A G N G G N N G G G R N A N R G
S10_85556 G/A 10 G N N A G A G N N G G G N N A N G G G
S10_115300 T/C 10 T C T C T C T N N T T N N N C N N T T
S10_133268 T/C 10 T C N N N N N T T N N N N N N N N N N
S10_158101 G/A 10 G N N A G N N N N N N N G A A N G G N
S10_158101 G/A 10 G A N A G A N N N G N G N N N N G G N
S10_163724 G/A 10 G A N A N N N G N N G N N N N N G N G
S10_163929 T/G 10 G T N T G N G N N G G N G T T T N G N
S10_163931 C/T 10 T C N C T N T N N T T N T C C C N T N
S10_180445 G/A 10 G A G N G A G G G N G N G A A N G G G
S10_209049 C/G 10 N G C S C G C C C N C C N C G G N C N
S10_209231 G/A 10 N A G A G A G G N N G G G N A A G G N
S10_214941 C/T 10 C N N T N N N N C N C N N N N N N C C
S10_234260 T/G 10 T N T N T G T N T T T T T N N G T N T
S10_234260 T/G 10 T G T G T G N T T T T T T N G G T T T
S10_238937 G/A 10 N A N A N N G G G N G N N N A N N N N
S10_252729 G/T 10 G T G T G N N G G G N N N N N T G G G
S10_252730 C/T 10 C T C T C N N C C C N N N N N T C C C
S10_261151 T/C 10 T C T C T C N N T T N N T N N N T N N
S10_268392 C/A 10 N N N N A N A N N N N N N N N N N N N
S10_268393 G/A 10 N N N N A N A N N N N N N N N N N N N
S10_274414 C/T 10 N N C N N N N C T C N N C T N N C N C
S10_281272 T/C 10 T N T N N N N N N N N N N N N N N T T
S10_287523 G/T 10 G N G N N N N G G G N N N N N N N N N
S10_292020 A/T 10 A T N N A N N A A A N A A N T T A A A
S10_292020 A/T 10 A T A T N N N A N A N A N N T T A N A
S10_294189 C/T 10 N N C N C N C C C N N N C T N N C N C
S10_302638 C/T 10 C T C T C N N C N C C N C T T N C C N
S10_312560 A/C 10 A C A N A N A A A A A A A C C N A A A
S10_347848 T/C 10 T C T C T C N T T T T N T N C N T T T
![Page 44: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/44.jpg)
Missing Data Strategies
• Impute• Technical Options
– Reduce the multiplexing level– Sequence the same library multiple times
• Molecular Options– Choose less frequently cutting enzymes
![Page 45: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/45.jpg)
DNA sample info entered to webform
HTS databaseProject approved?Samples shipped
GBS libraries made
Reference genome
Non‐reference genome
Data FileSNPs/Sample
Lab Analysis Pipelines
DNA sequence data Data File
States/Sample
GBS Workflow
![Page 46: Genotyping By Sequencing (GBS) Method Overview](https://reader034.fdocuments.net/reader034/viewer/2022051522/58a31da71a28ab32438c283c/html5/thumbnails/46.jpg)
GBS Team
Sharon Mitchell
Jeff Glaubitz
Ed Buckler
Qi Sun
Rob Elshire
James Harriman