8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
1/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Genomic DNA Librariesfor Shotgun SequencingProjects
William C. Nierman
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
2/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Whole Genome Shotgun SequencingWhole Genome Shotgun Sequencing
Random Sequencing Phase
a. sequence DNA(15,000 sequences/ Mb)
GGG ACTGTTC ...
a. isolate DNA
b. fragment DNA
c. clone DNA
Closure Phase
a. assemble sequences
b. close gaps
d. annotation
c. edit
237 239
238COMPLETEGENOME SEQUENCE
Library construction
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
3/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Genomic Sequencing Overview
Large Insert Library (20 - 500 Kb)
PhysicalMap
Genomic DNAMarker1 Marker2
Shotgun Library (2-3 Kb)Sequencing
(6-8 X)
Assembly
Gap Closure
Analysis
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
4/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Genomic Sequencing Overview
Genomic DNAMarker1 Marker2
Shotgun Library (2,10, 50 Kb)
Sequencing(6-8 X)
Assembly
Gap Closure
Analysis
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
5/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Library Construction
Clone Picking
Template Preparation
SampleTracking
Sequencing Reactions
Electrophoresis andBase Calling
Sequence Files
Genome Assembly
Shotgun SequencingPhase
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
6/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
graphical representation of phred quality values
Consensusquality values
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
7/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
3 Tier Whole Genome Shotgun3 Tier Whole Genome ShotgunLibrary StrategyLibrary Strategy
1. Moderate copy number plasmidsplasmids containing ~2-kb inserts
2. Moderate copy number plasmids containing~10-kb inserts
3. Fosmid or other clones containing 40 - 200-kbinserts
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
8/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
TIGR Assembly Viewer. Green arrows represent F and R sequences from the same clone.Red arrows represent sequences with a sequence mate in a different contig. 5 end of theassembly points to a telomeric repeat and is linked to a clone containing telomeric sequence
Repetitivesequences
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
9/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Repetitive Regions
Output from the TIGR software tool repeat Display showing a section of an assembly. Theblack boxes represent a 700 bp repeat (7V, 24 copies/genome) and a 3100 bp repeat (9D, 9copies/genome). Both repeats are spanned by clone DMGRG22. To confirm the sequenceof these repeats, this clone was transposed.
Large-insertspanning clone,DMGRG22
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
10/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
11/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH 4738A4737A
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
12/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Library Requirements1. Free vector should be at low or undetectable level.
2. None of the clones should contain chimeras derived byinsertion of two or more random fragments from separateparts of the genome.
3. The inserts should be of relatively uniform size.
4. Libraries of different insert sizes for linking should be used.
5. Libraries should be representative of genome.
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
13/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Vector
Vectorplus insert
DNA insertBstXI adaptor
CTTTCCAGCACA
GTGTGACCTTTC
GAAAGGTC
CTGGAAAG
Complementary to BstXI adaptor
Ligate
Ligat e
BstXI adaptor cloning system
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
14/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Library Requirements1. Free vector should be at low or undetectable level.
2. None of the clones should contain chimeras derived byinsertion of two or more random fragments from separate
parts of the genome.
3. The inserts should be of relatively uniform size.
4. Libraries of different insert sizes for linking should be used.
5. Libraries should be representative of genome.
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
15/40
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
16/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
17/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
18/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
19/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
What is Unclonable DNA ?Difficult cloning targets include severaldifferent types of sequences, such as: Toxic coding sequences Promoters A/T Rich DNA Modified bases
Repetitive regions
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
20/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Library Coverage and Randomness
Tolerance of cloned DNA by E.
coli host
Vector copy number
Insert size
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
21/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Vector Design Issues
Vector driven transcription and translation into theinsert induce expression of the cloned sequence. Fortuitous transcription out of the insert can interfere
with vector maintenance.
False positives and false negatives arise frominappropriate transcription. High copy number can cause plasmid instability.
lacP
Cloned fragment
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
22/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Sequencing Project Vector Features1. The sequencing primer sites immediately flank the cloningsite to avoid excessive re-sequencing of vector DNA.
2. PCR primer sites are located immediately outside of thesequencing primer sites to allow PCR amplification fortemplate preparation.
3. The entire cloning region including the primer sites is isolatedfrom RNA transcription.
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
23/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Design Features of a BstXI AdaptorCloning System
pHOS vector plus insert
AmpR
Ori, copynumber
ter1
ter2
Pr
BstXI site BstXI site Reverse sequencing primer
Reverse PCR primer
Forward sequencing primer
Forward PCR primer
rrnBT1 rrnBT2
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
24/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Construction of Linking Library in pHOS2Kan
genomic DNA ~50 kb with Bst XI adaptors
CTTTCCAGCACA
GAAAGGTC
CTGGAAAG
ACACGACCTTTC
pHOS2
RestrictionDigest
Amp
Amp
PhosphataseLigate Kan Cassette
Kan
Amp
Double Amp/KanSelection
pHOS2
pHOS2
Ligation
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
25/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
26/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Fosmid Library Construction
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
27/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Copy Number Induced (+) vs. Uninduced (-) Fosmid DNA preps
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
28/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
29/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
30/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Library Mix
Wolbachia (endosymbiont of B. malayi)Sequenced to 20X
True genome size: 1,080,471 bases.
At 7.6X coverage: 0% small, 100% large gave 1 scaffold
of 1,076,660 bp and 10 contigs 5% small, 95% large gave 1 scaffold of
1,077,210 bp and 12 contigs 60% small, 40% large gave 14 scaffolds
(largest=160 kb), 79 contigs
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
31/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Redundancy Analysis from Completed Projects
16201284.2X3.2XTPG Large321
916.5X6.5X
TPG Small
4221045.1X2.5XGFS Large
21012310.4X10.8XGFS Small
1238384.1X3.2XGMX Large
13219963.4X2.2XGMX Small
279944.7X2.1XGBS Large
95758.9X9.0XGBS Small
ContigsScaffoldsCoverageActual
CoverageEst.
Genome &Insert Size
gbs = Streptococcus agalactiaegmx = Myxococcus xanthusgfs = Fibrobacter succinogenestpg = Theileria parva
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
32/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
5 RNA'S8%
REPEATS15%FAILED MATES
1%
EDITING8%
COVERAGE3%
MATT'S HELP3%
SEQ GAPS12%
PHYS GAPS23%
2 DIFFICULT REPEATS27%
Comparison of Library Strategies
Genome BSP GBS GSA GSE S. pneumoniae S. agalactiae S.aureus S. epidermidis
Size MB 2.1 2.1 2.8 2.7 Groups 160 58 134 12Seq Gaps 290 46 198 24Start Date Nov 95 Dec 00 Mar 99 Feb 01In Closure 49 months 10 months 26 mon ths 7 months
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
33/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Myxococcus xanthus Sequencing Statistics Total shotgun sequences _ 130,436
TIGR Library insert sizes 2-3 kb, 10-12 kb Sequence coverage of 9X
Assembled into single scaffold of 103 contigs Two rounds of autoprimer sequencing
reduced contig number to 36 9,131,959 bases, 3500 Ns in gaps
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
34/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Aspergillus fumigatus karyotype
1,789 Kb
3,779 Kb
2,021 Kb
3,992 Kb
4,018 Kb
4,834 Kb
4,891 Kb
3,933* Kb
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
35/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Optical Analysis
Molecule maps generated from images of single DNA molecule digested with NheI
Resolution (avg fragment size) 8.28kb Total coverage: 8,987 Mbase, or 300x Total of 8 chromosomes
Total size: 29.189 Megabases
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
36/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
A. fumigatus chr5-7 contig placement
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
37/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Aspergillus fumigatusChromosomes
Presumed centromeric area
Telomere
3
2
6
5
8
7
1
4
Mitochondrion
rRNA
32 Kb
4.9 Mb
4.8 Mb
4.0 Mb
3.9 Mb
3.9 Mb
3.6 Mb
2.0 Mb
1.8 Mb
2.2 2.7
1.8 3.0
1.3 2.8
2.50.4 0.70.3
1.2 2.6
1.3 2.5
0.7 1.3
0.8 1.0
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
38/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
39/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
8/8/2019 1. Genomic DNA Libraries for Shotgun Sequencing Projects
40/40
TIGRTIGRTIGRTHE INSTITUTE FOR GENOMIC RESEARCHTHE INSTITUTE FOR GENOMIC RESEARCH
Top Related