454 - Sequencing · 454 sequencing at the NSC Rex Rodney arrived! Samples Wide variety of starting...
Transcript of 454 - Sequencing · 454 sequencing at the NSC Rex Rodney arrived! Samples Wide variety of starting...
454 sequencing
454 sequencing at the NSC
Rex Rodney arrived!
Samples
Wide variety of starting materials: • genomic DNA, • PCR products, • BACs • cDNA • mRNA
For shotgun or cDNA libraries - 500 ng of sample DNA is required.
Library preparation
For shotgun libraries: • Fragmentation (if needed) • Ligation of 454 sequencing adaptors
For amplicon libraries: PCR products are created by amplifying with specific fusion primers containing 454 sequencing adaptor sequences.
Emulsion PCR Amplification
One Fragment = One Bead
Library is attached to DNA Capture Beads.
Sequencing
One Bead = One Read
Sequencing
Data Processing & Analysis
Homopolymer errors
3 G's? 4 G's?
!!!"#$%&'()**+,'-(.%,'/%'"%$0
!
"!!!
#!!!
$!!!
%!!!
&!!!
'!!!
(!!!
" "!" #!" $!" %!" &!" '!"
!"#$%&"'()*%+,-.
/%!"#$0
)*+,-./"
!
"!!!
#!!!
$!!!
%!!!
&!!!
'!!!
(!!!
" "!" #!" $!" %!" &!" '!"
!"#$%&"'()*%+,-.
/%!"#$0
)*+,-./#
!0
"0
#0
$0
%0
&0
! "!! #!! $!! %!! &!!
1#0"%230'%+,-.
455%6%1#0"%230'%+7.
1'2,$/34
!0
"0
#0
$0
%0
&0
! "!! #!! $!! %!! &!!
1#0"%230'%+,-.
455%6%1#0"%230'%+7.
1'2,$/35
!"#$%&"'()*%#'$%+,,-,%.,-/01"%/-,%23%4&5 60)#'078 3",0"9%
Solving errors: oversampling
AGAAAGTCAGCGGCAAATTTGGTTTTAGACGAA-TTGTCCCTTTGACATAACGACTAAAGG AGAAAGTCAGCGGCAAATT-GGTTTTAGACGAA-TTGTCCCTTTGACATAACGACTAAAGG AGAAAGTCAGCGGCAAATTTGGTTTTAGACGAAATTGTCCCTTTGACATAACGACTAAAGG AGAAAGTCAGCGGCAAATTTGGTTTTAGACGAA-TTGTCCCTTTGACATAACGACTAAAGG AGAAAGTCAGCGGCAAATTTGGTTTTAGACGAA-TTGTCCCTTTGACATAACGACTAAAGG AGAAAGTCAGCGGCAAATTTGGTTTTAGACGAAATTGTCCCTTTGACATAACGACTAAAGG AGAAAGTCAGCGGCAAATTTGGTTTTAGACGAA-TTGTCCCTTTGACATAACGACTAAAGG AGAAAGTCAGCGGCAAATTTGGTTTTAGACGAAATTGTCCCTTTGACATAACGACTAAAGG AGAAAGTCAGCGGCAAATT-GGTTTTAGACGAA-TTGTCCCTTTGACATAACGACTAAAGG AGAAAGTCAGCGGCAAATTTGGTTTTAGACGAAATTGTCCCTTTGACATAACGACTAAAGG AGAAAGTCAGCGGCAAATTTGGTTTTAGACGAA-TTGTCCCTTTGACATAACGACTAAAGG
AGAAAGTCAGCGGCAAATTTGGTTTTAGACGAA-TTGTCCCTTTGACATAACGACTAAAGG
Undercall in two reads
Overcall in four reads
What do you get? Amplicon Genomic DNA
cDNA
Read length
• Whole genome sequencing
• Metagenomics
• Amplicon
• BAC, fosmid etc. sequencing
• Transcriptome
Applications
Whole genome sequencing Shotgun library
Paired end library Collapsed contig
DNA
Shotgun reads
Contigs
Scaffold NNNNN NNNNN Contig
Whole genome sequencing with GS FLX+
From mRNA to transcriptome
Fragmented mRNA sample
mRNA sample
cDNA
What do you get? Realistic yield
cDNA sample
Library
40
73
106
139
172
205
238
271
304
337
370
403
436
469
502
535
568
Readlength distribution
Library fragment cut-off 250-300 bases
cDNA (Roche protocol) Library
40
75
110
145
180
215
250
285
320
355
390
425
460
495
530
565
600
Read length distribution
Transcriptome analysis with GS FLX+
Amplicon
Performed using special Fusion Primers
IMPORTANT: primer dimers must be removed!
What do you get? Read length
40
63
86
109
132
155
178
201
224
247
270
293
316
339
362
385
408
431
454
477
500
523
546
569
592
4 4 5 7 8 1 1 1 1 1 61 1 2 2 22 2 52 2 82 3 3 3 43 3 73
Readlength distribution
Primer dimer
What do you get? Number of reads
BUT: number of reads depends on total length of the amplified products
Pacific Biosciences
Pacific Biosciences RS
• Arrives Q4 2011
• Single molecule sequencing
• 36 000 reads/run
• Long reads: • Average 2500 bp • Max 14 000 bp
• Accuracy: • single-pass 87% • consensus pass – 5 passes yields average Q30 (1:1000 chance of error)
How does it work?
Anchored polymerases + template in wells (Zero Mode Waveguides)
Add fluorescent nucleotides & film what happens.