©2001 Timothy G. Standish Matthew 13:17 17For verily I say unto you, That many prophets and...
-
Upload
quentin-hancock -
Category
Documents
-
view
218 -
download
0
Transcript of ©2001 Timothy G. Standish Matthew 13:17 17For verily I say unto you, That many prophets and...
©2001 Timothy G. Standish
Matthew 13:17
17 For verily I say unto you, That many prophets and righteous men have desired to see those things which ye see, and have not seen them; and to hear those things which ye hear, and have not heard them.
©2001 Timothy G. Standish
DNA SequencingDNA Sequencing
Timothy G. Standish, Ph. D.
©2001 Timothy G. Standish
Sequenced GenomesSequenced Genomes Over the past three years large scale sequencing of
eukaryotic genomes has become a reality Currently the sequencing of at least 5 multi-celled
eukaryotic genomes has been completed: 1998 Caenorhabditis elegans - 8 x 107 bp - A nematode
worm 2000 Homo sapiens - 3 x 109 bp - Humans 2000 Arabidopsis thaliana - 1.15 x 108 - A plant related to
mustard 2000 Drosophila melanogaster - 1.65 x 108 bp - Fruit flies 2002 Anopheles gambiae – 2.78 x 108 bp mosquito vector
of malaria
©2001 Timothy G. Standish
New TechnologyNew Technology Rapid sequencing of large complex genomes
has been made possible by: Foundational work done over many years
and… Dramatic improvement in DNA sequencing
technology over the past few years In this presentation we will look at both the
basic principles of DNA sequencing and how techniques have been refined to yield the dramatic results we now see
©2001 Timothy G. Standish
A Sequencing TimelineA Sequencing Timeline1977 Sanger and Maxam-Gilbert
sequencing techniques developed
1980 M13 vector developed for cloning, many refinements and application of computer technology
1990 Improved sequencing enzymes, fluorescent dyes developed, robotics used for high throughput
1997 Sacromycetes Cerevisiae genome sequenced
1999 Caenorhabdits elegans Human chromosome 22 and about 20 bacterial genomes
2000 Drosophila melanogaster, Homo sapiens, Arabidopsis thaliana
= 2,000 bp20 X 100 bp
Total/weekSamples/person/week Average read length
= 18,000 bp60 X 300 bp
= 90,000 bp180 X 500 bp
= 325,000 bp500 X 650 bp
=3,000,000bp5000X 600 bp
= 200 bp4 X 50 bp
©2001 Timothy G. Standish
Basic PrinciplesBasic Principles All current practical DNA sequencing techniques
can be divided into four major steps:
1. Labeling of DNA so that small quantities can be easily detected, traditionally done by labeling with either P32 or S35
2. Generation of fragments for which the specific bases at the 3’ end are known
3. Separation of fragments using gel electrophoresis sensitive ennough to resolve differenced in size of one nucleotide
4. Fragment detection
©2001 Timothy G. Standish
OutlineOutline In this presentation we will look at:
1. The Maxam-Gilbert and Sanger methods of DNA fragment generation
2. Then methods for separation of fragments
3. And finally examine how these techniques have been refined and automated to allow for rapid cheap sequencing of large quantities of DNA
©2001 Timothy G. Standish
The Maxam-GilbertThe Maxam-GilbertChemical MethodChemical Method
Three major steps:
1. DNA to be sequenced is typically labeled at the 5’ end using P32
2. Fragments are generated using chemicals that break DNA at specific bases
3. These fragments are then separated and detected using autoradiography
Polyacylamide Gel Electrophoresis is typically used to separate fragments on the basis of single nucleotide differences
©2001 Timothy G. Standish
2 Fragment Generation2 Fragment GenerationA number of chemicals will specifically
modify the bases in DNAModified bases can then be removed
from the deoxyribose sugar to which they are attached on the sugar-phosphate DNA backbone
Piperidine, a volatile secondary amine, is used to cleave the sugar-phosphate back bone of DNA at sites where bases were modified
©2001 Timothy G. Standish
Cleavage at Specific BasesCleavage at Specific Bases Typically 5 reactions are run:
1. Dimethylsulfate at pH 8.0 results in modification of guanine (G)
2. Piperidine formate at pH 2.0 breaks glycosidic bonds between deoxyribose and both purines, guanine (G) and adenine (A), by protination of nitrogen atoms
3. Hydrazine (rocket fuel!) opens pyrimidine rings on both pyrimidines, cytosine (C) and thymine (T)
4. Hydrazine in the presence of 1.5 M NaCl only reacts with C
5. 1.2 N NaOH at 90 oC strongly cleaves at A and may also weakly cleave at C
©2001 Timothy G. Standish
Cleavage at Specific BasesCleavage at Specific Bases The trick in chemical sequencing is to not allow
the reactions to go to completion Partial reactions run using the following
conditions will result in a series of labeled DNA fragments whose final base is known:
Dimethylsulfate at pH 8.0 -----------> G
Piperidine formate at pH 2.0 -------> G and A
Hydrazine ------------------------------> C and T
Hydrazine in 1.5 M NaCl -----------> C
1.2 N NaOH at 90 oC -----------------> A and some C
©2001 Timothy G. Standish
Partial Reactions:Partial Reactions:Dimethylsulphate pH 8.0Dimethylsulphate pH 8.0P32
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
©2001 Timothy G. Standish
Partial Reactions:Partial Reactions:Dimethylsulphate pH 8.0Dimethylsulphate pH 8.0
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
Modification of some, but not all, of the G bases as the reaction is not allowed to go to completion
©2001 Timothy G. Standish
Partial Reactions:Partial Reactions:Dimethylsulphate pH 8.0Dimethylsulphate pH 8.0
5’TACTTA3’
5’ACGTACTTA3’
5’TACTTA3’
5’ACGTACTTA3’
5’ACGTACTTA3’
5’TACTTA3’
5’*NNGAC3’
5’*NN3’
5’*NN3’
5’*NN3’
5’*NN3’
5’*NNGAC3’
Following breaking of the DNA strand at positions where G was chemically modified, two sets of fragments result: 1) A labeled set all ending where a G once was and 2) An unlabeled set which cannot be detected using autoradiography
Unlabeled fragments undetectable using autoradiography
Labeled fragments all of which represent a place where G used to be
©2001 Timothy G. Standish
Partial Reactions:Partial Reactions:HydrazineHydrazine
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
5’*NNGACGTACTTA3’
Some, but not all, C and T bases are modified as the reaction is not allowed to go to completion
©2001 Timothy G. Standish
Partial Reactions:Partial Reactions:HydrazineHydrazine
5’*A3’
5’GTACTTA3’
5’*A3’
5’*TA3’
5’G3’ 5’ACTTA3’
5’ACTTA3’
5’*NNGACGTAC3’
5’*NNGA3’
5’*NNGACGTACT3’
5’*NNGACGTAC3’
5’*NNGA3’
5’*NNGACG3’
Following breaking of the DNA strand at positions where C or T was chemically modified, two sets of fragments result: 1) A labeled set all ending where a C or T once was and 2) An unlabeled set which cannot be detected using autoradiography
Unlabeled fragments
LabeledT C set
©2001 Timothy G. Standish
DisadvantagesDisadvantagesToxic chemicalsLarge amounts of radioactivitySometimes ambiguous and
frequently ugly sequencing gelsTricky to read autoradsLack of automated methods
©2001 Timothy G. Standish
Sanger SequencingSanger Sequencing The Sanger sequencing method takes advantage of
the way that normal DNA replication occurs For DNA to be extended using normal DNA
polymerases, a hydroxyl group must be present at the 3’ carbon on deoxyribose
Fragments are generated by spiking reactions with small quantities 2’ 3’ dideoxy nucleotides which terminate polymerization whenever they are incorporated into DNA
Polymerases used must lack 3’ to 5’ exonuclease proof reading activity for this method to work
©2001 Timothy G. Standish
H
P
O
OH
OH
HO
O
O
CH2
NH2
N
N N
N
Sugar
Base
Phosphate
3’
5’
2’
1’4’
DideoxynucleotidesDideoxynucleotides DNA Sequencing using the Sanger
method involves the use of 2’3’-dideoxynucleotide triphosphates in addition to regular 2’-deoxynucleotide triphosphates
Because 2’3’-dideoxynucleotide triphosphates lack a 3’ hydroxyl group, and DNA polymerization occurs only in the 3’ direction, once 2’3’-dideoxynucleotide triphosphates are incorporated, primer extension stops
H
2’3’-dideoxynucleotide monophosphate
2’-dideoxynucleotide monophosphate
SU
GA
R-P
HO
SP
HA
TE
BA
CK
BO
NE
H
P
O
HO
O
O
CH2
HOH
P
O
O
HO
O
O
CH2
H
P
O
OH
HO
O
O
CH2
NH2
N
N
N
N
O
O
NH2N
NH
N
N
N O
NH2
N
B A
S E
S
2’3’2’3’dideoxy-dideoxy-
nucleotidesnucleotidesTerminateTerminate
DNADNAReplicatonReplicaton OH
P
O
HO
O
O
CH2
HO
O
H 2N
NHN N
N H
H OH
P
O
OH
O
O
CH2
CH 3
O
O
HNN
OH
H
P HO
O
O
CH2
HO
N
O
H 2N
N
H2O
2’3’did
eoxynu
cleotide
©2001 Timothy G. Standish
Making DNA FragmentsMaking DNA Fragments In Sanger DNA sequencing reactions all the basic
components needed to replicate DNA are used 4 reactions are set up, each containing:
– DNA Polymerase
– Primer
– Template to be sequenced
– dNTPs
– A small amount of one ddNTP ddATP, ddCTP, ddGTP, ddTTP
As incorporation of ddNTPs terminates DNA replication, a series of fragments is produced all terminating with the ddNTP that was added to each reaction
©2001 Timothy G. Standish
DNA SequencingDNA Sequencing
Plasmid (or phage) with cloned DNA
fragment
Primer Binding sites
Cloned fragment
Primer
©2001 Timothy G. Standish
The ddATP ReactionThe ddATP Reaction
5’TTATCG3’AATAGCATGGTACTGATCTTACGCTAT5’
5’TTATCGTACCATGACTAGATGCGA
5’TTATCGTACCA
5’TTATCGTACCATGACTA
5’TTATCGTA
5’TTATCGTACCATGA
5’TTATCGTACCATGACTAGATGCGATA
5’TTATCGTACCATGACTAGA
Pol.5’TTATCGTA Let me
Through!
Pol.5’TTATCGTACCATGA
Oh comeon!
Pol.5’TTATCGTACCATGACTAGA
NotAgain!
Pol.5’TTATCGTACCATGACTAGATGCGATA
Agggg….
©2001 Timothy G. Standish
Separation of DNA FragmentsSeparation of DNA FragmentsAll current practical sequencing
methods rely on separation of DNA fragments in such a way that differences in length of a single base can be resolved
This is typically done using polyacrylamide gel electrophoresis
©2001 Timothy G. Standish
Acrylamide
Acrylamide
Polyacrylamide GelsPolyacrylamide Gels Polyacrilamide is a polymer made of acrylamide
(C3H5NO) and bis-acrilamide (N,N’-methylene-bis-acrylamide C7H10N2O2)
O
CH
CH2
NH2C
O
CHCH2
NH2C
CH2
bis-Acrylamide
O
CHCH2
NH2C
Acrylamide
©2001 Timothy G. Standish
Polyacrylamide GelsPolyacrylamide Gels
O
CHCH2
NH2C
O
CHCH2
NH2C
SO4-.
Acrylamide polymerizes in the presence of free radicals typically supplied by ammonium persulfate
©2001 Timothy G. Standish
Polyacrylamide GelsPolyacrylamide Gels Acrylamide polymerizes in the presence of free
radicals typically supplied by ammonium persulfate
SO4-.
O
CHCH2
NH2C
O
CHCH2
NH2CNH2
O
CHCH2
C
O
CHCH2
NH2C
TMED (N,N,N’,N’-tetramethylethylenediamine) serves as a catalyst in the reaction
©2001 Timothy G. Standish
Polyacrylamide GelsPolyacrylamide Gels bis-Acrylamide polymerizes along with acrylamide
forming cross-links between acrylamide chainsO
CHCH2
NH2C
O
CHCH2
NH2C
O
CHCH2
NH2CNH2
O
CHCH2
C
O
CHCH2
NH2C
O
CHCH2
NH2C
bis-Acrylamide
O
CH
CH2
NH2C
O
CHCH2
NH2C
CH2
©2001 Timothy G. Standish
Polyacrylamide GelsPolyacrylamide Gels bis-Acrylamide polymerizes along with acrylamide
forming cross-links between acrylamide chains
©2001 Timothy G. Standish
Polyacrylamide GelsPolyacrylamide Gels Pore size in gels can be varied by varying the ratio
of acrylamide to bis-acrylamide
Lots of bis-acrylamideLittle bis-acrylamide
DNA sequencing separations typically use a 19:1 acrylamide to bis ratio
©2001 Timothy G. Standish
Denaturation of DNADenaturation of DNA For gel electorphoresis to accurately separate on the
basis of size and not shape or other considerations it is important that the DNA be denatured
This is typically achieved by using a high urea concentration (8 M) in the gel
Double stranded DNA
Denatured Single Stranded DNA
8 M8 MUreaUrea
Self annealing DNA
8 M8 MUreaUrea
Denatured Single Stranded DNA
©2001 Timothy G. Standish
5’GACGTACTTA3’
G G+A T+C C A>CG G+A T+C C A>C
Separation of Fragments:Separation of Fragments:Maxam-GilbertMaxam-Gilbert
1.2 N NaOH at
90 oC
A>C
Hydrazine
T+C
Piperidine formate
pH 2
G+A
Dimethyl sulfate pH 8
G
Hydrazine in 1.5 M
NaCl
CX
XX
XX
X
5’
t
o
3
’
©2001 Timothy G. Standish
Separation of Sanger FragmentsSeparation of Sanger Fragments Products from 4 reactions
each containing a small amount of a dideoxynucleotide are loaded onto a gel
Because polymerization goes 5’ to 3’ shortest fragments are 5’ compared to longer fragments which are in the 3’ direction
ddTTPddCTP ddGTPddATP
Rea
d 5’
to 3
’ fr
om b
otto
m to
top
©2001 Timothy G. Standish
To read the autorad it is important to start at the bottom and work up so that it is read in the 5’ to 3’ direction
DNA SequencingDNA SequencingWhat A SequencingWhat A Sequencing
Autorad ActuallyAutorad ActuallyLooks LikeLooks Like
A C G T
5’CTAGAGGATCCCCGGGTACCGAGCT...3’
©2001 Timothy G. Standish
Sequencing Method RefinementsSequencing Method Refinements Because of difficulties intrinsic to the Maxam-
Gilbert chemical sequencing strategy, efforts at improvement have been concentrated on the Sanger method
Major improvements in the following areas have been achieved
Labeling and detection Fragment separation DNA Polymerases used in sequencing and
resulting strategies for generation of fragments Automation
©2001 Timothy G. Standish
Pros and Cons of thePros and Cons of theSanger MethodSanger Method
It is more amenable to automation than Maxam-Gilbert
Fewer dangerous chemicals are used, but acrylamide and P32 or S35 are still a problem
Gels or autorads are generally cleaner looking and the reading of bases is a lot easier than Maxam-Gilbert data
The bottom line: Without improvements in automation, detection and separation technologies Sanger sequencing is still very labor intensive
©2001 Timothy G. Standish
Labeling and DetectionLabeling and Detection Labeling using radioactive isotopes is difficult,
dangerous and expensive Using biotin labeled primers has allowed
conjugation of enzymes to fragments and their subsequent detection using substrates that change color in the presence of the enzyme
This technique is clumsy, expensive, time consuming and unreliable
It also may require transfer of fragments to membranes thus increasing labor and generally has not caught on
©2001 Timothy G. Standish
Labeling and DetectionLabeling and Detection Another approach has involved development
of very sensitive silver staining technologies I have tried this one, it is miserable and
unreliable Read length on gels is typically short and
creation of a permanent copy of the gel requires expensive additional equipment and supplies
It may not involve isotopes, but it is such a hassle and the data is of such low quality that it is not worth the effort
©2001 Timothy G. Standish
Labeling and DetectionLabeling and DetectionThe most significant advance in labeling
has been the production of electrophoretically neutral dyes that fluoresce at specific wavelengths when excited by laser produced light over a very narrow range of wavelengths
These dyes, when attached to primers allow detection down to 15 attomoles (10-18)
That’s less than 107 molecules!
©2001 Timothy G. Standish
The Li-Cor SystemThe Li-Cor System
Fluorescence of dyes attached to DNA fragments are detected as they pass the lasers and detectors
Data in digital form is fed directly into a computer system where automated base calling is done
A graphic representation of the data resembles a traditional autorad with bands appearing in 4 lanes
Li-Cor of Lincoln, Nebraska was one of the first to implement fluorescent dyes as part of an automated sequencing system
The Li-Cor system uses infrared lasers scanning a fixed line toward the bottom of an acrylamide slab gel
©2001 Timothy G. Standish
Dye labeled fragments
Polyacrylamide gel
The Li-Cor SystemThe Li-Cor SystemA T G C
Detector CCDCCD
ZappoZappo
Laser
…..
©2001 Timothy G. Standish
Pros and ConsPros and Cons The Li-Cor systems major advantage is the lengths
of its DNA reads– Because all fragments travel through the entire gel,
resolution is sufficient to read over 1,000 bases in a single run with over 99 % accuracy
– This is better than just about any single run manual sequencing method
Elimination of manual reading of autorads also eliminates human error and removes a labor intensive step
P32 or S35 not used - another major advantage Tricky acrylamide gels still must be cast and
loaded manually
©2001 Timothy G. Standish
Applied BiosystemsApplied Biosystems Applied Biosystems (ABI) has developed fluorescent dye
systems further and improved methods for loading and electrophoresis
Four dyes each of which fluoresce at a different wavelength, but having about the same impact on electrophoritic mobility can be used to label either primers or the nucleotides that terminate a reaction
If terminator dyes are used, the entire sequencing reaction is reduced to one tube from 4 in conventional Sanger sequencing
Instead of polyacrylamide slab gels, a single capillary can be used with a liquid polymer that is replaced after each individual run
©2001 Timothy G. Standish
3’AATAGCATAACGTTAACGTTACGCTAT5’Pol.
Oh comeon!
5’TTATCGTACCACPol.
5’TTATCGTACCATAATTNot
Again!
Pol.
Agggg….
5’TTATCGTACCATAATTGCA
Replication Using Dye TerminatorsReplication Using Dye Terminators
5’TTATCG
5’TTATCGTATTGCAATTGCA
5’TTATCGTATTGCAATT
5’TTATCGTA
5’TTATCGTATTGCAAT
5’TTATCGTATTGCAATTG
5’TTATCGTATTGCAA
5’TTATCGTATTGCAATTGC
5’TTATCGTATTGCA
5’TTATCGTAT
5’TTATCGTATTG5’TTATCGTATT
5’TTATCGTATTGC
Pol.5’TTATCGTA Let me
Through!
As the base at the end of each fragment is clearly marked with a unique fluorescent dye, the entire reaction can be done in a single tube
©2001 Timothy G. Standish
…..
Heat plate
Liquid polymer
ATTGC A
ABI Prism 310 SystemABI Prism 310 System
ZappoZappo
Laser
Beam splitter
Detectors
-
+Window
Sequencing reactionSequencing reaction
Capillary
©2001 Timothy G. Standish
The State of the ArtThe State of the Art The ABI Prism 310 (1 capillary), 3100 (16
capillaries) and 3700 (96 capillaries) represent the current state of the art in automated sequencing machines
A single ABI Prism 377 slab gel sequencer can run 115,000 bases per day!
The 3100 can run up to 184,000 bases per day The 3700 can run up to 1,104,000 bases per day Large sequencing facilities, like Celera, have
factories full of these machines which can run 24 hours a day with very little down time for routine maintenance
©2001 Timothy G. Standish
The State of the ArtThe State of the Art
ABI Prism 3700
©2001 Timothy G. Standish