Making Sense of Genomes
-
Upload
jacob-gross -
Category
Documents
-
view
43 -
download
0
description
Transcript of Making Sense of Genomes
![Page 1: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/1.jpg)
Making Sense of Genomes
![Page 2: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/2.jpg)
Interpreting genomes
• DNA was discovered in mid- to late- 1800’s during biochemical investigations of proteins as a phosphorus-rich substance (nuclein, because isolated from white cell nuclei)
• Chromatin was coined to describe colored components of cell nuclei after staining
• Chromosome coined in 1888 describes threads of stainable material found withn the nucleus, in 1930’s nculein became desoxyribose nucleic acid..later DNA
![Page 3: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/3.jpg)
Early work looked at genome size
![Page 4: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/4.jpg)
Animal genome size variation
• 0.04 placozoan to ~133 pg for lungfish (3300X difference)
• Current dataset is skewed towards vertebrates (66%)
• Some hagfish undergo chromatin loss as large fragments of genome present in germline are eliminated in somatic line
![Page 5: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/5.jpg)
Invertebrate genomes
• At most, 1% have been looked at• Terrestrial snails and slugs have 2X larger
genomes than freshwater relatives• Some annelid groups exhibit huge
differences (120X range) unrelated to polyploidy
• No information on tapeworm or tarantula – potential senior project.
![Page 6: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/6.jpg)
Mechanisms for alterations in animal genome sizes
• Insertion-deletion mechanism for genome reduction• Selfish DNA and spread of transposable elements• Accumulation of pseudogenes• Introns (clear lack in Fugu (pufferfish)• Chromosome-level events
– Aneuploidy (duplication or loss of individual chromosomes), Segments break off and fuse, chromosome 2 fused together in humans, separate in apes, Unequal crossing over in meiosis, and unequal sister chromatid exchange during gamete formation
• Polyploidy– Duplication of entire chromosome set
• Satellite DNA– Mini (9-100 bp; 15 bp mostly), short (3-5 bp), micro (1-5 bp; ignore
overlap with short), copies of rDNA
![Page 7: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/7.jpg)
Plants are well-known polyploids
• Allopolyploidy – combination of genetically distinct chromosome sets
• Autopolyploidy – multiplication of one basic set of chromosomes
• Wheat is an allohexaploid containing three distinct sets of chromosomes from three different diploid species of goat-grass.
![Page 8: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/8.jpg)
![Page 9: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/9.jpg)
Genome size correlates with cell size
![Page 10: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/10.jpg)
Genome size and phenotypes
• As cells become larger, surface to volume ratio changes (affecting exchange rate with environment)
• Transcription is affected by cell size• Body size function of cell number not cell size• Metabolism trend of smaller genomes and cell
size with larger genomes (ie. In birds, flightless birds have larger genomes)
![Page 11: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/11.jpg)
Reasons for correlation
• Obviously a physical constraint, large genomes need more room
• DNA acts as a nucleoskeleton around which nucleus is assembled?
• Observe proportionate change in cell size in response to polyploidization
• Cell size is presumably due to the nature of the DNA as well as amount
![Page 12: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/12.jpg)
Other trends related to genome/cell size
![Page 13: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/13.jpg)
Duplications and deletions
![Page 14: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/14.jpg)
Inversions
![Page 15: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/15.jpg)
Genome Sequencing “Big” Biology
![Page 16: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/16.jpg)
![Page 17: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/17.jpg)
$1000 genome
• Race for the prize
• Methods
• YouTube1
• Whose genome in the databases?
• Venter – writing the code
![Page 18: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/18.jpg)
A G C TAGCATCCGTAT
Capillary and Slab gel electrophoresisuse a modified Sanger technology with fluorescent dyes
Typical reads of 500-750nt on an hour timescale.Variation depending on sequencer.
![Page 19: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/19.jpg)
Four color sequencing
![Page 20: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/20.jpg)
![Page 21: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/21.jpg)
![Page 22: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/22.jpg)
Innovations in DNA sequencing
• Sequencing by synthesis• Cot-based analysis• Chip-based analysis, hybridization• Single molecule linear read, RNA
polymerase • Nanopore technology
– Different nucleotides =Different change in electric signal
![Page 23: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/23.jpg)
Free Solution Electrophoresis
• Possibly will improve separation time (no matrix) without losing read length
• Label DNA molecules with friction increasing molecule such as streptavidin
• Currently can read 100 bp, a long way to go…
![Page 24: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/24.jpg)
Who needs electrophoresis?
• Pyrosequencing
• MALDI-TOF Mass Spectrometry
• Sequencing by Hybridization
• Massively Parallel Signature Sequencing– A testimony to innovative molecular biology
• Single molecule methods
![Page 25: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/25.jpg)
Pyrosequencing• Real-time sequencing measuring release of
PPi during DNA synthesis
• Has been of particular use for SNP analysis
• First of four deoxynucleotide triphosphates added to reaction, when correct one incorporated Ppi is released and measured using ATP sulfurylase-coupled ATP synthesis and luciferase – wash and repeat
![Page 26: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/26.jpg)
Put the sequencing reactions through a mass spectrometer
Spectra of the C- and G-terminated oligonucleotides
Current limit ~100 bp,Facilitated by sensitivity andhigh-throughput loading
![Page 27: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/27.jpg)
Shotgun sequencing – 2 approaches
– Hierarchical shotgun approach• Generating an overlapping set of intermediate-sized
(e.g. bacterial artificial chromosomes with 200 KB inserts) clones, and keeping a map of that (it took 2 yrs for mapping e-coli)
• Subjecting each of these clones to shotgun sequencing, and using the map to get the whole sequence.
– Whole-genome shotgun (WGS) approach• Generating sequence reads directly from a whole-
genome library • Using computational techniques to reassemble in one
step.• Used for Drosophila melanogaster (fruit fly) and by
Celera Genomics (formed 1998) for human genome.
![Page 28: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/28.jpg)
Overview of “Shotgun” Genomic Sequencing
Break DNA into random fragments (8-10X Coverage)
Original DNA
![Page 29: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/29.jpg)
Cloning vectors
• 2-5 kb in pUC or M13
• 5-50 kb in phage or cosmid
• 30-100 kb in P1 bacteriophage
• 60-300 kb in BAC
• 60-2000 kb in YAC
![Page 30: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/30.jpg)
Overview of Genomic Sequencing
Break DNA into random fragments (8-10X Coverage)
Amplify fragments in a vector and sequence 500-700 bases in from each end
Original DNA
Base calling performed by Phred software: http://www.phrap.org/http://www.genome.org/cgi/reprint/8/3/175.pdf
![Page 31: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/31.jpg)
Phred Software
• Calls bases in four phases:– Predicting peaks (ideal locations)– Locating observed peaks– Matching observed to predicted– Finding missing peaks
• http://www.genome.org/cgi/reprint/8/3/186.pdf
• http://www.genome.org/cgi/reprint/8/3/175.pdf
![Page 32: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/32.jpg)
Errors in Sequencing Reads
• Each base call is assigned a quality score:– q = -10 x log10(p) {Higher quality scores correspond to
low error probabilities; }Errors are associated with peak vicinity, use the following
parameters in error probability determination on a TRAINING SET:Peak spacingUncalled/called ration (two window sizes)Peak resolution
Result in a look-up table inherent to Phred software
![Page 33: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/33.jpg)
Common Sources of Sequencing Errors
• The first fifty or so peaks of a trace are noisy and unevenly spaced due to anomalous migration of short DNA fragments, and unreacted dye-primer and dye-terminator molecules.
• Near the end of the trace, peaks become less evenly spaced due to less accurate trace processing, less well resolved as diffusion effects increase, and also #labeled molecules decrease.
• Compressions – most common in GC-rich regions when bases near the end of a single-stranded fragment bind to a complementary region forming a hairpin (migrates more rapidly than expected)
• Dye-terminator sequencing method helps resolve compressions, but has own problems: “About 85% of high quality dye terminator errors resulted from a missing G peak following an A, or a missing A folling a T,…” Ewing and Green, 1998.
![Page 34: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/34.jpg)
Overview of Genomic Sequencing
Break DNA into random fragments (8-10X Coverage)
Amplify fragments in a vector and sequence 500-700 bases in from each end
Assemble fragments of sequence that have been read:
Original DNA
Contig 1 Contig 2
![Page 35: Making Sense of Genomes](https://reader030.fdocuments.net/reader030/viewer/2022032612/568131bc550346895d98246b/html5/thumbnails/35.jpg)