Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype &...

48
1 Intro 2: Last week's take home lessons Elements & Purification Systems Biology & Applications of Models Life Components & Interconnections Continuity of Life & Central Dogma Qualitative Models & Evidence Functional Genomics & Quantitative models Mutations & Selection Harvard-MIT Division of Health Sciences and Technology HST.508: Genomics and Computational Biology x= u c a g Uxu F Y C S - - R - W Cxu L H P R Q axu N S I T K R M NH+ gxu D V A G E A uxc uxa TE uxg cxc cxa cxg axc C-S axa axg gxc O- gxa gxg H:D/

Transcript of Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype &...

Page 1: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

1

Intro 2: Last week's take home lessons

Elements & PurificationSystems Biology & Applications of Models Life Components & Interconnections Continuity of Life & Central DogmaQualitative Models & EvidenceFunctional Genomics & Quantitative modelsMutations & Selection

Harvard-MIT Division of Health Sciences and Technology

HST.508: Genomics and Computational Biology

x= u c a g

Uxu F Y C S

- - R - W

Cxu L H P R

Q

axu N S I T

K R M NH+

gxu D V A G

E A

uxc uxa TEuxg

cxc cxa cxg

axc C-S axa axg

gxc O­gxa gxg H:D/

Page 2: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

DNA 1: Today's story, logic & goals

Binomial & exponential dx/dt = kx

Types of mutantsMutation, drift, selection

Association studies χ2 statistic Linked and causative alleles Haplotypes Computing the first genome,

the second ... New technologies Random and systematic errors

2

Page 3: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Connecting Genotype & Phenotype

%DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms (SNPs) 70-98% Speciation 30% Sequence homology <25% Distant (detectable only in 3D structures)

3

Page 4: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Types of phenotypic effects of mutations

Null: PKU Dosage: Trisomy 21

Gain of function: HbS Altered ligand specificity

Conditional (e.g. temperature or chemical)

4

Page 5: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Types of mutations

Single substitution: A to C, G or T, etc.Deletion: 1 bp ... chromosomes (aneuploidy)Duplication: as above (often at tandem repeats)Inversion: ABCDEFG to ABedcFG Translocation: ABCD & WXYZ to ABYZ & WXCDInsertion: ABCD to ABινσερτCDRecombination: ABCDEFGH & ABcDEfGH to

ABcDEFGH & ABCDEfGH5

Page 6: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Mutations & Polymorphisms

Mutations become polymorphisms or “common alleles” when frequency > 1% in a population (arbitrary)

All Single Nucleotide Polymorphisms (SNPs) (probably) exist in the human population: 3 billion x 4 (ACGT) at frequencies near 10-5 .

SNPs linked to a phenotype or causative.

6

Page 7: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Haplotypes

Representation of the DNA sequence of one chromsome (or smaller segments in cis).

Indirect inference from pooled diploid data

Direct observation from meiotic or mitotic segregation,cloned or physically separated chromsomes or segments

7

Page 8: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Linkage & Association

Family Triad: parents & child vs case-control

vs.

Case-control studies of association in structured or admixed populations. Pritchard &Donnelly, 2001. To appear in Theor. Pop. Biol. Program STRAT

Null hypothesis: allele frequencies in a candidate locus do not depend on phenotype (within subpopulations)

8

Page 9: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Pharmacogenomics Gene/Enzyme Drug Quantitative effect

Examples of clinically relevant genetic polymorphisms influencing drug metabolism and effects. Additional data (http://www.sciencemag.or g/feature/data/1044449.shl)

CYP2C9 Tolbutamide, warfarin, phenytoin, nonsteroidal anti-inflammatories

Anticoagulant effect of warfarin

Beta blockers, antidepressants, antipsychotics, Tardive dyskinesia from codeine, debrisoquin, dextromethorphan, encainide, antipsychotics; narcotic side

CYP2D6 flecainide, guanoxan, methoxyamphetamine, N - effects, efficacy, and dependence; propylajmaline, perhexiline, phenacetin, phenformin, imipramine dose requirement; beta­propafenone, sparteine blocker effect

Dihydropyrimidine dehydrogenase Fluorouracil Fluorouracil neurotoxicity

Thiopurine methyltransferase Mercaptopurine, thioguanine, azathioprine Thiopurine toxicity and efficacy; risk of second cancers

Renoprotective effects, cardiac ACE Enalapril, lisinopril, captopril indices, blood pressure,

immunoglobulin A nephropathy

Potassium channels

HERG Quinidine Drug-induced long QT syndrome

Cisapride Drug-induced torsade de pointes KvLQT1 Terfenadine, disopyramide, meflaquine Drug-induced long QT syndrome

hKCNE2 Clarithromycin 9Drug-induced arrhythmia

Page 10: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

DNA Diversity Databases

~100 genomes completed (GOLD) (http://wit.integratedgenomics.com/GOLD/)

A list of SNP databases (http://ariel.ucs.unimelb.edu.au/~cotton/mdi.htm)

3 million human SNPs www.ncbi.nlm.nih.gov/SNP

mapped snp.cshl.org

23K to 60K SNPs in genes HGMD(http://snp.cshl.org/naturepaper.html), (http://archive.uwcm.ac.uk/uwcm/mg/docs/hahaha.html)

10

Page 11: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Causative SNPs can be in non-coding repeats

aggcAggtggatca aggcGggtggatca

ALU repeat found upstream of Myeloperoxidase

“severalfold less transcriptional activity”"-463 G creates a stronger SP1 binding site & retinoic acid response element (RARE) in the allele... overrepresented in acute promyelocytic leukemia"

Piedrafita FJ, et al. 1996 JBC 271: 14412 (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=8662930&dopt=Abstract)

11

Page 12: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Modes of inheritanceDNA, RNA (e.g. RNAi), protein (prion),

& modifications (e.g. 5mC)

“Horizontal” (generally between species) transduction, transformation, transgenic

“Vertical” Mitosis: duplication & division (e.g. somatic) Meiosis/fusion: diploid recombination, reduction Maternal (e.g. mitochondrial)

12

Page 13: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Today's story, logic & goals

Types of mutantsMutation, drift, selection

Binomial & exponential dx/dt = kx Association studies χ2 statistic Linked and causative alleles Haplotypes Computing the first genome,

the second ... New technologies Random and systematic errors

13

Page 14: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Where do allele frequencies come from?

Mutation/migration(M), Selection(S), Drift (D), … Assumptions: Constant population size N Random mating Non-overlapping generations (NOT at equilibrium, not infinite alleles,sites or N)

See: Fisher 1930, Wright 1931, Hartl & Clark 1997 (http://shop.barnesandnoble.com/textbooks/booksearch/isbnInquiry.asp?isbn=0878933069)

14

Page 15: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Directional & Stabilizing Selection

• codominant mode of selection (coefficient s) – fitness of heterozygote is the mean of the fitness(w) of the two

homozygotesAA = 1; Aa = 1 + s; aa = 1 + 2s– always increase frequency of one allele at expense of the other

• overdominant mode – heterozygote has highest fitnessAA = 1, Aa = 1 + s; aa = 1 + twhere 0 < t < s– reach equilibrium where two alleles coexist

See H&C 1997 p. 229

15

Page 16: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

In-framemutants

Ratio of strains over environments, e , times, te , selection coefficients, se, R = Ro exp[-Σsete]

+ wild-typeTagged mutants

40° pH5 NaCl Complex Pool Select

t=0

Multiplex PCR size-tag or chip readout

16

Page 17: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Where do allele frequencies come from?Mutation/migration(M), Selection(S), Drift (D), … Mj= Σ(Ti*B[N-i,j-i,F]); Mj= Σ(Mi*B[i,i-j,R])

i=0,j i=j,N

Sj= Σ(Mi*B[N-i,j-i,1-1/w]); Sj= Σ(Mi*B[i,i-j,1-w]); i=1,j if w>1 i=j,N-1 if w<1

Dj= Σ Si*B[N,j,i/N] i=1,N-1

w=relative fitness of i mutants to N-i original Ti, Mi, Di, Si = frequency of i mutants in a pop. size N F= forward mutation(or migration) probability ; R=reverse. B(N,i,p)= Binomial = C(N,i) pi (1-p)N-i

(Fisher 1930, Wright 1931, Hartl & Clark 1997) 17 (http://shop.barnesandnoble.com/textbooks/booksearch/isbnInquiry.asp?isbn=0878933069)

Page 18: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Random Genetic Drift very dependent upon population size

18

Page 19: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Role of Genetic Exchange

• Effect on distribution of fitness in the whole population • Can accelerate rate of evolution at high cost (50%)

from Crow & Kimura 1970 Clark & Hartl 191997 p. 182

Page 20: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Common Disease – Common Variant Theory. How common?

ApoE allele ε4 : Alzheimer’s dementia, & hypercholesterolemia 20% in humans, >97% in chimps (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=7772071&dopt=Abstract)

HbS 17% & G6PD 40% in a Saudi sample

CCR5∆32 : resistance to HIV 9% in caucasians

20

Page 21: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Are rare variants responsible for susceptibility to complex diseases?

“Customary in theoretical work relating to complex diseases, the allele frequencies ... are treated as parameters of the model” New here: “resulting from an evolutionary process including selection, mutation, and genetic drift ... to learn about the underlying allele frequencies”

See Pritchard Am.J.Hum.Gen 69:124-137. (2001) Programs (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=11404818&dopt=Abstract)

(http://www.stats.ox.ac.uk/~pritch/software.html)

Top 5 RR 21

Page 22: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

DNA1: Today's story, logic & goals

Types of mutants Mutation, drift, selection

Association studies χ2 statistic Linked and causative alleles Haplotypes Computing the first genome,

the second ... New technologies Random and systematic errors

Binomial & exponential dx/dt = kx

22

Page 23: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

One form of HIV-1 Resistance

23

Page 24: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

r

Association test for CCR-5 & HIV resistanceAlleles Obs Neg ObsSeroPos total ExpecNeg ExpecPos CCR-5+ 1278 1368 2646 1305 1341 ∆ ccr-5 130 78 208 103 105 total 1408 1446 2854

P dof=(r-1)(c-1)=1 ChiSq=sum[(o-e)^2/e]= 15.6 0.00008

Samson et al. Nature 1996 382:722-5 (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=8751444&dopt=Abst

24

Page 25: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

25

Y ( ) (p)

1

[(

|

= Numbe r of S ib P airs As s ociationX= P opulation fre que ncy

GRR=1.5, #alle le s =1E6

1E+2

1E+3

1E+4

1E+5

1E+6

1E+7

1E+8

1E+9

1E+10

1E-09 1E-08 1E-07 1E-06 0.00001 0.0001 0.001 0.01 0.1

bas ed on Ris ch & Merikangas 1996) S c ience 273: 1516]

But what if we test more than one locus?

The future of genetic studies of complex human diseases. Ref (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=8801636&dopt=Abstract)

GRR = Genotypic relative risk

Y ( ) (

1

2

[(|

|

= Numbe r of S ib P airs As s ociationX= Ge notypic Re lative Ris k GRR)

#alle le s =1E6, p=0.5

1E+1

1E+2

1E+3

1E+4

1E+5

1E+6

1E+7

1E+8

0.001 0.01 0.1 10 100 1000 10000

1 .00 1 1 .01 1 .1 11 10 1 1,00 1 10,0 01

1-GRR GRR

bas ed on Ris ch & Merikangas 1996) S cience 273: 1516]

Y )

0

200

400

600

800

[(|

= Numbe r of S ib P airs (As s ocationX= Numbe r of Alle le s (Hypothe s e s ) Te s te d

GRR=1.5, p= 0.5

1,000

1,200

1,400

1,600

1E+4 1E+6 1E+8 1E+10 1E+12 1E+14 1E+16 1E+18 1E+20 1E+22

bas ed on Ris ch & Merikangas 1996) S cience 273: 1516]

Page 26: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

How many "new" polymorphisms?

G= generations of exponential population growth = 5000 N'= population size = 6 x 109 now; N= 104 pre-G m= mutation rate per bp per generation = 10-8 to 10-9 (ref) (http://www.nature.com/cgi-taf/DynaPage.taf?file=nature/journal/v397/n6717/abs/397344a0_fs.html&filetype=&content_filetype=)

L= diploid genome = 6 x 109 bp ekG = N'/N; so k= 0.0028 Av # new mutations < Σ Lektm = 4 x 103 to 4 x 104

per genome t=1 to 5000

Take home: "High genomic deleterious mutation rates in hominids" accumulate over 5000 generations & confound linkage methods

26And common (causative) allele assumptions.

Page 27: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Finding & Creating mutants

IsogenicProof of causality: Find > Create a copy > Revert

Caution:Effects on nearby genesAneuploidy (ref)

(http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=10888885&dopt=Abstract) 27

Page 28: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

28

Pharmacogenomics Example

5-hydroxytryptamine transporter

Lesch KP, et al Science 1996 274:1527-31 Association of anxiety-related traits with a polymorphism in the serotonin transporter gene regulatory region. Pubmed (http://www.ncbi.nlm.nih.gov/htbinpost/Entrez/query?uid=8929413&form= 6&db=m&Dopt=b)

Page 29: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Caution: phases of human genetics

Monogenic vs. Polygenic dichotomy

Method Problems Mendelian Linkage (300bp) need large families Common indirect/LD (106bp) recombination & new alleles Common direct (causative) 3% coding + ?non-coding All alleles (109) expensive ($0.20 per SNP)

(methods) (http://www.ncbi.nlm.nih.gov/SNP/snp_tableList.cgi?type=method)

29

Page 30: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

DNA1: Today's story, logic & goals

Types of mutants Mutation, drift, selection

Binomial & exponential dx/dt = kx Association studies χ2 statistic Linked and causative alleles Haplotypes Computing the first genome,

the second ... New technologies Random and systematic errors

30

Page 31: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Examples of random & systematic errors?

For (clone) template isolation:

For sequencing:

For assembly:

31

Page 32: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Sequence assembly

aaaaaggggggccccccc aggggggccAcccctttttttag

Overlap 100 kbp BAC clone (haplotype)

ccccctttttttagcgc 4 sequences in 2 islands

32acgacatagcgactagcta

Page 33: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

See Ewing, Hillier, Wendl, & Green 1998

33

Page 34: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Examples of random & systematic errors?

For (clone) template isolation:

For sequencing:

For assembly:

34

Page 35: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Examples of systematic errors

For (clone) template isolation:restriction sites, repeats

For sequencing: Hairpins, tandem repeats

For assembly: repeats, errors, polymorphisms, chimeric clones, read mistracking 35

Page 36: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

36

Project completion % vs coverage redundancy

(Roach 1995)(http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=8808467&dopt=Abstract)

0%

20%

40%

60%

80%

100%

120%

140%

160%

0 1 2 3 4 5 6 7 8 9 10 11 12

Closure Probab. 1939

Av Island length 1995

Island Length 1988

Whole-genome shotgun

X= mean coverage

Page 37: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Weber & Myers 1997 37

Page 38: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

2-Oct-2002 Boston GSAC Panel Discussion"The Future of Sequencing Technology: Advancing

Toward the $1,000 Genome"

Moderators: •J. Craig Venter, Ph.D., The Center for Advancement of Genomics •Gerald Rubin, Ph.D., Howard Hughes Medical Institute

Speakers: •George Church, Ph.D., Harvard University •Eugene Chen, Ph.D., US Genomics •Tony Smith, Ph.D., Solexa •Trevor Hawkins, Ph.D., Amersham Biosciences Corporation •Susan Hardin, Ph.D., VisiGen Biotechnologies, Inc. •Michael P. Weiner, 454 Corporation •Daniel H. Densham, Mobious Genomics, Ltd

38

Page 39: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Conventional

with 2 hairpin

B B’

C T A

G ddA

Gel size separation

ddT

B B’

C T A

G A

dideoxy gel

3’ 5’ 3’ 5’

39

Page 40: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

A

Conventional

with 2 hairpin

B B’

Systematic errors

T A T A

dideoxy gel

3’ 5’

G A

CTA

Sequential dNTP addition (pyrosequencing) > 30 base reads; no hairpin artefacts 40

Page 41: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

41

Fluorescent primers or ddNTPs

Anal Biochem 1997 Oct 1;252(1):78-88 Optimization of spectroscopic and electrophoretic properties of energy transfer primers. Hung SC, Mathies RA, Glazer AN

http://www.pebio.com/ab/apply/dr/dra3b1b.html

Page 42: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

New Genotyping & haplotyping technologiesde novo sequencing > scanning > selected sequencing > diagnostic methods Sequencing by synthesis • 1-base Fluorescent, isotopic or Mass-spec* primer extension (Pastinen97) • 30-base extension Pyrosequencing (Ronaghi99)* • 700-base extension, capillary arrays dideoxy* (Tabor95, Nickerson97, Heiner98)

SNP & mapping methods • Sequencing by hybridization on arrays (Hacia98, Gentalen99)* • Chemical & enzymatic cleavage: (Cotton98) • SSCP, D-HPLC (Gross 99) Femtoliter scale reactions (105 molecules) • 20-base restriction/ligation MPSS (Gross 99) • 30-base fluorescent in situ amplification sequencing (Mitra 1999) Single molecule methods (not production) • Fluorescent exonuclease (Davis91) • Patch clamp current during ss-DNA nanopore transit (Kasianowicz96) • Electron, STM, optical microscopy (Lagutina96, Lin99)

42

Page 43: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

43

Use of DNA Chips for SNP ID & Scoring

Wang et al., Science 280 (1998): 1077

• with HIV-1, BRCA1, mitochondria

• human DNA

T G C A

T TTGAACA G

(Context) C A T

TTGCACA G C A

A/A A/C C/C

(http://www.sciencemag.org/cgi/content/full/280/5366/1077)

Used for mutation detection

higher throughput and potential for automation ID of > 2000 SNPs in 2 Mb of

Multiplex reactions 50-fold

Page 44: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Use of Mass Spec for Analysis and Scoring

Haff and Smirnov, Genome Research 7 (1997): 378

A single nucleotide primer extension assay 44

Page 45: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

45

Mass Spectrometry for Analysis and

Scoring Haff and Smirnov,

Genome Res. 7 (1997): 378

Use mass spec to score which base(s) add

Multiplex 5 with known primer masses

Pool 50 to 500 samples

Sequenom

(http://www.sequenom.com/genotyping/overview/techno/techno.html)

Page 46: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Searching for (nearly) exact matches

Hash Suffix arrays Suffix trees

4N ~ = Genome length N=word length (for “lookup”) e.g. Set aside space for 416 ~ = 4 billion genomic positions (each requires 4­bytes of storage).

46

Page 47: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

Exact Sequence Searching

#!/usr/local/bin/perl$dnatext = "ggggggCgggCgggCgggg";print " Original genome: $dnatext \n";$n_mut = $dnatext =~ s/gC/gg/gi;print " Found: $n_mut mutation(s)\n"; print " After gene-therapy: $dnatext \n";

Original genome: ggggggCgggCgggCgggg Found: 3 mutation(s)After gene-therapy: ggggggggggggggggggg

47

Page 48: Intro 2: Last week's take home lessons · Random and systematic errors 2 . Connecting Genotype & Phenotype %DNA identity 100% Functional measures 99.9% Single Nucleotide Polymorphisms

DNA 1: Today's story, logic & goals

Types of mutants Mutation, drift, selection

Binomial & exponential dx/dt = kx Association studies χ2 statistic Linked and causative alleles Haplotypes Computing the first genome,

the second ... New technologies Random and systematic errors

48