Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications...

89
Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    354
  • download

    0

Transcript of Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications...

Page 1: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Linkage analysis: Two-factor testcross

AaBb x aabb

AaBb, Aabb, aaBb, aabb

What are the implications of phenotypes scored on these progeny?

Page 2: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Linkage analysis: Two-factor testcross

• Double heterozgyotes are mated with homozygous recessives

• Genotypes of a large number of progeny are scored

• If locus A and B are on different chromsomes, alleles will follow Mendel’s law of Independent Assortment

• Genetically linked? Two of four genotypes more frequent than expected (2 test statistic)

Page 3: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Linkage analysis: Interval mapping (Haley and Knott, 1992)

A BQ

rA rB

rAB = rA + rB - 2rArB

Page 4: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Frequencies for F1 gametes and RI genotypes (Markel et al., 1996)

F1 gametes Frequency RI genotype Frequency

A1B1 (1 - R')/2 A1A1B1B1 (1 - R)/2

A1B2 R'/2 A1B1B2B2 R/2

A2B1 R'/2 A2A2B1B1 R/2

A2B2 (1 - R')/2 A2A2B2B2 (1 - R)/2

Page 5: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

RI genotypic frequencies of two flanking markers and an intermediate QTL (Markel et al., 1996)

Genotype Predicted Frequency

A1A1Q1Q1B1B1

A1A1Q2Q2B1B1

(1 - RA)(1 - RB)/2RARB/2

A1A1Q1Q1B2B2

A1A1Q2Q2B2B2

(1 - RA)RB/2RA(1 - RB)/2

A2A2Q1Q1B1B1

A2A2Q2Q2B1B1

RA(1 - RB)/2(1 - RA)RB/2

A2A2Q1Q1B2B2

A2A2Q2A2B2B2

RARB/2(1 - RA)(1 - RB)/2

Page 6: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Expected additive effect coefficients of each pair of RI genotypes (Markel et al., 1996)

RI Genotypes Expected additive effect

A1A1B1B1 [(1 - RA - RB)/(1 - R)](a)

A1A1B2B2 [(RB - RA)/R](a)

A2A2B1B1 [(RA - RB)/R](a)

A2A2B2B2 [(RA + RB - 1)/(1 - R)](a)

Page 7: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Coefficients (xi) of the additive effect of a QTL at five positions between two flanking markers of A and B that are 20 cM apart (Markel et al., 1996)

Position of QTL (cM)

Genotype 0 5 10 15 20

A1A1B1B1 1.00 0.84 0.79 0.84 1.00

A1A1B2B2 1.00 0.43 0.00 -0.43 -1.00

A2A2B1B1 -1.00 -0.43 0.00 0.43 1.00

A2A2B2B2 1.00 -0.84 -0.79 -0.84 -1.00

Page 8: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Maximum likelihood approach to QTL mapping (Lander and Botstein, 1988)

• Assuming complete map coverage, is it possible to design a cross to make it highly likely that QTLs will be found?

• Using flanking markers as opposed to single-marker analysis

• Reduce the number of markers individually tested and thus reduce type I error

Page 9: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Traditional approach

• Compare the mean phenotypic value of progeny with genotype AB to those with marker genotype AA

• One-way analysis of variance– i.e., a linear regression– assume normally-distributed residual

environmental variance

Page 10: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Number of progeny required for detection (Soller and Brody, 1976)

• Assume that a QTL contributes 2exp to the genetic variance

and is located exactly at a marker locus

• (Z)2(2res/2

exp)

– Z is the number of standard deviations beyond with the normal curve contains probabilty a

• Phenotypic effect may be underestimated if not at marker locus• Greater number of progeny if not at the marker• No definition of the likely position of the QTL• Multiple testing

Page 11: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Interval mapping of QTLs using LOD scores: Method of maximum likelihood

i=a + bgi + · gi is coded (0, 1) for number of B alleles is a random normal variable with mean 0 and

variance 2

· b denotes the estimated phenotypic effect of a single allele substitution at a putative QTL

• L(a, b, 2) = iz((i - (a + bgi)), 2)

• LOD = log10(L(a’, b’, 2’)/L(A’, ), 2B’))

Page 12: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Interval mapping of QTLs using LOD scores: Method of maximum likelihood

• ELOD = 1/2log10(1 + 2exp/2

res) (a result from linear regression)

• ~1/2(log10e)(2exp/2

res) (Taylor expansion for small values of 2

exp/2res)

• ~0.22(2exp/2

res)

• T/ELOD ~ (Z)2/(2exp/2

res)

Page 13: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Interval mapping of QTLs using LOD scores(Lander and Botstein, 1988)

• L(a, b, 2) = i[Gi(0)Li(0) + Gi(1)Li(1)]

• Li(x) = z((i - (a + bx)), 2) denotes likelihood function for individual I

• Assumptions– gi = x

– Gi(x) denotes the probability that gi = x conditional on the genotypes and positions of the flanking markers

Page 14: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Confirmation of EtOH sensitivity QTL in mouse (Markel et al., 1997)

Page 15: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Genetic map of EtOH-sensitivity QTL (Lore1 - 6; Markel et al., 1997)

Page 16: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Additive effect of confirmed QTL for alcohol sensitivity (Markel et al., 1997)

Page 17: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Marker-assisted breeding of congenic mouse strains (Markel et al, 1997b)

• Yellow indicates the donor (D) genome

• Blue represents the recipient (R) genome

• Apoe is the target region of introgression

• Left side represents traditional approach, while right the “speed” congenic method

Page 18: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Traditional congenic breeding strategy (Markel et al., 1997b)

Generation Average %heterozygous (D/R)

segments SD

% recipient genome

F1N2

100.0050.007.07

50.0075.00

N3N4

25.005.0012.503.54

87.5093.75

N5N6

6.252.503.131.76

96.8898.44

N7N8

1.561.250.780.88

99.2299.61

N9N10

0.390.630.200.44

99.8199.90

Page 19: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Marker-assisted congenic breeding strategy (Markel et al., 1997)

Backcrossgeneration

Average %D/R segments

SD

% D/Rsegments in'best' male

% recipientgenome of'best' male

F1 1000 100 50

N2N3

50.007.0719.164.38

38.3211.93

80.8494.03

N4N5

5.982.440.980.98

1.95~0

99.03~100

Page 20: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Theoretical potential (Markel et al., 1997b)

Number of male carriers Potential reduction inD/R (x)

510

0.851.29

1520

1.501.65

3040

1.841.96

50 2.06

Page 21: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Comparison of theoretical expectations and empirical data

RecipientStrain at N5

Estimated %recipient

genome forbest male

Observed %recipient

genome forbest male

BABL/cByJC3H/HeJ

99.5299.27

99.1199.41

C57BL/KsCAST/Ei

99.6692.74 (N4)

99.7095.54 (N4)

DBA/2JFVB/NJ

98.9799.38

99.3899.73

Page 22: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Lecture 4: Mapping in humans (1 of 2)

• Linkage analysis

• Relative-pair analysis

Page 23: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Genetic mapping has been uncommon for human in most of the last century

• Lack of abundant supply of markers• Inability to arrange human crosses to suit

experimental purposes• Breakthrough with Botstein et al. (1980) for yeast• Use naturally occurring DNA sequence variation

in humans• Led to mapping several hundred rare Mendelian

diseases

Page 24: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Human Genetic Revolution

• Human genetics has sparked a revolution in medical science

• Can find genes behind disease without knowing how they function

• Completely generic approach

Page 25: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Last two decades ushered in complex traits

• Do not follow simple Mendelian monogenic inheritance

• Heart disease, hypertension, diabetes, cancer, and infection

Page 26: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Defining disease

• Clinical phenotype

• Age at onset

• Family history

• Severity

Page 27: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Allele frequencies+

Environment

Method/Technique

+Time/Place

• Prevalence• Risk• Heritability• Age of onset• Family history• Severity etc.

}

The Population

The Sample

The Metric

Page 28: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Linkage Analysis: Overview

• Simple Mendelian traits offer a small number of hypotheses for the geneticist to test.

• Thus, the geneticist speculates based on Mendelian rules what the most appropriate model is to explain the pattern of relationship between observed phenotype and genotype.

Page 29: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Linkage analysis: Hypothesis

• For simple mendelian traits, mendelian rules of gametic transmission can explain adequately the pattern of phenotypes in a multigenerational family:

• M1 = a specified model that suggests a specific location for a trait-causing gene

• Much more likely to have produced the observed data than

• M0 = a model that suggests no linkage to a trait-causing gene in the region

Page 30: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Linkage analysis: Hypothesis

• The evidence for M1 versus M0 is measured by the likelihood ratio

LR = Prob(Data|M1)/Prob (Data|M0)

• This is also presented as Z, the lod score

Z = log10(LR)

• (see 49, 50; Morton (1955))

Page 31: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

1

2 3 5

T / t, M1 / M2 t / t, M2 / M2

t/tM1/M2

T/tM2/M2

T/tM2/M2

T/tM1/M2

T/tM1/M2

t/tM1/m2

2

1 4 6

Autosomal dominant trait

Page 32: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Basic calculations in human linkage analysis

• Assign linkage phase• Calculate conditional probabilities• Observe the number of each class of paternal

gametes in progeny• Probability of observed family given a model [L()]• Probability assuming independent assortment

[L(0.5)]• Calculate likelihood ratio: LR = L()/L(0.5)

Page 33: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Assign linkage phase

• Equivalent to experimental two-factor testcross• Linkage phase

– Different sets of alleles on each member within a pair of homologous chromosomes (i.e, haplotype)

– AB/ab is in coupling; Ab/aB is in repulsion– Marker alleles are codominant, so phase is

arbitrary; coupling is TM1/tM2 and repulsion is tM1/TM2

Page 34: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Conditional probabilities

Gamete Frequencies

Phase TM1 TM2 tM1 tM2

Coupling (1 - )/2 /2 /2 (1 - )/2

Repulsion /2 (1-)/2 (1-)/2 2

n1 n2 n3 n4

Page 35: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Observe paternal gametes

• n1 = TM1, n2 = TM2, n3 = tM1, and n4 = tM2 gametes

• Six children in the present example– n1 = 1– n2 = 2– n3 = 3– n4 = 0

Page 36: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Probability L()

• Each offspring is an independent event so that:• L() = L(coupling)L() + L(repulsion)L()

=0.5[0.5n(1 - )n1+n4()n2+n3]+0.5[0.5n(1 - )n2+n3()n1+n4]

=0.5n+1[(1- )n1+n4()n2+n3+(1- )n2+n3()n1+n4]• The geneticist provides a reasonable value for ;

in this case, what is a reasonable value for ?

Page 37: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Probability L(.167)

• L(0.167) = (0.5)7[(0.833)1(0.167)5+(0.833)5(0.167)1] = 0.000524

Page 38: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

L(0.5)

• L(0.5)=.25n, n is the number of progeny• L(0.5)

=(0.25)6

=0.000244

Page 39: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

LR and Z

• LR = L()/L(0.5) = 0.00052/0.00024

= 2.147

• Z = log10LR = 0.332

• Try different values of • If recombinants (r) can be counted directly, then

maximum likelihood estimate (MLE) = r/n

Page 40: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

1

2 3 5

T / t, M1 / M2 t / t, M2 / M2

t/tM1/M2

T/tM2/M2

T/tM2/M2

T/tM1/M2

T/tM1/M2

t/tM1/m2

2

1 4 6

1 2

t/t, M1/M2 T/t, M2/M2

Page 41: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Father’s genotype is in repulsion

• Assume father’s alleles are in repulsion (TM2/tM1)

– L()=0.5n(1 - )n2+n3()n1+n4

– L(0.167)=(0.5)6(0.833)5(0.167)=0.001046

• Multiple generations are thus valuable

– Nearly twice the earlier value

– Z improves by 0.3, underscoring the value of multi-generation pedigrees

• How about two families of 6 children versus one family of 12?

Page 42: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Linkage analysis: Autosomal recessive trait

• More complicated analysis; more families are required to demonstrate linkage between a marker locus and an autosomal recessive trait compared to autosomal dominant

• Normal children can be Tt or TT; thus, alone can not be used to deduce linkage phase of doubly-heterozygous parent

• Families with just one affected are not informative, even when several normal children are available

• LR()=0.5[(1-)1()0+()1(1-)0]

=0.5[(1-)+]

=0.5

Page 43: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Allele frequency estimation

• Allelic heterogeneity

• Critical; rare versus common allele

Page 44: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Allele-sharing studies

• Penrose (1935)

• Haseman and Elston (1972)

• Carey and Williamson (1993)

• Fulker and Cardon (1994)

• Lander et al. (1995)

Page 45: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Allele-sharing: Haseman and Elston (1972)

• Can genetic variance be assigned to a locus?

• Twin studies– Partition genetic variance– Do not address the contribution of individual loci

• Sib-pairs– Addresses secular and age effects– Include information about parents

Page 46: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Allele-sharing: Haseman and Elston (1972)

• Xij = + gij + eij

• gij = genotypic value; eij = environmental deviation

• Assume random mating and linkage equilibrium

• Yj = (sib-pair difference)2

• Estimate Y based on best estimate of the number of alleles the sibs share identical by descent (IBD)

Page 47: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Allele-sharing: Haseman and Elston (1972)

• Let j = proportion of genes shared IBD and Y = (x1j - x2j)2 for sib pair j

• Develop expectation of Y if known precisely at the disease locus

• Estimate (’) given the genotypes of the parents (sometimes) and children for marker locus

• Predict Y based on ’

Page 48: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Development of the model

• E (Yj | j

• E (’ | Im) ’ = estimate of – Im = information about parent and sib genotypes

• E (Y | ’)

Page 49: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

E (Yj | j)

• For sib pair BB-Bb

• x1j = + a + e1j

• x2j = + d + e2j

• Yj = (a + e1j - d - e2j)2 = (a - d + ej)2

Page 50: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

E (Yj | j)

j Genotype pair Probability

0 BB - BB p2(p2) = p4

1/2 BB - BB p2(p) = p3

1 BB - BB p2(1) = p2

Page 51: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

E (Yj | j)

Expectation Variance components

E(Yj | j = 1) 2e

E(Yj | j = 1/2) 2e + 2

a + 22d

E(Yj | j = 0) 2e + 22

a + 22d

Page 52: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

0 1/2 1 j

Yj

E (Yj | j)

Page 53: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

E (Yj | j)

• Expectation for Yj varies with proportion of j

• E(Yj | j) = + j

= (2e + 22

g)

= -22g

j = 0, 1/2, 1

• Note: 2d vanishes with large n

Page 54: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

E(’ | Im)

• Estimate p based on sib-pair and parental genotypes for a marker locus

• fji is the probability that the jth sib pair have I genes identical by descent

• Im is the information on sib-pair and parental genotypes

Page 55: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

• Our best estimate of j (strongest correlation) is given as

’ = fj2 + 1/2fj1

’j is the Bayes estimate of j when a squared error loss function is used

• Maximum possible correlation with j when j is a random variables taking on values of 1, 1/2, and 1 (Haseman, 1970).

Page 56: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

E(’ | Im)

Type Probability

7 parental mating types p(b)

34 offspring types p(a|b)

Joint probability p(ab)

Page 57: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

E(’ | Im)

Mating type Sib pair type p(ab) fj0 fj1 fj2'j

AiAi x AiAi AiAi-AiAi pi4 1/4 1/2 1/4 1/2

AiAi x AjAj AiAj-AiAj 2pi2pj

2 1/4 1/2 1/4 1/2

AiAi x AiAj AiAi - AiAi

AiAi - AiAj

AiAj- AiAj

pi3pj

2pi3pj

pi3pj

01/20

1/21/21/2

1/20

1/2

3/41/43/4

Page 58: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

fji = 2

h = 0

vPp

wPs

P{v and w and j = h/2},

wPs

vPp

P{v and w and i = i/2},

For i = 0,1,2

Joint probability of observing Im and that j should equal i/2

Sum of the three joint probabilities, i = 0, 1, 2

Page 59: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

E(Yj | ’j)

• Assume a two-allele marker locus...

• No dominance...

• And complete parental information

Page 60: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

E(Y | ’)

• Given complete Im

• E(Yj|’j) = + ’j

= -2(1-2c)22g

• (1-2c)2 = correlation between jm and jt, i.e., proportion of marker genes ibd and QTL genes i.b.d.

Page 61: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

E(Yj|’jm) =

jm

E(Y|jt)P{jt|jm}P{jm|’jm}jt

Joint distribution of jt and jm

Joint distribution of ’jm and jm

Page 62: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

E(Yj | ’jm) = [2e + 2(1 - 2c + 2c2) 2

g - 2(1 -c)22g’jm

= [2e + 2(1 - 2c + 2c2)2

g

= - 2(1 -c)22g’jm

If c = 1/2, then b = 0If c = 0, then b = -22

g

Page 63: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

P{jm = jt = 1} A1B1A2B2

XA3B3A4B4

A = marker B = trait

A1B1 (1 - c)/2A2B2 (1 - c)/2A1B2 c/2A2B1 c/2

A3B3 (1 - c)/2A4B4 (1 - c)/2A3B4 c/2A4B3 c/2

Page 64: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

A1B1A2B2X

A3B3A4B4

A1B1A3B3

A1B1A3B3

Sib 1 Sib 2

Page 65: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

A1B1A3B3

A1B1A3B3

Sib 1 Sib 2

[(1 - c)/2]2 [(1 - c)/2]2

Page 66: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

[(1 - c)/2]2[(1 - c)/2]2 = (1 - c)4 / 16

P{jm = jt = 1} = 4(c4/16) + 8[c2(1 - c)2 /16] + 4[(1 - c)4/ 16]

=[c2 + (1 - c)2]2/4 = 2 / 4, where

= c2 + (1 - c)2

Page 67: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Contemporary sib-pair analysis (Kruglyak and Lander, 1995)

• Multipoint linkage analysis– full inheritance information– maximum likelihood estimates

• Qualitative traits

• Quantitative traits

Page 68: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Sib-pair analysis advantages

• Sib pairs are relatively easy to ascertain

• Closely matched, control for secular effects

• No assumptions about inheritance

• No assumptions:– penetrance– phenocopy– disease allele frequency

Page 69: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Sib-pair analysis: Basic model

• Determine whether a sib pair shares 0, 1, or 2 alleles identical by descent (IBD)

• Affected sibs should share alleles IBD more often than expected under random Mendelian segregation (qualitative trait)

• Sib-pairs should show a correlation between magnitude of phenotypic difference and number of alleles shared IBD (quantitative trait)

Page 70: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Sib-pair analysis: Qualitative traits

• Estimated proportions of IBD sharing– (z0, z1, z2)

• Mendelian expectation 0, 1, 2) = (1/4, 1/2, 1/4)

• According to Holmans (1993):– z0 + z1 + z2 = 1; 1/2 z1; z1 2z0

– If the is no dominance variance: z1 = 1/2

Page 71: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Sib-pair analysis and relative risk (Risch, 1990)

• If only a single locus is involved...

• Relative-risk ratio for a sib (prevalence in siblings of affecteds divided by population prevalence) S = relative risk ratio for sibling

O = relative risk ratio for offspring

M = relative risk ratio for monozygotic twin

Page 72: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Sib-pair analysis and relative risk (Risch, 1990)

• zO = 0 / S

• z1 = 1O / S

• z2 = 2M / S

• In the absence of dominance variance, O = S and M - 1 = 2(S - 1)

Page 73: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

IBD distribution (adapted from Kruglyak and Lander, 1995)

Sibling 1

Sibling 2

4 2 3 2 3 2 3 4 4 1 3

4 2 3 2 2 5 1 5 2 3 1

2 3 4 5 5 4 3 3 3 1 2

2 3 4 5 5 4 3 3 5 2 3

0 20 40 60 80 100 cM

p(IBD) 2 1 0

1.00

.50

Page 74: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Quantitative trait sib-pair analysis

• Let 1i, 2i denote phenotypes of two siblings

• Di = 1i - 2i

• vi represents the number of alleles shared IBD

• At the QTL, variance of D depends on v

• So that 20 > 2

1 > 22, where 2

j is the variance of the difference D when j alleles are shared

• How do we test this hypothesis?

Page 75: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Quantitative traits with complete information: Haseman-Elston

• E(Di2 | vi ) = - vi; = 2

g (additive genetic variance)

• Linear regression assures an ML estimate only if the noise process is normally distributed and uncorrelated with the dependent variable

• Squared difference D2 does not necessarily follow• Standard error and distribution of test statistic are

based on normal, uncorrelated error; thus, t-test derived by dividing by its standard error is inappropriate

Page 76: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Quantitative traits with complete information: ML QTL variance estimation

• Derive direct estimates of 2j based on D for

each value of v

• Assume the simple constraint

20 2

1 22

• No dominance variance

21 = (2

0 + 22) / 2

• How to deal with incomplete data?

Page 77: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Quantitative traits with complete information: Nonparametric QTL analysis

• Make no assumptions about the phenotypic distribution; Wilcoxon rank-sum test

• Rank sib pairs according to absolute D; rank(i) the rank of the ith sib pair and s a location in the genome

XW(s) = rank(i) f(vi)i = 1

n

Page 78: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Quantitative traits with complete information: Nonparametric QTL analysis

• For f(v)

• No linkage, XW(s) has expectation 0 and variance V = [n(n+1)(2n+1)]/12

• Ratio Z(s) = XW(s) / V1/2

• Z(s) asymptotically distributed– standard normal– Ornstein-Uhlenbeck diffusion process

Page 79: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Lecture 5a: Mapping in humans (2 of 2)

• Linkage disequilibrium

• Allele frequency estimation

• Association analysis

Page 80: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Linkage equilibrium and disequilibrium

• The linkage analyses so far discussed assume linkage equilibrium

• All possible combination of alleles on a a single chromosome (all possible haplotypes or all possible gamete genotypes) occurs as frequently as would be predicted from the random association of individual allele frequencies

Page 81: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

For example, assume that:A = 0.2 a = 0.8 M = 0.6 m = 0.4

Haplotypes ExpectedFrequency

AM 0.2 x 0.6 = 0.12

Am 0.2 x 0.4 = 0.08

aM 0.8 x 0.6 = 0.48

am 0.8 x 0.4 = 0.32

Total = 1.00

Page 82: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Disequilibrium = D = observed frequency - expected frequency

Haplotype Observed 0 - E D

AM .04 .04 - .12 = -0.08

Am .16 .16 - .08 = +0.08

aM .56 .58 - .48 = +0.08

am .24 .24 - .32 = -0.08

Page 83: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Comments on linkage disequilibrium

• Dmax is determined by setting one of the haplotypes involving the least common allele at a frequency of zero

– Dmax = 0.12, if frequency of AM were zero

– Absolute Dmax is 0.25 for any two-locus system (frequency of each of four alleles were 0.25)

• Effect on linkage analysis

– If no assumptions about any genotype, D is not relevant

– Guess about one or more individual’s genotype, total lod score is less accurate

Page 84: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Linkage disequilibrium between marker and trait loci

• Most cases of trait are due to relatively few distinct ancestral mutations at trait-causing locus

• Allele A present on an ancestral chromosomes and lying close enough to trait-causing locus so that linkage has not been thoroughly “shuffled” in the population’s history

• Young mutation in an isolated population

Page 85: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Association Studies

• Disregard familial patterns of inheritance

• Case-control studies

• Allele A is associated with a trait if it is significantly more frequent among affecteds as compared to unrelated controls

• 2 x 2 contingency 2 test

Page 86: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Association studies

• Choice of control group is a major issue– Not an issue in linkage or allele-sharing method– why?

• Association studies most meaningful when it involves alleles with direct biological relevance

Page 87: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Association studies and complex traits

• HLA complex (chrom. 6) implicated in etiology of autoimmune diseases

• HLA-B27 allele– Occurs in 90% of patients with ankylosing spondylities

– Only 9% of the general population

• Type I diabetes, rheumatoid arthritis, multiple sclerosis, systemic lupus, late-onset Alzheimer’s disease

Page 88: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Three competing hypotheses (Hn) for positive

associations • H1: Allele is actually a cause of the disease

• H2: Allele is in linkage disequilibrium with the actual cause (syntenic with trait-causing allele)

• Recall that for D– Most cases of trait are due to relatively few distinct ancestral

mutations at trait-causing locus– allele A was present on one of these ancestral chromosomes

and lies close enough to trait-causing locus such that linkage has not been thoroughly “shuffled” in the population’s history

– young mutation in an isolated population

Page 89: Linkage analysis: Two-factor testcross AaBb x aabb AaBb, Aabb, aaBb, aabb What are the implications of phenotypes scored on these progeny?

Three competing hypotheses (Hn) for positive associations

• H3: Artifact of population admixture

• A trait present at a higher frequency in an ethnic group will be positively associated with any allele that happens to be more common in tht group

• For example, (Lander and Shork, 1994)– eating with chopstick in San Francisco– HLA-A1 allele (more common among Asians

than Caucasians)