1 QTL mapping in mice Lecture 10, Statistics 246 February 24, 2004.

1

QTL mapping in mice

Lecture 10, Statistics 246

February 24, 2004

2

The mouse as a model

Same genes?The genes involved in a phenotype in the mouse may also be

involved in similar phenotypes in the human.

Similar complexity?The complexity of the etiology underlying a mouse phenotype

provides some indication of the complexity of similar human phenotypes.

Transfer of statistical methods.The statistical methods developed for gene mapping in the

mouse serve as a basis for similar methods applicable in direct human studies.

3

Backcross experiment

4

F2 intercross experiment

5

F2 intercross: another view

6

Quantitative traits (phenotypes)133 females from our earlier (NOD B6) (NOD B6) cross

Trait 4 is the log count of a particular white blood cell type.

7

Another representation of a trait distribution

Note the equivalent of dominance in our trait distributions.

8

A second example

Note the approximate additivity in our trait distributions here.

9

Trait distributions:a classical view

In general we seek a difference in the phenotype distributions of the parental strains before we think seeking genes associated with a trait is worthwhile. But even if there is little difference, there may be many such genes. Our trait 4 is a case like this.

10

Data and goals

Data

Phenotypes: yi = trait value for mouse i

Genotype: xij = 1/0 of mouse i is A/H at marker j (backcross); need two dummy variables for intercross

Genetic map: Locations of markers

Goals •Identify the (or at least one) genomic region, called quantitative

trait locus = QTL, that contributes to variation in the trait•Form confidence intervals for the QTL location •Estimate QTL effects

11

Genetic map from our NOD B6 intercross

12

Genotype data

13

Models: Recombination

We assume no chromatid or crossover interference.

points of exchange (crossovers) along chromosomes are distributed as a Poisson process, rate 1 in genetic distancce

the marker genotypes {xij} form a Markov chain along the chromosome for a backcross; what do they form in an F2 intercross?

14

Models: GenotypePhenotype

Let y = phenotype, g = whole genome genotype

Imagine a small number of QTL with genotypes g1,…., gp (2p or 3p distinct genotypes for BC, IC resp).

We assume

E(y|g) = (g1,…gp ), var(y|g) = 2(g1,…gp)

15

Models: GenotypePhenotype, ctd

Homoscedacity (constant variance)

2(g1,…gp) = 2 (constant)

Normality of residual variation

y|g ~ N(g ,2 )

Additivity:

(g1,…gp ) = + ∑j gj (gj = 0/1 for BC)

Epistasis: Any deviations from additivity.

16

Additivity, or non-additivity (BC)

17

Additivity or non-additivity: F2

18

The simplest method: ANOVA

•Split mice into groups according to genotype at a marker•Do a t-test/ANOVA•Repeat for each marker•Adjust for multiplicity

LOD score = log10 likelihood ratio, comparing single-QTL model to the “no QTL anywhere” model.

19

Exercise

1. Explain what happens when one compares trait values of individuals with the A and H genotypes in a backcross (a standard 2-sample comparison), when a QTL contributing to the trait is located at a map distance d (and recombination fraction r) away from the marker.

2. Can the location of a QTL as in 1 be estimated, along with the magnitude of the difference of the means for the two genotypes at the QTL? Explain fully.

20

Interval mapping (IM)

Lander & Botstein (1989)• Take account of missing genotype data (uses the HMM)• Interpolates between markers • Maximum likelihood under a mixture model

21

Interval mapping, cont

Imagine that there is a single QTL, at position z between two (flanking) markers

Let qi = genotype of mouse i at the QTL, and assume

yi | qi ~ Normal( qi , 2 )

We won’t know qi, but we can calculate

pig = Pr(qi = g | marker data)

Then, yi, given the marker data, follows a mixture of normal distributions, with known mixing proportions (the pig).

Use an EM algorithm to get MLEs of = (A, H, B, ).

Measure the evidence for a QTL via the LOD score, which is the log10 likelihood ratio comparing the hypothesis of a single QTL at position z to the hypothesis of no QTL anywhere.

22

Exercises

1. Suppose that two markers Ml and Mr are separated by map distance d, and that the locus z is a distance dl from Ml and dr from Mr. a) Derive the relationship between the three recombination fractions connecting Ml , Mr and z corresponding to dl + dr = d.

b) Calculate the (conditional) probabilities pig defined on the previous page for a BC (two g, four combinations of flanking genotypes), and an F2 (three g, nine combinations of flanking genotype).

2. Outline the mixture model appropriate for the BC distribution of a QT governed by a single QTL at the locus z as in 1 above.

23

LOD score curves

24

LOD curves for Chr 9 and 11 for trait4

25

LOD thresholds

To account for the genome-wide search, compare the observed LOD scores to the distribution of the maximum LOD score, genome-wide, that would be obtained if there were no QTL anywhere.

LOD threshold = 95th %ile of the distribution of genome-wide maxLOD,, when there are no QTL anywhere

Derivations: • Analytical calculations (Lander & Botstein, 1989)• Simulations • Permutation tests (Churchill & Doerge, 1994).

26

Permutation distribution for trait4

27

Epistasis for trait4

28

Acknowledgement

Karl Broman, Johns Hopkins

1 QTL mapping in mice Lecture 10, Statistics 246 February 24, 2004.

Documents

Transcript of 1 QTL mapping in mice Lecture 10, Statistics 246 February 24, 2004.