1 QTL mapping in mice Lecture 10, Statistics 246 February 24, 2004.
-
Upload
lawrence-walters -
Category
Documents
-
view
224 -
download
2
Transcript of 1 QTL mapping in mice Lecture 10, Statistics 246 February 24, 2004.
1
QTL mapping in mice
Lecture 10, Statistics 246
February 24, 2004
2
The mouse as a model
Same genes?The genes involved in a phenotype in the mouse may also be
involved in similar phenotypes in the human.
Similar complexity?The complexity of the etiology underlying a mouse phenotype
provides some indication of the complexity of similar human phenotypes.
Transfer of statistical methods.The statistical methods developed for gene mapping in the
mouse serve as a basis for similar methods applicable in direct human studies.
3
Backcross experiment
4
F2 intercross experiment
5
F2 intercross: another view
6
Quantitative traits (phenotypes)133 females from our earlier (NOD B6) (NOD B6) cross
Trait 4 is the log count of a particular white blood cell type.
7
Another representation of a trait distribution
Note the equivalent of dominance in our trait distributions.
8
A second example
Note the approximate additivity in our trait distributions here.
9
Trait distributions:a classical view
In general we seek a difference in the phenotype distributions of the parental strains before we think seeking genes associated with a trait is worthwhile. But even if there is little difference, there may be many such genes. Our trait 4 is a case like this.
10
Data and goals
Data
Phenotypes: yi = trait value for mouse i
Genotype: xij = 1/0 of mouse i is A/H at marker j (backcross); need two dummy variables for intercross
Genetic map: Locations of markers
Goals •Identify the (or at least one) genomic region, called quantitative
trait locus = QTL, that contributes to variation in the trait•Form confidence intervals for the QTL location •Estimate QTL effects
11
Genetic map from our NOD B6 intercross
12
Genotype data
13
Models: Recombination
We assume no chromatid or crossover interference.
points of exchange (crossovers) along chromosomes are distributed as a Poisson process, rate 1 in genetic distancce
the marker genotypes {xij} form a Markov chain along the chromosome for a backcross; what do they form in an F2 intercross?
14
Models: GenotypePhenotype
Let y = phenotype, g = whole genome genotype
Imagine a small number of QTL with genotypes g1,…., gp (2p or 3p distinct genotypes for BC, IC resp).
We assume
E(y|g) = (g1,…gp ), var(y|g) = 2(g1,…gp)
15
Models: GenotypePhenotype, ctd
Homoscedacity (constant variance)
2(g1,…gp) = 2 (constant)
Normality of residual variation
y|g ~ N(g ,2 )
Additivity:
(g1,…gp ) = + ∑j gj (gj = 0/1 for BC)
Epistasis: Any deviations from additivity.
16
Additivity, or non-additivity (BC)
17
Additivity or non-additivity: F2
18
The simplest method: ANOVA
•Split mice into groups according to genotype at a marker•Do a t-test/ANOVA•Repeat for each marker•Adjust for multiplicity
LOD score = log10 likelihood ratio, comparing single-QTL model to the “no QTL anywhere” model.
19
Exercise
1. Explain what happens when one compares trait values of individuals with the A and H genotypes in a backcross (a standard 2-sample comparison), when a QTL contributing to the trait is located at a map distance d (and recombination fraction r) away from the marker.
2. Can the location of a QTL as in 1 be estimated, along with the magnitude of the difference of the means for the two genotypes at the QTL? Explain fully.
20
Interval mapping (IM)
Lander & Botstein (1989)• Take account of missing genotype data (uses the HMM)• Interpolates between markers • Maximum likelihood under a mixture model
21
Interval mapping, cont
Imagine that there is a single QTL, at position z between two (flanking) markers
Let qi = genotype of mouse i at the QTL, and assume
yi | qi ~ Normal( qi , 2 )
We won’t know qi, but we can calculate
pig = Pr(qi = g | marker data)
Then, yi, given the marker data, follows a mixture of normal distributions, with known mixing proportions (the pig).
Use an EM algorithm to get MLEs of = (A, H, B, ).
Measure the evidence for a QTL via the LOD score, which is the log10 likelihood ratio comparing the hypothesis of a single QTL at position z to the hypothesis of no QTL anywhere.
22
Exercises
1. Suppose that two markers Ml and Mr are separated by map distance d, and that the locus z is a distance dl from Ml and dr from Mr. a) Derive the relationship between the three recombination fractions connecting Ml , Mr and z corresponding to dl + dr = d.
b) Calculate the (conditional) probabilities pig defined on the previous page for a BC (two g, four combinations of flanking genotypes), and an F2 (three g, nine combinations of flanking genotype).
2. Outline the mixture model appropriate for the BC distribution of a QT governed by a single QTL at the locus z as in 1 above.
23
LOD score curves
24
LOD curves for Chr 9 and 11 for trait4
25
LOD thresholds
To account for the genome-wide search, compare the observed LOD scores to the distribution of the maximum LOD score, genome-wide, that would be obtained if there were no QTL anywhere.
LOD threshold = 95th %ile of the distribution of genome-wide maxLOD,, when there are no QTL anywhere
Derivations: • Analytical calculations (Lander & Botstein, 1989)• Simulations • Permutation tests (Churchill & Doerge, 1994).
26
Permutation distribution for trait4
27
Epistasis for trait4
28
Acknowledgement
Karl Broman, Johns Hopkins