Genetic evaluation under parental uncertainty Robert J. Tempelman Michigan State University, East...
-
Upload
homer-mason -
Category
Documents
-
view
214 -
download
0
Transcript of Genetic evaluation under parental uncertainty Robert J. Tempelman Michigan State University, East...
Genetic evaluation under parental uncertaintyGenetic evaluation under parental uncertainty
Robert J. TempelmanMichigan State University, East Lansing, MI
National Animal Breeding Seminar Series
December 6, 2004.
Key papers from our lab:Key papers from our lab:
Cardoso, F.F., and R.J. Tempelman. 2003. Bayesian inference on genetic merit under uncertain paternity. Genetics, Selection, Evolution 35:469-487.
Cardoso, F.F., and R.J. Tempelman. 2004. Genetic evaluation of beef cattle accounting for uncertain paternity. Livestock Production Science 89: 109-120.
Multiple sires – The situationMultiple sires – The situation
Cows are mated with a group of bulls under pasture conditions
Common in large beef cattle populations raised on extensive pasture conditions – Accounts for up to 50% of calves in some herds
under genetic evaluation in Brazil (~25-30% on average)
– Multiple sires group sizes range from 2 to 12+ (Breeding cows group size range from 50 to 300+)
Common in commercial U.S. herds.– Potential bottleneck for genetic evaluations beyond
the seedstock level (Pollak, 2003).
Multiple sires – The situationMultiple sires – The situation
x x
?? ??
Who is the sire?
The tabular method for computing The tabular method for computing genetic relationshipsgenetic relationships Recall basis tabular method for computing
the numerator relationship matrix:– Henderson, C.R. 1976. A simple method for
computing the inverse of a numerator relationship matrix used in prediction of breeding values. Biometrics 32:69.
A = {aij} where aij is the genetic relationship between animals i and j. Let parents of j be sj and dj. , ,0.5 0.5
j jij i s i da a a
,1 0.5 1j jjj s d ja a F
The average numerator relationship The average numerator relationship matrix (ANRM)matrix (ANRM) Henderson, C.R. 1988. Use of an average
numerator relationship matrix for multiple-sire joining. Journal of Animal Science 66:1614-1621.– aij is the genetic relationship between animals i and j.
Suppose dam of j be known to be dj whereas there are vj different candidate sires (s1,s2,…svj) with probabilities (p1,p2,…pvj) of being the true sire:
1 1 2 2, , , ,0.5 0.5 ....j j v j
ij i d s i s s i s v i sa a p a p a p a
1 1 2 2, , ,1 0.5 .... 1j j j j jjj s s d s s d v v d ja p a p a p a F
1
1j
j
j
v
ss
p
Pedigree file example from Pedigree file example from Henderson (1988)Henderson (1988)
Animal Sires Sire probabilities Dam
1 0 1 0
2 0 1 0
3 1 1 2
4 1 1 2
5 3 1 4
6 3 1 0
7 3,5 0.6, 0.4 6
8 1,5 0.3, 0.7 4
9 1,4,5 0.3, 0.6, 0.1 6
10 1 1 4
0 = unknown
Could be determined using genetic markers
Numerator relationship matrix:Numerator relationship matrix:1 0 0.5 0.5 0.5 0.25
1 0.5 0.5 0.5 0.25
1 0.5 0.75 0.5
1 0.75 0.25
0.375
0.375
0.7
0.425
0.6625
0.725
1.225
1.25 0.375
1A
symmetric
Res
t p
rovi
ded
in
Hen
der
son
, 198
8
Animal Sires Sire probabilities
Dam
7 3,5 0.6, 0.4 6
8 1,5 0.3, 0.7 4
9 1,4,5 0.3, 0.6, 0.1 6
10 1 1 4
17 16 3 13 5 150.5 0.5 0.5 0.25 0.5 0.6 0.5 0.4 0.5 0.375a a p a p a
27 26 3 23 5 250.5 0.5 0.5 0.25 0.5 0.6 0.5 0.4 0.5 0.375a a p a p a
37 36 3 33 5 350.5 0.5 0.5 0.5 0.5 0.6 1.0 0.4 0.75 0.7a a p a p a
47 46 3 43 5 450.5 0.5 0.5 0.25 0.5 0.6 0.5 0.4 0.75 0.425a a p a p a
57 56 3 53 5 550.5 0.5 0.5 0.375 0.5 0.6 0.75 0.4 1.25 0.6625a a p a p a
67 66 3 63 5 650.5 0.5 0.5 1.0 0.5 0.6 0.5 0.4 0.375 0.725a a p a p a
77 3 36 5 561 0.5 1 0.5 0.6 0.5 0.4 0.375 1.225a p a p a
Note if true sire of 7 is 3, a77 = 1.25; otherwise a77 = 1.1875
How about inferring upon what How about inferring upon what might be the correct sire?might be the correct sire?Empirical Bayes Strategy:
– Foulley, J.L., D. Gianola, and D. Planchenault. 1987. Sire evaluation with uncertain paternity. Genetics, Selection, Evolution. 19: 83-102.
Sire model implementation.
Simple sire modelSimple sire model
Animal Sires Sire probabilities
1 0 1
2 0 1
3 1 1
4 1 1
5 3 1
6 3 1
7 3,5 0.6, 0.4
8 1,5 0.3, 0.7
9 1,4,5 0.3, 0.6, 0.1
10 1 1
3 3
4 1 4
5 3 5
6 4 6
7 5 7
8 8
9 9
10 10
1 0 0 0
1 0 0 0
0 1 0 0
0 1 0 0
0 0
0 0
0 ?
1 0 0 0
? ?
? ?
? ?
Xβ
y e
y s e
y s e
y s e
y s e
y e
y e
y e
y =X+ Zs + e
One possibility: Substitute sire One possibility: Substitute sire probabilities for elements of Z.probabilities for elements of Z.
Animal Sires Sire probabilities
1 0 1
2 0 1
3 1 1
4 1 1
5 3 1
6 3 1
7 3,5 0.6, 0.4
8 1,5 0.3, 0.7
9 1,4,5 0.3, 0.6, 0.1
10 1 1
3 3
4 1 4
5 3 5
6 4 6
7 5 7
8 8
9 9
10 10
1 0 0 0
1 0 0 0
0 1 0 0
0 1 0 0
0 0
0 0
0
1 0 0
0.6 0.4
0.3 0.7
0.3 0.6 0.1
0
Xβ
y e
y s e
y s e
y s e
y s e
y e
y e
y e
Strategy of Foulley et al. (1987)Strategy of Foulley et al. (1987)
3
4
5
6
7
8
9
1
7 7
8 8
9 9 9
0
Pr( '3' | ) Pr( '5 ' | )
1 0 0 0
1 0 0 0
0 1 0 0
0 1 0 0
0 0
0Pr( '1' | ) Pr( '5 ' | )
Pr( '1' | ) Pr( '4 ' | ) Pr( '5 ' |
0
0
1 0 0
)
0
y y
y y
y
Xβ
y y
y
y
y
y
y sire sire
sire sire
sire sire sire
y
y
y
3
1 4
3 5
4 6
5 7
8
9
10
e
s e
s e
s e
s e
e
e
e
: Posterior probabilities using provided sire probabilities as “prior” probabilities and y to estimate elements of Z.
- computed iteratively
Limitation: Can only be used for sire models.
Pr( ' ' | )yisire j
Inferring upon elements of design Inferring upon elements of design matrixmatrix Where else is this method currently used? Segregation analysis
– Estimating allelic frequencies and genotypic effects for a biallelic locus WITHOUT molecular marker information.
– Prior probabilities based on HW equilibrium for base population.
– Posterior probabilities based on data.– Reference: Janss, L.L.G., R. Thompson., J.A.M. Van
Arendonk. 1995. Application of Gibbs sampling for inference in a mixed major gene-polygenic inheritance model in animal populations. Theoretical and Applied Genetics 91: 1137-1147.
Another strategy (most commonly Another strategy (most commonly used)used) Use phantom groups (Westell et al.,
1988; Quaas et al., 1988).Used commonly in genetic evaluation systems
having incomplete ancestral pedigrees in order to mitigate bias due to genetic trend.
– Limitations (applied to multiple sires):1. Assumes the number of candidate sires is
effectively infinite within a group.2. None of the phantom parents are related.3. Potential confounding problems for small
groups (Quaas, 1988).
The ineffectiveness of phantom The ineffectiveness of phantom grouping for genetic evaluations in grouping for genetic evaluations in multiple sire pastures:multiple sire pastures: Perez-Enciso, M. and R.L. Fernando. 1992.
Genetic evaluation with uncertain parentage: A comparison of methods. Theoretical and Applied Genetics 84:173-179.
Sullivan, P.G. 1995. Alternatives for genetic evaluation with uncertain paternity. Canadian Journal of Animal Science 75:31-36.– Greater selection response using Henderson’s ANRM
relative to phantom grouping (simulation studies).– Excluding animals with uncertain paternity reduces
expected selection response by as much as 37%.
1. To propose a hierarchical Bayes animal model for genetic evaluation of individuals having uncertain paternity
2. To estimate posterior probabilities of each bull in the group being the correct sire of the individual
3. To compare the proposed method with Henderson’s ANRM via
1. Simulation study
2. Application to Hereford PWG and WW data.
Uncertain paternity - Uncertain paternity - objectives objectives
Animal genetic values – a
Uncertain paternity -Uncertain paternity -hierarchical Bayes modelhierarchical Bayes model
1st stageData - Data - yy
(Performance (Performance records)records)
Non-genetic effects -
(Contemporary groups, age of dam, age of calf, gender)
Residual terms - e(assumed to be
normal)
y = X + Za + e; e ~N (0,Ie2)
Uncertain paternity -Uncertain paternity -hierarchical Bayes modelhierarchical Bayes model
2nd stageNon-
genetic effects
Animal genetic values
ResidualVariance
Prior knowledge based on literature
information
(Co)variances based on relationship (A), sire assignments (s)sire assignments (s) and genetic variance genetic variance
((aa22))
Prior means based on literature information
Variance based on the reliability of prior information
~N (o,V) a|s ~N (0,Asa2) e
2 ~ se2
Uncertain paternity -Uncertain paternity -hierarchical Bayes modelhierarchical Bayes model
3rd stagesire assignmentssire assignments
Prior knowledge based on literature
information
Probability for sire Probability for sire assignments assignments ((jj))
genetic variancegenetic variance
a2 ~ sa
2a) Prob s πj j
Could be based on marker data.
Uncertain paternityUncertain paternity - -hierarchical Bayes modelhierarchical Bayes model
4th stage Specifying Specifying uncertainty for uncertainty for
probability of sireprobability of sire assignmentsassignments
1
|kj
jv
kj j j
k
p
π α
e.g. How sure are you about the prior probabilities of 0.6 and 0.4 for Sires 3 and 5, respectively, being the correct sire?
Assessment based on how much you trust the genotype based probabilities.
Could also model genotyping error rates explicitly (Rosa, G.J.M, Yandell, B.S., Gianola, D. A Bayesian approach for constructing genetics maps when markers are miscoded. Genetics, Selection, Evolution 34:353-369)
Dirichlet prior
Uncertain paternity -Uncertain paternity -joint posterior densityjoint posterior density
2nd stage
Genetic effects
Residual error
Prior knowledge based on literature
information
(Co)variances (relationship, sire
assignments and genetic variances)
Prior means (literature information)
Variance (reliability of priors)
1st stage DataData
Prior knowledge based on literature
information
3rd stage
Non-genetic
fixed effects
Markov chain Monte Markov chain Monte Carlo (MCMC)Carlo (MCMC)
Prior probability for sire assignments
Reliability of priors4th stage
Simulation Study Simulation Study (Cardoso and Tempelman, 2003)(Cardoso and Tempelman, 2003)
Generation
0
Base population
Selection (20 sires & 100 dams)
Breeding population
Selection (15 sires & 75 dams)
Selection (5 sires & 25 dams)
Breeding population
Offspring (500 animals)
Random mating (inbreeding avoided)
1
.
.
.
Offspring (360 animals)5
.
.
.
Selection (15 sires & 75 dams)
Selection (5 sires & 25 dams)
Breeding population
Offspring (500 animals)
Random mating (inbreeding avoided)
2
Totals: 80 sires, 400 dams, 2000 non-parents.
Paternity assignmentPaternity assignment
Offspring
UncertainCertain
Random Assignment to Paternity Condition.3.7
Assignment to Multiple Sire Groups
.2 .3 .2 .1 .1 .1
2 3 4 6 8 10Within the assigned group one of the sire is picked to be the true sire (with equal or unequal probabilities)
Sire1 1
2 2s d d
i i i i i iy a a m e Record:
Sires averaged 23.6 progeny, Dams averaged 5.9 progeny
Simulated traits:Simulated traits:
Ten datasets generated from each of two different types of traits:– Trait 1 (WW):
– Trait 2 (PWG):
2
2
0.3 0.2
0.2a am
m
h r
h
2
2
0.5 0
0a am
m
h r
h
Naïve prior assignments:i.e. equal prior probabilities to each candidate sire (i.e. no information based on genetic markers available)
Posterior probabilities of sire Posterior probabilities of sire assignments being equal to true siresassignments being equal to true sires
Multiple-sire group size
Animal Category 2 3 4 6 8 10
Trait 1
Parents 0.525 0.349 0.269 0.183 0.127 0.110
Non-parents 0.517 0.345 0.268 0.178 0.134 0.105
Trait 2
Parents 0.521 0.352 0.280 0.188 0.138 0.111
Non-parents 0.540 0.360 0.289 0.191 0.143 0.111
Rank correlation of predicted genetic Rank correlation of predicted genetic effectseffects
0.35
0.40
0.45
0.50
0.55
0.60
0.65
0.70
0.75
0.80
0.85
Parentsadditive
Non-parentsadditive
Parentsmaternal
Non-parentsmaternal
Parentsadditive
Non-parentsadditive
Ra
nk
co
rre
lati
on
ANRMHIERTRUE
a a
b
Trait 1 Trait 2
a a
b a a
b
a a
b
a a b
a a b
ANRM = Henderson’s ANRM
HIER = proposed model
TRUE = all sires known
Sidenote:
Model fit criteria was clearly in favor of HIER over ANRM
Data set 3,402 post-weaning gain records on
Hereford calves raised in southern Brazil (from 1991-1999)
4,703 animals Paternity (57% certain; 15% uncertain & 28%
unknown-base animals) Group sizes 2, 3, 4, 5, 6, 10, 12 & 17
Methods ANRM (average relationship) HIER (uncertain paternity hierarchical Bayes
model)
Uncertain paternity -Uncertain paternity -application to field dataapplication to field data
Parametera Posterior median 95% Credible Set
ANRM
0.231 (0.153, 0.316)
73.8 (48.0, 103.6)
246.5 (221.5, 271.2)
404.5 (334.3, 494.0)
HIER
0.244 (0.162, 0.336)
78.2 (51.1, 111.2)
242.9 (216.5, 268.2)
404.5 (333.9, 493.8)
Posterior inference for PWG genetic parameters under Posterior inference for PWG genetic parameters under ANRM versus HIER modelsANRM versus HIER models
2ah2a2e2cg
2ah2a2e2cg
Model choice criteria (DIC and PBF) decisively favored HIER over ANRM
Very high rank correlations between genetic evaluations using ANRM versus HIER
Some non-trivial differences on posterior means of additive genetic value for some animals
Uncertain paternity -Uncertain paternity -Results summaryResults summary
Standard deviation of additive genetic effects
Uncertain paternity -Uncertain paternity -assessment of accuracy (PWG)assessment of accuracy (PWG)
y = 0.6786x + 2.2914
R2 = 0.741
4.0
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0
8.5
9.0
4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0
SD(a), HIER (kg)
SD
(a),
AN
RM
(k
g)
Sire with 50 progeny
Sire with 9 progeny
i.e. accuracies are generally slightly overstated with Henderson’s ANRM
ConclusionsConclusions
Uncertain paternity modeling complements genetic marker information (as priors)– Reliability on prior information can be
expressed (via Dirichlet). Little advantage over the use of Henderson’s
ANRM.– However, accuracies of EPD’s overstated
using ANRM.– Power of inference may improve with better
statistical assumptions (i.e. heterogeneous residual variances)
Implementation issuesImplementation issues
Likely require a non-MCMC approach to providing genetic evaluations.
Some hybrid with phantom grouping may be likely needed.– Candidate sires are not simply known
for some animals.
Bob Weaber’s talk.