Un algorithme Elston-Stewart pour le calcul des dérivées...
Transcript of Un algorithme Elston-Stewart pour le calcul des dérivées...
Un algorithme Elston-Stewart pour le calcul des deriveesexactes de la vraisemblance en epidemiologie genetique
Alexandra Lefebvre1 and Gregory Nuel1
(These co-dirigee par: G. Nuel1 & A. Duval2)
(1) LPSM (Probabilites, Statistique et Modelisation), CNRS, Sorbonne Universite, Paris(2) CRSA, Instabilite des microsatellites et cancer, Hopital Saint-Antoine, Paris
Les journees annuelles du GDRStatistique et Sante
Nantes, 27 & 28 September 2018
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 1 / 20
General context: Likelihood in family studies
Genetic epidemiologyPopulation based (GWAS, ...)Family studies, pedigree based (Segregation, linkage, TDT, ...)
1 2 3 4
5 6
7
GU = {Gu}u∈U , YU = {Yu}u∈Upau : set of labels for parents of Xu
Gu (resp. Yu): set of values for Gu (resp. Yu)
Evidence: Either data or neutral ev.ev|G = {Gu ∈ G∗u ⊆ Gu}, u ∈ Uev|Y = {Yu ∈ Y∗u ⊆ Yu}, u ∈ U
Potential related to u
ϕu(Gu|Gpau ; θ) =
∫Yu
1Gu∈G∗u 1Yu∈Y∗u P(Gu|Gpau ; θ)P(Yu|Gu; θ)
with θ ∈ Rp, p ≥ 1Likelihood
L(θ) = P(ev|θ) =∑GU
∏u∈U
ϕu(Gu|Gpau ; θ)
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 2 / 20
General context: Computation of L(θ)
Complexity
L(θ) =∑G1
· · ·∑Gn
n∏u=1
ϕu
(Gu|Gpau ; θ
)Let g = |Gu| ⇒ Complexity = O (gn)
AlgorithmsElston-Stewart (1971), Totir et al. (2009) : Peeling of individuals.Sum-product or message passing: Elimination of variables.
Complexity : O(gm×n)→ O(n × gm×k)n individuals, m loci, k being less than 5 in most cases
Lander-Green (1987) : Peeling of loci
Complexity : O (gm×n)→ O (m × gn)
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 3 / 20
General context: Computation of L(θ)
Elston-Stewart algorithm or Sum-product algorithm
Pedigree
1 2 3 4
5 6
7
Clique decomposition
1 2 3 4
5 6
7
L(θ) =∑G5
∑G6
∑G7
{( F1(G5)︷ ︸︸ ︷∑G1
∑G2
ϕ1(G1)ϕ2(G2)ϕ5(G5|G1,G2)
)( F2(G6)︷ ︸︸ ︷∑
G3
∑G4
ϕ3(G3)ϕ4(G4)ϕ6(G6|G3,G4)
)ϕ7(G7|G5,6)
}A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 4 / 20
C1 = {G∗1 ,G∗2 ,G∗5 }
C2 = {G∗3 ,G∗4 ,G∗6 }
C3 = {G5,G6,G∗7 }
F1(G5|θ)
F2(G6|θ)
F3(∅|θ)
∀i ∈ {1, . . . 4}; φi (Ci |θ) =∏
Gu∈Ci∗
ϕu(Gu|Gpau ; θ)
Fi (Si |θ) =∑Ci\Si
∏j∈fromi
Fj(Sj |θ)
× φi (Ci |θ); F3(∅|θ) = L(θ)
Our objective: Compute L(θ) and its derivatives up to a chosen order d .
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 5 / 20
State of art
Sensitivity analysis: express L(θ) as a polynomial in θ⇒ not always possible (e.g. P(Xu = 1) = eθ/(1 + eθ) )⇒ prohibitive if degree in θ is large (e.g. θ in many potentials)
Smoothing recursions (Cappe and Moulines, 2005): express∇d log L(θ) as function of E[∇d logϕu|ev] (Fisher-Louis’ formulas)⇒ developed for Hidden Markov Models⇒ many terms and complex recursions when d increases
Polynomial computations: perform sum-product with polynomials⇒ moments of additive functional in BN (Cowel, 1992; Nilsson, 2001)⇒ moment generating function / probability generating function forregular expr. in Markov models (Nuel, 2008, 2010)
Our idea: Use polynomial computations in the framework ofderivatives
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 6 / 20
Method: Derivative generating function
L(θ) =∑GU
∏u∈U
ϕu(Gu|Gpau ; θ) ⇒∑GU
Fu∈U
Ddϕu(Gu|Gpau ; θ)
Derivative generating function (Dgf)Let f be a function of class Cd of θ ∈ R, d ∈ N
Dd f (θ) =d∑
k=0
f (k)(θ)zk
where z is a dummy variable.Example: D2f (θ) = f (θ) + f ′(θ)z + f ′′(θ)z2
⇒ Polynomial potentials of degree d : ∀u ∈ U :
Ddϕu(Gu|Gpau ; θ) =d∑
k=0
ϕu(k)(Gu|Gpau ; θ)zk =
ϕu(Gu|Gpau ; θ) + ϕ′u(Gu|Gpau ; θ) z + . . .+ ϕ(d)u (Gu|Gpau ; θ) zd
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 7 / 20
Method: “Leibniz’s product” : F
Definition “Leibniz’s product”
P =d∑
k=0
akzk with ak = f (k)(θ)
Q =d∑
k=0
bkzk with bk = g (k)(θ)
P ? Q =d∑
k=0
ckzk with ck = (fg)(k)(θ)
(fg)(k)(θ) =k∑
i=0
(k
i
)f (i) (θ) g (k−i) (θ)⇒ ck =
k∑i=0
(k
i
)aibk−i
P ? Q =d∑
k=0
k∑i=0
(k
i
)aibk−iz
k
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 8 / 20
Multidimensional parameter Θ = θ1, . . . , θp
Dgf of f function Cd of Θ ∈ Rp
Dd f (Θ) =∑
k1+...+kp≤d
∂(k1+...+kp)f (θ)
∂θ1k1 . . . ∂θp
kpz1
k1 . . . zpkp
“Leibniz product”
P =∑
k1+...+kp=d
ak1,...,kpz1k1 . . . zp
kp ; ak1,...,kp =∂(k1+...+kp)f (Θ)
∂θ1(k1) . . . ∂θp
(kp)
Q idem with coef. b instead of a and function g instead of f
P ? Q =∑
k1+...+kp=d
ck1,...,kpz1k1 . . . zp
kp
with ck1,...,kp =
k1∑i1=0
. . .
kp∑ip=0
(k1
i1
). . .
(kpip
)ak1−i1,...,kp−ipbi1,...,ip
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 9 / 20
Computation :∑∏
φu →∑
FDdφu
C1 = {G∗1 ,G∗2 ,G∗5 }
C2 = {G∗3 ,G∗4 ,G∗6 }
C3 = {G5,G6,G∗7 }
F1(G5|θ)
F2(G6|θ)
F3(∅|θ)
∀i ∈ {1, . . . 4}; φi (Ci |θ) =∏
Gu∈Ci∗
ϕu(Gu|Gpau ; θ)
Fi (Si |θ) =∑Ci\Si
∏j∈fromi
Fj(Sj |θ)
× φi (Ci |θ); F3(∅|θ) = L(θ)
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 10 / 20
Computation :∑∏
φu →∑
FDdφu
C1 = {G∗1 ,G∗2 ,G∗5 }
C2 = {G∗3 ,G∗4 ,G∗6 }
C3 = {G5,G6,G∗7 }
F1(G5|θ)
F2(G6|θ)
F3(∅|θ)
∀i ∈ {1, . . . 4}; Φdi (Ci |θ) = F
Xu∈Ci∗Ddϕu(Xu|Xpau ; θ)
F di (Si |θ) =
∑Ci\Si
(F
j∈fromi
F dj (Sj |θ)
)?Φd
i (Ci |θ) F d3 (∅|θ) = DdL(θ)
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 10 / 20
Computation :∑∏
φu →∑
FDdφu
C1 = {G∗1 ,G∗2 ,G∗5 }
C2 = {G∗3 ,G∗4 ,G∗6 }
C3 = {G5,G6,G∗7 }
F1(G5|θ)
F2(G6|θ)
F3(∅|θ)
Φd2 (G3,G4,G6|θ) = Ddϕ3(X3; θ) ? Ddϕ4(X4|X1; θ)
F d3 (∅|θ) =
∑G5
∑G6
∑G7
F d1 (G5|θ) ? F d
2 (G6|θ) ?Φd3 (G7|G5,G6; θ)
F d3 (∅|θ) = DdL(θ) = L(θ) + L′(θ)z + . . .+ L(d)(θ)zd
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 10 / 20
Comparison with empirical computation
Empirical computation
L′(θ) ' L(θ + h1)− L(θ)
h1; L′′(θ) ' L(θ + 2h2)− 2L(θ + h2) + L(θ)
h22
for a small enough h1 and h2.
Complexities
Let C be the complexity to compute L(θ) = P(ev|θ)
Unidimensional parameter & arbitrary d → L(θ), L′(θ), . . . , L(d)(θ)Complexity = O
(C × d2
)Multidimensional parameter (p > 1) & d = 2 → L(θ),∇L(θ),Hess L(θ)Complexity = O
(C × p2
)Similar to the empirical method with two main advantages
Exact derivativesNo need to choose the step h and converge iteratively to the solution
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 11 / 20
Application : two-point linkage in genetics
Goal : Locate a gene of interest on the genome.Trick: uses crossovers (CO), phenomenon happening during meiosis.
Xm Mm
Xp MpPat. chr.
Mat. chr.
Xp Mp
Xp Mm
Xm Mp
Xm Mm
R
NR
R
NR
Assumption: The closest two genes are on the genome, the lesschances to be separated during meisosis
Parameter:
θ =#R
#R + #NR= P(odd C ) with C = #CO
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 12 / 20
Genetic distance h - Haldane, J. B. S. (1919).
Genetic distance h:
Assumption: C ∼ P(h)⇒ h = E[C ]
θ = P(odd C ) =∞∑k=0
P(C = 2k + 1) =∞∑k=0
e−hh(2k+1)
(2k + 1)!=
1− e−2h
2
h = − log(1− 2θ)
2expressed in morgan (M) or centimorgan (cM)
LOD score:
θ −−−→h→∞
1
2LOD(θ) = log10
(L(θ)
L( 12 )
)
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 13 / 20
Example with the KUS family (MENDEL package)
Interest RADIN:
Xp ∈ {1, 0}Xm ∈ {1, 0}
}→
11 → lethal before birth10 or 01 → Y = 1, black filling00 → Y = 0, no filling
Marker PGM1:Mp ∈ {1+, 1−, 2+, 2−}Mm ∈ {1+, 1−, 2+, 2−}
}→ 16 configurations
G ∈ {1 + 1+, 1 + 1−, . . . , 2− 2−}, 10 values
Pedigree (familial structure)1+1-
1-2- 1+2+1-2- 1+1- 1-2- 1+2+
1+2- 1+2-1+1-1+2-1+1- 1-1- 1-1- 1-1- 1-2-1+2-1+2- 2+2-1+2-
1+1-
2+2-
B1 M1
Y1 C1 C2 Y2 C3 C4 Y4
D1 D2 D3 D4 D5 D6 D7 D8 D11 D12 D13 D16 D17
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 14 / 20
Variables in two-point linkage
Y1
Xp1
Xm1
Xp3
SMp3SXp3
SMm3SXm3
Θ
Θ
Y2
Xp2
Xm2
Xm3 G2
Mp2
Mm2
Mm3
G1
Mp1
Mm1
Mp3
G3Y3
Known Structure and Observed G and Y .
Latent: Xp & Xm ∈ {1, 0}; Mp & Mm ∈ {1+, 1−, 2+, 2−};SXp, SXm, SMp, SMm ∈ {pat,mat}
ϕSXp3(SXp3 = pat) = ϕSXp3
(SXp3 = mat) = 0.5
ϕSMp3(SMp3 = SXp3|SXp3) = (1− θ)
ϕSMp3(SMp3 6= SXp3|SXp3) = θ
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 15 / 20
Build polynomial potentials
Polynomials of degree 0 (All except those related to selector of M)
Polynomials of degree strictly positive (Related to selector of M)
D2ϕSMp3(SMp3 = SXp3|SXp3) = (1− θ)− z
D2ϕSMp3(SMp3 6= SXp3|SXp3) = θ + z
Reparametrization: θ = eβ/(1 + eβ)
D2ϕSMp3(SMp3 = SXp3|SXp3) =
1
1 + eβ− eβ
(1 + eβ)2z − eβ(1− eβ)
(1 + eβ)3z2
D2ϕSMp3(SMp3 6= SXp3|SXp3) =
eβ
1 + eβ+
eβ
(1 + eβ)2z +
eβ(1− eβ
)(1 + eβ)3
z2
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 16 / 20
Results: Examples of the last polynomial message
Recalls θ = eβ
1+eβ; β = log θ
1−θ
We define `(β) = log L(
eβ
1+eβ
)θ F 2
K (∅|θ) = D2`(θ) F 2K (∅|β) = D2`(β)
0.001 −46.68 + 984.0 z−1.106 z2 −46.68 + 0.9830 z−17.10−3 z2
0.01 −44.52 + 83.84 z−1002 z2 −44.52 + 0.8300 z−17.10−2 z2
0.05 −43.57 + 3.159 z−417.7 z2 −43.57 + 0.1500 z−0.8074 z2
0.1 −43.74− 7.771 z−119.5 z2 −43.75− 0.6993 z−1.528 z2
0.15 −44.25− 12.12 z−65.80 z2 −44.25− 1.546 z−2.152 z2
0.2 −44.93− 14.90 z−47.94 z2 −44.93− 2.384 z−2.658 z2
0.3 −46.63− 18.90 z−33.49 z2 −46.63− 3.969 z−3.065 z2
0.4 −48.66− 21.42 z−14.72 z2 −48.67− 5.140 z−1.876 z2
0.5 −50.84− 22.00 z−4.000 z2 −50.85− 5.5000 z−0.2500 z2
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 17 / 20
Results: Derivatives
−7 −6 −5 −4 −3 −2 −1 0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
beta
LOD
LOD(β) =`(
eβ
1+eβ
)− `(0.5)
log(10)
β = −2.772409⇒ θ = 0.05883346
LOD(β) = 3.164775
∂LOD
∂β(0.0) = −2.38862
∂2LOD
∂β2(β) = −0.4104178
Black dots computed with Mendel 16.0 (Reference software)
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 18 / 20
Results : Confidence intervals and tests
θ (95% IC) LR W S
KUS (n = 22) 0.059 (0.008, 0.320) 14.574 7.264 32.010ALL (n = 93) 0.193 (0.106, 0.326) 17.012 15.821 12.900
p. values KUS 1.3 · 10−4 7.0 · 10−3 1.5 · 10−8
p.values ALL 3.7 · 10−5 7.0 · 10−5 3.3 · 10−4
O(β) = −`′′(β) I (β) = E[O(β)|β]
Likelihood ratio test statistic:
LR = 2(`(β)− `(β0)) ∼H0
χ2(df = 1)
Wald’s test statistic:
W =(((((((I (β0)(β − β0)2 = O(β)(β − β0)2 ∼
H0
χ2(df = 1)
Score test statistic:
S =����`′(β0)2
I (β0)=`′(β0)2
O(β)∼H0
χ2(df = 1)
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 19 / 20
Summary and perspectives
Summary:
Sum-product with polynomials and “Leibniz product”univariate: derivatives up to order d in O(d2 × C )multivariate θ ∈ Rp: gradient and Hessian in O(p2 × C )confidence intervals, theoretical distributions under H0A. Lefebvre and G. Nuel. A sum-product algorithm with polynomialsfor computing exact derivatives of the likelihood in Bayesian networks.Proceedings of Machine Learning Research 72:201–212, 2018.
Perspectives:
joint estimation of allele frequencies and θextensive simulations under H0other genetic models: segregation, TDT, IBD, etc.Applications in local score of a sequence.
Acknowledgement: This work was funded bythe Ligue Nationale Contre le Cancer.
A. Lefebvre and G. Nuel Derivatives of the likelihood in BNs PGM 2018 20 / 20