Fast Simulation based Estimation for Complex Models...The corresponding maximum likelihood function...

Fast Simulation based Estimationfor Complex Models

Maria-Pia Victoria-Feser1

1 Research Center for Statistics, University of Geneva

Workshop on Forecasting from Complexity - IMA, University of Minnesota

April 26, 2018

M.-P. Victoria-Feser Fast Simulation based Estimation April 26, 2018 1 / 24

On going research with...

Stephane Guerrier, Department of Statistics,Penn State University

Guillaume Blanc, Research Center for Statistics,University of Geneva

Samuel Orso, Research Center for Statistics,University of Geneva

Mucyo Karema, Department of Statistics,(from 09/18) Penn State University

Motivation

With the ever increasing data size and model complexity, animportant challenge encountered in constructing new estimators or inimplementing classical ones are the numerical aspects of theestimation procedure.Parametric models with hundreds (or thousands) of parameters arecommon in social, environmental and medical sciences. Fast andefficient model selection and estimation methods are necessary.Classical estimation approaches need to be revisited. A potential toolis (parametric) simulations for which (almost) only the datagenerating process is needed.We propose simulation based estimators that are consistent andwith small finite sample bias that use the framework of indirectinference.

Problem Setting

Inputs:

A family of models Fθ,θ ∈ Θ ⊂ �p,An oberved sample X = [Xi ], i = 1, . . . , n supposedly generated fromFθ0 ,θ0 ∈ Θ,A (working, auxiliary) estimator π : X→ Π ⊂ �r , r ≥ pA related estimating function (i.e. what is optimized) Ψπ

Outputs:

Produce an estimator, θ that targets θ0 (consistency), through a(implicit) function of πUse the estimator to produce a (estimated) probability distribution forinference purposes (e.g. confidence intervals, hypothesis testing, etc.)

Problem Setting

Setting:

Complex Fθ: high dimentionality in p, hierarchical modelling,nonlinearity, complex dependences, no or approximate likelihoodfunction, etc.One can generate data from Fθ.

Objectives:

Choose π (and Ψπ) that is simple to compute numerically (e.g.obtained with successive steps) but does not need to be consistentUse simulations to obtain θ from π and provide a workingdistribution for inference. (We assume there exist a bijective functionb s.t. θ = b (π).)

Problem Setting

By hypothesis, the observed sample X is a function of θ0 (unknown)The sample has be drawn from an infinite sequence at sayω0 ∈ �m, m ≥ n (unknown), e.g. the seed when simulating.Hence X := X (θ0, n,ω0) ∈ �n, and

π(θ0, n,ω0) ≡ argzeroπ

Ψπ [X (θ0, n,ω0) ,π] . (1)

From π(θ0, n,ω0), recover by means of simulations an astimator forθ0.

Simulation Based Estimation

Known strategies:

Indirect inference: Ensures consistency and has a known limitingdistribution (see e.g. Gourieroux et al., 1993).In the just identified case (dim(π(θ0, n,ω0)) = dim(θ0)):

θ(j,n,H) ≡ argzeroθ

π(θ0, n,ω0)− 1H

H∑h=1

π(θ, n,ωj+(h−1)H), (2)

That is, for given values of θ, H samples X(θ, n,ωj+(h−1)H

simulated, the first one with seed ωj that is kept fixed to find thematching value for θ (using a numerical method).

Iterative bootstrap: Effective iterative method for removing(asymptotic) bias (see Guerrier et al., 2017)Iterate until convergence:

θ(k)(j,n,H) = π(θ0, n,ω0) +

(k−1)(j,n,H) −

H∑h=1

(k−1)(j,n,H), n,ωj+(h−1)H

Both estimators coincide.Does not rely on numerical optimisation methods if π is of closedform (or a multiple steps estimator).

New: Use H = 1...

θ(k)(j,n,1) = π(θ0, n,ω0) +

(k−1)(j,n,H) − π

(k−1)(j,n,H), n,ωj

Under some regularity condistions, θ(j,n,1) is equivalent to:

θ(j,n) ≡ argzeroθ∈Θ

Ψπ [X (θ, n,ωj) , π(θ0, n,ω0)] . (5)

The optimization is reverted: for a given seed ωj , find the value of θthat produces the sample X (θ, n,ωj) such thatΨπ [X (θ, n,ωj) , π(θ0, n,ω0)] = 0.

Remarks:

If π is of closed form, the estimation procedure can be very fast!The choice for π is hence determining.π can be defined as a multi-steps estimator.To obtain an (approximate) distribution for inference, one can obtainseveral θ(j,n,1) by varying the seed ωj (experiemental).

Logistic Regression

Experimental Example - Logistic Regression

Consider the logistic regression model withresponse y (with yi , i = 1, . . . , n elements)linear predictor Xβ, X being an n × p matrix of fixed covariates withrow xi , i = 1, . . . , n,logit link E[yi ] = µi = exp(xiβ)(1 + exp(xiβ)).

Estimation performed using the Iteratively Reweighted Least Squares(IRLS) as implemented in R (glm function):

β(k) ≡ β(k−1) + J−1(β(k−1)

)S(β|X, y), (6)

is the negative of the Hessian matrix (evaluated at the currentvalue of β),which requires numerous inversions (through QRdecompositions) of potentially large matrices.

Logistic Regression

Instead, we use the auxiliary (non consistent) estimatorπ(β0, n,ω0) =

)−1XT y (β0, n,ω0), i.e. the LS.

The consistent (indirect) estimator is defined iteratively as

β(k)(j,n) ≡ β

(k−1)(j,n) +

[π(θ0, n,ω0)−

)−1XT y(β(k−1)

(j,n) , n,ωj)],

The numerical aspects are reduced to only one inversion of a(potentially high-dimensional) matrix (XT X)The data y(β, n,ωj) can be obtained by the reciprocal F −1

β (uj) whereuj = (uij), i = 1, . . . , n are simulated from a uniform distribution onlyonce with (arbitrary) seed ωj .

Logistic Regression

Simulation:

n = 50p = 30 (20 covariates have null slope coefficients)1000 Monte Carlo simulationsCompute the LS (using the IRLS as implemented in the glm functionin R) and associated confidence intervals (from asymptotic theory) toobtain probability coverages.Compute β(j,n) (IB) for 1000 seeds, take the median and use the1000 replicates to obtain probability coverages.

Logistic Regression

IRLS IB

Mean Bias

IRLS IB

Coverage

Generalized Linear Latent Variable Model

Example - Generalized Linear Latent Variable Models

Generalized Linear Latent Variable Models (GLLVM) are very popularin various areas of research in social and behavioural science, and alsoin life sciences (ecology).Basically, latent variable models include latent (non observable)variables to account for the dependence structure between (and alsowithin) the manifest (observed) variables.In particular, the GLLVM generalizes factor analysis to manifestvariables that are not normally distributed.Complex models: large number of manifest variables, complex surveydesign (e.g. panel data and household surveys) inducing complexrelations/dependences, etc.Examples: education surveys such as the OCDE Programme forInternational Student Assessment (PISA) and the Survey of AdultSkills (PIAAC), media audiences, etc.

Let z(k), k = 1, . . . , q be the latent variables and x (l), l = 1, . . . , p,be the manifest variables, p > q.The conditional density gl (x (l)|z) is assumed to belong to theexponential family but can be different for different l .Let the link function (corresponding gl ,l = 1, . . . , p) be

νl (E(x (l)|z(2))) = λ(l)Tz,

z = (1, zT(2))T, z(2) = (z (1), . . . , z (q))T

λ(l) = (λ(l)0 , . . . , λ

(l)q )T = (λ(l)

0 ,λ(l)T(2) )T

λ(l)(2) are the loadings,

h(z(2)) is the multivariate standard normal.

Assumption: conditionally on the latent variables, the manifestvariables are independent of each other.Hence, given a sample of n observations x1, . . . , xn wherexi = (x (1)

i , . . . , x (p)i )T , i = 1, . . . , n, the log-likelihood l(λ,φ|x) of the

loadings λ and the scale parameters φ is

n∑i=1

log∫. . .

∫ p∏l=1

exp{x (l)

i ul (λ(l)Tz)− bl (ul (λ(l)Tz))φl

+ cl (x (l), φl )}

h(z(2))dz(2).

The corresponding maximum likelihood function containsmultidimensional integrals that have no analytical simplification.Approximation methods such as adaptive Gauss quadratures (see e.g.Rabe-Hesketh et al., 2002), Laplace approximation (Huber et al.,2004), integrated nested Laplace approximations (Rue et al., 2009) orthe fully exponential Laplace approximation (Bianconcini andCagnone, 2012) can be used.Composite likelihoods (see Lindsay, 1988) are alternative targetfunctions that are also used.Even so, optimizing the (approximated) likelihood function or otherfunctions can be very challenging because the models can beexcessively large in the number of parameters...

Example - Generalized Linear Latent Variable ModelsExample: Exploratory factor analysis with binary outcomes

We observe p binary manifest variables, suppose q latent variableswith free structure (up to some indentifiability constraints).Consider the auxiliary (non consistent) estimator that is based on thenormal (linear) model: x = Λz + ε, ε ∼ N (0,Σ) with Σ a diagonalmatrix of residual variance.Using the EM algorithm, it is obtianed by iterating between

Obtainzi = (ΛT Σ−1Λ + I)−1ΛT Σ−1xi . (7)

and adjust the scale of the zi to get ziObtain

Λ =[ n∑

i=1zi zT

]−1 n∑i=1

zi xTi , (8)

Note that zi requires only the inversion of the p × p diagonal matrixΣ and the q × q matrix (ΛT Σ−1Λ + I).M.-P. Victoria-Feser Fast Simulation based Estimation April 26, 2018 19 / 24

A consistent estimator is obtained using the iterative boostrap withH = 1 by simulating from a GLLVM with binary outcomes.Actually, one simulates uniform realizations (only once) and thesamples are generated using F−1

λ,φ with λ,φ evaluated at the currentvalue λ(k)

(j,n) and φ(k)(j,n) of the iterative bootstrap.

Simulation:

n = 100q = 5 factors, p = 500 manifest variables100 Monte Carlo simulations from a GLLVM with binary outcomesValues for the 2500 factor loadings where generated from a standardnormal distribution, where, for each vector of loadingsλ(l), l = 1, ..., p, up to one third of the loadings chosen randomlywere set to 0.There are ≈ 3000 parameters to estimate.For each sample, the estimator is obtained in less that 30 seconds(without computational optimization).

●●●

●●

−5.0

−2.5

Loadings

The loadings displayed (9 among 2500) have been selected randomlyamong factor loadings below zero, equal to zero, and above zero.

Conclusion

Thank you very much for your attention!

Thanks to the organizers forthe invitation.

Conclusion

References

S. Bianconcini and S. Cagnone. Estimation of generalized linear latent variable models via fullyexponential laplace approximation. Journal of Multivariate Analysis, 112:183–193, 2012.

C. Gourieroux, A. Monfort, and A. E. Renault. Indirect inference. Journal of AppliedEconometrics, 8 (supplement):S85–S118, 1993.

S. Guerrier, E. Dupuis-Lozeron, Y. Ma, and M.-P. Victoria-Feser. Simulation based biascorrection methods for complex models. Journal of the American Statistical Association,(online version: http://dx.doi.org/10.1080/01621459.2017.1380031), 2017.

P. Huber, E. Ronchetti, and M.-P. Victoria-Feser. Estimation of generalized linear latent variablemodels. Journal of the Royal Statistical Society, Series B, 66:893–908, 2004.

B. Lindsay. Composite likelihood methods. Contemporary Mathematics, 80:221–239, 1988.S. Rabe-Hesketh, A. Skrondal, and A. Pickles. Reliable estimation of generalized linear mixed

models using adaptive quadrature. The Stata Journal, 2:1–21, 2002.H. Rue, S. Martino, and N. Chopin. Approximate Bayesian inference for latent gaussian models

by using integrated nested laplace approximations. Journal of the Royal Statistical Society B,71:319–392, 2009.

Fast Simulation based Estimation for Complex Models...The corresponding maximum likelihood function...

Documents

Transcript of Fast Simulation based Estimation for Complex Models...The corresponding maximum likelihood function...

Topology Simpliﬁcation for Polygonal Virtual Environments · Topology Simpliﬁcation for Polygonal Virtual Environments Jihad El-Sana Amitabh Varshney Department of Computer Science

Perceptual Recovery from Consonant-Cluster Simpliﬁcation ...

Generalized View-Dependent Simpliﬁcation

Extraction and Simpliﬁcation of Iso-surfaces in …dominique.attali/Publications/05... · Extraction and Simpliﬁcation of Iso-surfaces in ... and Raindrop Geomagic, ... H. Edelsbrunner

Impact of pipes networks simpliﬁcation on water hammer ...

Marginal Likelihood Integrals for Mixtures of Independence Modelsjmlr.csail.mit.edu/papers/volume10/lin09a/lin09a.pdf · Inference in Bayesian statistics involves the evaluation of

A comparison of mesh simpliﬁcation algorithmswebdocs.cs.ualberta.ca/.../Polygon_Simplification/7.pdf · 2002. 7. 9. · A comparison of mesh simpliﬁcation algorithms P. Cignoni,

Agricultural landscape simpliﬁcation and insecticide … et al. Landscape... · Agricultural landscape simpliﬁcation and insecticide ... T.D.M., B.P .W., D.A.L., and C.G ... insecticide

Double Integrals over General Regions. Double Integrals over General Regions Type I Double integrals over general regions are evaluated as iterated integrals.

Radical Simpliﬁcation through Polyglot and Poly-paradigm … · 2013-05-20 · Radical Simpliﬁcation through Polyglot and Poly-paradigm Programming Dean Wampler dean@objectmentor.com

Rational Expression Simpliﬁcation with Algebraic Side ...

Chapter 15 – Multiple Integrals 15.1 Double Integrals over Rectangles 1 Objectives: Use double integrals to find volumes Use double integrals to find.

Tutorial A Developer’s Survey of Polygonal Simpliﬁcation ...

Complexity and simpliﬁcation in understanding recruitment in …science.whoi.edu/labs/pinedalab/PDFdocs/... · 2019. 7. 2. · Complexity and simpliﬁcation in understanding recruitment

Radical Simpliﬁcation through Polyglot and Poly-paradigm … › polyglotprogramming › › papers › ... · Radical Simpliﬁcation through Polyglot and Poly-paradigm Programming

Automatic dynamics simpliﬁcation in Fast Multipole Method ...webpages.iust.ac.ir/mozayani/Papers-pdf/razavi-supercomputing2012.pdf · Automatic dynamics simpliﬁcation in Fast

Rational Expression Simplification with Polynomial Side ... · Rational Expression Simplification with Polynomial Side Relations by ... Rational Expression Simplification with

Hilbert integrals, singular integrals, and Radon transforms Iarchive.ymsc.tsinghua.edu.cn/.../6390-11511_2006_Article_BF02392592… · hilbert integrals, singular integrals, and radon

Vector integrals Line integrals Surface integrals Volume integrals Integral theorems

CHAPTER 14 Multiple Integrals 14.1 Double Integrals ... · PDF fileDouble Integrals Changing to Better Coordinates ... Vector Calculus Vector Fields Line Integrals ... 16 Mathematics