Geometrical foundations of asymptotic inference: Robert E. Kass and Paul W. Vos; Wiley, New York,...

4
Journal of Statistical Planning and Inference 100 (2002) 81–84 www.elsevier.com/locate/jspi Book review Geometrical foundations of asymptotic inference Robert E. Kass and Paul W. Vos; Wiley, New York, 1997, pp. 355+xii, ISBN: 0-471- 82668-5 For a reader with the appropriate mathematical background, the book by Kass and Voss is an extremely readable and, in the reviewer’s opinion, mathematically beau- tiful treatise on the insights dierential geometry can yield for asymptotic inference. A signicant accomplishment of this book is that it generally expresses itself through pictures and ideas and not through unintelligible formulas in symbols involving 4 sub- scripts and superscripts. I personally have learned much from this book and believe that it will yield new insights for my own research. The rst half of this book deals with asymptotic inference for curved exponential families. Consider a regular exponential family of order k whose densities are of the form p(y|) = exp{y T ()}h(y); (1) where y R k , N R k . A subfamily of (1), parameterized by R m , is a curved exponential family if the map () imbeds as a submanifold of R k . An example is the usual nonlinear regression model Y j = f(x j ;)+ j j ;j =1;:::;k; (2) where x j are the covariates and, as usual j 1 ;:::; j k are assumed i.i.d. N (0; 2 ). In this model ()=(f(x 1 ;);:::;f(x k ;)). Under certain regularity conditions, primarily that the Jacobian of the map () have full rank m, the family of densities (2) will be a curved exponential family of dimension m. After a chapter on primarily notational preliminaries, Chapters 2 and 3 discuss the geometry of curved exponential families with m = dim() = 1. For the regular ex- ponential family (1), Fisher information is reinterpreted as an inner product (which depends upon ) on R k . The matrix of this inner product is the usual Fisher in- formation matrix I (), whose entries consist of the second partial derivatives, with respect to the components of , of . Let C be the curve in N which is the image of the map (). The curvature (), as dened in dierential geometry, of C is, following Efron (1975), called the statistical curvature of the curved exponential family. PII: S0378-3758(01)00106-9

Transcript of Geometrical foundations of asymptotic inference: Robert E. Kass and Paul W. Vos; Wiley, New York,...

Page 1: Geometrical foundations of asymptotic inference: Robert E. Kass and Paul W. Vos; Wiley, New York, 1997, pp. 355+xii, ISBN: 0-471-82668-5

Journal of Statistical Planning andInference 100 (2002) 81–84

www.elsevier.com/locate/jspi

Book review

Geometrical foundations of asymptotic inferenceRobert E. Kass and Paul W. Vos; Wiley, New York, 1997, pp. 355+xii, ISBN: 0-471-82668-5

For a reader with the appropriate mathematical background, the book by Kass andVoss is an extremely readable and, in the reviewer’s opinion, mathematically beau-tiful treatise on the insights di;erential geometry can yield for asymptotic inference.A signi=cant accomplishment of this book is that it generally expresses itself throughpictures and ideas and not through unintelligible formulas in symbols involving 4 sub-scripts and superscripts. I personally have learned much from this book and believethat it will yield new insights for my own research.

The =rst half of this book deals with asymptotic inference for curved exponentialfamilies. Consider a regular exponential family of order k whose densities are of theform

p(y|�) = exp{yT�− (�)}h(y); (1)

where y∈Rk , �∈N ⊆Rk . A subfamily of (1), parameterized by �∈�⊆Rm, is a curvedexponential family if the map � → �(�) imbeds � as a submanifold of Rk .

An example is the usual nonlinear regression model

Yj =f(xj; �) + jj; j = 1; : : : ; k; (2)

where xj are the covariates and, as usual j1; : : : ; jk are assumed i.i.d. N (0; �2). In thismodel �(�) = (f(x1; �); : : : ; f(xk ; �)). Under certain regularity conditions, primarily thatthe Jacobian of the map � → �(�) have full rank m, the family of densities (2) willbe a curved exponential family of dimension m.

After a chapter on primarily notational preliminaries, Chapters 2 and 3 discuss thegeometry of curved exponential families with m= dim(�) = 1. For the regular ex-ponential family (1), Fisher information is reinterpreted as an inner product (whichdepends upon �) on Rk . The matrix of this inner product is the usual Fisher in-formation matrix I(�), whose entries consist of the second partial derivatives, withrespect to the components of �, of . Let C be the curve in N which is the imageof the map � → �(�). The curvature �(�), as de=ned in di;erential geometry, of C

is, following Efron (1975), called the statistical curvature of the curved exponentialfamily.

PII: S0378 -3758(01)00106 -9

Page 2: Geometrical foundations of asymptotic inference: Robert E. Kass and Paul W. Vos; Wiley, New York, 1997, pp. 355+xii, ISBN: 0-471-82668-5

82 Book review / Journal of Statistical Planning and Inference 100 (2002) 81–84

An important property of �(�) is that it depends only upon C. In other words, if �is reparameterized by, for example, �̃= arctan(�) and if �̃(�̃) is the curvature functioncalculated using the parameter �̃, then �(�) = �̃(arctan(�)).

Let �(�) =E�(�)(Y ) where Y has the density (1). Consider regular estimators of �.These are estimators which, together with some nonsingularity conditions, are functionsT ( EY ) of the sample mean EY such that T (�(�)) = �. The auxiliary spaces of T aresubsets of Rk on which T is constant. Let IT (�) be Fisher information for T andnI(�) = n(�′(�))t I(�(�))(�′(�)) be Fisher information for the full sample. Then theinformation loss in T can be expressed as

lim I(�)−1[nI(�) − IT (�)] = �(�)2 + 12�(�)2; (3)

where �(�) is the curvature type measurement of the auxiliary space at �. The result (3)is due to Efron (1975). This result, and others from Chapters 2 and 3, are generalizedto the case m¿ 1 in Chapter 4.

Chapters 5 and 6 present nonlinear regression in the exponential family context.Suppose Y1; : : : ; Yk are independent so that Y = [Y1 : : : Yk ]t has a density of the form (1).Eq. (2) is replaced by �j =E(Yj) =f(xj; �). This model includes, of course, the usualgeneralized linear model in which f(xj; �) depends only upon xtj�. Let M= {[f(x1; �): : : f(xk ; �)]t}, a surface of dimension m in Rk , where m is the dimension of �, andlet � = [�1 : : : �k ]t . In normal distribution nonlinear regression, Y − �̂ is orthogonal tothe tangent m-dimensional hyperplane of M at �̂. This geometry is generalized toexponential family nonlinear regression. A measure of inJuence, Cook’s D, is alsogeneralized to exponential family nonlinear regression. Finally, in normal distributionnonlinear regression, it is standard to use curvature measures to analyze the adequacyof the linearizations used (see, for example, Seber and Wild (1989)). In Chapter 6,these curvature measures are generalized to exponential family nonlinear regression.

The prerequisite level of familiarity with di;erential geometry takes a signi=cant leapin Chapter 7. This chapter considers the geometry of the parameter space � using theRiemannian metric de=ned by Fisher information. Recall that Fisher information hasbeen reinterpreted as an inner product which depends upon a basepoint. In the moregeneral setting, the information inner product is de=ned upon the tangent hyperplanesof �, and this is exactly what is meant by a Riemannian metric. It is well knownthat for the binomial distribution, the transformation p → arcsin

√p is variance stabi-

lizing. More generally, for the multinomial distribution with probabilities (p1; : : : ; pk),the transformation (p1; : : : ; pk) → (2

√p1; : : : ; 2

√pk) is an isometry (that is a metric

preserving map) from the multinomial distribution, considered as a manifold with theFisher information metric, to an orthant of the sphere of radius 2 in Rk . Somewhatless well known is that the m-dimensional multivariate normal distribution with covari-ance �2Im is isometric to a hyperbolic (m + l)-dimensional space. It is part of somecosmic plan that the two most important distributions to statistics should be related tothe two most important non-Euclidean geometries in the history of the development ofgeometry?

Page 3: Geometrical foundations of asymptotic inference: Robert E. Kass and Paul W. Vos; Wiley, New York, 1997, pp. 355+xii, ISBN: 0-471-82668-5

Book review / Journal of Statistical Planning and Inference 100 (2002) 81–84 83

Chapter 7 also generalizes the �2 tests for the multinomial distribution to a class ofdistance based goodness of =t tests. The level of geometry used takes a further quantumleap in Chapter 8 in which nonmetric connections on the parameter space are used,resulting in a statistical manifold. The work of Amari (1990) on information recoveryand conditioning, asymptotic eMciency of estimators, asymptotic power of tests, ande;ects of parameterization is presented. In Chapter 9, the asymptotic eMciency ofmaximum quasi-likelihood estimators is studied by constructing a statistical manifoldand studying its curvature. Chapter 10 describes recent statistical uses to asymptotics ofother standard constructions from di;erential geometry (namely, vector bundles, Hilbertbundles, and =ber bundles) as well as preferred point geometries, a new geometricconstruction which, as far as the reviewer can determine, has only been studied in thestatistical literature.

Returning to our discussion of Chapters 2–4, we see an example of a general featureof this book. Almost any reader will appreciate the beauty of expressing informationloss in terms of curvature. But a clearer understanding of this result requires a clearunderstanding of di;erential geometry. Di;erential geometry, like much of mathematics,can be understood in a formal sense relatively quickly. However, for most people, ittakes a long time for the subject to really “sink in”. The book of Kass and Vos containsall the required di;erential geometric background material and has made a genuinee;ort to present its ideas with a minimum of mathematical abstraction. However thereviewer is not convinced that it has been able to surmount the Herculean obstacleof raising the general statistical reader’s di;erential geometric comfort level to a pointwhere the reader can truly appreciate the results in this book.

Very early in his conversion from a mathematician to a statistician, this reviewerlearned that statistics requires a di;erent frame of mind from mathematics: one mustabandon the search for the absolute formal correctness that one =nds in mathematics andaccept the ambiguity inherent in statistical modeling. Di;erential geometry also teachesa frame of mind, one that the reviewer believes is useful to the statistician. Noticethat although the calculation of statistical curvature �(�) requires a parameterizationof the curved exponential family, statistical curvature itself is independent of sucha parameterization. The study of di;erential geometry would teach the statistician toseparate the inherent properties of a family of distributions from the properties ofthe speci=c parameterizations and coordinate systems used to express this family. Theexperience of di;erential geometry would indicate that one should not think in terms ofthe distribution of an estimator �̂, but rather in terms of the distribution of the deviationof �̂ from �. The book by Kass and Voss has many beautiful examples of di;erentialgeometric insights in statistics. But the reviewer believes that statistics would greatlybene=t, in ways even more fundamental than those presented by Kass and Voss, byacquiring the “di;erential geometry frame of mind”.

The reviewer hopes that Kass and Voss will apply their considerable expositoryskills to write a =rst book which will slowly introduce, in the spirit of Spivak (1979),the statistical reader to the constructions of di;erential geometry, and, in the process,promote the statistical bene=ts of the “di;erential geometry frame of mind”.

Page 4: Geometrical foundations of asymptotic inference: Robert E. Kass and Paul W. Vos; Wiley, New York, 1997, pp. 355+xii, ISBN: 0-471-82668-5

84 Book review / Journal of Statistical Planning and Inference 100 (2002) 81–84

References

Amari, S.-I., 1990. Di;erential-Geometrical in Statistics. Springer, New York.Efron, B., 1975. De=ning the curvature of a statistical problem (with applications to second order eMciency).

Ann. Statist. 3, 1189–1217.Seber, G.A.F., Wild, C.J., 1989. Nonlinear Regression. Wiley, New York.Spivak, M., 1979. A Comprehensive Introduction to Di;erential Geometry, Vol. I, 2nd Edition. Publish or

Perish, Boston.

Ted ChangDepartment of Statistics,

University of VirginiaCharlottesville, VA 22904-4135, USA