Post on 08-Jul-2020
Princeton University
Workshop on Frontiers of Statistics
in Honour of
Professor Peter Bickel’s 65th Birthday
May 18 - 20, 2006, Princeton, USA
Table of Contents
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Biography of Peter J. Bickel . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Committees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Invited Speakers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Program Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Directions Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Abstracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Workshop Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Contents of Book “Frontiers of Statistics” . . . . . . . . . . . . . . . . . . . 38
Special Thanks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Acknowledgements
Sponsors
We gratefully acknowledge the generous financial support of :
Minerva Research Foundation
Bendheim Center for Finance, Princeton University
National Science Foundation
Department of Operations Research & Financial Engineering,
Princeton University
and academic support of:
Institute of Mathematical Statistics
International Indian Statistical Association
1
Background
The workshop intends to bring top and junior researchers together to define and ex-
pand the frontiers of statistics. It provides a focal venue for top and junior researchers to
gather, interact and present their new research findings, to discuss and outline emerging
problems in their fields, and to lay the groundwork for fruitful future collaborations. A
distinguished feature is that all topics are in core statistics with interactions with other dis-
ciplines such as biology, medicine, engineering, computer science, economics and finance.
Topics include: (1) Nonparametric inference and machine learning; (2) Longitudinal and
functional data analysis; (3) Time series, and financial econometrics; (4) Computational
biology and biostatistics; (5) MCMC, Bootstrap, and robust statistics; (6) Experimental
design and industrial engineering.
The workshop also serves advanced graduate students and young researchers looking
for new topics to work on and experienced researchers who hope to gain an overview of
contemporary developments in statistics.
The workshop is held on the occasion of the 65th birthday of Professor PeterJ. Bickel,
Professor of Statistics, University of California, Berkeley, one of the most celebrated statis-
ticians of our time. A book on the “Frontiers of Statistics” will soon be published based
on the topics presented on the workshop. The book will map the frontiers of the various
disciplines in statistics and provide useful references on the latest developments in each
subject. It will also be helpful to both new and experienced researchers who are willing
to gain a bird’s-eye view of the various frontiers of statistics, and published in celebration
of Professor Peter J. Bickel’s 65th birthday.
2
Biography of Peter J. Bickel
Peter J. Bickel
Peter Bickel has been a leading figure in the field of statistics
in the 43 years since he received his Ph.D. in Statistics at the
age of 22. He is widely recognized as one of the greatest statisti-
cians of our time in any metrics: breadth, depth and productivity.
He has made wide-ranging and far-reaching contributions to the
discipline of statistics. He has pioneered the research in many
statistical disciplines and has made fundamental contributions to
many areas in statistics. These include robust statistics, decision
theory, semiparametric modeling, bootstrap, nonparametric mod-
eling, machine learning, computational biology, and many other
areas (e.g. transportation and genomics) where statistics and
quantitative approaches play an important role. His exceptional
record of research accomplishment is evidenced by his exceptionally many publications in
the very top ranking journals in the field of statistics. His scientific findings have strongly
reshaped statistical thinking, methodological development, theoretical studies, and data
analysis. His research has strongly influenced the development of other quantitative dis-
ciplines such as engineering, economics, finance, computational biology, public health,
among others.
Bickel’s wide-ranging and far-reaching contributions to statistics have been signifi-
cantly recognized internationally by numerous awards and honors. These includes the
first recipient of The COPSS Presidents Award in 1980, and The Wald Lecturer in 1980.
His work has also been greatly recognized outside the statistical profession. These include
his John D. and Catherine T. MacArthur Foundation Fellowship in 1984, Guggenheim,
NATO, Miller Fellowships, and his election to the American Academy for Arts and Sciences
in 1985, the National Academy of Sciences in 1985, Royal Netherlands Academy of Arts
and Sciences in 1995. He was also honored the (UC-Berkeley) Chancellor’s distinguished
professor (1996-1999).
Professor Bickel is a strong professional leader. He has provided strong leadership
at all levels, from his enthusiastic administrative services to Berkeley as the department
chairman (76–79, 93–98), director of statistical laboratory (87-92), to a dean (twice) of the
Physical Sciences and many other important committees; from professional services such
as the President of The Institute of Mathematical Statistics (1980–1982), the president
of The Bernoulli Society (1991–1993), and the Board of Trustee of National Institute of
Statistics (1991 — ) to the national level such as various leading positions in the National
Academy of Sciences, National Research Council, Council of Scientific Advisors and the
American Association for the Advancement of Science.
3
Scientific Committee:
Jianqing Fan (Chair) Princeton University
Luisa Fernholz Princeton University
Hira Koul Michigan State university
Hans Mueller University of California at Davis
Vijay Nair University of Michigan
Ya’acov Ritov Hebrew University of Jerusalem
Jeff Wu Georgia Institute of Technology
Organizing Committee:
Jianqing Fan(Chair) Princeton University
Luisa T Fernholz Temple University
Heng Peng Princeton University
Chongqi Zhang Guangzhou University
Yazhen Wang University of Connecticut
Committee on Travel Support:
Luisa T. Fernholz (Chair) Princeton University
Jianqing Fan Princeton University
Liza Levina University of Michigan
Yijun Zuo Michigan State University
Yazhen Wang University of Connecticut
4
Invited speakers:
Yacine Ait-Sahalia Princeton University
Donald Andrews Yale University
Peter Buhlmann Swiss Federal Institute of Technology Zurich
Kjell Doksum University of California, Berkeley
David Donoho Stanford University
Ursula Gather University of Dortmund
Jayanta K. Ghosh Purdue University
Friedrich Goetze University of Bielefeld
Peter G. Hall The Australian National University
Haiyan Huang University of California at Berkeley
Jiming Jiang University of California, Davis
Hira Koul Michigan State University
Soumendra N. Lahiri Iowa State University
Elizaveta Levina University of Michigan
Jun Liu Harvard University
Regina Liu Rutgers University
Xiaoli Meng Harvard University
Stephan Morgenthaler EPFL Learning Center
Hans Muller University of California, Davis
Vijay Nair The University of Michigan
Byeong Park Seoul National University
Nancy Reid University of Toronto
John Rice University of California, Berkeley
Yaacov Ritov Israel Social Sciences Data Center
Anton Schick Binghamton University
Chris Sims Princeton University
David Tyler Rutgers University
Sara van der Geer Swiss Federal Institute of Technology Zurich
Mark van der Laan University of California, Berkeley
Willem van Zwet University of Leiden
Jane-Ling Wang University of California, Davis
Jon Wellner University of Washington
Yazhen Wang University of Connecticut
Jeff C. Wu Georgia Institute of Technology
Zhiliang Ying Columbia University
Chunming Zhang University of Wisconsin at Madison
Yijun Zuo Michigan State University
5
Program Overview
Thursday Friday Saturday
8:30-8:45 Registration
8:45-9:00 Opening Ceremony
9:00-9:30 Peter G. Hall Jun Liu Hira Koul John Rice
9:30-10:00 Peter Buhlmann Haiyan Huang Anton Sckick Jon Wellner
10:00-10:30 Sara van der Geer Zhiliang Ying Soumendra N. Lahiri Ureula Gather
10:30-11:00 Photo and Break Break Break
11:00-11:30 Willem Van Zwet Hans Muller Reginia Liu Jayanta K. Gosh
11:30-12:00 Nancy Reid Chunming Zhang Yijun Zuo Xiaoli Meng
12:00-12:30 Friedrich Goetze Byeong Park Jiming Jiang Jeff Wu
12:30-14:00 Lunch Lunch Lunch
14:00-14:30 Kjell Doksum David Donoho
14:30-15:00 Jane-Ling Wang David Tyler
15:00-15:30 Stephan Morgenthaler Yaacov Ritov
15:30-16:00 Break Break
16:00-16:30 Vijay Nair Chris Sims
16:30-17:00 Elizaveta Levina Yacine Ait-Sahalia
17:00-17:30 Mark van der Laan Donald Andrews
6
7
Program
May 17, 2006 (Wednesday)
19:30-21:30 Reception Palmer House
(http://www.princeton.edu/palmerhouse/)
Tel: 609-258-3715
Fax: 609-258-0526
May 18, 2006 (Thursday)
8:00-8:45 Registration F101∗
8:45-9:00 Opening Ceremony F101
Chair: Jianqing Fan
Invited Session
9:00-10:30 Chair: Don Fraser F101
9:00 Peter G. Hall
Some theory for classifiers in high-dimensional,
low sample size settings
9:30 Peter Buhlmann
Very high-dimensional data: prediction and
variable selection
10:00 Sara van der Geer
Oracle inequalities for the LASSO
10:30-11:00 Photo and Break
11:00-12:30 Chair: Ursula Gather F101
11:00 Willem van Zwet
An expansion for a discrete non-lattice
distribution
11:30 Nancy Reid
Applied Asymptotics
12:00 Friedrich Goetze
Edgeworth Approximations for
Symmetric Statistics
8
12:30-14:00 Lunch (Friend Convocation Room)
14:00-15:30 Chair: Luisa Fernholz F101
14:00 Kjell Doksum
Powerful Choices: Variable and Tuning
Constant Selection in Nonparametric
Regression based on Power
14:30 Jane-Ling Wang
Flexible Approaches to Model Survival
and Longitudinal Data Jointly
15:00 Stephan Morgenthaler
Smoothing Large Tables
15:30-16:00 Break
16:00-17:30 Chair: David Blei F101
16:00 Vijay Nair
Statistical Inverse Problems in Active Network
Tomography
16:30 Elizaveta Levina
Detection in Wireless Sensor Networks
17:00 Mark van der Laan
Estimating function based cross-validation
End of day 1
∗Friend 101
9
Program
May 19, 2006 (Friday)
8:45-9:00 Registration F006∗
Parallel Invited Sessions
9:00-10:30 Chair: Julian Faraway F006
9:00 Jun Liu
Bayesian Methods in Haplotype Inference
and Disease Mapping
9:30 Haiyan Huang
A statistical framework to infer functional
gene associations from multiple biologically
dependent microarray experiments
10:00 Zhiliang Ying
Semiparametric mixed effects models for
duration and longitudinal data
9:00-10:30 Chair: Run-ze Li F004∗∗
9:00 Hira Koul
Goodness-of-fit testing in interval
censoring case 1
9:30 Anton Sckick
Efficient estimators for times series
10:00 Soumendra N. Lahiri
Edgeworth expansions for sums of
block-variables under weak dependence
10:30-11:00 Break
11:00-12:30 Chair: Richard Samworth F006
11:00 Hans Muller
Functional Variance
11:30 Chunming Zhang
Spatially Adaptive Functional Linear
Regression with Functional Smooth Lasso
12:00 Byeong Park
Estimation and Testing for Varying
10
Coefficients in Additive Models with
Marginal Integration
11:00-12:30 Chair: Miriam Donoho F004
11:00 Reginia Liu
Mining Massive Text Data: Classification,
Construction of Tracking Statistics and
Inference under Misclassification
11:30 Yijun Zuo
Multi-Dimensional Trimming Based on
Data Depth
12:00 Jiming Jiang
Fence Methods: Another Look at Model Selection
12:30-14:00 Lunch(Friend Convocation Room)
Invited Session
14:00-15:30 Chair: Stephan Morgenthaler FCR∗∗∗
14:00 David Donoho
Sparsity in Inference: past trends,
future promise
14:30 David Tyler
Invariant coordinate selection (ICS):
A robust statistical perspective on
independent component analysis (ICA)
15:00 Yaacov Ritov
Some remarks on non-linear
dimension reduction
15:30-16:00 Break
16:00-17:30 Chair: Yazhen Wang FCR
16:00 Chris Sims
Bayesian Inference in Central Banks: Recent
Developments in Monetary Policy Modeling
16:30 Yacine Ait-Sahalia
Likelihood Inference for Diffusions
17:00 Donald Andrews
The Limit of Finite Sample Size and a
Problem with Subsampling
End of day 2
∗Friend 006 ∗∗Friend 004 ∗∗∗Friend Convocation Room
11
Program
May 20, 2006 (Saturday)
8:45-9:00 Registration FCR
Invited Sessions
9:00-10:30 Chair: Anirban Dasgupta FCR
9:00 John Rice
Multiple Testing in Astronomy
9:30 Jon Wellner
Goodness of fit via phi-divergences:
a new family of test statistics
10:00 Ursula Gather
Methods of robust online signal
extraction and applications
10:30-11:00 Break
11:00-12:30 Chair: Zhezhen Jin FCR
11:00 Jayanta K. Gosh
Convergence and Consistency of
Newton’s Algorithm for Estimating a
Mixing Distribution
11:30 Xiaoli Meng
Statistical physics and statistical
computing: A critical link– estimating
criticality via perfect sampling
12:00 Jeff Wu
Bayesian Hierarchical Modeling for
Integrating Low-accuracy and
High-accuracy Experiments
12:30-14:00 Lunch(Friend Convocation Room)
End of day 3
12
Abstracts
13
Likelihood Inference for Diffusions
Yacine Ait-Sahalia
Bendheim Center for Finance Princeton University, Princeton University
This talk surveys recent results on closed form likelihood expansions for discretely
sampled diffusions. One major impediment to both theoretical modeling and empirical
work with continuous-time models is the fact that in most cases little can be said about
the implications of the instantaneous dynamics for longer time intervals. One cannot in
general characterize in closed form an object as simple, yet fundamental for everything
from prediction to estimation and derivative pricing, as the conditional density of the
process, also known as the transition function of the process. I will describe a method
which produces accurate approximations in closed form to the transition function of an
arbitrary multivariate diffusion. I will then show a connection between this method and
saddlepoint approximations and provide examples. Next, I will discuss inference using this
method when the state vector is only partially observed, as in stochastic volatility or term
structure models. Finally, I will outline the use of this method in specification testing and
sketch derivative pricing applications.
The Limit of Finite Sample Size and a Problem withSubsampling
Donald W.K. Andrews
Department Economics, Yale University
This paper considers tests and confidence intervals based on a test statistic that has
a limit distribution that is discontinuous in a nuisance parameter or the parameter of
interest. The paper shows that standard fixed critical value (FCV) tests and subsample
tests often have asymptotic size—defined as the limit of the finite sample size—that is
greater than the nominal level of the test. We determine precisely the asymptotic size of
such tests under a general set of high-level conditions that are relatively easy to verify.
Often the asymptotic size is determined by a sequence of parameter values that approach
the point of discontinuity of the asymptotic distribution. The problem is not a small
sample problem. For every sample size, there can be parameter values for which the test
over-rejects the null hypothesis. Analogous results hold for confidence intervals.
We introduce a hybrid subsample/FCV test that alleviates the problem of over-rejection
asymptotically and in some cases eliminates it. In addition, we introduce size-corrections
to the FCV, subsample, and hybrid tests that eliminate over-rejection asymptotically. In
some examples, these size corrections are computationally challenging or intractable. In
other examples, they are feasible. This is joint work with Patrik Guggenberger.
14
Very High-dimensional Data: Prediction and VariableSelection
Peter Buhlmann
Swiss Federal Institute of Technology Zurich
We consider problems where the number of predictor variables p is much larger than
sample size n, i.e. function, the Lasso or also boosting algorithms have been shown to
be asymptotically consistent and both of them often exhibit very good empirical perfor-
mance. However, the problem of variable selection is much more subtle and difficult than
prediction.
We will discuss theoretical and practical potential and limitations of the Lasso and
boosting for variable selection, and we will present powerful improvements. The talk is a
special birthday tour for Peter Bickel: from ”Relaxed Lasso” over ”Sparse Boosting” to
completely different ideas from the ”PC algorithm” in graphical modeling. The methods
are used for two problems in computational biology: (i) alternative splicing using single-
gene libraries; and (ii) short motif modeling for splice site detection.
Powerful Choices: Variable and Tuning Constant Selectionin Nonparametric Regression based on Power
Kjell Doksum
Department of Statistics, University of California, Berkeley
This paper considers nonparametric multiple regression procedures for analyzing the
relationship between a response variable and a vector of covariates. It uses an approach
which handles the dilemma that with high dimensional data the sparsity of data in re-
gions of the sample space makes estimation of nonparametric curves and surfaces virtually
impossible. This is accomplished by abandoning the goal of trying to estimate true under-
lying curves and instead estimating measures of dependence that can determine important
relationships between variables. These dependence measures are based on local parametric
fits on subsets of the covariate space that vary in both dimension and size within each
dimension. The subset which maximizes a signal to noise ratio is chosen. The signal is a
local estimate of a dependence parameter which depends on the subset size, and the noise
is an estimate of the standard error (SE) of the estimated signal. This approach of choos-
ing the window size to maximize a signal to noise ratio lifts the curse of dimensionality
because for regions with sparsity of data the SE is very large. For contigious Pitman al-
ternatives it corresponds to asymptotically maximizing the probability of correctly finding
relationships between covariates and a response, that is, maximizing asymptotic power. It
is shown that within a selected dimension, the bandwidths of the optimally selected subset
15
do not tend to zero as the sample size n grows except for alternatives where the length of
the intervals where the alternative differs from the hypothesis tends to zero as n grows.
One of the dimension reduction algorithms is used together with MARS and GUIDE and
is shown to improve their performance. This is joint work with Chad Schafer, Shijie Tang
and Kam Tsui.
Sparsity in Inference: Past Trends, Future Promise
David Donoho
Statistics Department, Stanford University
Suppose we have to estimate a large number of parameters, most of which are zero or
negligible and some of which are important or significant; but we don’t know in advance
which parameters are likely to be negligible and which’are likely to be important. This
important problem in some sense spans large swaths of applied statistics, from regression
model building to gene association studies.
I’ll discuss some of Peter Bickel’s early work related to this problem, and how the
problem has grown and mutated over the years. At this point, it’s a problem with truly
vast implications, having applications throughout science and technology, with lots of
challenging mathematics and surprising applications.
Methods of Robust Online Signal Extraction andApplications
Ursula Gather
Department of Statistics, University of Dortmund
We discuss filtering procedures for robust extraction of a signal from noisy time series.
These methods can e.g. be applied to online observations of vital parameters which are
acquired by clinical information systems for critically ill patients. Multivariate time series
from online monitoring exhibit trends, abrupt level changes and large spikes (outliers)
as well as periods of relative stability. Also, the measurements are overlaid with a high
level of noise and among the variables strong dynamic dependencies are found (Gather et
al. (2002)). The challenge is to develop methods that allow a fast and reliable denoising
of these time series. Noise and artifacts are to be separated from structural patterns of
relevance.
Standard approaches to univariate signal extraction are moving averages and (univari-
ate) running medians, but they have shortcomings when outliers or trends occur. Review-
ing and extending recent work we present new methods for robust online signal extraction
and discuss their merits for preserving trends, abrupt shifts and extremes and for the
removal of spikes (Davies, Fried, Gather (2004)). Our robust regression moving window
16
methods are applicable even in real time because of increased computational power and
fast algorithms (Bernholt and Fried (2003)).
In multivariate robust signal extraction efficiency is lost if the error terms of the vari-
ables are highly correlated since generalizing robust univariate regression methods does not
result in affine equivariant procedures. Multivariate affine equivariant regression methods
with high breakdown, as e. g. MCD-regression (Rousseeuw et al. (2004)), more over
assume that the data are in general position. For discrete data in short time windows this
is however often not the case.
We therefore propose new procedures for multivariate signal extraction, which offer
fast and robust signal extraction, good efficiency properties and which can be used for
discretely measured data with low variability as well as in situations with many outliers.
Convergence and Consistency of Newton’s Algorithm forEstimating a Mixing Distribution
Jayanta K. Ghosh
Department of Statistics, Purdue University
In recent years Michael Newton has proposed an algorithmic estimate of a mixing
distribution, which is computationally efficient. We prove its convergence and consistency
under rather strong conditions. The consistency result is new. A proof of convergence
given earlier under same conditions by Newton is shown to be incomplete and not easily
rectifiable. We study various other aspects of the estimate and compare it with the Bayes
estimate based on Dirichlet mixtures. This is joint work with Surya Tokdar.
Edgeworth Approximations for Symmetric Statistics
Friedrich Goetze
Department of Mathematics, University of Bielefeld
We shall describe conditions, such that Edgeworth approximations up to an error
o(N−1) hold for a general class of asymptotical linear symmetric statistics in N indepen-
dent observations, which admits a regular stochastic Hoeffding expansion. The conditions
involve Cramer’s condition of smoothness for the linear term and some covariance type
conditions for the second order term. The results are joint work with M. Bloznelis and
extend previous work by P. Bickel, V. Bentkus, W. van Zwet and the author. They are
based on new analytical and combinatorial techniques. Connections with approximation
results in Probability and Number Theory for related degenerate U -statistics, and their
dimension dependence will be discussed as well.
17
Some Theory for Classifiers in High-dimensional, LowSample Size Settings
Peter Hall
Centre for Mathematics and its Applications, Mathematical Sciences Institute,
Australian National University
A large class of distance-based classifiers is defined, and their performance addressed
using theoretical arguments based on letting dimension diverge as sample size is kept
fixed. Particular attention is paid to the use of truncation, to heighten sensitivity of the
classifiers in cases of data sparsity. It is shown that in that setting, truncated distance-
based classifiers can perform well when differences between distributions are detectable
but not estimable. They do not do quite as well as classifiers based on Donoho and
Jin’s higher-criticism methods, although they are more robust against assumptions about
distribution type and component relationships. However, the robustness of higher criticism
can be increased by using methods based thresholding, as well as empirical approaches.
A Statistical Framework to Infer Functional GeneAssociations from Multiple Biologically Dependent
Microarray Experiments
Haiyan Huang
Department of Statistics, University of California, Berkeley
Microarray data from an increasing number of biologically interrelated and interde-
pendent experiments now allow more complete portrayals of functional gene relationships
involved in biological processes. However, in the current integrative analyses of microarray
data, an important practical issue is widely ignored: the existence of dependencies among
gene expressions across biologically related experiments. When not accounted for, these
dependencies (due to either similar intrinsic conditions or relevant external perturbations
among the experiments) can result in inaccurate inferences of functional gene associa-
tions, and hence incorrect biological conclusions. To address this fundamental problem,
we propose a new measure, Knorm correlation, to quantify functional gene associations
in the presence of such experimental dependencies. Our intuitive strategy is to reduce
the experimental dependencies before estimating gene correlations. The statistical model
underlying Knorm correlation is a multivariate normal distribution characterized by a
Kronecker product dependency structure. This unique structure maintains the same ex-
perimental correlations across genes and the same gene correlations across experiments.
The proposed measure simplifies to the Pearson coefficient when experiments are uncor-
related. Applications to simulation studies and to two real datasets (on yeast and human
18
genes) demonstrate the success of Knorm correlation, and also the adverse impact of exper-
imental dependencies on gene associations using Pearson coefficients. Knorm correlation is
expected to greatly improve the accuracy of biological inferences made from experiments
currently (and incorrectly) assumed to be uncorrelated.
This is a joint work with Melinda Teng and Xianghong Zhou.
Fence Methods: Another Look at Model Selection
Jiming Jiang
Department of Statistics, University of California, Davis
Many model search strategies involve trading off model fit with model complexity in a
penalized goodness of fit measure. Asymptotic properties for these types of procedures in
settings like linear regression and ARMA time series have been studied. Yet, such strate-
gies do not always translate into good finite sample performance. The issue is typically
one of the procedure being overly sensitive to the setting of penalty parameters, which are
required to be increasing functions of sample size. Furthermore, these strategies do not
generalize naturally to more complex models, such as those for modeling clustered data
or those that involve adaptive estimation. In these cases, penalties and model complexity
may not be naturally defined.
We introduce a new class of model selection strategies known as fence methods. The
general idea involves a procedure to isolate a subgroup of what are known as correct
models (of which the optimal model is a member). This is accomplished by constructing
a statistical fence, or barrier, to carefully eliminate incorrect models. Once the fence is
constructed, the optimal model will be selected among the correct models (those within the
fence) according to simplicity of the models. We describe a variety of fence methods, based
on the same principle but applied to different situations. These include regression, least
angle regression, linear mixed models for clustered and non-clustered data, generalized
linear mixed models for clustered and non-clustered data, and time series models. We
show the broad applicability of fence methods to all of these areas by giving a number
of examples, each supported by simulation results or real-life data analyses. In terms of
theoretical development, we give sufficient conditions for consistency of fence, a desirable
property for a good model selection procedure.
This work is joint with J. Sunil Rao, Zhonghua Gu and Thuan Nguyen.
19
Goodness-of-fit Testing in Interval Censoring Case 1
Hira L. Koul
Department of Statistics and Probability, Michigan State University
In the interval censoring case 1, an event occurrence time is unobservable, but one ob-
serves an inspection time and whether the event has occurred prior to this time or not. The
focus here is to provide tests of goodness-of-fit hypothesis pertaining to the distribution
of the event occurrence time. The proposed tests are based on certain marked empirical
processes for testing a simple hypothesis and their martingale transforms. These tests are
asymptotically distribution-free, consistent against a large class of fixed alternatives and
have nontrivial asymptotic power against a large class of local alternatives.
Edgeworth Expansions for Sums of Block-variables underWeak Dependence
Soumendra N. Lahiri
Department of Statistics, Iowa State University
Let {Xi}∞i=−∞ be a sequence of random vectors and let Yin = fin(Xi,l) be zero mean
block-variableswhere Xi,l = (Xi, . . . ,Xi+l−1), i ≥ 1 are overlapping blocks of length ℓ and
where fin are Borel measurable functions. This paper establishes valid joint asymptotic
expansions of general orders for the joint distribution of the sums∑n
i=1Xi and∑n
i=1 Yin
under weak dependence conditions on the sequence {Xi}∞i=−∞ when the block length ℓ
grows to infinity. In contrast to the classical Edgeworth expansion results where the terms
in the expansions are given by powers of n−1/2, the expansions derived here are mixtures
of two series, one in powers of n−1/2 while the other in powers of [nl ]−1/2. Applications
of the expansions to studentized statistics and to block bootstrap methods for time series
data are given.
Detection in Wireless Sensor Networks
Elizaveta Levina
Department of Statistics, The University of Michigan
Wireless sensor networks are becoming more widely available for use in various appli-
cations, such as intruder detection and ecological monitoring. The basic issues in sensor
networks (detection, estimation, design) are statistical but little work in this area has been
20
done by statisticians. I will give a brief overview of the main problems and then focus
on a local-vote decision algorithm we developed for target detection by a wireless sensor
network. Sensors acquire measurements corrupted by noise, make individual decisions,
correct their decisions after consulting the neighboring sensors, and then a collective deci-
sion is made by the network. Related local methods have been proposed by the engineers
but no theoretical performance guarantees were available. We give an explicit formula for
the decision threshold for a given false alarm rate, based on limit theorems for weakly
dependent random fields. We also show that, for a fixed false alarm rate, the local-vote
correction significantly improves target detection rate.
Joint work with George Michailidis and Natallia Katenka.
Bayesian Methods in Haplotype Inference and DiseaseMapping
Jun Liu
Department of Statistics, Harvard University
Haplotypes provide complete information of inheritance, which are very useful in pop-
ulation genetics and association studies. Since experimentally determining haplotype data
is expensive, much effort has been devoted to develop computational tools for inferring
haplotypes from genotype data. I will present a few Bayesian and semi-Bayesian models
that have been formulated over the past few years for this task, including new hierarchical
Bayes model developed in our group that incorporates the coalescence effect in a prior
distribution. The prediction accuracy of the new method is uniformly improved compared
to existing methods such as HAPLOTYER and PHASE.
I will further discuss a Bayesian approach in detecting multi-locus interactions (epista-
sis) for case-control association studies. Existing methods are either of low power or com-
putationally infeasible when facing of a large number of markers. Using MCMC sampling
techniques, the method can efficiently detect interactions among thousands of markers.
Using simulation results, I will discuss the power of our approach and the importance to
consider epistasis in association mapping.
Based on joint work with Yu Zhang and Tim Niu.
21
Mining Massive Text Data: Classification, Construction ofTracking Statistics and Inference under Misclassification
Regina Liu
Department of Statistics, Rutgers University
We present a systematic data mining procedure for exploring large free-style text
datasets to discover useful features and develop tracking statistics (often referred to as
performance measures or risk indicators). The procedure includes text classification, con-
struction of tracking statistics, inference under error measurements and risk analysis. The
main difficulty in deriving this inference scheme is the accounting for misclassification
errors, for which we propose two types of approaches: “plug-in” and “projection” meth-
ods. We also consider the bootstrap calibration for fine tuning. Finally, as an illustrative
example, the proposed data mining procedure is applied to analyzing an aviation safety
report repository from the FAA to show its utility in aviation risk management or general
decision-support systems.
Although most illustrations here are drawn from aviation safety data, the proposed
data mining procedure applies to many other domains, including, for example, mining
free-style medical reports for tracking possible disease outbreaks.
This is joint work with Daniel Jeske, Department of Statistics, UC Riverside.
Statistical Physics and Statistical Computing: A CriticalLink– Estimating Criticality via Perfect Sampling
Xiao-Li Meng
Department of Statistics, Harvard University
This talk is based on the following chapter, jointly written with James Servidea of U.S.
Department of Defense, in the volume dedicated to Professor Peter Bickel: “The main
purpose of this chapter is to demonstrate the fruitfulness of cross-fertilization between
statistical physics and statistical computation, by focusing on the celebrated Swendsen-
Wang algorithm for the Ising model and its recent perfect sampling implementation by
Mark Huber. In particular, by introducing Hellinger derivative as a measure of instanta-
neous changes of distributions, we provide probabilistic insight into the algorithm’s critical
slowing down at the phase transition point. We show that at or near the phase transition,
an infinitesimal change in the temperature parameter of the Ising model causes an as-
tronomical shift in the underlying state distribution. This finding suggests an interesting
conjecture linking the critical slowing down in coupling time with the grave instability of
the system as characterized by the Hellinger derivative (or equivalently, by Fisher infor-
mation). It also suggests that we can approximate the critical point of the Ising model, a
22
physics quantity, by monitoring the coupling time of Huber’s bounding chain algorithm,
an algorithmic quantity. This finding might provide an alternative way of approximating
criticality of thermodynamic systems, which is typically intractable analytically. We also
speculate that whether we can turn perfect sampling from a pet pony into a workhorse
for general scientific computation may depend critically on how successful we can engage,
in its development, researchers from statistical physics and related scientific fields.”
Smoothing Large Tables
Stephan Morgenthaler
EPFL Learning Center
Methods to smooth large tables are described. Such smoothing problems are of interest
in many scientific contexts and with a variety of objectives in mind. One may want to
interpolate the table entries, or to quantify the differences between rows and columns, or
to classify rows and columns into homogeneous subgroups, or to find the best rows and
columns, or some other objective. Fisher’s ANOVA, which can be computed by sweeping
row means and column means from the table, assigns a single effect to each row and
each column and was originally invented for tables of low dimension. The singular value
decomposition of the table offers an alternative single effects approximation. In both cases,
the smoothed row traces, that is the plot of the row entries against the row effects, are
straight lines.
More general table smoothers are obtained by using more flexible traces. Some of
the difficulties with this approach are discussed, among them the choice of row and col-
umn variables replacing the single effects from above, the parsimonious choice of trace
parameters, the classification of traces, and the transformation of table entries.
Functional Variance
Hans-Georg Muller
Department of Statistics, University of California, Davis
Functional data consist of an observed sample of smooth random trajectories. A key
tool for the analysis of such data is a representation in terms of eigenfunctions of the
autocovariance operator of the underlying stochastic process and the associated functional
principal components. In some applications the information of interest resides not in
the observed smooth random trajectories themselves but rather in the additive noise.
Assuming the noise is composed of a white noise component and a smooth random process
component, we refer to the latter as the functional variance process. This process can
23
then be decomposed in terms of its eigenfunctions. Methods to estimate eigenfunctions
and functional principal component scores for the functional variance process are based on
residuals obtained in an initial smoothing step, applied to the original data. We discuss
asymptotic justifications and applications. (joint work with U. Stadtmuller and F. Yao).
Statistical Inverse Problems in Active Network Tomography
Vijay Nair
Department of Statistics, Department of Industrial & Operations Engineering,
University of Michigan, Ann Arbor
The term network tomography, first introduced in Vardi (1996), characterizes two
classes of large-scale inverse problems that arise in the modeling and analysis of computer
and communications networks. This talk will deal with active network tomography where
the goal is to recover link-level quality of service parameters, such as packet loss rates and
delay distributions, from end-to-end path-level measurements. Internet service providers
use this to characterize network performance and to monitor service quality. We will
provide a review of recent developments, including the design of probing experiments,
inference for loss rates and delay distributions, and applications to network monitoring.
This is joint work with George Michailidis, Earl Lawrence, Bowei Xi, and Xiaodong Yang.
Estimation and Testing for Varying Coefficients in AdditiveModels with Marginal Integration
Byeong Park
Department of Statistics, Seoul National University
We propose marginal integration estimation and testing methods for the coefficients of
varying coefficient multivariate regression model. Asymptotic distribution theory is devel-
oped for the estimation method which enjoys the same rate of convergence as univariate
function estimation. For the test statistic, asymptotic normal theory is established. These
theoretical results are derived under the fairly general conditions of absolute regularity
(β-mixing). Application of the test procedure to the West German real GNP data reveals
that a partially linear varying coefficient model is best parsimonious in fitting the data
dynamics, a fact that is also confirmed with residual diagnostics.
24
Applied Asymptotics
Nancy Reid
Department of Statistics, University of Toronto
The theory of higher order asymptotics provides quite accurate approximations for
a large number of parametric models. However, the details of the theory are somewhat
complicated, and perhaps for that reason the methods are not used as often as they might
be. I will outline some ’case studies’ where improved approximation is readily implemented
and illustrate the effects on the resulting inference. I will suggest areas where further
research is needed.
Multiple Testing in Astronomy
John Rice
Department of Statistics, University of California, Berkeley
Suppose that a very large number of independent null hypotheses are tested, almost
all of which are true. How can the proportion of false null hypotheses be estimated? For
motivation, I will discuss the Taiwanese-American Occultation Survey, and will explain
how this question arises. I will then present some recent results.
Some Remarks on Non-linear Dimension Reduction
Ya’acov Ritov
Israel Social Sciences Data Center
We remark on the possibility of a well defined dimension reduction. We consider a
model in which the data is distributed on a manifold. We present an algorithm for gener-
ating a global map of data to a lower dimensional space, minimizing the local structure of
the manifold. We remark on the importance on estimating the manifold structure when
the main concern is estimating a regression function.
Efficient Estimators for Times Series
Anton Schick
Department of Mathematical Sciences, Binghamton University
I illustrate several recent results on efficient estimation for semiparametric time series
models with a simple class of models: first-order nonlinear autoregression with indepen-
dent innovations. In particular I consider estimation of the autoregression parameter, the
innovation distribution, conditional expectations, the stationary distribution, the station-
ary density, and higher-order transition densities.
25
Bayesian Inference in Central Banks: Recent Developmentsin Monetary Policy Modeling
Christopher A. Sims
Department of Economics, Princeton University
In the 1950’s and 60’s, large-scale econometric models, grounded in an elegant the-
ory of inference initiated by Trygve Haavelmo, began to be widely used by policy-making
instituions. While the models remained in use, their grounding in a theory of inference
almost completely disappeared by 2000. In the last few years, there has been research ac-
tivity in many central banks aimed at producing models grounded in a Bayesian approach
to inference and using modern computational approaches to posterior simulation. This
talk summarizes the history and describes the methods and results driving the current
research.
Invariant Coordinate Selection (ICS): A Robust StatisticalPerspective on Independent Component Analysis (ICA)
David E. Tyler
Department of Statistics, Rutgers University
In many disciplines, independent component analysis (ICA) has become a popular
method for analyzing multivariate data. Independent component analysis typically as-
sumes the observe data Y ∈ ℜp is generated by a nonsingular affine transformaton of inde-
pendent components, i.e. Y = AZ, where A is a nonsingular matrix and Z = (Z1, . . . , Zp)′
consists of independent variables Z1, . . . , Zp. The objective is to then estimate A and hence
recover Z. Approaches for recovering Z have often been successful in exploring multivari-
ate data in general, i.e. in cases where the ICA model may not be hold. The purpose
of this talk is to provide some understanding as to why independent component analysis
may work well as a general multivariate method. In particular, without reference to the
ICA model, it can be noted that for some methods the recovered Z can be viewed as affine
invariant coordinates. That is, if we transform Y → Y∗ = BY + b for any nonsingular
Y, then Z∗ = ∆Z + c, where ∆ is a nonsingular diagonal matrix. In other words, the
standardized versions of the components Zj and Z∗j are the same. Hence, the terminology
invariant coordinate selection (ICS).
Consequently, this leads to the development of a wide class of affine equivariant co-
ordinatewise methods for multivariate data. Some methods to be discussed are affine
equivariant principal components, robust estimates of multivariate location and scatter,
affine invariant multivariate nonparametric tests, affine invariant multivariate distribution
functions, and affine invariant coordinate plots. The affine equivariant principal compo-
26
nents and the corresponding affine invariant coordinate plots can be regarded in a sense as
projection pursuit without the pursuit. Several examples are given to illustrate the utility
of the proposed methods.
Oracle Inequalities for the LASSO
Sara van de Geer
Seminar fuer Statistik, ETH Zuerich
We consider the LASSO penalty for general M-estimators. Examples include logistic
regression, quantile regression, log-density estimation, and boosting with for example lo-
gistic loss or hinge loss. Let Y be a real-valued (response) variable andX be a (co-)variable
with values in some space X . Let
F ⊂ {fα =m
∑
k=1
αkψk}
be a (convex subset of a) linear space of functions on X . Here, {ψk}mk=1 is a given system
of base functions. Let γf : R × X → R be some loss function, and let {(Yi,Xi)}ni=1 be
i.i.d. copies of (X,Y ). We consider the estimator
f = arg minfα∈F
{
1
n
n∑
i=1
γf (Yi,Xi) + λI(α)
}
,
where I(α) :=∑m
k=1 τk|αk|denotes the weighted ℓ1 norm of the vector α ∈ Rm with
random weights τk := ( 1n
∑ni=1 ψ
2k(Xi))
1/2. We study the situation where the number of
parameters m is large (possibly much larger than the number of observations n). Our
purpose is threefold. Firstly, we want to show that for a proper choice of the smoothing
parameter λ ( possibly depending on {τk}), the estimator f satisfies an oracle inequality.
Secondly, we want the result to hold without any a priori bounds on the functions in F .
Thirdly, we aim at “reasonable” values for the constants involved, as indication that the
result is not merely an asymptotic one. In certain settings, the smoothing parameter λ can
be chosen asymptotically equal to 4√
2 logm/n, which is four times as large as in the linear
Gaussian case with soft thresholding. The factor 4 comes from using a symmetrization
and a contraction inequality.
27
Estimating Function Based Cross-validation
Mark van der Laan
Division of Biostatistics, University of California, Berkeley
Suppose that we observe a sample of independent and identically distributed realiza-
tions of a random variable. Given a model for the data generating distribution, assume
that the parameter of interest can be characterized as the parameter value which makes
the population mean of a possibly infinite dimensional estimating function equal to zero.
Given a collection of candidate estimators of this parameter, and specification of the vec-
tor estimating function, we propose a norm of the cross-validated estimating equation as
criteria for selecting among these estimators. For example, if we use the Euclidean norm,
then our criteria is defined as the Euclidean norm of the empirical mean over the validation
sample of the estimating function at the candidate estimator based on the training sam-
ple. We establish a finite sample inequality of this method relative to an oracle selector,
and illustrate it with some examples. This finite sample inequality provides us also with
asymptotic equivalence of the selector with the oracle selector under general conditions.
We also study the performance of this method in the case that the parameter of interest
itself is pathwise differentiable (and thus, in principle, root-n estimable).
An Expansion for A Discrete Non-lattice Distribution
Willem R. van Zwet
Department of Statistics, University of Leiden
Much is known about asymptotic expansions for asymptotically normal distributions
if these distributions are either absolutely continuous or pure lattice distributions. In this
paper we begin an investigation of the discrete but non-lattice case. We tackle one of the
simplest examples imaginable and find that curious phenomena occur. Clearly more work
is needed. (Co-author Friedrich Gotze)
Flexible Approaches to Model Survival and LongitudinalData Jointly
Jimin Ding and Jane-Ling Wang (Speaker)
Department of Statistics, University of California at Davis
In clinical studies, longitudinal covariates are often used to monitor the progression
of the disease as well as survival time. Relationship between a failure time process and
some longitudinal covariates is of key interest and so is the understanding of the pattern
28
of longitudinal process to learn more about health status of patients, or to get some
insight into the progression of disease. Joint modeling of the longitudinal and survival
data has certain advantages and emerged as an effective way to handle both types of data
simultaneously. In this talk, we will explore several intriguing and challenging issues in
joint modelling.
Typically, a parametric longitudinal model is assumed to facilitate the likelihood ap-
proach. However, the choice of a proper parametric model turns out more illusive than
standard longitudinal studies where no survival end-point occurs. Furthermore, the com-
putational burden due to both Monte Carlo numerical integration and EM (Expected
Maximum) algorithm is an important concern in the joint modelling setting. To deal with
those challenges, we propose several flexible longitudinal models in the joint modelling
setting. Simplicity of the model structure is crucial to have good numerical stability, and
we will illustrate this through numerical studies and data analysis.
Goodness of Fit via Phi-divergences: A New family of TestStatistics
Jon A. Wellner
Department of Statistics, University of Washington
A new family of goodness-of-fit tests based on phi-divergences is introduced and
studied. The new family is based on phi-divergences somewhat analogously to the phi-
divergence tests for multinomial distributions introduced by Cressie and Read (1984), and
is indexed by a real parameter s ∈ R: s = 2 gives the Anderson - Darling test statistic,
s = 1 gives the Berk-Jones test statistic, s = 1/2 gives a new (Hellinger - distance type)
statistic, s = 0 corresponds to the “reversed Berk-Jones” statistic, and s = −1 gives a
“studentized” (or empirically weighted) version of the Anderson - Darling statistic. We
also introduce corresponding integral versions of the new statistics.
We show that the asymptotic null distribution theory of Jaeschke (1979) and Eicker
(1979) for the Anderson-Darling statistic, and of Berk and Jones (1979) applies to the
whole family of statistics Sn(s) with s ∈ [−1, 2]. We also provide new finite-sample
approximations to the null distributions and show how the new approximations can be
used to obtain accurate computation of quantiles.
On the side of power behavior, we show that for 0 < s < 1 and fixed alternatives
the test statistics always converge almost surely to their corresponding natural parameter.
For 1 < s <∞ we provide necessary and sufficient conditions on the alternative d.f. F for
convergence to the corresponding natural parameter to hold, and show that the “Poisson
boundary” phenomena noted by Berk and Jones for their statistic continues to hold for
s ≥ 1 and s < 0 by identifying the Poisson boundary distributions explicitly.
29
We extend the results of Donoho and Jin (2004) by showing that all our new tests
for s ∈ [−1, 2] have the same “optimal detection boundary” for normal shift mixture
alternatives as Tukey’s “higher-criticism” statistic and the Berk-Jones statistic.
Heterogeneous Autoregressive Realized Volatility Model
Yazhen Wang
Department of Statistics, University of Connecticut
Volatilities of asset returns are pivotal for many issues in financial economics. The
availability of high frequency intraday data should allow us to estimate volatility more
accurately. Realized volatility is often used to estimate integrated volatility. To obtain
better volatility estimation and forecast, some autoregressive structure of realized volatility
is proposed in the literature. This talk will present my recent work on heterogeneous
autoregressive models of realized volatility.
Bayesian Hierarchical Modeling for IntegratingLow-accuracy and High-accuracy Experiments
Jeff Wu
Georgia Institute of Technology, School of Industrial and Systems Engineering
Standard practice in analyzing data from different types of experiments is to treat data
from each type separately. By borrowing strength across multiple sources, an integrated
analysis can produce better results. Careful adjustments need to be made to incorpo-
rate the systematic differences among various experiments. To this end, some Bayesian
hierarchical Gaussian process models (BHGP) are proposed. The heterogeneity among
different sources is accounted for by performing flexible location and scale adjustments.
The approach tends to produce prediction closer to that from the high-accuracy experi-
ment. The Bayesian computations are aided by the use of Markov chain Monte Carlo and
Sample Average Approximation algorithms. The proposed method is illustrated with two
examples: one with detailed and approximate finite elements simulations for mechanical
material design and the other with physical and computer experiments. (Based on joint
work with Zhiguang Qian).
30
Semiparametric Mixed Effects Models for Duration andLongitudinal Data
Zhiliang Ying
Department of Statistics, Columbia University
In this talk, I will present a doubly semiparametric mixed effects model for duration and
recurrent event time data. This model is useful in accommodating possible informative
censoring, a common problem in many follow-up studies. It also exhibits interesting
features which make it relatively easy to carry out the usual statistical inferences. We
show the usefulness and practicality of the proposed approach via theoretical properties,
simulation results and data analysis. Some additional developments on linear mixed effects
model for longitudinal data will also be presented.
Spatially Adaptive Functional Linear Regression withFunctional Smooth Lasso
Chunming Zhang
Department of Statistics, University of Wisconsin
In this paper we consider the setting where the regressor is a functional data such as a
curve or an image and the response is a scalar. We propose the “functional smooth lasso”
(FSL) approach to simultaneously regularize the roughness and the size of the nonzero
regions of the functional linear regression estimates. An efficient algorithm is developed
for computing FSL. The degrees of freedom of FSL is derived and incorporated into the
automatic tuning of regularization parameters. Furthermore, we prove the consistency and
the convergence rate of FSL. An interesting finding is that the convergence rate depends
on the degree of the ”smoothness” of the predictors. The proposed method is illustrated
via simulation studies and real data application.
Multi-Dimensional Trimming Based on Data Depth
Yijun Zuo
Department of Statistics and Probability, Michigan State University, East Lansing
With a natural order principle, trimming in one dimension is straightforward. One-
dimensional trimmed means are among the most popular estimators of the center of data
and have been used in various fields of statistics and in our daily life. Trimmed means
can overcome the high sensitivity of the mean to outliers (or heavy-tailed data) and the
low efficiency of the median for light-tailed data. Hence they can serve as compromises
between the mean and the median.
31
Multi-dimensional data often contain outliers, which typically are far more difficult to
detect than in one dimension. A robust procedure such as the multi-dimensional trimming
that can automatically detecting outliers or “heavy tails” is thus desirable. The task of
trimming, however, becomes non-trivial, for there is no natural order principle in high
dimensions. In this talk, multi-dimensional trimming based on “data depth” is discussed.
It is found that depth-trimmed means can possess very desirable properties such as high
efficiency and high robustness. Furthermore, inference procedures based on the depth-
trimmed means can outperform the classical Hotelling’s T 2 (and the univariate t) ones.
Applications of data depth trimming such as clustering and dimension reduction are also
addressed. Contributions of Professor Bickel to trimming are discussed.
32
Workshop Participants
Name Institution Email Address
Yacine Ait-Sahalia Princeton University yacine@Princeton.EDU
Beth Andrews Northwestern University bandrews@northwestern.edu
Donald W.K. Andrews Yale University donald.andrews@yale.edu
Alex Bajamonde Genentech Inc. bajamonde.alex@gene.com
Peter J. Bickel Univ. of California, Berkeley bickel@stat.berkeley.edu
Steinar Bjerve University of Oslo steinar@math.uio.no
David Blei Princeton University blei@cs.princeton.edu
Howard Bondell North Carolina State Univ. bondell@stat.ncsu.edu
Peter Buhlmann Swiss Federal Institute of Tech. Zurich peter.buehlmann@stat.math.ethz.ch
Christopher Calderon Princeton University ccaldero@princeton.edu
Melissa Carroll Princeton University mkc@princeton.edu
Serena Chan Cornell University ssc35@cornell.edu
Scott Chasalow Bristol-Myers Squibb scott.chasalow@bms.com
Aiyou Chen Bell Labs, Lucent Tech. aychen@research.bell-labs.com
Ming-Yen Cheng National Taiwan University cheng@math.ntu.edu.tw
Shojaeddin Chenouri University of Waterloo schenouri@uwaterloo.ca
Laura Chioda Princeton University lchioda@princeton.edu
Gregory Chow Princeton University gchow@Princeton.EDU
Erhan Cinlar Princeton University ecinlar@princeton.edu
Anirban Dasgupta Purdue University dasgupta@stat.purdue.edu
Savas Dayanik Princeton University sdayanik@princeton.edu
Aurore Delaigle Univ. of California, San Diego delaigle@math.ucsd.edu
Jimin Ding Univ. of California, Davis jmding@wald. ucdavis.edu
Kjell Doksum Univ. of California, Berkeley doksum@stat.wisc.edu
David Donoho Stanford University donoho@stanford.edu
Miriam G. Donoho San Jose State Univ. donoho@email.sjsu.edu
Juan Du Michigan State University dujuan@msu.edu
Veronica Esaulova Otto von Guericke Univ. veronica.esaulova@gmail.com
Yingying Fan Princeton University yingying@princeton.edu
Julian Faraway University of Michigan faraway@umich.edu
Luisa T. Fernholz Princeton University lfernhol@princeton.edu
Don Fraser University of Toronto dfraser@utstat.toronto.edu
Mendel Fygenson Univ. of Southern California fygenson@marshall.usc.edu
33
Workshop Participants
Name Institution Email Address
Anne Gadermann Univ. of British Columbia gaderman@interchange.ubc.ca
Ursula Gather University of Dortmund gather@statistik.uni-dortmund.de
Zhiyu Ge Merrill Lynch gary−ge@mil.com
Jayanta K. Ghosh Purdue University ghosh@stat.purdue.edu
Sujit Ghosh North Carolina State Univ. sghosh@stat.ncsu.edu
Subhashis Ghoshal North Carolina State Univ. ghosal@stat.ncsu.edu
Friedrich Goetze University of Bielefeld goetze@mathematik.uni-bielefeld.de
Wenceslao G. Manteiga Univ. of Santiago de Compostela wenceslao@usc.es
Jiezhun Gu North Carolina State Univ. jgu@unity.ncsu.edu
Arjun Gupta Bowling Green State Univ. gupta@bgnet.bgsu.edu
Peter G. Hall The Australian National Univ. Peter.Hall@maths.anu.edu.au
Hillary Han Cornell University hillary−han@yahoo.com
Jaroslaw Harezlak Harvard University jharezla@hsph.harvard.edu
Nick Hengartner Los Alamos National Laboratory nickh@lanl.gov
Moonseong Heo Cornell University moh2002@med.Cornell.edu
David Hitchcock Univ. of South Carolina hitchcock@stat.sc.edu
Haiyan Huang Univ. of California at Berkeley hhuang@stat.berkeley.edu
Li-Shan Huang University of Rochester Lhuang@bst.rochester.edu
Tao Huang Yale University t.huang@yale.edu
Ben Huang Bristol-Myers Squibb shupang.huang@bms.com
Xiaoming Huo Univ. of California at Riverside xmhuo@ucr.edu
Ed Ionides University of Michigan inides@umich.edu
Barry James Univ. of Minnesota, Duluth bjames@d.umn.edu
Kang James Univ. of Minnesota Duluth kjames@d.umn.edu
Yuan Ji The University of Texas yuanji@mdanderson.org
Jiancheng Jiang Princeton University jjiang@princeton.edu
Jiming Jiang University of California, Davis jiang@wald.ucdavis.edu
Kun Jin FDA/CDER/OB/DBI JINK@cder.fda.gov
Zhezhen Jin Columbia University zj7@columbia.edu
Rebecha Jornsten Rutgers University rebecka@stat.rutgers.edu
Noureddine El Karoui UC Berkeley nkaroui@stat.berkeley.edu
Katerina Kechris Univ. of Colorado Health Sci. Center katerina.kechris@uchsc.edu
Abbas Khalili University of Waterloo aa2mahmo@math.uwaterloo.ca
Chris Klaassen Universiteit van Amsterdam chrisk@science.uva.nl
Hira Koul Michigan State University koul@stt.msu.edu
34
Workshop Participants
Name Institution Email Address
Sanjeev Kulkarni Princeton University kulkarni@princeton.edu
Jaimyoung Kwon Cal State East Bay jaimyoung.kwon@csueastbay.edu
Soumendra N. Lahiri Iowa State University snlahiri@iastate.edu
Clifford Lam Princeton University wlam@princeton.edu
Hyunsook Lee Pennsylvania State Univ. hlee@stat.psu.edu
Elizaveta Levina University of Michigan elevina@umich.edu
Michael Levine Purdue University mlevins@stat.purdue.edu
Hongzhe Li University of Pennsylvania hli@cceb.upenn.edu
Lexin Li North Carolina State Univ. li@stat.ncsu.edu
Runze Li Pennsylvania State Univ. rli@stat.psu.edu
Chaobin Liu Bowie State University cliu@bowiestate.edu
Jun Liu Harvard University jliu@stat.harvard.edu
Mengling Liu New York University mengling.liu@med.nyu.edu
Regina Liu Rutgers University rliu@stat.rutgers.edu
Yanning Liu Cornell University yl229@cornell.edu
Yufeng Liu Univ. of North Carolina yfliu@email.unc.edu
Markus Loecher Rutgers University loecher@mailaps.org
Adriana Lopes University of Pittsburgh ad15@pitt.edu
Panos Lorentziadis Hellenic American Univ. paloren@attglobal.net
Aurelie Lozano Princeton University alozano@princeton.edu
Ying Lu Harvard University ylu@latte.harvard.edu
Jun Luo Michigan State University luojun@msu.edu
Jinchi Lv Princeton University jlv@princeton.edu
Loriano Mancini University of Zrich mancini@isb.unizh.ch
David Masson University of Delaware davidm@udel.edu
Jon McAuliffe University of Pennsylvania mcjon@wharton.upenn.edu
Anjana Meel University of Pennsylvania anjanam@seas.upenn.edu
Xiaoli Meng Harvard University meng@stat.harvard.edu
Oksana Mokliatchouk Bristol-Myers Squibb oksana.mokliatchouk@bms.com
Stephan Morgenthaler EPFL Learning Center stephan.morgenthaler@epfl.ch
Akira Morita Georgia Tech gtg347v@mail.gatech.edu
Hans Mueller Univ. of California Davis mueller@wald.ucdavis.edu
Yolanda Munoz University of Texas Yolanda.M.Munoz@uth.tmc.edu
Vijay Nair University of Michigan vnn@umich.edu
Jan Neumann Simens Corporate Research jan.neumann@siemens.com
Yue Niu Princeton University yniu@princeton.edu
35
Workshop Participants
Name Institution Email Address
Juan Carlos Pardo University of Vigo juancp@uvigo.es
Byeong Park Seoul National University bupark@stats.snu.ac.kr
Emanuel Parzen Texas A&M University eparzen@stat.tamu.edu
Heng Peng Princeton University pheng@princeton.edu
Jianan Peng Acadia University jianan.peng@acadiau.ca
Quang Pham University of Alaska Fairbanks pxquang@cox.net
Nancy Reid University of Toronto reid@utstat.edu
Philip Reiss Columbia University ptr2003@columbia.edu
John Rice Univ. of California, Berkeley rice@stat.Berkeley.EDU
Yaacov Ritov Israel Social Sci. Data Center yaacov.ritov@huji.ac.il
Alex Rojas Carnegie Mellon University arojas@stat.cmu.edu
Juan Romo Universidad Carlos III de Madrid juan.romo@uc3m.es
Kaisiromwe Sam Uganda Bureau of Statistics sam.kaisiromwe@ubos.org
Alexander Samarov MIT samarov@mit.edu
Richard Samworth University of Cambridge rjs57@cam.ac.uk
Stanley Sawyer Washington University sawyer@math.wustl.edu
Robert Schapire Princeton University schapire@cs.princeton.edu
Anton Schick Binghamton University anton@math.binghamton.edu
Damla Senturk Penn State Univ. dsenturk@stat.psu.edu
Chris Sims Princeton University sims@princeton.edu
Dan Spitzner Virginia Tech dan.spitzner@vt.edu
Curtis Storlie North Carolina State Univ. storlie@stat.ncsu.edu
Umar Syed Princeton University usyed@cs.princeton.edu
Nian-Sheng Tang Columbia University nstang@ynu.edu.cn
Shijie Tang Univ. of Wisconsin at Madison tangs@stat.wisc.edu
Tiejun Tong Yale University tiejun.tong@yale.edu
David Tyler Rutgers University dtyler@rci.rutgers.edu
Sara van der Geer Swiss Federal Institute of Tech. Zurich geer@math.leidenuniv.nl
Mark van der Laan Univ. of California, Berkeley laan@stat.Berkeley.EDU
Willem van Zwet University of Leiden vanzwet@math.leidenuniv.nl
Bob Vanderbei Princeton University rvdb@princeton.edu
Aldo Jose Viollaz Univ. Nac. De Tucuman aviollaz@herrera.unt.edu.ar
Haiyan Wang Kansas State University hwang@ksu.edu
Jane-Ling Wang Univ. of California at Davis wang@wald.ucdavis.edu
Naisyin Wang Texas A&M University nwangstat@gmail.com
Paul C. Wang CPR & CDR Technologies, Inc pcwang@cprcdr.com
36
Workshop Participants
Name Institution Email Address
Qing Wang Princeton University qingwang@princeton.edu
Xiaohui Wang University of Virginia xw5a@virginia.edu
Yonghua Wang Bristol-Myers Squibb yonghua.wang@bms.com
Jon Wellner University of Washington jaw@stat.washington.edu
Roy Welsch MIT rwelsch@mit.edu
Yazhen Wang University of Connecticut yzwang@stat.uconn.edu
Baolin Wu University of Minnesota baolin@biostat.umn.edu
Jeff C. Wu Georgia Institute of Tech. jeffwu@isye.gatech.edu
Qiang Wu University Pittsburgh qiw8@pitt.edu
Yichao Wu University of North Carolina wuy@email.unc.edu
Joseph A. Yahav The Hebrew Univ. of Jerusalem j-ya@bezeqint.net
Zhiliang Ying Columbia University zying@stat.columbia.edu
Angela Yu Princeton University ajyu@princeton.edu
Peng Zeng Auburn University zengpen@auburn.edu
Chongqi Zhang Guangzhou University chongqi@gzhu.edu.cn
Chunming Zhang Univ. of Wisconsin at Madison cmzhang@stat.wisc.edu
Hao Zhang North Carolina State Univ. hzhang@stat.ncsu.edu
Heping Zhang Yale University heping.zhang@yale.edu
Jingjin Zhang Princeton University jingjinz@princeton.edu
Jin-Ting Zhang National Univ. of Singapore stazjt@nus.edu.sg
Zhengjun Zhang Univ. of Wisconsin at Madison zjz@stat.wisc.edu
Zhigang Zhang Oklahoma State Univ. zhigang.zhang@okstate.edu
Tian Zheng Columbia University tzheng@stat.columbia.edu
Jianhui Zhou University of Virginia jz9p@virginia.edu
Hongtu Zhu Columbia University hz2114@columbia.edu
Ji Zhu University of Michigan jizhu@umich.edu
Hui Zou University of Minnesota hzou@stat.umn.edu
Yijun Zuo Michigan State University zuo@stt.msu.edu
37
Frontiers of Statistics
— in honor of Professor Peter J. Bickel’s 65th Birthday
Edited by Jianqing Fan and Hira L. Koul
Imperial College Press
Table of Contents
1. Our Steps on the Bickel WayKjell Doksum and Ya’acov Ritov 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Doing Well at a Point and Beyond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Robustness, Transformations, Oracle-free Inference, and Stable Parameters . . . . . . . . 4
1.4 Distribution Free Tests, Higher Order Expansions, and Challenging Projects . . . . . . . 4
1.5 From Adaptive Estimation to Semiparametric Models . . . . . . . . . . . . . . . . . . . . 5
1.6 Hidden Markov Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.7 Non- and Semi-parametric Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.8 The Road to Real Life . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Bickel’s Publication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Part I. Semiparametric Modeling
2. Semiparametric Models: A Review of Progress since BKRW (1993)Jon A. Wellner, Chris A. J. Klaassen and Ya’acov Ritov 25
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Missing Data Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3 Testing and Profile Likelihood Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4 Semiparametric Mixture Model Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5 Rates of Convergence via Empirical Process Methods . . . . . . . . . . . . . . . . . . . . 30
2.6 Bayes Methods and Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.7 Model Selection Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.8 Empirical Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.9 Transformation and Frailty Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.10 Semiparametric Regression Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.11 Extensions to Non-i.i.d. Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.12 Critiques and Possible Alternative Theories . . . . . . . . . . . . . . . . . . . . . . . . . 35
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3. Efficient Estimator for Time SeriesAnton Schick and Wolfgang Wefelmeyer 45
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 Characterization of Efficient Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3 Autoregression Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.4 Innovation Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.5 Innovation Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.6 Conditional Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.7 Stationary Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.8 Stationary Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.9 Transition Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
38
4. On the Efficiency of Estimation for a Single-index ModelYingcun Xia and Howell Tong 63
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2 Estimation via Outer Product of Gradients . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3 Global Minimization Estimation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.4 Sliced Inverse Regression Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.5 Asymptotic Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.6 Comparisons in Some Special Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.7 Proofs of the Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5. Estimating Function Based Cross-ValidationM.J. van der Laan and Dan Rubin 87
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.2 Estimating Function Based Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3 Some Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.4 General Finite Sample Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.5 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Part II. Nonparametric Methods
6. Powerful Choices: Tuning Parameter Selection Based on PowerKjell Doksum and Chad Schafer 113
6.1 Introduction: Local Testing and Asymptotic Power . . . . . . . . . . . . . . . . . . . . . 114
6.2 Maximizing Asymptotic Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.4 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7. Nonparametric Assessment of AtypicalityPeter Hall and Jim W. Kay 143
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
7.2 Estimating Atypicality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7.3 Theoretical Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
7.4 Numerical Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
7.5 Outline of Proof of Theorem 7.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8. Selective Review on Wavelets in StatisticsYazhen Wang 163
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
8.2 Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
8.3 Nonparametric Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
8.4 Inverse Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
8.5 Change-points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
8.6 Local Self-similarity and Non-stationary Stochastic Process . . . . . . . . . . . . . . . . 176
8.7 Beyond Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
9. Model Diagnostics via Martingale Transforms: A Brief ReviewHira L. Koul 183
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
9.2 Lack-of-fit Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
9.3 Censoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
9.4 Khamaladze Transform or Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
39
Part III. Statistical Learning and Bootstrap
10. Boosting Algorithms: with an Application to Bootstrapping Multivariate TimeSeriesPeter Buhlmann and Roman W. Lutz 209
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
10.2 Boosting and Functional Gradient Descent . . . . . . . . . . . . . . . . . . . . . . . . . . 211
10.3 L2-Boosting for High-dimensional Multivariate Regression . . . . . . . . . . . . . . . . . 217
10.4 L2-Boosting for Multivariate Linear Time Series . . . . . . . . . . . . . . . . . . . . . . . 222
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
11. Bootstrap Methods: A ReviewS. N. Lahiri 231
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
11.2 Bootstrap for i.i.d Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
11.3 Model Based Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
11.4 Block Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
11.5 Sieve Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
11.6 Transformation Based Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
11.7 Bootstrap for Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
11.8 Bootstrap under Long Range Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . 246
11.9 Bootstrap for Spatial Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
12. An Expansion for a Discrete Non-Lattice DistributionFriedrich Gotze and Willem R. van Zwet 257
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
12.2 Proof of Theorem 12.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
12.3 Evaluation of the Oscillatory Term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
Part IV. Longtitudinal Data Analysis
13. An Overview on Nonparametric and Semiparametric Techniques for LongitudinalDataJianqing Fan and Runze Li 277
13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
13.2 Nonparametric Model with a Single Covariate . . . . . . . . . . . . . . . . . . . . . . . . 279
13.3 Partially Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
13.4 Varying-Coefficient Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
13.5 An Illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
13.6 Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
13.7 Estimation of Covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
14. Regressing Longitudinal Response Trajectories on a CovariateHans-Georg Muller and Fang Yao 305
14.1 Introduction and Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
14.2 The Functional Approach to Longitudinal Responses . . . . . . . . . . . . . . . . . . . . 311
14.3 Predicting Longitudinal Trajectories from a Covariate . . . . . . . . . . . . . . . . . . . . 313
14.4 Illustrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
40
Part V. Statistics in Science and Technology
15. Statistical Physics and Statistical Computing: A Critical LinkJames D. Servidea and Xiao-Li Meng 327
15.1 MCMC Revolution and Cross-Fertilization . . . . . . . . . . . . . . . . . . . . . . . . . . 328
15.2 The Ising Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
15.3 The Swendsen-Wang Algorithm and Criticality . . . . . . . . . . . . . . . . . . . . . . . 329
15.4 Instantaneous Hellinger Distance and Heat Capacity . . . . . . . . . . . . . . . . . . . . 331
15.5 A Brief Overview of Perfect Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
15.6 Huber’s Bounding Chain Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
15.7 Approximating Criticality via Coupling Time . . . . . . . . . . . . . . . . . . . . . . . . 340
15.8 A Speculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
16. Network Tomography: A Review and Recent DevelomentsEarl Lawrence, George Michailidis, Vijayan N. Nair and Bowei Xi 345
16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
16.2 Passive Tomography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
16.3 Active Tomography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
16.4 An Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
16.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
Part VI. Financial Econometrics
17. Likelihood Inference for Diffusions: A SurveyYacine Aıt-Sahalia 369
17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
17.2 The Univariate Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
17.3 Multivariate Likelihood Expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
17.4 Connection to Saddlepoint Approximations . . . . . . . . . . . . . . . . . . . . . . . . . 383
17.5 An Example with Nonlinear Drift and Diffusion Specifications . . . . . . . . . . . . . . . 386
17.6 An Example with Stochastic Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
17.7 Inference When the State is Partially Observed . . . . . . . . . . . . . . . . . . . . . . . 391
17.8 Application to Specification Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
17.9 Derivative Pricing Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
17.10 Likelihood Inference for Diffusions under Nonstationarity . . . . . . . . . . . . . . . . . 400
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
18. Nonparametric Estimation of Production EfficiencyByeong U. Park, Seok-Oh Jeong, and Young Kyung Lee 407
18.1 The Frontier Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
18.2 Envelope Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
18.3 Order-m Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
18.4 Conditional Frontier Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
18.5 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
Part VII. Parametric Techniques and Inferences
41
19. Convergence and Consistency of Newton’s Algorithm for Estimating Mixing Dis-tributionJayanta K. Ghosh and Surya T. Tokdar 429
19.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
19.2 Newton’s Estimate of Mixing Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 431
19.3 Review of Newton’s Result on Convergence . . . . . . . . . . . . . . . . . . . . . . . . . 432
19.4 Convergence Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
19.5 Other Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
19.6 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
20. Mixed Models: An OverviewJiming Jiang and Zhiyu Ge 445
20.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
20.2 Linear Mixed Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
20.3 Generalized Linear Mixed Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
20.4 Nonlinear Mixed Effects Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
21. Robust Location and Scatter Estimators in Multivariate AnalysisYijun Zuo 467
21.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
21.2 Robustness Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
21.3 Robust Multivariate Location and Scatter Estimators . . . . . . . . . . . . . . . . . . . 473
21.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481
21.5 Conclusions and Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
22. Estimation of the Loss of an EstimateWing Hung Wong 491
22.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
22.2 Kullback-Leibler Loss and Exponential Families . . . . . . . . . . . . . . . . . . . . . . . 493
22.3 Mean Square Error Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
22.4 Location Families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496
22.5 Approximate Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498
22.6 Convergence of the Loss Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506
Subject Index 507
Author Index 511
42
fÑxv|tÄ g{tÇ~á
go
Mary Beth Falke
Connie Brown
Zoya Kramer
and
Michael Bino, Lisa Glass, Noelina Hall , Kimberly Lupinacci
43