ANALYSIS & CORRELATION Sarah Edwards OF MAC ......Serial Number, Vendor ID, Product ID, Version !
Testing for Serial Correlation by means of Extreme Values
Transcript of Testing for Serial Correlation by means of Extreme Values
![Page 1: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/1.jpg)
Testing for SerialCorrelation by means of
Extreme Values
Ishay WeissmanTechnion - Israel Institute of Technology
Vimeiro 2013
1
![Page 2: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/2.jpg)
A typical lecture in statistics begins as
follows:
Let
X1, X2, · · · , Xn
be an iid sample from some df F ...
I will open with
2
![Page 3: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/3.jpg)
Ivette, Ivette Jr., Tiago de Oliveira
Vimeiro, 1983
3
![Page 4: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/4.jpg)
And now, let
X1, X2, · · · , Xn
be a sample from a continuous df F0 and F0
is U [0,1]
(if not, replace Xi by F0(Xi)).
We suspect the data exhibit some serial
correlation (dependence).
The main purpose of this work is to study
the effectiveness of the LARGEST
SPACING (LS) as a tool to detect serial
dependence.
4
![Page 5: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/5.jpg)
OVERVIEW
- Background on Spacings
- Possible Competitors
- Autoregressive Model and a Surprising
Connection to Extreme Values
- Power Comparisons
- Two More Models
- Conclusions
5
![Page 6: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/6.jpg)
Want to test
H0 : ”iid-uniform”
There is no optimal test against all possible
alternatives !!!
Concentrate on Autoregressive Model
Xi = ρXi−1 + (1 − ρ)Ui
(1 ≤ i ≤ n , 0 ≤ ρ ≤ 1),
where
{Ui : i ≥ 0} is an iid-U [0,1] sequence,
X0 = U0.
So, here we test
H0 : ρ = 0 vs. H1 : ρ > 0 .
6
![Page 7: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/7.jpg)
SPACINGS
Let
Y1 ≤ Y2 ≤ · · · ≤ Yn (Y0 ≡ 0, Yn+1 ≡ 1)
be the order statistics of the {Xi} and let
Vi = Yi − Yi−1 (i = 1,2, · · · , n + 1)
be the spacings and Vmax be the largest.
When ρ = 0, for 0 ≤ y ≤ 1,
P{Vmax ≤ y } =
n+1∑j=0
(−1)j(n + 1
j
){(1 − jy)+}n
(Whitworth (1897), Darling(1953)).
7
![Page 8: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/8.jpg)
If E1, E2, · · · , En+1 are iid unit-exponential
and
Tn+1 =n+1∑i=1
Ei ,
then
(V1, V2, · · · , Vn+1)D=
(E1, E2, · · · , En+1)
Tn+1
=n + 1
Tn+1·(E1, E2, · · · , En+1)
n + 1,
independent of Tn+1. Since
Tn+1/(n + 1) → 1 a.s., for large n, the
spacings behave (approximately) as iid
exponential (λ = n + 1).
Hence, for −∞ < x < ∞
limn→∞P{ (n + 1)Vmax − log(n + 1) ≤ x }
= exp{−e−x} ,
i.e. attraction to the Gumbel distribution.8
![Page 9: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/9.jpg)
Want to compare the power of LS with some
other competitors.
That is, the power of the test which rejects
H0 when Vmax > cα with powers of tests
based on:
- Likelihood ratio (LR)
- Sample serial correlation (SSC)
- Kolmogorov-Smirnov (K-S)
9
![Page 10: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/10.jpg)
LR: Most powerful, as a benchmark, to see
how close is LS to LR.
SSC: Least squares estimator of ρ, intuitive.
K-S: Very popular, similar in nature:
extreme vertical distance
vs.
extreme horizontal distance.
10
![Page 11: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/11.jpg)
Empirical df vs. Uniform df
0.0 0.2 0.4 0.6 0.8 1.0
0.00.2
0.40.6
0.81.0
0.0 0.2 0.4 0.6 0.8 1.0
0.00.2
0.40.6
0.81.0
LS and K-S occur together K-S is large due to
accumulation
11
![Page 12: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/12.jpg)
LIKELIHOOD RATIO
Denote
X = (X1, X2, · · · , Xn)
U = (U1, U2, · · · , Un)
and let U0 = X0 = x0 ∈ [0,1]. Then
Ui = (Xi − ρXi−1)(1 − ρ)−1 (1 ≤ i ≤ n).
The Jacobian of U 7→ X is (1 − ρ)−n.
Hence the joint density of X, conditioned on
U0 = X0 = x0, at x ∈ [0,1]n, is given by
fX(x) =
= (1 − ρ)−nn∏
i=1I{ρxi−1 ≤ xi ≤ ρxi−1 + 1 − ρ}
12
![Page 13: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/13.jpg)
= (1−ρ)−nI
ρ ≤ min1≤i≤n
min
xi
xi−1,
1 − xi
1 − xi−1
.
![Page 14: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/14.jpg)
Let
Ti = min
Xi
Xi−1,
1 − Xi
1 − Xi−1
(∈ [ρ,1] )
(1 ≤ i ≤ n, Tmin = min1≤i≤n
Ti ),
then the following facts follow from Slide 11:
Fact 1. The {Ti} are iid uniform on [ρ,1].
Fact 2. The likelihood function is given by
L(ρ) = (1 − ρ)−nI{ρ ≤ Tmin} (0 ≤ ρ ≤ 1).
Fact 3. The statistic Tmin is sufficient with
respect to ρ and it is the maximum likelihood
estimator (MLE) of ρ.13
![Page 15: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/15.jpg)
Fact 4. For testing
H0 : ρ = 0 vs. H1 : ρ > 0
the test which rejects H0 when
Tmin > cα = 1 − α1/n
is most powerful α-level test, with power
given by
πα(ρ) =
α
(1−ρ)n if ρ ≤ cα ,
1 if ρ ≥ cα .
Interesting case:
a sample extreme (minimum) is most
powerful for testing existence of serial
correlation !!!
14
![Page 16: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/16.jpg)
POWER COMPARISONS
For each pair ρ, n we generated 105 samples
from the autoregressive model and computed
the (empirical) power, namely, the proportion
of samples for which H0 : ρ = 0 was rejected.
The significance level is α = .05 in all cases.
15
![Page 17: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/17.jpg)
Power Functions, Autoregressive Model, α = .05.
0.0 0.1 0.2 0.3 0.4
0.2
0.6
1.0
alpha=.05, n=10
rho
0.0 0.1 0.2 0.3 0.40
.20
.61
.0
alpha=.05, n=20
rho
0.0 0.1 0.2 0.3 0.4
0.2
0.6
1.0
alpha=.05, n=50
rho
0.0 0.1 0.2 0.3 0.4
0.2
0.6
1.0
alpha=.05, n=100
rho
0.0 0.1 0.2 0.3 0.4
0.2
0.6
1.0
alpha=.05, n=200
rho
0.00 0.05 0.10 0.15 0.20
0.2
0.6
1.0
alpha=.05, n=500
rho
LR (blue), LS (black), K-S (red) , SSC (green)
16
![Page 18: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/18.jpg)
0.00 0.05 0.10 0.15
0.2
0.6
1.0
alpha=.05, n=1000
rho
0.00 0.02 0.04 0.06 0.08 0.10
0.2
0.6
1.0
alpha=.05, n=2000
rho
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07
0.2
0.6
1.0
alpha=.05, n=5000
rho
0.00 0.01 0.02 0.03 0.04 0.05
0.2
0.6
1.0
alpha=.05, n=10000
rho
LR (blue), LS (black), K-S (red) , SSC (green)
17
![Page 19: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/19.jpg)
To be fair to Kolmogorov-Smirnov, we have
run similar simulations on samples from beta
models beta(γ, 1), namely
Xi = U1/γi .
Independent, but not uniform.
Here,
H0 : γ = 1 vs. H1 : γ > 1
or
H0 : γ = 1 vs. H2 : γ < 1 .
18
![Page 20: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/20.jpg)
Power Functions, Beta Model
0.5 1.0 1.5 2.0
0.1
0.3
0.5
0.7
alpha=.05, n=10
1/gamma
0.5 1.0 1.5 2.0
0.2
0.6
1.0
alpha=.05, n=50
1/gamma
0.5 1.0 1.5 2.0
0.2
0.6
1.0
alpha=.05, n=100
1/gamma
0.5 1.0 1.5 2.0
0.2
0.6
1.0
alpha=.05, n=1000
1/gamma
LR (blue), LS (black), K-S (red)
LR here refers to the likelihood ratio test for
this model∗, namely the most powerful test.
K-S tends to the optimum, while LS stays
far below.19
![Page 21: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/21.jpg)
(∗) Reject H0 vs. γ > 1 when −2Σ logXi < χ22n(.05)
Reject H0 vs. γ < 1 when −2Σ logXi > χ22n(.95).
![Page 22: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/22.jpg)
TWO MORE MODELS
Binomial Model:
Let B1, B2, · · · be iid Bernoulli sequence with
parameter p, independent of the {Ui}
sequence.
Define
Yi = BiYi−1 + (1 − Bi)Ui
( i ≥ 1 , Y0 = U0 )
Notice, the marginal distribution of Yi is
U [0,1],
the first serial correlation, P{Yi = Yi+1}
and the extremal index, all three are equal
to p. Clusters of equal neighbors are of
random (geometric) length.
20
![Page 23: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/23.jpg)
Moving-max model:
Let ξ1, ξ2, · · · be a sequence of iid β(k−1,1)
random variables, where k is a fixed positive
integer. Let
Zi = max{ξi, ξi+1, · · · , ξi+k−1} (i ≥ 1).
The Z-sequence is called a moving-max
sequence of order k. For each i, Zi is
U [0,1]-distributed but neighboring values are
dependent. Upper extreme values appear in
clusters of size k, which imply that the
extremal index is equal to k−1.
For k = 2, the first serial correlation is 3/7
and P{Zi = Zi+1} = 1/3.
21
![Page 24: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/24.jpg)
Scatter points (i, Yi) and (i, Zi)
0 20 40 60 80 100
0.00.2
0.40.6
0.81.0
Binomial, p=.333
0 20 40 60 80 100
0.20.4
0.60.8
1.0
Moving−Max(2)
The two plots look very similar. In both
cases, the experienced practitioner will reject
the independence hypothesis just on the
basis of the fact that for continuous random
variables, the probability of a tie is 0. We
brought these cases to see how well the LS
and K-S tests detect the dependence.
22
![Page 25: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/25.jpg)
Power Functions
1 10 100 1000 10000
0.0
0.2
0.4
0.6
0.8
1.0
Moving−Max(2), alpha=.05
n
1 10 100 1000 10000
0.0
0.2
0.4
0.6
0.8
1.0
Moving−Max(3), alpha=.05
n
Logarithmic scale, LS (black), K-S (red)
K-S test is not consistent !
(Similar results for the Binomial Model.)
23
![Page 26: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/26.jpg)
CONCLUSION
- We presented here evidence (not a
theorem) that the largest spacing is quite
sensitive to serial dependence.
- K-S is more sensitive to deviation from
”uniform distribution”.
- As a byproduct, in the Autoregressive
Model, the optimal test for serial correlation
is based on lower extremes.
24
![Page 27: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/27.jpg)
THANK YOU FOR
YOUR ATTENTION
SEE YOU ALL
IN VIMEIRO 2043
25
![Page 28: Testing for Serial Correlation by means of Extreme Values](https://reader030.fdocuments.net/reader030/viewer/2022020706/61fccc521cf26d51863afb07/html5/thumbnails/28.jpg)