An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto...
-
Upload
allison-booker -
Category
Documents
-
view
222 -
download
3
Transcript of An update on the Statistical Toolkit Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Alberto...
An update on the An update on the Statistical ToolkitStatistical Toolkit
Barbara Mascialino, Maria Barbara Mascialino, Maria Grazia Pia, Andreas Pfeiffer, Grazia Pia, Andreas Pfeiffer,
Alberto Ribon, Alberto Ribon,
Paolo ViarengoPaolo Viarengo
July 19July 19thth, 2005, 2005
G.A.P Cirrone, S. Donadio, S. Guatelli, A. Mantero, B. Mascialino, S. Parlati, M.G. Pia, A. Pfeiffer, A. Ribon, P. Viarengo
“A Goodness-of-Fit Statistical Toolkit”IEEE- Transactions on Nuclear Science (2004), 51 (5): 2056-2063.
Release StatisticsTesting-V1-01-00 downloadable from the web:http://www.ge.infn.it/geant4/analysis/HEPstatistics/
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
EMPIRICAL DISTRIBUTION FUNCTIONORIGINAL DISTRIBUTIONS
• Kolmogorov-Smirnov test
• Goodman approximation of KS test
• Kuiper test
)(
4 22
nm
nmDmn
)()( xGxFSupD mnmn
)()()()( 00* xFxFMaxxFxFMaxD TT
Dmn
Tests based on maximum distanceTests based on maximum distanceunbinned distributionsunbinned distributions
SUPREMUMSUPREMUMSTATISTICSSTATISTICS
• Fisz-Cramer-von Mises test
• Anderson-Darling test
i
ii xFxFnn
nnt 2
21221
21 )]()([)(
i k kkk
kiikk
ik nh
HnH
HnnFh
nkn
nA
4)(
)(1
)1(
)1( 2
22
Tests containing a weighting functionTests containing a weighting function
binned/unbinned distributionsbinned/unbinned distributions
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
EMPIRICAL DISTRIBUTION FUNCTIONORIGINAL DISTRIBUTIONS
QUADRATICQUADRATICSTATISTICSSTATISTICS
+ + WEIGHTING WEIGHTING FUNCTIONFUNCTION
Sum/integral of all the distances
1.1. Status of the existing testsStatus of the existing tests
2. New GoF Tests added2. New GoF Tests added
3. Description of the power study3. Description of the power studyphase Iphase I
4. Description of the power study4. Description of the power studyphase IIphase II
5. A concrete example: IMRT5. A concrete example: IMRT
1.1. Status of the existing tests:Status of the existing tests:Fisz-Cramer-von MisesFisz-Cramer-von Mises
Conover (book) + Darling (1957):
- The two-sample Cramer-von Mises test (Fisz test) has the same asymptotic distribution of the one-sample test (Cramer-von Mises test).
- The equation of the asymptotic distribution is available in the paperby Anderson and Darling (1952).
binned/unbinned distributionsbinned/unbinned distributions
1.1. Status of the existing tests:Status of the existing tests:two-sample Anderson-Darlingtwo-sample Anderson-Darling
Scholz and Stephens (1987):-The two-sample Anderson-Darling test can be written in different ways:
- exact formulation (for unbinned distributions only)- approximated formulation (for binned/unbinned distributions)
-The approximated distance is already available in the toolkit.
-The asymptotic distributions of both exact and approximated formulations are available in the paper.- The two-sample Anderson-Darling test has the same asymptotic distribution of the one-sample test.
binned/unbinned distributionsbinned/unbinned distributions
1.1. Status of the existing tests:Status of the existing tests:Tiku testTiku test
Tiku (1965):
- Cramer-von Mises test in a chi-squared approximation.
- Cramer-von Mises test statistics is converted into a central chi-square, bypassing the problem of integrating the weighting function.
binned/unbinned distributionsbinned/unbinned distributions
2. New GoF Tests:2. New GoF Tests:weighted Kolmogorov-Smirnovweighted Kolmogorov-Smirnov
Canner (1975) & Buning (2001):
- Canner modified KS test introducing one weighting function identical to the one used in AD test.
- Buning modified KS test introducing one weighting function similar to the one used in AD test.
- The equation of the asymptotic distribution is not available in Canner’s paper, only a few critical values for some samples sizes (n=m).
))()( xxGxFSupKSW mnmn
unbinned distributionsunbinned distributions
2. New GoF Tests:2. New GoF Tests:weighted Cramer von Misesweighted Cramer von Mises
Buning (2001):
- Buning modified CVM test introducing one weighting function similar to the one used in AD test.
- The equation of the asymptotic distribution is not available in the paper, only critical values for many samples sizes.
unbinned distributionsunbinned distributions
)][ xGFCVMW mnmn 2
2. New GoF Tests:2. New GoF Tests:WatsonWatson
Watson (1975):
-Derives from Cramer-von Mises test statistics.
- Like Kuiper test it can be applied in case of cyclic observations.
- The equation of the asymptotic distribution is not available in the paper, only critical values for many samples sizes.
)()()()( xFxGxGxFW nmmn 2 2
Other newsOther news
• New user layer dealing with ROOT histograms (Andreas is working on that).
• Paper to IEEE-TNS
• Next release of the GoF Statistical Toolkit scheduled within summer.
Future developmentsFuture developments
• Fix some design-related problems.
• New design (add uncertainties).
• Extend the toolkit to the comparison of:– Experimental data versus theoretical functions,– k-sample problem,– Many dimensional one-, two-, k-sample problem.
Which is the recipe toWhich is the recipe toselect the most suitableselect the most suitable
Goodness-of-Fit testGoodness-of-Fit testamong the ones available inamong the ones available inthe GoF Statistical Toolkit?the GoF Statistical Toolkit?
3. Description of the3. Description of thepower study – phase Ipower study – phase I
SAMPLE 1SAMPLE 1
RESULTS:RESULTS:LOCATION-SCALE LOCATION-SCALE
ALTERNATIVEALTERNATIVE
SAMPLE 2SAMPLE 2TESTTEST
PARENT 1PARENT 1 PARENT 2PARENT 2MONTEMONTECARLOCARLO
REPLICATIONSREPLICATIONSk=1000k=1000
EDF STATISTICS (UNBINNED DATA): EDF STATISTICS (UNBINNED DATA): KS, KSW, KSA, KUIPER, CVM, ADAKS, KSW, KSA, KUIPER, CVM, ADA
““EMPIRICAL” POWER EVALUATIONEMPIRICAL” POWER EVALUATION
RESULTS:RESULTS:GENERAL GENERAL
ALTERNATIVEALTERNATIVE
COMPARISON WITHCOMPARISON WITHPUBLISHED PUBLISHED
RESULTSRESULTS
REALREALDATADATA
EXAMPLESEXAMPLES
Parent distributionsParent distributions
1)(1 xf
Uniform)
2(
2
2
2
1)(
x
exf
Gaussian
||3
2
1)( xexf
Double exponential
24
1
11)(
xxf
Cauchy
xexf )(5
Exponential
Contaminated Normal Distribution 2
)1,1(5.0)4,1(5.0)(7 xf
)9,0(1.0)1,0(9.0)(6 xf
Contaminated Normal Distribution 1
Skewness and tailweightSkewness and tailweight
025.05.0
5.0975.0
xx
xxS
125.0875.0
025.0975.0
xx
xxT
ParentParent SS TTf1(x) Uniform 1 1.267
f2(x) Gaussian 1 1.704
f3(x) Double exponential 1 2.161
f4(x) Cauchy 1 5.263
f5(x) Exponential 4.486 1.883
f6(x) Contamined normal 1
1 1.991
f7(x) Contamined normal 2
1.769 1.693
SkewnessSkewness TailweightTailweight
Comparative evaluation of testsComparative evaluation of tests
ShortShort
(T(T<1.5)<1.5)
MediumMedium
(1.5 < T < 2)(1.5 < T < 2)
LongLong
(T>2)(T>2)
SS~~11 KSKS KS – CVMKS – CVM CVM - ADCVM - AD
SS>1.5>1.5 KS - ADKS - AD ADAD CVM - ADCVM - AD
Skewness
Skewness
TailweightTailweight
2222 Supremum Supremum statistics statistics
teststests
Supremum Supremum statistics statistics
teststests
Tests Tests containing a containing a
weight functionweight function
Tests Tests containing a containing a
weight functionweight function< <
4. Description of the4. Description of thepower study – phase IIpower study – phase II
SAMPLE 1SAMPLE 1
RESULTS:RESULTS:LOCATION-SCALE LOCATION-SCALE
ALTERNATIVEALTERNATIVE
SAMPLE 2SAMPLE 2TESTTEST
PARENT 1PARENT 1 PARENT 2PARENT 2MONTEMONTECARLOCARLO
REPLICATIONSREPLICATIONSk=1000k=1000
BINNED/UNBINNED DATABINNED/UNBINNED DATACHI2, KS, KSW, KSA, KUIPER, CVM, CVMA, ADA, ADCHI2, KS, KSW, KSA, KUIPER, CVM, CVMA, ADA, AD
““EMPIRICAL” + “MC” POWER EVALUATIONEMPIRICAL” + “MC” POWER EVALUATION
RESULTS:RESULTS:GENERAL GENERAL
ALTERNATIVEALTERNATIVE
LINEAR LINEAR POWERPOWER
CORRELATIONCORRELATIONBETWEENBETWEEN
TESTSTESTSISODYNESISODYNES
Which is the most suitable goodness-of-fit test?
EXAMPLE EXAMPLE : unbinned dataLateral profilesLateral profiles
5. A concrete example: IMRT5. A concrete example: IMRT
Mich
ela
Mich
ela
Pierge
ntili
Pierge
ntili
mn GFH :0
mn GFH :1
GoF test selectionGoF test selection
025.05.0
5.0975.0
xx
xxS
125.0875.0
025.0975.0
xx
xxT
SkewnessSkewness TailweightTailweight
S = 1: symmetric distributionS < 1: left skewed distributionS > 1: right skewed distribution
T is always greater than 1, the longer the tail the greater
the value of T.
1.1. Classify the type of the distributions in terms of Classify the type of the distributions in terms of skewness S and tailweight Tskewness S and tailweight T
Comparative evaluation of tests powerComparative evaluation of tests power
ShortShort
(T(T<1.3)<1.3)
MediumMedium
(1.3 < T < 2)(1.3 < T < 2)
LongLong
(T>2)(T>2)
SS~~11 KSKS KS – CVMKS – CVM CVM - ADCVM - AD
SS>1.5>1.5 KS - ADKS - AD ADAD CVM - ADCVM - ADSkewness
Skewness
TailweightTailweight
2. Choose the most appropriate test for the classified 2. Choose the most appropriate test for the classified type of distributiontype of distribution
GoF test: test selection & resultsGoF test: test selection & results
Moderate skewed – medium tail
KOLMOGOROV-SMIRNOV TESTD=0.27 – p>0.05
X-variable: Ŝ=1.53 T=1.36
Y-variable: Ŝ=1.27 T=1.34
^
^
RESULTSRESULTS: unbinned data