Modern Methods of Data Analysis - Physikalisches Institutmenzemer/Stat0708/statistik... ·...
Transcript of Modern Methods of Data Analysis - Physikalisches Institutmenzemer/Stat0708/statistik... ·...
Modern Methods of Data Analysis - WS 07/08 Stephanie Hansmann-Menzemer
Modern Methods ofData Analysis
Lecture XIIa (14.01.08)
● Goodness of Fit
Contents:
Modern Methods of Data Analysis - WS 07/08 Stephanie Hansmann-Menzemer
Re: Exercise● Test are made of a detection system which is claimed to
be “at least 90%” efficient.● How many test should be made to establish the truth
(>90%) with a significance of 5% and reject the alternative hypothesis “<80% efficient” with a power of 99%.
● Solution: number of tests N ≥ 232;
● for N=232, hypothesis (detection efficiency > 90%) accpeted if ≥ 199 tests successful.
● Question: A test with N=232 is performed. n tests have been successful (e.g. n = 210). Can I use the same data to check a second hypothesis, e.g. detection efficiency > 95%; or do I introduce any bias by already knowing the test result.
Modern Methods of Data Analysis - WS 07/08 Stephanie Hansmann-Menzemer
Goodness of Fit
● Is data distribution, consistent with given pdf f(x)?
– no assumption about alternative hypothesis
– compute probability (assuming f(x)=true) to observe data which are at least as incompatible as the one actually observed.
Modern Methods of Data Analysis - WS 07/08 Stephanie Hansmann-Menzemer
Χ²-Test for Histogram● If data points are number of events n, in bin i in a
histogram with B bins then : # expected events in bin i
● bin content should be > 10, such that making the Gaussian approximation is justified
● if experimental data & theoretical predication are normalized to same total number of events entries, then ndf = B -1
● if data used for fit, ndf = B - # fit parameters
● Note:– one should not use the Χ² test as the one and only
criteria, look/inspect also data!– Check for local irregularities
Modern Methods of Data Analysis - WS 07/08 Stephanie Hansmann-Menzemer
Χ²-Test:
● Χ² is insensitive to the sign of (y[i] – f(x[i]))● Χ² is insensitive to the sequence of y[i],
=> it is insensitive too trends● the larger number of data points, the smaller
the impact of a single outlier Χ² tests get less powerful with an increasing number of degrees of freedom
● It is only valid for Gaussian error distribution or an approximation ....
● Only test which is valid if data and test sample are identically! (correction of ndf)
Modern Methods of Data Analysis - WS 07/08 Stephanie Hansmann-Menzemer
Example: Lottery● In a lottery 1000 lots are drawn every week. Over 10 weeks
the number of wins are: n[i] = {24, 15, 17, 18, 26, 24, 32, 33, 29, 32} In total there were 250 wining lots, on average μ=25 per week.
●
● Now form two groups {1-5} and {6-10} and perform test, that each sum fluctuates around 125: – grouping into two blocks of five was quite ad-hoc
● biased CL● a fair method needs to consider all possible partitions
Modern Methods of Data Analysis - WS 07/08 Stephanie Hansmann-Menzemer
Kolmogorov Smirnov Test (I)● Take random data sample x[i] of size n;
Order data
● Define step function
● has to be compared to cumulative distribution F(x), which correspond to hypothesis f(x)
F(x)
Modern Methods of Data Analysis - WS 07/08 Stephanie Hansmann-Menzemer
Kolmogorov Smirnov Test (II)● form Kolmogorov-Smirnov test statistic d, given by:
● small value for d means good agreement. If d is larger than a given critical value (from table) reject hypothesis
● For large number of events (≥ 50) significance given by: (otherwise more complicated tables exist, depending on number of measurements) critical value 1.63 1.36 1.22 1.07 significance 1% 5% 10% 20%
Modern Methods of Data Analysis - WS 07/08 Stephanie Hansmann-Menzemer
Kolmogorov Smirnov test (III)● Ks is an “exact test” - does not rely on approximations
● applies only for continuous distributions F(x)
● F(x) has to be fully specified before hand. If parameters estimated from data, critical regions are not valid anymore!
● KS tends to be more sensitive to the center than to tails and differences in the global form/trend of the distribution.
● KS is often used as alternative to Χ² for small samples– no binning, no loss in information
● often used to test uniform distribution => F(x): straight line
Modern Methods of Data Analysis - WS 07/08 Stephanie Hansmann-Menzemer
Example: Lottery
The corresponding p-value (CL) is p < 1%
Modern Methods of Data Analysis - WS 07/08 Stephanie Hansmann-Menzemer
Other Test based on CDF F(x)
● look at integrated quadratic deviations of from F(x) suitable weighted by weighting function ψ(x), build test statistic Q:
● if weighting function ψ(x)=1 => Cramer-van-Mises test:
● if weighting function ψ(x) = 1/[F(x)(1-F(x))] => Anderson-Darling test
AD test weights stronglydeviations at tails. Excellenttest for how Gaussian distribution is.
Modern Methods of Data Analysis - WS 07/08 Stephanie Hansmann-Menzemer
Run Test
● Form sequence A: above, B: below : AAABBBBBBAAA Consists of 3 runs
● Good fits As and Bs are jumbled around, many “short runs”● One can show ... for , P(r= # runs)
Gaussian distributed with:–
–
● Run test less powerful than Χ², but provides add. info
Χ² = 12 for ndf =12=> good fit
Modern Methods of Data Analysis - WS 07/08 Stephanie Hansmann-Menzemer
Compare Two Samples● Given two data samples x[i] and y[i]. Do they belong to
same underlying (unknown) distribution?
● Sort x[i] and y[i] together by increasing order, count runs
● Works only if number of elements in both distributions are roughly the same.
● Not very powerful test => however no assumptions about underlying functions are made
● Kolmogorov-Smirnov test can be used as well Same critical values of d for significance levels as before.
● Many more rank tests around, all relatively weak.
Modern Methods of Data Analysis - WS 07/08 Stephanie Hansmann-Menzemer
Exercise● Ten temperatures are measured, each with error of 0.2 K:
10.1 10.3 9.7 10.4 9.8 9.7 10.1 10.0 10.1 9.8 It is suggested that they are all the same true value. What is ndf and Χ²? What do you conclude?
● How would things be different if the original suggestion where that they are all the same true value with 10.1K?
● If 1000 measurement are grouped in 25 bins and fitted to sum of a Gaussian + flat background. How many ndf are there?
● Perform run test for lottery example: {24,15,17,18,26,24,32,33,29,32}