Modern Methods of Data Analysis - Physikalisches Institutmenzemer/Stat0708/statistik... ·...

Modern Methods of Data Analysis - WS 07/08 Stephanie Hansmann-Menzemer

Modern Methods ofData Analysis

Lecture XIIa (14.01.08)

● Goodness of Fit

Contents:


Re: Exercise● Test are made of a detection system which is claimed to

be “at least 90%” efficient.● How many test should be made to establish the truth

(>90%) with a significance of 5% and reject the alternative hypothesis “<80% efficient” with a power of 99%.

● Solution: number of tests N ≥ 232;

● for N=232, hypothesis (detection efficiency > 90%) accpeted if ≥ 199 tests successful.

● Question: A test with N=232 is performed. n tests have been successful (e.g. n = 210). Can I use the same data to check a second hypothesis, e.g. detection efficiency > 95%; or do I introduce any bias by already knowing the test result.


Goodness of Fit

● Is data distribution, consistent with given pdf f(x)?

– no assumption about alternative hypothesis

– compute probability (assuming f(x)=true) to observe data which are at least as incompatible as the one actually observed.


Χ²-Test for Histogram● If data points are number of events n, in bin i in a

histogram with B bins then : # expected events in bin i

● bin content should be > 10, such that making the Gaussian approximation is justified

● if experimental data & theoretical predication are normalized to same total number of events entries, then ndf = B -1

● if data used for fit, ndf = B - # fit parameters

● Note:– one should not use the Χ² test as the one and only

criteria, look/inspect also data!– Check for local irregularities


Χ²-Test:

● Χ² is insensitive to the sign of (y[i] – f(x[i]))● Χ² is insensitive to the sequence of y[i],

=> it is insensitive too trends● the larger number of data points, the smaller

the impact of a single outlier Χ² tests get less powerful with an increasing number of degrees of freedom

● It is only valid for Gaussian error distribution or an approximation ....

● Only test which is valid if data and test sample are identically! (correction of ndf)


Example: Lottery● In a lottery 1000 lots are drawn every week. Over 10 weeks

the number of wins are: n[i] = {24, 15, 17, 18, 26, 24, 32, 33, 29, 32} In total there were 250 wining lots, on average μ=25 per week.

●

● Now form two groups {1-5} and {6-10} and perform test, that each sum fluctuates around 125: – grouping into two blocks of five was quite ad-hoc

● biased CL● a fair method needs to consider all possible partitions


Kolmogorov Smirnov Test (I)● Take random data sample x[i] of size n;

Order data

● Define step function

● has to be compared to cumulative distribution F(x), which correspond to hypothesis f(x)

F(x)


Kolmogorov Smirnov Test (II)● form Kolmogorov-Smirnov test statistic d, given by:

● small value for d means good agreement. If d is larger than a given critical value (from table) reject hypothesis

● For large number of events (≥ 50) significance given by: (otherwise more complicated tables exist, depending on number of measurements) critical value 1.63 1.36 1.22 1.07 significance 1% 5% 10% 20%


Kolmogorov Smirnov test (III)● Ks is an “exact test” - does not rely on approximations

● applies only for continuous distributions F(x)

● F(x) has to be fully specified before hand. If parameters estimated from data, critical regions are not valid anymore!

● KS tends to be more sensitive to the center than to tails and differences in the global form/trend of the distribution.

● KS is often used as alternative to Χ² for small samples– no binning, no loss in information

● often used to test uniform distribution => F(x): straight line


Example: Lottery

The corresponding p-value (CL) is p < 1%


Other Test based on CDF F(x)

● look at integrated quadratic deviations of from F(x) suitable weighted by weighting function ψ(x), build test statistic Q:

● if weighting function ψ(x)=1 => Cramer-van-Mises test:

● if weighting function ψ(x) = 1/[F(x)(1-F(x))] => Anderson-Darling test

AD test weights stronglydeviations at tails. Excellenttest for how Gaussian distribution is.


Run Test

● Form sequence A: above, B: below : AAABBBBBBAAA Consists of 3 runs

● Good fits As and Bs are jumbled around, many “short runs”● One can show ... for , P(r= # runs)

Gaussian distributed with:–

–

● Run test less powerful than Χ², but provides add. info

Χ² = 12 for ndf =12=> good fit


Compare Two Samples● Given two data samples x[i] and y[i]. Do they belong to

same underlying (unknown) distribution?

● Sort x[i] and y[i] together by increasing order, count runs

● Works only if number of elements in both distributions are roughly the same.

● Not very powerful test => however no assumptions about underlying functions are made

● Kolmogorov-Smirnov test can be used as well Same critical values of d for significance levels as before.

● Many more rank tests around, all relatively weak.


Exercise● Ten temperatures are measured, each with error of 0.2 K:

10.1 10.3 9.7 10.4 9.8 9.7 10.1 10.0 10.1 9.8 It is suggested that they are all the same true value. What is ndf and Χ²? What do you conclude?

● How would things be different if the original suggestion where that they are all the same true value with 10.1K?

● If 1000 measurement are grouped in 25 bins and fitted to sum of a Gaussian + flat background. How many ndf are there?

● Perform run test for lottery example: {24,15,17,18,26,24,32,33,29,32}

Modern Methods of Data Analysis - Physikalisches Institutmenzemer/Stat0708/statistik... ·...

Documents

Transcript of Modern Methods of Data Analysis - Physikalisches Institutmenzemer/Stat0708/statistik... ·...