Overview course in Statistics (usually given in 26h, but now in 2h) introduction of basic concepts...

Post on 29-Dec-2015

217 views 3 download

Tags:

Transcript of Overview course in Statistics (usually given in 26h, but now in 2h) introduction of basic concepts...

Overview course in Statistics (usually given in 26h, but now in 2h)

introduction of basic concepts of probability concepts of parameter estimation and confidence belts techniques: maximum likelihood & least-square hypothesis testing & goodness-of-fit concepts

Prof. Jorgen D’Hondt

(Vrije Universiteit Brussel)

IPN (Teheran, Iran) Winter School – February, 2008

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 2

It will go fast...

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 3

Lecture 1: content

Basic concepts of probability theory• definition of probability & main calculation rules• rule of Bayes• permutations & combinations & binomial distribution

Probability distributions• experimental histograms versus theoretical distributions• definition of a distribution of a stochastic variable• concept of correlation between stochastic variables

Main distributions used in physics• binomial, Poisson, Gaussian/normal, exponential, 2, Cauchy• convoluting an experimental resolution with a physics distribution

Central Limit theorem in actionParameter estimation

• basic concept & interpretation of the techniques• construction of confidence intervals (method of Neyman)• properties of estimators (incl. Minimum Variance Bound)

Propagation of uncertainties• one variable & many variables with a covariance matrix

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 4

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 5

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 6

Rule of Bayes

Xi : theorie iY : experimental result

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 7

Experiment of Galton

Binomial distribution

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 8

Experiment of Galton → binomial distribution

N=20P(left)=0.5

N=20P(left)=0.25

N=50P(left)=0.5

N=50P(left)=0.25

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 9

Describe your data-set

Discrete(Integers)

Continuous(Reals)

Binning

{ 5.6354 7.3625 8.1635 9.3634 1.3846 0.2847 1.4763 }

‘Histograms’ ‘N-tuples’

Quantitative via some variables (mean, N, median, standard deviation, -percentile,...)

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 10

Probability Density Distribution (PDF)

cumulative

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 11

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 12

Poisson distribution : example

The Michigan-Irvine-Brookhaven experiment has observed on the 23th of February 1987 in total 9 neutrinos in an interval of 10 seconds. This was exactly on the moment when the supernova S1987a was observed by astronomers. They have compared this event with many observations they made with the same experiment. What is the probability that this special event happened in the time period they have measured?

Number of neutrinos

Number of times observed in 10 seconds

Check what a Poissonian distribution would predict.

Answer: In the total time they have measured, they would expect 0.0003 events in which they observe 9 neutrinos in a 10 seconds interval. Hence a very small probability that this was simply a fluctuation from a Poisson distribution...

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 13

Central Limit theoremat work

N=2

N=50

N=5

N=3

N=10

N=20

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 14

Limits between distributions

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 15

Limits between distributions: example

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 16

No limit for the Cauchy distribution

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 17

Experiment versus Theory

x – mean of our sample – mean of our parent dist

– S.D. of our parent dist– sample variance of our sample

x

s

Data Sample(experiment)

Probability Distribution(from which data sample was drawn)

Latin letters Greek letters

Statistics – parameter estimation

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 18

Stochastic X

Different samplesof n measurements

estimator (one value per sample)

result of 1 sample

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 19

Confidence intervals (Neyman)

1

2

x0

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 20

End of first lecture

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 21

Lecture 2: content

Maximum Likelihood technique• general method & example• properties of the Maximum Likelihood estimator & variance• graphical interpretation

Least-Square technique• general method• the linear model as main case study & matrix notation• variance of the estimator & graphical interpretation

Hypothesis testing• general concepts of a hypothesis test & type of errors made• Neyman-Pearson optimal test (likelihood ratio method)

Goodness-of-fit tests• basic concept & confidence level• the 2-test as most important test & the RUN test

Statistics, Teheran, February 2008 Prof. Jorgen D'Hondt (Vrije Universiteit Brussel) 22