Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.
-
Upload
dwayne-page -
Category
Documents
-
view
216 -
download
0
Transcript of Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.
![Page 1: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/1.jpg)
Short Resume of Statistical Terms
Fall 2012
By Yaohang Li, Ph.D.
![Page 2: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/2.jpg)
Review• Last Class
– Introduction to Monte Carlo• This Class
– Important Statistics Terms• Random Events
– Independence of Random Events– Axioms on Random Events
• Random Variables– Independence of Random Variables
• CDF• PDF• Expectation
– Characteristics of Expectation
• Moments of a Distribution– rth moment– rth central moment
• Mean• Variance• Standard Deviation• Covariance
– Characteristics of covariance
• Review of Statistics and Probability Terms• Important Distribution• Central Limit Theorem• Estimand and Estimator
• Next Class– Monte Carlo for Integration
![Page 3: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/3.jpg)
Random Events and Probability• Random Event
– An event which has a chance of happening
• Probability– A numerical measure of that chance– Lying between 0 and 1, both inclusive
• Terminology– P(A)
• The probability that an event A occurs– P(A+B+…)
• The probability that at least one of the events A, B, … occurs– P(AB…)
• The probability that all the events A, B, … occur– P(A|B)
• The probability that the event A occurs when it known that the event B occurs• Conditional probability of A given B
![Page 4: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/4.jpg)
Axioms in Probability
• P(A+B+…)P(A)+P(B)+…– If only one of the events A, B, … can occur, they are called
exclusive. The equality holds
– If at least one of the events A, B, … must occur, they are called exhaustive. P(A+B+…)=1
• P(AB)=P(A|B)P(B)– If P(A|B)=P(A), A and B are independent
• The chance of A occurring is uninfluenced by the occurrence of B
![Page 5: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/5.jpg)
Random Variables and Distributions
• Random variable ()– A number to characterize a set of exclusive and exhaustive
events
• Cumulative Distribution Function (CDF)– F(y)=P( y)
– The probability that the event which occurs has a value not exceeding a prescribed y
– F(+)=1 and F(-)=1
– F(y) is a non-decreasing function of y
![Page 6: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/6.jpg)
Expectation• If g() is a function of , the expectation (or mean value) of g is
denoted and defined by
– Stieltjes integral– The integral is taken over all values of y
• Explanation– Continuous random events
• F(y) is continuous and f(y) is a derivative
– Discrete random events• F(y) is a step function and fi is the step of height at the points of yi
• Probability Density Function (pdf)– f(y) and yi are the probability density functions
)()()( ydFygEg
dyyfygEg )()()(
i
ii fygEg )()(
![Page 7: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/7.jpg)
More on Expectation
• The statistical physicist uses another notation for expectation– Suppose pi is the probability density function
• How about if g(x) is a constant function?
![Page 8: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/8.jpg)
Linear Combination of the Expectation Values
![Page 9: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/9.jpg)
Multi-dimensional Distribution• Multi-dimensional Random Variable
– Represented used a vector
• Multi-dimensional CDF– F(y)=P( y)
y means that each coordinate of is not greater than the corresponding coordinate of y
• Expectation
– Continuous multidimensional events
• where
)()()( yyη dFgEg
yyyη dfgEg )()()(
k
kk
k yyy
yyyFyyyff
...
),...,,(),...,,()(
21
2121y
![Page 10: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/10.jpg)
Independence of Random Variables
• Consider a set of exhaustive and exclusive events, each characterized by a pair of numbers and , for which F(y,z) is the distribution. G(y) is an CDF for and H(z) is an CDF for .– F(y,z) = P( y, z)
– G(y) = P( y)
– H(z) = P( z)
• If it so happens that– F(y,z)=G(y)H(z) for all y and z
– the random variables and are called independent
![Page 11: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/11.jpg)
Characteristics of Expectations
• Hold regardless whether or not the random variables i
are independent or not
• Hold only i are mutual independent
i i
iiii gEEg )()(
i
iii
ii gEEg )()(
![Page 12: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/12.jpg)
Moments of Distribution• rth moment of a distribution
– E(r)
• Principle moment = E()
• rth central moment r= E{(- )r}
• Most important moments = E(), known as the mean of
• Measure of location of a random variable 2, known as the variance of (usually used abbreviation of “var”)
• Measure of dispersion about the mean– standard deviation
– coefficients of variation /
2
![Page 13: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/13.jpg)
Covariance
• Definition of covariance (usually abbreviation of cov)– If and are random variables with means and v,
respectively, the quantity E{(- )(-v)} is called the covariance of and
– If and are independent, the covariance is 0
• Why?
– Also, cov(, )=var()
• Why?
![Page 14: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/14.jpg)
Important Formula of Covariance
k
iji
k
i
k
ji
1 1 1
),cov()var(
![Page 15: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/15.jpg)
Correlation Coefficient
• Definition
– Always between +1 and -1
– If =0, they are not correlated
– If <0, they are negatively correlated
– If >0, they are positively correlated
varvar/),cov(
![Page 16: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/16.jpg)
Important Distributions
• Uniform Distribution• Exponential Distribution• Binomial Distribution• Poison Distribution• Normal Distribution
![Page 17: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/17.jpg)
Uniform Distribution• Uniform Distribution (Rectangle Distribution)
– A distribution has constant probability
– Mean?
– Variance?
![Page 18: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/18.jpg)
Exponential Distribution• Exponential Distribution
– mean 1/– variance 1/ 2
![Page 19: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/19.jpg)
Binomial Distribution• Binomial Distribution
– Discrete probability distribution Pp(n|N) of obtaining exactly n successes out of N Bernoulli trials
– Each Bernoulli trial is true with probability p and false with probability q=1-p
= =
![Page 20: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/20.jpg)
Poisson Distribution• Poisson Distribution
– The limit of the Binomial Distribution
– Mean is v
– Variance is v
!)(lim)(
n
evnPnP
vn
BN
v
![Page 21: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/21.jpg)
Normal Distribution• Normal Distribution (Gaussian Distribution)
– Bell curve
– De Moivre developed the normal distribution as an approximation to the binomial distribution
![Page 22: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/22.jpg)
Normal Distribution in Data Analysis• 68.26% of the data will be found within one SD either side of the mean
(±1SD) 95.44% of the data will be found within two SD either side of the
mean(±2SD) 99.74% of the data will be found within three SD either side of the mean
(±3SD)
![Page 23: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/23.jpg)
Central Limit Theorem
• Central Limit Theorem– The sum of n independent random variables has an
approximately normal distribution when n is large
• Random variables conform to arbitrary distribution
![Page 24: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/24.jpg)
Central Limit Theorem in Practice
• In practice– n = 10 is reasonably large number
– n = 25 is rather large (effective infinite)
![Page 25: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/25.jpg)
Estimation• Monte Carlo Computation
– Goal: estimating the unknown numerical value of some parameter of some distribution• The parameter is called an estimand
• Sample• The available data (may consist of a number of observed random variables)• The number of observations in the sample is called the sample size
• Estimand– mean
• (1+ 2+…+ n)/n– weighted average
• (w11+w22+…+wnn)/(w1+w2+…+wn)• May be a better estimator
• Connection between the sample and the estimand– The estimand is a parameter of the distribution of the random variables constituting the
sample
![Page 26: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/26.jpg)
Sampling Distribution• Parent Distribution
– We can represent the sample by a vector with coordinates 1, 2, 3,…, n
– The distribution of 1, 2, 3,…, n is called the Parent Distribution– To estimate the estimand (a parameter of the Parent Distribution), we use
some function t()• t is an estimator
• Sampling Distribution is a random variable, so is t()
• if we repeated the experiment, we should expect to get a different value of
– Since varies from experiment, t() has a distribution, called sampling distribution
– If t() is to be close to , then the sampling distribution ought to be closely concentrated around
![Page 27: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/27.jpg)
Measuring Sampling Distribution
• The bias of t– The difference between and the average value of t() =E{t()-}
– t is an unbiased estimator if =0
• The sampling variance of t 2t=var{t()}=E{[t()-Et()]2}=E{[t- - ]2}
• If and 2t are small, t is a good estimator
![Page 28: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/28.jpg)
Important Estimators
• Mean of the parent distribution
– standard error
• Variance of the parent distribution
– standard error
nn /)...( 21
n/
)1/()...(222
22
12 nns n
ns
5.0/22
![Page 29: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/29.jpg)
Efficiency
• Goal of Monte Carlo Work– Obtain a respectably small standard error in the final result
– More random samples can lead to better accuracy
• Not very rewarding
– Variance Reduction Method
![Page 30: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/30.jpg)
Summary• Important Statistics Terms
– Random Events• Independence of Random Events• Axioms on Random Events
– Random Variables• Independence of Random Variables
– CDF– PDF– Expectation
• Characteristics of Expectation– Moments of a Distribution
• rth moment• rth central moment
– Mean– Variance– Standard Deviation– Covariance
• Characteristics of covariance– Correlation Coefficient
![Page 31: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/31.jpg)
Summary (Cont.)• Important Distributions
– Uniform Distribution– Exponential Distribution– Binomial Distribution– Poison Distribution– Normal Distribution
• Estimation– Sample– Estimand– Parent Distribution– Sampling Distribution– Estimator
• Important estimators– Buffon’s Needle
![Page 32: Short Resume of Statistical Terms Fall 2012 By Yaohang Li, Ph.D.](https://reader036.fdocuments.net/reader036/viewer/2022081519/56649f425503460f94c62b44/html5/thumbnails/32.jpg)
What I want you to do?
• Review Slides• Review basic probability/statistics concepts• Work on your Assignment 1