ST102 Notes

download ST102 Notes

of 21

Transcript of ST102 Notes

  • 7/31/2019 ST102 Notes

    1/21

    STATISTICS (ST102)LENT TERM MATERIAL

    one. Point Estimation

    Data

    X, Y

    Estimator

    (formula)

    Estimates

    function of data to

    give estimates

    Method of Moments Least Squares Maximum Likelihood

    Note: The Estimator is a random variable, as the value of the estimate changes as different samples are

    drawn. Thus, the estimator has a probability distribution. The samples are regarded as constants.

    Measuring the Preference of an Estimator:

    Mean Square Error (MSE) - Measures Tradeoff

    Between Bias and Effi

    ciency

    MSE = E ( )2

    = E E( )+ E( ) 2

    ==Var()+ Bias(){ }2

    Bias - on average, the estimator gives the true value

    Bias ( ) = E ( )Therefore, if an estimator is unbiased:

    Bias ( ) = 0 E ( ) =

    Disadvantages: No equal weights, magnies values that >1 and miniaturizes values that < 1

    Mean Absolute Deviations (MAD)

    MAD= E

    Method of Moments Estimator (MME):

    Kth Sample Moment

    (non-centered)= Kth Population Moment

    k=11

    nX

    i

    i=1

    n

    = X = E X( )

    k=21

    nX

    i

    2

    i=1

    n

    = E X2( ) = Var X( )+ E X( ) 2

    ... ...

    k1

    nX

    i

    k

    i=1

    n

    = E Xk( )

    Computable from Data

    Not Computable

    Depends on unknown

    parameters

  • 7/31/2019 ST102 Notes

    2/21

    Maximum Likelihood Estimator (MLE): requires identical and independently distributed samples

    Step 1: Constructing Likelihood Function

    L ( ) = f X1, X2 ,..., Xn;( ) = f X1;( )f X2;( )...f Xn;( ) = f Xi;( )i=1

    n

    Often easier to work with log-likelihood function:

    l ( ) = ln L ( ) = ln f Xi;( )i=1

    n

    = l n f Xi;( )

    i=1

    n

    Step 2: Maximize Likelihood Function

    Maximize log-likelihood function by differentiation or by observation

    dl ( )d

    =

    = 0

    Note: In likelihood function, Xs (sample data) are treated as constants, and (parameter) is a variable

    First order conditions are only application if the function is continuous differentiable

    Properties:

    - Under suitable conditions, MLE and MME have nice large sample properties

    - Consistent

    n, MSE ( ) 0

    - Asymptotically Normal

    As n approaches innity, under some regularity conditions,

    n ( ) N 0,1

    I ( )

    ( ) N 0,1

    nI ( )

    N ,

    1

    nI ( )

    I() is the Fisher Information, dened as:

    I ( ) = E2

    2

    ln f X;( )

    = f X;( ) i2

    2

    ln f X;( )

    = f X;( ) i

    2

    2

    ln f X;( )

    - On top of this, MLE is also:

    - Invariant

    MLE ( ) = MLE g ( ) = g ( )- More Efficient than MME

    Var MLE( ) Var MME( )

    Conceptually, MLE uses more information in calculation than MME

    Hence, we always use MLE when possible

  • 7/31/2019 ST102 Notes

    3/21

    two. Condence Intervals

    Condence Interval for Parameter at 90/95/99% interval

    C SE ( )Length of Condence Interval = 2 x SE

    X% Condence Interval: If one repeats the interval estimation a large number of times, about x% of times theinterval estimator covers the true

    Condence Intervals for Population Mean

    Population Nor ally Distributed Population Not Normal For Proportion

    Variance Known Variance UnknownIf N is large,

    By Central Limit TheoremSimplication after CLT

    X

    n

    N 0,1( )

    Use Z-table

    X

    s

    n

    tn1

    Use t-table

    X

    n

    N 0,1( )

    approximately

    Use Z-table

    X

    (1

    )

    n

    N 0,1

    ( )

    Use Z-table

    Chi-Squared Distribution chi-squared distribution with k degrees of freedom

    If Xi N(0,1), Z = X12+ X2

    2+ ...+ Xk

    2+ Xi

    2 k

    2

    i=1

    k

    k

    2= 0,[ ) E(Z) = k Var(Z) = 2k

    Test for Population Variance:

    n1( )s2

    2

    n1

    2

    1( )Confidence Interval for 2 isn 1( )s2

    2

    1/2,n1

    ,n 1( )s2

    2

    /2,n1

    Proof

    Let Xi N(,2) i =1, 2, ..., n

    Xi

    2

    i=1

    n

    = 1

    2(X i )

    2

    i=1

    n

    n2

    1

    2

    (Xi )2

    i=1

    n

    =1

    2

    Xi X( )+ X ( ) 2

    i=1

    n

    =1

    2

    Xi X( )2

    i=1

    n

    + X ( )2

    i=1

    n

    + 2 X ( ) Xi X( )i=1

    n

    1

    2

    X ( )2

    i=1

    n

    = n

    2X ( )

    2

    =

    X

    n

    2

    1

    2

    1

    2X

    i X( )

    2

    =n 1( )

    2

    1

    n 1X

    i X( )

    2

    i=1

    n

    =

    n 1( )s2

    2

    i=1

    n

    n2 12 = n12

  • 7/31/2019 ST102 Notes

    4/21

    t-Distribution student t-distribution with k degrees of freedom

    Let Z N(0,1) and X k2

    T =Z

    X k t

    k

    t is a continuous and symmetric distribution on (- ,), with heavier tails than the normal distribution.

    As k approaches innity, the t distribution converges to the standard normal distribution.

    Test for Population Mean:

    X

    s

    n

    tn1

    1( )Confidence Interval for is X t/2,n1

    s

    n

    Proof

    X

    2n N 0,1( ) and

    n 1( )s2

    2

    n1

    2

    X

    2n

    n 1( )s2 2

    n 1

    =

    X

    2n

    2

    s2=

    X

    s

    n

    tn1

    F-Distribution F-distribution with degrees of freedom p, k

    U ~ p2 V ~ k

    2 U,V are independent r.v.s

    W =

    Up

    Vk

    ~ Fp,k

    Fp,k = 0,[ ) E W( ) =k

    k 2,k > 2 Var(W)=

    2k2

    p + k 2( )

    p k 2( )2

    k 4( ),k > 4

    If W ~ Fp,k , W1 ~ Fk,p If T ~ tk ,T

    2 ~ F1,k

    Test for ratio of 2 Normal Variances

    H0

    :Y

    2/

    X

    2= r vs H

    1:

    Y

    2/

    X

    2 r

    T =

    n 1( )sX2

    /X

    2

    n 1( )m 1( )sY

    2/

    Y

    2

    m 1( )

    =

    Y

    2

    X

    2

    sX

    2

    sY

    2= r

    sX

    2

    sY

    2~ F

    n1,m1

    1( )Confidence Interval for

    Y

    2

    X

    2is F

    1/2,n1,m1 s

    Y

    2

    sX

    2, F

    /2,n1,m1 s

    Y

    2

    sX

    2

  • 7/31/2019 ST102 Notes

    5/21

    three. Hypothesis Testing

    one State the N ll and Al ernative Hypothesis

    Two-Tail Test

    H0

    := k vs H1

    : k

    H0

    : =

    One-Tail Test

    k vs H1

    : > k / H0

    : = k vs H1

    : < k

    twoCompute the T st Statistic (T)

    three Look Up Critical Values

    at Level of Signicance (1%, 5%, 1

    One-tail Test:

    P(T >C) =

    If T >C, reject at level of sign

    Two-tail Test:

    P( T >C/2 ) =

    If T >C/2 , reject at level of sig

    %)

    ificance

    ificance

    Compute the p-value

    p-value is the smallest level of signicance that Ho

    can be rejected.

    Let t be the calculated test statistic

    P(T > t) /P(T < t) = p value

    One-tail TestIf p-value , reject at level of significance

    Two-tail Test

    If p-value / 2, reject at level of significance

    four

    Reject even at the 1% level of signi

    Conc

    ance / Do

    lude

    not reject even at the 10% level of signicance

  • 7/31/2019 ST102 Notes

    6/21

    Errors in Hypothesis Testing

    Type 1 Error: Reject the Null Hypothesis when H0 is true

    Type 2 Error: Not rejecting H0 when H1 is true

    Power: P ( Rejecting H0 when H1 is true )

    Note: P ( Type 2 Error ) + Power = 1

    For Ex

    H0

    := 0

    T =X 0

    n

    Un

    mple:

    s H1

    : 0

    er H 0 ,T N(0,1)

    Type 1 Error

    P(Type 1 Error)

    = P(Observed T lies in critical region)

    = level of signicance

    Type 2 Error

    True Mean is not0, instead = 1

    T is no longer a standard normal distribution.

    T ~ N

    1

    0

    n

    ,1

    P(Type 2 Error) = P(T, under H1, lies within critical values)

    Power = P(T, under H1, lies outside critical values)

    Properties

    Trade offbetween Type 1 and Type 2 error

    As falls, P( Type 2 Error) Increases

    As the distance 0 - 1 increases, P( Type 2 Error) Falls

    If variance increases, P( Type 2 Error) IncreasesIf no. of samples increases, variance falls,

    P( Type 2 Error) Increases

  • 7/31/2019 ST102 Notes

    7/21

    Testing difference of 2 population means

    Data is Normally Distributed, or n is large

    X N X ,X2( ) Y N Y,Y

    2( )

    XN X,

    X

    2

    nX

    Y N

    Y,

    Y

    2

    nY

    X

    k= a

    Matched Pairs

    Two distributions can be logically linked,

    same sample size

    nX= n

    Y= n

    Z = X Y N X Y,X

    2

    n+

    Y

    2

    n

    Under Ho,

    If Variance known

    T =Z a

    X

    2

    n+

    Y

    2

    n

    ~ N(0,1)

    If Variance unknown, Use Sample Variance

    T=Z a

    s n~ t

    n1 where s2=

    1

    n1Z

    i

    2 nZ

    2( )

    Independent Samples

    Two distributions cannot be linked,

    Different sample sizes

    nX n

    Y

    T = X Y( ) aSE X Y( )

    Under Ho,

    If Variance known

    SE X Y( ) =

    X

    2

    nX

    +

    Y

    2

    nY

    T=X Y( ) a

    X

    2

    nX

    +

    Y

    2

    nY

    ~ N 0,1( )

    1( )Confidence Interval for X Y is X Y Z/2

    X

    2

    nX

    +

    Y

    2

    nY

    If Variance unknown, but equal

    X

    2=

    Y

    2=

    2

    Pooled Variance = sP

    2=

    nx1( )sX

    2+ n

    Y1( )sY

    2

    nx1( )+ nY 1( )

    Pooled Variance is the weighted average of sample variances

    SE X Y( ) =s

    P

    2

    nX

    +s

    P

    2

    nY

    =1

    nX

    +1

    nY

    nx1( )sX

    2+ n

    Y1( )sY

    2

    nx1( )+ nY 1( )

    T =X Y( ) a

    sP

    2

    nX

    +s

    P

    2

    nY

    =

    X Y( ) a

    1

    nX

    +1

    nY

    nx1( )sX

    2+ n

    Y1( )sY

    2

    nx1( )+ nY 1( )

    ~ tn+m2

    1( )Confidence Interval for XY is

    X Y t/2,n+m2 1

    nX

    + 1nY

    nx 1( )sX2

    + nY 1( )sY2

    nx1( )+ nY 1( )

    Test for Correlation

    Corr X,Y( ) =Cov X,Y( )

    Var X( )Var Y( )=

    E X EX( ) Y EY( )

    X EX( )2

    YEY( )2

    Correlation measures the linear relationship

    between X and Y. When p=0, X and Y are linearly

    independent.

    H0

    := 0 vs H1

    :> 0 / < 0 / 0

    Sample Correlation Coefficient:

    =X X( ) YY( )

    X X( )2

    YY( )2

    =

    XiY

    i nXY(n 1)sXsY

    T = n 2

    1 2 =

    n 2

    1

    2 1~ t n2

  • 7/31/2019 ST102 Notes

    8/21

    Goodness of Fit Test to assess if a given distribution ts the data well

    1. H0: r.v. X follows a certain distribution

    H1: r.v. X does not follow a certain distribution

    Note: in cases where the parameters of the distribution are not given, use MLE/MME to get a point estimate

    2. Construct the Table and calculate the Test Statistic

    1 2 ... Total

    Observed Frequency, Zi n

    Probability, p np1 np2 1

    Expected Frequency, Ei n

    Difference Zi - Ei 0

    (Zi - Ei)2 / Ei T

    Under the Null Hypothesis,

    T =Zi E i( )

    2

    E ii=1

    n

    ~ n1No. of parameters estimated2

    T =Z

    i E

    i( )2

    Eii=1

    n

    =Z

    i

    2

    Ei

    2i=1

    n

    Zi +i=1

    n

    E ii=1

    n

    =Z

    i

    2

    Ei

    ni=1

    n

    Note: If any category has expected cell count < 5, then merge groups so that all groups have expected countsthat are more than or equal to 5. For some cases, intervals or groups can be self created. For example, in a test

    for normality, one can divide the line into 10 intervals with the probability at each interval being 10%.

    Contingency Tables / Tests of Association Special application of the goodness-of-t test

    p - number of free counts among Z

    d - no. of the estimated free parameters

    for most cases, p-d = (r-1)(c-1)

    Test of independence:

    Test for several Binomial Distributions:

    1 2 ... c

    1 Z11 Z12 ... Z1c

    2 Z21 Z22 ... Z2c

    ... ... ... ... ...

    r Zr1 Zr2 ... Zrc

    1 2 ... c

    1 E11 E12 ... E1c

    2 E21 E22 ... E2c

    ... ... ... ... ...

    r Er1 Er2 ... Erc

    -1 2 ... c

    1 ...

    2 ...

    ... ... ... ... ...

    r ...

    H0 : p ij=

    p ii

    pij

    p i

    i

    =

    Zii

    n ,p

    ij=

    Zij

    n

    p ij

    = p i

    i

    p

    ij

    E ij=

    npij

    H0 :p11 = p12 = ...= p1n = p p=Z11 + Z12 + ...+ Z1n

    Z1 + Z2 + ...+ Zn

    T =Zi E i( )

    2

    E ij=1

    c

    i=1

    r

    ~ pd2

  • 7/31/2019 ST102 Notes

    9/21

  • 7/31/2019 ST102 Notes

    10/21

    Common Tests:

    Test whether B0 is non-zero

    H0 :0 = 0 T=

    0

    0

    SE 0( )=

    0

    0

    2

    nxi

    2

    i=1

    n

    / (x i x)2

    i=1

    n

    ~ tn2

    1( )Confidence Interval for 0 is0 t/2,n2

    2

    nxi

    2

    i=1

    n

    / (x i x)2i=1

    n

    Test whether B1 is signicantly non-zero

    H0 :1 = 0 T=

    1

    1

    SE 1( )=

    1

    1

    2 / (xi x)2

    i=1

    n

    ~ tn2

    1( )Confidence Interval for 1 is1 t/2,n2

    2/ (x i x)

    2

    i=1

    n

    Testing the variance of the residuals

    n 2( )2

    2=

    1

    2y i 0 + 1xi( )

    2

    i=1

    n

    ~ n22

    ANOVA

    Total Sum of Squares (SS) = Regression SS + Resudual SS

    (y i y)2

    i=1

    n

    = 12(x i x)

    2

    i=1

    n

    + y i 0 + 1x i( )2

    i=1

    n

    Total SS = (y i y)2

    i=1

    n

    = yi2

    i=1

    n

    ny2

    Regression SS = 12(x i x)

    2

    i=1

    n

    = 12 xi2i=1

    n

    nx2

    Another Test for whether B1 is signicantly non-zero

    F =Regression SS

    Residual SS( ) / n 2( )=

    11

    SE 1( )

    2

    ~ F1,n2

    Regression Correlation Coefficient Percentage of total variation explained by x

    R =Regression SS

    Total SS= 1

    Residual SS

    Total SS, Radj =

    Regression SS / n 2( )Total SS / n 1( )

  • 7/31/2019 ST102 Notes

    11/21

    Analysis of Minitab Results

    !160140120100806040200

    70

    60

    50

    40

    30

    20

    10

    Stopping Distance

    Velocity

    S 2.14805

    R-Sq 98.4%

    R-Sq (ad j ) 9 8. 0%

    Regression

    95% CI

    95% PI

    Fitted Line PlotVelocity = 18.06 + 0.2818 Stopping Distance

  • 7/31/2019 ST102 Notes

    12/21

    STATISTICS (ST102)MICHAELMAS TERM MATERIAL

    Notepad for Probability

    Basic axioms, independence, mutual exclusion, pairwise disjoint/partitions,

    total probability, Bayes theorem, permutations and combinations

  • 7/31/2019 ST102 Notes

    13/21

    Notepad for Discrete and Continuous Random Variables

    Discrete Random Variables Continuous Random Variables

    p.d.f.

    c.d.f

    E(X)

    Var(X)

    m.g.f.

    Misc

  • 7/31/2019 ST102 Notes

    14/21

    Discrete Random Variables 1: Discrete Uniform Distribution

    pdf: cdf: mgf: mean:

    variance:

  • 7/31/2019 ST102 Notes

    15/21

    Discrete Random Variables 2: Bernoulli Distribution

    pdf: cdf: mgf: mean:

    variance:

  • 7/31/2019 ST102 Notes

    16/21

    Discrete Random Variables 3: Binomial Distribution

    pdf: cdf: mgf: mean:

    variance:

  • 7/31/2019 ST102 Notes

    17/21

    Discrete Random Variables 4: Poisson Distribution

    pdf: cdf: mgf: mean:

    variance:

  • 7/31/2019 ST102 Notes

    18/21

    Continuous Random Variables 1: Uniform Distribution

    pdf: cdf: mgf: mean:

    variance:

  • 7/31/2019 ST102 Notes

    19/21

    Continuous Random Variables 2: Exponential Distribution

    pdf: cdf: mgf: mean:

    variance:

  • 7/31/2019 ST102 Notes

    20/21

    Continuous Random Variables 3: Normal Distribution

    pdf: cdf: mgf: mean:

    variance:

  • 7/31/2019 ST102 Notes

    21/21

    Notepad for Multivariate Random Variables

    Joint distributions, Marginal Distributions, Conditional Distributions, Covariance and Correlation