Statistics for Librarians, Session 3: Inferential statistics
-
Upload
university-of-north-texas -
Category
Data & Analytics
-
view
98 -
download
1
description
Transcript of Statistics for Librarians, Session 3: Inferential statistics
INFERENTIAL STATISTICS
GOALS OF SERIES
Comfort
Fears
SESSION OBJECTIVES
Purpose of Inferential Statistics
Probability
Elements of Significance Testing
Three key tests• T-test• Chi-squared• Correlation (or binomial)
Effect Measures
PURPOSE OF INFERENTIAL STATISTICS
• Infer results•Draw conclusions• Increase the Signal-Noise ratio
Signal
Noise
INFERENTIAL STATISTICS
Tests of hypotheses• Expectations• AssociationsAccounts for uncertainty• Random error• Confidence interval
HYPOTHESES
Your Hypothesi
s(H1)
Null Hypothesis(H0)
NOT TO PROVE, BUT TO FALSIFY
H1Difference
H0No Difference
NOT TO PROVE, BUT TO FALSIFY
H1>=10%
Increase
H0<10% Increase
REVIEW OF DESCRIPTIVE STATISTICS
LEVELS OF MEASUREMENT (NOIR)
Nominal• Counts by
category• Binary (Yes/No)• No meaning
between the categories (Blue is not better than Red)
Ordinal• Ranks• Scales• Space between
ranks is subjective
Interval• Integers• Zero is just
another value – doesn’t mean “absence of”
• Space between values is equal and objective, but discrete
Ratio• Interval data with
a baseline• Zero (0) means
“absence of” • Space between is
continuous• Includes simple
counts
Central Tendency
ErrorSpread
DESCRIPTIVE STATISTICAL ANALYSIS
CENTRAL TENDENCY BY LEVELS OF MEASUREMENT
Interval or Ratio
Mean
Median
Nominal or Rank
Mode
Median (rank only)
SPREAD
Interval & Ratio
• Range• Quantiles• Standard
Deviation
Nominal & Rank
• Distribution Tables
• Bar Graphs
How variable is the data?
RANGE & QUARTILES
FORMULAS
Mean Standard Deviation
PROBABILITYWHAT’S PROBABIL ITY GOT TO DO WITH
STATISTICS?
WHAT IS PROBABILITY?
Chance of something happening (x)
Expressed as P(x)=y
Between 0 and 1
Based on distribution of events
STEM-AND-LEAF
Stem
Leaf
0 01112222222222222233333344445556666677788899
1 0000000011122223333356778899
2 00122234444799
3 0245
Groups Last digit
Years at UNT
0 5 131 6 131 6 131 6 132 6 152 6 162 7 172 7 172 7 182 8 182 8 19
3 11 294 11 294 12 304 12 324 12 345 12 355 13
Stem
Leaf Count
0 1122223334445555666666677777899
31
1 000011122222222333346677889 27
2 0122234468 10
3 1112355888 11
4 12 2Range Count
0-9 31
10-19 27
20-29 10
30-39 11
40-49 2
0-9 10-19 20-29 30-39 40-490
10
20
30
40
Histogram of Years at UNT
NORMAL DISTRIBUTIONS
PROBABILITY DISTRIBUTION
Set the mean to 0Standard Deviations above
and below the mean
DEMONSTRATION OF DISTRIBUTIONS
Distribution of the PopulationThe “Truth”
N is the # of samples
n is the number of items in each
sample
Watch the cumulative mean & medians slowly merge to the population
ACTIVITIES
CASE STUDY
• Background: Info-Lit course is meeting resistance from skeptical faculty.• Research Questions:• Does the IL course improve grades on final
papers?• Can the IL course improve passing rates for
the course?• Do students in different majors respond
differently to the IL training?• Is the final score related to the number of
credit hours enrolled for each student?
METHODOLOGY
Selection
• Two sections of same course with different instructors.
• Random Assignment
Outcome
• Blinded scoring by 2 TAs• Scores range from 1-100• Passing grade: 70
ACTIVITIES
Table 1• Distributio
n of scores
Table 2• Distributio
n of passing rates by major
Table 3• Correlation
of scores with credit hours
DESCRIPTIVE STATISTICSOF CASE STUDY
DISTRIBUTION OF SCORES
Table 1• Distribution
of scores
Table 2• Distribution
of passing rates by broad field of major
Table 3• Correlation
of scores & credit hours
SIGNIFICANCE TESTING
SIGNIFICANCE TESTING
• Groups against each other• A group against the population or
standard
Comparing significance of
differences
• Risk of being wrong• Alpha (α)• Set in advance
What is “significant”?
• The value that the statistic must meet or exceed to be statistically significant.
• Based on statistic and αCritical Value
STEPS IN SIGNIFICANCE TESTING
Which Test?
Calculate Statistic
Critical Value of Statistic?
Probability (p-
value)
KEY ELEMENTS OF SIGNIFICANCE TESTING
Null Hypothesis
Measure of Central Tendency
Standard deviations
Risk of being wrong (alpha)• Usually .05 or .025 or .01 or .001
Degrees of freedom (df)
DEGREES OF FREEDOM
Number of values in the final calculation of a statistic that are free to vary.
DEGREES OF FREEDOM EXPLAINED
• All these have a mean of 5:• 5, 5, 5• 2, 8, 5• 3, 2, 10• 7, 4, & ?
• If 2 values are known and the mean is known, then the 3rd value is also known.• Only 2 of the 3
values are free to vary.
CALCULATING DEGREES OF FREEDOM (DF)
For a single sample:• Degrees of freedom (df) for t-test = n-1
For more than one group: • df=∑(n-1) for all groups (k)• OR, ∑ n-k
For comparing proportions in categories (k):• df= ∑k-1 (# of categories minus 1)
COMPARING VALUEST-TEST
T-TEST
Used with interval or ratio data
Based on normal distribution
Four Decisions• Paired or un-paired samples?• Equal or unequal variances (standard deviations)?• Risk? • One- or two-tail?
• Direction of expected difference• Best to bet on difference in both directions (2-tail)
One-Tail
Two-Tail
T-TEST FORMULA FOR UNPAIRED SAMPLES
𝑡=𝑥1−𝑥2𝑆𝑥1−𝑥2
Signal
Noise
Difference Between Group Means 𝑉𝑎𝑟𝑖𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝐺𝑟𝑜𝑢𝑝𝑠
ELEMENTS OF T-TEST USING EXCEL DATA ANALYSIS TOOLPAK
• UnpairedPaired or Unpaired samples?
• Equal*Equal or Unequal Variances?
• Data for intervention group• Data for control groupData
• 0Hypothesized difference
• 0.025 (for a 2-tail test)Alpha
T-TEST IN EXCEL
READING T-TEST RESULTS
∑(n-1) = (51-1)+(50-1) =50+49=99
<=0.025?
IS THE DIFFERENCE SIGNIFICANT?
p=0.0005
TESTING DISTRIBUTION OF NOMINAL DATA
PEARSON’S CHI-SQUARED (Χ2)GOODNESS OF FIT TEST
Does an observed frequency distribution differ from an expected distribution• Observed is the sample or the intervention.• Expected is the population or the control or a
theoretical distribution.• Will depend on your Null Hypothesis
Nominal or categorical data
• Counts by category
EXPECTED RATIOS FOR CASE STUDY
Research Question:• Do students in different majors respond differently to the IL training?
Null Hypothesis• The ratio of students who passed will be the same for all majors.
WHEN TO USE PEARSON’S CHI-SQUARED GOODNESS OF FIT TEST
Nominal Data
Sample Size• Not too large:
• Sample is at most 1/10th of population• Not too small:
• At least five in each of the categories for the expected group.
OBSERVED PASSING RATES BY MAJOR
Major PassedNot Passed
Grand Total
Arts 6 7 13
Humanities 8 5 13Social Sciences 17 10 27
STEM 20 5 25
Undeclared 16 7 23
Total 67 34 101
EXPECTED RATIOS OF PASSING RATES BY MAJOR
• H0: Rates of passing will be the same for all majors.• Expected rates: 70% of class passes.• Expected ratios: 70% of each major passes.Major Passed Not Passed Grand Total
Arts 11.2 (16*.7) 4.8 16Humanities 11.2 (16*.7) 4.8 16Social Sciences 18.2 (26*.7) 7.8 26STEM 16.1 (23*.7) 6.9 23Undeclared 14 (20*.7) 6 20
CHI-SQUARED GOF TEST FORMULA
• Critical value of Chi-squared depends on degrees of freedom.•Degrees of freedom• Based on the number of categories or
table cells (k)• df=k-1
CHI-SQUARED IN EXCEL
What is Null Hypothesis?
There is no difference between the majors regarding passing rates.
What is your alpha (risk)?
0.05
Data in a summary tables?
Actual Ratios
Expected Ratios
Excel function:
=CHISQ.TEST(actual range1,expected range2)
Provides a p-value
0.0000172
Is p-value <= alpha?
Yes
CORRELATION OF SCORE & SEMESTER HOURS
ENROLLED
STATISTICAL CORRELATION
Quantitative value of relationship of 2 variables
• -1 represents a perfect indirect correlation• 0 represents no correlation• +1 represents a perfect direct correlation
Expressed in range of -1 to +1
• How much two variables change together
Based on co-variance
PEARSON’S PRODUCT MOMENT CORRELATION COEFFICIENT
Most commonly used statistic
Normally distributed interval or ratio data only
Labeled as r
Multiplication = Interaction
Signal
Noise
𝑟 𝑥𝑦=∑ (𝑥−𝑥 ) ( 𝑦− 𝑦 )
(𝑛−1 )𝑠𝑥 𝑠𝑦
CORRELATION IN EXCEL
• No correlationNull Hypothesis?
• =PEARSON(range1,range2)Coefficient function (r):
Does NOT have a single function to test for significance
Calculate Probability:
n # in sample 101
df # in sample - 2 99
alpha 0.025 for 2-Tail Test 0.025
r =PEARSON(range1,range2) 0.362287
t =r*SQRT(alpha)/SQRT(1-r^2) 3.867434
p =T.DIST.2T(t,df) 0.000197
CORRELATIONS FOR ORDINAL DATA
Spearman’s ϱ (rho)• Use if there are limited ties in rank.
Kendall’s τ (tau)• Use if you have a number of ties.
SELECTING THE TESTS
KNOW THE TESTS
Assumptions
Limitations
Appropriate data type
What the test tests
FACTORS ASSOCIATED WITH CHOICE OF STATISTICAL METHOD
Level of Measurement
What is being compared
Independence of units
Underlying variance in the
populationDistribution Sample size
Number of comparison
groups
USE A FLOW CHART
GOING BEYOND THE P-VALUEEFFECT SIZES
AND THE P-VALUE SAYS…
Much about the
distributions
More about the H0 than
H1
Little about size of
differences
MORE USEFUL STATISTICS
Effect Sizes• Tell the real story
Confidence Intervals• State your certainty
EFFECT SIZES OF QUANTITATIVE DATA
Differences from the mean
• Standardized• weighted against the
pooled (average) standard deviation
• Cohen’s d
Correlations
• Cohen’s guidelines for Pearson’s r
• r = 0.362
Effect Size
r>
Small .10
Medium .30
Large .50𝑑=
𝑥1−𝑥2𝑠𝑥1 , 𝑥2
EFFECT SIZES OF QUALITATIVE DATA
Based on Contingency
table
• Uses probabilitiesRelative risk
• RR = 1.608• The passing rate for the intervention
group was 1.6 times the passing rate for control group.
RR of Case Study
Pass No Pass Total
Intervention a (41) b (24) a+b (65)
Control c (26) d (10) c+d (36)
Totals a+c (67) b+d (34) a+b+c+d (101)
CONFIDENCE INTERVALS
Point estimates
Intervals
Based on
Expressed as:
• Single value• Mean
• Degree of uncertainty• Range of certainty around the
point estimate
• Point estimate (e.g. mean)• Confidence level (usually .95)• Standard deviation
• The mean score of the students who had the IL training was 79.5 with a 95% CI of 76.4 and 82.5.
CASE STUDY CONCLUSIONS
• Research Questions:• Could the IL course improve grades on final
papers?• Could the IL course improve passing rates for
the course?• Do students in different majors respond
differently to the IL training?• Is the final score related to the number of
credit hours enrolled for each student?
• Control for external variables
STATISTICAL ANALYSIS
Signal
Noise