Sampling and Statistical Analysis for Decision Making A. A. Elimam College of Business San Francisco...
-
Upload
hilary-chase -
Category
Documents
-
view
219 -
download
0
description
Transcript of Sampling and Statistical Analysis for Decision Making A. A. Elimam College of Business San Francisco...
Sampling and Statistical Analysis for Decision Making
A. A. ElimamCollege of Business
San Francisco State University
Chapter Topics• Sampling: Design and Methods• Estimation:
• Confidence Interval Estimation for the Mean(Known)
•Confidence Interval Estimation for the Mean (Unknown)
•Confidence Interval Estimation for the Proportion
Chapter Topics
• The Situation of Finite Populations• Student’s t distribution • Sample Size Estimation• Hypothesis Testing• Significance Levels• ANOVA
Statistical Sampling
• Sampling: Valuable tool• Population:
• Too large to deal with effectively or practically• Impossible or too expensive to obtain all data
• Collect sample data to draw conclusions about unknown population
Sample design
• Representative Samples of the population • Sampling Plan: Approach to obtain samples• Sampling Plan: States
• Objectives• Target population • Population frame• Method of sampling• Data collection procedure• Statistical analysis tools
Objectives
• Estimate population parameters such as a mean, proportion or standard deviation• Identify if significant difference exists between two populations
Population Frame• List of all members of the target population
Sampling Methods
• Subjective Sampling: • Judgment: select the sample (best customers)
• Convenience: ease of sampling • Probabilistic Sampling:
• Simple Random Sampling• Replacement• Without Replacement
Sampling Methods
• Systematic Sampling: • Selects items periodically from population. • First item randomly selected - may produce bias
• Example: pick one sample every 7 days
• Stratified Sampling: • Populations divided into natural strata• Allocates proper proportion of samples to each stratum• Each stratum weighed by its size – cost or significance of certain strata might suggest different allocation• Example: sampling of political districts - wards
Sampling Methods
• Cluster Sampling:• Populations divided into clusters then random sample each• Items within each cluster become members of the sample• Example: segment customers for each geographical location
• Sampling Using Excel: • Population listed in spreadsheet• Periodic• Random
Sampling Methods: Selection
• Systematic Sampling:• Population is large – considerable effort to randomly select
• Stratified Sampling: • Items in each stratum homogeneous - Low variances • Relatively smaller sample size than simple random sampling
• Cluster Sampling: • Items in each cluster are heterogeneous • Clusters are representative of the entire Population• Requires larger sample
Sampling Errors
• Sample does not represent target population (e. g. selecting inappropriate sampling method)
• Inherent error:samples only subset of population• Depends on size of Sample relative to population• Accuracy of estimates• Trade-off: cost/time versus accuracy
Sampling From Finite Populations
• Finite without replacement (R)• Statistical theory assumes: samples selected with R• When n < .05 N – difference is insignificant • Otherwise need a correction factor• Standard error of the mean
1x
N nNn
Statistical Analysis of Sample Data
• Estimation of population parameters (PP)• Development of confidence intervals for PP• Probability that the interval correctly estimates true population parameter• Means to compare alternative decisions/process
(comparing transmission production processes)• Hypothesis testing: validate differences among PP
Mean, , is unknown
Population Random SampleI am 95%
confident that is between 40 &
60.
Mean X = 50
Estimation Process
Sample
Mean
Proportion p ps
Variance s2
Population Parameters Estimated
2
X_
Point EstimatePopulation Parameter
Std. Dev. s
• Provides Range of Values Based on Observations from Sample
• Gives Information about Closeness to Unknown Population Parameter
• Stated in terms of Probability Never 100% Sure
Confidence Interval Estimation
Confidence Interval Sample Statistic
Confidence Limit (Lower)
Confidence Limit (Upper)
A Probability That the Population Parameter Falls Somewhere Within the Interval.
Elements of Confidence Interval Estimation
Example: 90 % CI for the mean is 10 ± 2.
Point Estimate = 10
Margin of Error = 2
CI = [8,12]
Level of Confidence = 1 - = 0.9
Probability that true PP is not in this CI = 0.1
Example of Confidence Interval Estimation
Parameter = Statistic ± Its Error
Confidence Limits for Population Mean
X Error
= Error = X
XX
XZ
xZ
XZX
Error
Error
X
90% Samples
95% Samples
x_
Confidence Intervals
xx .. 64516451
xx 96.196.1
xx .. 582582 99% Samples
nZXZX X
X_
• Probability that the unknown population parameter falls within the
interval
• Denoted (1 - ) % = level of confidence e.g. 90%, 95%, 99%
Is Probability That the Parameter Is Not Within the Interval
Level of Confidence
Confidence Intervals
Intervals Extend from (1 - ) % of
Intervals Contain . % Do Not.
1 - /2/2
X_
x_
Intervals & Level of Confidence
Sampling Distribution of
the Mean
toXZX
XZX
X
• Data Variation measured by
• Sample Size
• Level of Confidence (1 - )
Intervals Extend from
Factors Affecting Interval Width
X - Z to X + Z xx
n/XX
Mean
Unknown
ConfidenceIntervals
Proportion
FinitePopulation Known
Confidence Interval Estimates
• Assumptions Population Standard Deviation is Known Population is Normally Distributed If Not Normal, use large samples
• Confidence Interval Estimate
Confidence Intervals (Known)
nZX /
2
nZX /
2
Mean
Unknown
ConfidenceIntervals
Proportion
FinitePopulation Known
Confidence Interval Estimates
• Assumptions Population Standard Deviation is Unknown Population Must Be Normally Distributed
• Use Student’s t Distribution• Confidence Interval Estimate
Confidence Intervals (Unknown)
nStX n,/ 12
n
StX n,/ 12
• Shape similar to Normal Distribution • Different t distributions based on df• Has a larger variance than Normal• Larger Sample size: t approaches Normal• At n = 120 - virtually the same• For any sample size true distribution of
Sample mean is the student’s t• For unknown and when in doubt use t
Student’s t Distribution
Standard Normal
Zt0
t (df = 5)
t (df = 13)Bell-ShapedSymmetric
‘Fatter’ Tails
Student’s t Distribution
• Number of Observations that Are Free to Vary After Sample Mean Has Been Calculated
• Example Mean of 3 Numbers Is 2
X1 = 1 (or Any Number)X2 = 2 (or Any Number)X3 = 3 (Cannot Vary)Mean = 2
degrees of freedom = n -1 = 3 -1= 2
Degrees of Freedom (df)
Upper Tail Area
df .25 .10 .05
1 1.000 3.078 6.314
2 0.817 1.886 2.920
3 0.765 1.638 2.353
t0
Assume: n = 3 df = n - 1 = 2
= .10 /2 =.05
2.920t Values
.05
Student’s t Table
A random sample of n = 25 has = 50 and s = 8. Set up a 95% confidence interval estimate for .
. .46 69 53 30
X
Example: Interval Estimation Unknown
nStX n,/ 12
nStX n,/ 12
2580639250 . 25
80639250 .
Sample of n = 30, S = 45.4 - Find a 99 % CI for, , the mean of each transmission system process. Therefore = .01 and = .005
266.75 312.45
Example: Tracway Transmission
nStX n,/ 12 n
StX n,/ 12
45.4289.6 2.756430
45.4289.6 2.756430
/ 2, 1 .005,29 2.7564nt t
Mean
Unknown
ConfidenceIntervals
Proportion
FinitePopulation Known
Confidence Interval Estimates
• Assumptions Sample Is Large Relative to Population
n / N > .05• Use Finite Population Correction Factor• Confidence Interval (Mean, X Unknown)
X
Estimation for Finite Populations
nStX n,/ 12 n
StX n,/ 121
N
nN1
NnN
Mean
Unknown
ConfidenceIntervals
Proportion
FinitePopulation Known
Confidence Interval Estimates
• Assumptions Two Categorical Outcomes Population Follows Binomial Distribution Normal Approximation Can Be Used n·p 5 & n·(1 - p) 5
• Confidence Interval Estimate
Confidence Interval Estimate Proportion
n)p(pZp ss
/s
1
2 pn
)p(pZp ss/s
12
A random sample of 1000 Voters showed 51% voted for Candidate A. Set up a 90%
confidence interval estimate for p.
p .484 .536
Example: Estimating Proportion
n)p(pZp ss
/s
1
2 p
n)p(pZp ss
/s
1
2
.51(1 .51).51 1.6451000
p .51(1 .51).51 1.645
1000
Sample Size
Too Big:•Requires toomuch resources
Too Small:•Won’t do the job
What sample size is needed to be 90% confident of being correct within ± 5? A pilot study suggested that the standard
deviation is 45.
nZError
2 2
2
2 2
2
1645 45
5219 2 220
..
Example: Sample Size for Mean
Round Up
What sample size is needed to be within ± 5 with 90% confidence? Out of a population of 1,000, we randomly selected 100 of which 30 were defective.
Example: Sample Size for Proportion
Round Up
322705
7030645112
2
2
2
..
))(.(..error
)p(pZn
228
Hypothesis Testing
• Draw inferences about two contrasting propositions (hypothesis)
• Determine whether two means are equal:1. Formulate the hypothesis to test2. Select a level of significance3. Determine a decision rule as a base to
conclusion4. Collect data and calculate a test statistic5. Apply the decision rule to draw conclusion
Hypothesis Formulation
• Null hypothesis: H0 representing status quo• Alternative hypothesis: H1
• Assumes that H0 is true • Sample evidence is obtained to determine
whether H1 is more likely to be true
Test
Accept Reject
Significance Level
FalseTrue
Type II ErrorType I Error
Probability of making Type I error = level of significance
Confidence Coefficient = 1-
Probability of making Type II error = level of significance
Power of the test = 1-
Decision Rules
• Sampling Distribution: Normal or t distribution• Rejection Region• Non Rejection Region• Two-tailed test , /2• One-tailed test , • P-Values
Hypothesis Testing: Cases
• Two-Sample Means
• F-Test for Variances
• Proportions
• ANOVA: Differences of several means
• Chi-square for independence
Chapter Summary• Sampling: Design and Methods• Estimation:
• Confidence Interval Estimation for Mean(Known)
• Confidence Interval Estimation for Mean (Unknown)
• Confidence Interval Estimation for Proportion
Chapter Summary• Finite Populations• Student’s t distribution • Sample Size Estimation• Hypothesis Testing• Significance Levels: Type I/II errors • ANOVA