SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida...
-
Upload
james-underwood -
Category
Documents
-
view
222 -
download
0
Transcript of SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida...
![Page 1: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/1.jpg)
![Page 2: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/2.jpg)
SAMPLING METHODS
Stratification and Clustering
Richard L. Scheaffer
University of Florida
http://courses.ncssm.edu/math/Stat_Inst/Notes.htm
![Page 3: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/3.jpg)
The Sampling Process
• Population
• Sampling Unit
• Sampling Frame
• Sampling Design
• Sample
![Page 4: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/4.jpg)
…and Process Failures
• Sampling errors
• Non-sampling errors*
![Page 5: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/5.jpg)
PSI Example
CEMENT PLANT
Avenues 1 2 3 4 5 6 7 8 9 10
1 121 118 124 123 116 118 120 118 114 122 2 116 118 118 113 117 116 117 112 112 115 3 114 107 109 106 112 108 112 110 111 111 4 105 104 103 101 103 105 104 106 109 107
Streets 5 100 100 101 96 98 96 100 100 105 100 6 97 95 96 94 96 95 96 97 96 97 7 92 90 91 89 93 94 93 92 92 90 8 86 81 85 87 85 85 86 87 83 84 9 80 78 80 79 77 81 81 79 84 81 10 76 77 74 77 75 74 80 75 77 74
Display 1: Grid of houses showing PSI measurements
![Page 6: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/6.jpg)
Cluster_Column
Cluster_Row
Random_Sample
Stratify_Column
Stratify_Row
80 90 100 110 120
Sample_Mean
Sampling Design ComparisonsBox Plot
Display 2: Samples of size 10 by various designs
![Page 7: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/7.jpg)
SIMPLE RANDOM SAMPLINGThe observations y1, y2, . . .yn are to be sampled from a population with mean , standard deviation , and of size N in such a way that every possible sample of size n has an equal chance of being selected. If the sample mean is denoted by, it follows that
and
E( y ) = μ
(V y ) =
σ
2
n
N − n
N − 1
⎛
⎝
⎜
⎞
⎠
⎟
![Page 8: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/8.jpg)
For the sample variance s2, it can be shown that
and, thus, that an unbiased estimator of the variance of the sample mean is given by
€
ˆ V (y ) =s2
nN −n
N
⎛ ⎝ ⎜
⎞ ⎠ ⎟=
s2
n1−f)( )
€
E(s
2
) =
N
N − 1
σ
2
= S
2
![Page 9: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/9.jpg)
STRATIFIED RANDOM SAMPLING
Stratified sampling designs:
• Convenience (administration efficiency)• Estimates desired for each of the strata • Reduced variation (statistical efficiency)
![Page 10: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/10.jpg)
Let
€
y i denote the sample mean for the simple random sample selected from stratum i, ni the sample size for stratum i, iμ the population mean for stratum i, and Ni the size of stratum i.
€
y st = 1N
Niy ii=1
L∑ = Wiy ii=1
L∑
€
V (y st∧
)
€
= 1N2 Ni
2 Ni −niNi
⎛
⎝ ⎜ ⎜
⎞
⎠ ⎟ ⎟i=1
L∑
si2
ni
⎛
⎝
⎜ ⎜ ⎜
⎞
⎠
⎟ ⎟ ⎟
![Page 11: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/11.jpg)
Allocation
From a total sample size of n, how many observations should be allocated to stratum i? The sample size and allocation across strata may be chosen to minimize variance for fixed sample size
€
ni
= nN
iσ
i
Njσ
j∑
Proportional Allocat ion
€
ni
= nN
i
Nj∑
nN
i
N
![Page 12: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/12.jpg)
€
Comparison to SRS
The following comparisons apply for situations in which the Ni are all relatively large. (Here Wi = Ni/N.)
Vran - Vprop =
and
Vprop - Vopt =
1 − f
n
W
i
∑ ( μ
i
− μ )
2
1
n
W
i
(S
i
∑ − S )
2
where S = W
i
∑ S
i
![Page 13: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/13.jpg)
CLUSTER SAMPLING
Commonly used when:• Frame for elements is relatively difficult to obtain• Frame for clusters of elements of elements is
relatively easy to obtain
Examples: • Classrooms versus students• City blocks versus residents• Cartons of items stored in a warehouse versus
individual items
![Page 14: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/14.jpg)
Examples of Clusters
• Polls (home or workplace)
• Crop yield surveys (trees, corn, sugar cane)
• Animal studies (traps, colonies)
• Systematic sample
![Page 15: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/15.jpg)
Single-stage cluster sample:
Select a simple random sample of clusters and then measure each element within the sampled clusters.
N = number of clusters in the population n = number of clusters selected in a simple random sample mi = number of elements in cluster i, i = 1, . . . , N
€
m =1n mii=1n∑ = average cluster size for the sample
yi = total of all observations in the ith cluster
€
y =yii=1
n∑
mii=1n∑
=y tm
![Page 16: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/16.jpg)
where
€
ˆ V (y ) = N − nN
⎛
⎝ ⎜
⎞
⎠ ⎟
1
nm 2sr2
€
sr2 =
( yi
− y mi)2
i=1
n∑
n − 1
![Page 17: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/17.jpg)
01234
5678
0 2 4 6 8 10 12 14Residences_m
Renters_y = 0.488Residences_m
Rented Residences Scatter Plot
![Page 18: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/18.jpg)
Two-stage cluster sampling with equal cluster sizes
€
ˆ μ =N
M
⎛
⎝ ⎜
⎞
⎠ ⎟
M iy ii =1
n
∑
n=
1
M
M iy ii =1
n
∑
n
![Page 19: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/19.jpg)
Two-stage Cluster Sampling with Equal Cluster Sizes
€
ˆ V (̂ μ )=(1−f1)MSBnm+(1−f2)1N ⎛ ⎝ ⎜ ⎞
⎠ ⎟MSWm
where f1 = n/N, f2 = m/M,
MSB =
€
mn−1 y i−̂ μ ⎛ ⎝ ⎜ ⎞ ⎠ ⎟2i=1n∑
and
MSW =
€
1n(m−1) yij−y i ⎛ ⎝ ⎜ ⎞
⎠ ⎟2
j=1m∑i=1
n∑
=
€
1n si2i=1n∑
![Page 20: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/20.jpg)
1. If N is large,
€
ˆ V (̂ μ ) = MSB/nm and depends only on the c luster means.
2. If m =
€
M (or f2 = 1), then two-stage cluster sampling reduces to one stage cluster sampling.
3. If n = N then two-stage cluster sampling becomes stratified random
sampling with N strata and m observations from each.
![Page 21: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/21.jpg)
FIGURE 1: Distribution of class GPA’s
2.00
2.50
3.00
3.50
4.00
GPA
N=60 Mean 3.27 Standard Deviation 0.55
![Page 22: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/22.jpg)
FIGURE 2: Distribution of sample means from simple random sampling
3.00 3.20 3.40
5
10
15
20
SRS 20 Mean 3.27 Standard Deviation 0.105
![Page 23: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/23.jpg)
FIGURE 3: GPA’S by gender
2.0
2.5
3.0
3.5
4.0
1 2
Gender
G
P
A
Count Mean Standard Deviation F 45 3.39 0.436 M 15 2.92 0.698
![Page 24: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/24.jpg)
FIGURE 4: Distribution of sample means from stratified random sampling
3.00 3.20 3.40
5
10
15
20
StRS 20 Mean 3.27 Standard Deviation 0.091 n=20 n1=15 n2=5
![Page 25: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/25.jpg)
FIGURE 5: Distribution of sample means from cluster sampling; ordered clusters
2.5 3.0 3.5
5
10
15
20
25
MeanOrd Mean 3.30 Standard Deviation 0.232 (n=4, m=5)
![Page 26: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/26.jpg)
FIGURE 6: Distribution of sample means from cluster sampling; random clusters
2.5 3.0 3.5
10
20
30
40
MeanRan
Mean 3.30 Standard Deviation 0.089
![Page 27: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/27.jpg)
FIGURE 7: Distribution of sample means from cluster sampling; systematic clusters
2.50 2.75 3.00 3.25
10
20
30
40
Mean
Mean 3.27 Standard Deviation 0.055
![Page 28: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/28.jpg)
Summary Chart
Mean Standard Deviation
Population 3.27 0.550 Simple random 3.27 0.105 Stratified (gender) 3.27 0.091 Cluster-ordered 3.30 0.232 Cluster-random 3.30 0.089 Cluster-systematic 3.27 0.055
![Page 29: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/29.jpg)
Names to Explore
• P. C. Mahalanobis– "The I.S.I. has taken the lead in the original development of the
technique of sample surveys, the most potent fact finding process available to the administration". R. A. Fisher
• Walter Shewhart• Jerzy Neyman• William Cochran• Edwards Deming• Warren Mitofsky
![Page 30: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/30.jpg)
References• Groves, Robert; Dillman, Donald; Eltinge, John;
and Little, Roderick, editors. 2002. Survey Nonresponse. New York: Wiley.
• Lohr, S. 1999. Sampling: Design and Analysis, Pacific Grove, CA: Brooks Cole.
• Scheaffer, Richard; Mendenhall, William; and Ott, R. Lyman. 1996. Elementary Survey Sampling, 5th ed. Belmont,CA: Duxbury Press.
![Page 31: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/31.jpg)
EXTRAS!
![Page 32: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/32.jpg)
Equal cluster sizes-comparison to SRS
€
MSB =SSB
n −1=
m
n −1(y i − y c )2
i =1
n
∑
€
MSW =SSW
n(m−1)=
1
n(m−1)(yij − y i )
2
j =1
m
∑i =1
n
∑
€
ˆ V (y c ) =N − n
N
⎛
⎝ ⎜
⎞
⎠ ⎟1
nmMSB
![Page 33: SAMPLING METHODS Stratification and Clustering Richard L. Scheaffer University of Florida RLS907@bellsouth.net .](https://reader033.fdocuments.net/reader033/viewer/2022061305/551452935503462d4e8b51b1/html5/thumbnails/33.jpg)
For a random sample of size mn,
€
ˆ V (y ) =Nm− nm
Nm
⎛
⎝ ⎜
⎞
⎠ ⎟s2
nm=
N − n
N
⎛
⎝ ⎜
⎞
⎠ ⎟s2
nm
€
ˆ s 2 ≈1
m(m−1)MSW + MSB[ ]
€
RE∧
(y c/y ) = ˆ s 2
MSB