NATIONAL HEALTH AND NUTRITION EXAMINATION SURVEY III · NATIONAL HEALTH AND NUTRITION EXAMINATION...

NATIONAL HEALTH AND NUTRITIONEXAMINATION SURVEY III

WEIGHTING AND ESTIMATION METHODOLOGY

EXECUTIVE SUMMARY

Prepared for:

National Center for Health StatisticsHyattsville, Maryland

Prepared by:

Westat, Inc.1650 Research Boulevard

Rockville, Maryland 20850

Leyla MohadjerJill Montaquila

Joseph WaksbergBridgett BellPenny James

Ismael Flores-CervantesMaida Montes

February 1996

NATIONAL HEALTH AND NUTRITIONEXAMINATION SURVEY III

WEIGHTING AND ESTIMATION METHODOLOGY

EXECUTIVE SUMMARY

TABLE OF CONTENTS

Section Page

1. INTRODUCTION ................................ ................................ .................... 1

1.1 Sample Design................................................................................ 11.2 Comparisons Between NHANES I, NHANES II, Hispanic HANES,

and NHANES III ............................................................................ 31.3 Goals of Weighting......................................................................... 4

2. GENERAL OVERVIEW OF THE WEIGHTING METHODOLOGY 5

2.1 Computing Basic Weights .............................................................. 52.2 Adjusting for Nonresponse and Poststratification .......................... 6

3. WEIGHTING FOR FUL L SAMPLE AND SUBSAMPLES FOR PHASE 1

AND PHASE 2 ................................ ................................ .......................... 11

3.1 Weighting for Full Sample ............................................................. 113.2 Weighting for Subsamples ............................................................. 12

4. WEIGHTING FOR PHA SES 1 AND 2 COMBINED ........................... 13

5. VARIANCE ESTIMATI ON................................ ................................ .... 23

REFERENCES ................................ ................................ ................................ ............. 29

LIST OF TABLES

Table Page

1. Selected Sample Design Parameters for the Health and Nutrititon ExaminationSurveys........................................................................................................ 3

2. March 1990 Undercount Adjusted CPS Totals........................................... 9

3. March 1993 Undercount Adjusted CPS Totals........................................... 10

4. Number of Respondents by Age-Sex-Race/Ethnicity Subdomain forPhases 1 and 2 Combined ........................................................................... 15

5. Number of Respondents by Age-Sex-Race/Ethnicity Subdomain forPhases 1 and 2............................................................................................. 17

6. Fifth and 95th Percentiles, and the Mean Values of theWeights By Sampling Domain ................................................................... 18

i

7. Appropriate Uses of the Weights ................................................................ 22

ii

LIST OF EXHIBITS

Exhibit Page

1. NHANES III Weighting Flow Chart Phase 1, Phase 2 (Separately) .......... 20

2. NHANES III Weighting Flow Chart, Phase 1 & 2 Replicate Weights....... 28

iii

EXECUTIVE SUMMARY

1. INTRODUCTION

1.1 Sample Design

The NHANES III sample represents the total civilian noninstitutionalized

population, 2 months of age or older, in the 50 states of the United States. A four-stage

sample design was used: (1) Primary Sampling Units (PSUs) comprising mostly single

counties, (2) area segments within PSUs, (3) households within area segments, and (4)

persons within households.

The PSUs in the first stage were mostly individual counties; in a few cases,

adjacent counties were combined to keep PSUs above a certain minimum size. There were 81

PSUs in the sample, selected with probability proportionate to measures of size (pps) and

without replacement. The measure of size reflected the desire to oversample the minority

groups in NHANES III. Thirteen large counties were chosen with certainty. The 13 certainty

counties were divided into 21 survey locations for logistical and operational reasons. The

data collection was carried out between October 1988 and September 1994. In order to

permit separate analyses for two 3-year periods (referred to as Phase 1 and Phase 2), as well

as for the entire field period, the sample of PSUs was randomly allocated to the two 3-year

periods. One set was allocated to the first 3-year time period during which NHANES III was

conducted (Phase 1, 1988-91), and the other set to the second 3-year period (Phase 2, 1991-

94). The allocation of PSUs to the two phases was made in a way that retained as much of

the original stratification as possible.

For most of the sample in Phase 1, the second stage was area segments

comprising city or suburban blocks, combinations of blocks, or other area segments in places

where block statistics were not produced in the 1980 census. The area segments were used

only for a sample of persons who lived in housing units built before 1980. For units built in

1980 and later, the second stage consisted of sets of addresses selected from building permits

issued in 1980 or later (these are referred to as new construction segments). In Phase 2 of the

survey, the 1990 census data were used for the selection of the second stage units, with no

new construction sampling. In both phases, the area segments were stratified by percent

Mexican-American prior to sample selection.

1

The third stage consisted of households and group quarters. All households and

group quarters in the sample segments were listed, and a subsample of households and group

quarters was designated for screening in order to identify potential respondents. The

subsampling rates were set to produce a national, approximately equal probability sample of

households in most of the U.S., with higher rates for the geographic strata with high minority

concentrations.

Persons within the households or group quarters were the fourth stage of sample

selection. (The persons selected for the sample are frequently referred to as SPs, Sampled

Persons, throughout the report.) The screened households were grouped into a number of

classes, depending on the age-sex-race/ethnicity of their members. The classes were

subsampled at different rates, and, within each class, members of particular age-sex

race/ethnicity subdomains were identified as potential SPs; other members of the households

were excluded from the sample. For more detail on the NHANES III sample design refer to

Sample Design: Third National Health and Nutrition Examination Survey, National Center

for Health Statistics, Vital and Health Statistics , Series 2, Number 113, September 1992.

A summary of the sample sizes for the full 6-year NHANES III sample at each

stage of selection follows:

Number of separate areas (PSUs) in sample

Number of survey locations

Number of segments

Number of households screened

Number of households with SPs designated for interview

Number of designated SPs

Number of interviewed SPs

Number of MEC examined SPs

Number of home examined SPs

81

89

2,144

93,653

19,528

39,695

33,994

30,818

493

2

1.2 Comparisons Between NHANES I, NHANES II, Hispanic HANES, and

NHANES III

It should be noted that due to differences in the sample sizes and designs for the

three cycles of NHANES, estimates will differ in reliability across surveys. NHANES is one

of the major programs in the series of health-related studies conducted by the National Center

for Health Statistics (NCHS) over the past 30 years. This system of surveys has included

NHANES I, NHANES II, Hispanic HANES (HHANES), NHANES III, and NHANES I

Epidemiologic Follow-up Surveys. Although the three cycles have similar analytic

objectives, there are differences in their sample designs. A comparison of the sample design

parameters for the four Health and Nutrition Examination Surveys is given in Table 1.

Table 1. Selected sample design parameters for the health and nutrition examination surveys

Parameter NHANES I NHANES II Hispanic HANES NHANES III Age of civilian noninstitutionalized target population

Geographical areas

Average number of sample persons per household

Number of survey locations

Domains for oversampling

Sample size

1-74 years 6 months-74 years 6 months-74 years 2 months and over

United States United States Southwest for United States (excluding Alaska (including Alaska Mexican-American (including Alaska and and Hawaii) and Hawaii) persons; NY, NJ, Hawaii)

CT for Puerto Rican persons; Dade County, FL, for Cuban persons

1 1 2-3 2-3

100 64 17 in Southwest; 9 89 in NY, NJ, CT; 4 in

Dade County Low income: Low income: Dade County: 6 52 subdomains were children aged 1-5 children aged 6 months-19 years predesignated years; women aged months-5 years; and 45-74 years; consisting of age-sex 20-44 years; persons aged 60-74 Southwest and NY, groups for black, persons aged 65 years NJ, and CT: Mexican-American, years and over persons aged 6 and other persons.

months-19 years Target sample sizes and 45-74 years were established for

the subdomains.

28,043 27,801 15,931 39,695 Examined sample size 20,749 20,322 11,672 30,818 Years covered 1971-1974 1976-1980 1982-1984 1988-1994

3

The differences in the sample sizes and designs for the three cycles of NHANES

and for HHANES should be considered when comparisons are made across various HANES

surveys. For example, it should be noted that NHANES III is the only survey that includes

persons 75 years or older, and that NHANES I and NHANES II did not include any

oversampling of Hispanics.

Goals of Weighting

The purpose of weighting the sample data is to permit analysts to produce

estimates of statistics that would have been obtained if the entire sampling frame had been

surveyed. Sample weights can be considered as measures of the number of persons the

particular sample observation represents. Weighting takes into account several features of the

survey: the specific probabilities of selection for the individual domains that were

oversampled, as well as nonresponse and differences between the sample and the total

population. Differences between the sample and the population may arise due to sampling

variability, differential undercoverage in the survey among demographic groups, and possibly

other types of response errors, such as differential response rates or misclassification errors.

Sample weighting in NHANES III was used to accomplish the following

objectives:

1. To compensate for differential probabilities of selection among subgroups (age-sex-race/ethnicity subdomains; persons living in different geographic strata sampled at different rates);

2. To reduce biases arising from the fact that nonrespondents may be different from those who participate;

3. To bring sample data up to the dimensions of the target population totals;

4. To compensate, to the extent possible, for inadequacies in the sampling frame (resulting from omissions of some housing units in the listing of area segments, omissions of persons with no fixed address, etc.); and

5. To reduce variances in the estimation procedure by using auxiliary information that is known with a high degree of accuracy.

The sample weighting was carried out in three stages. The first stage involved

the computation of weights to compensate for unequal probabilities of selection (Objective 1

4

1.3

above). The second stage adjusted for nonresponse (Objective 2). The third stage used

poststratification of the sample weights to Census Bureau estimates of the U.S. population to

simultaneously accomplish the third, fourth, and fifth objectives.

It should be noted that due to the form of estimators typically used with data

from complex samples, extreme variability in the weights may result in reduced reliability of

the estimates. The NHANES III sample was designed to minimize the variability in the

weights, subject to operational and analytic constraints. Additionally, measures such as

weight trimming have been used to reduce the variability in the weights for NHANES III.

However, the analyst should bear in mind the fact that extreme observations in conjunction

with large weights may result in extremely influential observations, i.e., observations that

dominate the analysis.

2. GENERAL OVERVIEW OF THE WEIGHTING METHODOLOGY

Computing Basic Weights

The first-stage (or basic) weight for each SP was calculated as the reciprocal of

the SP’s probability of selection, with adjustments for other variabilities in sampling rates

such as changes made to the sampling rates at the time of data collection. The probability of

selection of a person in NHANES III depended on three factors: (1) the person’s age-sex

race/ethnicity domain; (2) the density stratum; and (3) the PSU. The following provides a

brief description of each of the three components.

Older persons, children, Mexican-Americans, and black persons were

oversampled to insure a prespecified minimum sample size for each analytic domain so that

estimates of the health and nutrition status of persons in each domain could be made with

acceptable precision. The oversampling in NHANES III was part of a pattern established in

the sample design. The population was decomposed into 52 subdomains: 7 age groups by

sex for black and Mexican-American persons and 12 age groups by sex for white persons and

other racial groups combined. After defining age-sex-race/ethnicity subdomains, variable

sampling rates were derived to ensure the achievement of sample sizes sufficient to permit

analyses of the data for each subdomain.

5

2.1

The density strata were established by dividing the census blocks (or

enumeration districts) in each sampled PSU into six classes with each class having a different

level of concentration of Mexican-American persons. Blocks with high concentrations of

Mexican-American persons were oversampled to increase the sample yield for this group.

The third component, the PSU factor, was introduced to adjust the basic weights

to reflect the effect of the relatively fixed sample size within each PSU in NHANES III on the

sample weights. The reason for the relatively fixed sample size by PSU was to have a

manageable and efficient field procedure. However, the use of nearly a fixed number of

examinations per PSU implied that NHANES III would not consist of exact self-weighting

samples.

Adjusting for Nonresponse and Poststratification

If every selected household had agreed to complete the screener and every

selected person had agreed to complete the interview and the medical examination, weighted

estimates based on the data would be close to unbiased estimates of statistics for the total U.S.

population. However, nonresponse occurs in any survey operation, and thus, nonresponse

bias may result. The best approach to minimizing nonresponse bias is to plan and implement

field procedures that maintain high cooperation rates. For NHANES III, the payment of cash

incentives and repeated callbacks for refusal conversion were very effective in reducing

nonresponse, and thus, nonresponse bias. Because some nonresponse occurs even with the

best strategies, adjustments are always necessary to minimize potential nonresponse bias.

All persons selected in the sample were asked to participate in a personal

interview at their home, where medical history and socio-demographic information were

collected. After the initial interview, all interviewed persons were invited to the MEC for

physical examination. Persons who were unable to come to the MEC were offered an

abbreviated physical examination at their home.

Therefore, nonresponse in NHANES III occurred at several stages of the data

collection process. Some of the sample persons who were screened (100% of the selected

sample was screened, including about 6.7 percent for which neighbors provided the

information) refused to be interviewed (interview nonresponse). Some of the interviewed

SPs refused the medical examination (exam nonresponse). The overall interview and exam

6

2.2

nonresponse rates were 14 percent and 9 percent, respectively. The adjustment procedures

used for unit nonresponse were slightly different from those Ezzati, et. al. (1991, 1992) used

for creating preliminary weights for Phase 1 of the survey. A two-stage procedure for

nonresponse adjustment and poststratification to known population totals was carried out to

adjust for unit nonresponse in NHANES III. Exploratory research and analysis were carried

out to identify variables to be used for nonresponse adjustment. A clustering methodology

was used to identify potential variables and their subclasses for use in nonresponse

adjustment. The SI-CHAID (Statistical Innovation's Chi-Square Automatic Interaction

Detection) software was used to examine the relationship between response and various

independent predictor variables (see Kass 1980, Lee 1989). SI-CHAID forms adjustment

classes that maximize the variation in response rates. The outcome has a tree-shaped

structure that identifies, based on chi-square values, the predictor variables that are highly

related to the dependent variable (response status). Separate weighting class nonresponse

adjustments were carried out for groups of sample individuals, defined by the following set of

characteristics for the interviewed sample: (1) race/ethnicity, (2) age, and (3) household size,

For the examined sample, the nonresponse classes were defined by: (1) race/ethnicity, (2)

age, (3) household size, and (4) self-reported health status.

Extreme weights may occasionally result when units are sampled to yield fixed

sample sizes within a PSU, as was the case with NHANES III. Additionally, the procedures

used to make nonresponse and poststratification weighting adjustments can contribute to

extreme weights. A few unexpectedly large sampling weights can seriously inflate the

variance of the survey estimates. Thus, for a very small number of records, weight trimming

procedures were used to reduce the impact of such large weights on the estimates produced

from the sample.

Poststratification of sample weights to independent population estimates is used

for several purposes. In most household surveys, certain demographic groups in the U.S.

population (for example, young black males) experience fairly high rates of undercoverage in

survey efforts. Poststratification to Census estimates partially compensates for such

undercoverage and for any differential nonresponse, and can help to reduce the resulting bias

in the survey estimates. Poststratification can also help to reduce the variability of sample

estimates as well as achieve consistency with accepted U.S. figures for various

subpopulations.

7

For both Phase 1 and Phase 2, a two-stage poststratification procedure was used.

The first-stage adjustment was poststratified to Census region/MSA status totals, while the

second-stage adjustment used age-sex-race/ethnicity domain totals. The Current Population

Survey (CPS) was used for both stages of the poststratification for the control totals. The

estimates used as control totals for poststratification corresponded to the midpoint of each

time period for Phase 1 and Phase 2. For Phase 1, the control totals used were derived from

the March 1990 CPS; for Phase 2, the totals were from the March 1993 CPS. For both

Phase 1 and Phase 2, all control totals were obtained using undercount adjusted CPS weights.

These CPS weights have themselves undergone poststratification to the Census Bureau’s best

estimates of the total civilian noninstitutionalized population of the United States, including

homeless and others not counted in surveys or in the most recent decennial census. The

NHANES III poststratification therefore brings the weighted totals up to the level of the

presumed total civilian noninstitutionalized population in the United States. Furthermore, the

detailed cells used in poststratification correct for distortions in the age-sex-race/ethnicity

composition of the sample arising from undercoverage, as well as distortions in geography,

etc. Tables 2 and 3 provide the 1990 and 1993 undercount adjusted CPS population totals

used in poststratification for Phase 1 and Phase 2 of the survey.

The final weight for each sample person is the product of the basic weight and

the nonresponse adjustment, trimming, and poststratification adjustment factors. Some SPs

were considered ineligible for the exam or for certain components of the exam due to

nonresponse at the interview stage, or due to the fact that they were not selected into the

subsample under consideration. For nonrespondents, the final weight is zero, while for

ineligibles, the final weight is missing. Three full-sample weights are provided for each

phase: Interview, Mobile Examination Center (MEC) exam, and MEC+Home exam weights.

For Phase 1 and Phase 2, each of these weights was computed using the procedures described

above. In addition to the full-sample weights, weights were computed for four subsamples of

the NHANES III sample: Persons examined at the MEC, or MEC+Home, with morning

(Standard) blood draws; persons examined at the MEC, or MEC+Home, with

afternoon/evening (Modified) blood draws; persons administered allergy tests at the MEC;

and persons administered neurobehavioral tests or Central Nervous System (CNS) tests.

Persons in the Standard sample were instructed to fast overnight for 12 hours, while persons

in the Modified sample were instructed to fast for 6 hours. (The actual fasting hours,

however, could be different from the instructions given to the respondents.) For more detail

on the fasting instructions, refer to the Plan and Operation of the Third National Health and

Nutrition Examination Survey, Vital and Health Statistics (1992). The Allergy subsample

8

consists of all sampled persons aged 6 to 19, and a half-sample of persons aged 20 to 59. The

CNS subsample consists of the other half-sample of persons aged 20 to 59. Sampling

weights were also created for the combined Phase 1 and Phase 2 samples by multiplying the

weights for each phase by a factor of one-half and then combining the two phase samples.

The following sections provide brief descriptions of the various sampling weights computed

for NHANES III.

9

Table 2. March 1990 undercount adjusted CPS totals*

White/Other Black, Non-Hispanics Mexican-Americans

Age Male Female Male Female Male Female

2-11 months**

1 to 2 years

3 to 5 years 6 to 11 years 12 to 19 years

20 to 29 years

30 to 39 years

40 to 49 years

50 to 59 years

60 to 69 years 70 to 79 years 80+ years

1,287,784 1,220,410

2,980,860 2,822,836

4,374,435 4,140,866 911,942 907,764 524,592 533,892 8,629,062 8,152,429 1,737,184 1,706,130 962,604 948,191

11,042,440 10,581,409 2,170,730 2,206,642 1,194,780 1,122,249

15,688,213 16,154,034

16,935,511 17,257,300

13,113,718 13,505,554

9,011,922 9,529,197

8,299,588 9,707,882 5,051,094 7,115,632 1,803,494 3,422,850

892,888 866,286 530,908 489,043

4,548,182 5,611,803 3,010,919 2,499,606

2,439,958 2,995,107 1,055,783 1,061,955

1,325,033 1,956,130 387,851 462,704

Total 98,218,122 103,610,396 14,025,917 16,249,863 7,667,437 7,117,640

Overall total 246,889,375

*These totals were used as population controls for poststratification to the 52 age-sex-race/ethnicitydomains for Phase 1

**The population totals for 2-11 month old white/other babies were estimated by taking 5/6 of the total CPS estimate for less than 1 year old white/others. The population totals for 2-35 month oldblack and Mexican-American babies were estimated by taking 17/18 of the CPS estimate for less than3 year old black or Mexican-American babies.

SOURCE: Current Population Survey

10

Table 3. March 1993 undercount adjusted CPS totals*

White/Other Black, Non-Hispanics Mexican-Americans

Age Male Female Male Female Male Female

2-11 months** 1,220,009 1,195,902

1 to 2 years 3,084,848 2,938,969

3 to 5 years 4,524,065 4,268,933 959,781 969,256 601,980 592,474 6 to 11 years 8,932,943 8,338,142 1,803,866 1,759,779 1,033,780 1,050,243 12 to 19 years 11,048,058 10,564,791 2,211,922 2,230,171 1,165,540 1,224,296

20 to 29 years 14,928,357 15,138,441

30 to 39 years 17,657,521 17,937,053

40 to 49 years 14,498,177 14,877,962

50 to 59 years 9,605,640 10,058,779

60 to 69 years 8,107,318 9,350,120 70 to 79 years 5,474,728 7,453,251 80+ years 2,054,518 3,844,970

987,819 923,857 665,129 597,756

4,708,931 5,745,358 3,198,334 2,766,656

2,717,786 3,329,443 1,296,774 1,274,973

1,368,276 2,038,108 439,283 539,568

Total 101,136,18 105,967,31 14,758,383 16,995,971 8,400,820 8,045,966 1 2

Overall total 255,304,63 1

* These totals were used as population controls for poststratification to the 52 age-sex-race/ethnicity domains for Phase 2

**The population totals for 2-11 month old white/other babies were estimated by taking 5/6 of the total CPS estimate for less than 1 year old white/others. The population totals for 2-35 month oldblack and Mexican-American babies were estimated by taking 17/18 of the CPS estimate for less than3 year old black or Mexican-American babies.

SOURCE: Current Population Survey

11

3. WEIGHTING FOR FUL L SAMPLE AND SUBSAMPLES FOR PHASE 1 AND PHASE 2

3.1 Weighting for Full Sample

Interview Weights

All sampled persons were contacted for an interview at home. Those who did

not participate in the interview were considered nonrespondents in the calculation of

interview weights; all who did respond to the interview were assigned interview weights.

There was a total of 17,464 interview respondents out of the eligible 20,277 SPs for Phase 1,

and 16,530 interview respondents out of the eligible 19,418 SPs for Phase 2. Interview

weights were computed by applying interview nonresponse adjustment to the basic

poststratified weights. Weight trimming was used on a small number of cases (less than 1

percent of interviewed cases) with extreme weights. Poststratification was then applied to the

nonresponse adjusted trimmed weights.

MEC Examination Weights

All interviewed persons were invited to the MEC for physical examinations.

Those who reported to the MEC were considered as respondents in calculating the MEC

exam weight. All home examinees and interviewed persons who were not examined were

treated as nonrespondents. All SPs who were not interviewed were regarded as ineligibles for

the purpose of computing MEC exam weights. Out of the 20,277 interviewed cases in Phase

1, there were 15,630 respondents to the MEC exam, 1,834 nonrespondents, and 2,813

ineligible (interview nonrespondents). For Phase 2, the 19,418 SPs comprise 15,188 MEC

respondents, 1,342 nonrespondents, and 2,888 ineligibles (interview nonrespondents). The

final interview weight was adjusted for examination nonresponse. Again, the weights for a

small number of cases (less than 1 percent of MEC examined cases) were trimmed to reduce

their effect on the variance estimates. Poststratification was then carried out on the

nonresponse adjusted trimmed exam weights to arrive at the final MEC examination weights.

12

MEC+Home Examination Weights

An additional 493 persons who were unable to come to the MEC were examined

at their home. For the calculation of the MEC+Home exam weight, sampled persons

examined either at the MEC or at their home were considered respondents. All interviewed

persons who were not examined were considered nonrespondents. Of the 20,277 sampled

persons in Phase 1, there were 15,884 respondents to the MEC or home examination. Out of

19,418 sampled persons in Phase 2, there were 15,427 respondents to the MEC or home

examination. The final MEC+Home exam weights were calculated by adjusting the final

interview weights for exam (MEC+Home) nonresponse, trimming the MEC+Home exam

nonresponse adjusted weights (less than 1 percent of MEC+Home examined cases had their

weights trimmed), and applying the two-stage poststratification procedure.

Weighting for Subsamples

Standard and Modified Weights

Person 12 years or older in a random half of the households selected in the

sample were instructed to fast overnight (12 hours) and report to the morning examination

session. The sample is referred to as the “Standard” subsample. The other half of the sample

(persons 12 years or older) were instructed to fast for 6 hours and then report to either the

afternoon or evening examination session. This sample is referred to as the “Modified”

subsample. Two Standard half-sample weights were computed for each phase: MEC exam

and MEC+Home exam for persons who were examined at the MEC, and persons who were

examined either at the MEC or at home. Similarly, both MEC exam and MEC+Home exam

weights were calculated for the Modified half-sample. Since each of these subsamples is

approximately a half-sample of the age-eligible sample, the basic MEC exam and the basic

MEC+Home exam weights were computed by doubling the final full-sample MEC exam

weight and the final full-sample MEC+Home weight, respectively, for each person in the

given (i.e., Standard or Modified) half-sample.

A ratio adjustment for "nonresponse" due to reporting to the "wrong" session

was applied, and then the two-stage poststratification was carried out to obtain the final MEC

exam and MEC+Home weights for the given half-sample.

13

3.2

The Standard half-sample for Phase 1 consisted of 4,913 SPs aged 12 years and

older, of which 4,785 SPs were examined at the MEC, and 128 SPs were examined at home.

For Phase 2, there was a total of 5,134 SPs, with 5,016 SPs examined at the MEC, and 118

SPs examined at home.

The Modified half-sample for Phase 1 consisted of 5,048 SPs, of age 12 years

and older of which 4,947 SPs were examined at the MEC, and 101 SPs were examined at

home. For Phase 2, there was a total of 5,146 SPs, with 5,036 SPs examined at the MEC, and

110 SPs examined at home.

Allergy and CNS Weights

All person aged 6-19 years and a random half of adults aged 20-50 years, were

eligible for the allergy test. The other random half of persons aged 20-50 were assigned to

the CNS component. The Allergy and CNS components of the exam were assigned only to

those persons in the Allergy and CNS subsamples, respectively, who reported to the MEC for

their exam. Thus, all interviewed persons who were not examined at the MEC were

considered nonrespondents. This means that the nonresponse adjustment for the Allergy and

CNS subsamples was completed during the examination nonresponse adjustment. However,

the weights for the Allergy and CNS subsamples for ages 20 to 59 did go through, another

round of an adjustment to reflect the random assignment to either CNS or allergy, trimming,

and poststratification.

For Phase 1, the Allergy sample consisted of 7,616 SPs, of which 6,097 were

respondents. For Phase 2, the 7,483 SPs in the Allergy sample comprised 6,009 respondents.

The CNS sample for Phases 1 and 2 contains 3,645 and 3,811 SPs, respectively. For Phase 1,

there were 2,751 respondents, while for Phase 2 the number of respondents was 2,911.

4. WEIGHTING FOR PHASES 1 AND 2 COMBINED

The full sample and subsample weights for Phases 1 and 2 combined were

computed by taking one-half the weights for the phase to which the sampled person was

assigned. The decision to forego any further poststratification of these weights was based on

the duration of the survey. Because Phase 1 collection spanned the years 1988 to 1991, and

Phase 2 collection occurred from 1991 to 1994, any control totals used to poststratify the

14

combined Phases 1 and 2 samples would not appropriately reflect the population over each of

the reference periods. However, because weights for each phase were poststratified

separately, they do reflect the population for their respective reference periods. In addition,

estimates obtained for the full NHANES III sample will be consistent with estimates obtained

separately for either Phase 1 or Phase 2. Tables 4 and 5 provide the number of respondents,

by sampling domain, for each of the samples for which weights were computed. For the

interviewed and MEC samples, Table 6 shows the 5th and 95th percentiles, and the mean of

the distribution of the weights. The distribution of the weights for the MEC+Home sample is

similar to that of the MEC sample, since home examined cases constitute a very small portion

of the examinations conducted for elderly persons and babies. Similarly, the distributions of

weights for the subsamples closely resemble those of the full-sample, since the subsamples

are random subsets of the full-sample. For example, if the subsample includes a random one-

half of the SPs in a given sampling domain, the subsample MEC weights for a domain would

be about twice as large as the full-sample MEC weights.

Flow charts of the methodology used to weight the samples for Phase 1 and for

Phase 2, are given in Exhibit 1. Table 7 summarizes the appropriate uses of the weights.

15

Table 4. Number of respondents by age-sex-race/ethnicity subdomain for Phases 1 and 2 combined

Number of respondents Age-Sex-Race/Ethnicity MEC+Home

subdomain Screened Interviewed MEC examined examined Total White/Other

Male 2-11 months 1-2 years 3-5 year 6-11 years 12-19 years 20-29 years 30-39 years 40-49 years 50-59 years 60-69 years 70-79 years 80+ years

Female 2-11 months 1-2 years 3-5 year 6-11 years 12-19 years 20-29 years 30-39 years 40-49 years 50-59 years 60-69 years 70-79 years 80+ years

39,695 33,994 30,818 31,311

748 704 639 659 560 525 490 490 582 539 503 503 632 565 518 518 536 471 433 433 674 535 472 472 740 564 506 506 669 521 475 476 694 519 456 461 795 609 539 549 796 613 503 535 843 688 486 576

731 707 658 668 552 524 475 475 632 578 519 519 591 527 489 489 706 615 552 552 743 631 579 582 892 731 678 681 717 598 539 542 763 604 536 541 818 619 529 548

1,061 806 634 685 1,049 823 511 639

16

Table 4. Number of respondents by age-sex-race/ethnicity subdomain for Phases 1 and 2 combined (continued)

Number of respondents Age-Sex-Race/Ethnicity MEC+Home

subdomain Screened Interviewed MEC examined examined Black, non-Hispanic

Male 2-35 months 3-5 years 6-11 years 12-19 years 20-39 years 40-59 years 60+ years

Female 2-35 months 3-5 years 6-11 years 12-19 years 20-39 years 40-59 years 60+ years

Mexican-American Male

2-35 months 3-5 years 6-11 years 12-19 years 20-39 years 40-59 years 60+ years

Female 2-35 months 3-5 years 6-11 years 12-19 years 20-39 years 40-59 years 60+ years

585 555 532 535 575 535 512 512 655 605 577 577 656 579 542 542

1,287 1,057 986 987 814 645 585 588 730 598 527 544

552 532 515 515 600 565 542 542 606 556 541 541 692 629 601 601

1,538 1,333 1,280 1,281 940 776 723 728 831 662 546 581

667 630 594 595 642 601 564 564 644 598 570 570 655 572 535 535

1,488 1,265 1,147 1,149 756 593 558 558 743 609 532 552

650 619 585 587 689 647 620 620 657 616 591 591 642 575 548 548

1,432 1,261 1,188 1,191 736 596 563 563 709 569 495 515

17

Table 5. Number of respondents by age-sex-race/ethnicity subdomain for Phases 1 and 2

Number of respondents Standard Standard Modified Modified

Age-Sex-Race/Ethnicity Allergy CNS MEC MEC+Home MEC MEC+Home subdomain component component examined examined examined examined

Total 12,106 5,662 9,127 9,254 9,497 9,630 White/Other

Male 6-11 years 12-19 years 20-29 years 30-39 years 40-49 years 50-59 years 60-69 years 70-79 years 80+ years

Female 6-11 years 12-19 years 20-29 years 30-39 years 40-49 years 50-59 years 60-69 years 70-79 years 80+ years

Black, non-Hispanic Male

6-11 years 12-19 years 20-39 years 40-59 years 60+ years

Female 6-11 years 12-19 years 20-39 years 40-59 years 60+ years


6-11 years 12-19 years 20-39 years 40-59 years 60+ years

Female 6-11 years 12-19 years 20-39 years 40-59 years 60+ years

518 433 169 169 212 212 224 248 214 214 239 239 249 257 233 233 250 250 232 243 218 218 226 226 230 226 219 219 216 218

266 270 250 255 254 263 235 245 218 247 240 267

489 552 216 216 240 240 261 318 283 283 280 281 350 328 315 315 343 343 274 265 245 246 269 271 258 278 269 270 248 250

265 267 238 245 286 299 317 335 234 270 238 272

577 542 213 213 263 263 496 490 474 474 478 479 303 282 288 289 272 273

251 255 250 254

541 601 240 240 276 276 639 641 602 603 631 631 345 378 334 335 359 361

258 267 258 266

570 535 206 206 266 266 596 551 531 532 559 559 261 297 262 262 276 276

257 263 255 260

591 548 251 251 225 225 621 567 573 574 562 563 270 293 263 263 274 274

220 228 252 255

18

Table 6. Fifth and 95th percentiles, and the mean values of the weights by sampling domain

Interview weights Age-sex-race/ethnicity

domain* 5% Mean 95% 5% Mean 95%

MEC exam weights

White Male

2-11 months 1 to 2 years 3 to 5 years 6 to 11 years 12 to 19 years 20 to 29 years 30 to 39 years 40 to 49 years 50 to 59 years 60 to 69 years 70 to 79 years 80+ years

Female 2-11 months 1 to 2 years 3 to 5 years 6 to 11 years 12 to 19 years 20 to 29 years 30 to 39 years 40 to 49 years 50 to 59 years 60 to 69 years 70 to 79 years 80+ years

Black, non-Hispanic Male

2-35 months 3 to 5 years 6 to 11 years 12 to 19 years 20 to 39 years 40 to 59 years 60+ years

Female 2-35 months 3 to 5 years 6 to 11 years 12 to 19 years 20 to 39 years 40 to 59 years 60+ years

1,028 1,775 3,187 1,104 1,951 3,603 3,435 5,835 10,318 3,647 6,231 11,049 3,481 8,452 19,259 3,632 9,083 21,783 6,850 16,081 36,139 7,438 17,477 40,788

12,809 24,582 44,631 14,055 26,657 48,042 16,417 29,532 48,769 18,886 33,480 56,947 14,042 30,923 66,802 14,937 34,522 76,759 14,372 27,347 56,634 15,194 30,043 56,124

7,638 17,722 41,610 8,469 20,183 47,973 5,622 13,419 23,994 6,170 15,155 30,140 4,441 8,496 18,677 5,599 10,348 22,565 1,486 2,776 5,200 2,214 3,922 6,965

983 1,678 3,082 1,013 1,798 3,273 2,894 5,417 10,492 3,175 5,981 11,636 2,819 7,212 16,339 2,942 7,994 19,256 7,014 15,838 35,150 7,288 17,057 37,685 7,141 17,569 35,424 8,745 19,567 41,651 7,700 25,491 50,726 8,429 27,848 52,573 7,484 24,258 47,953 8,251 26,128 55,136 9,583 24,082 50,456 10,666 26,923 56,937 8,287 16,573 26,700 8,860 18,712 32,029 9,021 15,593 25,388 10,635 18,254 30,467 3,283 9,132 20,604 3,974 11,611 26,557 2,429 4,413 7,545 3,728 7,087 13,336

962 1,694 2,797 956 1,768 3,095 1,016 1,749 2,999 1,055 1,828 2,890 1,622 2,926 4,913 1,602 3,069 5,540 1,487 3,785 7,122 1,547 4,043 7,705 2,507 4,379 7,472 2,569 4,694 8,320 2,016 3,998 6,308 2,108 4,408 7,142 1,424 2,252 3,596 1,511 2,555 4,001

903 1,682 2,919 915 1,738 3,013 982 1,661 3,023 962 1,732 3,012

1,763 3,117 5,059 1,825 3,203 5,322 1,069 3,527 7,046 1,086 3,691 7,577 1,951 4,260 8,490 1,740 4,436 9,398 1,782 4,075 8,273 1,899 4,374 8,857 1,622 3,017 5,737 1,928 3,658 7,440

19

Table 6. Fifth and 95th percentiles, and the mean values of the weights by sampling domain (continued)

Interview weights Age-sex-race/ethnicity

domain* 5% Mean 95% 5% Mean 95%

MEC exam weights


2-35 months 3 to 5 years 6 to 11 years 12 to 19 years 20 to 39 years 40 to 59 years 60+ years

Female 2-35 months 3 to 5 years 6 to 11 years 12 to 19 years 20 to 39 years 40 to 59 years 60+ years

Other Male

2-11 months 1 to 2 years 3 to 5 years 6 to 11 years 12 to 19 years 20 to 29 years 30 to 39 years 40 to 49 years 50 to 59 years 60 to 69 years 70 to 79 years 80+ years

Female 2-11 months 1 to 2 years 3 to 5 years 6 to 11 years 12 to 19 years 20 to 29 years 30 to 39 years 40 to 49 years 50 to 59 years 60 to 69 years 70 to 79 years 80+ years

345 949 2,188 360 1,007 2,304 363 937 2,228 384 999 2,290 269 1,669 3,676 280 1,751 3,944

1,023 2,063 4,075 1,053 2,206 4,334 1,287 2,454 4,850 1,404 2,707 5,269

810 1,984 3,909 820 2,108 4,170 230 679 1,882 242 777 2,106

314 878 2,192 343 929 2,276 285 870 2,204 300 908 2,308 243 1,622 4,994 248 1,691 5,245 921 2,040 4,168 959 2,141 4,267

1,065 2,088 4,049 1,103 2,216 4,384 891 1,961 4,080 955 2,075 4,292 403 881 1,972 442 1,012 2,373

1,099 1,822 3,072 1,181 2,042 3,439 1,979 5,467 8,953 1,953 5,971 9,712 1,700 7,199 15,901 1,795 7,643 16,229 2,446 12,407 36,139 2,781 14,034 42,915 2,655 17,478 46,071 2,983 19,555 51,296 3,834 23,611 43,848 4,065 26,888 50,046 5,163 28,633 66,840 5,313 31,513 73,023 4,168 17,949 40,236 4,339 20,156 45,842 4,363 20,085 49,443 5,143 22,693 58,288 6,114 14,266 33,623 6,510 16,183 34,555 2,138 11,736 26,172 2,883 14,469 32,791 1,453 3,699 7,940 2,227 5,353 15,722

1,116 1,885 3,548 1,184 2,047 3,811 2,023 6,218 16,005 2,369 6,777 17,031 1,596 7,649 15,877 1,766 8,696 19,547 2,775 14,255 43,375 2,773 15,516 47,138 1,585 15,154 44,062 1,859 17,064 52,279 2,575 20,334 47,400 2,729 21,798 52,190 2,674 22,604 53,511 2,644 24,656 57,541 2,955 21,138 50,456 3,321 22,153 56,159 3,043 12,712 26,197 3,162 14,505 30,559 3,791 13,087 26,673 4,625 15,650 29,733 3,423 7,083 13,143 3,919 9,210 17,896 2,627 4,509 8,114 4,498 7,961 14,136

*The race/ethnicity domain, white/other, is divided into white and other (the remainder of white/other)groups in this table to show the distribution of the weights separately for each group.

20

Exhibit 1. NHANES III Weighting Flow ChartPhase 1, Phase 2 (separately)

Basic Weight

Adjust for new construction (Phase 1 only)

Trim

Poststratify to region/MSA controls

Poststratify to sampling domain controls

Post-stratified

basic weights

Adjust for interview nonresponse by: age, race/ethnicity, household size

Trim



Poststratified interview weights

Adjust for MEC examination nonresponse by: age, race/ethnicity, household size, self-reported

health status

Trim



Poststratified examination

weights

Adjust for MEC and Home examination nonresponse by:

size, self-reported health status

Trim



Poststratified MEC+Home examination

weights

age, race/ethnicity, household

21

Exhibit 1. NHANES III Weighting Flow Chart (continued)Phase 1, Phase 2 (separately)

Trim

Poststratify to region/MSA

controls

Poststratify to sampling domain

controls

Adjust for morning session nonresponse

by: race/ethnicity, household size,

self-reported health status

Trim


controls


controls

Trim


controls


controls

Trim


controls


controls

Post stratified examination

weights

Allergy full-sample (ages 6-11)

weights

Poststratifed CNS half-sample

(ages 20-59) weights

Poststratified modified

half-sample MEC exam weights

Poststratified MEC+Home examination

weights

Adjust for afternoon/evening

session nonresponse by:

race/ethnicity, household size,


Trim


controls


controls

Poststratified standard

half-sample MEC+Home exam

weights

Trim


controls


controls

Poststratified modified

half-sample MEC+Home exam

weights

Poststratifed allergy half-sample

(ages 20-59) weights

Poststratified standard

half-sample MEC exam

Adjust for morning session nonresponse

by: race/ethnicity, household size,


Adjust for afternoon/evening

session nonresponse by:

race/ethnicity, household size,

self-reported healthstatus

age, age,

weights

age, age,

21

Table 7. Appropriate uses of the weights

Weight Application

Final interview weight

Final exam (MEC only) weight

Final MEC+Home exam weight

Final Allergy weight

Final CNS weight

Final Standard exam (MEC only) weight

Final Modified exam (MEC only) weight

Final Standard MEC+Home exam weight

Final Modified MEC+Home exam weight

Use only in conjunction with the sample interviewed at home, and only with items collected during the household interview.

Use only in conjunction with the MEC examined sample, and only with interview and examination items collected at the MEC.

Use only in conjunction with the MEC+Home examined sample, and only with items collected at both the MEC and home.

Use only in conjunction with the Allergy subsample, and only with items collected as part of the allergy component of the exam.

Use only in conjunction with the CNS subsample, and only with items collected as part of the CNS component of the exam.

Use only in conjunction with the MEC examined persons assigned to the Standard subsample, and only with items collected at the MEC exam. These weights should be used to analyze tests such as the Oral Glucose Tolerance Tests (OGTT), where overnight fasting is preferred.

Use only in conjunction with the MEC examined persons assigned to the Modified subsample, and only with items collected at the MEC exam.

Use only in conjunction with the MEC and home examined persons assigned to the Standard subsample, and only with items collected during the MEC and home examinations.

Use only in conjunction with the MEC and home examined persons assigned to the Modified subsample, and only with items collected during the MEC and home examinations.

22

5. VARIANCE ESTIMATI ON

When data are collected as part of a complex sample survey, care is needed to

produce approximately unbiased and design-consistent estimates of variance analytically. In

a complex sample survey setting, variance estimates computed using standard statistical

software packages that assume simple random sampling are biased. Two common

approaches are available for estimation of variances for complex survey data: linearization

and replication.

For the linearization approach, nonlinear estimates are approximated by linear

ones for the purpose of variance estimation. The linear approximation is derived by taking

the first order Taylor series approximation for the estimator. Standard variance estimation

methods for linear statistics are then used to estimate the variance of the linearized estimator.

For a two-PSUs-per-stratum sample design such as NHANES III, with some

simplifying assumptions including with replacement sampling at the first stage (See Wolter,

1985), the linearization variance estimate is obtained by summing the squared differences

between the linearized estimates for the two PSUs in each stratum. That is,

H v(z) = �(zh1 - zh2 )2

, h=1

where zh1 and zh2 are the linearized estimates for PSU 1 and PSU 2, respectively, of stratum

h.

Replication methods provide a general means for estimating variances for the

types of complex sample designs and weighting procedures usually encountered in practice.

The basic idea behind the replication approach is to select subsamples repeatedly from the

whole sample, to calculate the statistic of interest for each of these subsamples, and then to

use the variability among these subsamples or replicate statistics to estimate the variance of

the full-sample statistics. See Wolter (1985) for further descriptions of both the replication

and linearization approaches.

One of the main advantages of the replication approach is its ease of use at the

analysis stage. The same estimation procedure is used for the total sample and for each

replicate. The variance estimates are then readily computed by a simple procedure.

23

Furthermore, the same procedure is applicable to most statistics desired such as means,

percentages, ratios, correlations, etc. (Efron, 1982). These estimates can also be calculated

for analytic groups or subpopulations. Another important advantage of the replication

approach is that it provides a simple way to account for adjustments that are made in

weighting, such as adjustments for nonresponse and poststratification. By separately

computing the weighting adjustments for each replicate, it is possible to reflect the effects of

poststratification and nonresponse adjustment in the estimates of variance.

There are different ways of creating replicates from the full sample. Jackknife

and balanced repeated replication (BRR) methods are two common procedures for the

derivation of replicates. The jackknife procedure retains most of the sample in each replicate,

whereas the BRR approach retains about one-half of the sample in each replicate. The choice

of a replication method for NHANES III depended on the objectives of the survey. In

NHANES, special attention is given to (1) estimates of health characteristics for subdomains

of the population, and (2) estimates of quartiles for various statistics. For small subdomain

estimation, the jackknife procedure is more stable since every replicate includes most of the

entire sample, and the chance of having replicates with no observation for the characteristic of

interest is small. However, the BRR method has proven to be more reliable for the estimation

of quartiles. Kovar, Rao, and Wu (1988) found in an empirical study that the jackknife

replication method performed poorly for estimating the variance estimates of population

quartiles, but BRR seemed to work relatively well for these quartiles. Rao, Wu, and Yue

(1992) report on both jackknife and BRR procedures for estimating the median for cluster

samples.

For the combined Phases 1 and 2 sample, replicate weights were calculated

using Fay’s Method, a variant of the balanced repeated replication (BRR) method. For more

details on Fay’s Method, refer to Judkins (1990). BRR is generally used with stratified

multistage sample designs when two PSUs per stratum have been selected. For standard

BRR, each replicate half-sample estimate is formed by selecting one of the two PSUs from

each stratum and then using only the selected PSUs to estimate the parameter of interest. The

weights for the units selected are multiplied by a factor of two to form the replicate weights.

Fay’s Method produces replicate weights by multiplying the full-sample weights

by factors of K and 2-K (0 < K < 1). For creating replicate weights for NHANES III, K=0.3

was used. In studies where quartile estimates and small domain estimates are both of interest,

Fay’s Method has sometimes been used as a compromise between the jackknife and standard

24

BRR. Judkins (1990) demonstrates that for estimation of quartiles and other statistics, Fay’s

Method with K=0.3 does well in terms of both bias and stability.

ˆThe full-sample estimate, q , is calculated using the full-sample weights. The ˆreplicate weights are then used to calculate replicate estimates, q (j), using the same

methodology as was used to calculate the full-sample estimate. The variance estimator, v(q ˆ ),

then takes the form

1 G

( 2

v( )q = 2 � q ( j ) - q )G (1 - K ) j =1

where G is the total number of replicates formed. The degrees of freedom associated with

this variance estimator is approximately L, the number of PSUs minus the number of strata.

The total number of replicate samples that can be formed is 2L . However, it is

not necessary to form all replicates. All of the information available in the 2L replicates can

be captured using G orthogonal or “balanced” replications. The Plackett-Burman algorithm,

described in McCarthy (1966), is used to create the orthogonal Hadamard matrix. The

minimum number of replicates needed to have the full information, G, is the smallest integer

divisible by 4 which is greater than or equal to L. For NHANES III, L=49, so G=52

replicates were used.

Replicate weights are provided for both the interviewed and MEC examined

samples for Phases 1 and 2 combined. Exhibit 2 contains a flow chart of the methodology

used to create the replicate weights. The PC software, WesVarPC, can be used to analyze

NHANES III data using the replicate weights. WesVarPC may be accessed via the Internet at

Westat’s home page (URL: www.westat.com). Any other replication software (such as V

PLX developed by Bob Fay) that accounts for Fay’s Method in the computation of variances

can also be used.

Replicate weights were not created for the MEC+Home examined sample or for

the subsamples. WesVarPC may be used to create simple replicate weights for these samples.

Unlike the interview and MEC replicate weights that are provided, replicate weights created

using WesVarPC will not reflect all the stages of adjustment that were applied to the weights.

However, WesVarPC does have the capacity to reflect the final stage of poststratification; to

obtain the poststrata totals that should be input, for each age-sex-race/ethnicity domain,

25

average the Phase 1 poststrata totals given in Table 2 with the Phase 2 poststrata totals given

in Table 3. For specific instructions on using WesVarPC to create replicate weights, refer to

A User’s Guide to WesVarPC. This manual may be obtained via the Internet at Westat’s

home page (URL: www.westat.com).

In addition to the replicate weights, pseudo-stratum and pseudo-PSU identifiers

along with probabilities are provided, and may be used to calculate variance estimates using

standard linearization software such as SUDAAN (developed by Research Triangle Institute),

PC-CARP (developed by the Iowa State University), or OSIRIS (developed by the University

of Michigan).

Occasionally, analysts may wish to compute estimates based on only Phase 1 or

only Phase 2 data. This could occur if certain data items were collected in one phase of the

survey, but not collected in the other phase. In addition, analysts may wish to compare an

estimate based only on Phase 1 data with the corresponding estimate based on only Phase 2

data. These applications create special problems for variance estimation.

NHANES III was designed with 2 PSUs selected per stratum. Each of the two

selected PSUs in a stratum was randomly assigned to either Phase 1 or Phase 2 of the data

collection. Thus, each phase has only one PSU per stratum in the sample. In order to

compute variance estimates for only one phase, strata must be collapsed, or paired, so that an

implied two-PSUs-per-stratum design exists in each phase. Because this is not how the

sample was actually designed, an additional between-PSU component of variation is

artificially introduced, and variance estimates based on the collapsed strata are slight over-

estimates of the “true” sampling variances. Furthermore, the degrees of freedom for

estimating the variances in only one phase is reduced by one-half. This make the variance

estimates less stable; that is, the variance of the variance estimates is increased.

If data are available from all 6 years of data collection, but separate phase

estimates are desired, it is advisable to calculate the variance (or relative variance) estimates

based on the true survey design and the 6 years of data as discussed above. The variance (or

rel-variance) for an estimate based on one phase of data is then taken to be twice the variance (or rel-variance) of the 6-year estimate. For example, if Xt is an estimate based on the 6

years of data, with variance estimate Vt , and if X1 is the corresponding estimate based on

only Phase 1 data, then the variance of X1 is 2*Vt .

26

If data are available in only one phase of the survey, then a paired (collapsed)

strata estimate of variance must be used. This will provide a slight over-estimate of the

sampling variance. For the NHANES III survey, paired strata for both Phase 1 and Phase 2

are available. The SUDAAN software can use the pairings directly to produce linearized

variance estimates. WesVarPC can be used to create simple replicate weights based on the

paired strata, to produce BRR variance estimates. Again, no matter what procedure is used

for individual phase variance estimates, there will be problems related to the stability of the

variance estimates. It is suggested that some generalized variance function technique, such as

relative variance curves or average design effect models, be employed to smooth the unstable

variance estimates.

27

Exhibit 2. NHANES III Weighting Flow Chart Phase 1+2 Replicate Weights

Phase 1 Basic Weight Phase 2 Basic Weight

Adjust for new construction

Trim



Construct replicate basic weights using Fay's method (k=0.3)


Trim



Adjust for MEC examination nonresponse by:


Trim



Trim




Trim



Adjust for MEC examination nonresponse by:


Trim



Poststratified Basic

Replicate Weights

Poststratified Interview Replicate Weights

Poststratified MEC examination replicate weights

Construct replicate basic weights using Fay's method (k=0.3)

age, race/ethnicity, household size, age, race/ethnicity, household size,

28

REFERENCES

Efron, B. and Stein, C. (1982). “The Jackknife, the Bootstrap, and Other Resampling Plans,” Philadelphia: Society for Industrial and Applied Mathematics.

Ezzati, T. and Khare, M. (1991). “Consideration of Health Variables to Adjust Sampling Weights for Nonresponse in a National Health Survey,” Proceedings of the Social Statistics Section of the American Statistical Association, 203-208.

Ezzati, T. and Khare, M. (1992). “Nonresponse Adjustments in a National Health Survey,” Proceedings of the Survey Research Methods Section of the American Statistical Association, 339-344,

Judkins, D.R. (1990). "Fay's Method for Variance Estimation," Journal of Official Statistics, 6, 3, 223-239.

Kass, G.V. (1980). An Exploratory Technique for Investigating Large Quantities of Categorical Data. Applied Statistics 2:119-127.

Kovar, J.G., Rao, J.N.K., and Wu, C.F.J. (1988). "Bootstrap and Other Methods to Measure Errors in Survey Estimates," The Canadian Journal of Statistics, 16 Supplement, 25-45.

Lee, K.H. et al. (1989). Analyzing Complex Survey Data. Newbury Park, CA.

McCarthy, P.J. (1966) . Replication: An Approach to the Analysis of Data from Complex Surveys. Vital and Health Statistics, Series 2, No. 14, National Center for Health Statistics, Public Health Service, Washington, D.C.

Vital and Health Statistics, Series 1, Number 32, July 1992. Plan and Operation of the Third National Health and Nutrition Examination Survey, National Center for Health Statistics.

Vital and Health Statistics, Series 2, Number 113, September 1992. Sample Design: Third National Health and Nutrition Examination Survey, National Center for Health Statistics.

Rao, J.N.K., Wu, C.F.J., and Yue, K. (1992) "Some Recent Work on Resampling Methods for Complex Surveys," Survey Methodology, 18, 3, 209-217.

Wolter, K.M. (1985). Introduction to Variance Estimation, New York: Springer-Verlag.

29

NATIONAL HEALTH AND NUTRITION EXAMINATION SURVEY III · NATIONAL HEALTH AND NUTRITION EXAMINATION...

Documents

Transcript of NATIONAL HEALTH AND NUTRITION EXAMINATION SURVEY III · NATIONAL HEALTH AND NUTRITION EXAMINATION...