Cigarette smoking and self-reported health in China · 2010. 5. 3. · Cigarette smoking and...

Cigarette smoking and self-reported health in China

[Accepted, forthcoming, China Economic Review]

Steven T. YEN a,* W. Douglass SHAW b, and Yan YUAN c

a Department of Agricultural Economics 308D Morgan Hall

University of Tennessee Knoxville, TN 37996-4518, USA

b Department of Agricultural Economics

Texas A&M University TAMU 2124 / Blocker Building

College Station, TX 77843-2124, USA

c The Research Institute of Economics and Management Southwestern University of Finance and Economics

Chengdu, China

The authors thank Thomas McGuire, Richard Dunn and Ximing Wu for their comments on an earlier version of this manuscript, and Paul Jakus, Mary Riddel, and V. Kerry Smith for valuable comments on a related paper on smoking behavior. Shaw is also Research Fellow, Hazard Reduction and Recovery Center at A&M, and acknowledges support from the W-2133 USDA/Hatch Project. An earlier version of this paper was presented at the 2009 Far East and South Asia Meeting of the Econometric Society, Tokyo, August 3-5. * Corresponding author.

E-mail address: [email protected] (S.T. Yen).

1

Cigarette smoking and self-reported health in China Abstract

The effect of cigarette smoking on self-reported or assessed health (SAH) has been

considered in several studies, with some surprising results, but smoking behavior has received

less attention in studies in countries like China, than in the United States and various European

countries. In this manuscript the variation in an ordinal endogenous SAH variable is modeled

with an ordinal endogenous cigarette smoking variable, using the copula approach to

accommodate skewness in the error distribution. The treatment approach avoids several selection

issues that could bias empirical estimates. The empirical model is estimated for a random sample

of adult males from nine Chinese provinces in the 2006 China Health and Nutrition Survey. The

results for our sample suggest that heavy smokers are more likely to report excellent health.

Government and those in health policy might target heavy smokers with the message that

quitting does result in benefits, keeping in mind that their own reported assessment of their

health is itself a function of several factors.

Keywords: China; Self-assessed health; Smoking

JEL Classification: I10, C31

2

1. Introduction

Are cigarette smokers more likely to state that their health is good or bad than non-

smokers? They can answer either way, as the relationship is complicated. People who believe

themselves to be healthy may know about the dangers of smoking, and have stopped, or never

started smoking. Some healthy people may also simply believe that they can handle the potential

negative effects of smoking, continuing smoking even when they know the risks. People who are

self-assessed to be less healthy may get the message about the dangers of smoking and have

stopped, but there are also many sick people who may not have stopped, for various reasons.

Perceptions about the risks of disease or death likely play a role here, as does addiction.

We investigate the complicated relationship between self-assessed health (SAH) and

smoking, using micro-level data for a sample of Chinese men. Cigarette smoking behavior in

lower-income countries including China has received less attention from economists (Lance,

Akin, Dow, & Loh, 2004) than in other countries, although studies have existed for other ethnic

Chinese in Taiwan (Liu & Hsieh, 1995; Hsieh, 1998) and Hong Kong (Ho, Lam, Fielding, &

Janus, 2003). By examining this relationship, it may help health officials who are involved in

trying to get messages about risks of smoking out to people in China. Identifying the

characteristics of smokers and their health status will help officials learn what more they might

do to communicate knowledge that people can use to gauge their own health and consider

smoking risks.

Cigarette smoking is the single most preventable cause of death in the world today.

Worldwide it kills one person every six seconds, causes 1 in 10 deaths among adults, and claims

more than five million lives annually (Mathers & Loncar, 2006; WHO, 2009). Smoking not only

causes premature deaths but also leads to several diseases which may not necessarily kill a

person but affect health, such as chronic bronchitis, mucus hypersecretion, bladder cancer, and

3

peptic ulcer disease (Samet, 2001). Yet, cigarette or tobacco use remains common throughout the

world, with many countries having in excess of a quarter of its adult population smoking.

With more than 320 million smokers consuming 30% of the world’s cigarette production,

China is the largest producer and consumer of tobacco (Mackay, 1997; WHO, 2009). Per capita

cigarette consumption in China rose dramatically from the early 1970s to the early 1990s, and

deaths due to second-hand smoke are estimated to be over 100,000 per year (China Ministry of

Health, 2007). In China, smoking is largely an activity for males. Estimates from the China

Health and Nutrition Survey (CHNS) suggest that 59.6% of men and 5.1% of women were

current smokers in 1993, and the percentages decreased to 48.9% and 3.2%, respectively, by

2003 (Qian et al., 2010). Statistics from the 2006 CHNS, the sample used in the current study,

suggest 53.3% of men were current smokers, compared to 3.7% women. Because of the small

prevalence of smokers among females, our paper investigates the demand for cigarettes in China

by men.

It is expensive to obtain medical professionals’ assessments of a subject’s health

condition, or to do extensive objective tests of overall health on a patient. Therefore, it is

common to rely on survey questionnaires or interviews that simply ask individuals to rate or

assess their own health on a scale, yielding SAH. The question typically is phrased as, “On a

scale from 1 (poor) to 4 (excellent), how would you rate your health?” When respondents can see

the question (in mail, internet, or in-person surveys that use visual devices) they might be

instructed to circle one of the four or more discrete (ordinal) responses, i.e., numbers on a Likert

scale. Sometimes the SAH is simply a binary variable indicating excellent/good health versus

mediocre or poor health, but the data set we use offers information on a 1 to 4 scale. We posit

that the SAH variable is endogenous, and explore its variation from several variables, including

cigarette smoking. Smoking is a choice that people make, and can be a variable that is

4

endogenous in our model, so we link the SAH to an ordinal smoking participation equation. We

use a very general econometric model that accommodates dependent and potentially skewed

error distributions using the copula approach.

Below we briefly review related literature on the SAH measure and studies of smoking

behavior. Then we describe the data and report sample statistics in section 3. The empirical

model is featured in section 4, followed by estimation results in section 5. The final section

concludes.

2. Empirical literature

2.1 Self-assessed health

SAH, as a health measure, has been the topic of many previous studies (Au, Crossley, &

Shellhorn, 2005; Baker, Stabile, & Deri, 2004; Butler, Burkhauser, Mitchell, & Pincus, 1987;

Cai & Kalb, 2006; Campolieti, 2002; Case & Paxson 2005; Dwyer & Mitchell, 1999; Etilé &

Milcent, 2006; Idler 2003; Idler & Benyamini, 1997; Moore & Zhu, 2000; Shmueli, 2002; van

Doorslaer & Jones, 2003). Many researchers have examined factors that lead to heterogeneity in

SAH or in reporting it. Reported health might depend on personal characteristics such as gender

and education levels, while “true” health, which is perhaps never observable, does not (Groot,

2000). Exogeneity of SAH has been rejected in many recent studies of various behaviors related

to health status. We focus below on the link between the SAH and cigarette smoking.

2.2 Cigarette smoking and health

The vast literature on cigarette smoking is much too extensive to address here (see

Viscusi, 1992 for an extensive discussion). Researchers have dealt with a host of issues for over

twenty-five years, but much of the in-depth micro-level analysis has been done in the United

States (US) or in a European country. Many smoking studies conducted in developing countries

rely on aggregated and not micro-level data (e.g., Chapman & Richardson, 1990), and aggregate

5

data have been used to examine the demand for cigarettes in China (e.g., Mao & Xiang, 1997;

Xu, Hu, & Keeler, 1998). These studies are subject to the limitations and empirical concerns

normally associated with aggregated data: explanatory variables are often highly collinear and

there can be substantial simultaneity. In addition, aggregate data generally do not contain

detailed demographic information. Modest overall smoking rates in poorer societies often

disguise high incidences for certain subgroups for whom consumption is highly concentrated,

and many studies also fail to incorporate price variation below the national level, which can be

considerable in developing countries (Lance et al., 2004).

Micro-level data have been used recently in studies of cigarette demand in China (Lance

et al., 2004; Mao & Xiang, 1997) and smoking status in Taiwan (Hsieh, 1998; Liu & Hsieh,

1995) and Hong Kong (Ho et al., 2003). Important issues in microdata modeling of smoking are

summarized in Sloan, Smith, and Taylor (2003), and earlier, by Jones (1994). The issues include

whether smokers are addicts, and if so, whether the addiction is rational (Becker, Grossman, &

Murphy, 1994; Chaloupka, 1991; Chaloupka & Warner, 2000), and as in the Taiwan studies, the

role of risks perceptions in the decisions to smoke (which builds on earlier work by Viscusi

(1990)). As reported and perceived risks are not available in our data, the large number of studies

that deal with perceived risks are not reviewed here.

Studies have explored the effect of smoking on SAH (suggesting that smoking causes

variation in the SAH), some the reverse (the SAH causes variation in smoking behavior), while

others have explored simultaneity in the relationship. In an early exploration, Blaylock and

Blisard (1992) consider the possibility of simultaneity between health status, the decision to be a

current smoker, to quit, and the quantity of cigarettes consumed. The SAH variable used by these

authors is binary (good or not good), as are the smoking participation and quit decision variables.

Using a sample of women from the Continuing Survey of Food Intakes by Individuals in the US,

6

they find SAH did not influence the probability of smoking or quitting.

Jones (1994) noted that the smoking-health relationship is potentially obscured by many

unobservable variables. This includes such variables as a measure of the above-mentioned level

of stress. For example, an individual with a genetic predisposition toward anxiety might feel

stressed and smoke because of that personal trait, which may be quite difficult to measure, but

which may be correlated with an observed health measure. Using a sample of British adults from

the Health and Lifestyle Survey (HALS), Jones (1994) finds that individuals with poor or fair

SAH are less likely to have quit smoking than those in better health.1

Shmueli (1996) questioned Jones’ (1994) result because of the fact that people who quit

might have done so at different times, causing an unobserved heterogeneity in SAH. Using a

sample of the elderly from Israel, he finds that the effect of SAH (binary) on smoking varies,

depending on the four year cohort considered. Income is used as an identifying instrument. For a

period of 1966 to 1985 (for only the last group of smokers from 1981 to 1985), present health

was an exogenous variable with a negative effect, with sicker (and not healthier) smokers

quitting smoking.

Jones (1996) notes, in a response, that a variable indicating a disability may in fact pick

up a curative effect sought by Shmueli (1996); he also questions the use of income as an

instrument, for income is likely correlated with the quitting decision. Using additional data from

a follow-up survey to the 1984-85 HALS and allowing for endogeneity in SAH, Jones (1996)

confirms his earlier finding. However, he notes that those who experienced serious injury or

illness at the end of the period of analysis are indeed more likely to quit smoking.

Ho et al. (2003) examine a cross section of women and men surveyed in Hong Kong in

1 Jones (1994) and Sloan et al. (2003) considered more complicated decisions to stop smoking, but such modeling requires panel data, which we do not have.

7

the mid-1990’s. They noted that at that time, the proportion of former smokers among those who

had smoked at some point, was higher than that in Mainland China. Most of their sample report

being in good or very good health, but 5% of men and about 8% of women report being in poor

health. Using odds ratio models, these authors consider the role of smoking in reporting poor or

very poor health, adjusted for age, alcohol consumption, exercise (in the past 30 days),

education, marital status, and place of birth. Those who had never smoked had better perceived

health than those currently smoking, but those who had quit (had smoked before) had the worst

perceived health, for both genders.

Most recently, Contoyannis and Jones (2004) consider the role of several “lifestyle”

behaviors in SAH. The lifestyle behaviors that may affect health include diet (a breakfast

indicator), smoking (current or not), exercise (binary), sleep, alcohol consumption, and stress.

Their data again come from the original HALS used by Jones (1994), supplemented with a

follow-up source of panel data. They take advantage of exogenous variables from the follow-up

data to model the lifestyle variables from the earlier data, and use lifestyle variables to explain

SAH. Allowing for unobserved heterogeneity, they find that non-smoking has a large and

positive effect on the probability of reporting excellent or good health (SAH = 1). We conclude,

based on the above literature and debates, that the jury is still out on how smoking affects the

SAH, that this may ultimately be an empirical outcome that varies across samples, and most

importantly, that findings for our sample will contribute to knowledge about this for men in

China.

3. Data and sample

Our data come from the 2006 China Health and Nutrition Survey (CHNS). The survey

was designed to examine the effects of the health, nutrition, and family planning programs

implemented by national and local governments and to see how the social and economic

8

transformation of the Chinese society is affecting the health and nutritional status of its

population. CHNS is a longitudinal survey which covers the Guangxi Zhuang and eight other

provinces with substantial variations in geography, economic development, public resources, and

health indicators in 1989, 1991, 1993, 1997, 2000, 2004 and 2006. The nine provinces accounted

for approximately 42% of China’s population in 2006. The surveys collected data on consumer

goods as well as detailed information on measures of health outcomes, such as height, weight,

blood pressure, activities of daily living, SAH status, morbidity, physical function limitations,

and disease history.

A multistage, random cluster process was used to draw the sample surveyed in each of

the provinces. Counties in the nine provinces were stratified by income level (low, middle, and

high) and a weighted sampling scheme was used to randomly select four counties (one in low,

two in middle, and one in high-income levels) from each province. In addition, the provincial

capital and a lower income city were selected. Villages and townships within the counties and

urban and suburban neighborhoods within the cities were selected randomly. Currently, there are

about 4,400 households in the survey, covering some 19,000 individuals. Further details and

updates of the survey are described elsewhere (CHNS, 2007).

A sample of males from the 2006 CHNS adult survey is used for this study. Other years

of the CHNS are not used because of problems that may arise when estimating limited dependent

variable models with short panels; see Greene (2004), who suggests that ignoring panel

approaches might be best for such data. To focus the comparison between current smokers and

those that have never smoked (“never smokers”), former smokers are excluded from the sample.

After eliminating observations with missing values, 2,908 men remained in the final sample.

Definitions of variables and their descriptive statistics are presented in Table 1. The first

endogenous variable used in this analysis is an indicator of cigarette smoking behavior. The

9

survey contains information on the number of cigarettes smoked per day, and therefore cigarette

consumption could be modeled as an integer-count variable. However, due to the observed

clusters of the number of cigarette packs smoked in this data, with pile-up’s at 0.5, 1, 1.5 and 2

packs, we recode the cigarette pack quantity into an ordinal variable representing the decision to

smoke or not and how much (with values from 1 to 4). Ordinal response models are found to

perform better in modeling cigarette demand with such clustered data (Kasteridis, Munkin, &

Yen, 2010). The other endogenous variable in the analysis, SAH, is collected as an ordinal

variable, ranging from poor (coded as 1) to excellent (coded as 4). The average SAH is about

2.7, with the largest frequency falling in category 3 (good health).

Table 2 reports two-way frequencies between SAH and smoking, and illustrates the

difficulty in spotting a simple relationship between these two variables, although there are a few

suggestive cells of high frequency [e.g. non-smokers (smoking = 1) and those in good health

(SAH = 3); heavy smokers (smoking = 4) and those in excellent health (SAH = 4)]. Several cell

frequencies appear puzzling, such as the high count of moderate smokers (smoking = 3) in good

health. Further exploration of these frequencies distribution requires a formal modeling

procedure.

Continuous explanatory variables include age, income, and household size, which have

been shown to be important in health-related studies (e.g. Deshazo & Cameron, 2005). A

common problem in cross-sectional studies is a lack of variation in cigarette prices. As our data

include people from many provinces and regions, community-level prices were collected for

local markets and merged to the sample. Community-level prices in poorer, developing countries

can provide legitimate cross-sectional variation that identifies demand (Deaton, 1997). The

extraneous price data exhibit some variation, with a mean of RMB 4.85 and a standard deviation

of RMB 2.75 per pack. Binary explanatory variables include categories of education, which are

10

found to play a role in cigarette smoking (e.g., Viscusi, 1995), and the region of residence,

marital status, and some indicators of alcohol drinking behavior.2

4. An ordinal health model with an ordinal endogenous treatment

A treatment effect model accommodates the non-random selection of individuals into the

“treated” state and avoids statistical bias in empirical estimates caused by such non-random

selection. Most sample selection and treatment effect models have heretofore been estimated

with Gaussian (jointly normal) error disturbances. Normality of the error disturbances generally

cannot be justified by economic theory and misspecification of the distribution can lead to

inconsistent empirical estimates and misleading inference. To avoid biases caused by such

distributional misspecification, we develop an ordered probability treatment effect model with a

more flexible error distribution. Below we specify the model, first building its likelihood

function without a specific distributional assumption for the error terms. Following this the error

distribution is specified by the copula approach (Nelsen, 2006), which requires a marginal

cumulative distribution function (cdf) (henceforth, margin) for each error term and a copula

function which links the margins. In all that follows, observation subscripts are suppressed for

brevity. The model consists of an ordinal treatment equation for cigarette smoking (y1)

1 1 1if , 1,...,j jy j z u j J− ′= µ ≤ α + < µ = (1)

and an ordinal outcome equation for SAH (y2) which includes observed cigarette level as a

regressor:

2 1 1 2if , 1,...,k ky k x y u k K− ′= ξ ≤ β + δ + < ξ = (2)

2 An anonymous reviewer suggested control for occupations, as being in certain types of jobs may be associated with smoking for social reasons, or because of stress, or [we add] because of correlations with risk preferences. However, we explored this for six types of occupations and the only significant one was for those who were professional or administrative managers, which was in turn strongly linked to being employed for wages.

11

where z and x are vectors of explanatory variables, α and β are conformable vectors of

parameters, and the µ’s and ξ’s are threshold parameters such that 0 1, 0, ,Jµ = −∞ µ = µ = ∞

0 1, 0, ,Kξ = −∞ ξ = ξ = ∞ and 2 1,..., J −µ µ and 2 1,..., K −ξ ξ are estimable. The random errors u1, u2

are bivariate (not necessarily normally) distributed with zero means, unitary variances, and a

correlation structure specified below. Besides a different error distribution, the specification

above is an extension of the conventional treatment effect model (Barnow, Cain, & Goldberger,

1980) in that the treatment variable y1 is ordinal (vs. binary) and the outcome variable (y2) is also

ordinal rather than continuous. The likelihood function for an independent sample is

{ } 1 21( , )

1 2all 1 1

Pr( , )J K

y j y k

j k

L y j y k= =

= =

= = =ÕÕÕ (3)

where “all” indexes summation over sample observations, 1(A) is a binary indicator function

which equals 1 if event A holds, and the bivariate probabilities (likelihood contributions) are

1 2

1 1 1

1 1 1 1 1

Pr( , )

( , ) ( , )

( , ) ( , )

1,..., ; 1,...,

j k j k

j k j k

y j y k

F z x y F z x y

F z x y F z x y

j J k K

µ α ξ β δ µ α ξ β δ

µ α ξ β δ µ α ξ β δ-

- - -

= =

¢ ¢ ¢ ¢= - - - - - - -

¢ ¢ ¢ ¢- - - - + - - -

= =

(4)

such that 1 2 1 1 2 2( , ) Pr( , )F v v V v V v= £ £ is a bivariate cdf for standardized random variables V1

and V2 with margins 1 1 1 1( ) Pr( )F v V v= £ and 2 2 2 2( ) Pr( ).F v V v= £

4.1 The copulas

To accommodate skewness in the distribution of the error terms u1, u2, each bivariate cdf in Eq.

(4) is specified with the copula approach (Nelsen, 2006).3 We consider two leading

3 A copula, denoted 1 2 1 1 2 2( , ) [ ( ), ( )],C v v C F v F v= is a dependent function that can be used

to generate joint distributions of random variables V1 and V2 with specific margins 1 1( )F v and

2 2( )F v .

12

specifications: the Frank and Gaussian (bivariate normal) copulas (Nelsen, 2006). The Frank

copula was used in a slightly different context, in a recent paper in this journal by Yen, Yuan,

and Liu (2009). Here we present the Gaussian copula, defined as

1 11 2

1 11 2 1 2

2 2( ) ( )

2 1/2 2

( , ; ) [ ( ), ( ); ]

1 ( 2 )

2 (1 ) 2(1 )

F F

C F F F F

s st tdsdt

Φ Φ

θ Ψ Φ Φ θ

θπ θ θ

- -

- -

- ¥ - ¥

=

é ù- - +ê ú=ê ú- -ë û

ò ò (5)

where Φ–1 is the inverse univariate standard normal cdf and Ψ is the standard bivariate normal

cdf with correlation parameter θ bounded between [–1,1]. This is the copula used by Lee (1983)

in developing sample selectivity models with continuous but non-normal distributions.

4.2 The margins

Most alternatives to the Gaussian copula admit skewness in the random errors even with

symmetric margins. Additional skewness can be accommodated by using skewed margins. We

consider two forms of margins for F1 and F2. The first is the benchmark univariate Gaussian cdf

and the other is the generalized log-Burr cdf for standardized random variable ui with skewness

κi (Burr, 1942; Lawless, 2003)

1/( ; ) 1 (1 ) , , 1,2.i iui i i i iF u e u iκκ κ -= - + - ¥ < < ¥ = (6)

The generalized log-Burr distribution includes the logistic (κi = 1) and extreme value (κi → 0)

distributions as special cases; it can deliver very different probabilities even within a moderate

range of skewness.

To demonstrate the copula approach in the present context, for a model with Gaussian

copula and generalized log-Burr margins (henceforth, Gaussian-Burr model), our preferred

specification, the first probability on the right-hand side of Eq. (4) is obtained by substituting

1 1 1( ; )F u κ and 2 2 2( ; )F u κ from Eq. (6) into the probability in Eq. (4):

13

1

1

1

1/12 1

1/12 1

( , )

{ [1 (1 exp( )) ],

[1 (1 exp( )) ]; }.

j k

j

k

F z x y

z

x y

κ

κ

µ α ξ β δ

Φ Φ κ µ α

Φ κ ξ β δ θ

--

--

¢ ¢- - -

¢= - + -

¢- + - -

(7)

The specific forms of the remaining probabilities in the likelihood contribution Eq. (4) and for

alternative copulas can be derived in like manners, with slightly different threshold parameters

µ’s and ξ’s.

4.3 Average treatment effects

An important reason for estimating a treatment effect model is calculation of average

treatment effects (ATEs). Unlike conventional models in which a treatment is specified as an

exogenous discrete variable, an endogenous treatment effect model accommodates endogeneity

of the treatment and non-random selection of individuals into the treated states, thus eliminating

potential sample selection bias and providing consistent estimates of the treatment effects and the

effects of other explanatory variables. Eqs. (1) and (2) give the marginal probabilities

1 1 1 1 1 1Pr( ) ( ; ) ( ; )h hy h F z F zµ α κ µ α κ-¢ ¢= = - - - (8)

2 2 1 2 1 1Pr( ) ( ) ( ).k ky k F x y F x yξ β δ ξ β δ-¢ ¢= = - - - - - (9)

Using the joint probability in Eq. (4) and the marginal probability in Eq. (8), we have the

conditional probability

2 1 1 2 1Pr( | ) Pr( , ) / Pr( ).y k y h y h y k y h= = = = = = (10)

Using Eq. (10), the treatment effects can be calculated as

2 1 2 1Pr( | ) Pr( | ), for all , 1,...,khgTE y k y h y k y g h g k K= = = - = = > = (11)

which are the effect of being in smoking category h (in reference to category g) on the

probability of being in the kth SAH category for all k. All ATEs are calculated by averaging the

component effects across the sample.

14

4.4 Marginal effects of explanatory variables

Drawing on the marginal probability in Eq. (9) and conditional probability in Eq. (10),

we calculate the marginal effects of exogenous variables on SAH probabilities by differentiating

(or differencing, in the case of a binary explanatory variable) the marginal probabilities

2Pr( 1)y = and 2Pr( 4)y = , and conditional probabilities 2 1Pr( 1| 1),y y= = 2 1Pr( 1| 4),y y= =

2 1Pr( 4 | 1),y y= = and 2 1Pr( 4 | 4).y y= = For statistical inference, standard errors of all

marginal effects (and ATEs) are calculated by the delta method (Serfling,1980).

5. Empirical results

First, we explore a number of alternative modeling options. We estimate a Gaussian

binary treatment (smoking) effect model, assuming that SAH is a continuous variable (Barnow et

al., 1980). We also estimate an ordinary least-squares model with exogenous smoking, and two

ordered probit models with exogenous smoking (with the Gaussian and generalized log-Burr

distributions). These latter two models are restricted forms of the preferred Gaussian-Burr and

Frank-Burr models (by imposing zero error correlation). These alternative empirical models lead

to different results than those reported below. The most notable difference is insignificance of

the exogenous smoking variable in the last three models. The Gaussian binary treatment effect

model, though not directly comparable to the probability models considered here, did produce a

significant treatment effect of binary smoking but notably different results on some of the

explanatory variables (e.g., significance of age and its squared term in the SAH equation).

Complete results of these preliminary analyses are not presented due to space considerations but

are available upon request.

Our next task is to choose among alternative copulas and margins. As the models with

different copulas and margins are nonnested, model selection is accomplished with a nonnested

specification test procedure. Specifically, let ri and si be the maximum log likelihood contributions

15

of sample observation i for two competing specifications and define differences di = ri – si for i =

1,…,n with sample mean d and standard deviation ds . Then, under the null hypothesis of no

difference between the two models, Vuong’s (1989, Eqs. (3.1), (4.2), (5.6)) standard normal statistic

is 1/2 / dz n d s= ~ (0,1)N . The test results suggest that the Frank and Gaussian copulas perform

equally well, and that the generalized log-Burr margin fits the data better when either copula is

used. In sum, the Frank-Gaussian and Frank-Burr model perform equally well (z = 0.29, p-value

= 0.77), and that they are both preferable to the Gaussian-Gaussian (bivariate Gaussian) and

Frank-Gaussian models, all with z > 2.0 and p-value < 0.04. Most important is the finding that

the Gaussian-Gaussian model, used extensively in empirical studies with sample selection, is

rejected at the 5% level of significance. The Gaussian-Burr and Frank-Burr models produce

fairly close empirical estimates, in terms of average treatment effects and marginal effects of

explanatory variables, and we focus on results of the former for the remainder of the analysis.4

5.1 Maximum-likelihood estimates

Maximum-likelihood estimates for the Gaussian-Burr model are presented in Table 3.

Estimate of the error correlation is positive and significant at the 1% level. Both threshold

parameters are positive and significant at the 1% level; a negative threshold parameter estimate

would have suggested misspecification of the models. The skewness parameter for the cigarette

equation is not significantly different from zero at the 10% level of significance (suggesting the

extreme-value margin) but is significantly different from one (rejecting the logistic margin). The

skewness parameter for the SAH equation shows the opposite, which is significantly different

from zero at the 1% level of significance (rejecting the extreme-value margin) but is not

significantly different from (and is extremely close to) unity, suggesting the logistic distribution

is appropriate.

4 The full set of results for all econometric specifications are available from the authors.

16

Table 3 contains many interesting coefficients, but moving straight to the variable

relating to the focus of the paper, note that the effect of cigarette smoking, holding other factors

constant, is negative and significant on SAH at the 1% level. This statistical significance of

smoking on SAH suggests a misspecified model, such as the exogenous model referred to above,

can disguise the important effects of smoking on SAH, and highlights the importance of

accommodating endogeneity of smoking We more carefully consider the effects of smoking on

SAH by turning to the average treatment effects.

5.2 Average treatment effects

Table 4 reports the ATEs, i.e., the average changes in the probability of each SAH level

for various smoking treatments. All ATE’s are significant at the 1% level with only a few

exceptions. These results indicate the average effect of one level of smoking relative to another

level, on the probability of falling into a given SAH category. For example, in the upper left-

hand corner, the –5.71 estimate indicates that relative to being a non-smoker (smoking = 1), a

person who smokes 1 to 10 cigarettes (smoking = 2) is 5.71% less likely to be in poor self-

assessed health (SAH = 1). The diagonal elements show the effects between adjacent smoking

categories, so here, for example, a category 3 smoker is 2.52% less likely to be in poor self-

reported health than a category 2 smoker. Similarly, in the lower right-hand corner, a person who

smokes very heavily (smoking = 4) has a 21.64% higher chance of being in excellent self-

reported health (SAH = 4) than a category 3 smoker, and a 28.70% chance of being in the

excellent SAH category than a non-smoker.

Our result showing a negative effect of smoking on the probability of reporting poor

health (SAH = 1) is similar to the findings by Ho et al. (2003), i.e., that those who had never

smoked had better perceived health than those currently smoking. However, at higher SAH

levels, our ATE results clearly suggest that the level of smoking increases the SAH status. For

17

our sample of Chinese men, those who smoke more report being healthier than those who smoke

less. Non-smokers may have quit smoking because they were in poor health, or they may have

never started smoking because of poor health. Another possibility is that heavy smokers are

falsely reporting their health as being very good or excellent because they have adapted to

whatever negative actual effects on their health may cause, and report their health status quite

differently from those who never smoked, or fully understand the possible negative effects of

smoking. We cannot control for past health using panel data in the manner that Shmueli (1996)

did, but our results from the ATE’s are consistent with his.

It is very likely that some smokers in our sample have quit smoking (former smokers)

because they had problems with their health. It is thus worthwhile to investigate the SAH

specifically for current smokers and for former smokers. We accomplish this by estimating a

special case of the same (Gaussian-Burr) model as presented in this paper, but with a binary

quitting treatment approach (J = 2 in Eq. (1)). We create a subsample of current and former

smokers, excluding non-smokers from this part of the analysis.5 The ATEs for quitting smoking,

also presented in Table 4, in fact do suggest that quitting has negative effects on the probabilities

of reporting low SAH (poor and fair) and a positive effect on the probability of reporting excellent

health. This highlights the importance of comparing not only current to non-smokers, but also

considering former smokers in their relation to current smokers.

We cannot read more specific causal stories into the ATEs because details on underlying

nature of the health status or timing of starting or quitting smoking are not available in the data

used here. For example, with the exception of a variable indicating hypertension, we do not

know whether individuals who rate themselves as unhealthy do so because they have one or

5 We acknowledge one of the reviewers for suggesting this analysis. Parameter estimates for the current and former smoker sample are available upon request of the authors.

18

more of the specific smoking-related diseases (cardiovascular disease, one of the associated

cancers, or non-fatal diseases mentioned in the introduction). The relationships between smoking

and the explanatory variables are next explored by examining the marginal effects, while

controlling for the SAH measure.

5.3 Marginal effects

The marginal effects of exogenous variables on the probabilities of falling into particular

SAH categories, conditioned on the smoking levels, are calculated at the sample means of all

explanatory variables and results are presented in Table 5. The large number of explanatory

variables in the model prohibits discussion of the marginal effects of every variable in this paper,

so we focus on variables most important for the theme here—the conditional probabilities reflect

the strength of effects for nonsmokers, and heavy smokers. To begin, age has the usual affect on

health: older people are more (less) likely to be in poor (excellent) health, and the magnitudes

differ only slightly (in reference to their standard errors) between the smoking status (non-

smokers and heavy smokers). Being in a larger household increases the chance of reporting

excellent health. There is evidence of regional differences, with residents in Shandong Province,

for instance, being healthier than residents in Liaoning (and many other provinces). Residing in

Guangxi has the opposite effect.

Government agencies naturally can influence income via tax and other policies, and this

may affect health for men in China. According to the marginal effects, household income has a

negative effect on the probability of self-reporting poor health, with higher income reducing the

marginal probability of being in SAH = 1, and increasing the probability of being in excellent

health (SAH = 4), both conditional and unconditional on smoking status. Wealthier men report

being healthier, but being a non-smoker has a stronger marginal effect of income (0.82%) on the

probability of being in excellent health, than being a heavy smoker does (0.50%). We also note

19

that the income effects on health, conditioned on smoking status (a non-smoker or heavy

smoker) are not significant.

Government policies can affect education levels, which are often found to play a role in

understanding health risks. Several different qualitative education categories are included in the

specification of the models, but for Chinese men, there is no significant marginal effect of

education on health. The relationship between education and SAH can be significant. For

example, D'Hombres et al. (2010) find that tertiary (primary) levels of education in their sample

of households from eight transition countries of the former Soviet Union lead to higher (lower)

levels of SAH, as compared to secondary levels. Di Novi (2010) reports a positive effect of

college education on SAH as well in the US. It is possible that the higher levels of education

(high school or more) are required before one sees a strong positive influence of years of

schooling on the SAH. Thus, the insignificance that we find may be due to the fact that the

majority our sample of men are educated at the junior high school level or less (79%), and only

0.06% have some college education or more.

Of considerable interest here are the drinking and physical activity variables. Physical

activity has a positive marginal effect on being in the excellent health category, and oddly, so

does drinking occasionally, up to drinking almost daily. Here again, interpretation is limited

because health is self-reported and a factor could be adaptation and perceptions in that reporting,

but note that “frequently” here does not necessarily indicate abuse of alcohol, as moderate

drinking could be one drink per day. Contoyannis and Jones (2004) consider drinking behavior

described as “prudent” as a lifestyle that is also modeled as an endogenous variable in their

analysis, but found no strongly significant inverted relationship to social class: there is only a

hint that social class negatively influences drinking alcohol.

20

6. Conclusions

In this paper we have developed and estimated a model of the variation in SAH that

includes formal development as an ordinal endogenous smoking variable, applied to a sample of

Chinese men from nine provinces. The importance of this is first and foremost, simply to get the

estimated model of variation in the SAH, assumed to be endogenous, to also take consideration

of the possibility that smoking is an endogenous variable. Exogeneity of the smoking variable

has often been maintained in many earlier attempts to explain behaviors. The treatment approach

we apply, in general, can alleviate some problems stemming from selection issues. Indeed,

results of our preliminary analysis demonstrate that failure to accommodate endogeneity of

smoking can disguise an important effect of smoking on SAH.

Though there have been some important studies of smoking in China, Hong Kong and

Taiwan, we add to the empirical literature on smoking in a developing country. Results based on

one-year sample from the CHNS support the notion that health, as indicated by the SAH, is

better for this sample of Chinese men, after controlling for age and other socio-demographic

effects. Sick people in our sample may have quit smoking, which is what was found by Shmueli

(1996) for a sample of Israelis. Agencies that send messages about smoking risks may need to

work harder to reach those with lower education levels, as are predominant in our sample. The

CHNS data do not contain information on stopping and starting smoking over time, nor do we

have details on specific diseases of the sample units, so our interpretation is somewhat limited to

what can be drawn from this cross sectional analysis, using sub-categories such as current and

“former” smokers. However, our research here suggests that heavy smokers be targeted by

policy-makers, communicating to them that there are positive benefits of quitting smoking on

their health, even if they have smoked for a significant period of time. They also need to keep in

mind that self reported health for this group may have a complicated relationship to other factors.

21

As mentioned above, we do not use other years from the CHNS and adopt a panel-data

technique because of econometric issues with a short panel and the type of limited dependent

variable models we have (Greene, 2004)Also at the technical/econometric level, the copula

approach we implement here allows for specification of the treatment effect model with more

flexible distributions for the error terms than that used in conventional treatment effect models.

The use of the flexible model allows a convenient way to test other specifications against the

more conventional model. The Gaussian-Gaussian model, used in much of the sample selection

literature, is rejected.

A few final additional caveats pertain to the results we present here. First, we use

smoking as a variable that may affect health, formally treated as an endogenous variable, but

recognize that there may also be a fully simultaneous relationship between the SAH and smoking

(e.g., Blaylock & Blisard, 1992). A fully simultaneous relationship is not pursued here as a

simultaneous-equation model with dual observed discrete (binary or ordinal) variables is not

identified (Schmidt, 1981). Second, unlike some other models of smoking behavior, we have not

explored the role of several possible additional influences on smoking decisions. For example,

we have not included information about the subjective smoking-related mortality or morbidity

risks of the smokers and non-smokers (e.g., Viscusi, 1990), because the data set for this part of

China does not contain it. Third, we do not have indicators of perceived addiction, which may be

quite important in determining whether people currently smoke. However, to the extent that

subjective risks of harm or addiction are themselves endogenous variables that are functions of

other explanatory variables, such as gender, and age, etc., these other factors that become

instruments are indeed used in our models. Thus, whether our model being “incomplete” matters

in the usual econometric sense is not assured, but is a question worthy of further investigation.

22

Finally, a number of the regressors we do use, such as drinking and physical activity,

may be potentially endogenous. While the lack of viable instruments does not allow further

exploration of potential endogeneity of these variables, results of an alternative model without

these variables produce few discernable differences in the treatment effects and marginal effects

of other explanatory variables, based on the current sample. Further studies might consider

endogeneity of these additional variables, perhaps when a longer panel data sample becomes

available.

23

References

Au, D. W. H., Crossley, T. F., & Shellhorn, M. (2005). The effect of health changes and long-

term health on the work activity of older Canadians. Health Economics, 14(10), 999–

1018.

Baker, M., Stabile, M., & Deri, C. (2004). What do self-reported, objective measures of health

measure? Journal of Human Resources, 39(4), 1067–1093.

Barnow, B. S., Cain, G. G., & Goldberger, A. S. (1980). Issues in the analysis of selectivity bias.

In E. W. Stromsdorfer, & G. Farkas (Eds.), Evaluation studies review annual, 5 (pp. 43–

59). Beverley Hills: Sage.

Becker, G. S., Grossman, M., & Murphy, K. M. (1994). An empirical analysis of cigarette

addiction. American Economic Review, 84(3), 396–418.

Blaylock, J. R., & Blisard, W. N. (1992). Self-evaluated health status and smoking behaviour.

Applied Economics, 24(4), 429–435.

Burr, I. W. (1942). Cumulative frequency functions. The Annals of Mathematical Statistics,

13(2), 215−232.

Butler, J. S., Burkhauser, R. V., Mitchell, J. M., & Pincus, T. P. (1987). Measurement error in

self-reported health variables. Review of Economics and Statistics, 69(4), 644–650.

Cai, L., & Kalb, G. (2006). Health status and labour force participation: Evidence from

Australia. Health Economics, 15(3), 241–261.

Campolieti, M. (2002). Disability and the labor force participation of older men in Canada.

Labour Economics, 9(3), 405–432.

Case, A. C., & Paxson, C. (2005). Sex differences in morbidity and mortality. Demography,

42(2), 189–214.

Chaloupka, F. J. (1991). Rational addictive behavior and cigarette smoking. Journal of Political

24

Economy, 99(4), 722–742.

Chaloupka, F. J., & Warner, K. E. (2000). The economics of smoking. In A. J. Culyer, & J. P.

Newhouse (Eds.), Handbook of Health Economics, Vol. 1 (pp. 1539–1627). Amsterdam:

Elsevier.

Chapman, S., & Richardson, J. (1990). Tobacco excise and declining consumption: The case of

Papua New Guinea. American Journal of Public Health, 80(5), 537–540.

China Health and Nutrition Survey (CHNS) (2007). Design and methods. Chapel Hill, NC:

Carolina Population Center. http://www.cpc.unc.edu/projects/china/design/design.html

(Accessed April 1, 2008).

China Ministry of Health. (2007). China tobacco control report. Beijing, May.

Contoyannis, P., & Jones, A. M. (2004). Socio-economic status, health and lifestyle. Journal of

Health Economics, 23(5), 965–995.

Deaton, A. (1997). The analysis of household surveys. Baltimore: Johns Hopkins University

Press.

DeShazo, J. R., & Cameron, T. A. (2005) The effect of health status on willingness to pay for

morbidity and mortality risk reductions. California Center for Population Research. On-

Line Working Paper Series. Paper CCPR-050-05. URL

http://repositories.cdlib.org/ccpr/olwp/CCPR-050-05/ (Accessed March 8, 2009).

D'Hombres, B., Rocco, L., Suhrcke, M., & McKee, M. (2010). Does social capital determine

health? Evidence from eight transition countries. Health Economics, 19(1), 56–74.

Di Novi, C. (2010). The influence of traffic-related pollution on individuals' life-style: results from

the BRFSS. Health Economics. in press. DOI: 10.1002/hec.1550.

Dwyer, D. S., & Mitchell, O. S. (1999). Health problems as determinants of retirement: Are self-

rated measures endogenous? Journal of Health Economics, 18(2), 173–193.

25

Etilé, F., & Milcent, C. (2006). Income-related reporting heterogeneity in self-assessed health:

Evidence from France. Health Economics, 15(9), 965–981.

Greene, W. (2004). The behavior of the maximum likelihood estimator of limited dependent

variable models in the presence of fixed effects. Econometrics Journal, 7(1), 98–119.

Groot, W. (2000). Adaptation and scale reference bias in self-assessments of quality of life.

Journal of Health Economics, 19(3), 403–420.

Hsieh, C. (1998). Health risk and the decision to quit smoking. Applied Economics, 30(6), 795–

804.

Ho, S. Y., Lam, T. H., Fielding, R., & Janus, E. D. (2003). Smoking and perceived health in

Hong Kong Chinese. Social Science & Medicine 57(9), 1761–1770.

Idler, E. L. (2003). Discussion: Gender differences in self-rated health, in mortality, and in the

relationship between the two. Gerontologist, 43(3), 372–375.

Idler, E. L., & Benyamini, Y. (1997). Self-rated health and mortality: A review of twenty-seven

community studies. Journal of Health and Social Behavior, 38(1), 21–37.

Jones, A. M. (1994). Health, addiction, social interaction and the decision to quit smoking.

Journal of Health Economics, 13(1), 93–110.

Jones, A. M. (1996). Smoking cessation and health: A response. Journal of Health Economics,

15(6), 755–759.

Kasteridis, P. P., Munkin, M. K., & Yen, S. T. (2010). A binary-ordered probit model of

cigarette demand. Applied Economics, 42(4), 413–426.

Lance, P. M., Akin, J. S., Dow, W. H., & Loh, C. P. (2004). Is cigarette smoking in poorer

nations highly sensitive to price? Evidence from Russia and China. Journal of Health

Economics, 23(1), 173–189.

Lawless, J. F. (2003). Statistical models and methods for lifetime data. New York: John Wiley &

26

Sons.

Lee, L-F. (1983). Generalized econometric models with selectivity. Econometrica, 51(2),

507−513.

Liu, J., & Hsieh, C. (1995). Risk perception and smoking behaviour: Empirical evidence from

Taiwan. Journal of Risk and Uncertainty, 11(2), 139–157.

Mackay, J. (1997). Beyond the clouds – Tobacco smoking in China. Journal of the American

Medical Association, 278(18), 1531–1532.

Mao, Z., & Xiang, J. (1997). Demand for cigarettes and factors affecting demand: A cross-

sectional survey. Chinese Healthcare Industry Management, 5, 227–229.

Mathers, C. D., & Loncar, D. (2006). Projections of global mortality and burden of disease from

2002 to 2030. PLoS Medicine, 3(11), e442.

Moore, M. J., & Zhu, C. W. (2000). Passive smoking and health care: Health perceptions myth

vs. health care reality. Journal of Risk and Uncertainty, 21(2–3), 283–310.

Nelsen, R. B. (2006). An introduction to copulas, (2nd Ed.). New York: Springer.

Qian, J., Cai, M., Gao, J., Tang, S., Xu, L., & Critchley, J. A. (2010). Trends in smoking and

quitting in China from 1993 to 2003: National Health Service Survey data. Bulletin of the

World Health Organization, 88, in press. DOI: 10.2471/BLT.09.064709.

Samet, J. M. (2001). The risks of passive and active smoking. In P. Slovic (Ed.), Smoking: Risk,

perception and policy (pp. 3–28). London: Sage Publications.

Schmidt, P. (1981). Constraints on the parameters in simultaneous Tobit and probit models. In C.

F. Manski, & D. McFadden (Eds.), Structural analysis of discrete data with econometric

applications (Chap. 12, pp. 422–434). Cambridge, MA: MIT Press.

Serfling, R. J. (1980). Approximation theorems of mathematical statistics. New York: John

Wiley & Sons.

27

Shmueli, A. (1996). Smoking cessation and health: A comment. Journal of Health Economics

15(6), 751–754.

Shmueli, A. (2002). Reporting heterogeneity in the measurement of health and health-related

quality of life. Pharmacoeconomics, 20(6), 405–412.

Sloan, F., Smith, V. K., & Taylor, D. H. (2003). The smoking puzzle: Information, risk

perception, and choice. Cambridge: Harvard University Press.

van Doorslaer, E., & Jones, A. M. (2003). Inequalities in self-reported health: Validation of a

new approach to measurement. Journal of Health Economics, 22(1), 61–87.

Viscusi, W. K. (1995). Cigarette taxation and the social consequences of smoking. In J. M.

Poterba (Ed.), Tax policy and the economy (pp. 51–101). Cambridge, MA: MIT Press.

Viscusi, W. K. (1990). Do smokers underestimate risks? Journal of Political Economy, 98(6),

1253−1269.

Viscusi, W. K. (1992). Smoking: Making the risky decision. New York: Oxford University Press.

Vuong, Q. H. (1989). Likelihood ratio tests for model selection and nonnested hypotheses.

Econometrica, 57(2), 307–333.

WHO. (2009). WHO report on the global tobacco epidemic, 2009: Implementing smoke-free

environments. Geneva: World Health Organisation. URL

http://www.who.int/tobacco/mpower/en/ (Accessed April 23, 2010).

Xu, X., Hu, T., & Keeler, T. (1998). Optimal cigarette taxation: Theory and estimation. Working

Paper, University of California at Berkeley.

Yen, S. T., Yan, Y., & Liu, X. (2009). “Alcohol consumption by men in China: A non-Gaussian

censored system approach.” China Economic Review, 20(2), 162–173.

28

Table 1

Variable definitions and sample statistics (n = 2908)

Variable Definition Mean

Endogenous variables (ordinal)

Smoking Cigarettes smoked per day: recoded as 1 = nonsmoker; 2 = 1–10 cigarettes, 3 = 11–20 cigarettes, 4 = > 20 cigarettes

2.01 (1.00)

SAH Self-assessed (reported) health status: 1 = poor, 2 = fair, 3 = good, 4 = excellent

2.74 (0.78)

Continuous explanatory variables

Cigarette price Local free-market cigarette prices (RMB per pack) 4.85

(2.75)

Age Age in years 47.76

(15.02)

HH income Household income in RMB per year 12818.26

(18373.03)

HH size Household size 2.18

(1.50)

Binary explanatory variables (1 = yes; 0 = no)

< Primary No school or did not graduate from primary school 0.22

Primary Graduated from primary school (reference) 0.20

Junior high Graduated from junior high school 0.37

Senior high Graduated from senior high school 0.14

≥ College Some college or more 0.06

Employed Employed for wages 0.25

Self-employed Self-employed 0.43

Unemployed Unemployed 0.05

Home maker Home maker (reference) 0.05

Retired Retired 0.12

Student Student, unable to work, or others 0.10

Heilongjiang Resides in Heilongjiang Province 0.12

Jiangsu Resides in Jiangsu Province 0.12

Shandong Resides in Shandong Province 0.10

Henan Resides in Henan Province 0.12

Hubei Resides in Hubei Province 0.10

Hunan Resides in Hunan Province 0.11

29

Guangxi Resides in Guangxi Zhuang Autonomous Region 0.11

Guizhou Resides in Guizhou Province 0.11

Liaoning Resides in Liaoning Province (reference) 0.11

Urban Resides in central city 0.41

Married Married 0.88

Divorced Divorced, separated, or widowed 0.02

Single Never married (reference) 0.10

Drink frequently Drinks almost everyday 0.21

Drink less frequently Drinks 1–4 times a week 0.25

Drink occasionally Drinks 1–2 times a month 0.09

Drink rarely Drinks < 1 time a month 0.46

Minority National minority 0.09

Tea Normally drinks tea 0.45

Active Participates in physical activities 0.54

Hypertension Hypertension 0.09

Standard deviations in parentheses.

30

Table 2

Two-way frequency distributions of smoking and SAH status

SAH

Smoking 1 2 3 4 Total

1 86 354 590 194 1224

(2.96) (12.17) (20.29) (6.67)

2 35 180 311 99 625

(1.20) (6.19) (10.69) (3.40)

3 37 250 435 133 855

(1.27) (8.60) (14.96) (4.57)

4 10 72 102 20 204

(0.34) (2.48) (3.51) (0.69)

Total 168 856 1438 446 2908

Relative frequencies (%) in parentheses.

31

Table 3

Maximum-likelihood estimates of ordinal probability models with ordinal treatment: Gaussian

copula with generalized log-Burr margins

Smoking SAH

Variable Estimate S.E. Estimate S.E.

Constant –1.334*** 0.287 4.756*** 0.421 Income / 10000 0.041*** 0.015 0.078*** 0.024 Age / 10 0.896*** 0.124 –0.240 0.209 Age2 / 1000 –0.916*** 0.127 –0.157 0.203 HH size 0.022 0.017 0.081*** 0.028 Heilongjiang –0.458*** 0.101 0.158 0.170 Jiangsu 0.015 0.119 0.235 0.157 Shandong –0.223** 0.100 0.439*** 0.166 Henan –0.282*** 0.098 –0.123 0.161 Hubei –0.095 0.101 –0.177 0.164 Hunan 0.254*** 0.094 0.295* 0.163 Guangxi –0.021 0.101 –0.517*** 0.166 Guizhou 0.092 0.105 0.068 0.166 < Primary 0.144* 0.075 –0.069 0.119 Junior high –0.052 0.064 0.083 0.102 Senior high –0.127 0.084 0.061 0.131 ≥ College –0.191* 0.106 –0.034 0.170 Employed 0.051 0.124 0.005 0.189 Self-employed 0.180 0.116 0.160 0.178 Unemployed 0.100 0.157 0.185 0.232 Retired –0.248* 0.132 –0.046 0.197 Student –0.065 0.132 –0.281 0.190 Married 0.164 0.109 0.203 0.167 Divorced –0.236 0.185 –0.253 0.271 Urban –0.115** 0.059 0.169* 0.093 Minority –0.212** 0.102 0.114 0.160 Hypertension 0.191** 0.078 Cigarette price –0.040 0.011 Tea –0.030 0.075 Active 0.244*** 0.080 Drink frequently 0.521*** 0.101 Drink less freq. 0.336*** 0.092 Drink occasionally 0.330** 0.131 Cigarettes –0.627*** 0.158

32

µ2, ξ2 0.687*** 0.035 2.276*** 0.117 µ3, ξ3 1.859*** 0.114 4.770*** 0.251 κi 0.115 0.084 1.001*** 0.143 θ 0.362*** 0.094 Log likelihood –6574.464

Asterisks *** indicate statistical significance at the 1% level, ** at the 5% level, and * at the

10% level.

33

Table 4

Average treatment effects of smoking and quitting on the probabilities of health: Gaussian-Burr model

SAH = 1 SAH = 2 SAH = 3 SAH = 4

Smoking 1 2 3 1 2 3 1 2 3 1 2 3

Effects of smoking

2 –5.71*** –7.19*** 5.84*** 7.06***

(2.13) (1.39) (1.78) (1.72)

3 –8.23*** –2.52*** –13.58*** –6.39*** 7.05*** 1.22*** 14.76*** 7.70***

(2.73) (0.61) (2.96) (1.58) (1.53) (0.40) (4.19) (2.48)

4 –10.33*** –4.62*** –2.10*** –22.18*** –15.00*** –8.60*** 3.82** –2.02 –3.24 28.70*** 13.94*** 21.64***

(2.91) (0.81) (0.22) (4.56) (3.19) (1.62) (1.67) (3.27) (2.92) (8.84) (4.66) (7.14)

Effects of quitting

–5.99*** –13.71*** 1.45 18.25*

(2.29) (5.30) (2.73) (10.17)

All probabilities are multiplied by 100. Asymptotic standard errors are in parentheses. Asterisks *** indicate statistical significance at

the 1% level and ** at the 5% level.

34

Table 5

Marginal effects of selected explanatory variables on the probabilities of health: Gaussian copula with

generalized log-Burr margins

Probability or SAH = 1 conditional on Probability or SAH = 4 conditional on Variable None Smoking = 1 Smoking = 4 None Smoking = 1 Smoking = 4 Continuous explanatory variables Income / 10000 –0.20*** –0.29 –0.23 1.42*** 0.82*** 0.50* (0.07) (0.21) (0.22) (0.48) (0.29) (0.30) Age / 10 1.00*** 1.85*** 1.86*** –7.06*** –5.04*** –4.30*** (0.18) (0.24) (0.33) (0.95) (0.56) (0.50) HH size –0.21*** –0.33** –0.30* 1.47*** 0.93*** 0.68** (0.08) (0.14) (0.16) (0.54) (0.34) (0.30) Cigarette price –0.08* –0.13 0.18* 0.33** (0.04) (0.33) (0.10) (0.17) Binary explanatory variables Heilongjiang –0.39 –1.50** –1.91*** 2.92 4.27** 6.10*** (0.44) (0.67) (0.67) (3.02) (2.02) (1.97) Shandong –0.96** –2.12*** –2.20*** 8.67*** 7.56*** 7.34*** (0.41) (0.66) (0.67) (3.18) (2.31) (2.00) Hunan –0.69* –0.86 –0.62 5.64* 2.79 1.23 (0.39) (0.70) (0.73) (3.29) (2.07) (1.64) Guangxi 1.78*** 3.12*** 3.45*** –7.89*** –4.92*** –4.19*** (0.64) (1.04) (1.24) (2.79) (1.62) (1.41) Urban –0.43* –0.99** –1.11*** 3.08* 2.68** 2.77*** (0.25) (0.41) (0.42) (1.68) (1.16) (1.04) Minority –0.28 –0.88 –1.07* 2.11 2.50 3.18* (0.38) (0.61) (0.56) (3.00) (2.12) (1.88) Hypertension 0.38* 0.61** –0.81** –1.46* (0.20) (0.30) (0.40) (0.75) Active –0.64*** –1.15*** –1.14*** 4.40*** 3.08*** 2 .57*** (0.22) (0.38) (0.40) (1.52) (1.03) (0.85) Drink frequently –1.31*** –2.36*** –2.33*** 9.62*** 6.85*** 5.68*** (0.28) (0.43) (0.55) (2.14) (1.51) (1.12) Drink less freq. –0.91*** –1.64*** –1.66*** 5.91*** 4.08*** 3.43*** (0.26) (0.43) (0.50) (1.76) (1.19) (0.95) Drink occasionally –0.90*** –1.62*** –1.64*** 5.81** 4.01** 3.38** (0.34) (0.58) (0.62) (2.51) (1.76) (1.43)

All marginal effects on probabilities are multiplied by 100. Asymptotic standard errors are in

parentheses. Asterisks *** indicate statistical significance at the 1% level, ** at the 5% level, and * at

the 10% level. Marginal effects for insignificant variables are not presented.

Cigarette smoking and self-reported health in China · 2010. 5. 3. · Cigarette smoking and...

Documents

Transcript of Cigarette smoking and self-reported health in China · 2010. 5. 3. · Cigarette smoking and...