A grouped data regression approach to estimating economic and social influences on individual...

11

Click here to load reader

Transcript of A grouped data regression approach to estimating economic and social influences on individual...

Page 1: A grouped data regression approach to estimating economic and social influences on individual drinking behaviour

HEALTH ECONOMICS, VOL. 4: 237-247 (1995)

ECONOMETRICS AND HEALTH ECONOMICS

A GROUPED DATA REGRESSION APPROACH TO ESTIMATING ECONOMIC AND SOCIAL

INFLUENCES ON INDIVIDUAL DRINKING BEHAVIOUR

MATI’HEW SUTTON AND CHRISTINE GODFREY Centre for Health Economics, University of York, UK

SUMMARY

General Household Survey (GHS) data sets, covering the period 1978-1990, are pooled to investigate the relationship between the riskiness of individuals’ self-reported drinking behaviour and a wide range of personal characteristics and economic factors. A grouped data regression approach is used to reduce problems with the inaccuracy of self-reports of alcohol consumption and clustering of observations in the consumption data. Results for males aged 18 to 24 years are presented, and possible methods for interpreting the results of grouped data regression are illustrated. Controlling for other factors, current smokers are estimated to be at a 75% higher risk of drinking over recommended levels than non-smokers. Particular attention is paid to the interactions between the price of alcohol, income and heavy drinking. At average levels of income, a 5% increase in the real price of alcohol is predicted to reduce the probability of ‘at-risk’ drinking by 1.5%. At lower initial levels of income, drinking patterns are found to be more responsive to both price and income changes. Grouped data regression is proposed as a way of focusing policy analysis on individual risks of alcohol-related health and social problems.

KEY WORDS-akohol consumption; grouped data regression; iso-probabilities; health policy

There are numerous studies of the determinants of alcohol consumption.1s2 The majority have employed time series data aggregated over the population, or different regions, of a country. Results of such studies have been used to inform governments’ taxation and revenue policies. Alcohol consumption patterns are, however, not only of interest because of their impact on gover- nment finances, but also because of the risks of health and social problems at the individual level.

There have been only a few economic studies of the determinants of alcohol consumption using individual- or household-based data. This is partly due to the difficulty in obtaining survey data which includes economic variables, such as income. Also, several years of survey data may need to be

pooled if price elasticity estimates are to be obtained. However, the advantage of these data sets, once compiled, is that the interactions between economic and social determinants of individual consumption levels can be examined.

In the UK, there have been a number of studies which have used household expenditure data from the Family Expenditure Survey (FES).*5 However, whilst such studies are useful in examining household consumption behaviour, the health implications of the d t s based on the FRS are more difTicult to interpret. In England, Government health policy is now expressly concemed with the riskiness of alcohol consumption pattern^.^ Therefore, there is an acute need for estimates of the effectiveness of policy instruments on individual drinking behaviour.

Address for correspondence: Matthew Sutton, Centre for Health Economics, University of York, YORK YO1 5DD. England.

CCC 1057-9230/95/030221-11 0 1995 by John Wiley & Sons, Ltd.

Received 21 December 1994 Accepted 28 March 1995

Page 2: A grouped data regression approach to estimating economic and social influences on individual drinking behaviour

238 M. SU’ITON AND C. GODFREY

In this paper, relationships between the riskiness of individuals’ self-reported drinking behaviour and a wide range of personal characteristics and economic factors are investigated. Due to the nature of the data used, a grouped data regression approach is adopted. Results for males aged 18 to 24 years are presented, and possible methods for interpreting the results of grouped data regression are illustrated. Given the importance of equity considerations to health policy, particular attention is given to the interactions between the price of alcohol, income and heavy ability curves are suggested examine these relationships.

DATA

The UK General Household

drinking. fio-prob- as a useful way to

Survey (GHS) con- tains questions on individuals’ drinkkg behaviour, health status, socio-demographic characteristics, employment and income. Independent, representa- tive samples of individuals living in privately- owned households are interviewed each year. Questions on alcohol consumption have been included every other year since 1978. Heavy drinkers may be under-represented for several reasons, including the exclusion of institutions and lower response rates of heavy drinker^.^ Neverthe- less, this data-set is important, since it will be the main method of monitoring Government attempts to meet its national alcohol policy targets.

Respondents are asked to consider their alcohol consumption across five categories of drink: shandy; beer, lager, stout and cider; spirits and liqueurs; sherry and martini; and wine. Inter- viewees are first asked to estimate how often they consume a drink from each category. Responses are coded into one of six groups (seven from 1990 onwards), ranging from ‘once or twice a year’ to ‘most days a week’. Following this, individuals are requested to estimate the number of ‘standard drinks’ they consume on each drinking occasion, where a ‘standard drink’ is a half pint of beer, a glass of wine, or a pub measure of spirits.

Using a mid-point estimate of the frequency of drinking (eg. 3.5 times a week for ‘three or four times a week’), and the average number of units consumed on each occasion, the number of units consumed weekly by each respondent can be estimated. However, given the limited set of possible frequencies of consumption, these figures cluster around popular drinking patterns.

In addition, GHS reports indicate more funda- mental problems in the accurate measurement of individuals’ weekly consumption of units of alcohol.8 There are particular problems with basing weekly estimates on the ‘usual’ amount drunk on a single drinking occasion. Moreover, the strength of drinks will vary within drink-types, notably in the case of lagers and beers. Survey reports sug- gest that the only accurate way to use the alcohol consumption data is to assign individuals to one of seven drinking categories (table 1).

This categorisation of drinking behaviour has further importance in terms of ublic health

ment has set targets for reducing the proportion of the population in England drinking in the ‘fairly high’, ‘high’ and ‘very high’ groups. GHS data are the basis for the setting and monitoring of these targets.

The period 1978-1990 is considered in this paper, with GHS data from every other year being included. In total, seven independent GHS samples are pooled. The analysis is restricted to males aged 18 to 24 years in this study for three reasons. First, in previous work no correctly specified models could be found for the whole population, and the sample was divided into age-sex groups since these variables were found to be the most import- ant predictors of the heterogeneity of drinking patterns.’ Second, the alcohol price series is con- structed from national expenditure figures and a large proportion of this expenditure will be made by this population group. Therefore, the price series may be most applicable to these individuals since time trends in the prices of different types of alcoholic beverages vary considerably, as does the consumption of different drink-types between age and sex groups. Finally, the probability of drink- ing heavily is highest in this group and these

Table 1. Drinking categories in the GHS

policy. In The Health of the Nation F , the Govern-

Units/week

Drinking category Men Women

Non-drinker None None Very low <1 <1 LOW 1-10 1-7 Mo&rate 11-21 8-14 Fairly high 22-35 15-25 High 36-50 26-35 Very high 51+ 36 +

Page 3: A grouped data regression approach to estimating economic and social influences on individual drinking behaviour

ESTIMATING IhFLUENCES ON DRINKING BEHAWOUR 239

individuals are therefore a prime consideration for p~licy-making.~

The wide range of independent variables included is intended to reflect differences across individuals over a number of dimensions: personal characteristics; family structure; socio-economic status; health-related attitudes and behaviour; consumer behaviour (particularly expenditure indicating planning for the long-term); peer influences; and aggregate economic factors.

THE GROUPED DATA REGRESSION APPROACH

The grouped data regression technique is appli- cable when the dependent variable is limited to a certain number of categories, but the ranges of the underlying variable to which each category refers are known. Therefore, grouped data regression based on the seven drinking categories used in the GHS is a valid method of estimation for these data. A discussion of methods for estimating parameters is provided by Stewart." In this aper the maximum likelihood approach is adopted.

The grouped data regression approach is similar to an ordered probit model. An underlying latent variable, y*, is assumed to generate the observa- tions on the dependent variable (in this case, the drinking category to which each individual is assigned). The latent variable is assumed to be generated in the following way:

Pl

y * = p x + & (1)

where the values of y* are unobserved, the vector B contains regression coefficients, the vector x contains the values of the corresponding regres- sors and the disturbance terms E are independently, normally distributed random variables with zero mean and variance 0'. The observations, y, are generated in the following fashion:

y = l i f y ' s u , y = 2 i f a , < y * c a ,

y = J ifaJ-, < y *

in which u ~ . . . u ~ - , are pre-determined threshold values. This approach is more efficient than an alternative ordered probit approach, since the

estimation procedure utilises information on the scale of y* (provided by the threshold values) to produce an estimate of 0, rather than requiring that this be normalised to one. ''

In this study, the estimated number of units of alcohol consumed per week generated by responses to the GHS questions are assumed to be the best estimates of the unobserved y* values. The threshold values, u1 to a6, are set equal to the boundary values of the drinking categories shown in table 1.

The grouped data re ession model was esti- mated using LTMDEP.lFLinear and logarithmic functional forms were tested. For grouped data regression, a logarithmic model is generated by using logged values of the threshold terms and taking logs of the continuous regressors. In the linear case, a general model was initially attempted, which included cubic terms of all continuous variables and cross-products of the price-series and family income variables. A more parsimonious model was sought by removing continuous variables not significant at the 5% level.

Many different types of misspecification, such as non-normality , heteroskedasticity , omitted variables and endogenous regressors, lead to inconsistency of maximum likelihood estimators. A RESET test was included as a general check for the validity of the coefficients estimated by maximum likelihood. The evidence provided by Horowitz" and others suggests that RESET is a fairly reliable and convenient general check. While early evidence on the RESET tests sug- gested the use of second, third and fourth powers, more recent work has indicated that using only the squared term provides a more effective check. l3

RESULTS

For completeness, ordinary least-squares models were estimated, but neither linear nor logarithmic functional forms passed the RESET test at the 1% level, and there was evidence of heteroskedasticity in the errors, based on a general test. For the grouped data regression approach, the RESET statistic was significant in the logarithmic form, but not in the linear model. The estimated regres- sion coefficients using the latter functional form are given in tables 2a and 2b (Variable definitions are provided in the Appendix.)

Page 4: A grouped data regression approach to estimating economic and social influences on individual drinking behaviour

240 M. SUlTON AND C. GODFREY

Table 2a. Grouped data regression results-continuous variables

Variable Coefficient t-ratio Meanof x

Peer-influences AREABOOZ HOUSBOOZ HSBOOZSQ HSBOOZ3 Economic factors INCOME INCOMESQ INCOME3 PRICE PRICE2 ADVERT PRICE *INCOME (pRIcE*INcoME)2 (PRICE COME)^

2.24 8.17

-0.90 0.03

-20.06 2.11

-0.04 -333.28

156.29 -0.33 23.44 -2.78

0.06

3.19 18.75 -8.96 5.7 1

-2.32 2.36 1.91

-3.06 2.72

-2.40 2.52

-2.44 1.97

1.013 1.052 3.033

17.448

0.902 2.257

36.665 0.959 0.923 3.261 0.870 1.935

25.176

Restricted Log-likelihood - - -15152.05 Log -likelihood - - -14253.20 RESET statistic Sample size

- -0.28 - - - 7552

The estimated coefficients in tables 2a and 2b do not give the estimated marginal impacts of changes in the independent variables on the pre- dicted value of the dependent variable. Manipulation of the coefficients at a certain value of the mean function are required to estimate the marginal impact of each variable on the probabili- ties of each group. The only prediction which can be made from the reported coefficients is that, if the coefficient is positive, the probability of the highest group will increase, and the probability of the lowest group will decrease (the opposite being true if the coefficient is negative). The impact on the other groups cannot be predicted from the reported coefficients.

Most of the estimated coefficients have the expected sign. However, since there are a variety of cross-product and higher-order terms for the continuous variables (Table 2a), the influences of specific factors are difficult to interpret. The level of drinking in the individual’s area of residence is found to be positively associated with riskiness of drinking behaviour. This may reflect the relative number of licensed premises in close proximity, a factor which has been found previously to affect levels of alcohol consumption at the national level, l4 A non-linear relationship is estimated

between alcohol consumption by other members of the household and the individual’s drinking risk, but the influence of household-peers is still positive. These findings are in accordance with previous studies which demonstrate the influence of drinking habits in an individual’s social network. l5

In table 2b, the influences of personal character- istics are found to be significantly different from zero, with individuals born outside the UK (NONUKBRN), or having parents who were born outside the UK (SLFUKBRN), having less risky drinking patterns. Individuals who are married are estimated to have a lower risk of drinking heavily (EVERMARR), but if they subsequently divorce or separate, the risk rises to above pre-marriage levels (SEP-DIV). The estimated relationship between age and drinking behaviour is not monot- onic (AGE18 through AGE23), but younger individuals tend to report less risky drinking patterns.

One-adult families are associated with lower probabilities of high alcohol consumption (ONEADF). The number of children in the family influences drinking behaviour, with increasing numbers of children predicting less risk of heavy drinking. None of the dummy variables represent-

Page 5: A grouped data regression approach to estimating economic and social influences on individual drinking behaviour

ESTIMATING INFLUENCES ON DFUNKING BEHAVIOUR

Table 2b. Grouped data regression results-dummy variables

24 1

Variable Coefficient t-ratio Meanof x

Personal characteristics SLFUKBRN -2.80 NONUKBRN - 8.40 EVERMARR -5.86 SEP-DIV 9.11 AGE18 -5.59 AGE19 - 1.46 AGE20 0.13 AGE2 1 -2.04 AGE22 -0.33 AGE23 -0.06 Family structure ONEADF -2.44 THREEADF -0.80 FOURADF -1.14 ONECHDF -1.06 TWOCHDF -2.10 THREECHDF -2.94 CHDUSF 0.39 Educational qualifrcations HIGHERQ NURSING GCEORCSE APPRENTC O m Q Socio-economic group EMPMAN PROF ARMED MANUAL FARMER OWACC Consumer behaviour RENT MANYCAR NOCAR FEWDURAB MANYDURB Economic status UNEMP COLLEGE HOUSKEEP

0.96 0.10 0.88 0.31 3.18

2.44 -0.83

3.30 3.33 0.38 5.85

0.08 -1.00

3.06 -2.14 -0.12

-4.17 -4.30 -6.35

-3.80 -9.26 -5.67

3.12 -6.45 - 1.74

0.16 -2.51 -0.42 -0.08

-2.12 -0.86

1.21 -1.73 -2.26 -1.91

0.39

1.09 0.01 1.53 0.16 1.38

2.61

1.61 6.48 0.30 5.32

0.16 -1.80

5.47 -3.30 -0.20

-6.18 -5.34 - 1.06

-0.70

Health-related attitudes and behaviour IL.LHLTH -0.76 - 1.05 SMOKER 7.67 17.56 ALCNODAM 4.12 5.69 Constant 185.81 3.59 NOHSBOOZ -1.55 - 1.34

0.087 0.063 0.220 0.006 0.149 0.148 0.139 0.136 0.134 0.147

0.139 0.270 0.320 0.199 0.068 0.021 0 099

0.107 0.001 0.669 0.013 0.009

0.060 0.037 0.107 0.555 0.030 0.043

0.457 0.228 0.267 0.169 0.171

0.130 0.090 0.001

0.089 0.423 0.093

0.05 1 -

Page 6: A grouped data regression approach to estimating economic and social influences on individual drinking behaviour

242 M. SUITON AND C. GODFREY

ing the highest educational qualification achieved are significantly different from zero.

Employers/managers (EMPMAN), manual workers (MANUAL) and own-account non-pro- fessionals (OWNACC) are all estimated to be more likely to drink heavily than the default group, junior or intermediate non-manual workers. Fifty-six percent of the individuals in this data set are classified in manual socio-economic groups.

The positive coefficient on the variable represent- ing households with no car, NOCAR, may be indicative of many possible effects. This is unlikely to represent an urban-rural differential, since differ- ences in alcohol availability between individuals living in urban and rural areas should be reflected in the coefficient on AREABOOZ. Differences in current wealth should be reflected in the family income variables, but possession of a car may represent wealth capital which may have a positive effect on current drinking behaviour. The positive influence of the NOCAR variable may reflect safer drinking by those who drive to licensed premises.

Individuals who are currently unemployed (UNEMP), at college full-time (COLLEGE), or keeping house (HOUSKEEP) are all less likely than those currently employed to be drinking heavily. The coefficient on HOUSKEEP is, however, not significantly different from zero, although this may reflect the minor variation in this variable in this group. Current smokers are found to have a higher risk of heavy drinking than never or former smokers (SMOKER), suggesting that there is complementarity rather than substitution between risky consumption patterns. l6 Reported ignorance of the possible harmful health effects of high alcohol consumption has a significant positive influence on the likelihood of drinking heavily (ALCNODAM). Endogeneity of some variables, such as employment status, income, self-reported health and smoking status, was not indicated by the general misspecification test used in this study.

More detailed estimation of the impact of independent variables on each of the drinking categories must be considered from a base situa- tion. The base situation considered in this section is for all dummy variables equal to zero, and all continuous variables set to their mean values. For dummy variables, impact is assessed by calculat- ing the probabilities of each drinking category when the variable is set to zero or one.

Heavy alcohol consumption has been found to pose a greater health risk for smokers compared to non-smokers. l7 Individuals who consume

more than ten units of alcohol a day face a doubling of the already elevated relative risk of oesophageal cancer if they also smoke twenty cigarettes a day.‘* Differences in the probabilities of each of the seven drinking categories between smokers and non-smokers are shown in table 3. Smoking is estimated to have a positive impact on the probability of drinking in the ‘high’ and ‘very high’ consumption categories. The impact is largest on the highest risk group, with the probability of this drinking level increasing by 75% for smokers relative to non-smokers.

Simulations of the changes in drinking cate- gory probabilities which would be associated with various changes in continuous variables are shown in table 4. The ‘at-risk’ category represents those who drink over 21 units of alcohol per week (the combination of categories 5, 6 and 7). The influence on the ‘at-risk’ category is simply the combination of the influences on its constitu- ent categories. A 50% increase in alcohol consumption in the individual’s household is shown to have a larger impact than a similar change in alcohol consumption in the local area. For both factors, the largest decrease in prob- ability is seen in category 1 (abstainers), and the largest increase in probability is found in the highest risk group (category 7).

A 5% increase in the real price of alcohol is predicted to have a substantial impact on the probability of the ‘at-risk’ group. Relative to the initial situation, the negative impact of an increase in price is calculated to be largest on the ‘very high’ consumption group (category 7). This may be interpreted as indicating a higher level of price- responsiveness in heavy drinking behaviour rela- tive to moderate drinking behaviour. From the base situation, a 10% real increase in the level of income is found to have little impact on individuals’ drinking behaviour.

THE INTERACTION BETWEEN THE PRICE OF ALCOHOL, INCOME AND DRINKING

BEHAVIOUR

Interaction terms between the price of alcohol and family income per adult were found to be significantly Merent from zero in the grouped data regression. This result implies that the influence of price changes will depend on the level of income, and the influence of income changes

Page 7: A grouped data regression approach to estimating economic and social influences on individual drinking behaviour

ESTIMATING INFLUENCES ON DRINKING BJHAVIOUR

Table 3. The estimated impact of smoking status on drinking behaviour

Probabilities Probabilities Drinking group Units/week for non-smokers for smokers Change

243

Non-drinker None 0.127

Low 1-10 0.092 Moderate 11-21 0.140 Fairly high 22-35 0.250 High 36-50 0.192 Very high 51 + 0.187

1 .OOo

Very low <1 0.012

-

0.056 -0.071 0.007 -0.005 0.056 -0.036 0.101 -0.039 0.227 -0.023 0.226 0.034 0.328 0.141

1 .Ooo 0.o00 -

~~

Table 4. The estimated impacts of continuous variables on drinking behaviour

Situation

Initial 50% inc. Areabooz (Relative change) 50% inc. Housbooz (Relative change) 5% inc. Price (Relative change) 10% inc. Income (Relative change)

Drinking category

1

0.127 0.113

-10.8% 0.089

-29.7% 0.132

+4.2% 0.125

-1.3%

2

0.012 0.012

-7.3% 0.010

-20.2% 0.013

+2.9% 0.012

-0.8%

3

0.092 0.086

-5.9% 0.077

-16.3% 0.094

+2.3% 0.091

-0.8%

4

0.140 0.135

-3.4% 0.127

-9.5% 0.142

+ 1.4% 0.139

-0.4%

5

0.250 0.250 0.0% 0.250

-0.3% 0.250 0.0% 0.250 0.0%

6

0.192 0.199

+3.8% 0.212

+10.6% 0.189

-1.5% 0.193 +0.5%

7 At-risk

0.187 0.205

+9.5% 0.236

+26.1% 0.180

-3.7% 0.189

+1.2%

0.629 0.654

+4.0% 0.697

+ 10.9% 0.619

-1.5% 0.632 +0.5%

will depend on the level of alcohol price. More- over, the significance of higher-order terms on price and income indicates that the relationships with drinking behaviour are not linear, and the impact of changes will therefore depend on the initial value.

The impact of price and income changes can be estimated from 'base situations' other than that considered in table 4. At a level of income 50% higher than average, for example, the impact of a 5% increase in price on the probability of 'at-risk' drinking is approximately half of that estimated at the average income level.

It is possible to illustrate these interactions with iso-probability curves. An individual's probability of drinking in category c, n:, has been estimated to be a function of price ( P i ) , income (YJ, price- income interaction terms, and other factors (Xij). To simulate the effect of a price increase, it is necessary to compare an individual, with a certain profile of characteristics, who faces price Po, with an individual who has identical characteristics and

faces price P,. The probability of drinking over safe limits for individual 1, nl, can be estimated by:

where the terms us to es are the estimated marginal impacts of the independent variables RALCP, PRICE2, PRICJ3NC, PRICEY2 and PRICEY3 respectively. This equation can be rearranged to give a standard cubic function in P,:

4 p ; + (b3 + dSY2) p : + (aS + csY) eSY eSy3

(rI; - n; - (aS + C"P, - (bS + d Y ')Pi - e"3P; )

e'Y

= o (4)

Page 8: A grouped data regression approach to estimating economic and social influences on individual drinking behaviour

244 M. SU"TON AND C. GODFREY

Base Prob=0.63

Prob=0.65

------.-.- /--- ---\

.*.-.- -.-._. -.-.._ . -.-__

i Prob=O.lO

_._._.--.. -.-...--.___._._____.-_.-. ------ -.--._ -.-._._ --.-._ -.-.__ .-.-._, -.--._ -.-. ! --..-._.

I i ,".-..._ ;( ! I -.-..-.- ;.i t i -.-._. 1; ---.__. ;i ;! ii 'I

i j

-5.. -.. ...

-%. -.-... .-.-._ Prob=0.75

,--.--.- -.-.

i i

-.-- .-_.__._.___._.____.-. -.---- --.--._.. -.-.-._._. It

i

I 1 I I 1 I 1

0.5 1 1.5 2 2.5 3 3.5

This expression can be used to calculate the price level required to produce a pre-determined probability of drinking over safe limits, llf, for each possible level of income. The solution(s) to this equation can be used to generate contours for llf over a range of income levels. General for- mulae for solving cubic equations are given in mathematical texts.'9

The original l-Ii value has been calculated for the base situation used in the previous section (average values for all continuous variables and zero values for all dummy variables). The range of income levels for which the P I values are derived is 5% - 400% of the average income in this sample. For probabilities n: substantially less than the initial probability, lli = 0.63, there is only one root for P I , and this is large and negative. For probabilities greater than 0.63 there are three roots, with one solution once again being large and negative. Another set of solutions involves values of the price index higher than the average price, and imply probabilities lIj which are increasing in price at all income levels. The third set of solu- tions include the base situation and the average

4

1

0.95

0.9 X 0 TI C - 0.85 .- 8 a'

0.8

0.75

0.7

value of the price index. This set of solutions is shown in figure 1. Four probability contours are shown for nJ

equal to 0.63, 0.65, 0.70 and 0.75. The base situa- tion, with average income (income index equals 1) and average alcohol price (price index equals 0.96), is marked. The diagram is best interpreted by holding either income or price constant, and investigating the impact of changes in the other variables on the probability l l J . Holding the level of income constant, it can be seen that decreases in the price index are associated with increasing probabilities of drinking over safe limits. The distance between the contours indicates the respon- siveness of drinking behaviour to price changes, with closer contours representing higher respon- siveness. Therefore, these iso-probability curves can be seen to imply higher price-responsiveness at lower initial levels of income.

Holding the price index constant at 0.85, increases in income are initially associated with rapid increases in the probability of drinking heavily. However, if an individual earns between approximately 33% and 250% of average income,

Figure 1. Iso-probability lines of drinking over safe limits by price and income levels

Page 9: A grouped data regression approach to estimating economic and social influences on individual drinking behaviour

ESTIMATING l"NcIjS ON DRINKING BEHAMOUR 245

marginal increases in income are associated with decreases in risky drinking. Once again, however, the responsiveness of the probability Ill depends on the initial level of the other variable. At the average price (index=0.96), increases in income from 33% of average income are estimated to have a negligible impact on drinking behaviour.

CONCLUSION

In this paper, the importance of estimating models which predict individual drinking behaviour has been emphasised. Much of the previous work on determinants in the alcohol literature has focused on time-series data at the population level. This work is limited, since it cannot predict influences on the riskiness of individuals' drinking patterns which are the focus of health policy.

In this analysis of the drinking behaviours of 18 to 24 year-old males over the period 1978-1990, personal characteristics and family structure were found to affect significantly the probability of risky drinking behaviour. Whilst there were significant differences in drinking patterns between some socio-economic groups, the highest edu- cational qualification obtained by individuals was found to be an insignificant influence on alcohol consumption.

There was, however, a significant association between risky behaviours, with current smokers significantly more likely to drink heavily. The risk pf drinking over 50 units per week, for example, was found to be 75% higher for current smokers compared to non-smokers (table 3). Self-reported ignorance of the potential health hazards of heavy drinking was also positively correlated with high alcohol consumption.

The results of this study are also compatible with previous studies which find the drinking habits of individuals' social networks to be highly influential on their alcohol cons~mption.'~ In this study, the levels of alcohol consumption in the individual's area of residence and household were both found to positively impact on the risk of heavy drinking, with the influence of the closest peers (in this case, other household members) being the most significant (table 4).

A significant interaction between the influences of the price of alcohol and income on drinking behaviour was estimated. As a result, it is only feasible to discuss the responsiveness of drinking behaviour to changes in either of these variables at

given levels of the other. A base situation was described which involved average levels of all continuous variables and zeros for all binary vari- ables. From this base situation, a 5% increase in the real price of alcohol was estimated to reduce the probability of drinking over safe limits by 1.5% (table 4). On the other hand, an increase in income, equal to 10% of the average income level, is predicted to increase the probability of at-risk drinking by 0.5%. The influences of these changes are larger for individuals on lower incomes.

Iso-probability lines (figure 1) illustrate the complexity of the relationships between alcohol price, income and drinking behaviour which have been estimated in this article. Based on the regres- sion results obtained, simplification of the relationships between these variables, through exclusion of cross-product and higher-order terms, would have reduced the explanatory power of the model and produced biased results. Nevertheless, the appropriate level of complexity leads to results from which it is difficult to make generalisable interpretations. The influence of price changes, for example, will depend on the initial values of all independent variables, and will therefore be different for each individual.

In this article, the use of iso-probability curves has been demonstrated for illustrating complex relationships between two independent variables and drinking behaviour. For predicting the effects of policy changes, the grouped data regression approach can be used to produce probabilities of each drinking category for different combinations of independent variables. Since the GHS is a large, representative sample of the population, it would be possible to use these individual-level data to investigate changes in explanatory vari- ables. Monte-Carlo simulations of policy changes, drawing at random from the sample population and the predicted probabilities of different drink- ing levels, would be more accurate and informative. Use of the grouped data regression approach is recommended for focusing policy analysis on individual risks of alcohol-related health and social problems, rather than taxation and employment consequences.

ACKNOWLEDGEMENTS

We are grateful to the OPCS and ESRC Data Archive for supplying the GHS data used in this paper. The authors would like to thank participants

Page 10: A grouped data regression approach to estimating economic and social influences on individual drinking behaviour

246 M. SUTTON AND C. GODFREY

of the Third European Workshop on Econometrics and Health Economics for helpful comments on an early version of this paper. We are grateful to Les Godfrey, Chr is Orme and Sean Healy for advice on various aspects of the work. We would also like to acknowledge constructive comments made by two anonymous referees and Owen O’Donnell. Preliminary work on which this paper is based was funded by the Health Education Authority.

FOURADF

GCEORCSE

HIGHERQ

HOUSBOOZ APPENDIX

ADVERT

AGE18

ALCNODAM

APPRENTC

AREABOOZ

ARMED

CALCINC

CHDUSF

COLLEGE EMPh4AN

EVERMARR

FARMER

FEWDURAB

Quarterly expenditure on alcohol advertising deflated by an all items price index (Source: Media Expenditure Analysis Limited). Dummy; 1 if aged 18 years (AGE19 to AGE23 similarly defined). Dummy; 1 if does not believe alcohol can damage health in general or if drunk to excess. Dummy; 1 if highest educational qualification is an apprenticeship. Index of drinking habits of other adults in the individual’s area of residence. Constructed similarly to HOUSBOOZ. Dummy; 1 if member of the armed forces (see EMPMAN). Per adult gross income of the family unit deflated by a price index. The price index is total consumers’ expenditure divided by total consumer’s expenditure at 1985 prices. Dummy; 1 if any child under five years in the family unit. Dummy; 1 if at college. Dummy; 1 if employer or man- ager, or if currently unemployed was most recently an employer or manager, or if never employed, this is the socio-economic group of their partner or father. Dummy; 1 if ever been married or currently cohabiting. Dummy; 1 if farmer or agri- cultural worker (see EMPMAN). Dummy; 1 if household lies in bottom quartile of the distribution of the ratio of the number of

HOUSKEEP

HSBOOZ3 HSBOOZSQ ILLHLTH

INCOME3 INCOMESQ UANUAL

MANYCAR

NOCAR

NOHSBOOZ

durables divided by the average number of durables in that year. Dummy; 1 if family contains four adults or more. Dummy; 1 if highest education qualification is WE, CSE, or 0- level. Dummy; 1 if highest education qualification is more advanced than W E , CSE or 0-Level. Index reflecting the drinking habits of the other adults in the household, constructed in two stages: (1) the average number of units of alcohol consumed by the other adults in the household is calculated, (2) this variable is divided by the average of this variable across all individuals in this time period. If no value is available for HOUSBOOZ it is set equal to the mean value for this sample. An variable NOHSBOOZ is also created for these cases. Dummy; 1 if currently employed in keeping house. Cube of HOUSBOOZ. Square of HOUSBOOZ. Dummy; 1 if in ill-health, defined as rating own health as poor, having a longstanding illness which limited activity in the last two weeks, or being unemployed because health state prevents working. Cube of CALCINC. Square of CALCINC. Dummy; 1 if intermediate or junior manual worker (see EMPMAN). Dummy; 1 if have ratio of cars per adult over average cars per adult in the top quartile of the distribution. Dummy; 1 if have ratio of durables to average durables in top quartile of the distribution. Dummy; 1 if household does not have a car. Dummy; 1 if there is no value for HOUSBOOZ, because either the other adults in the household did not complete the drinking

Page 11: A grouped data regression approach to estimating economic and social influences on individual drinking behaviour

ESTIMATING INFLUENCES ON DRINKING BEHAVIOUR 247

NONUKBRN NURSING

ONEADF

ONECHDF

OTHERQ

OWNACC

PRICE2 PRICEINC PRICEY2 PRICEY3 PROF

RALCP

RENT

SEP-DIV

SLFUKBRN

SMOKER

THREEADF

THREECHDF

TWOCHDF

UNEMP

questions, or there are no other adults in the household. Dummy; 1 if born outside UK. Dummy; 1 if highest education qualification is a nursing qualification. Dummy; 1 if only adult in family (defined as sixteen years or over). Dummy; 1 if family contains one child only. Dummy; 1 if highest education qualification is any qualification not included in HIGHERQ, GCEORCSE, APPR-ENTC, or NURSING, including foreign qualifications. Dummy; 1 if own account non- manual worker (see EMPMAN). Square of RALCP. Product of CALCINC and RALCF'. Square of PRICEINC. Cube of PRICEINC. Dummy; 1 if professional (see EMPMAN). Expenditure on alcohol at current prices divided by expenditure valued at 1985 prices, deflated by an all items price index. Dummy; 1 if house is rented or leased. Dummy; 1 if currently separated or divorced. Dummy; 1 if born in the UK but at least one parent born outside UK. Dummy; 1 if current smoker of cigarettes, pipes or cigars. Dummy; 1 if a three-adult family. Dummy; 1 if family contains three or more children. Dummy; 1 if family unit contains two children. Dummy; 1 if currently unem- ployed or actively seeking work.

REFTRENCES

1. Godfrey, C. Economic influences on change in population and personal substance behaviour in, Edwards, G. and Lader, M. (eds.) Addiction: processes of change Society for the Study of Addiction Monograph. Oxford: Oxford Medical Publications 1994.

2. Edwards, E. et al. Alcohol policy and the public good Oxford: Oxford University Press 1994.

3. Atkinson, A., Gomulka, J. and Stem, N. Spending on beer, wine and spirits: evidence from the Family Expenditure Survey 1970-1983. Discussion Paper 114, ESRC programme on taxation, incentives and the distribution of income. London School of Economics, London 1989.

4. Crooks, E. Alcohol consumption and taxation an economic analysis. Institute for Fiscal Studies, London 1989.

5. Baker, P. and McKay, S . The structure of alcohol tares: a hangover from the past? IFS Commentary 21. Institute forFiscal Studies, London 1990.

6. The Health of the Nation: a strategy for health in England. London: HMSO 1992.

7. Warner, K.E. Possible increases in the underreporting of cigarette consumption. Journal of the American Statistical Association, 1978; 73: 314-318.

8. OPCS. General Household Survey 1990. London: HMSO 1991.

9. Sutton, M. and Godfrey, C. The Health of the Nation targets for alcohol: a study of the economic and social determinants of high alcohol consump- tion in diferent groups. A report to the Health Education Authority. Centre for Health Economics: York 1994.

10. Stewart, M.B. On least squares estimation when the dependent variable is grouped. Review of Econ- omic Studies 1983; L: 737-753.

11. Greene, W. LJMDEP Version 6.0: user's manual and reference guide. Econometric Software, hc.: New York 1992.

12. Horowitz, J.L. Bootstrap-based critical values for the information matrix test. Journal of Econome- trics 1994; 61 (2): 395-411.

13. Godfrey, L.G., McAleer, M. & MacKenzie, C.R. Variable addition and Lagrange multiplier tests for linear and logarithmic regression models. Review of Economics and Statistics 1988; 70: 492-503.

14. Godfrey, C. Licensing and the demand for alcohol. Applied Economics 1988; 200: 1541-58.

15. Skog, 0-J. Social interaction and the distribution of alcohol consumption. Journal of Drug Issues 1980;

16. Jones, A. A systems approach to the demand for alcohol and tobacco. Bulletin of Economic Research 1989; 4: 23-39.

17. The Faculty of Public Health Medicine. Alcohol and the public health: the prevention of harm related to the use of alcohol. Royal Colleges of Physicians. MacMillan: Basingstoke, 1991.

18. The Royal College of Physicians. A great and growing evil: the medical consequences of alcohol abuse. Tavistock London, 1987.

19. Press, W.H., Flannery, B.P.. Tukolsky, S.A. and Vetterling, W.T. Numerical Recipes. Cambridge University Press 1986.

10 71-92.