Screening for interactions between design factors and demographics in choice-based conjoint

J BUSN RES 1992:24:115-133

115

Screening for Interactions Between Design Factors and Demographics in Choice-Based Conjoint

Goutam Chakraborty

George Woodworth

Gary J. Gaeth University of Iowa

Richard Ettenson University of Maryland

We propose an iterative stepwise forward selection procedure to screen a subset of important variables from a large pool of candidate variables in choice-based conjoint analysis. The method involves computing weighted correlations between candidate variables and residuals from the prior best fitting multinomial logit (MNL) model. Candidate variables that pass the screening step are then introduced and the MNL model is refitted using standard MNL software. A series of diagnostic tests is carried out before each iteration cycle. The procedure can be implemented easily using commonly available statistical software. We illustrate the application of the proposed method on a large data set.

Introduction

Choice modeling, whether based on field or experimental data, has been an important area of study for researchers in economics, marketing, as well as other fields (Sen, 1986; McFadden, 1986). Recently, discrete choice experiments have gained considerable popularity among academicians and practitioners who wish to understand consumer behavior (Louviere and Woodworth, 1983; Louviere, 1988). In a discrete choice experiment, a respondent is asked to evaluate a series of choice sets. Each choice set typically contains several alternatives described by specific combinations of attribute levels. The respondent’s task is to choose the most pre- ferred alternative from each choice set. Analogous to rating-based conjoint analysis (Green and Srinivasan, 1978; Louviere, 1988), the data generated from choice

Address correspondence to Goutam Chakraborty. currently at Oklahoma State University, College of Business Administration, Department of Marketing. Stillwater, OK 74074.

Journal of Business Research 24, 115-133 (1992) 0 1992 Elsevier Science Publishing Co., Inc.

014%2963/92/$5.W

655 Avenue of the Americas, New York, NY IO010

116 J BUN RES 1992:24:115-133

G. Chakraborty et al.

experiments can be modeled to estimate relative importance of the attributes, as well as part-worths associated with different levels of the attributes. This information provides considerable insight into consumers’ decision making processes and is often used to predict marketplace behavior through choice simulators. Al- though different types of choice modeling procedures have been reported in the literature (Currim, 1982; Buckley, 1988), the more popular ones appear to be variants of the multinomial logit model (MNL).

Estimating MNL models from choice experiment data becomes increasingly complex as the number of attributes describing the alternatives increases. For example, in a choice experiment that uses 10 attributes to describe alternatives, the second order model includes 10 main effects, 10 quadratic effects (i.e., the linear-by-linear component of main effects), and 45 2-way interaction effects. Thus, even without higher order interactions, there are 6.5 effects that might be included in the model. The situation is further complicated by the fact that researchers in marketing are often interested in finding differences in decision processes across managerially relevant market segments. These differences typically manifest them- selves in the form of varying attribute coefficients across demographic variables. Using the example above, if one considers only 5 such demographic variables (e.g., age, gender, income, education, and marital status), an additional 50 interactions between main effects and demographics need to be modeled. Additionally, if the researcher wants to include interactions between demographics and the quadratic effects or the 2-way attribute-by-attribute interactions, the number of total effects to be included in the model grows to 325. A MNL model of this magnitude is unwieldy, if not intractable for 2 reasons. First, the associated X-matrix is often ill-conditioned (collinear) making simultaneous estimation of all parameters diffi- cult, and second, many software packages cannot estimate such a large number of parameters. To deal with these problems, we outline an iterative stepwise forward selection procedure that can be implemented with existing statistical software packages.

The remainder of this article is organized as follows. First, we briefly describe the estimation of parameters in a MNL model. Next, we propose a screening method to identify significant effects from a large number of potential effects to include in the choice model. We then apply our screening method to an existing data set collected as part of a large study of consumer decision making for health insurance. Finally, we conclude the article by discussing some limitations of our screening method and suggesting directions for future research.

Maximum Likelihood Estimation of Parameters in MNL Models

The MNL form of Lute’s model (1959, 1977) can be derived from random utility theory (Amemiya, 1981; Ben-Akiva and Lerman, 1985; McFadden, 1981), which assumes that the utility of an alternative has 2 components, one that is fixed and the other random. The fixed or systematic component of an alternative’s utility is typically modeled as a function of the attribute levels and individual consumer characteristics. The random component reflects unobserved attributes and taste variation, measurement errors, and imperfect information (Ben-Akiva and Ler- man, 1985). If the random components are independently and identically Gumbell distributed, the probability that a randomly selected individual in population segment “g” chooses alternative “a” from choice set “A” is given by the MNL model:

Screening for Interactions in Conjoint J BUSN RES 1992:24:115-133

117

p,(aIA) = exp(Va,YC eXP(V,d7 jcA

where V,, is the systematic component of the utility of alternative “j” to individuals in segment “g.” This systematic component can be further specified as a function of the vector of the observed levels of each attribute (x) of the alternatives as

‘,, = c Pkg ’ xk, k

where Pkg is the weight of the attribute “k” in the valuation of the alternative “j” for respondents in segment “g.” This specification can be easily extended to ac- commodate quadratic or interaction effects.

There are several ways to mode1 the variation of attribute weights across segments, but the simplest is to regard them as functions of demographic (or similar) variables:

Pkg = Pk + c ake . we,

where ake represents the interaction between the attribute “k” and the demographic variable “e” and w_ is the value of the demographic variable “e” in the segment “g.” Thus, the utility of alternative “j” for individuals in segment “g” is:

‘1, = 2 Pk . xk, + c 2 Sk, ’ xkj . W,g (1) k k e

For the main effects model, attribute-by-demographic interactions are all zero; hence, dropping the segment (g) subscript, we are left with:

V,, = V, = C Pk ’ xkj (2)

k

Choice Experiments

Data from choice experiments are often analyzed using the MNL mode1 (e.g., Louviere, 1988). Suppose that the design in a choice experiment involves presenting I choice sets, A,, A,. . . . .A,, to each respondent. The probability that an individual from segment “g” chooses alternative “a” from choice set Ai, according to the MNL mode1 is:

jtAi

where Vi, is the systematic component of alternative “j” for respondents in segment “g” given by equation (1). In equation (3), the denominator is constant for alternatives within the same choice set Ai. Therefore, one can rewrite this equation as:

pg (alAi) = exp(V,,)l&

where ki, is an unknown constant corresponding to the choice set ‘5” and the segment “g” (see Technical Appendix for details). For the purpose of parameter estimation, we treat the choices made by a single respondent as conditionally independent, given the demographic characteristics of that respondent. Failure of this assumption does not bias parameter estimates but inflates estimates of their

118 J BUSN RES 1992:24:115-133 G. Chakraborty et al.

standard errors. Louviere and Woodworth (1983) discuss this and propose a conservative correction of the standard errors.

The parameters of the utility model in equation (1) are estimated by maximizing the formal likelihood

where f.,,, is the number of individuals in segment “R” choosing alternative “a” from choice set A,, and respectively.

G and I are the total number of segments and choice sets,

Practical Problems in Screening from a Large Number of Interactions

The practical barrier to fitting such models is the need to estimate coefficients for an extremely large number of potential interaction terms-both attribute-by- attribute and attribute-by-demographic interactions. In what follows, we describe an iterative stepwise forward selection method to rapidly screen a large number of variables for improving the model fit (see Cooper and Nakanishi, 1988, for single step screening procedure). Our method involves computing weighted correlations between candidate variables and residuals from the currently best fitting MNL model. Candidate variables that pass the screening step are then introduced and the MNL model is refitted using standard MNL software.

Our procedure is analogous to the well-known forward stepwise regression method. As such, it shares 3 generic shortcomings associated with any stepwise procedure. First, the decision as to which variables to include and when to stop is ad hoc and should depend on researchers’ knowledge and goals. Second, stepwise procedures are specifically designed to maximize model fit and are therefore generally unsuitable for inference and theory testing. Finally, all stepwise procedures suffer from the lack of availability of a suitable comparison or base model because the correct model is unknown and the full model with all independent variables is often saturated and of little value, especially in the presence of collinearity. To provide some safeguards against the generic weaknesses of stepwise procedures, we have incorporated a series of diagnostic checks in various steps of our proposed method.

Ideally, a screening procedure for choosing candidate variables should maximize pre-specified research criteria, such as improvement in model fit or improvement in the predictive ability of the model. In the context of maximum likelihood estimation, the usual criterion is maximizing the improvement in log-likelihood. How- ever, existing MNL programs do not allow stepwise addition of variables by this criterion. We are not aware of any existing software for stepwise introduction of variables in a multinomial logistic model, although such procedures are available for binary logit models (in SAS and BMDP). Our solution to this problem is to screen a small set of candidate variables, fit these variables with MNL software, and then test for improvement in log-likelihood by backwards elimination of the comparatively small set of candidate variables passing the screening step. The residuals from the refitted model are then used to screen for additional candidate variables. The procedure is repeated until no useful improvement in the model is made. Iteration helps to avoid missing an important variable in the screening stage.

Screening for Interactions in Conjoint

r s~gnnliicanr vanables

J BUSN RES 1992:24:115-133

119

STEP ONE Fir mam effects .M.NL model

Compute residuals. working

Weighted correlaaon of residuals with

ldennfy subset of can&date vanables

STEP THREE Center all vanables and run welghred 5repwse reeresslon

STEP FOUR Refit MNl_ model. adding candidate vanables whxh sunwed prewous srs;,

+ STEP FIVE

Check hkehhocd raoo resr for Jomt slgmficance of all new vanables

T-test for mdiwdual significance for each new vanable

Check If vansbles from Srep one are xl11 \lgnlflcanl?

Check lmprovemenr in J’

Figure 1.

Proposed Stepwise Screening Procedure

Our screening procedure is outlined in Figure 1. We start with the assumption that the attributes in a controlled choice experiment have been carefully selected by the researcher. Although there are certainly applications for which this is not appropriate, we decided to use the main effects as a core model; that is, all main effects are automatically included in all subsequent models. We then select important higher-order variables from a list of potential effects including quadratics, attribute-by-attribute and attribute-by-demographic interactions. The goal is to find effects that contribute to improvement in log-likelihood, and ultimately to provide a better model fit.

120 J BUSN RES 1992:24:115-133


As shown in Figure 1, we fit the main effects model using standard MNL software and obtain estimates of coefficients in step one. From these estimated coefficients, we compute expected frequencies, residuals and weights:

Expected frequency: ja,, = n,, . 0, ialA Residual: e “‘P = (5% - L&f&g Weight: w,,, = fa,,

where nig is the number of respondents in segment “g” evaluating choice set Ai. Typically, data are completely disaggregated so that II,, equals 1 and fai, is a O/l variable corresponding to each individual and alternative. In this case, the expected frequency equals the predicted probability pr (alAi).

In step 2, we identify a set of candidate variables for further consideration by selecting a subset of variables from all quadratic and interaction effects. This is accomplished by including those effects that have large weighted simple correlations with the residuals from the fitted model in step 1. This can be carried out easily using standard statistical software packages, which allow calculation of weighted correlations. This is similar in principle to identification of additional variables in large regression problems (Draper and Smith, 1981), except that we use the ordinary correlations between the residuals and candidate variables instead of the partial correlations controlling for variables already in the model. The purpose of this second step is to reduce the number of variables for step 3, where we work with the partial correlations. Alternative approaches can be taken to this step. In particular, a reviewer suggested that a split-half or a jackknife analysis could replace our use of the single set of correlations. Such an approach helps to protect against instability in the simple correlations. We did not include these approaches in this example, but have found this latter suggestion to be quite helpful when high le- verage data points are present, particularly in interval scaled data.

In step 3, we approximate a multinomial logistic regression by stepwise weighted linear regression (Jennrich and Moore, 1975) using the working logit as the dependent variable and the set of variables currently in the model, plus the candidate variables selected in the previous step, as independent variables. The working logit, the dependent variable in this step, is:

Y& = 1% <.Li,> + GIR

In this step, all the variables from step 1 are forced into the model to ensure that the stepwise regression selects those candidate variables having large partial correlations with the dependent variable, after adjusting for the variables already in the model. We also center all independent and dependent variable means to zero at the subject level to reduce collinearity (Draper and Smith, 1981). The weighted least squares stepwise procedure approximates a stepwise MNL solution but com- putes inflated “F to enter” values (F-ratio for each new variable), due to an over- statement of the residual degrees of freedom. We therefore recommend that researchers set a stringent “F to enter” threshold to ensure that only the most important variables are retained.

In step 4, we add the surviving variables to the MNL model. In step 5, the new model is tested against the original for improvement in log-likelihood using the nested model likelihood ratio test (Ben-Akiva and Lerman, 1985). This determines whether all the new variables’ coefficients are jointly equal to zero. It is possible


121

to reject this hypothesis and simultaneously have some nonsignificant t statistics for the new variables. In this case, we recommend additional nested model tests by deleting those new variables that individually are nonsignificant in the model. The ideal approach is to delete new variables one at a time starting with the least significant one and testing the significance of each deletion. However, this may be impractical due to the computer time required to fit large models.

Another important issue in this stage is whether the addition of new variables renders some of the original variables nonsignificant. This may arise due to high collinearity between the original variables and new variables. Although it is possible to maintain uncorrelated linear and linear-by-linear effects using a particular experimental design, it is not possible to achieve the same for attribute-by- demographic interactions, as these would depend on the distribution of the demographic variables in the sample. As mentioned previously, for controlled choice experiments we are guided by the principle that main effects are as equally important as interactions, and therefore recommend that the original variables be retained rather than the new variables that create this condition. We recognize that in some instances, an interaction between a demographic variable and a design variable may make more theoretical sense than a prespecified main effect for an attribute. In that case, the interaction can be retained and the main effect dropped. We leave this decision to the judgment of the researcher. An additional diagnostic we recommend at this stage is improvement in the adjusted likelihood ratio index (rho-squared bar) which is related to the Akaike information criterion (Ben-Akiva and Lerman, 1985). This is a measure of improvement in goodness-of-fit and is adjusted for degrees of freedom lost due to the addition of new variables.

Once each criterion is satisfied, the result is a model with all the original main effects and a small number of additional effects. We then return to step 1 and repeat the procedure to select additional interactions from those not yet considered. We continue to iterate until the likelihood cannot be improved further by a model that also satisfies the diagnostic criteria of step 5. When the iteration stops, we recommend RESET (regression specification error test) to check for adequacy of functional form of the selected model (Ramsey, 1969; Thursby, 1989). This test is illustrated in the application section.

It is possible that the selected model may fail one of these proposed tests. In such a case, the users may go back to step 1 and reiterate by starting with a smaller number of variables from the output of the first stepwise run. Another problem that may occur is that the iteration process may identify non-nested models. In such a case, we propose that the researchers conduct the modified likelihood ratio test for non-nested alternatives (Horowitz, 1982) to select one of the alternative models.

Our method of selecting candidate variables, based on the stepwise weighted regression of the current working logits on centered independent variables, shares the same theoretical foundations of the “score test” procedure used in econometrics to carry out the Lagrange multiplier test for omitted variables (McFadden, 1987; Engle, 1984; Crowder, 1976). In the score test, normalized residuals from the current model are regressed on centered and weighted independent variables to make inferences about whether any new variable should be added in the next step. Thus, in both the score test and our procedure, the inferences are guided by the partial correlations of normalized residuals or working logits (a function of

122 J BUSN RES lYY2:24:115-133


the residuals), respectively, with the omitted independent variables. However, the score test is an omnibus test for the presence of significant omitted variables, while our method seeks to maximize model fit. Inferences based on the omnibus score test are likely to be overly conservative when there are only a few significant variables among a large number of omitted variables. This is similar to the problem encountered when conducting the Scheffe omnibus test for multiple comparisons in the regression context.

Application of Proposed Screening Method

Method

In this section, we illustrate the use of our screening method by applying the procedure to data collected as a part of a large scale study of consumers’ decision processes when evaluating and selecting health insurance in a multi-plan environ- ment. Through extensive preliminary work (including 6 focus group interviews with consumers and input based on expert opinion), 24 attributes were selected to describe health insurance plans. For each attribute (except brand name), 3 levels were developed after additional pretesting and expert consultation; brand name was varied at 6 levels. The population of interest was Maryland state employees. The sampling frame was based on a set of workplace telephone directories of state agencies. A systematic sample was created by using every 26th name from the directories. After obtaining consent through telephone solicitation, choice conjoint questionnaires were mailed to 662 subjects. Each subject was paid $15 after they returned a completed questionnaire. The total number of returned surveys was 562, yielding a response rate of 8.5%.

Design

The conjoint portion of the questionnaire was composed of 11 choice sets with 4 alternatives per set. Of these, the first 2 choice sets were for practice and not used in the analysis. Each alternative in the choice set was described by a brand name and 23 other attributes. The attribute levels of the first 3 alternatives in each choice set was determined by a 6 x 3’3 fractional factorial design. The attribute levels of the fourth alternative in each choice set were held constant at the middle level. The constant alternative is used to serve as a base for estimating the contrast of the utilities of the other 3 alternatives. Following the conjoint section, the respondents completed, among other things, a section on demographics.

The attribute brand name was varied at 6 levels and coded by 5 O/l dummy variables. The 3 levels of the other attributes were coded such that the linear effects had values of - 1, 0, and 1, and the corresponding quadratic effects had values of 1, -2, and 1. The design permitted estimation of all linear and quadratic effects free from other higher-level interactions (attribute-by-attribute). Previous research suggested that the evaluation and selection of health insurance might be influenced by a respondent’s age, gender, job-class, education, current health insurance plan, and whether or not the respondent had dependents. These 6 variables were therefore used in this application. The 5 demographic variables were dichotomized and coded as - 1 and 1. “Current health insurance plan” w&s:also dichotomized based


Table 1. Coding and Sample Distribution of Demographic Variables

Variable Name Levels Coding Used % of Respondents

Age

Gender

40 yrs. or above Less than 40 yrs.

Male Female

Job-class White collar Blue collar

Current health insurance plan

Education

Dependents

Enrolled in Blue Cross/Blue Shield Enrolled in other insurance

More than one year of college High school or comm. college

Yes No

1 -1

-1

1 -I

1 -1

1 -1

I -I

50.3

49.7

48.3

51.7

58.5

41.5

62.3

37.7

62.9

27.1

51.5

48.5

on whether the respondent was enrolled in a Blue Cross/Blue Shield plan (Blue Cross/Blue Shield held approximately 80% of market share). The codes and the sample distribution for the levels of these 6 demographic variables are shown in Table 1. Thus, the model building task involved a total of 189 potential effects; 5 dummy variables for brand name, 23 linear effects, 23 quadratic effects, and 138 (6 x 23) attribute-by-demographic interactions contained in a data set of over 20.000 choices.

Result of First Iteration

As the first step, we estimated the parameters of the main effects model by using the SAS MLOGIT procedure. In the main effects model, most effects are significant and the adjusted likelihood ratio index is 13.9%. Although this statistic is similar to adjusted R2 in regression, the values of rho-squared bar are typically lower than regression R2 and there are no general guidelines for what would be considered an acceptable value for this statistic (Ben-Akiva and Lerman, 1985). All variables, except the brand name dummies and office hours, are coded such that - 1 represents the “best” level and + 1 represents the “worst” level of the attribute. Therefore, a negative sign is expected for all coefficients except those for office hours and brand dummies; the values under the main effects model in Table 2 confirm this. The order of the 5 most important attributes (hospitalization, choice of doctors, premium, dental coverage, and choice of hospitals) generally confirms managerial expectations. Thus, the overall performance of the main effects model appears to be satisfactory.

As an initial screening, we next identified 50 candidate variables from the pool of 161 variables left after fitting the main effects (28 variables) model. We retained all the main effects (even those few that are not significant) based on previously discussed reason. Fifty new variables were chosen on the basis of highest weighted correlations with the residuals of the main effects model. In this case, the number 50 was a result of an ad hoc decision to include only correlations significant at p < 0.10. Although it is theoretically possible to use all 161 variables in the stepwise procedure, the computer time required is prohibitively expensive. Moreover, since

Table 2. Details of Iterations of Screening Process

Parameter Estimates In

Variable Name Main

effects model

Dummy for brand 1

Dummy for brand 2

Dummy for brand 3

Dummy for brand 4

Dummy for brand 5 Waiting time in office

Office hours”

Premium

Emergency care

Choice of doctors

Prescription coverage

Convenience/paperwork

Office visit coverage

Out of town emerg. coverage

Dental coverage Quality of hospitals

Choice of hospitals

Travel time to physician

Travel time to hospital

Time to make appoint.

Substance abuse coverage

Psychologist/psychiatrist coverage Wellness/education program

Vision & hearing care

Communication with participants

Preventive care

Hospitalization

24-hour consultation by phone

Vision & hearing care-q (quadratic)

Out of town emerg. cov_q

(quadratic)

Travel time to physician-q

(quadratic)

Emerg. coverage-q (quadratic) Time to make appointment-q

(quadratic)

Convenience/paperwork_q (quadratic)

Hospitalization-q (quadratic) Current plan-4 by-Premium

Pretentive-q (quadratic)

Consultation by phone-q

(quadratic)

Prescription-q (quadratic) Gender-by-premium Log-likelihood of model

Change in log-likelihood

F

.107* .138 ,576 .357

,149 .13x

,390 ,412

,071s .117*

- .08S - .OY3 .091 .O79

- .32X - .353

-.I56 - .144 - .424 - ,397 - .060* - .029* - .212 - .209 p.10.5 ~ ,087 ~ ,244 ~ ,251 - ,319 - ,307 -.196 -.176 - .314 - .323 -.159 - .173 - ,063 - .05Y -.15x - .I53 - ,107 - .llO - ,253 ~ ,231 - ,028’ ~ ,046’ ~ ,226 ~ ,238

.032* .01g*

- ,244 - .237 - ,665 - ,647 -.211 ~ .210

- .056

- ,037

- ,047

- 5995

13.9%

- 5968 27h

14.2%

1st 2nd 3rd 4th 5th iteration iteration iteration iteration iteration

.117*

,308 ,147

,418

.141* - ,091

.076

- .342

-.I52

- .380 - .041*

- .217

-.lll

- ,256

- ,306 -.171

- ,314

-.158

- ,063

-.150

-.138

~ ,213

~ ,056

~ ,249

.011*

~ ,247

~ ,665 ~ ,207

- ,045

~ ,038

- ,048

.055

- .O43

- ,049

5953

16h 14.4%

.084*

,297

.126*

,418

.137*

-.102

.0X6 - .501

-.162

- ,388 - .036*

- ,204

~ ,109 - .261

- ,300

-.174

- .317

-.I67

- ,069

-.165

-.128

~ ,200

- ,074

- ,267

- .009*

~ ,256

~ ,650 - .200

~ ,045

- .03Y

- .043

.058

- .041

- .05 1 .041

.183

- .056

- 5940 13h

14.6%

.054*

,264 ,131

.3Y5

.111*

-.112

,097

- ,501

-.I70

- ,382

~ .036*

~. 190

-.123 - ,257

p.314

-.1X8

~ ,328

p.153

~ ,052

p.146

-.I46 -.187

- .097

- .266 - .010*

- .252

- .662 - ,207

- .039

- ,033

~ ,039

,064

- ,039

- ,056

,042

,167

- ,063

- .065

.045

- 5928 12h

14.7%

.057*

,277

,143

,400 .12g*

-.112

,095

- ,504

-.171

~ ,387 - .036*

-.18Y -.124

~ ,257

- ,315

-.I87

- .329 - .I51

- .04Y

-.146

-.147

-.187

- .096

- ,264

~ ,009’

- ,255

- ,666

- ,207

- ,039

~ ,033

- .037

.066

- ,039

~ ,057

,042

,168

- .064

- ,066 ,046

- ,068 ~ 5900

2@ 15.1%

“Except for office hours, all other variables are coded such that - 1 corresponds to best levels and + 1 corresponds to worst levels.

%gnificant at a = 0.01 or better by likelihood ratio test for nested models. *Not significant at a = 0.05 for one-tailed test.


we iterated through each stage (including initial screening and stepwise), the chance of missing an important variable due to this part of the screening procedure was very slim.

After centering each variable, we ran a weighted stepwise regression with working logits calculated from the current (main effects) model as the dependent variable and all main effects plus the 50 new candidate variables as independent variables. We first forced the 28 variables (all main effects including brand dummies) into the model and then looked for remaining variables that had the highest partial correlations with the dependent variable. The statistic “F to enter” in the stepwise regression is a measure of these partial correlations. In this illustration, we decided toconsider4variables,“visionandhearingcare-q,”“ out-of-townemergencycoverage

9, ” “current health care plan-by-choice of doctors,” and “travel time to physician q,” which had the highest “F to enter” in the stepwise run. In our example, any variable ending with “ _q” denotes the quadratic effect of that variable.

Next, we re-estimated the MNL model and included these 4 new variables along with the 28 variables of the main effects model. Using the nested model test, we retained only the variables “vision and hearing care-q,” “out-of-town emergency coverage-q,” and “travel time to physician-q. ” The coefficient for “current health care plan-by-choice of doctors” was not significant and there was no improvement in p’ when this variable was added to the model containing all other variables. Thus, the model after the first iteration contained the core (main effects) model and 3 additional quadratic effects, yielding a total of 31 variables.

Result of Second Iteration

Starting with the final model from the previous iteration, we recomputed the weighted correlations among the residuals of this model and the 158 variables remaining from the original pool of 189 variables, Again, based on the highest weighted correlations, we chose 50 new candidate variables. From the results of the stepwise regression of the working logits (calculated from the final model after the first iteration) on the 31 variables plus the 50 new variables, as before we selected 4 variables based on highest “F to enter.” These 4 variables were “emergency care-q, ” “current health care plan-by-choice of doctors,” “time to make appointment-q,” and “convenience/paperwork-q.” Similar to the first iteration, we dropped the interaction “current health care plan-by-choice of doctors” based on significance tests, and lack of improvement in model fit. Therefore, the model after the second iteration contained all main effects and 6 quadratic effects for a total of 34 variables.

Result of Third Iteration

Proceeding as in the previous iterations, we chose the first 4 variables from the stepwise regression for final consideration. These were “current health care plan- by-choice of doctors,” “hospitalization-q,” “current health care plan-by- premium,” and “preventive care-q.” Again based on the tests as in the previous iterations, we dropped the variable “current health care plan-by-choice of doctors” and retained the remaining variables in the model. Thus, at this stage the model contained all 28 main effects, eight quadratic effects, and one attribute-by- demographic interaction effect. We also dropped the variable “current health care

126 J BUSN RES 1992:24:1 IS-133 G. Chakraborty et al.

plan-by-choice of doctors” from the list of potential candidate variables for all subsequent iterations. We could have dropped this variable from the list of potential variables after the first iteration. However, as a check we decided to carry it through 3 iterations to show that it had no effect on subsequent models. In general, once a variable has been tested in the MNL model and shown to be nonsignificant, it may be dropped from the list of candidate variables.

Result of Fourth Iteration

In this step, the first 4 additional variables from stcpwise regression were “24-hour consultation by phone-q,” “prescription coverage-q,” “current health care plan-by-preventive care,” and “current health care plan-by-choice of hospitals,” respectively. In the re-estimated MNL model, the coefficients for the 2 demographic-by-attribute interaction variables were nonsignificant. A nested model test also failed to reject the hypothesis that these 2 coefficients were jointly different from zero. Similarly the goodness-of-fit measure, p2 suggested that adding these 2 variables did not improve the model fit. Therefore, we retained “24-hour consultation by phone-q” and “prescription coverage-q,” but dropped both interaction variables from further consideration. The model at this stage had all 28 main effects, 10 quadratic effects, and 1 interaction effect.

Result of Fifth and Final Iteration

The first 4 variables from the stepwise regression of this stage were “job-class-by- office visit coverage,” “gender-by-premium.” “premium-q,” and “current health care plan-by-quality of hospitals.” When added to the model, these new variables improved the log-likelihood and p’. Significance tests indicated that the coefficients of the 4 new variables were different from zero, both individually and jointly. However, adding these 4 new variables resulted in nonsignificant t statistics for a number of variables already in the model. As argued earlier, we believe that in a controlled discrete choice experiment, main effects are at least as important (if not more) as higher order effects. Hence, further investigation is needed before any new variable is added to the model. By deleting new variables one at a time and jointly, we concluded that only the “gender-by-premium” variable could be added to the model in this stage without degrading main effects. Thus, the model after the fifth iteration contained all main effects, 10 quadratic effects, and 2 attribute- by-demographic interactions.

After the fifth iteration, no variable could be added to the model that satisfied the criteria outlined in step 5 of Figure 1. Thus, the final model was obtained in 5 iterations. The adjusted likelihood ratio index, p”, was lS.l%, which was a modest but significant improvement over the main effects only model (p’ = 13.9%). In this model, most effects (except 2 brand specific dummies and 2 main effects) were significant, and all main effect coefficients were in the expected directions. The iteration results indicate that the coefficient estimates remained fairly stable (with no sign reversals) through each iteration. The final model also passed the RESET test successfully. This test checked the adequacy of the model’s functional form by adding higher powers (square, cube) of the predicted values based on variables already in the model. Had these additional terms been significant, it would have


127

indicated an incorrect functional form of the variables in the model. In sum, all diagnostic checks indicated a satisfactory terminal model.

Although the primary purpose of this article is not to describe the substance of the original health care project, the final model resulting from the application of our technique contains several quadratic and attribute-by-demographic interaction effects that provide insights into consumers’ health insurance decisions. For instance, the sign and significance of the “current health care plan-by-premium” interaction indicates that the respondents enrolled in Blue Cross and Blue Shield plans are less price sensitive than respondents enrolled in other plans. This may reflect a brand loyalty effect, since a majority of Blue Cross and Blue Shield enrollees have been with this plan for many years. Similarly, the “gender-by- premium” interaction indicates that male respondents were more sensitive to price than female respondents. This indicates that females may be a more lucrative target market for health insurance providers.

Conclusion

We proposed a way to help researchers identify a subset of important variables from a large pool of candidate variables in a discrete choice experiment. Our iterative procedure involved an initial screening based on the correlations of the residuals from a current model with the candidate variables. This was followed by a weighted linear stepwise regression that approximates stepwise multinomial logistic regression. Our weighted regression screening procedure based on the partial correlations shares the same theoretical underpinnings of the Lagrange multiplier test for omitted variables reported in the econometric literature (McFadden, 1987). Cooper and Nakanishi (1988) also recommend a similar screening procedure to identify potential cross-effects in modeling multinomial panel data. However, this latter procedure uses only a 1 step analysis of the residual correlations. Thus, their suggestion is identical to our step 2 in Figure 1. The advantages of our procedure are as follows. First, we use both ordinary correlations to quickly screen a large number of candidate variables followed by partial correlations for a more precise approximation to stepwise multinomial logistic regression. Second, our procedure is iterative, and has diagnostic checks built into each iteration stage so that variables missed at one stage can enter in subsequent stages. Thus, our procedure is eco- nomical and more flexible than currently available screening methods for modeling choice. When applied to a large data base from a choice experiment, our method proved to be effective both in identifying appropriate variables to include in the model and extending our understanding of consumer decision behavior for health insurance.

Our procedure is not without limitations because it shares several generic limitations possessed by any stepwise variable selection technique. Most notable is the arbitrary nature of selecting one model out of a large number of models of roughly equal explanatory power. For instance, the stepwise procedure commonly used for large regression problems suffer from this limitation. This issue is philo- sophical to the extent that model building requires both analytic skills and a theoretical foundation on the part of the researchers. The diagnostic checks proposed in our method will help researchers avoid statistical mistakes, but there would undoubtedly be situations in which a researcher may need to rely on theory in

128 J BUSN RES 1‘992:24:115-133


judging among candidate models. A skilled researcher might propose a good base model on a priori grounds and use our screening technique “on the margin” to fine tune that model by identifying useful omitted variables.

Finally, one inherent limitation unique to our method is that we are approxi- mating the exact stepwise solution of MNL model by using a weighted stepwise regression (step 3) to retain the flexibility of including demographic-by-attribute interactions in the pool of candidate variables (see Technical Appendix). However, this issue is less relevant because the final model estimation (step 4) is based on maximum likelihood without any approximation. Another weakness is the use of simple rather than partial correlations in the initial screening phase (step 2). This may cause a useful omitted variable to be dropped in the screening phase if that variable is somewhat collinear with the variables already in the model. The iterative nature of our procedure allows important omitted variables to be picked up in the subsequent iteration cycles, but this should be investigated in future research with a Monte Carlo simulation by varying the degree of intercorrelations among the set of candidate variables.

We thank Joel Horowitz for his comments on this manuscript and specifically for the suggestion of

conducting RESET test on MNL models. Thanks are also extended two anonymous reviewers and

especially to Jordan Louviere for their helpful comments. Thanks to Don Anderson for developing the experimental design used in the application.

Technical Appendix

Fitting Multinomial Logistic Models via Iteratively Reweighted Least Squares (IRLS)

Two well known ideas in mathematical statistics combined with a variable centering technique produce an IRLS algorithm for computing maximum likelihood esti- mators of the parameters of multinomial logistic choice models. This algorithm is easy to implement and variants of it are used in software for fitting MNL models.

Our explanation of the method involves imbedding the multinomial likelihood function in a more general, but simpler, Poisson likelihood function. A complex maximization involving relatively few parameters is thereby replaced by a simpler one involving a large number of “slack” parameters. Having simplified the hke- lihood function, we then use the IRLS algorithm to find its maximum (Jennrich and Moore, 1975).

For example, suppose that subjects are presented with 4 choice tasks, each requiring them to choose 1 of 3 alternatives: a,,, a,, and a,. To fit a main effects model, we use completely aggregated data. We use the symbols (fo,, f,,, fii) to denote the choice frequencies for the ith choice set. These are absolute, not relative frequencies and may be disaggregated completely or into segments using socio- demographic variables. For the purpose of parameter estimation, the vectors of choice frequencies are modeled as independent; however, this is unlikely to be true for completely aggregated data. Under this assumption, (foi, fii, fZi) is multinomial with probabilities P,,,, P,,, and Pzi, where P,, = exp (V.$t exp(V,,), V,, = x’&, xai is a vector of attribute variables associated with alternative “a” in choice set “i,” and p is the vector of attribute weights (see equation (1)).

Screening for Interactions in Conjoint J BUSN RES lYY2:24:115-133

Let us say that there are n subjects presented with the task of choosing product A, product B, or neither in 4 choice sets. The price of each product is either high or low and each product either has or does not have a particular feature. The data layout might look like this:

Attributes

Choice Set

Price Feature Choice Frequencies

A B A B None A B

1 Lo Hi No Yes

2 Lo Hi Yes No

3 Hi Lo No Yes 4 Hi Lo Yes No

We want to model choice probabiities with brand specific intercepts, feature effects, and a generic price effect. That is, if we code price as - l/l (Lo/Hi) and feature as l/O (Present/Absent), the model (dropping the choice set subscript) is:

p(A) = exp(VXl + exp(V.J + exp(Vdl; VA = cxA + rr*PRICE, + +,*FEATURE,

p(B) = exp(Vn)41 + expVA) + exp(V,)l; V, = 0~~ + rr*PRICE, + +,*FEATURE,

Where 01, and aB are brand-specific intercepts, -K is the generic price effect, and +A and &, are brand-specific feature effects. To fit this model by iteratively reweighted least squares, we create a data matrix of independent variables.

Choice Set Dummy Independent Variables Variables (within segment)

Brand Price Features

Choice Set ALT FRQ xl x2 x3 x4 X5 sl s2 s3 s4

1

1

1

2

2

2

3

3 3

4

4

4

0 fO1

I fll

2 f21

0 fO2

1 f12

2 f22

0 fO3

1 f13

2 t23

0 fO4

1 f14

2 f24

0 0 1 0 0 1

0 0 1 0 0 1

0 0

1 0 0 1

0 0

1 0

0 1

0

-1

1

0

-1

1

0 1

-1

0

1

-1

0 0 0 0 0 1

0 0

1 0 0 0

0 0

0 0 0 1

0 0

1 0

0 0

1 0 0 0

1 0 0 0

1 0 0 0

0 1 0 0

0 1 0 0

0 1 0 0

0 0 1 0 0 0 1 0

0 0 1 0

0 0 0 1

0 0 0 1

0 0 0 1

Parameters:

The estimation algorithm then consists of repeatedly running weighted linear regres- sions of a constructed dependent variable, y* (the working logit), on the indepen-

130 J BUSN KES lYY2:24:115-133


dent variables. Let us collect the parameters to be estimated in vector p’ = (p,,

p2. p% @4, p5) = (%%, aH~ n, $+,, QH). The p vector is initially set equal to (O,O,O,O,O), and is repeatedly updated as follows:

Reweight

Using the current values of fi compute:

1. The probat$lities, i,%, (0):

PII, = exp(V,J[exp(V,:,,) + exp(P,,) + exp(Vdl, where $‘(,, = 0.

PII = exp(~J[exp(~,:,,) + exp(P,,) + exp(Qz,)I. where P,, = fiI*x,, + &*x1, + fi,*2x,,.

A = exp(PJ[exp(Q,,,) + exp(P,,) + exp(P1,)l, where v’,, = fi?*x,, + fiI*xl, + &*x5,,

II. The expected frequencies i,, = n,*j,$,. III. The residuals e,, = (f,, - f,,)lf,,, and IV. The working logits Yz, = lo&,) + e,,,

Estimate p

Center all the variables, x,, . , x5 and y* and their fa,,-weighted averages within choice sets in each segment and regress centered y* on centered x,, , , x5. The resulting regression coefficients are updated values of 0.

Why Does It Work?

The multinomial log likelihood function is:

L(P) = xZ&fal lOg(Pai(P))

setting to zero the partial derivative with respect to p, we get the maximum likelihood estimating equation:

where n, = fo, + f,, + fi,. Now let us treat the frequencies, fA, as independent Poisson variates with means hai, where log (A;,,) = .Q’ p + y,, and y, = -log(c exp (~,~‘p)), is treated as a free parameter. Note that the conditional

distribution of fo,, f,,, f2i given the sampling constraint, L,, + f,, + fii = ni is multinomial with probabilities pa, given above. The role of the “slack” parameters yi is to ensure that the sampling constraint is satisfied by the fitted model. The Poisson log likelihood is:

L(P,Y) = XZJf,, log&J - &il

so that the joint estimating equations for p and y are:

C, [n, - exp(+) Zaexp(xa,‘fi)] = 0

and:

Screening for Interactions in Conjoint J BUSN RES 1’#2:24:115-133 131

CiCaX,i[& - exp(-j, + X,i’ fi)] = 0.

Solving the first equation for y and substituting the result into the second equation yields an equation identical to the multinomial estimating equation for 6, consequently multinomial and Poisson likelihoods yield the same estimate of p.

It can also be shown that the p submatrix of the inverse of the Poisson information matrix for p and y equals the inverse of the multinomial information matrix for p, so that multinomial and Poisson models produce the same asymp- totic standard errors for p. Jennrich and Moore (1975, section 3.3) failed to get this result because they did not introduce a slack parameter to handle the fixed N sampling constraint.

The use of iteratively reweighted least squares to compute maximum likelihood estimates in the log-linear Poisson model is explained in Jennrich and Moore (1975). Briefly, the method is as follows.

Write log (hai) = z,,‘O, where z,, isxai augmented with choice set dummy variables and 8 is p augmented with the slack parameters, y. The estimating equation for 6 is:

Expanding this around an arbitrary point, O,,, we get:

Letting y.zi = Z,i’ 8,, + uai - A,,(&,))/A,,(0,,), we have:

e ^I [ I;C~,,h,,(eo)Z:i] ‘[ ~CZ,iA,, (O~,)y~i] ~ [ Z’A,,Z] ‘[ Z'A,,Y* ]

That is, 6 is approximately the vector of regression coefficients in the A-weighted regression of y* on the z variables.

A practical impediment to the use of this method is the large number of dummy variables corresponding to the slack parameters, one per choice set per segment. If the data are disaggregated into segments, there must be one dummy variable for each choice set within each segment. This is likely to create intractability problem. To overcome this problem, we center y* and the x-variables at their A- weighted means within each choice set in each segment, then these variables are orthogonal to the dummy variables corresponding to the y parameters. Conse- quently, the updating equation for fi becomes:

@ = [Xc’ A,,Xc]-‘[Xc’ A,, Y,*] (Al)

where X, and Y,* have been centered; however, the residual degrees of freedom will be incorrect.

This is the inspiration for our screening procedure. Suppose that Z is a vector of candidate variables (e.g., brand-by-gender interaction) not included in the model. We would disaggregate the data matrix by gender, so that the first 3 rows of the data matrix become:

132 J BUSN RES lYY2:24: 1 IS-133


Brand by Choice Set Gender

Gender Choice Indep. Vars. Dummy Interaction

(1 = M, -1 = F) Set Alt Frey xl x2 x3 x4 x5 sll s12 21 22

I 1 0 f;lll 0 0 0 0 0 1 0 0 0

1 1 0 - 1 0 0 1 0 -1 0

1 2 0 -1

1 1 0 0 0 0 0 1 0 0

1 I ~ 1 0 0 0 1 1 0

1 2 fi,?

:,

1 1 0 1 0 1 0 1

The screening step consists of fitting the main effects model excluding variables Zl and 22. That is, we iterate equation (Al) to convergence. It is immaterial whether aggregated or gender-disaggregated data are used; the estimate will be the same. To add the gender-by-brand interaction to the equation we must disaggregate the data and append Zl and 22, appropriately centered, to the X, matrix and reiterate question (Al) to convergence. Our screening method, in effect, refits equation

(Al) but, in the interest of speed, does not reiterate. The significance levels obtained when weighted linear regression software is used to fit (Al) are incorrect, since the residual degrees of freedom include the degrees of freedom that ought to be deducted for the omitted choice set dummy variables.

The proposed screening method described in this article is an approximation to what would be an exact stepwise method for choosing variables in a MNL model. It is approximate for 2 reasons: First, the exact procedure requires iteratively reweighting the least squares fit produced by the weighted linear regression un- derlying each step in the stepwise regression. Second, the exact procedure requires the introduction of dummy variables (whether explicitly estimated or “absorbed”) for each choice set within each segment. While it is feasible to introduce these dummy variables when working with highly aggregated data (say 8 segments and 16 choice sets), it is not feasible when data are highly disaggregated by the inclusion of demographic interactions into the model. Thus, we trade off accuracy in our ability to identify potential effects for the flexibility of including all types of effects in the list of candidate variables.

References

Amemiya, Takeshi, Qualitative Response Models: A Survey. Journal of Economic Literature 19 (1981): 1483-1536.

Ben-Akiva, Moshe, and Lerman, Steven R. Discrete Choice Analysis: Theory and Appli- cation to Travel Demand, The MIT Press, Cambridge, MA. 1985.

Buckley, Patrick G. Nested Multinomial Logit Analysis of Scanner Data for a Hierarchical Choice Model. Journal of Business Research 17 (September 1988): 133-154.

Cooper, G. Lee., and Masao, Nakanishi, Market-Share Analysis: Evaluating Competitive Marketing Effectiveness, Kluwer Academic Publishers, MA. 1988.

Crowder, M. Maximum Likelihood Estimation for Dependent Observations. Journal of the Royal Statistical Society B 38 (1976).

Currim, Imran S. Predictive Testing of Consumer Choice Models Not Subject to Inde-


133

pendence of Irrelevant Alternatives. Journal of Marketing Research 19 (May 1982): 208-222.

Draper, Norman R., and Smith, Harry. Applied Regression Analysis, John Wiley & Sons, Inc., New York. 1981.

Engle, Robert. Wald, Likelihood Ratio and Lagrange Multiplier Test in Econometrics, in Handbook of Econometrics. Z. Griliches and M. Intriligator, eds., North-Holland, Amsterdam. 1984, Vol. 2.

Green, Paul E., and Srinivasan, V. Conjoint Analysis in Consumer Research: Issues and Outlook. Journal of Consumer Research 5 (September 1978): 103-123.

Horowitz, Joel. Statistical Comparison of Non-Nested Probabilistic Discrete Choice Models. Transportation Science 17 (1982): 319-350.

Jennrich, R. I., and Moore, R. H. Maximum Likelihood Estimation by Means of Nonlinear Least Squares, ASA Proceedings of the Statistical Computing Section, 1975: 57-65.

Louviere, Jordon J., and Woodworth, George. Design and Analysis of Simulated Consumer Choice or Allocation Experiments: An Approach Based on Aggregate Data. Journal of Marketing Research 20 (November 1983): 350-367.

Louviere, Jordan J., Analyzing Decision Making-Metric Conjoint Analysis, Sage Publica- tions, Inc., CA, 1988.

Lute, R. Duncan. Individual Choice Behavior, Wiley, New York. 1959.

Lute, R. Duncan. The Choice Axiom After Twenty Years. Journal of Mathematical Psy- chology 3 (1977): 215-33.

McFadden, Daniel. Econometric Models of Probabilistic Choice, in Structural Discrete Data with Econometric Application. C. F. Manski and D. McFadden, Eds. MIT Press, Cam- bridge, MA. 1981. Pp. 198-272.

McFadden, Daniel The Choice Theory Approach to Market Research. Marketing Science 5 (Fall 1986): 275-297.

McFadden, Daniel. Regression-Based Specification Tests For The Multinomial Logit Model. Journal Of Econometrics 34 (1987): 63-82.

Ramsey, J. B. Tests For Specification Errors in Classical Linear Least-Squares Regression Analysis. Journal of the Royal Statistical Society Series B, 31 (1969): 350-371

Sen, Subrata K. Introduction to the Special Issue on Consumer Choice Models. Marketing Science 5 (Fall 1986).

Thursby, Jerry G. A Comparison of Several Specification Error Tests For A General Al- ternative. International Economic Review 30 (February 1989): 217-230.

Screening for interactions between design factors and demographics in choice-based conjoint

Documents

Transcript of Screening for interactions between design factors and demographics in choice-based conjoint