Measuring Information Systems Service Quality: … Concerns Measuring Information Systems Service...

SERVQUAL Concerns

MeasuringInformation SystemsService Quality:Concerns on the Useof the SERVQUALQuestionnaire1

By: Thomas P. Van DykeDepartment of Information Systems

and TechnologiesCollege of Business and EconomicsWeber State University3804 University CircleOgden, Utah [email protected]

Leon A. KappelmanBusiness Computer Information

Systems DepartmentCollege of Business AdministrationUniversity of North TexasDenton, Texas [email protected]

Victor R. PrybutokBusiness Computer Information

Systems DepartmentCollege of Business AdministrationUniversity of North Texas

Denton, Texas 76203-3677U.S.A.

1Robert Zmud was the accepting senior editor for this paper.

Abstract

A recent MIS Quarterly article rightfully pointsout that service is an important part of the roleof the information systems (IS) department andthat most IS assessment measures have aproduct orientation (Pitt et al. 1995). The articlewent on to suggest the use of an /S-context-modified version of the SERVQUAL instrumentto assess the quality of the services supplied byan information services provider (Parasuramanet al. 1985, 1988, 1991).2 However, a numberof problems with the SERVQUAL instrumenthave been discussed in the literature (e.g.,Babakus and Boiler 1992; Carman 1990;Cronin and Taylor 1992, 1994; Teas 1993).This article reviews that literature and discussessome of the implications for measuring servicequality in the information systems context.Findings indicate that SERVQUAL suffers froma number of conceptual and empirical difficul-tie’s. Conceptual difficulties include the opera-tionalization of perceived service quality as adifference or gap score, the ambiguity of theexpectations construct, and the unsuitability ofusing a single measure of service quality acrossdifferent industries. Empirical problems, whichmay be/inked to the use of difference scores,include reduced reliability, poor convergentvalidity, and poor predictive validity. This sug-gests that (1) some alternative to differencescores is preferable and should be utilized; (2) used, caution should be exercised in the inter-pretation of I$-SERVQUAL difference scores;and (3) further work is needed in the develop-ment of measures for assessing the quality ofIS services..

Keywords: IS management, evaluation, mea-surement, service quality, user attitudes,user expectations

ISRL Categories: AI0104, AI0109, A104,E10206.03, GB02, GB07

2The terms "information services provider" or "IS servicesprovider" are used to refer to any provider of informationsystems services. This includes the information systemsfunction within an organization as well as external vendorsof information systems products and/or services.

MIS Quarterly/June 1997 195

Research Note

Introduction

Due to the growth of outsourcing, end-user-controlled information assets, joint ventures,and other alternative mechanisms by whichorganizations are meeting their need for infor-mation systems services, IS managers areincreasingly concerned about improving theperceived (as well as actual) service quality the IS function (Kettinger and Lee 1994). recent years, the use of SERVQUAL-basedinstruments has become increasingly popularwith information systems (IS) researchers.However, a review of the literature suggeststhat the use of such instruments may result ina number of measurement problems.

A recent article makes several important con-tributions to the assessment and evaluation ofthe effectiveness of information systems (IS)departments in organizations (Pitt et al. 1995).The article:

1. Points out that, although service is animportant part of the role of the IS depart-ment, most IS assessment measures havea product orientation.

2 Proposes an extension of the categorizationof IS success measures (DeLone andMcLean 1992) to include service quality.

3. Proposes the use of the SERVQUAL instru-ment from marketing (Parasuraman et al.1985, 1988, 1991) to operationalize the ISservice quality construct and modify thewording of the instrument to better accom-modate its use in an IS context.

4. Adapts and augments a theory regardingthe determinants of service quality expecta-tions to an IS context and offers ideas forfuture research.

A number of studies, however, identify poten-tial difficulties with the SERVQUAL instrument(e.g., Babakus and Boiler 1992; Carman 1990;Cronin and Taylor 1992, 1994; Teas 1993,1994). This research note reviews some of theliterature regarding difficulties with the

SERVQUAL instrument in general and exam-ines the implications of these difficulties to theuse of the instrument in an IS context.

The SERVQUAL Instrument:Problems Identified in theLiterature

The difficulties with the SERVQUAL instrumentidentified in the literature can be grouped intotwo main categories: (1) conceptual and (2)empirical; although, the boundary betweenthem blurs because they are closely inter-relat-ed. The conceptual problems center around(1) the use of two separate instruments, onefor each of two constructs (i.e., perceptionsand expectations), to operationalize a thirdconceptually distinct construct (i.e., perceivedservice quality) that is itself the result of a com-plex psychological process; (2) the ambiguityof the expectations construct; and (3) the suit-ability of using a single instrument to measureservice quality across different industries (i.e.,content validity). The empirical problems are,by and large, the result of these conceptual dif-ficulties, most notably the use of differencescores, in conjunction with the a theoreticalnature of the process used in the constructionof the original five dimensions of service quali-ty. The empirical difficulties most often attrib-uted to the SERVQUAL instrument include lowreliability, unstable dimensionality, and poorconvergent validity. A review of these concep-tual and empirical problems should serve tocaution those who wish to use SERVQUAL tomeasure the service quality of an informationsystem provider.

Conceptual difficulties withSERVQUAL

Subtraction as a "Simulation" of aPsychological Process

Many of the difficulties associated with theSERVQUAL instrument stem from the opera-

196 MIS Quarterly~June 1997

SERVQUAL Concerns

tionalization of a service quality construct thatis theoretically grounded in a discrepancy orgap model. In conceptualizing service quality,Parasuraman et al. 1985, 1988, 1991, 1994b)use the "service quality model," which positsthat one’s perception of service quality is theresult of an evaluation process whereby "thecustomer compares.., the perceived serviceagainst the expected service" (Gr6nroos 1984,p. 37).

Rather than develop an instrument to directlymeasure the perception of service quality thatis the outcome of this cognitive evaluationprocess, the SERVQUAL instrument(Parasuraman et al. 1988, 1991) separatelymeasures the expected level of service andthe experienced level of service. Then servicequality scores are calculated as the differencebetween these two measures. These threesets of scores are commonly referred to asexpectation (E), perception (P), SERVQUAL (whereby, SERVQUAL = P - E).Although not without precedent, and certainlyworthy of fair empirical evaluation, the implicitassumption that subtraction accurately por-trays this cognitive process seems overly sim-plistic. Even if one fully accepts this discrepan-cy model of experiences vis-~.-vis expectationsas indicative of the general process wherebyone arrives at an evaluation of a service expe-rience, the notion that the specific mechanismis merely subtraction does not logically follow.The use of differences was, and remains, anoperational decision. Regrettably, it does notappear to have been a particularly good one.The direct measurement of one’s perception ofservice quality that is the outcome of this cog-nitive evaluation process seems more likely toyield a valid and reliable outcome. If the dis-crepancy is what one wants to measure, thenone should measure it directly.

Ambiguity of the "Expectations" Construct

Teas (1994) notes that SERVQUAL expecta-tions are variously defined as desires, wants,what a service provider should possess, nor-mative expectations, ideal standards, desiredservice, and the level of service a customer

hopes to receive (e.g., Parasuraman et al.1985, 1988, 1991; Zeithaml et al. 1993). Thesemultiple definitions and corresponding opera-tionalizations of "expectations" in theSERVQUAL literature result in a concept thatis loosely defined and open to multiple inter-pretations (Teas). Yet even when concise defi-nitions are provided, various interpretations ofthe expectations construct can result in poten-tially serious measurement validity problems.

The conceptualization of "expectations" con-sistent with the SERVQUAL model is the vec-tor attribute interpretation--"that is one onwhich a customer’s ideal point is at an infinitelevel" (Parasuraman et al. 1994, p. 116).Unfortunately, as the proportion of extremeresponses (e.g., seven on a seven point scale)increases, the expectation scores become lessuseful as an increasing proportion of the varia-tion in difference-based SERVQUAL scores isdue only to changes in perception scores.

Teas (1993) found three different interpretationsof "expectations" derived from an analysis offollow-up questions to an administration of theSERVQUAL questionnaire. One interpretationof expectations is as a forecast or prediction.The forecast interpretation of expectations can-not be discriminated from the disconfirmedexpectations model of consumer satisfaction(Oliver 1980). This interpretation is inconsistentwith the definition of service quality put forth byParasaraman et al. (1988) and results in a dis-criminant validity problem with respect to con-sumer satisfaction, A second interpretation ofexpectations is as a measure of attribute impor-tance, When respondents use this interpreta-tion, the resulting perception-minus-expectationscores exhibit an inverse relationship betweenattribute importance and perceived servicequality, all other things being equal.

The third interpretation identified by Teas(1993) is the "classic ideal point" concept.Parasuraman et al. (1994) describe this whenthey note that "the P-E [i.e., perceptions-minus-expectations] specification could beproblematic when a service attribute is a clas-sic ideal point attribute~that is one on which acustomer’s ideal point is at a finite level andtherefore, performance beyond which will dis-

MIS Quarterly~June 1997 197

Research Note

please the customer (e.g., friendliness of salesperson in a retail store)" (p. 116). Thisinterpretation of expectations results in aninverse of the relationship between theSERVQUAL score, calculated as perceptionminus expectation (P - E), and actual servicequality for all values when perception scoresare greater than expectation scores (i.e., P E). This interpretation is consistent with thefinding that user satisfaction scores were high-est when actual user participation was in con-gruence with the user’s need for participation,rather than merely maximized (Doll andTorkzadeh 1989).

These various interpretations of the "expecta-tion" construct lead to a number of measure-ment problems. The findings suggest that aconsiderable portion of the variance in theSERVQUAL instrument is the result of mea-surement error induced by respondent’s vary-ing interpretations of the "expectations" con-struct (Teas 1993).

Three separate types of expectations havebeen described (Boulding et al. 1993): (1)thewill expectation, what the customer believeswill happen in their next service encounter; (2)the should expectation, what the customerbelieves should happen in the next serviceencounter; and (3) an ideal expectation, whata customer wants in an ideal sense. The idealinterpretation of expectation is often used inthe SERVQUAL literature (Boulding et al.1993). Boulding et al. (1993) differentiatebetween should and ideal expectations by stat-ing that what customers think should happenmay change as a result of what they havebeen told to expect by the service provider, aswell as what the consumer views as reason-able and feasible based on what they havebeen told and their experience with the firm ora competitor’s service. In contrast, an idealexpectation may "be unrelated to what is rea-sonable/feasible and/or what the serviceprovider tells the customer to expect"(Boulding et al. 1993, po 9). A series of experi-ments demonstrated results that were incom-patible with the gap model of service quality(Boulding et al. 1993). Instead, the resultsdemonstrate that service quality is influencedonly by perceptions. Moreover, the results indi-

cate that perceptions are influenced by bothwill and should expectations, but in oppositedirections. Increasing will expectations leads toa higher perception of service quality whereasan increasing expectation of what should bedelivered during a service encounter will actu-ally decrease the ultimate perception of thequality of the service provided (Boulding et al.1993). Not only do these findings fail to sup-port the gap model of service quality, but theseresults also demonstrate the wildly varyingimpact of different interpretations of the expec-tations construct.

Different methods to operationalize "expecta-tions" in developing their IS versions ofSERVQUAL have been used (Pitt el al. 1995;Kettinger and Lee 1994). One study used theinstructions to the survey to urge the respon-dents to "think about the kind of IS unit thatwould deliver excellent quality of service" (Pittet al. 1995). The items then take a form suchas: E1 They will have up-to-date hardware andsoftware. Whereas the second study (Kettingerand Lee 1994) used the form: E1 Excellentcollege computing services will have up-to-date equipment.

Recall that some respondents to SERVQUALwere found to interpret expectations as fore-casts or predictions (Teas 1993). This interpre-tation corresponds closely with the will expec-tation (Boulding et al. 1993). It is easy to seehow this interpretation might be formed espe-cially with the "They will" phrasing (Pitt et al.1995). Unfortunately, the impact of the willexpectation on perceptions of service quality isopposite from that intended by theSERVQUAL authors and the (P-E) or gapmodel of service quality (Boulding et al. 1993).

In summary, a review of the literature indicatesthat respondents to SERVQUAL may havenumerous interpretations of the expectationsconstruct and that these various interpretationshave different and even opposite impacts onipemeptions of service quality. Moreover, some,of the findings demonstrate that expectations~influence only perceptions and that percep-tions alone directly influence overall service,quality (Boulding et al. 1993). These findingsfail to support the (P-E) gap model of service

198 MIS Quarterly/June 1997

SERVQUAL Concerns

quality and indicate that the use of the expec-tations construct as operationalized bySERVQUAL-based instruments is problematic.

Applicability of SERVQUAL AcrossIndustries

Another often mentioned conceptual problemwith SERVQUAL concerns the applicability ofa single instrument for measuring service qual-ity across different industries. Severalresearchers have articulated their concerns onthis issue. A study of SERVQUAL across fourdifferent industries found it necessary to addas many as 13 additional items to the instru-ment in order to adequately capture the ser-vice quality construct in various settings, whileat the same time dropping as many as 14items from the original instrument based onthe results of factor analysis (Carman 1990).The conclusion arrived at was that consider-able customization was required to accommo-date differences in service settings. Anotherstudy attempted to utilize SERVQUAL in thebanking industry (Brown et al. 1993). Theauthors were struck by the omission of itemswhich they thought a priori would be critical tosubject’s evaluation of service quality. Theyconcluded that it takes more than simple adap-tation of the SERVQUAL items to effectivelyaddress service quality across diverse set-tings. A study of service quality for the retailsector also concluded that utilizing a singlemeasure of service quality across industries isnot feasible (Dabholkar et al. 1996).

Researchers of service quality in the informa-tion systems context appear to lack consensuson this issue. Pitt et al. (1995) state that theycould not discern any unique features of ISthat make the standard SERVQUAL dimen-sions inappropriate nor could they discern anydimensions with some meaning of servicequality in the IS domain that had been exclud-ed from SERVQUAL. Kettinger and Lee(1994), however, found that SERVQUALshould be used as a supplement to the UIS(Baroudi and Orlikowski 1988) because thatinstrument also contains items that are impor-tant determinants of IS service quality. Their

findings suggest that neither the UIS norSERVQUAL alone can capture all of the fac-tors which contribute to perceived servicequality in the IS domain. For example, itemscontained in the EllS include the degree oftraining provided to users by the IS staff, thelevel of communication between the users andthe IS staff, and the time required for new sys-tems development and implementation, all ofwhich possess strong face validity as determi-nants of IS service quality. In addition,Kettinger and Lee dropped the entire tangiblesdimension from their IS version of SERVQUALbased on the results of confirmatory factoranalysis. These finding contradict the beliefthat all dimensions of SERVQUAL are relevantand that there are of no unique features of theIS do.main not included in the standardSERVQUAL instrument (Pitt et al. 1995). It difficult to argue that items concerning themanner of dress of IS employees and the visu-al attractiveness of IS facilities (i.e., tangibles)should be retained as important factors in theIS domain while issues such as training, com-munication, and time to complete new systemsdevelopment are excluded. We agree that.using a single measure of service qualityacross industries is not feasible (Dabholkar etal. 1996) and therefore future research shouldinvolve the development of industry-specificmeasures of service quality.

Empirical difficulties with theSERVQUAL instrument

A difference score is created by subtractingthe measure of one construct from the mea-sure of another in an attempt to create a mea-sure of a third distinct construct. For example,in scoring the SERVQUAL instrument, anexpectation score is subtracted from a percep-tion score to create such a "gap" measure ofservice quality. Even if one assumes that thediscrepancy theory is correct and that theseare the only (or at least, the last) two inputsinto this cognitive process, it still raises thequestion: Can calculated difference scoresoperationalize the outcome of a cognitive dis-crepancy? It appears that several problemswith the use of difference scores make them a


Research Note

poor measure of psychological constructs(e.g., Edwards 1995; Johns 1981; Lord 1958;Peter et al. 1993; Wall and Payne 1973).Among the difficulties related to the use of dif-ference measures discussed in the literatureare low reliability, unstable dimensionality, andpoor predictive and convergent validities.

Reliability Problems With Difference Scores

Many studies demonstrate that Cronbach’s(1951) alpha, a widely-used method of esti-mating instrument reliability, is inappropriatefor difference scores (e.g., Cronbach andFurby 1970; Edwards 1995; Johns 1981; Lord1958; Peter et al. 1993; Prakash andLounsbury 1983; Wall and Payne 1973). Thisis because the reliability of a difference scoreis dependent on the reliability of the compo-nent scores and the correlation between them.The correct formula for calculating the reliabili-ty of a difference score (rD) is:

r D =

0-12rll + 0-22r22 - 2r120-10.2

O’12 + 0"22 - 2r120.10.2

where rll and r22 are the reliabilities of the twocomponent scores, G12 and 0"22 are the vari-ances of the component scores, and r12 is thecorrelation between the component scores(Johns 1981).

This formula shows that as the correlation ofthe component scores increases, the reliabilityof the difference Scores is decreased. Anexample was provided where the reliability ofthe difference score formed by subtracting onecomponent from another with an average relia-bility of .70, and a correlation of .40, is only .50(Johns 1981). Thus, while the average reliabili-ty of the two components is .70, which is con-sidered acceptable (Pitt et al. 1995; cf.,Nunnally 1978), the correlation between thecomponents reduces the reliability of the differ-ence score to a level that most researcherswould consider unacceptable (Peter et al.1993).

An example of the overestimation for the relia-bility caused by the misuse of Cronbach’salpha can be found in the analysis of servicequality for a computer manufacturer(Parasuraman et al. 1994a; see Table 1). Notethat Cronbach’s alpha consistently overesti-mates the actual reliability for the differencescores of each dimension (column 2). Alsonote that the use of the correct formula for cal-culating the reliability of a difference score hasdemonstrated that the actual reliabilities for theSERVQUAL dimensions may be as much as.10 lower than reported by researchers incor-rectly using Cronbach’s alpha. In addition,¯ these findings show that the non-difference,direct response method results in consistentlyhigher reliability scores than the (P-E) differ-ence method of scoring.

These results have important implications for¯ the IS-SERVQUAL (Pitt et al. 1995).

Table 1. Reliability of SERVQUAL: The, Misuse of Cronbach’s Alpha

Johns’ 0¢ forA Priori Cronbachs’ ¢z Cronbachs’ ¢z Differences

Dimensions (Non-Difference) (Difference) (Difference)Tangibles .83 .75 .65Reliability .91 .87 .83Responsiveness .87 .84 .81Assurance .86 .81 .71Empathy .90 .85 .81

Note: Difference scores calculated as perception minus expectation (P - E).

200 MIS Quarter/y/June 1997

SERVQUAL Concerns

Cronbach’s alpha, which consistently overesti-mates the reliability of difference scores, wasused incorrectly. Even when using the inflatedalpha scores, Pitt et al. note that two of threereliability measures for the tangibles dimensionfall below the 0.70 level required for commer-cial applications. Had they utilized the appro-priate modified alpha, they may have conclud-ed that the tangibles dimension is not reliablein the IS context, a finding which would havebeen consistent with the results of Kettingerand Lee (1994).

A review of the literature clearly indicates thatby utilizing Cronbach’s alpha, researchers tendto overestimate the reliabilities of differencescores especially when the component scoresare highly correlated: Such is the case with theSERVQUAL instrument (Peter et al. 1993).

Predictive and Convergent Validity IssuesWith Difference Scores

Another problem with the SERVQUAL instru-ment concerns the poor predictive and conver-gent validities of the measure. Convergentvalidity is concerned with the extent to whichmultiple measures of the same construct agreewith each other (Cambell and Fiske 1959).Predictive validity refers to the extent to whichscores of one construct are empirically relatedto scores of other conceptually-related con-structs (Bagozzi et al. 1992; Kappelman 1995;Parasuraman et al. 1991).

One study reported that perceptions-onlySERVQUAL scores had higher correlationswith an overall service quality measure (i.e.,convergent measure) and with complaint reso-lution scores (i.e., the predictive measure)than did the perception-minus-expectation dif-ference scores used with SERVQUAL(Babakus and Boiler 1992). A different studyperformed regression analyses in which anoverall single-question service quality ratingwas regressed separately on both differencescores (i.e., perception minus expectation)and perception-only scores (Parasuraman etal. 1991). The perception-only SERVQUALscores produced higher adjusted r-squared

values (ranging from .72 to .81) compared the SERVQUAL difference scores (rangingfrom .51 to .71).

The predictive validity of difference scores, anon-difference direct response score, and theperceptions only scores for SERVQUAL in thecontext of a financial institution hae been com-pared (Brown et al. 1993). Correlation analysiswas performed between the various scoresand a three-item behavioral intentions scale.Behavioral intentions include such concepts aswhether the customer would recommend thefinancial institution to a friend or whether theywould consider the financial institution firstwhen seeking new services. The results of thestudy show that both the perceptions only (.31)and direct response (.32) formats demonstrat-ed higher correlations with the behavioralintentions scale than did the traditional differ-ence score (.26).

The superior predictive and convergent validityof perception-only scores was confirmed(Cronin and Taylor 1992). Those results indi-cated higher adjusted r;squared values for per-ception-only scores across four different indus-tries. The perception component of the per-ception-minus-expectation score consistentlyperforms better as a predictor of overall ser-vice quality than the difference score itself(Babakus and Boiler 1992; Boulding et al.1993; Cronin and Taylor 1992; Parasuramanet al. 1991).

Unstable Dimensionality of the SERVQUALInstrument

The unstable nature of the factor structure forthe SERVQUAL instrument may be related tothe atheoretical process by which the originaldimensions were defined. The SERVQUALquestionnaire is based on a multi-dimensionalmodel (i.e., theory) of service quality. A dimensional model of service quality basedon a review of the service quality literatureand the extensive use of both executive andfocus group interviews was developed(Parasuraman et al. 1985). During instrumentdevelopment, Parasuraman et al. (1988)


Research Note

Table 2. Unstable Dimensionality of SERVQUAL

StudyCarman (1990)

InstrumentFour modifiedSERVQUALs using12-21 of the originalitems

AnalysisPrincipal axis factoranalysis with obliquerotation

FactorStructure

Principal axis factoranalysis with obliquerotation

Five to nine factors

Bresinger and Lambert Odginal 22 items Principal axis factor Four factors with(1990) analysi,s with oblique eigenvalues > 1

rotationParasuraman, Zeithaml, Odginal 22 itemsand Berry (1991)

Five factors, but differentfrom a priori model.Tangibles dimension split

into two factors, while respon-siveness and assurancedimensions loaded on a singl~factor.

Finn and Lamb (1991) Original 22 items LISREL confirmatory Five-factor modelfactor analysis ’had poor fit.

Original 22 items (1) Prin~.ipal axis factoranalysis with obliquerotation.(2) Confirmatory factoranalysis

Babakus and Boiler(1991)

Cronin and Taylor (1992) Original 22 items Principal axis factoranalysis with obliquerotation

*Van Dyke and Popelka 19 of original 22 items Principal axis factor(1993) analysis with oblique

rotation*Pitt, Watson, and Kavan Original 22 items(1995)

Original 22 items

Principal componentsand maximum likelihoodwith verimax rotation

LISREL confirmatoryfactor analysisPrincipal axis factoranalysis with obliquerotation

*Kettinger and Lee(1994)*Kettinger, Lee, and Lee(1995)

Original 22 items

(1) Five-factor modelnot supported

(2) Two factors

Unidimensionalstructure

Unidimensionalstructure

(1) Financial institutionseven-factor model withtangibles and empathy splitinto two.(2) Consulting firm five-factors, none matching theoriginal.(3) Information systemsservice firm--three-factormodel.Four-factor model, tangiblesdimension dropped.(1) Korea, three-factormodel, tangibles retained.(2) Hong Kong, four-factormodel, tangibles retained.

*Measured information systems service quality.


SERVQUAL Concerns

began with 97 paired questions (i.e., one forexpectation and one for perception). Items(i.e., question pairs) were first dropped on thebasis of within-dimension Cronbach coeffi-cient alphas, reducing the pool to 54 questionpairs. More items were then dropped or re-assigned based on oblique-rotation factorIoadings and within-dimension Cronbachcoefficient alphas resulting in a 34 paired-iteminstrument with a proposed seven-dimension-al structure. A second data collection andanalysis with this "revised" definition andoperationalization of service quality resultedin the 22 paired-item SERVQUAL instrumentwith a proposed five-dimensional structure.Two of these five dimensions contained itemsrepresenting seven of the original 10 dimen-sions. We are cautioned, however, that thosewho wish to interpret factors as real dimen-sions shoulder a substantial burden of proof(Cronbach and Meehl 1955). Moreover, suchproof must rely on more than just empiricalevidence (e.g., Bynner 1988; Galletta andLederer 1989).

The results of several studies have demon-strated that the five dimensions claimed for theSERVQUAL instrument are unstable (seeTable 2). SERVQUAL studies in the informa-tion systems domain have also demonstratedthe unstable dimensionality of the SERVQUALinstrument. The service quality of IS serviceswas measured in three different industries, afinancial institution, a consulting firm, and aninformation systems service business (Pitt etal. 1995). Factor analysis was conducted usingprincipal components and maximum likelihoodmethods with varimax rotation for a range ofmodels. Analysis indicated differing factorstructures for each type of firm. Analysis of theresults for a financial institution indicated aseven-factor model with both the tangibles andempathy dimensions split into two. Thesebreakdowns should not be surprising. Pitt et al.note that "up-to-date hardware and software"are quite distinct from physical appearances inthe IS domain. The empathy dimension wascreated by the original SERVQUAL authorsfrom two distinctly different constructs, namelyunderstanding and access, which were com-bined due to the factor Ioadings alone, withoutregard to underlying theory. Not only did IS-

SERVQUAL not match the proposed model, itsfactor structure varied across settings.Analysis of the data from the consulting firmresulted in a five-factor model although noneof these matched the original a priori factors.The factor analysis of the information systemsbusiness data resulted in the extraction of onlythree factors.

LISREL confirmatory factor analysis was usedon SERVQUAL data collected from users (i.e.,students) of a college computing servicesdepartment (Kettinger and Lee 1994).Analysis of this data resulted in a four factorsolution. The entire tangibles dimension wasdropped. An IS version of SERVQUAL wasused in a cross-national study (Kettinger et al.1995). Results of exploratory common factoranalysis with oblique rotation indicated athree-factor model from a Korean sample anda four-factor model was extracted from a HongKong data set. The tangibles dimension wasretained in the analysis of both of the Asiansamples.

The unstable dimensionality of SERVQUALdemonstrated in many domains, includinginformation services, is not just a statisticalcuriosity. The scoring procedure forSERVQUAL calls for averaging the P-E gapscores within each dimension (Parasuraman etal. 1988). Thus a high expectation coupledwith a low perception for one item would becanceled by a low expectation and high per-ception for another item within the samedimension. This scoring method is only appro-priate if all of the items in that dimension areinterchangeable. This type of analysis wouldbe justified if SERVQUAL demonstrated aclear and consistent dimensional structure.However, given the unstable number and pat-tern of the factor structures, averaging groupsof items to calculate separate scores for eachdimension cannot be justified. Therefore, forscoring purposes, each item should be treatedindividually and not as part of some a prioridimension.


Research Note

Discussion andConclusions

In summary, numerous problems with the origi-nal SERVQUAL instrument are described inthe literature (e.g., Babakus and Boiler 1992;Carman 1990; Cronin and Taylor 1992, 1994;Teas 1993, 1994). The evidence suggests thatdifference scores, like the SERVQUAL percep-tion-minus-expectation calculation, tend toexhibit reduced reliability, poor discriminantvalidity, spurious correlations, and restrictedvariance problems (Edwards 1995; Peter et al.1993). The fact that the perception componentof the difference score exhibits better reliabilioty, convergent validity, and predictive validitythan the perception-minus-expectation differ-ence score itself calls into question the empiri-Cal and practical usefulness of both the expec-tations scores as well as the difference scores(Babakus and Boiler 1992; Cronin and Taylor1992; Parasuraman et al. 1994). Moreover,inconsistent definitions and/or interpretationsof the "expectation" construct lead to a numberof problems. The findings of Teas (1993) sug-gest that a considerable portion of the variancein SERVQUAL scores is the result of measure-ment error induced by respondents’ varyinginterpretations of the expectations construct. Inaddition, since expectations, as well as per-ceptions, are subject to revision based onexperience, concerns regarding the temporalreliability of SERVQUAL difference scores areraised. Furthermore, the dimensionality of theSERVQUAL instrument is problematic.

It was reported that an analysis of IS-SERVQUAL difference scores resulted ineither three, five, or seven factors dependingon the industry (Pitt et al. 1995). A portion the instability in the dimensionality ofSERVQUAL can be traced to the developmentof the original instrument (i.e., Parasuraman etal. 1988). Given these problems, users of theSERVQUAL instrument should be cautioned toassess the dimensionality implicit in their spe-cific data set in order to determine whether thehypothesized five-factor structure that hasbeen proposed (Parasuraman et al. 1988,1991) is supported in their particular domain.Moreover, if the item elimination and dimen-

sional collapsing utilized in the development ofSERVQUAL has resulted in a 22 paired-iteminstrument that in fact does not measure all ofthe theoretical dimensions of the service quali-ty construct (i.e., content validity), then the useof linear sums of those items for purposes ofmeasuring overall service quality is problemat-ic as well (Galletta and Lederer 1989).

Many of the difficulties identified with theSERVQUAL instrument also apply to the IS-modified versions of the SERVQUAL instru-ment used by Pitt et al. (1995) and Kettinger and Lee (1994). It appears that theIS-SERVQUAL instrument, utilizing differencescores, is neither a reliable nor a valid mea-surement for operationalizing the service quali-ty construct for an information systems ser-vices provider. The IS versions of theSERVQUAL instrument, much like the originalinstrument (Parasuraman et al. 1988), sufferfrom unstable dimensionality and are likely toexhibit relatively poor predictive and conver-gent validity, as well as reduced reliabilitywhen compared to non-difference scoringmethods. The existing literature providesimpressive evidence that the use of percep-tion-minus-expectation difference scores isproblematic.

This critique of the perceptions-minus-expecta-tions gap score should not be interpreted as aclaim that expectations are not important orthat they should not be measured. On the con-trary, evidence indicates that both should andwill expectations are precursors to perceptionsbut that perceptions alone directly influenceoverall perceived service quality (Boulding etal. 1993). Our criticism is not with the conceptof expectations per se, but rather the opera-tionalization of service quality as a simple sub-traction of an ambiguously defined expecta-tions construct from the perceptions of the serovice actually delivered.

IS professionals have been known to raiseexpectations to an unrealistically high level inorder to gain user commitment to new systemsand technologies. This can make it much moredifficult to deliver systems and services that willbe perceived as successful. According to thernodel developed by Boulding et al. (1993), per-


SERVQUAL Concerns

ceived service quality can be increased byeither improving actual performance or by man-aging expectations, specifically by reducingshould expectations and/or increasing willexpectations. These two different types ofexpectations are not differentiated by the tradi-tional SERVQUAL gap scoring method. A betterapproach to understanding the impact of expec-tations on perceived service quality may be tomeasure will and should expectations separate-ly and then compare them to a service qualitymeasure that utilizes either a direct response orperceptions-only method of scoring.

Prescriptions for the use ofSERVQUAL

The numerous problems associated with theuse of difference scores suggest the need foran alternative response format. One alterna-tive is to use the perceptions-only method ofscoring. A review of the literature (Babakusand Boiler 1992; Boulding et al. 1993; Croninand Taylor 1992; Parasuraman et al. 1991,1994), indicates that perceptions-only scoresaro superior to the pemeption-minus-expecta-tion difference scores in terms of reliability,convergent validity, and predictive validity. Inaddition, the use of perceptions-only scoresreduces by 50% the number of items that mustbe answered and measured (44 items to 22).Moreover, the findings of Boulding et al. (1993)suggest that expectations are a precursor topemeptions and that perceptions alone directlyinfluence service quality.

A second alternative, suggested by Carman(1990) and Babakus and Boiler (1992), is revise the wording of the SERVQUAL itemsinto a format combining both expectations andperceptions into a single question. Such anapproach would maintain the theoretical valueof expectations and perceptions in assessingservice quality, as well as reduce the numberof questions to be answered by 50%. Thisdirect response format holds promise for over-coming the inherent problems with calculateddifference scores. Items with this format couldbe presented with anchors such as "falls farshort of expectations" and "greatly exceeds

expectations." One study indicates that suchdirect measures possess higher reliability andimproved convergent and predictive validitywhen compared to difference scores(Parasuraman et al. 1994a).

Conclusion

Recognizing that we cannot manage what wecannot measure, the increasingly competitivemarket for IS services has emphasized theneed to develop valid and reliable measures ofthe service quality of information systems ser-vices providers, both internal and external tothe organization. An important contribution tothis effort was made with the suggestion of aIS-modified version of the SERVQUAL instru-ment (Pitt et al. 1995). However, earlier stud-ies raised several important questions con-cerning the SERVQUAL instrument (e.g.,Babakus and Boiler 1992; Carman 1990;Cronin and Taylor 1992, 1994; Peter et al.1993; Teas 1993, 1994). A review of the litera-ture suggests that the use of difference scoreswith the IS-SERVQUAL instrument results inneither a valid nor reliable measure of per-ceived IS service quality. Those choosing touse any version of the IS-SERVQUAL instru-ment are cautioned. Scoring problems aside,the consistently unstable dimensionality of theSERVQUAL and IS-SERVQUAL instrumentsintimates that further research is needed todetermine the dimensions underlying the con-struct of service quality. Given the importance

~of the service quality concept in IS theory andpractice, the development of improved mea-suros of service quality for an information sys-tems services provider deserves further theo-retical and empirical research.

References

Babakus, E., and Boiler, G. W. "An EmpiricalAssessment of the SERVQUAL Scale,"Journal of Business Research (24:3), 1992,pp. 253-268.

Bagozzi, R., Davis, F., and Warshaw, P."Development and Test of a Theory of


Research Note

Technological Learning and Usage,"Human Relations, (45:)7, 1992, pp.659-686.

Bamudi, J. and Orlikowski, W. "A Shod-FormMeasure of User Information Satisfaction: APsychometric Evaluation and Notes onUse," Journal of Management InformationSystems (4:4), 1988, pp. 44-49.

Boulding, W., Kalra, A., Staelin, Ro, andZeithaml, V. A. "A Dynamic Process Modelof Service Quality: From Expectations toBehavioral Intentions," Journal of MarketingResearch (30:1), 1993, pp. 7-27.

Brensinger, R. P., and Lambed, D. M. "Canthe SERVQUAL Scale be Generalized toBusiness-to-Business Services?" inKnowledge Development in Marketing,AMA’s Summer Educators ConferenceProceedings, Boston, MA, 1990, p. 289.

Brown, T. J., Churchill, G. A., and Peter, J. P."Improving the Measurement of ServiceQuality," Journal of Retailing (69:1), 1993,pp. 127-139.

Bynner, J. "Factor Analysis and the ConstructIndicator Relationship," Human Relations(41:5), 1988, pp. 389-405.

Campbell, D. T., and Fiske, D. W. "Convergentand Discriminant Validation by theMultitrait-Multimethod Matrix," Psy-chological Bulletin (56), 1959, pp. 81-105.

Carman, J. M. "Consumer Perceptions ofService Quality: An Assessment of theSERVQUAL Dimensions," Journal ofRetailing (66:1), 1990, pp. 33-55.

Cronbach, L. J. "Coefficient Alpha and theInternal Structure of Tests," Psychometrika(16), 1951, pp. 297-334.

Cronbach, L. J., and Furby, L. "How WeShould Measure ’Change’--or Should We?"Psychological Bulletin (74), July 1970, pp.74-73.

Cronbach, L. J., and Meehl, P. "ConstructValidity in Psychological Tests,"Psychological Bulletin (21), 1955, pp.281-302.

Cronin, J. J., and Taylor, S. A° "MeasuringService Quality: A Reexamination andExtension," Journal of Marketing (56:3),1992, pp. 55-68.

Cronin, J. J., and Taylor, S. A. "SERVPERFversus SERVQUAL: ReconcilingPerformance-Based and Perceptions-

Minus-Expectations Measurements ofService Quality," Journal of Marketing(58:1), 1994, pp. 125-131.

IDabholkar, P. Ao, Thorpe, I. D., and Rentz, J.O. "A Measure of Service Quality for RetailStores: Scale Development and Validation,"Journal of the Academy of MarketingSciences (24:1), 1996, pp. 3-16.

OeLone, W., and McLean, E. =InformationSystems Success: The Quest for theDependent Variable," Information SystemsResearch (3:1), Mamh 1992, pp. 60-95.

Doll, W. J., and Torkzadeh, G. =End-UserComputing Involvement: DiscrepancyModel," Management Science (35:10),1989, pp. 1151-1171.

I--dwards. Jo R. =Alternatives to DifferenceScores as Dependent Variables in theStudy of Congruence in OrganizationalResearch," Organizational Behavior andHuman Decision Processes (64:3), 1995,pp. 307-324.

J--inn, D. W., and Lamb, C. W. "An Evaluationof the SERVQUAL Scales in a RetailingSetting," Advances in Consumer Research(18), 1991, pp. 338-357.

Galletta, D. F., and Lederer, A. L. "SomeCautions on the Measurement of UserInformation Satisfaction," DecisionSciences (20:3), 1989, pp. 419-439.

Gr6nroos, C. "A Service Quality Model and itsMarketing Implications," European Joumalof Marketing (18:4), 1984, 36-55.

Johns, G. "Difference Score Measures ofOrganizational Behavior Variables: ACritique," Organizational Behavior andHuman Performance (27), 1981, pp.443-463.

Kappelman, L. "Measuring User Involvement:A Diffusion of Innovation Approach," TheDATA BASE for Advances in InformationSystems, (26:2,3), 1995, pp. 65-83.

Kettinger,W. J., and Lee, C. C. "PerceivedServiceQuality and User Satisfaction withthe Information Services Function,"Decision Sciences (25:5), 1994, pp.737-766.

Kettinger, W., Lee C., and Lee, S. "GlobalMeasurements of Information ServiceQuality: A Cross-National Study," DecisionSciences (26:5); 1995, pp. 569-585.


SERVQUAL Concerns

Lord, F. M. "The Utilization of UnreliableDifference Scores," Journal of EducationalPsychology (49:3), 1958, pp. 150-152.

Nunnally, J. Psychometric Theory, 2nd ed.,McGraw-Hill, New York, 1978.

Oliver, R. L. "A Cognitive Model of theAntecedents and Consequences ofSatisfaction Decisions," Journal ofMarketing Research (17:3), 1980, pp.460-469.

Parasuraman, A., Zeithaml, V. A., and Berry,L. L. "A Conceptual Model of ServiceQuality and its Implications for FutureResearch," Journal of Marketing (49:4),1985, pp. 41-50.

Parasuraman, A., Zeithaml, V. A., and Berry,L. L. "SERVQUAL: A Multiple Item Scale forMeasuring Consumer Perceptions ofService Quality," Journal of Retailing (64:1),1988, pp. 12-40.

Parasuraman, A., Zeithaml, V. A., and Berry,L. Lo "Refinement and Reassessment of theSERVQUAL Scale," Journal of Retailing(67:4), 1991, pp. 420-450.

Parasuraman, A., Zeithaml, V. A., and Berry,L. L. "Alternative Scales for MeasuringService Quality: A ComparativeAssessment Based on Psychometric andDiagnostic Criteria," Journal of Retailing(70:3), 1994a, pp. 201-229.

Parasuraman, A., Zeithaml, V. A., and Berry,L. L. "Reassessment of Expectations as aComparison in Measuring Service Quality:Implications for Further Research," Journalof Marketing (58:1), 1994b, pp. 111-124.

Peter, J. P., Churchill, G. A., and Brown, T. J."Caution in the Use of Difference Scores inConsumer Research," Joumal of ConsumerResearch (19:1), 1993, pp. 655-662.

Pitt, L. F., Watson, R. T., and Kavan, C. B."Service Quality: A Measure of InformationSystems Effectiveness," MIS Quarterly(19:2), June 1995, pp. 173-187.

Prakash, V., and Lounsbury, J. W. "AReliability Problem in the Measurement ofDisconfirmation of Expectations," inAdvances in Consumer Research, Vol. 10,Tybout, A.M. and Bagozzi, R. P. (eds.)Association for Consumer Research, AnnArbor, MI, 1983, pp. 244-249.

Teas, R. K. "Expectations, PerformanceEvaluation and Consumer’s Perception of

Quality," Journal of Marketing (57:4), 1993,pp. 18-34.

Teas, R. K. "Expectations as a ComparisonStandard in Measuring Service Quality: AnAssessment of a Reassessment," Journalof Marketing(58:1), 1994, pp. 132-139.

Van Dyke, T. P., and Popelka, M. E."Development of a Quality Measure for anInformation Systems Provider," in TheProceedings of the Decision SciencesInstitute (3), 1993, pp. 1910-1912.

Wall, T. D., and Payne, R. "Are DeficiencyScores Deficient?" Journal of AppliedPsychology (58:3), 1973, pp. 322-326.

Zeithaml, V. A., Berry, L., and Parasuraman,A. "The Nature and Determinant ofCustomer Expectation of Service Quality,"Journal of the Academy of MarketingScience (21:1), 1993, pp. 1-12.

About the Authors

Thomas P. Van Dyke is an assistant profes-sor of information systems and technologies atWeber State University and recently complet-ed his a doctoral dissertation in business com-puter information systems at the University ofNorth Texas. His current research interestsinclude the effects of alternative presentationformats on biases and heuristics in humandecision making and MIS evaluation andassessment. He has published articles in TheJournal of Computer Information Systems andProceedings of the Decision SciencesInstitute.

Leon A. Kappelman is an associate professorof business computer information systems inthe College of Business Administration at theUniversity of North Texas and associate direc-tor of the Center for Quality and Productivity.After a successful career in industry, hereceived his Ph.D. (1990) in managementinformation systems from Georgia StateUniversity. He has published over two dozenjournal articles. His work has appeared inCommunications of the ACM, Journalof Management Information Systems,DATA BASE for Advances in InformationSystems, Journal of Systems Management,Journal of Computer Information Systems,


Research Note

Information Week, National ProductivityReview, Project Management Joumal, Journalof Information Technology Management,Industrial Management, as well as other jour-nals and conference proceedings. He authoredInformation Systems for Managers, McGraw-Hill (1993). His current research interestsinclude the management of information assets,information systems (IS) development andimplementation, IS project management, andMIS evaluation and assessment. He is co-chair of the Society for InformationManagement’s (SIM) Year 2000 WorkingGroup.

Victor R, Prybutok is the director of theUniversity of North Texas Center for Quality

and Productivity and an associate professor ofmanagement science in the BusinessComputer Information System Department. Hehas published articles in Technometrics,Operations Research, Economic QualityControl, and Quality Progress, as well as otherjournals and conference proceedings. Dr.Prybutok is a senior member of the AmericanSociety for Quality Control (ASQC), an ASQC.certified quality engineer, a certified qualityauditor, and a 1993 Texas quality awardexaminer. His current research interestsinclude project management, assessment ofquality programs, neural networks, and MIS.evaluation and assessment.


Measuring Information Systems Service Quality: … Concerns Measuring Information Systems Service...

Documents

Transcript of Measuring Information Systems Service Quality: … Concerns Measuring Information Systems Service...