Customer satisfaction measurement

28
The choice between a five-point and a ten-point scale in the framework of customer satisfaction measurement Pedro S. Coelho Susana P. Esteves New University of Lisbon In marketing research, and particularly in the context of customer satisfaction measurement, we often try to measure attitudes and human perceptions. This raises a number of questions regarding appropriate scales to use, such as the number of response alternatives. Obviously, there is a trade-off between the desired response discrimination level and the effort that is demanded of the respondent to situate his or her answer in one of the scale categories. If this effort is too high it can reduce the quality of responses and increase the non-response rate. In the context of customer satisfaction measurement we compare a five- point and a ten-point numerical scale. The analysis includes the evaluation of non-response rates, response distribution, the ability to model customer satisfaction, as well as convergent, discriminant and nomological validity of constructs used in the ECSI (European Customer Satisfaction Index) model. Globally, results tend to favour the choice of the ten-point scale, which contradicts some conventional wisdom. Moreover, we conclude that in this context there are no effects of socio-demographic characteristics (namely educational level) on the ability of respondents to use each scale. Introduction In marketing research, and particularly in the framework of customer satisfaction measurement, we often try to measure attitudes and human perceptions. This task raises a number of questions regarding questionnaire design and particularly about the appropriate response scales to use. Among the usual decisions is the choice between a verbal or numerical scale, along with the number of response alternatives. International Journal of Market Research Vol. 49 Issue 3 © 2007 The Market Research Society 313 Received (in revised form): 28 December 2006

Transcript of Customer satisfaction measurement

Page 1: Customer satisfaction measurement

The choice between a five-point and aten-point scale in the framework ofcustomer satisfaction measurement

Pedro S. CoelhoSusana P. EstevesNew University of Lisbon

In marketing research, and particularly in the context of customer satisfactionmeasurement, we often try to measure attitudes and human perceptions. Thisraises a number of questions regarding appropriate scales to use, such as thenumber of response alternatives. Obviously, there is a trade-off between thedesired response discrimination level and the effort that is demanded of therespondent to situate his or her answer in one of the scale categories. If this effortis too high it can reduce the quality of responses and increase the non-responserate. In the context of customer satisfaction measurement we compare a five-point and a ten-point numerical scale. The analysis includes the evaluation ofnon-response rates, response distribution, the ability to model customersatisfaction, as well as convergent, discriminant and nomological validity ofconstructs used in the ECSI (European Customer Satisfaction Index) model.Globally, results tend to favour the choice of the ten-point scale, whichcontradicts some conventional wisdom. Moreover, we conclude that in thiscontext there are no effects of socio-demographic characteristics (namelyeducational level) on the ability of respondents to use each scale.

Introduction

In marketing research, and particularly in the framework of customersatisfaction measurement, we often try to measure attitudes and humanperceptions. This task raises a number of questions regardingquestionnaire design and particularly about the appropriate responsescales to use. Among the usual decisions is the choice between a verbal ornumerical scale, along with the number of response alternatives.

International Journal of Market Research Vol. 49 Issue 3

© 2007 The Market Research Society 313

Received (in revised form): 28 December 2006

Coelho.qxp 26/04/2007 14:56 Page 313

Page 2: Customer satisfaction measurement

In fact, in recent years some discussion has taken place amongacademics and practitioners regarding the appropriate number of responsealternatives to use. It is usually accepted that a small number of pointsdoes not allow a good discrimination of responses (limiting the ability tofind significant differences between segments) and may limit the dataanalysis methods that can be used. More points improve the data metric,enrich the possible data analyses and facilitate the calculation ofcovariances between variables, which are used in most multivariate dataanalysis methods. Obviously, there is a trade-off between the desiredresponse discrimination level and the effort that is demanded of therespondent to locate his response in one of the scale categories. If thiseffort is too high it can reduce the quality of responses and increase thenon-response rate. Some of the traditional guidelines suggest the use of anumber of categories between three and nine (Stem & Noazin 1985;Malhotra & Birks 2003) and clearly the most used number of responsealternatives is seven (Cox 1980). Motivational theorists have been arguingagainst questions with a large number of response alternatives, based onthe fact that respondents may not be sufficiently motivated to makemeaningful discriminations (Tourangeau 1984; Krosnick & Alwin 1989;Alwin 1991). Obviously the choice regarding the number of responsealternatives depends on several factors, such as the nature of thephenomena being measured, the involvement of respondents in thephenomena, the socio-demographic characteristics of respondents andeven the nature of the data collection methods.

Also, some discussion exists regarding the use of an odd or even numberof response alternatives. The choice between an odd or even number ofalternatives is usually based on providing a neutral point as an acceptableresponse to a specific question. We contend that, within an odd scale, themiddle point is often used by the respondents that prefer to reduce theresponse effort. This fact results in an overestimation of the true frequencyassociated with this middle point. We also contend that, for mostquestions measuring attitudes within customer satisfaction surveys, the useof a scale with an even number of response alternatives is a preferablechoice. For this reason, we would consider that respondents should haveat least a slightly positive or slightly negative attitude towards theevaluated attribute. Obviously, it is possible that a respondent does nothave an opinion or experience with regard to specific attributes, but thatshould cause a non-response or a ‘no experience’ response, instead ofbeing considered indifferent.

Customer satisfaction measurement

314

Coelho.qxp 26/04/2007 14:56 Page 314

Page 3: Customer satisfaction measurement

In spite of these well-known ideas very little work has been done intrying to validate them. Particularly in the context of customer satisfactionmeasurement many authors have proposed the ten-point numerical scale(anchored in the extremes). This is the usual approach in the EuropeanCustomer Satisfaction Index (ECSI) (ECSI 1998; Ball et al. 2004) and inthe framework of the American Customer Satisfaction Index (ACSI)(Fornell et al. 1996, 1998; Johnson et al. 2001). Nevertheless, somepractitioners have criticised this scale, arguing that respondents havedifficulty discriminating answers using more than five points (particularlyin a telephone survey) and that an odd number of points may be preferablein order to account for neutrality. According to these arguments a standardfive-point scale would be a preferable choice over a ten-point scale.

This paper aims to compare a five-point and a ten-point numerical scalein the context of customer satisfaction measurement. The analysis includesthe evaluation of non-response rates, response distribution, the ability todiscriminate between attributes being measured, the ability to modelcustomer satisfaction, as well as convergent, discriminant and nomologicalvalidity of constructs used in the ECSI model. Moreover, we investigate theeffects of social-demographic characteristics on the ability of respondentsto use each scale. The social-demographic analysis includes both the effectson response profile and the probability of non-response.

The structure of the paper is organised as follows. The Introductionpresents the problem and the goals of the paper. In the next section weintroduce the use of scales in attitude measurement, and approach theproblem of choosing the number of response alternatives, referring toprevious work in this context. The organisation of the empirical study isshown in the subsequent section. This presentation includes the ECSImodel as well as the study design. The fourth section presents and analysesthe main results obtained in the study. The final section discusses the mainfindings.

Number of scale points in attitude measurement

Most marketing research and particularly customer satisfaction surveysinvolve the measurement of attitudes. In fact, when using surveys we areoften interested in measuring behaviour, but in practice we are usuallylimited to the measurement of attitudes. The reasons behind this practiceare that, on the one hand, it is easier and more feasible to question aboutattitudes than to observe real behaviour and, on the other, there is thecommon belief that attitudes can be seen as antecedents of behaviours.

International Journal of Market Research Vol. 49 Issue 3

315

Coelho.qxp 26/04/2007 14:56 Page 315

Page 4: Customer satisfaction measurement

For instance, customer satisfaction and customer loyalty are consideredantecedents of behaviours like customer attrition, customer acquisitionand customer value. Attitude variables that can be found in typical surveysinclude things such as beliefs, preferences and intentions.

In surveys, attitude variables are often measured using rating scales.Among the major decisions when constructing a rating scale are thenumber of response alternatives and the use of an odd or even number ofcategories. The choice between attitude scales and particularly the choiceregarding the number of response alternatives is not new in marketingtheory, but there is clearly a lack of empirical work on this issue.Moreover, in published research there is little agreement about the optimalnumber of response alternatives (Neumann & Neumann 1981; Alwin1997), or the use of an odd or even number of categories (Malhotra &Birks 2003). While several authors have concluded that there are nosignificant gains in using more than five response alternatives (Jenkins &Taber 1977; Stem & Noazin 1985; Converse & Presser 1986), others havefavoured scales with more alternatives (up to 25) (Green & Rao 1970;Cox 1980; Alwin 1997). Nevertheless, most of the results tend to favoura solution with a number of alternatives between five and nine (Andrews& Withey 1976; Cox 1980; Neumann & Neumann 1981; Givon &Shapira 1984; Cicchetti et al. 1985; Alwin & Krosnick 1991; Colmanet al. 1997).

Green and Rao (1970), comparing scales with two, three, six andeighteen response categories, recommend using at least six points and atleast eight scales per variable. Nevertheless, they conclude that littleinformation appears to be gained by increasing the number of responsecategories beyond six. Ramsay (1973), studying the effects of the numberof response categories on precision of scale values, concluded that usingseven categories provides almost as much precision as a scale requiring acontinuous judgment. Jenkins and Taber (1977), using a Monte Carlostudy to analyse composite scale reliability, conclude that reliability levelsoff after five response categories. Neumann and Neumann (1981)compared six rating scales of lengths between two and ten points.Although their conclusions favour the longer scales, they could not findsignificant differences between a seven-point and a ten-point scale in termsof correlation and eta coefficients. Moreover they found that deviations ofactual averages from theoretical means increase as the number of choicepoints increase. Stem and Noazin (1985) have investigated the number ofscale positions on test–retest reliability, concluding that five- and seven-point scales are the most reliable for bipolar adjective scales. Moreover,

Customer satisfaction measurement

316

Coelho.qxp 26/04/2007 14:56 Page 316

Page 5: Customer satisfaction measurement

they have not found any increase in reliability from using a number ofcategories above five, and detected a significant decrease in reliabilitywhen using more than seven points. Cicchetti et al. (1985), using a MonteCarlo simulation to access the extent to which the interrater reliability isaffected by the number of scale points, concluded that, although reliabilityincreases up to seven scale points, no substantial increases occur beyondthat point. Nevertheless, conclusions are necessarily limited to thesimulation parameters. Haley and Case (1979) have tested 13 attitudescales for agreement, discrimination among brands and response pattern,but their study did not offer much insight into the choice of the number ofresponse alternatives within a same scale type. Givon and Shapira (1984)use a stochastic model to investigate the conjoint effect of the number ofitems and number of response alternatives on the sampling error of acomposite scale estimator. They conclude that sampling error may bereduced, increasing the number of response alternatives up to five, sevenor nine, depending on the number of items used. A review of the workdone up until the beginning of the 1980s, regarding the optimal number ofresponse alternatives, can be found in Cox (1980). The author concludesthat there is no single number of response alternatives for a scale that isappropriate under all circumstances. Also, he concludes that no formulacan be given to indicate what this number should be, even in a particularset of circumstances. Nevertheless, he establishes that scales with two orthree response alternatives are generally inadequate and that the marginalreturn from using more than nine response alternatives is minimal.Consequently, he proposes that the optimal number of responsealternatives is normally situated between five and nine. One exception isAlwin’s (1997) work, which compares seven- and eleven-category ratingscales. His conclusions favour the eleven-point scale in terms ofmeasurement precision, and reject the idea that the eleven-point scale ismore vulnerable to measurement errors.

Description of the study

The ECSI model

The European Customer Satisfaction Index (ECSI) appeared in 1999 andis adapted from the Swedish Customer Satisfaction Index (Fornell 1992)and the ACSI (American Customer Satisfaction Index) (Fornell et al.1998). The ECSI model is well established as a tool for measuring andexplaining customer satisfaction and its antecedents and related constructs

International Journal of Market Research Vol. 49 Issue 3

317

Coelho.qxp 26/04/2007 14:56 Page 317

Page 6: Customer satisfaction measurement

(ECSI 1998; Cassel & Eklof 2001; Vilares & Coelho 2004). It has beenvalidated across a number of European countries and many industries,such as insurance, mobile phones, fixed phones, carbonated soft drinks,public transportation, retail banking, cable TV, supermarkets, postalservices, food products and public service.

The ECSI model is composed of two sub-models: the structural modeland the measurement model. The structural model defines therelationships between the latent variables and is represented in Figure 1.Customer satisfaction is the central variable of this model, having asantecedents the image of the company, customer expectations, perceivedquality of products and services, and perceived value (where the relationbetween quality and price is measured). As consequences of customersatisfaction there are two variables: complaints and loyalty.

The measurement model includes the relations between the latent ornon-observable variables and the observed indicators that correspond tosurvey questions (Table 1). Within this model we assume that therelationships between the latent variables and the observed indicators areall of the reflective nature (i.e. the indicators are assumed to be reflex ofthe latent variables).

Customer satisfaction measurement

318

Figure 1 ECSI structural model

Complaints

LoyaltyImage

Expectations

Perceivedquality

Perceivedvalue

Satisfaction(ECSI)

Coelho.qxp 26/04/2007 14:56 Page 318

Page 7: Customer satisfaction measurement

PLS (Partial Least Squares) was used to estimate this model using twodata sets obtained as explained in the next section. The methodology PLSapplied to ECSI is presented in detail by several authors (e.g. ECSI 1998;Cassel et al. 2000).

Data

Data came from a survey corresponding to the 2004 wave of ECSI-Portugal (the Portuguese Customer Satisfaction Index). The selection ofrespondents follows the criteria defined in ECSI (1998). Data collectiontook place in November and December 2004, through telephoneinterviews supported by a CATI system. The same questionnaire wasadministered to both samples, but for one sample we used a five-point

International Journal of Market Research Vol. 49 Issue 3

319

Table 1 Indicators of each latent variable

Latent variable Indicators

Image Q4A: It is a reliable operatorQ4B: It is well establishedQ4C: It gives a positive contribution to societyQ4D: It is concerned about its customersQ4E: It is innovative and forward looking

Expectations Q5A: Expectations concerning overall qualityQ5B: Expectations concerning the fulfilment of personal needsQ5C: Expectations concerning reliability

Perceived quality Q6: Perceived overall qualityQ7A: Technical quality of the networkQ7B: Personal attentionQ7C: Quality of services providedQ7D: Diversity of products and servicesQ7E: Product reliabilityQ7F: Quality of information providedQ7G: Coverage of the network

Perceived value Q10: Evaluation of price given qualityQ11: Evaluation of quality given price

Satisfaction Q3: Overall satisfactionQ9: Fulfilment of expectationsQ18: Distance to the ideal company

Complaints Q15: Complaint handlingQ16: Expectations of complaint handling

Loyalty Q12: Intention of remaining as a customerQ17: Recommendation to colleagues and friends

Coelho.qxp 26/04/2007 14:56 Page 319

Page 8: Customer satisfaction measurement

scale and for the other a ten-point scale (anchored in the extremes).1 Thetwo scales are both numerical, with the same labels on the extreme points.Therefore, the only difference between them is the number of responsealternatives. The questionnaire is the standard questionnaire used in ECSI-Portugal for the mobile telecommunications industry. The questionnaireincludes a set of questions regarding the seven constructs of a structuralsatisfaction model (image, expectations, perceived quality, perceived value,satisfaction, complaints, and loyalty), plus a set of socio-demographicquestions. The sample size was 252 for the five-point scale and 253 for theten-point scale. Both data sets were collected among customers of the samemobile telecommunications operator. The sampling design includes arandom selection of households using random-digit dialling. In eachhousehold one resident is selected randomly and qualified as a member ofthe target population.

Results

Descriptive analysis

Table 2 shows the frequency of non-response and the frequency ofresponse on the middle points of each scale. Results are shown byindicator and organised in seven groups corresponding to the seven latentvariables in the satisfaction model. From the results presented in Table 2 itcan be seen that, in general, the five-point scale has a higher proportion ofnon-responses when compared to the ten-point scale. Among the 25indicators considered, only four show higher non-response rates for theten-point scale.

When we formally test the difference between the proportions of non-response measured with the two scales, using the hypotheses

H0: pi,5–p = pi,10–p

H1: pi,5–p = pi,10–p

where pi,5–p is the proportion of non-response for variable i, when usingthe five-point scale and pi,10–p has the same meaning when using the ten-point scale, we never reject the null hypothesis, with the exception ofvariables Q4a and Q4e, at a 5% significance level. Therefore, although the

Customer satisfaction measurement

320

1 The labelling varies with the specific attributes but is generally stated as ‘very low’ to ‘very high’.

Coelho.qxp 26/04/2007 14:56 Page 320

Page 9: Customer satisfaction measurement

proportion of non-response is numerically higher in the sample using thefive-point scale, we can not conclude that generally these proportions aredifferent in the population.

International Journal of Market Research Vol. 49 Issue 3

321

Table 2 Non-response rates and proportion of responses in the middle points of the scale

% of middle points

NR (%) % of 3 % of 5 and 6

Latent variable 1 to 5 1 to 10 Difference 1 to 5 1 to 10 Difference

Image

Q4A 4.0 0.8 3.2* 20.2 21.1 –0.9Q4B 2.8 1.2 1.6 7.8 10.8 –3.0Q4C 4.8 2.4 2.4 27.1 21.9 5.2Q4D 4.4 2.4 2.0 29.0 28.3 0.7Q4E 4.0 1.2 2.8* 16.9 15.6 1.3

Expectations

Q5A 6.0 4.3 1.7 27.0 24.4 2.6Q5B 6.7 7.1 –0.4 35.7 25.5 10.2*Q5C 6.3 6.7 –0.4 40.3 25.8 14.5*

Perceived quality

Q6 0.4 0.0 0.4 19.9 18.2 1.7Q7A 1.2 0.0 1.2 25.7 16.6 9.1*Q7B 8.3 4.7 3.6 20.3 13.3 7.0*Q7C 13.1 10.3 2.8 20.5 17.2 3.3Q7D 12.7 7.5 5.2 28.6 17.9 10.7*Q7E 7.1 4.0 3.1 25.2 21.4 3.8Q7F 6.0 3.2 2.8 26.6 17.1 9.5*Q7G 0.8 0.4 0.4 32.8 21.0 11.8*

Perceived value

Q10 2.0 1.6 0.4 42.1 44.2 –2.1Q11 2.8 1.2 1.6 44.1 41.6 2.5

Satisfaction

Q3 0.4 0.8 –0.4 25.5 25.1 0.4Q9 2.4 2.4 0.0 20.2 27.5 –7.3Q18 4.4 4.0 0.4 7.8 21.0 –13.2

Complaints

Q15 0.0 0.0 0.0 34.6 12.5 22.1*Q16 11.1 11.4 –0.3 31.8 27.6 4.2

Loyalty

Q12 3.2 2.0 1.2 10.7 18.1 –7.4Q17 3.2 2.8 0.4 17.2 15.0 2.2

* Significant at 5% level

Coelho.qxp 26/04/2007 14:56 Page 321

Page 10: Customer satisfaction measurement

If the response effort were too high using the ten-point scale we wouldexpect to find a higher frequency of non-response in the group using thisscale. Results do not confirm this hypothesis, and we may conclude thatthe use of a five-point or ten-point scale does not tend to affectsignificantly the non-response rate.

The concentration of response in the middle points of the scale can alsobe seen in Table 2. This table exhibits the proportion of response incategory 3 for the five-point scale and the proportion of response incategories 5 and 6 for the ten-point scale. It can be observed that in generalthe concentration of response in the middle points is higher for the five-point scale, when compared to the ten-point one. In fact, only sixindicators (among 25) show a higher concentration of response in middlepoints for the ten-point scale.

When we formally test the difference between the proportions ofresponses in middle points with the two scales, using the hypotheses

H0: pi,5–p = pi,10–p

H1: pi,5–p ≠ pi,10–p

where pi,5–p is the proportion of responses on rate 3 for variable i, whenusing the five-point scale and pi,10–p is the proportion of responses on rates5 and 6 when using the ten-point scale, we reject for eight variables thenull hypothesis that the proportions are equal in the population (cf.Table 2) at 5% significance level. Therefore, we can conclude thatconcentration of response in middle points tends to be higher for the five-point scale.

This is also an interesting result that tends to confirm our hypothesisthat within an odd scale the middle point is often used by the respondentsthat prefer to reduce the response effort, resulting in an overestimation ofthe true frequency associated with this middle point. This result, alongwith the equivalence of non-response rates using both scales, tends tovalidate the idea that for most questions measuring attitudes withincustomer satisfaction surveys the use of a scale with an even number ofresponse alternatives may be a preferable choice.

The mean of each indicator, both for the five-point and the ten-pointscale, is shown is Table 3. We also present the transformed mean, afterconversion of both scales to the interval [0;1], using the formula yi* =(yi – 1)/Ri where yi is the original rating for respondent i, Ri is the range ofthe scale used by respondent i and yi* is the transformed rating forrespondent i.

Customer satisfaction measurement

322

Coelho.qxp 26/04/2007 14:56 Page 322

Page 11: Customer satisfaction measurement

It can be seen that for the transformed variables the global average isidentical both for the five-point and ten-point scales: 0.70. Also, whenanalysing individual indicators, both means tend to show similar values(the differences are always smaller than 0.08).

International Journal of Market Research Vol. 49 Issue 3

323

Table 3 Means by indicator using the original and the transformed scale

1 to 5 1 to 10

Indicator Mean Transformed mean Mean Transformed mean Difference

Image

Q4A 4.0 0.76 7.7 0.75 0.01Q4B 4.4 0.84 8.3 0.82 0.02Q4C 3.9 0.72 7.6 0.73 –0.01Q4D 3.7 0.68 7.2 0.69 –0.01Q4E 4.1 0.77 7.9 0.77 0.00

Expectations

Q5A 3.8 0.70 7.4 0.71 –0.01Q5B 3.7 0.68 7.3 0.70 –0.02Q5C 3.6 0.66 6.9 0.65 0.01

Perceived quality

Q6 4.0 0.75 7.8 0.76 –0.01Q7A 3.9 0.72 7.6 0.74 –0.02Q7B 4.1 0.77 7.8 0.76 0.01Q7C 4.0 0.75 7.7 0.74 0.01Q7D 3.8 0.70 7.4 0.71 –0.01Q7E 3.9 0.72 7.4 0.71 0.01Q7F 3.8 0.71 7.5 0.72 –0.01Q7G 3.7 0.66 7.2 0.69 –0.03

Perceived value

Q10 2.9 0.47 5.4 0.49 –0.02Q11 3.4 0.60 6.4 0.60 0.00

Satisfaction

Q3 3.9 0.71 7.5 0.72 –0.01Q9 3.8 0.69 7.3 0.70 –0.01Q18 3.6 0.65 7.2 0.69 –0.04

Complaints

Q15 2.7 0.43 5.6 0.51 –0.08Q16 3.7 0.67 7.0 0.67 0.00

Loyalty

Q12 4.2 0.79 7.7 0.75 0.04Q17 4.1 0.77 7.7 0.75 0.02

Global average 3.8 0.70 7.3 0.70 0.00

Coelho.qxp 26/04/2007 14:56 Page 323

Page 12: Customer satisfaction measurement

When we formally test the difference between the transformed meanswith the two scales, using the hypotheses

H0: mi,5–p = mi,10–p

H1: mi,5–p ≠ mi,10–p

where mi,5–p is the mean for variable i when using the five-point scale andmi,10–p has the same meaning when using the ten-point scale, we neverreject the null hypothesis. Therefore we can not conclude that these meansare different in the population. Once again this result tends to confirm thatboth scales produce equivalent mean scores and validate the acceptabilityof the ten-point scale. In fact, if the efforts demanded of respondents weretoo high using the ten-point scale we would expect to find different meanscores for the two scales.

The effect of socio-demographics on ratings

One specific fear regarding the use of ten-point scales concerns some socio-demographic groups, specifically those with a lower education level. Thequestion to be answered is: ‘Do people with different education levels usethe five- and ten-point scales differently?’

To answer this question we have implemented two analyses:

1. an analysis of variance having as dependent variable the ratings toquestion Q3 (overall satisfaction)

2. a logistic regression having as its dependent variable a binary variablerepresenting the non-response to question Q7c (quality of servicesprovided).

Both analyses include as independent variables three demographicvariables (sex, age group and educational level) and a variable representingthe original scale used by each respondent (ten-point or five-point). Also,in both cases, a cross-effect between education level and the scale used wasincluded. So, the analyses allow us to understand if the includedindependent variables influence the response pattern and also if some kindof interaction between education level and the response scale exists. Thevariables included in the analyses and their categories are presented inTable 4. In the analysis of variance the original scales of the dependentvariable were transformed into the interval [0;1], using the transformationpresented in the previous section.

Customer satisfaction measurement

324

Coelho.qxp 26/04/2007 14:56 Page 324

Page 13: Customer satisfaction measurement

Results for the analysis of variance are given in Tables 5 and 6. Table 5shows the F tests for the significance of the regression and for eachindividual variable (using Type III sum of squares). The null hypothesisthat all model coefficients are zero is rejected, showing the relevance of themodel. From the tested factors, only sex and education level are significantat any reasonable significance level. These are important results that tendto show that the scale used (ten-point or five-point) does not influence themean rating for the dependent variable. Also, although the education level

International Journal of Market Research Vol. 49 Issue 3

325

Table 4 Independent variables and categories

Independent variable Categories

Sex MaleFemale

Age group <3030–3940–4950 or more

Education level BasicSecondary (high school)University degree

Scale used 10 points scale5 points scale

Scale used × education level cross-effects 10 points scale/Basic10 points scale/Secondary10 points scale/University

5 points scale/Basic5 points scale/Secondary5 points scale/University

Table 5 F tests

Source DF Sum of squares Mean square F value Pr > F

Model 10 1.21637952 0.12163795 3.42 0.0002Error 491 17.46895476 0.03557832Corrected total 501 18.68533428

Variables

Sex 1 0.27802108 0.27802108 7.81 0.0054Used scale 1 0.00268551 0.00268551 0.08 0.7836Age group 4 0.07279392 0.01819848 0.51 0.7273Education level 2 0.79510525 0.39755263 11.17 <0.0001Scale used*education level 2 0.06104471 0.03052235 0.86 0.4247

Coelho.qxp 26/04/2007 14:56 Page 325

Page 14: Customer satisfaction measurement

is a significant factor, there is no interaction between the scale used and theeducation level. The conclusion is that people with different educations donot tend to use the two scales differently.

Table 6 shows coefficient estimates for the significant parameters. Pleaseremember that only sex and education level were considered significant.Results confirm the well-known notion in customer satisfaction studiesthat females tend to show higher ratings than males, and also that meanscores tend to decrease with the increase of education level.

With a logistic regression we intend to offer a complementary insight tothe question regarding the hypothetical influence of socio-demographicvariables on the rating profile. With the analysis of variance we have beenconcerned with the response pattern to question Q3 (overall satisfaction).Nevertheless, it is also interesting to understand whether the scale usedand socio-demographic characteristics influence the probability of non-response. This is particularly important since, despite concluding that thescale does not influence the response pattern (for respondents), it mayinfluence the probability of response and therefore the quality of collecteddata. The dependent variable Q7c (quality of services provided) waschosen for being one with the higher non-response rate among variables inthe questionnaire. Results for the logistic regression are given in Tables 7and 8.

Table 7 shows a chi-square test for the model significance and Waldstatistics2 for individual independent variables. The null hypothesis that allmodel coefficients are zero is rejected, showing the relevance of the model.For the independent variables only age group and education level aresignificant at any reasonable significance level. Once again, these results

Customer satisfaction measurement

326

Table 6 Parameter estimates

Parameter Estimate Standard error t value Pr > |t|

Intercept 0.7424151497 0.07227142 10.27 <0.0001Female 0.0490012601 0.01752916 2.80 0.0054Male 0.0000000000Basic education 0.0849742986 0.03029263 2.81 0.0052Secondary education 0.0215570425 0.02873063 0.75 0.4534University education 0.0000000000

2 The Wald statistic is used in this context to test the significance of model coefficients. Under the null hypothesis(corresponding to the nullity of each coefficient) the statistic follows a chi-square distribution with degrees offreedom equal to the number of restrictions to be tested.

Coelho.qxp 26/04/2007 14:56 Page 326

Page 15: Customer satisfaction measurement

tend to show that the scale used (ten-point or five-point) does not influencethe probability of non-response for the dependent variable. Also, althoughthe education level is a significant factor, there is no interaction betweenthe scale used and the education level. The conclusion is that within aneducation level, people using different scales do not tend to show differentresponse rates.

Coefficient estimates for the significant parameters are shown inTable 8. Note that only age group and education level were consideredsignificant. Also note that the probability modelled in our analysis is withregard to the response event. Results show that the response probabilitytends to decrease with the increase of age. Also it can be seen that theresponse probability tends to be higher for people with higher educationlevels.

International Journal of Market Research Vol. 49 Issue 3

327

Table 7 Chi-square tests

Testing global null hypothesis: BETA = 0

Test DF Chi-square Pr > chi sq

Likelihood ratio 10 40.6550 <.0001Score 10 44.7595 <.0001Wald 10 36.7804 <.0001

Variables

Sex 1 0.8306 0.3621Scale used 1 0.6921 0.4054Age group 4 16.8279 0.0021Education level 2 12.8545 0.0016Scale used*education level 2 0.0698 0.9657

Table 8 Parameter estimates

Analysis of maximum likelihood estimates

Standard Wald Parameter DF Estimate error chi-square Pr > chi sq

Intercept 1 1.7622 0.2092 70.9829 <0.0001Age group <30 1 0.9717 0.3625 7.1847 0.0074Age group 30–39 1 0.4654 0.3489 1.7787 0.1823Age group 40–49 1 0.7258 0.3529 4.2303 0.0397Age group 50 or + 1 –0.1955 0.2795 0.4893 0.4842Education level Basic education 1 –0.6560 0.2155 9.2615 0.0023Education level Secondary education 1 –0.0335 0.2395 0.0196 0.8888Education level University education – 0.0000

Coelho.qxp 26/04/2007 14:56 Page 327

Page 16: Customer satisfaction measurement

Globally, we can conclude that:

• some socio-demographic variables, such as sex, age group andeducation level, influence the probability of non-response or theresponse profiles for respondents

• the scale used (ten-point or five-point) influences neither theprobability of non-response nor the response profiles for respondents

• in neither analysis is there an interaction between the scale used andthe education level; therefore, we can conclude that, within anyeducation level, people using different scales do not tend to showdifferent response patterns.

These conclusions tend to confirm the empirical analysis made in theprevious section where we have observed that both scales produced similarnon-response rates as well as similar mean scores (after rescaling).Therefore, the small and usually non-significant differences in responserates between both samples, rather than being a consequence of the scaleused, may be totally explained by small differences in the socio-demographic profile of the samples. In fact, the sample using the ten-pointscale presents a slightly higher proportion of both young and well-educated people that contributes to a higher response rate.

Validity assessment

Figures 2 and 3 show the estimated path coefficients and t values (betweenparentheses) for the structural models estimated with five-point and ten-point scales, respectively. Generally, the hypothesised links tend to besignificant at 5% significance level. For the model estimated with the five-point scale the only exceptions are the Image–Loyalty, Expectations–Satisfaction and the Complaints–Loyalty paths that are not significant at10% significance level. In particular, the estimate for the hypothesisedpath between Complaints and Loyalty is negative (contradicting thetheory) and shows an extremely low t ratio. When using the ten-point scaleonly the Image–Loyalty, Expectations–Satisfaction paths are notsignificant at 5% significance level. Nevertheless, the Expectations–Satisfaction path is significant at 10% significance level and theImage–Loyalty path at 10.5% significance level. Globally the ten-point scale showed a greater ability to capture the significance oftheoretically supported links, resulting in a higher nomological validity forthis scale.

Customer satisfaction measurement

328

Coelho.qxp 26/04/2007 14:56 Page 328

Page 17: Customer satisfaction measurement

International Journal of Market Research Vol. 49 Issue 3

329

Figure 2 Model parameter estimate and t values for the five-point scale

Complaints

Loyalty

0.10(1.31**)

0.11(1.46**)

0.25(3.87)

0.46(5.73)

0.25(4.44)

0.31(3.99)

0.58(13.42)0.66

(16.78)

0.65(17.22)

0.30(3.93)

* Non-significant at 5% significance level** Non-significant at 10% significance level

0.29(3.21)

–0.03(0.34)**

Image

Expectations

Perceivedquality

Perceivedvalue

Satisfaction(ECSI)

Figure 3 Model parameter estimate and t values for the ten-point scale

Complaints

Loyalty

0.14(1.62**)

0.14(1.79*)

0.34(4.83)

0.48(4.89)

0.34(4.56)

0.19(2.22)

0.55(9.77)0.68

(22.42)

0.64(15.43)

0.17(2.31)

* Non-significant at 5% significance level** Non-significant at 10% significance level

0.38(4.76)

0.22(3.22)

Image

Expectations

Perceivedquality

Perceivedvalue

Satisfaction(ECSI)

Coelho.qxp 26/04/2007 14:56 Page 329

Page 18: Customer satisfaction measurement

We also access the model’s explanatory power (through thedetermination coefficient,3 R2) for the equations explaining satisfactionand loyalty, both in the five-point and ten-point scales. From the resultspresented in Table 9 it can be seen that the ten-point scale shows a higherexplanatory power for customer satisfaction and loyalty, when comparedto the five-point scale. The increase in explanatory power attributed to theten-point scale is particularly impressive for the loyalty construct.

The R2 value for satisfaction is higher than 0.50 for both scales (0.59 forthe five-point scale and 0.66 for the ten-point scale). On the other hand,although the R2 value for loyalty on the ten-point scale is quite high (0.54),the value is very low for the five-point scale (0.27), showing a weakerexplanatory power for loyalty.

These results clearly favour the ten-point scale and can be seen as aconfirmation of the higher nomological validity of this scale, since,globally, constructs in the model estimated with the ten-point scale tend toshow higher correlations, confirming theoretical predictions.

Table 10 presents the average communalities for the seven latentvariables of the ECSI model, both for five-point and ten-point scales.Communality for a manifest variable may be interpreted as the proportionof its variance, which is reproduced by the directly connected latentvariable. This measure can be used as an indicator of the convergentvalidity of the measurement model. In almost all cases latent variablecommunalities are higher than 0.50, indicating than the variance capturedby each latent variable is significantly larger than variance due tomeasurement error, and thus demonstrating a high convergent validity ofthe construct. There are two exceptions: quality and loyalty, which havecommunality slightly below 0.50 for the five-point scale (0.479 and 0.468,respectively). In general the communality is higher for the ten-point scale.The only exception occurs in expectation (0.717 for the five-point scale

Customer satisfaction measurement

330

Table 9 Determination coefficient (R 2) of satisfaction and loyalty

Latent variable 1 to 5 1 to 10

Satisfaction 0.59 0.66Loyalty 0.27 0.54

3 The determination coefficient reveals the proportion of variation in each dependent variable (satisfaction andloyalty) that is explained by the model.

Coelho.qxp 26/04/2007 14:56 Page 330

Page 19: Customer satisfaction measurement

and 0.674 for the ten-point scale). Once again these results tend to favourthe convergent validity of the ten-point scale.

Some authors (e.g. Givon & Shapira 1984) have contended that theadvantages of using a higher number of response alternatives tend to bemore pronounced for constructs with fewer number of indicators. Theresults showed in Table 10 do not confirm this statement. In fact, the latentvariables with a higher difference between communality for the ten-pointscale and five-point scale are image (0.093), loyalty (0.089) and perceivedquality (0.041). Nevertheless, image and quality are the constructs in ourmodel with a higher number of indicators (five and eight, respectively). So,globally, we could not find a relation between the communalityimprovements resulting from using a higher number of responsealternatives and the number of indicators in the construct.

One way to assess discriminant validity is to determine whether eachlatent variable shares more variance with its own measurement variablesthan with other constructs. For that we start to compare measurementvariables communalities with the squared correlations between their ownconstruct and other constructs in the model. A low percentage of latentvariable squared correlations exceeding measurement variablescommunalities tends to confirm discriminant validity (Chin 1998).

Communalities for the indicators of each latent variable and thepercentage of latent variable squared correlations exceeding measurementvariables communalities are shown in Table 11. Regarding the indicatorscommunalities it can be seen that values tend to be higher for the ten-pointscale, when compared to the five-point one. Among the 24 indicatorsconsidered, only five show higher communalities in the five-point scale.

Regarding the latent variable squared correlations exceedingmeasurement variables communalities, in general there are few violations.

International Journal of Market Research Vol. 49 Issue 3

331

Table 10 Communality and number of indicators by latent variable

Communality Difference

Latent variable 1 to 5 1 to 10 10 to 5 points Number of indicators

Image 0.51 0.60 0.09 5Expectations 0.72 0.67 –0.04 3Perceived quality 0.48 0.52 0.04 8Perceived value 0.74 0.74 0.00 2Satisfaction 0.68 0.70 0.02 3Complaints – – – 1Loyalty 0.47 0.56 0.09 3

Coelho.qxp 26/04/2007 14:56 Page 331

Page 20: Customer satisfaction measurement

Customer satisfaction measurement

332

Table 11 Indicators communalities and percentage of latent variable squared correlationsexceeding measurement variables communalities

1 to 5 1 to 10

Latent variable Latent variable square correlations square correlations

exceeding measurement exceeding measurement variables communalities variables communalities

Commu- Number of % of Commu- Number of % of Indicator nality comparisons violations nality comparisons violations

Image

Q4A 0.49 6 16.7 0.59 6 0.0

Q4B 0.21 6 83.3 0.43 6 33.3

Q4C 0.57 6 0.0 0.62 6 0.0

Q4D 0.71 6 0.0 0.70 6 0.0

Q4E 0.54 6 16.7 0.64 6 0.0

Expectations

Q5A 0.71 6 0.0 0.65 6 0.0

Q5B 0.78 6 0.0 0.70 6 0.0

Q5C 0.65 6 0.0 0.67 6 0.0

Perceived quality

Q6 0.51 6 16.7 0.59 6 0.0

Q7A 0.45 6 33.3 0.51 6 0.0

Q7B 0.39 6 50.0 0.44 6 50.0

Q7C 0.50 6 16.7 0.63 6 0.0

Q7D 0.50 6 16.7 0.53 6 0.0

Q7E 0.57 6 0.0 0.60 6 0.0

Q7F 0.57 6 0.0 0.53 6 0.0

Q7G 0.33 6 50.0 0.34 6 66.7

Perceived value

Q10 0.67 6 0.0 0.68 6 0.0

Q11 0.80 6 0.0 0.80 6 0.0

Satisfaction

Q3 0.63 6 0.0 0.59 6 0.0

Q9 0.66 6 0.0 0.73 6 0.0

Q18 0.75 6 0.0 0.78 6 0.0

Complaints

Q15–16 1.00 6 0.0 1.00 6 0.0

Loyalty

Q12 0.30 6 0.0 0.72 6 0.0

Q17 0.32 6 0.0 0.76 6 0.0

Coelho.qxp 26/04/2007 14:56 Page 332

Page 21: Customer satisfaction measurement

In expectations, perceived value, satisfaction, complaints and loyalty weobserved no violations. In the other latent variables, the violations tend tobe more significant for the five-point scale. Note that the gains indiscriminant validity for the ten-point scale are concentrated in the Imageand Perceived Quality constructs, which is to be expected since these arethe latent variables measured with a higher number of indicators. Theexception occurs in indicator Q7g of perceived quality, where there is onemore violation for the ten-point scale (four, in a total of six comparisons).Nevertheless, one should note that this is an indicator with lowdiscriminant validity in both scales. Generally, these results confirm ahigher discriminant validity of the constructs when using the ten-pointscale.

A complementary assessment of discriminant validity may be obtainedusing the variance extracted test (Fornell & Larcker 1981). We comparethe estimates of average variance extracted (AVE) for each pair ofconstructs in the model with the correlation between the constructs.Discriminant validity is demonstrated if both square roots of varianceextracted are greater than this correlation. Table 12 presents the results forboth scales. Elements in the main diagonal represent the square roots ofAVE and the other elements correlations between constructs. Although forboth scales discriminant validity is generally achieved for most constructs,

International Journal of Market Research Vol. 49 Issue 3

333

Table 12 Square roots of average variance extracted and correlations between constructs

Perceived Image Expectations value Quality Satisfaction Complaints Loyalty

Five-point scale

Image 0.74Expectations 0.65 0.85Perceived value 0.46 0.50 0.85Quality 0.75 0.66 0.50 0.69Satisfaction 0.67 0.60 0.58 0.69 0.83Complaints 0.50 0.42 0.44 0.51 0.58 1.00Loyalty 0.40 0.34 0.37 0.45 0.51 0.29 0.82

Ten-point scale

Image 0.78Expectations 0.64 0.82Perceived value 0.52 0.45 0.86Quality 0.70 0.68 0.51 0.71Satisfaction 0.72 0.62 0.66 0.67 0.84Complaints 0.44 0.49 0.40 0.49 0.55 1.00Loyalty 0.58 0.52 0.49 0.60 0.71 0.55 0.70

Coelho.qxp 26/04/2007 14:56 Page 333

Page 22: Customer satisfaction measurement

results tend to favour the ten-point scale. In fact, with the five-point scalea lack of discriminant validity is detected between image and qualityconstructs (note that these are the constructs where we have found someconstruct squared correlations exceeding measurement variablescommunalities). Also, the square root of AVE for quality is equal to thecorrelation between this construct and satisfaction, while with the ten-point scale all constructs show square roots of AVEs higher than all thecorresponding correlations.

Discussion and conclusions

This paper aimed to compare a five-point and a ten-point numerical scalein customer satisfaction measurement in the framework of the PortugueseCustomer Satisfaction Index (ECSI-Portugal). The analysis includes theevaluation of non-response rates, response distribution, as well asconvergent, discriminant and nomological validity of constructs used inthe ECSI model. Moreover we have investigated the effects of socio-demographic characteristics on the ability of respondents to use each scale.

Globally, it is apparent that the ten-point scale shows better propertiesthan the five-point scale, validating the choice made in the context ofECSI-Portugal. In fact, it has been seen that the ten-point scale generallyshows higher validity than the five-point scale. This is true both forconvergent and discriminant validity. Also the ten-point scale showed ahigher explanatory power for the main variables in our model (satisfactionand loyalty) thus confirming a higher nomological validity.

Results also showed that both scales produced similar non-responserates and similar mean scores. Therefore results do not tend to confirm thehypothesis that the response effort demanded by the ten-point scale is toohigh for respondents. In addition, we have not confirmed the conclusionsof previous work (e.g. Neumann & Neumann 1981) regarding an increasein deviation of actual averages from theoretical means as the number ofchoice alternatives increase. Moreover, we confirmed that the five-pointscale tends to show a higher attraction of responses to the middle point ofthe scale. This result tends to confirm our hypothesis that, within an oddscale, the middle point is often used by the respondents that prefer toreduce the response effort, resulting in an overestimation of the truefrequency associated with this middle point. This result, along with theequivalence of non-response rates and mean scores using both scales, tendsto validate the idea that, when measuring attitudes within customersatisfaction surveys, a scale with a higher number of response alternatives

Customer satisfaction measurement

334

Coelho.qxp 26/04/2007 14:56 Page 334

Page 23: Customer satisfaction measurement

may be a preferable choice. We can also conclude that respondents candeal with scales with an even number of points and therefore the use ofscales with a neutral category should not be mandatory.

Through an analysis of variance aiming to explain response scores anda logistic regression aiming to explain non-response, we confirmed thewell-known idea in customer satisfaction studies that some socio-demographic variables, such as sex, age group and education level,influence the probability of non-response and the response profiles ofrespondents. We have also concluded that the scale used (ten-point or five-point) influences neither the probability of non-response, nor the responseprofiles of respondents. Finally, we found that there was not, in eitheranalysis, any interaction between the scale used and education level.Therefore we can conclude that within any education level, people usingdifferent scales do not tend to exhibit different response probabilities orresponse profiles. This is a particularly important result since somecriticism regarding the use of scales with a high number of responsealternatives is that it would decrease the quality of response, particularlyfor people with lower educational levels.

If we use Cox’s (1980) definition regarding the optimal number ofresponse alternatives for a scale – ‘a scale with the optimal number ofresponse alternatives is refined enough to be capable of transmitting mostof the information available from respondents without being so refinedthat it simply encourages response error’ – we can clearly state that in thecontext of our study ten points is a better choice for the number ofresponse alternatives than five points.

Results obtained contradict some conventional wisdom defending scaleswith a lower number of points. What are the possible explanations forthis? First, the context (e.g. population, market) may have some unknowncharacteristics that account for the superiority of ten-point scales,although this seems unlikely. In fact, the target population (users of mobilephones) reaches more than 80% of the Portuguese population, and theyare well spread through all socio-demographic classes (Vilares & Coelho2006). Second, there is the purpose of the research used for the analysis.We have tested the application of scales in the context of customersatisfaction measurement, which has seldom been used in previous workregarding the evaluation of rating scales. Although this research wasconducted in the framework of customer satisfaction measurement, wefind it unlikely that our conclusions are specific to this type of research.We could probably argue that this is a marketing research context wherecustomers are particularly motivated to participate, since they understand

International Journal of Market Research Vol. 49 Issue 3

335

Coelho.qxp 26/04/2007 14:56 Page 335

Page 24: Customer satisfaction measurement

that they are contributing to the improvement of the product/serviceoffered by their supplier, but this seems to be an insufficient cause for thespecificity of the results. Third, the analysis performed differs from mostprevious studies defending a lower number of response categories. In fact,we have compared the two scales using multi-item constructs estimated ina structural equation modelling (SEM) framework. Some of the previouswork concerned with single-item measures and the ones using multi-itemconstructs were almost exclusively concerned with reliability. In fact,validity, which was one of the major concerns in our study, has not beenused as a criterion for most studies. Fourth, it is possible that theconventional wisdom of 20 or 30 years ago no longer applies. Onepossible explanation is that consumers have become more sophisticated attaking tests and rating their attitudes. Globally, the familiarity with scaleshas definitely been growing. Decades ago, the use of standardised tests andattitude scales was much less common than today. Phrases such as ‘On ascale of 1 to 10, how would you rate …’ have crept into popular speech.Taken together, these facts may point to a shift in the ability of averageconsumers to use more discriminating scales with greater ease.

If it is true that consumers can use scales with more scale points moreeasily than they could in the past, there is every reason for researchers touse such scales. Our results suggest that more scale points, routinely used,will result in greater ability to identify important relationships, highervalidity for constructs, and better hypothesis tests in theory and practice.

Nevertheless, this study shows several limitations and should beimproved in different ways. First, we have considered only a numericalinterval scale. In fact, when using other types of scales (e.g. a Likert-typescale) we may arrive at different conclusions. Second, our analysis islimited to the comparison between five and ten response alternatives.Although our results favour the ten-point scale we can not state that thisis an optimal number of response alternatives, and we may not exclude thepossibility that a number of response alternatives between five and tenwould produce better results than the scales analysed. Also, our study doesnot consider the effects of different types of labelling of the responsealternatives. For instance, labelling that tries to produce an unbalancedscale may result in different conclusions when choosing the number ofresponse alternatives. We have not also tested the use of any fully labelledscale, so we can only confirm the superiority of a ten-point scale over anend-labelled five-point one, but we can not exclude the hypothesis that afully labelled five-point scale would perform better. This limitationconstitutes fertile ground for future research. Finally, it should be pointed

Customer satisfaction measurement

336

Coelho.qxp 26/04/2007 14:56 Page 336

Page 25: Customer satisfaction measurement

out that the analysis was limited to the fact that the two questionnaires(using different scales) were administered to independent samples. Someadditional analysis would be beneficial if the two questionnaires wereadministered to the same sample.

Acknowledgements

The authors would like to thank Dr Dwayne Ball (University of Nebraska-Lincoln) and the anonymous referees for their helpful comments.

References

Alwin, D.F. (1991) Research on survey quality. Sociological Methods & Research,20, pp. 3–29.

Alwin, D.F. (1997) Feeling thermometers versus 7-point scales – which are better?Sociological Methods and Research, 25, 3, pp. 318–340.

Alwin, D.F. & Krosnick, J.A. (1991) The reliability of attitudinal survey measures:the role of question and respondent attributes. Sociological Methods & Research,20, pp. 139–181.

Andrews, F.M. & Withey, S.B. (1976) Social Indicators of Well-Being: Americans’Perceptions of Life Quality. New York: Plenum.

Ball, A.D., Coelho, P.S. & Machás, A. (2004) The role of communication and trustin explaining customer loyalty: an extension to the ECSI model. European Journalof Marketing, 38, available from the authors.

Cassel, C. & Eklof, J.A. (2001) Modeling customer satisfaction and loyalty onaggregate levels: experience from the ECSI pilot study. Total Quality Management,12, 7–8, pp. 834–841.

Cassel, C., Hackl, P. & Westlund, A. (2000) On measurement of intangibles assets: astudy of robustness of partial least squares. Total Quality Management, 7,pp. 897–907.

Chin, W.W. (1998) The partial least squares approach to structural equationmodeling. In: G.A. Marcoulides (ed.) Modern Methods for Business Research.Mahwah, NJ: Lawrence Erlbaum Associates.

Cicchetti, D.V., Showalter, D. & Tyrer, P.J. (1985) The effect of number of ratingscale categories on levels of interrater reliability: a Monte Carlo investigation.Applied Psychological Measurement, 9, 1, pp. 31–36.

Colman, A.M., Norris, C.E. & Preston, C.C. (1997) Comparing rating scales ofdifferent lengths: equivalence of scores from 5-point and 7-point scales.Psychological Reports, 80, pp. 355–362.

Converse, J.M. & Presser, S. (1986) Survey Questions: Handcrafting theStandardized Questionnaire. Newbury Park, CA: Sage.

Cox, E.P. (1980) The optimal number of response alternatives for a scale: a review.Journal of Marketing Research, 17, pp. 407–422.

International Journal of Market Research Vol. 49 Issue 3

337

Coelho.qxp 26/04/2007 14:56 Page 337

Page 26: Customer satisfaction measurement

ECSI (1998) European Customer Satisfaction Index. Report prepared for the ECSISteering Committee.

Fornell, C. (1992) A national customer satisfaction barometer: the Swedishexperience. Journal of Marketing, 56, 1, pp. 6–21.

Fornell, C. & Larcker, D.F. (1981) Evaluating structural equation models withunobservable variables and measurement error. Journal of Marketing Research,18, pp. 39–50.

Fornell, C., Johnson, M.D., Anderson, E.W., Cha, J. & Everitt Bryant, B. (1996)The American Customer Satisfaction Index: nature, purpose and findings. Journalof Marketing, 60, 4, pp. 7–18.

Fornell, C., Johnson, M.D., Anderson, E.W., Cha, J. & Everitt Bryant, B. (1998)The American Customer Satisfaction Index: Methodology Report. Ann Arbor, MI:University of Michigan.

Givon, M.M. & Shapira, Z. (1984) Response to rating scales: a theoretical modeland its application to the number of categories problem. Journal of MarketingResearch, 21, pp. 410–419.

Green, P.E. & Rao, V.R. (1970) Rating scales and information recovery – how manyscales and response categories to use. Journal of Marketing, 34, pp. 33–39.

Haley, R.I. & Case, P.B. (1979) Testing thirteen attitude scales for agreement andbrand discrimination. Journal of Marketing, 43, pp. 20–32.

Jenkins, G.D. & Taber, T.D. (1977) A Monte Carlo study of factors affecting threeindices of composite scale reliability. Journal of Applied Psychology, 62,pp. 392–398.

Johnson, M., Gustafsson, A., Andreason, T.W., Lervik, L. & Cha, G. (2001) Theevolution and future of national customer satisfaction index models. Journal ofEconomic Psychology, 22, pp. 217–245.

Krosnick, J.A. & Alwin, D.F. (1989) Response strategies for coping with thecognitive demands of survey questions. Unpublished manuscript. Ann Arbor, MI:University of Michigan, Institute for Social Research.

Malhotra, N. & Birks, D. (2003) Marketing Research: An Applied Approach,2nd European edn. Prentice Hall.

Neumann, L. & Neumann, Y. (1981) Comparison of six lengths of rating scales:students’ attitudes toward instruction. Psychological Reports, 48, pp. 399–404.

Ramsay, J.O. (1973) The effect of number of categories in rating scales on precisionof estimation of scale values. Psychometrika, 37, pp. 513–532.

Reynolds, F.D. & Neter, J. (1982) How many categories for respondentclassification. Journal of the Market Research Society, 24, 4, pp. 345–346.

Stem, D.E. & Noazin, S. (1985) The effects of number of objects and scale positionson graphic position scale reliability. In: R.E. Lusch et al. (eds) AMA Educators’Proceedings. Chicago: Marketing Association, pp. 370–372.

Tourangeau, R. (1984) Cognitive sciences and survey methods. In: T.B. Jabine, M.L.Straf, J.M. Tanur & R. Tourangeau (eds) Cognitive Aspects of SurveyMethodology: Building a Bridge between Disciplines. Washington, DC: NationalAcademy Press, pp. 73–100.

Vilares, M. & Coelho, P. (2004) The employee–customer satisfaction chain in theECSI model. European Journal of Marketing, 37, pp. 1703–1722.

Vilares, M. & Coelho, P. (2006) ECSI-Portugal – Relatório de Sectores. Lisbon: IPQ.

Customer satisfaction measurement

338

Coelho.qxp 26/04/2007 14:56 Page 338

Page 27: Customer satisfaction measurement

About the authors

Pedro Simões Coelho is Associate Professor at Instituto Superior deEstatística e Gestão de Informação of the Universidade Nova de Lisboa(ISEGI-UNL). He is also a researcher in the Statistics and InformationManagement Center (CEGI) at ISEGI-UNL, Vice-President of thePortuguese Association for Classification and Data Analysis (CLAD) andVice-President of Qmetrics, SA. Additionally, he is co-coordinator of thePortuguese committee of the ECSI-Portugal (European CustomerSatisfaction Index) project. At ISEGI-UNL he is Director of the Masterdegrees and is lecturing courses in survey methodology, marketingresearch, data collection methodologies and quantitative methods formarketing. Pedro Simões Coelho has been a consultant for severalorganisations, including the Portuguese Statistical Office. His mainresearch interests are in survey methodology, structural equationmodelling, customer satisfaction measurement, and the explanation ofcustomer loyalty.

Susana Pereira Esteves ([email protected]) is presently AssistantProfessor at Instituto Superior de Estatística e Gestão de Informação of theUniversidade Nova de Lisboa (ISEGI-UNL). She is a researcher in theStatistics and Information Management Research Center (CEGI) of thesame university and is also a member of the workgroup of ECSI - Portugal(European Customer Satisfaction Index) project. Her current work focuseson marketing research, in particular customer satisfaction and loyaltymeasurement.

Address correspondence to: Professor Pedro Simões Coelho, ISEGI-UNL, Campus de Campolide, 1070-312 Lisboa, Portugal.

Email: [email protected]

International Journal of Market Research Vol. 49 Issue 3

339

Coelho.qxp 26/04/2007 14:56 Page 339

Page 28: Customer satisfaction measurement