Discriminant Analysis

18
Discriminant Analysis_2001-2002 1 Quantitative Techniques for Marketing 6.0 Discriminant Analysis Aims To give you an appreciation of the theoretical and practical issues in the application of discriminant analysis to marketing decision making In particular the topic aims to indicate the objectives of discriminant analysis, data requirements, methodology and technical concepts, practical issues and potential applications. Objectives By the end of this section, and after additional private study and completion of the assignment, you should be able to: v Understand the suitability of discriminant analysis in the context of data properties and research objectives. v Appreciate theoretical issues and understand technical concepts of discriminant analysis v Understand how to decide how many discriminant functions to derive. v Interpret the results of a discriminant function. v Evaluate the classification performance of a discriminant function. v Appreciate the use of discriminant analysis to marketing v Conduct of discriminant analysis using SPSS Software. v Critically evaluate published studies in discriminant analysis v Write a structured, critical account of the methodology, application, results and interpretation of discriminant analysis.

Transcript of Discriminant Analysis

Page 1: Discriminant Analysis

Discriminant Analysis_2001-2002

1

Quantitative Techniques for Marketing

6.0 Discriminant Analysis

Aims

To give you an appreciation of the theoretical and practical issues in the

application of discriminant analysis to marketing decision making

In particular the topic aims to indicate the objectives of discriminant

analysis, data requirements, methodology and technical concepts, practical

issues and potential applications.

Objectives

By the end of this section, and after additional private study and

completion of the assignment, you should be able to:

v Understand the suitability of discriminant analysis in the context of data

properties and research objectives.

v Appreciate theoretical issues and understand technical concepts of

discriminant analysis

v Understand how to decide how many discriminant functions to derive.

v Interpret the results of a discriminant function.

v Evaluate the classification performance of a discriminant function.

v Appreciate the use of discriminant analysis to marketing

v Conduct of discriminant analysis using SPSS Software.

v Critically evaluate published studies in discriminant analysis

v Write a structured, critical account of the methodology, application,

results and interpretation of discriminant analysis.

Page 2: Discriminant Analysis

Discriminant Analysis_2001-2002

2

Page 3: Discriminant Analysis

Discriminant Analysis_2001-2002

3

Content

6.1 Introduction: Objectives of Discriminant Analysis

6.2 Data Requirements

6.3 The Discriminant Function

6.3.1 Estimation of the function6.3.2 Unstandardized and Standardised Coefficients

6.4 Classification using Discriminant Analysis

6.4.1 Discriminant Scores6.4.2 Classification Criteria6.4.3 Evaluating the Discriminant Function using

Classification Performance

6.5 Multiple Discriminant Analysis

6.5.1 Choosing the Number of Functions - Eigenvalue criteria and Wilks' Lambda

6.6 Applications of Discriminant Analysis to Marketing

6.7 Summary

Page 4: Discriminant Analysis

Discriminant Analysis_2001-2002

4

Reading

Hair J F, Anderson R E and Tatham R L and Black W C (1998)Multivariate Data Analysis, 5th Edition, New Jersey, USA PrenticeHall International. Chapter 4.

Albers-Miller, N. D (1999). Consumer Misbehaviour: Why People BuyIllicit Goods, Journal of Consumer Marketing, Vol. 16 No. 3, 273-287.

Beharrell, B. and Crockett, A. (1992). New Age Food! New AgeConsumers! With or Without Technology Fix Please, British FoodJournal, Vol. 94 No. 7,

Crask M R and Perreault J R (1971) Validation of Discriminant Analysisin Marketing Research. Journal of Marketing Research, Vol. XIV,No. l, Feb.

Cunningham, I. C. M. and Cunningham, W. H. (1973) The Urban In-Home Shopper: Socio-Economic and Attitudinal Characteristics,Journal of Retailing, 49, 42-50.

Gamesalingham, S. and Kumar, K. (2001). Detection of Financial Stressvia Multivariate Statistical Analysis, Managerial Finance, Vol. 27No. 4, 45-55.

Korgaonkar, P., Silverblatt, R. and O’Leary, B. (2001). Web Advertisingand Hispanics, Journal of Consumer Marketing, Vol. 18 No. 2, 134-152.

Kuei, C-H, Madu, C. N., Chinho, L. and Min, H. (1997). An EmpiricalInvestigation of the Association Between Quality ManagementPractices and Organisational Climate, International Journal ofQuality Science, Vol. 2 No. 2, 121-137.

Mannion M A, Cowan, C and Gannon, M (2000), Factors AssociatedWith Perceived Quality Influencing Beef Consumption in Ireland,British Food Journal, Vol. 102, No. 2 pp. 195-210.

McEnally M R and Hawes J M (1984). The Market for Generic BrandGrocery Products, Journal of Marketing, Winter, 75-83.

Montgomery D B (1975). New Product Distribution: An Analysis ofSupermarket Buyer Decisions, Journal of Marketing Research,(August 1975), 255-264.

Morrison, D. G. (1969) On the Interpretation of Discriminant Analysis.Journal of Marketing Research, Vol. l, May, l56-l63.

Perry M (1969) Discriminant Analysis of Relations between ConsumersAttitudes, Behaviour and Intentions. Journal of AdvertisingResearch, Vol. 9, No. 2, 1969, 34-39.

Page 5: Discriminant Analysis

Discriminant Analysis_2001-2002

5

Pessemier E A, Burger P C, and Tigert D J (1967) Can New ProductBuyers be Identified? Journal of Marketing Research, Vol. 4,November, pp. 349-354.

Robertson T S and Kennedy J N (1968) Prediction of ConsumerInnovators: Application of Multiple Discriminant Analysis.Journal of Marketing Research, Vol. 5, No. 1, Feb., pp. 64-69.

Sands S and Moore P (1981) Store Site Selection by DiscriminantAnalysis. Journal of the Market Research Society, Vol. 23, No. l,Jan, pp. 40-5l.

Siu, W-S and Tsoi, M-Y (1998). Nutrition Label Usage of ChineseConsumers. British Food Journal, Vol. 100, No. 1, 25-29.

Steel P, Storey D and Wynarczyk, P (1985) The Prediction of Small Company Failure Using Financial Statement Analysis, CURDS, University of Newcastle upon Tyne, Discussion Paper No 19.(See MRN for loan copy)

Taffler R and Houston A W (1980) How to Identify Failing Companies Before it is Late, Professional Administration.(See MRN for loan copy)

Tomlinson M (1994) Do Distinct Class Preferences for Foods Exist?, British Food Journal, Vol. 96 (7), 11-17.

Waldron D G (1978) The Image of Craftsmanship. A Predictor Variable Influencing the Purchase of European Automobiles by Americans. European Journal of Marketing, Vol. l2, No 8, pp. 554-56l.

Williams, C. E. and Tse, E. C. Y. (1995). The Relationship BetweenStrategy and Entrepreneurship: The US Restaurant Sector.International Journal of Contemporary Hospitality Management,Vol. 7 No. 1, 22-26.

Page 6: Discriminant Analysis

Discriminant Analysis_2001-2002

6

Quantitative Techniques for Marketing

6.0 Discriminant Analysis

6.1 Introduction and Objectives

1. The aim of discriminant analysis is to explain and predict the groupmembership of things on the basis of measurements on explanatoryvariables.

2. Explanation / prediction uses discriminant function, a linearcombination of explanatory (discriminating) variables.

e.g. For 2 groups, 2 explanatory variables

D = a1 x1 + a2 x2

D = discriminant score (1)

a1, a2 are coefficientsx1, x2 are explanatory variables.

3. Analysis concerns estimation of the coefficients for an appropriate set of variables, interpretation of the relative importance of the variables and evaluation of the predictive power of the model

4. Marketing applications in product usage, store site selection and company failure.

Page 7: Discriminant Analysis

Discriminant Analysis_2001-2002

7

6.2 Data Requirements

1. The data is required to include a dependent [nominal] variable and[metric] explanatory variables. In this example freezer ownership[cat] is determined by family size [size] and real disposable monthlyincome [income].

cat = a1size + a2income

cat size income dat1 2 1 2500 12 2 2 3000 13 2 2 4000 14 2 5 4500 15 2 4 5000 16 2 2 5500 17 1 4 6000 18 1 4 7000 19 1 2 8500 1

10 1 4 10000 111 2 1 2500 212 1 2 3000 213 2 2 4000 214 2 5 4500 215 1 4 5000 216 2 2 5500 217 2 4 6000 218 1 4 7000 219 1 2 8500 220 2 4 10000 2

Variablescat = Group variable:

1 = freezer owner2 = non-freezer owner

size = family sizeincome = monthly disposable income in real termsdat = data selection variable:

1 = data for estimation2 = data for classification

Page 8: Discriminant Analysis

Discriminant Analysis_2001-2002

8

6.3 The Discriminant Function

1. With g groups a maximum of g-1 discriminant functions are necessary. We consider the 2 group case.

6.3.1 Estimation

1. The objective is to find a linear combination of the explanatoryvariables which separates the groups. For example, with p numberof explanatory variables the discriminant function is:

D = a1 x1 + a2 x2 + + + ap xp (2)

2. With g groups, estimation can employ canonical correlations to derive up to g-1 functions as a descending hierarchy so that:

D1 explains the group differences the most

D2 explains group differences not explained by D1 etc.

6.3.2 Discriminant Function Coefficients

1. Output usually presents 2 types of functions:

(a) based on unstandardised coefficients ( au )

(b) based on standardised coefficients ( as )

2. Standardised variables give the same result as if the data had zero means and unit variance. They are used to interpret relative importance of the explanatory variables.

3. Unstandardised coefficients are used for prediction.See example

Page 9: Discriminant Analysis

Discriminant Analysis_2001-2002

9

Example: Freezer Ownership

Group 1 = freezer owner 2 = non-owner

Explanatory variables

Size = family sizeIncome = real family income

Results (based on data from Section 6.2 )

Variable Coefficientsstandardised unstandardised

Size 0.00537 0.004013Income 0.99877 0.000709Constant n/a -3.979905

Ds = 0.00537 size + 0.99877 income

Du = 0.004013 size + 0.000709 income - 3.979905

Page 10: Discriminant Analysis

Discriminant Analysis_2001-2002

10

6.4 Classification Using Discriminant Analysis

6.4.1 Discriminant Scores

1. Are the basis for prediction derived from the unstandardised function and values of the explanatory variables not used for estimation.

2. We classify the items on the basis of:

(a) Individual scores

(b) group scores

3. Individual scores are obtained by substituting individual values ofvariables in the unstandardised function.

No. group size income D score

1 2 1 2500 -2.20452 2 2 3000 -1.84623 2 2 4000 -1.13774 2 5 4500 -0.77145 2 4 5000 -0.42116 2 2 5500 -0.07497 1 4 6000 0.28748 1 4 7000 0.99609 1 2 8500 2.050810 1 4 10000 3.1216

4. Group scores are obtained by substituting average group values foreach variable in the unstandardised function.

Group Average values D scoresize income

1 3.50000 7875.000 1.614

2 2.66667 4083.333 -1.076

Page 11: Discriminant Analysis

Discriminant Analysis_2001-2002

11

6.4.2 Classification Criteria

1. Compares individual scores with the group scores. Individuals areclassified in the nearest group

2. We can implement this using a mid-point (M) between the group scores and classify people on comparing their score (Di) with M

if D < M classify in left group

if D > M .classify in right group

No. Actualgroup

Size Income D score Predictedgroup

1 2 1 2500 -2.2045 22 2 2 3000 -1.8462 23 2 2 4000 -1.1377 24 2 5 4500 -0.7714 25 2 4 5000 -0.4211 26 2 2 5500 -0.0749 27 1 4 6000 0.2874 18 1 4 7000 0.9960 19 1 2 8500 2.0508 110 1 4 10000 3.1216 1

Page 12: Discriminant Analysis

Discriminant Analysis_2001-2002

12

6.4.3 Evaluation of the Discriminant Function From Classification Criteria

1. Evaluation compares actual group membership with predicted group membership.

2. Morrison (1969) suggests this should be carried out using a holdoutsample; data not used for estimation. That is, the full data set issplit with some data used for estimation and the rest used as aholdout sample for evaluation.

3. For example, suppose we have 10 additional observations and classify these on the basis of the existing function and group scores as shown below

4. The D scores, actual and predicted group membership are as shown below:

No. Actualgroup

Size Income D score Predictedgroup

1 2 1 2500 -2.2045 22 1 2 3000 -1.8462 23 2 2 4000 -1.1377 24 2 5 4500 -0.7714 25 1 4 5000 -0.4211 26 2 2 5500 -0.0749 27 2 4 6000 0.2874 18 1 4 7000 0.9960 19 1 2 8500 2.0508 110 2 4 10000 3.1216 1

5. Actual and predicted group allocation is summarised in a confusionor classification matrix as follows:

ActualGroup

PredictedGroup

1 2 Totals

1 2 2 42 2 4 6

Totals 4 6 10

Page 13: Discriminant Analysis

Discriminant Analysis_2001-2002

13

6. Performance is summarised by C - the per cent correct classification

C = (2+4) x 100 = 60 % (3) 10

7. In addition to this measure we need a comparative basis on which to evaluate performance

8. Common bases are

(a) C max criterion

(b) C pro criterion

9. C max is based on classification of all cases into the dominant group.

e.g. p = prob of belonging to group 1

1-p = prob of belonging to group 2

C max = max ( p, 1-p ) with the criterion that

If C > Cmax the function is good

e.g. C max = max ( .40, .60 ) = .60

C = 60%

10. C pro is based on the probability of correct classification in all groups using a random method.

Page 14: Discriminant Analysis

Discriminant Analysis_2001-2002

14

11. Defining

proportion of actual cases in group 1 = pproportion of actual case in group 2 = 1-p

C pro = p2 + (1-p)2 (4)

so the criteria is:

If C > C pro the function is good

e.g. C pro = ( 0.40 )2 + ( 0.60 )2 = 0.52

C = 60%

Page 15: Discriminant Analysis

Discriminant Analysis_2001-2002

15

6.5 Multiple Discriminant Analysis

1. With g groups the maximum number of functions required is g-1 butsuccessful discrimination may be possible with fewer functions.

2. Choice of the number of functions

(a) Eigenvalue/ variance criterion

(b) Wilks' lambda

3. Eigenvalue/variance

e.g. 4 groups ( 3 functions at most)

Functionnumber

Eigenvalue Percentagevariance

Comment

1 0.31781 61.52 substantial

2 0.19802 38.33 fair

3 0.00078 0.15 Poor

Total 0.51661 100.00

Contribution of function 3 is poor and would not be derived

4. Wilks' Lambda is an inverse measure of the discriminatory power in the explanatory variables which has not been removed by the current set of discriminant functions

5. The statistical test is for the significance of the information which has not been explained by the current set of functions

6. The hypotheses are

H0 = remaining information is not significant

H1 = remaining information is significant

Page 16: Discriminant Analysis

Discriminant Analysis_2001-2002

16

8. The test is a χ2 test

9. Example

AfterFunction

Lambda Actualχ2

df Criticalχ2

Sig.

0 0.633 85.538 12 21.026 0.000

1 0.834 33.931 6 12.592 0.010

2 0.999 0.145 2 5.991 0.243

Note

1. df = degrees of freedom

2. critical χ2 is based on a 5 per cent significance level.

Page 17: Discriminant Analysis

Discriminant Analysis_2001-2002

17

6.6 Applications of Discriminant Analysis to Marketing

1. Discriminant analysis has been applied to three main areas:

(a) product/service users and none-users;(b) store site selection;(c) prediction of company failure.

2. The application to product use, consumers are classified based upontheir degree of product use (user vs. non-user), (heavy, medium lightuser) or the time lapse which evolves before they try a product(early adopter, late adopter, non-adopter). Psychographic anddemographic variables are then used as discriminating variables.

2. See for example the application to product innovators (Robertsonand Kennedy, 1968), buyers of a new supermarket product(Montgomery, 1975) generic brand grocery products (McEnelly andHawes, 1984), or buyers of a new detergent (Pessemier, Burger andTigert, 1967).

3. A study of beef consumption behaviour in Ireland by Mannion et al(2000) employed factor analysis of a 25-item scale concerned withthe importance of a series of attributes (health, safety and quality)associated with beef. The solution produced 7 factors that weresubsequently employed in a discriminant analysis of consumers whohad maintained their consumption and consumers who had reducedconsumption.

4. The application to store site selection classifies stores on the basis of performance and uses demographic characteristics of the population to discriminate between good and bad sites (Sands and Moore, 1981). The objective of the analysis is to formulate a screening policy for new store sites.

5. The application to the prediction of company failure classifies firms on the basis of their performance and uses financial ratios to

discriminate between good and bad performers. The objective is toprovide a decision framework to anticipate company decline andinstitute policies to prevent failure. See for example, Steele, Storey andWynarczyk (1985), Taffler and Houston (1980) and Taffler (1982).

Page 18: Discriminant Analysis

Discriminant Analysis_2001-2002

18

6.7 Summary

1. Discriminant analysis is a useful aid to classify people or objectsinto groups using metric or non-metric discriminating variables.

2. The advantage over univariate analysis is that it is carries the advantage of establishing inter-group profiles and identifies a hierarchy of relevant variables.

3. The effect of estimation bias on the evaluation of predictive performance can be avoided using a split data set

4. A critical aspect of the analysis involves the identification of relevant discriminating variables. In more sophisticated applications, the definition of both dependent and discriminating variables may be less straight-forward. e.g. defining product adopters or devising psychographic variables to measure social mobility or venturesome-ness.