SMU Marketing Research Ch-11

UNIT 11 DATA ANALYSIS

Structure

11.1 Introduction

Objectives

11.2 Descriptive statistics

11.3 Univariate tests

11.4 Discriminant analysis

11.5 Correlation Analysis of variance and covariance

11.6 Linear regression analysis

11.7 Logistic regression analysis

11.8 types of factor analysis

Exploratory factor analysis

Confirmatory factor analysis

11.9 Cluster analysis

11.10 Multidimensional scaling techniques

11.11 Conjoint analysis

11.12 Summary

11.13 Glossary

11.14 Terminal Questions

11.15 Answers

11.16 Case Study

11.1 INTRODUCTION

In the previous unit, you must have learnt the concept of central tendency and their measures. Further you have studied the measures of dispersion. In the end you have learnt various analysis like bivariate analysis and

Sikkim Manipal University Page No. 1

Marketing Management Unit 1

multivariate analysis. Now you will focus on Data analysis concept in more detail.

Data analysis is a process of designing, inspection, cleaning, conversion and modification of the data in such a way so that the objective of collecting relevant information can be achieved. There are multiple of facets, techniques and approaches of data collection like descriptive statistics, linear regression analysis, ANOVA, structural equation model, factor analysis, cluster analysis, conjoint analysis and multidimensional scaling techniques.

In this unit you will learn the types of multivariate analysis which includes discriminant, factor, cluster, conjoint and multidimensional analysis. Further you will learn the classification of factor analysis in two ways: exploratory and confirmatory factor analysis.

Objectives

After studying this unit, you should be able to:

describe the descriptive statistics

learn the univariate, multivariate and discriminant analysis

know about ANOVA, linear and logistic regression analysis

trace out the factor, cluster and conjoint analysis

aware about the multidimensional scaling techniques

CASELET

Regression Analysis

Our client, a regional manufacturer, approached SQPS with a request to analyze historical data. From the analysis, our client was interested in identifying relationships among variables and to be able to predict some response variables given the values of independent variables. There were 21 independent variables and a response (Throughput).

Approach Our approach can be summarized as follows:



• A team consisting of the client's continuous improvement associates was assembled

• Historical data was reviewed to ensure accuracy, validity and completeness

• Some data transformation was conducted

• Correlation analysis was performed to identify relationships among independent variables

• Stepwise regression was run to develop a prediction formula

• Prediction formula was verified using other raw data

Results The results showed that

• Relationships among independent variables were identified.

• highly correlated variables were examined as we only needed one of each pair for the prediction model

• A prediction model for "Throughput" was estimated with R-Sq of 88%. This means that 88% of the variability in the "Throughput" data was explained by the effect of the model. This was a very reliable prediction model.

Source: http://www.shraimqps.com/Resources/case_study2-reg.htm

11.2 DESCRIPTIVE STATISTICS

Descriptive statistics is the process which describes the features or characteristics of the data. This method further helps in making the analysis for the violations of the assumptions of statistical techniques. Descriptive analysis also traces out the specific research questions. In this method, lot of advance statistical tests are used which are very sensitive to data violation. These tests further create a clear picture in the mind of researcher to track the exact position of the violation of data. Descriptive statistics method is used for both categorical and continuous variables. It is applicable to SPSS kind of programmes also. This statistics give information related to the frequency and percentages. For example: for a single variable how many times a specific data happen like number of males and females respondents. For continuous variables, descriptive data



give information related to mean, standard deviation, skewness and kurtosis.

Describing the main characteristics of data collection in a quantitative way is termed as descriptive statistics. There is a huge difference between inferential statistics and descriptive statistics. In statistical inference conclusions are drawn from data randomly.

For example: errors during observation

On the other hand, descriptive statistics relates with set of data summarization, instead of using the data to know about the population that the data is supposed to represent. However probability theory is not the base for the descriptive statistics. Even though in situation, when the conclusion is drawn for data analysis by using inference statistics, presentation of descriptive statistics is also done at that time.

For example: a report showing the study related to human subjects, a table there shows the overall sample size, their sub groups, their demographic factors, clinical features like commuters proportion of the subjects, average age analysis and the subject proportion of each gender.

Self assessment questions

1. Descriptive analysis also traces out the specific research questions. (True/False)

2. In ______conclusions are drawn from data randomly.

11.3 UNIVARIATE TESTS

A very simple method of quantitative statistical analysis is known as univariate analysis. This analysis is processed by applying the description of a single variable and its features or attributes which is applicable to the units of analysis.

For example: if the analysis factor is variable age, then the researcher find out how many factors come under the attribute categories of a given age.

There are major differences between univariate and bivariate analysis or multivariate analysis. Bivariate or multivariate analysis is that analysis which includes two variables at the same time. Multivariate analysis consider multiple of variable in other words more than two variables are considered in this analysis.

Univariate analysis normally used the descriptive method of analysis while bivariate or multivariate analysis use explanatory purposes. Usually



univariate analysis is used in the research’s first stage for analysing the data as compared to bivariate or multivariate analysis.

The process of univariate analysis includes creating frequency distributions of people which must include the numbers assigned to the variables observed in the

Assigning of numbers can be done with the help of creating a table format like pie chart, bar chart or other similar type of geographical presentation.

For example: One sample of distribution table and bar chart is given below. In table for a variable age, frequency distribution is shown which comes under the univariate analysis. The other bar chart which comes under the bivariate analyses is shown which includes two variables: incarceration rate and country.

Age range Frequency Percent

under 18 10 5

18–29 50 25

29–45 40 20

45–65 40 20

over 65 60 30

Valid cases: 200Missing cases: 0

Figure:11.1 This is a frequency distribution chart (ranked from lowest to highest) comparing international incarceration rates in 2002



http://en.wikipedia.org/wiki/Incarceration

http://en.wikipedia.org/wiki/Frequency_distribution

Source: http://en.wikipedia.org/wiki/Univariate_analysis

In univariate analysis, several tools and methods can be used provided their applicability should depend upon the nature of variable; whether you are dealing with a discrete variable or a continuous variable.

For example: discrete variable like gender, state, country etc. and continuous variable like income, age etc.

Univariate analysis also uses the central tendency method in addition to frequency distribution.

For example: central tendency like location

Univariate analysis with central tendency method explains the way of making cluster of quantitative data around some value. Central tendency can be measured in terms of calculating the average of a set of measurements. Average is composed of set of measures like mean, median, mode or others.

Balancing the study of central tendency with statistical dispersion is another measure used in the univariate analysis. These measurements track out the distribution of values around central tendency values. Dispersion is the measure used to study the range and standard deviation.


3. There are various similarities between univariate and bivariate analysis or multivariate analysis. (True/False)

4. _____________ can be measured in terms of calculating the average of a set of measurements.

11.4 DISCRIMINATE ANALYSIS

Process of classification of number of observational set into predefined classes is known as discriminant analysis. The main goal of discriminant analysis is the determination of observational class which is based on a set of variables namely input variables also known as predictors. There are basically two objectives of discriminant analysis:

a) One is you have to assess the adequate amount of classification in which group membership is given during the study



http://en.wikipedia.org/wiki/Univariate_analysis

b) Another is you have to assign objects to randomly one object of a group.

This statistical analysis has a predictive approach. In both the above objectives, some objects are assigned to few groups before moving to the Discriminant Analysis. This can be done at any point of time by using any method or option. Thus Cluster Analysis or Principal Components Analysis proves to be a useful complement for Discriminant Analysis.

For example: consumers who purchase Sony Laptop and those who purchase Lenovo Laptop. In this case, you can use Discriminant Analysis. During this analysis, you can also analyse the performance of your sales personnel as low, medium or high.

For those cases where classes are known, on the basis of number of observational set, the model is built where classes are already known to you. To calculate the dependent value, the model uses the independent value of variables as:

Where, Z = b1x1 + …. + bn xn + c,

Z = Discriminant scoreb = Discriminant weight for variable

x = Independent variables

c = Constant

Source: Marketing Research, Avinash Kapoor, Ch-10

As the above equation depicts, multiply each independent variable with its corresponding weightage b and do the calculation in such a way that each individual may get a single composite score of discriminant analysis. Now when you take the average of discriminant score of all the individuals in a particular group, you develop the mean of group. And this process of calculation is known as centroid. And if there are two groups involved in this, then two centroid should be used. This situation is similar to the multiple regressions where you study different types of variables. This discriminant analysis is helpful in predicting the class of a new observation in comparison to the class which is unknown.

For example: suppose for A, y problem of class, construction of y discriminant functions is done. When new observation is given, evaluation of y discriminant functions is started and class z is assigned by the new observation if there is highest value for zth discriminant function only.

There are various methods of discriminant analysis:



a) Fisher's Linear Discriminant Analysis

b) Multiple Discriminant Analysis

c) K-Nearest Neighbors’ Discriminant Analysis.


5. Cluster Analysis or _________ Analysis proves to be a useful complement for Discriminant Analysis.

6. The main goal of ______ analysis is the determination of observational class which is based on a set of variables namely input variables also known as predictors.

a) Discriminant

b) Variance

c) Factor

d) Regression

11.5 CORRELATION ANALYSIS OF VARIANCE AND COVARIANCEThere is an interesting relationship lies between analysis of variance (ANOVA) and simple correlation coefficient. Lot of pedagogical opportunities can be afforded by this relationship. Both these methods use graphical representation of data. This representation depicts strength of interrelationship between the dependent and independent variables. It also reveals the potential violations of the assumptions of analysis of variance.Historically, a direct link has been made between the Analysis of Variance and the correlation coefficient in terms of both hypothesis testing and conceptualization.

For example: in a one way context of ANOVA having k treatment groups, the square of simple correlation between score of each individual and his/her group mean, r2

yyk = SS between / SS total, or ŋ2. This allows a direct F test of k equality.F = (v2 / v1) [ŋ 2 / {1- ŋ 2)], based on v1 and v2 degrees of freedom.In terms of the square of correlation’s numerical components, you have,



Source:http://www.jstor.org/discover/10.2307/2685166?uid=3738256&uid=2129&uid=2&uid=70&uid=4&sid=47699108992517

You will understand this relationship with the help of following example:

Suppose there are three group designs of a total of 20 subjects. The scores

of 1st group are: 3, 5, 6, 7, 8, 9 and 11 (Average = 7); for second group

the scores are 4, 5,6, 8, 9, 10 and 16 then their average = 9); for third

group scores are: 5, 7, 9, 10, 11 and 12 (Average = 9).


7. ANOVA and correlation coefficient method use graphical representation of data. (True/False)

8. A direct link has been made between the _______ and the correlation coefficient.

11.6 LINEAR REGRESSION ANALYSIS

Linear Regression is an approach to modelling the relationship between a scalar dependent variable y in statistics. Thus, you can find different types of Linear Regression which are as follows:

1. Simple Regression : Where there is only one explanatory variable which is called simple regression.

2. Multiple Regression: Where there are more than one explanatory variable then it is called multiple regression.



http://en.wikipedia.org/wiki/Simple_regression

http://en.wikipedia.org/wiki/Simple_regression

http://en.wikipedia.org/wiki/Statistics

http://en.wikipedia.org/wiki/Dependent_variable

http://www.jstor.org/discover/10.2307/2685166?uid=3738256&uid=2129&uid=2&uid=70&uid=4&sid=47699108992517


3. Multivariate Linear Regression : Where multiple correlated dependent variables are predicted, rather than a single scalar variable then it is known as Multivariate Linear Regression.

In linear regression, unknown model parameters are estimated from the data and data is modeled by using linear predictor functions. Therefore, such models are called linear models.

The first type of regression analysis is linear regression which is used extensively in practical applications as well as studied rigorously. This is because of the two reasons:

the statistical properties of the resulting estimators are easier to determine

models which depend linearly on their unknown parameters are easier to fit than models which are non-linearly related to their parameters

With the help of the least squares approach, models of Linear regression are often fitted, but they may also be fitted in other ways, such as by minimizing a penalized version of the least squares loss function as in ridge regression or by minimizing the “lack of fit” in some other norm as with least absolute deviations regression. Conversely, the least squares approach that are not linear models can be used to fit models. Hence, while the terms "linear model" and "least squares" are closely linked, they are not synonymous.

Source: http://en.wikipedia.org/wiki/Linear_regression

Figure: 11.2: Linear regression graph



http://en.wikipedia.org/wiki/Linear_regression

http://en.wikipedia.org/wiki/Least_absolute_deviations

http://en.wikipedia.org/wiki/Least_absolute_deviations

http://en.wikipedia.org/wiki/Norm_(mathematics)

http://en.wikipedia.org/wiki/Ridge_regression

http://en.wikipedia.org/wiki/Ridge_regression

http://en.wikipedia.org/wiki/Loss_function

http://en.wikipedia.org/wiki/Least_squares

http://en.wikipedia.org/wiki/Regression_analysis

http://en.wikipedia.org/wiki/Linear_model

http://en.wikipedia.org/wiki/Linear_predictor_function

http://en.wikipedia.org/wiki/Data

http://en.wikipedia.org/wiki/Estimation_theory

http://en.wikipedia.org/wiki/Parameters

http://en.wikipedia.org/wiki/Multivariate_linear_regression

http://en.wikipedia.org/wiki/Multivariate_linear_regression

Example Consider a situation where a small ball is being tossed up in the

air and then we measure its heights of ascent hi at various moments in

time ti. Physics tells us that, ignoring the drag, the relationship can be

modelled as

where β1 determines the initial velocity of the ball, β2 is proportional to

the standard gravity, and εi is due to measurement errors. Linear regression

can be used to estimate the values of β1 and β2 from the measured data.

This model is non-linear in the time variable, but it is linear in the

parameters β1and β2; if we take regressors xi = (xi1, xi2) = (ti, ti2), the model

takes on the standard form

Source: http://en.wikipedia.org/wiki/Linear_regression


9. In ______ regression, unknown model parameters are estimated from the data and data is modeled by using linear predictor functions.

10. The terms "linear model" and "_______" are closely linked, they are not synonymous.

a) logistic

b) linear

c) factor

d) descriptive

11.7 LOGISTIC REGRESSION ANALYSIS

Logistic regression analysis can be explained with the help of logistic function in which probabilities uses values between 0 and 1 always.









http://en.wikipedia.org/wiki/Standard_gravity

Like:

and

and

Figure 11.3. The logistic function, with on the horizontal

axis and on the vertical axisSource: http://en.wikipedia.org/wiki/Logistic_regression

The above figure is showing the function graph. Where

Input =

Output =

The utility of the logistic function is that it can accept any value as an input to positive infinity from negative infinity. On the other hand, in case of output



http://en.wikipedia.org/wiki/Logistic_regression

only between 0 and 1 values are accepted. The equation given above uses some terms as:

g(X) = logit function of predictor X which is already given. It also indicates the natural logarithm.

= a case of probability

= the linear regression intercept equation where value of the criterion when the predictor = 0

= value of predictor when multiplied with the regression coefficient

Where base e = the exponential function

And e = error term.

The 1st formula of linear regression equation given above denotes that the

Probability of a case = odds of exponential function.

It is important to note that the equation of logistic regression input can differ from negative to positive infinity. Yet the output will differ from 0 to 1. It is after exponentiation of the odd equation.

Another equation explains that the linear regression equation is equal to the logit. Similarly the 3rd equation explains in a single case odds equation is equal to the exponential function of the equation of linear regression.

All the above said equations play as a connector between the odds and

linear regression equation. It is given that the logit differs from. It provides a basis for logit how to conduct the linear regression and how to convert logit back into the odds.


11. The utility of the logistic function is that it can accept any value as an input to positive infinity from negative infinity.

12. Logistic regression analysis can be explained with the help of logistic function in which probabilities uses values between 5 and 6 always. (True/False)

11.8 TYPES OF FACTOR ANALYSIS

There are two types of factor analysis:



http://en.wikipedia.org/wiki/Exponential_function

a) Exploratory

b) Confirmatory

a) Exploratory Factor Analysis

The statistical method of exploratory factor analysis is helpful in removing the underlying structure cover of the variables which are relatively large in size. This is very important technique of factor analysis. The main objective of this analysis is to understand the relationship among measured variables. Researcher used this method very common in the cases where a scale is developed which helps in identifying the latent construct set which further underlying the measured variables. Where there is no hypothesis of factors of measured variables, then researcher use this method. Measured variable is the process by which several attributes of people are measured and observed.

An item on the measurement scale is an example of measured variable

A proper consideration must be given to the measured variables numbers to do further analysis. When measured variables in the analysis are represented by each factor, then process of EPA give more accurate information. At least 3-5 measured variables for each factor should be used.

Source: braintechllc.com



Figure:11.4 process of exploratory data analysis

In the above figure a complete process of exploratory factor analysis is shown. Firstly you collect a set of data then select one sample from it then use different statistical techniques. If the result is in positive direction then fit the model into your data and apply it as a system.

Common factor model is the basis for this method. This method is composed of various functions like their unique factors, common factors and measurement errors. Measured variables are influenced by two or more common factors. On the other hand, only measured variable is influenced by each unique factor. It also doesn’t explain the role of correlation among measured variables.

An important assumption of exploratory factor analysis is that any factor can be associated with any measured variable. To develop a scale, EFA method should be used by the researcher before taking a look for confirmatory factor analysis. Exploratory factor analysis helps the researcher in taking important decisions as when and how the analysis can be done. There is no specific set of method for this analysis.

b) Confirmatory Factor Analysis

In statistics, there is a special kind of factor analysis namely, confirmatory factor analysis. This is normally used in social research. This analysis is used to track out the measurability of a consistent construct with researcher’s understanding related to this field. Thus the objective of confirmatory factor analysis is to check the fitness of data in a hypothesized measurement model. Here previous analytic research becomes the basis for this model.

The process of confirmatory factor analysis is as follows:

Firstly develop a hypothesis by considering all the factors.

Secondly find out the constraints among this hypothesis.

Thirdly impose these constraints on the model.

Finally after imposing these constraints researcher tries to make consistency in the model.

For example, suppose covariance measures have two accounting factors and these two factors are totally different from each other. Then researcher try to create such a model in which factor A and factor B’s correlation is



forced to 0. Then measures related to model fitness can be checked in order to assess how efficiently the covariance among all measures in the model can be captured. And if researcher has imposed those constraints in the model which are inconsistent with the data sample, then statistical tests of model give a poor fitness result due to which model will be rejected. The reason of being rejection is due to having multiple factors measurements. Another reason of rejection can be that some measures within a factor are more dependent to other measures.

It is considered that the requirements of 0 loadings are very strict for some applications. Therefore, a new developed analysis method is designed known as exploratory structural equation model. It clarifies the relation between indicated observations and their supposed latent factors in a hypothetical way. It also allows the loadings estimation with other secondary latent factors as well.

Structural equation modelling method is used to perform confirmatory factor analysis.

For example: there are some popular software’s related to this as: Mplus, EQS, LISREL AND AMOS

In structural equation model, the first step to assess the measurement model which is proposed can be Confirmatory Factor Analysis in some cases. There are many rules in structural equation modeling which are very similar to CFA related to assessment of model fitness and modification in model etc.

The major difference among CFA with SEM is that CFA doesn’t use the directed arrows between latent factors. Or it can be said that SEM does specify the causality of particular variables and factors while CFA variables doesn’t directly cause one another. In terms of SEM, CFA can be considered as the measurement model while the relation between the latent variables can be termed as the structural model.


13. _______ factor analysis helps the researcher in taking important decisions as when and how the analysis can be done.

14. Measured variables are influenced by one or two common factors. (True/False)



15. Structural equation modelling method is used to perform confirmatory factor analysis. (True/False)

11.9 CLUSTER ANALYSIS

Market segmentation is often considered as cluster analysis. This analysis is a process of establishing a market segment by dividing it into different clusters. On the other hand there is one conventional demographic analysis which is totally based on the tangible features like age, income, sex and family class.

Cluster analysis usually depends on two types of elements:

a) Subjective like consumer perception, their attitude, motivational part, their willingness, aspirations etc.

b) Behavioural features like knowledge of consumers, number of visiting to a particular shop, their trial concept of product etc.

The main principle of cluster analysis is doing the sub division of sample market into homogeneous clusters or groups. Each cluster should try to share their characteristics with other cluster exceptionally features can be different though.

Source: http://en.wikipedia.org/wiki/Cluster_analysis_(in_marketing)

Figure: 11.5: cluster analysis



http://en.wikipedia.org/wiki/Cluster_analysis_(in_marketing)

To interpret the cluster analysis, you should know this thing that if cluster groups are sharing the same traits and characteristics it doesn’t mean that they are identical in nature. Cluster analysis helps you in identifying the quantity and nature of odd customer’s groups in the sample segment. When you analysis the each segment of the market you can easily ascertain the market requirements, needs, threats and opportunities. This assessment would further help you in deciding the current and future potentiality in the market for business point of view.

For example: in the field of cardiology, clustering of medicines, symptoms, heart attacks ratio, heart patient eating life style can lead to be vary useful for the study. In the field of diabetes, sugar level, symptoms, cures for diseases, urine test, etc can be useful.

Thus you can describe the above example as wherever you need to classify the information into meaningful way, cluster analysis should be used.


16. ______ analysis helps you in identifying the quantity and nature of odd customer’s groups in the sample segment.

11.10 MULTIDIMENSIONAL SCALING TECHNIQUESMultidimensional scaling (MDS) is regarded as a substitute of factor analysis. The main objective of this is to identify meaningful dimensions which allow the researcher to clarify the observed similarities or dissimilarities (distances) between the objects investigated. In correlation matrix, the similarities between the objects (e.g., variables) are expressed. Thus, one may analyse any kind of similarities or dissimilarities with MDS. Now you will able to understand this with the help of following example:

Example: Suppose you take a matrix of distances which is there between major Indian cities from a map. Then analyse this matrix by specifying that you want to reproduce the distances based on two dimensions. Thus as a result of the MDS analysis, you would likely to get a two-dimensional representation of the locations of the cities, that is, you would basically obtain a two-dimensional map. You can "explain" the distances in terms of underlying two geographical dimensions: north/south and east/west.

There are various techniques of multidimensional scaling as follows:



Orientation of Axes: The actual orientation in the final solution of axes is arbitrary in factor analysis. But the distances between cities remain the same even if you can rotate the map in the direction you want in MDS. Hence, the final orientation of axes in the space or plane is the outcome of a subjective decision by the researcher who will select an orientation which can be most easily explained.

Computational Approach: In order to arrive at a configuration that best approximates the observed distances, MDS is not much an exact process as rather a way to "rearrange" objects in an efficient and effective way. It find out how well the distances between objects can be reproduced by the new configuration as well as moves objects which are defined by the requested number of dimensions in the space.

Applications: The "beauty" of MDS is that where you can analyze any kind of similarity or distance matrix. These similarities signifies - the number of times a subjects fails to discriminate between stimuli, people's ratings of similarities between objects, the percent agreement between judges and so on. Now you will able to understand this with the help of following example:

Example: You can see the MDS method in Psychological Research and Marketing Research:

1. It is popular in psychological research on person perception where similarities between trait descriptors are assessed so as to uncover the underlying dimensionality of perception of people’s regarding their traits.

2. It is also popular in marketing research so as to detect the nature and number of dimensions underlying the perceptions of different products or brands.

Thus in this method the researcher is allowed to ask unobtrusive questions ("how similar is brand A to brand B") and to develop from those questions underlying dimensions without the respondents knowing what is the real interest of a researcher's.

MDS and Factor Analysis: Even though MDS and factor analysis are basically different methods, there are similarities in the types of research questions to which these two procedures can be applicable. In Factor analysis the relationships are linear and thus the underlying data are distributed as multivariate normal on the other hand no such restrictions are imposed on MDS. MDS can be used as far as the rank-ordering of similarities or distances in the matrix is meaningful. In terms of resultant



differences, MDS often yields more interpretable and readily solutions whereas factor analysis tends to extract more factors (dimensions). As factor analysis requires you to first compute a correlation matrix while MDS can be applied to any kind of similarities or distances. Factor analysis requires subjects to rate those stimuli on some list of attributes whereas MDS can be based on subjects' direct assessment of similarities between stimuli. Therefore, MDS methods are relevant to a wide range of research designs because distance measures can be obtained in any number of ways.


17. In Factor analysis the relationships are linear and thus the underlying data are distributed as multivariate normal.

18. In correlation matrix, the similarities between the objects (e.g., variables) are not expressed. (True/False)

11.11 CONJOINT ANALYSISConjoint analysis is a type of technique which helps in determining the two important components

Firstly, what should be the features of a new product and Secondly, how a new product should be priced.

Thus this analysis helps in assuming that a product can be “broken down” into its component attributes. Now you will be able to understand this with the help of following example:

Example: The attributes of care are miles-per-gallon, colour, model style, size and price. Thus, the price set by an individual on any product is equivalent to the sum of the utility which they receive from all the attributes that make up the product. Therefore, it assumes that the likelihood to buy and the preferences for a product are both in proportion to the utility which an individual gains from that particular product.

You will find three phases of Conjoint analysis which are as follows: collection of trade-off data through a questionnaire statistical analysis of the data market simulation

Conjoint analysis is basically based on the fact that the relative attribute values which are considered jointly can be evaluated and measured than when they are considered in isolation. Hence, it is a tool which allows the subset of possible combinations of product features which are used to determine the importance of each feature in the decision of purchasing.



Here the respondents are asked to organise the product attributes combination list in decreasing order of preference. This ranking will help the respondents in making preference of order which helps in finding out the utilities of different values in each attribute.

The following steps are followed for the development of Conjoint analysis:

Step 1: Select product attributes such as price, appearance or size.

Step 2: Select the options and values for each attributes such as one may choose the levels of 5", 10", or 20" for the attribute of size and thus more burden is placed on the respondent where there are higher the number of options used for each attribute.

Step 3: Identify products as a combination of attribute options. The set of combinations of attributes which will be used will be regarded as a subset of the possible universe of products.

Step 4: Select the form which is to be presented to the respondents with the combinations of attributes and the options include paragraph description, pictorial presentation and verbal presentation.

Step 5: Choose how responses will be aggregated which is dependent on 3 choices –

pool all responses into a single utility function Define the segments of respondents who have similar preferences. use individual responses

Step 6: Choose the appropriate method and technique in order to evaluate and analyse the collected data. The model which is used to express the utilities of the various attributes is “worth model”. There are certain other models which helps in analyzing the data are - ideal-point (quadratic) models and vector (liner) models.

Thus, conjoint analysis has become an important marketing research tool and was used and analysed in the early 1970s. It is well appropriate for improving an existing product or defining a new product.


19. ______ analysis helps in assuming that a product can be “broken down” into its component attributes.

20. Rearrange the steps:



a) Select product attributes

b) Choose how responses will be aggregated

c) Identify products as a combination of attribute options

d) Choose the appropriate method and technique

i) d, c, b, a

ii) a, b, c, d,

iii) a, c, b, d

iv) b, a, d, c

11.12 SUMMARY

Let us recapitulate the important concepts discussed in this unit:

Descriptive statistics method is used for both categorical and continuous variables.

It is applicable to SPSS kind of programmes also. This statistics give information related to the frequency and percentages.

There are major differences between univariate and bivariate analysis or multivariate analysis.

Bivariate or multivariate analysis is that analysis which includes two variables at the same time. Multivariate analysis consider multiple of variable in other words more than two variables are considered in this analysis.

The main goal of discriminant analysis is the determination of observational class which is based on a set of variables namely input variables also known as predictors.

There is an interesting relationship lies between analysis of variance (ANOVA) and simple correlation coefficient. Lot of pedagogical opportunities can be afforded by this relationship. Both these methods use graphical representation of data.

In linear regression, unknown model parameters are estimated from the data and data is modeled by using linear predictor functions. Therefore, such models are called linear models.



http://en.wikipedia.org/wiki/Linear_model





It is important to note that the equation of logistic regression input can differ from negative to positive infinity. Yet the output will differ from 0 to 1. It is after exponentiation of the odd equation.

The main objective of exploratory analysis is to understand the relationship among measured variables. Researcher used this method very common in the cases where a scale is developed which helps in identifying the latent construct set which further underlying the measured variables.

the objective of confirmatory factor analysis is to check the fitness of data in a hypothesized measurement model.

The main principle of cluster analysis is doing the sub division of sample market into homogeneous clusters or groups. Each cluster should try to share their characteristics with other cluster exceptionally features can be different though.

Multidimensional scaling (MDS) is regarded as a substitute of factor analysis. The main objective of this is to identify meaningful dimensions which allow the researcher to clarify the observed similarities or dissimilarities (distances) between the objects investigated.

Conjoint analysis is basically based on the fact that the relative attribute values which are considered jointly can be evaluated and measured than when they are considered in isolation.

11.13 GLOSSARY

Linear Regression: It is an approach to modelling the relationship between a scalar dependent variable y in statistics.

Univariate Tests: A very simple method of quantitative statistical analysis is known as univariate analysis

Computational Approach: In order to arrive at a configuration that best approximates the observed distances, MDS is not much an exact process as rather a way to "rearrange" objects in an efficient and effective way.

Computational Approach: In order to arrive at a configuration that best approximates the observed distances, MDS is not much an exact process as rather a way to "rearrange" objects in an efficient and effective way.





Conjoint Analysis: Conjoint analysis is a type of technique which helps in determining the two important components

11.14 TERMINAL QUESTIONS

1. Elaborate the concept of univariate and discriminant analysis.

2. Discuss the concept of ANOVA and describe the relationship of ANOVA with correlation.

3. Explain the types of factor analysis.

4. Define the following terms:

a) cluster analysis

b) conjoint analysis

5. Elaborate the techniques of multidimensional scaling method.

6. Describe the Linear regression and logistic regression analysis.

7. Explain the types of linear regression analysis.

11.15 ANSWERS

Self-Assessment Questions

1. True

2. Statistical Inference

3. False

4. Central tendency

5. Principle components

6. a) Discriminant

7. True

8. ANOVA



9. b) Linear

10. Least square

11. Logistics

12. True

13. Exploratory

14. False

15. True

16. Cluster

17. Factor

18. False

19. Conjoint

20. Iii) a, c, b, d

Terminal questions

1. A very simple method of quantitative statistical analysis is known as univariate analysis….. refer 11.3 & 11.4

2. There is an interesting relationship lies between analysis of variance (ANOVA) and simple correlation coefficient…. Refer 11.5

3. There are two types of factor analysis:…refer 11.8

4. Market segmentation is often considered as cluster analysis….11.9 & 11.11

5. Multidimensional scaling (MDS) is regarded as a substitute of factor analysis…. refer 11.10

6. Linear Regression is an approach to modelling the relationship between a scalar dependent variable y in statistics…..refer 11.6 & 11.7





7. Thus, you can find different types of Linear Regression which are as follows:…refer 11.6

11.16 CASE STUDY

Anova Case at Wentworth Medical Center

As part of a long term study of individuals 65 years of age or older, sociologist and physicians at the Wentworth Medical Center in upstate New York investigated the relationship between geographic location and depression. A sample of 60 individuals all in reasonably good health, was selected; 20 individuals were residents of Florida, 20 were residents of New York, and 20 were residents of North Carolina. Each of the individuals sampled was given a standardized test to measure depression. The data collected follow; higher test scores indicate higher levels of depression. These data are available on the disk in the medical file.

A second part of the study considered the relationship between geographic location and depression for individuals 65 years of age or older who had a chronic health condition such as arthritis, hypertension, and/or heart ailment. A sample of 60 individuals with such conditions was identified. Again 20 were residents of Florida, 20 were residents of New York, and 20 were residents of north Carolina the levels of depression recorded for this study follow. These data are available on the data disk in the file Medical2.

Data from Medical 1 Data from Medical 2

FloridaNew York

North Carolina Florida

New York

North Carolina

3 8 10 13 14 107 11 7 12 9 127 9 3 17 15 153 7 5 17 12 188 8 11 20 16 128 7 8 21 24 148 8 4 16 18 175 4 3 14 14 85 13 7 13 15 142 10 8 17 17 166 6 8 12 20 182 8 7 9 11 176 12 3 12 23 196 8 9 15 19 159 6 8 16 17 13



7 8 12 15 14 145 5 6 13 9 114 7 3 10 14 127 7 8 11 13 133 8 11 17 11 11

Questions:

Use analysis of variance on data set 1 (Good Health). State the hypotheses being tested. What are your conclusions?

Use analysis of variance on data set 2 (Chronic Bad Health). State the hypotheses being tested. What are your conclusions?

Source: http://brainmass.com/statistics/all-topics/90897

REFERENCES

Hosmer, David W.; Lemeshow, Stanley (2000). Applied Logistic Regression(2nd ed.).

Wiley. ISBN 0-471-35632-8.

Cohen, Jacob; Cohen, Patricia; West, Steven G.; Aiken, Leona S. (2002). Applied

Multiple Regression/Correlation Analysis for the Behavioral Sciences (3rd ed.).

Routledge. ISBN 978-0-8058-2223-6.

Balakrishnan, N. (1991). Handbook of the Logistic Distribution. Marcel Dekker,

Inc.. ISBN 978-0-8247-8587-1.

Norris, Megan; Lecavalier, Luc (17 July 2009). "Evaluating the Use of Exploratory

Factor Analysis in Developmental Disability Psychological Research". Journal of Autism

and Developmental Disorders 40 (1): 8–20. DOI:10.1007/s10803-009-0816-2.

Harvey Russell Bernard, Research methods in anthropology: qualitative and

quantitative approaches, Rowman Altamira, 2006, ISBN 0-7591-0869-2, p. 549

E-References







http://en.wikipedia.org/wiki/Special:BookSources/0759108692

http://books.google.com/books?id=jBh0bknKgTkC&pg=PA549&dq=Univariate+analysis&cd=1#v=onepage&q=Univariate%20analysis&f=false

http://books.google.com/books?id=jBh0bknKgTkC&pg=PA549&dq=Univariate+analysis&cd=1#v=onepage&q=Univariate%20analysis&f=false

http://dx.doi.org/10.1007%2Fs10803-009-0816-2

http://en.wikipedia.org/wiki/Digital_object_identifier

http://en.wikipedia.org/wiki/Special:BookSources/978-0-8247-8587-1

http://en.wikipedia.org/wiki/International_Standard_Book_Number

http://en.wikipedia.org/wiki/Special:BookSources/978-0-8058-2223-6


http://en.wikipedia.org/wiki/Special:BookSources/0-471-35632-8


http://brainmass.com/statistics/all-topics/90897

http://www.jstor.org/discover/10.2307/2685166? uid=3738256&uid=2129&uid=2&uid=70&uid=4&sid=47699108992517

en.wikipedia.org/wiki/Cluster_analysis

en.wikipedia.org/wiki/Exploratory_factor_analysis





SMU Marketing Research Ch-11

Documents

Transcript of SMU Marketing Research Ch-11