Research Methodology: Lecture 6 - Indian Statistical Institute
3. Research Methodology-Data Collection, Statistical...
Transcript of 3. Research Methodology-Data Collection, Statistical...
Research Methodology
103
3. Research Methodology-Data Collection, Statistical Analysis
This chapter includes:
Type of study, Population and Sampling technique, Power analysis Instrumentation-Design of questionnaire Descriptive statistics and distributions Factor Analysis of measuring tools Reliability and Validity of measuring tools Discriminant Analysis-Steps followed to arrive at the final DA model Structural Equation Modeling-Confirmatory Factor Analysis and fit indexes for
all variables
Type of study-The study is descriptive and cross-sectional in nature using the survey
design.
Population
There are around 0.35 million credit card users in Kochi Municipal Corporation,
Kerala (Total of 1.5million people). These cards are offered by 28 different national
and international banks together having 199 different types of product offerings.
Sample frame and Sampling technique
A quota sample of 550 respondents is selected from among those who are using the
credit cards for a period greater than one year. This method is identified by the
researcher as a credible way of tracking credit card spending and financing patterns as
the respondents have completed a few billing cycles in the said period. Prior to this a
pilot study was conducted among 200 credit card users to measure their responses
and to validate the questionnaire. The number of final respondents from whom data
collection was done is 550.The quota sampling is a non-probability sampling technique
which can involve the judgement of the researcher as well to ensure quality of the
Research Methodology
104
responses. Non probability sample can be relevant for research to the extent that it
possesses the essential person and setting characteristics that define membership in
the intended target population (Sackett & Larsen, 1998)
Power Analysis: Power analysis is carried out a priori, during the design stage of the
study. A study can conclude that a null hypothesis was true or false. The real answer,
the state of the world, can be that the hypothesis is true or false. Given the three
factors alpha, sample size and effect size, beta can be calculated. Alpha is the
probability of type I error (rejection of a correct null hypothesis). Beta is the
probability of type II error (acceptance of a false null hypothesis). The probability of
correctly accepting the null hypothesis is equal to 1-α, which is fixed the probability of
incorrectly rejecting the null hypothesis, is β. The probability of correctly rejecting the
null hypothesis is equal to 1-β, which is the power. The sample size required by power
analysis for DA is 128 (.1 error margin) and for SEM is 468 (.2 error margin) (Cohen,
1989). These respective sample sizes were giving an 80 % or more power figure.
Sample Adequacy Test: For Discriminant Analysis (DA):
Discriminant Analysis (DA) is relatively robust even when there is modest violation of
equality of covariance assumption (Lachenbruch, 1975). The dichotomous variable,
which often violate multivariate normality, are not likely to affect conclusions based on
DA (Klecka, 1980).The condition that the smallest group in the DA should have at least
five times the total number of variables used in the study is fulfilled. The smallest
group Credit non-defaulters has 61 respondents more than twice the required sample
size of 25 (Illustration- One dependent variable and four predictor variables * Five
times=25).
Research Methodology
105
Sample Adequacy Test: For Structural Equation Modeling (SEM):
Hoelters critical N is used to judge if sample size is adequate or not in SEM. The
Hoelters N is 313 (at.05 level) and 332 (at.01 level) which is more than sufficient to
accept a model by chi-square (Schumacker & Lomax, 2004) (table 41). A sample size
greater than 500 is recommended to produce more robust model fit indexes (Lei, Ming
& Lomax Richard G, 2005).
Tools of Analysis- Two tests of normality viz; Kolmogorov-Smirnov (K-S) and Shapiro-
Wilk, Factor Analysis, Discriminant Analysis (DA) and Structural Equation Modeling
(SEM).
Instrumentation- Design of Questionnaire
Materialism (MAT) has 5 items measured on a 5 point Likert scale from strongly
disagree to strongly agree. Those who score three neither agree nor disagree to
materialistic nature. The scale was adopted from James Carl Stone IV (2001), Oklahoma
State University for reduced number of items. The original scale which was developed
by Richins & Dawson (1992) was abridged for more number of statements without
compromising the content validity. The following are the statements in the
questionnaire. (a) I enjoy buying expensive things (b) My possessions are important for
my happiness (c) I like to own nice things more than most people in my immediate and
comparable vicinity (d) Acquiring valuable things is important to me (e) I enjoy owning
luxury items. To avoid measurement error, in the beginning of the questionnaire it has
been quoted that respondents need to indicate what they do in daily life and not what
they think about.
Research Methodology
106
The Compulsive Buying (CBB) has 9 items measured on a 5 point scale from never to
very often. Those who score three are sometimes compulsive buyers. The following are
the statements in the questionnaire (a) I have bought things that I could not really
afford (b) I bought something to make myself feel better (c) I felt depressed after
shopping (d) I have gone on a buying spree without being able to stop (e) I bought
something and when I got home I wasn’t sure why I bought it (f) I felt anxious on days I
didn’t do shopping (g) I buy things simply because they are on sale (h) I just wanted to
buy and didn’t care what I bought (i) I have felt that others would be horrified if they
knew about my spending habits. The scale was adopted from James Carl Stone IV,
(2001) Oklahoma State University. The above said scale itself was adapted from
O’Guinn & Faber compulsive buying screener scale (1992). To avoid measurement
error, in the beginning of the questionnaire, it has been stated that respondents need
to indicate how frequently they engage in the following in daily life and not how they
wish they could.
Enhanced Credit Card Spending (ECCS) has 5 items measured on a five point Likert
scale from strongly disagree to strongly agree. Those who score three neither agree nor
disagree to be enhanced credit card spenders. The following are the statements in the
questionnaire (a) I end up buying more due to the possession of a credit card (b) When
I shop with credit card(s), I tend to make unplanned purchases (c) It is easy for me to
overspend when I shop in the presence of a credit card (d) Without a credit card, my
spending habits would not be different (e) If I did not have a credit card, I would
probably spend less. The scale was adopted from the credit card consumption scale
(Sahni, 1995). The fourth statement affected the overall scale reliability and hence was
removed. This does not affect the content validity as the fifth item is of similar nature.
Research Methodology
107
Credit Card Financing Behaviour (CCFB) has 6 items measured on a five point Likert
scale from strongly disagree to strongly agree. Those who score three neither agree nor
disagree to have credit card financing behaviour. The following are the statements in
the questionnaire (a) I exhaust the credit limit on my credit card(s) (b) When
purchasing I have been told that I have spent beyond the credit limit (c) The way I use
my credit card I always have enough credit (d) I manage bills in an effort to make
payments on my credit cards (e) I pay credit card bills after their due dates (f)
Creditors have threatened to cancel my credit cards. The scale was adopted from the
Credit card consumption scale (Sahni, 1995), which is originally adapted from Edwards
(1992). The fourth item was deleted as it affected the overall scale reliability. The
deleted item does not interfere with content validity due to the fact that there are
statements cross checking one’s ability to pay bills on time.
Credit Default Probability
A debt know how quiz (Master Card, 2006) is administered to measure the dependent
variable Credit default probability (CD). The set of questions were pertaining to credit
card bills and payment management patterns. The statements asked are (a) Do you
avoid looking at your bills and credit card balances? (b) Do you usually pay only the
minimum on your credit cards? (c) Do you sometimes pay your bills late or miss
payments entirely? (d) Do you use credit cards and store credit to make purchases
because you don’t have the money to pay for them at the time? (e) Is your paycheck
already spent before you receive it? (f) Do you choose the longest allowable payment
period or installment plan to make major purchases- for example, a car or major
appliance-affordable? (g) Have you taken out a home loan to pay down your debt and
already run up new consumer debts? (h) Do payments on your debt account for more
Research Methodology
108
than 20 percent of your household take-home pay each month (excluding your
mortgage or rent payment)? (i) Do you have savings to fall back on if something
unexpected happens, such as a car repair or medical emergency? (j) Do you spend
more time worrying about your bills than paying them? The questionnaire has a
nominal scale and the answer gets accumulated as points indicating probability of
credit default. All ten questions have a yes or no choice. In all questions, the option yes
has a score of one and no has a zero score. In question number nine option yes has a
zero score and if no a score of one. If all answers are yes then the total score is nine
and if all answers no then the total score is one. A score of zero means excellent, as
the credit default probability is nil. A score of one to four (both inclusive) indicates a
potential defaulters (early warning stage) and a score of five to ten (both inclusive)
indicates high on credit default probability. The respondents after their categorisation
fall into three groups, viz non-defaulters who are 61 in number, the potential
defaulters (early warning stage) are 364 and the defaulters are 125.This allocation is
used as part of the descriptive interpretation. In order to use Structural Equation
Modeling using AMOS 7, the dependent credit default was taken as a single item
measure by summing up the actual score of each respondent from zero to ten. The
higher the score the greater is the probability to default. The validity and reliability of
the single-item summated score from ten nominal questions is appropriate (Wanous &
Reichers, 1997).The dependent credit default score (metric) is used for further analysis
using SEM.
All the questionnaires were administered in English. The quota for target respondents
were identified by closed group networking using judgement. The business
management students of Rajagiri School of Management, Kochi were asked to identify
at least ten respondents in their family and friends circle who were voluntary and
Research Methodology
109
willing to cooperate with the study. In case of a few respondents there were some
clarifications regarding the questions and the student volunteers were more than
helpful in providing the same.
Procedure of data analysis and time frame:
A combination of descriptive (mean, standard deviation, distribution tests), and
inferential (t-test, Factor Analysis, Discriminant Analysis and SEM) analysis is
undertaken. The respondents of the pilot study were met from January 2007 to March
2008. The respondents of the final study were met from April 2008 to April 2010.
Analysis Plan:
Descriptive Statistics and Distributions:
The data was screened for the following defects.
1. Incorrect data entry and out-of range values and such errors were found absent.
2. Missing values were rectified.
3.Outliers-The composite score of the predictors (as averaged items reduces the probability of
outliers) viz, Compulsive Buying(CBB), Materialism (MAT), Enhanced Credit Card
Spending (ECCS) and Credit Card Financing Behaviour (CCFB) were filtered using the
new standardized scores as a condition and values outside -3 and 3 were identified.
There were six such cases arising due to CBB scores (Z score-CBB) and 2 cases arising
out of CCFB (Z score-CCFB) scores.
The Histograms, Box plots, skewness, kurtosis and normal Q-Q plots (observed values and
expected values) for each composite predictor variable was done separately with 542
samples and 550 samples and it was found that the identified eight cases was not
effecting the results substantially. If the respondent (subject) chose to respond with a
particular value, then that data was a reflection of reality and so removing the so
Research Methodology
110
called outliers is an antithesis of research, and hence the final sample was retained as
550 itself.
As a single method does not speak of all possible facets of normality in a given
sample, this was further followed by two tests of normality, Kolmogorov-Smirnov (K-
S) when sample was greater than 50 and Shapiro-Wilk when sample was smaller than
50 (when sub-groups of predictor variables were created based on dependent credit default categories). It
was found that there is not great difference in normality assumption when the whole
550 sample was divided based on categories generated by the dependent variable
credit default. Two separate categories of dependent variable, one for Discriminant
data set and other for SEM data set were used to generate separate K-S and S-W tests
(See table 5, 6, 7, 8). One limitation of the normality tests is that the larger the sample
size, the more likely to get significant results (p<.05, indicating non-normality). So a
slight deviation from normality will result in significance (p<.05) when sample is large.
This need not be an absolute deviation from normality (See table 5,6,7,8).The
researcher as said above has used various methods to check for normality and found
conforming (See Normal Q-Q Plot and descriptive of Predictors)(figure 4,5,6,7)(table 9).
In the normal Q-Q plot the black line indicates the values the sample should adhere to
if the distribution is normal. The dots are the actual data. In the descriptive table 9, the
5 % trimmed mean indicates the mean value after removing the top and bottom 5% of
scores. The skewness and kurtosis are zero for a normal distribution. The values
within +1 and -1 range are acceptable ranges for nearly fitting normal distributions.
This was cross checked with the sample of 542 after dividing them based on the
dependent variable category and still no significant difference existed.
Research Methodology
111
Separate Discriminant analysis and Structural Equation Modeling was done with
outliers eliminated (sample of 542) and not excluding the outliers (sample of 550). The
results were slightly better for Discriminant Analysis with a sample size of 550 and for
Structural Equation Modeling there was no difference either with 542 or 550 samples.
Hence the final distribution is close to normal without eliminating the outliers and the
final sample used in analysis is 550.
Table: 5
Tests of Normality(b) of Predictor Variables based on Credit Default Scores used in SEM Analysis
Predictor Category based on actual Kolmogorov-Smirnov(a) Type of
Variables default score (dependent) Statistic Sample size (df) Sig. Distribution
CBB
0.00 0.11 61.00 0.04 Not normal
1.00 0.12 115.00 0.00 Not normal
2.00 0.08 96.00 0.14 Normal
3.00 0.09 70.00 0.20 Normal
4.00 0.13 83.00 0.00 Not normal
5.00 0.09 52.00 0.20 Normal
6.00 0.08 25.00 0.20 Normal
7.00 0.11 19.00 0.20 Normal
8.00 0.16 23.00 0.12 Normal
9.00 0.16 4.00 .
10.00 0.26 2.00 .
MAT
0.00 0.14 61.00 0.01 Not normal
1.00 0.09 115.00 0.02 Not normal
2.00 0.09 96.00 0.05 Normal
3.00 0.12 70.00 0.01 Not normal
4.00 0.12 83.00 0.01 Not normal
5.00 0.15 52.00 0.00 Not normal
6.00 0.13 25.00 0.20 Normal
7.00 0.14 19.00 0.20 Normal
8.00 0.12 23.00 0.20 Normal
9.00 0.23 4.00 .
10.00 0.26 2.00 .
ECCS
0.00 0.15 61.00 0.00 Not normal
1.00 0.13 115.00 0.00 Not normal
2.00 0.13 96.00 0.00 Not normal
3.00 0.18 70.00 0.00 Not normal
Research Methodology
112
4.00 0.18 83.00 0.00 Not normal
5.00 0.12 52.00 0.05 Normal
6.00 0.13 25.00 0.20 Normal
7.00 0.16 19.00 0.19 Normal
8.00 0.14 23.00 0.20 Normal
9.00 0.24 4.00 .
10.00 0.26 2.00 .
CCFB
0.00 0.13 61.00 0.02 Not normal
1.00 0.12 115.00 0.00 Not normal
2.00 0.14 96.00 0.00 Not normal
3.00 0.17 70.00 0.00 Not normal
4.00 0.15 83.00 0.00 Not normal
5.00 0.10 52.00 0.20 Normal
6.00 0.13 25.00 0.20 Normal
7.00 0.20 19.00 0.04 Not normal
8.00 0.16 23.00 0.13 Normal
9.00 0.29 4.00 .
*This is a lower bound of the true significance.
a-Lilliefors Significance Correction
b-CCFB is constant when MCSUM = 10.00. It has been omitted.
K-S test used when sample greater than 50
S-W test used when sample smaller than 50
Table: 6
Tests of Normality(b) of Predictor Variables based on Credit Default Scores used in SEM Analysis
Predictor Category based on actual Shapiro-Wilk Type of
Variables default score (dependent) Statistic Sample size (df) Sig. Distribution
CBB
0.00 0.96 61.00 0.03 Not normal
1.00 0.94 115.00 0.00 Not normal
2.00 0.98 96.00 0.08 Normal
3.00 0.91 70.00 0.00 Not normal
4.00 0.96 83.00 0.01 Not normal
5.00 0.97 52.00 0.19 Normal
6.00 0.99 25.00 1.00 Normal
7.00 0.99 19.00 0.99 Normal
8.00 0.92 23.00 0.08 Normal
9.00 0.99 4.00 0.95 Normal
10.00
MAT
0.00 0.96 61.00 0.07 Normal
1.00 0.97 115.00 0.02 Not normal
Research Methodology
113
2.00 0.97 96.00 0.04 Not normal
3.00 0.96 70.00 0.05 Normal
4.00 0.97 83.00 0.08 Normal
5.00 0.95 52.00 0.04 Not normal
6.00 0.97 25.00 0.70 Normal
7.00 0.94 19.00 0.22 Normal
8.00 0.96 23.00 0.39 Normal
9.00 0.89 4.00 0.40 Normal
10.00
ECCS
0.00 0.95 61.00 0.02 Not normal
1.00 0.95 115.00 0.00 Not normal
2.00 0.95 96.00 0.00 Not normal
3.00 0.92 70.00 0.00 Not normal
4.00 0.94 83.00 0.00 Not normal
5.00 0.95 52.00 0.02 Not normal
6.00 0.96 25.00 0.41 Normal
7.00 0.94 19.00 0.22 Normal
8.00 0.91 23.00 0.03 Not normal
9.00 0.91 4.00 0.49 Normal
10.00
CCFB
0.00 0.92 61.00 0.00 Not normal
1.00 0.91 115.00 0.00 Not normal
2.00 0.93 96.00 0.00 Not normal
3.00 0.95 70.00 0.01 Not normal
4.00 0.97 83.00 0.04 Not normal
5.00 0.97 52.00 0.23 Normal
6.00 0.95 25.00 0.20 Normal
7.00 0.91 19.00 0.07 Normal
8.00 0.96 23.00 0.44 Normal
9.00 0.89 4.00 0.37 Normal
*This is a lower bound of the true significance.
b-CCFB is constant when MCSUM = 10.00. It has been omitted.
K-S test used when sample greater than 50
S-W test used when sample smaller than 50
Research Methodology
114
Table: 7
Tests of Normality(b) based on Credit Default Category as used in Discriminant Analysis
Predictors Credit default category(dependent) Kolmogorov-Smirnov(a) Type of
Statistic Sample size(df) Sig. Distribution
CBB
Non-defaulter 0.11 61.00 0.04 Not normal
Potential defaulters (early warning stage) 0.11 364.00 0.00 Not normal
Defaulter 0.06 125.00 0.20 Normal
MAT
Non-defaulter 0.14 61.00 0.01 Not normal
Potential defaulters (early warning stage) 0.07 364.00 0.00 Not normal
Defaulter 0.10 125.00 0.01 Not normal
ECCS
Non-defaulter 0.15 61.00 0.00 Not normal
Potential defaulters (early warning stage) 0.15 364.00 0.00 Not normal
Defaulter 0.12 125.00 0.00 Not normal
CCFB
Non-defaulter 0.13 61.00 0.02 Not normal
Potential defaulters (early warning stage) 0.13 364.00 0.00 Not normal
Defaulter 0.08 125.00 0.03 Not normal
*This is a lower bound of the true significance.
a-Lilliefors Significance Correction
Table: 8
Tests of Normality(b) based on Credit Default Category as used in Discriminant Analysis
Predictors Credit default category(dependent)
Shapiro-Wilk Type of
Statistic Sample size(df) Sig. Distribution
CBB
Non-defaulter 0.96 61.00 0.03 Not normal
Potential defaulters (early warning stage) 0.96 364.00 0.00 Not normal
Defaulter 0.99 125.00 0.50 Normal
MAT
Non-defaulter 0.96 61.00 0.07 Normal
Potential defaulters (early warning stage) 0.99 364.00 0.01 Not normal
Defaulter 0.98 125.00 0.03 Not normal
ECCS
Non-defaulter 0.95 61.00 0.02 Not normal
Potential defaulters (early warning stage) 0.95 364.00 0.00 Not normal
Defaulter 0.95 125.00 0.00 Not normal
CCFB
Non-defaulter 0.92 61.00 0.00 Not normal
Potential defaulters (early warning stage) 0.95 364.00 0.00 Not normal
Defaulter 0.98 125.00 0.10 Normal *This is a lower bound of the true significance.
Research Methodology
115
Table: 9
Descriptive Statistics for the Predictor Variables
Statistic Std. Error
CBB Composite
Mean 2.14 0.03
95% Confidence Interval for Mean Lower Bound 2.08
Upper Bound 2.21
5% Trimmed Mean 2.10
Std. Deviation 0.78
Skewness 0.91 0.10
Kurtosis 0.70 0.21
MAT Composite
Mean 2.96 0.04
95% Confidence Interval for Mean Lower Bound 2.88
Upper Bound 3.03
5% Trimmed Mean 2.95
Std. Deviation 0.86
Skewness -0.04 0.10
Kurtosis -0.30 0.21
ECCS Composite
Mean 3.20 0.04
95% Confidence Interval for Mean Lower Bound 3.12
Upper Bound 3.28
5% Trimmed Mean 3.22
Std. Deviation 0.99
Skewness -0.38 0.10
Kurtosis -0.67 0.21
CCFB Composite
Mean 2.00 0.03
95% Confidence Interval for Mean Lower Bound 1.93
Upper Bound 2.06
5% Trimmed Mean 1.96
Std. Deviation 0.72
Skewness 0.65 0.10
Kurtosis -0.11 0.21
Research Methodology
116
Figure: 4
Normal Q-Q Plot of CBB
Research Methodology
117
Figure: 5
Normal Q-Q Plot of MAT
Research Methodology
118
Figure: 6
Normal Q-Q Plot of ECCS
Research Methodology
119
Figure: 7
Normal Q-Q Plot of CCFB
Factor Analysis of the measuring tools
The response error or systematic error includes researcher error, interviewer error and
respondent error. These errors can affect the scale reliability and vice versa. The
researcher error arises due to population definition error, measurement error and
surrogate information error. Further, Interviewer error arises due to response selection
error, questioning error, recording error and cheating error. The respondent error
arises due to inability error and unwillingness error. The systemic error is minimized
by ensuring good quality measuring tool and good implementation.
Research Methodology
120
The random error or noise is taken care by large sample size. The reliability of the
scale is one of the measures of the quality of the tool. The reliability of the scale is
affected by researcher error and interviewer error. The scale items with maximum
Cronbachs Alpha value for each variable are subjected to factor analysis, to confirm
the validity of the existing standardised scale.
Factor Analysis is used to cross validate the existing standardised questionnaire. At
the early stage the attempt is to eliminate any variables that don’t correlate with any
other variables or that correlate very highly with other variables (R>0.9).
As part of the final study the questionnaires as modified by the results of the pilot
study had a total of 25 items spread across four predictor variables. One statement
(indicator) from enhanced credit card spending behaviour scale and one statement
from credit card financing behaviour scale was removed as reported earlier as a
preliminary analysis using scale reliability showed that these two respective items were
reducing the overall scale reliability. The total items across four predictor variables are
23. The dependent variable credit default being measured on a nominal scale was not
subject to factor.
A re-factor analysis is done using 23 items across four predictor variables
viz, materialism (5 items), compulsive buying (9 items), enhanced credit card spends (4
items), credit card financing behaviour (5 items). The appropriateness of factor
analysis model with the given data or whether the data were suitable for conducting
factor analysis was tested using Kaiser-Meyer-Olkin measure of sampling adequacy
and Bartlett’s test of Sphericity. The KMO static varies between 0 and 1. A value close
to 1 indicates that the patterns of correlations are relatively compact and so factor
analysis should yield distinct and reliable factors. Values above 0.9 are excellent
Research Methodology
121
(Hutcheson & Sofroniou, 1999).A significant test tells us that the R-matrix is not an
identity matrix; therefore there are some relationships between the variables we hope
to include in the analysis. Therefore factor analysis is appropriate (table 10).
Table: 10 KMO and Bartlett's Test
KMO and Bartlett's Test
Kaiser-Meyer-Olkin Measure of Sampling Adequacy. 0.93
Bartlett's Test of Sphericity Approx. Chi-Square 5935.13
df 253.00
Sig. 0.00
Looking at the correlation matrix the correlation values are less than or equal to 0.65
and in majority (but 6 inter-correlation values were insignificant (p>0.05) of the cases the
correlations are significant (p<0.05).The determinant of the correlation matrix is
greater than the necessary value of 0.00001, the reported value is 0.000017 indicating
that multi-collinearity is not a problem for these data.
The factors extracted were four (23 statements belonging to four predictor variables) whose
Eigen values were greater than 1.The average communalities of 23 statements (of four
predictor variables) were 0.60 which is a good value when sample size exceeds
250.Hence all the factors were retained using Kaiser’s criterion (table 11).
Research Methodology
122
Table: 11
Communalities
Communalities
Initial Extraction
CB1 1.00 0.56
CB2 1.00 0.59
CB3 1.00 0.34
CB4 1.00 0.61
CB5 1.00 0.60
CB6 1.00 0.65
CB7 1.00 0.64
CB8 1.00 0.46
CB9 1.00 0.61
Mat 1 1.00 0.59
Mat 2 1.00 0.58
Mat 3 1.00 0.58
Mat 4 1.00 0.61
Mat 5 1.00 0.64
ECCS1 1.00 0.75
ECCS2 1.00 0.72
ECCS3 1.00 0.73
ECCS5 1.00 0.56
CCFB1 1.00 0.61
CCFB2 1.00 0.71
CCFB3 R 1.00 0.39
CCFB5 1.00 0.61
CCFB6 1.00 0.48
Extraction Method: Principal Component Analysis.
The rotated component matrix is clearly loading into four factor which are
materialism (5 items),compulsive buying ( 9 items) ,enhanced credit card spending ( 4
items) and credit card financing behaviour (5 items).The varimax rotation was adopted
as the four factors were considered theoretically independent (table 12).
Research Methodology
123
Table: 12
Rotated Component Matrix
Rotated Component Matrix(a)
Component
1.00 2.00 3.00 4.00
CB7 0.75
CB6 0.75
CB9 0.72
CB4 0.72
CB5 0.71
CB2 0.70
CB1 0.65
CB8 0.65
CB3 0.42
Mat 5 0.77
Mat 2 0.74
Mat 4 0.74
Mat 3 0.74
Mat 1 0.69
ECCS3 0.82
ECCS1 0.81
ECCS2 0.79
ECCS5 0.73
CCFB5 0.75
CCFB2 0.73
CCFB6 0.67
CCFB3 R 0.61
CCFB1 0.61
Extraction Method: Principal Component Analysis.
a-Rotation converged in 5 iterations.
Rotation Method: Varimax with Kaiser Normalization.
Research Methodology
124
Figure: 8
Component Plot in Rotated Space
ECCS3ECCS2ECCS1
ECCS5
CB1
CB2CB5CB4CCFB1
Mat1CB3
CCFB2
Mat4Mat5Mat2 Mat3
CCFB5CCFB6CCFB3R
Component 31.0 0.5 0.0 -0.5 -1.0
Com
pone
nt 2
1.0
0.5
0.0
-0.5
-1.0
Component 11.00.50.0-0.5-1.0
Component Plot in Rotated Space
The factors identified by factor analysis are subject to reliability analysis and their
values are reported below. In the reliability statistics, the Cronbachs Alpha of a scale
Research Methodology
125
for all the included variables are greater than 0.7, as recommended by J.C. Nunnelly
(1978). Hence the constructs can be used together as a scale (table 13).
Table: 13
Number of items in each scale and their Cronbach Alpha Value
Scales No of items after deletion to
enhance Alpha Cronbach on
standardised items
Compulsive Buying (CBB) 9 0.894
Materialism (MAT) 5 0.836
Enhanced Credit Card Spending (ECCS)
4 0.852
Credit Card Financing Behaviour (CCFB)
5 0.776
Table: 14
Inter-Item Correlation Matrix for Compulsive Buying Scale
CB1 CB2 CB3 CB4 CB5 CB6 CB7 CB8 CB9
CB1 1.00 0.66 0.42 0.52 0.45 0.49 0.48 0.42 0.52
CB2 0.66 1.00 0.42 0.56 0.48 0.52 0.54 0.43 0.52
CB3 0.42 0.42 1.00 0.38 0.31 0.28 0.34 0.21 0.36
CB4 0.52 0.56 0.38 1.00 0.57 0.61 0.56 0.40 0.53
CB5 0.45 0.48 0.31 0.57 1.00 0.62 0.63 0.43 0.57
CB6 0.49 0.52 0.28 0.61 0.62 1.00 0.58 0.49 0.59
CB7 0.48 0.54 0.34 0.56 0.63 0.58 1.00 0.49 0.56
CB8 0.42 0.43 0.21 0.40 0.43 0.49 0.49 1.00 0.47
CB9 0.52 0.52 0.36 0.53 0.57 0.59 0.56 0.47 1.00
Research Methodology
126
Table: 15
Inter-Item Correlation Matrix for Materialism Scale
Mat 1 Mat 2 Mat 3 Mat 4 Mat 5
Mat 1 1.00 0.45 0.47 0.50 0.62
Mat 2 0.45 1.00 0.48 0.52 0.46
Mat 3 0.47 0.48 1.00 0.51 0.49
Mat 4 0.50 0.52 0.51 1.00 0.56
Mat 5 0.62 0.46 0.49 0.56 1.00
Table: 16
Inter-Item Correlation Matrix for ECCS scale
ECCS1 ECCS2 ECCS3 ECCS5
ECCS1 1.00 0.70 0.66 0.49
ECCS2 0.70 1.00 0.65 0.48
ECCS3 0.66 0.65 1.00 0.55
ECCS5 0.49 0.48 0.55 1.00
Table: 17
Inter-Item Correlation Matrix for CCFB scale
CCFB1 CCFB2 CCFB3 R CCFB5 CCFB6
CCFB1 1.00 0.65 0.35 0.45 0.30
CCFB2 0.65 1.00 0.40 0.54 0.41
CCFB3 R 0.35 0.40 1.00 0.28 0.22
CCFB5 0.45 0.54 0.28 1.00 0.47
CCFB6 0.30 0.41 0.22 0.47 1.00
Reliability –Test Retest, Parallel Form, and Internal Consistency
1. Test-retest – To ensure stability over a period of time and to negate the mood of the
respondent a test-retest is done, by going back to a few respondents who were part of
the pilot study. 40 respondents who were from Kochi Municipal Corporation limits
(Thripunithura, Kakkanad, Edapally, and Ernakulam) were met. Their original
Research Methodology
127
responses were compared to the new set of responses and found conforming. This
reposes the measuring capability and consistency of the tool.
2. Parallel form- 2 different measures involved (2 forms used) to check for reliability,
particularly for the variable compulsive buying as the confirmatory factor analysis at
the pilot study ,showed some form of inconsistency in the compulsive buying and
impulsiveness scale (CBIS or CB) (nine items), developed by James Carl Stone, 2001,
Oklahoma State University. Compulsive buying impulsive scale was complimented with
compulsive buying screener of Faber & O’Guinn (1992) so as to check for the parallel
form consistency. The CBIS scale with 550 respondents in the final study was re-
subjected to factor analysis as reported. The scale items were grouping into a single
factor in rotated component matrix analysis. The CBIS has also a reported reliability
Cronbachs alpha at 0.894.The compulsive buying trait was also measured with the
Faber scale. The Faber scale reported reliability is 0.836 (Cronbach Alpha). The
percentage of compulsive buyers identified by CBIS scale is 14.54% of respondents (out
of 550) and the percentage of compulsive buyers identified by Faber scale is 13.93 % of
respondents (out of 359). Among the common 359 respondents for both the scales, 58
respondents were classified as compulsive buyers by both the scales (CBIS and Faber
scale). Of the 58 respondents it was found that 44 respondents (75.86%) were classified
as compulsive buyers by Faber scale and the CBIS scale classified all 58 as compulsive
buyers. This is evidently conforming the parallel form reliability of the compulsive
buying tool.
Two different measures of compulsive buying (CBIS and Faber) were having negative
correlations between them in the correlation matrix because, the CBIS scale was
positive in nature and the Faber scale was negative in nature. The determinant of the
Research Methodology
128
matrix was greater than 0.00001 reported at 0.0000128. Both the scale items were
converging into a single factor in the direct oblimin rotation. The direct oblimin
rotation was used as the CBIS scale and Faber scale are both measuring the same
variable (table 18).
Table: 18 Pattern Matrix for CBIS and Faber Scale
Pattern Matrix(a) for CBIS and Faber Scale
Checking for Convergent validity
Component
1 2
CB6 0.9445
FCBB6R -0.8553
CB5 0.81503
CB4 0.80031
FCBB4R -0.7607
CB7 0.75417
CB9 0.66724
FCBB1R -0.6527
CB1 0.62399
CB8 0.62381
CB2 0.60981
FCBB3R -0.5667
FCBB2R -0.5194
FCBB5R 0.8621
CB3 -0.832
FCBB7R
Extraction Method: Principal Component Analysis.
Rotation Method: Oblimin with Kaiser Normalization.
a-Rotation converged in 6 iterations.
3. Internal consistency- It is the most important method, when large numbers of items
are there, aiding to look for suspect testing effect. Factor analysis and t-test for
elimination is done. No indicator (statements) was having high correlations values of
0.8 or 0.9. In all four predictors the inter-item correlation between was sufficient to
Research Methodology
129
bring in high reliability (Cronbach Alpha values) (see inter-item correlation tables 14 -
17 and Cronbach alpha table 13).
4. Average inter-item correlation- The language and scale correction is done to ensure
an inter item correlation between all the statements in a variable to the level so as to
ensure high Cronbachs Alpha value. Item total correlation (at random) of odd -even
items was analyzed to check for consistency. The statements (factors) in each
predictor variable ensured face and content validity.
5. Cronbach Alpha- It exposes all possible split. Above .9 is undesirable, and in none
of the measuring instruments the Cronbachs Alpha is above 0.90 in the final study
(table 13).
6. Training of the data collector was important and was ensured, before final data
collection. The data collectors were a group of marketing research students at Rajagiri
School of Management. Their training has ensured minimum interviewer error like
response selection error, questioning error, recording error and cheating error
.The respondent error like inability error and unwillingness error was also minimized.
Construct Validity-Convergent, Discriminant
Construct validity-Convergent validity is indicated by high correlations in the different
items of the same concept using different method of measurement. This was not
achievable as the process of tracking a compulsive buyer as explicitly reported in the
Diagnostic Criteria for Compulsive Buying from McElroy et al (1994) was difficult and
beyond the scope of the study. It was also not possible for variables materialism,
enhanced credit card spending and credit card financing behaviour.
Research Methodology
130
In the dependent variable case- A debt know how quiz of Master Card was
administered to find the credit default probability of the respondents. This is a
nominal scale with ten items, the answer gets accumulated as points indicating
probability of credit default. All ten questions have a Yes or No choice. In Question
number nine if yes then zero score and if No then a score of one. In all other questions
yes has one score and No has zero score. If all answers are yes then total score is nine
and if all answers no then total score is one.
Zero score means – excellent, as the credit default probability is nil.
Score of 1-4 means- Potential defaulters (early warning stage).
Score of 5- 10 means- High credit default probability.
As in this self response form, its very delicate and subtle to classify a respondent as
non defaulter (zero score), Potential Defaulters (Early Warning Stage) (score of 1 to 4),
and high defaulter (score of 5 to 10). Therefore convergent validity was to be looked
into by finding out the actual credit default status of a deliberately chosen respondent
set from the banks/financial institution (Initial proposed number was 275). This was
attempted at the sample design stage by selecting a known group of defaulters from
banks. The researcher tried to get the help from various bank officials to disclose their
list of defaulters. Many banks were curious at the discussion stage but when it came to
actual disclosure of list of defaulters or even helping to indirectly collect data from
their defaulters through their own recovery executives, they were reluctant.
Discriminant validity-The different concepts, i.e., four predictor variables measured
in the study using the same method of survey design should have low correlation
showing that these constructs are different and discriminated. The four factors
Research Methodology
131
(predictor variables under study) are orthogonal (unrelated). Four factors clearly
emerged as seen in the factor analysis (table 12). The second type of discriminant
validity using a different method of measurement of the five variables understudy
(four predictor and one criterion) could not be done for the paucity of time, and scope
being beyond the researchers cost considerations.
Content validity
The tools of measurement are standardised and already used elsewhere. The tools
were developed after profound focus group idea generation and hence can rate
statements of respondents to agree or disagree on a scale. Further, calculating the
mean of ranking across questions by different judges, ensure the content validity
which was already done as part of standardised questionnaire development. The
literature review also ensures that the tools are in line to the available body of
knowledge in the concerned domains.
External validity
External validity is the degree to which the conclusions in the study would hold for
other persons in other places and at other times. This is ensured as due to the above
mentioned steps as well as a scientific sample design as reported in the present
chapter. The present study undertaken by the researcher is done in a different, distinct
and contrasting cultural setting , the typical India specific culture viz-a-viz the typical
American or European cultures where near similar studies were done in the past .At
the same time the similarity between the Indian economy, a fast developing globalised
competitive economy, versus the European and American, which are post-modern
globalised developed economies are many like the new retail environment, credit card
availability and usage, changing family structure and the work culture, their spending
Research Methodology
132
pattern, attitude to consumerism and their impact on the social milieu are worth
mentioning. The study in the present form and its finding would have a standing
impact on the already available body of knowledge, their inter-domain relationships
and generalisability.
Conclusion validity
Conclusion validity- is the degree to which conclusions we reach about relationships
in our data are reasonable and believable due the rigor in the process of research as
mentioned above.
Discriminant Analysis
The dependent variable is non-metric, accumulated to a score based on which the
sample is categorised as credit non-defaulters, potential defaulters (early warning
stage) and defaulters. The respondents with a score of zero are credit non-defaulters,
and between one to four (both inclusive) are potential defaulters (early warning stage).
Those with a score between five and ten (both inclusive) are defaulters. The number of
credit non-defaulters is 61, potential defaulters (early warning stage) are 364 and of
defaulters are 125. A debt know-how quiz of Master Card was administered to find the
credit default probability of the respondents. A baseline discriminant analysis, using
step-wise method is completed. The original group cases, correctly classified are 52.7
percent. The cross-validated grouped cases, correctly classified are 51.8 percent. The
predictor variables are normally distributed (table 7, 8 & 9, explanation as in figure 4
to 7) and the Mahalanobis D square to the most likely group is not greater than the
critical value of 13.28 (sig .01 & df 4 with four independent variables), but for thirteen
cases. A separate DA was run eliminating the 13 cases and using the within group
covariance matrix. The original group classified is 50.8% and the cross-validated
Research Methodology
133
grouped cases correctly classified are 50.1%. Since the Box’s M test of equality of
covariance matrices is significant (P<.05), the DA was re-run using the separate group
covariance matrix. The original group correctly classified is only 49.5% which is not
more than 2 % of 50.8% when DA was run using within group covariance matrix (target
value was 51.82% in separate group covariance analysis). The researcher decided to
proceed further analysis with the full sample of 550 without eliminating the 13
outliers as the original group classified is 52.7% and cross validated grouped cases
correctly classified at 51.8 %.
In the DA using 550 samples, in the tests of equality of group means, the F value
shows significance (p<.05) for the discriminant model as a whole rejecting the null
hypothesis that the predictors have no impact in categorizing the dependent outcome.
But further results of discriminant analysis include 3 variables viz; enhanced credit
card spends (ECCS), credit card financing behaviour (CCFB) and compulsive buying
behaviour (CBB). The predictor variables are not having multi-collinearity (table 19).
Table: 19 Variables in the Analysis for DA
Variables in the Analysis
Step Predictors Tolerance Sig. of F to Remove Min. D Squared Between Groups
1.00 ECCS 1.00 0.00
2.00
ECCS 0.95 0.00 0.08 .00 and 1.00
CCFB 0.95 0.00 0.14 .00 and 1.00
3.00
ECCS 0.86 0.06 0.09 .00 and 1.00
CCFB 0.80 0.00 0.15 .00 and 1.00
CBB 0.72 0.00 0.19 .00 and 1.00
The Box’s M test of equality of within group covariance matrices is significant (p<.05).
The log determinants between the three dependent categories are close, indicating
equality of covariance. As separate-group covariance matrices for classification is less
Research Methodology
134
accurate (less than 2% from within group covariance matrix) the baseline within-group
covariance data output is used for further analysis. The original group classified in
separate group covariance matrices is 53.5% which is less than the target value of
53.75%, which is a 2% increase from 52.7% (within group covariance classification percentage).
Discriminant analysis (DA) is relatively robust even when there is modest violation of
equality of covariance assumption (Lachenbruch, 1975). The dichotomous variable,
which often violate multivariate normality, are not likely to affect conclusions based on
DA (Klecka, 1980).
Table: 20 Wilks' Lambda for DA
Wilks' Lambda
Step Number of Variables Lambda df1 df2 df3
Exact F
Statistic df1 df2 Sig.
1.00 1.00 0.92 1.00 2.00 547.00 22.76 2.00 547.00 0.00
2.00 2.00 0.77 2.00 2.00 547.00 38.10 4.00 1092.00 0.00
3.00 3.00 0.67 3.00 2.00 547.00 40.40 6.00 1090.00 0.00
One sample t-test for each variable shows, there is no significant difference between
sample mean and population mean, indicating minimal regression to the mean (p>.05).
Structural Equation Modeling
The data analysis is done in two stages, descriptive statistics followed by structural
equation modeling. Structural equation modeling is used in the study as it is robust
due to its ability to model mediating variables and also to test the overall model rather
than coefficients individually. It includes confirmatory factor analysis (measurement
model) and full model testing.
Research Methodology
135
The Structural Equation Model (SEM) depicting the relationship among the variables
(see figure 9 to 12) are modeled using covariances. The hypothesised relationship
between the variables, which includes their indicators and error terms, was used to
draw the full model (figure 13). With the theoretical grounding firmly in place, the
change in Modification Indices (MI) was used to assign the covariate relations between
error terms within the indicators of the respective variables. This enables the
researcher to find the most optimal model or combination of the variables that fits
well with the data on which it is built and serves as a purposeful representation of the
reality from which the data has been extracted, and provides a parsimonious
explanation of the data (Kline, 1998).
Confirmatory Factor Analysis of the final questionnaire using SEM
The modified questionnaire is used to collect the final data and then it was subjected
to Confirmatory Factor Analysis (CFA) to recheck for reliability and validity (table 21 to
32) (figure 9 to 12) for exact values.
In the confirmatory factor analysis the standardised regression weights for all the
variables viz, materialism (5items), compulsive buying (9 items), enhanced credit card
spend (4 items) and credit card financing behaviour (5 items) had a factor loading for
each indicator which is greater than 0.70 with their critical ratio (C.R) greater than 1.96
(p<.05), which shows good construct validity (Schumacker & Lomax, 2004). However,
two indicators each in compulsive buying scale (CB), one in enhanced credit card
spends (ECCS) and two items in credit card financing behaviour (CCFB) had factor
loading greater than 0.5 but the critical ratio was greater than 1.96 (p<.05). The
variances extracted from each of the error terms of the indicators were greater than
0.5 with their critical ratio greater than 1.96 (p<.05) (Graham 2006). However, two error
Research Methodology
136
terms of an indicator in the enhanced credit card spends scale, three error terms of
compulsive buying scale and one error term of credit card financing behaviour scale
had variances extracted just above 0.4 with their critical ratio greater than 1.96
(p<.05).The goodness of fit measures like IFI and CFI are indicating good fit with values
>0.90 (table 23, 26, 29, 32).
Figure 9
Materialism
Confirmatory Factor Analysis –MAT
MAT
Mat 1
.58
e10
1.00
1
Mat 2
.63
e11
.84
1
Mat 3
.66
e12
.91
1
Mat 4
.61
e13
1.05
1
Mat 5
.53
e14
1.09
1
.65
e1
1
Research Methodology
137
Table: 21 Standardized Regression Weights for MAT
Standardized Regression Weights for MAT* Estimate
Mat1 <--- MAT 0.728
Mat2 <--- MAT 0.651
Mat3 <--- MAT 0.669
Mat4 <--- MAT 0.736
Mat5 <--- MAT 0.771 * SRW greater than 0.70 with their critical ratio (C.R) greater than 1.96 (p<.05) Table: 22 Variances for MAT Variances for MAT Estimate S.E. C.R.
e1 0.652 0.071 9.154
e10 0.578 0.045 12.943
e11 0.625 0.044 14.23
e12 0.663 0.047 13.982
e13 0.608 0.048 12.754
e14 0.526 0.044 11.855 Estimate greater than 0.5 with their critical ratio greater than 1.96 (p< .05)
Table: 23 Fit Indexes for MAT
MAT
NFI RFI IFI TLI
CFI Delta1 rho1 Delta2 rho2
0.976 0.952 0.981 0.961 0.981 Model Fit: CMIN/DF 4.729, GFI 0.982, AGFI 0.946
Research Methodology
138
Figure 10
Compulsive Buying:
Confirmatory Factor Analysis –CBB
Table: 24
Standardized Regression Weights for CBB
Standardized Regression Weights for CBB* Estimate
CB1 <--- cbb 0.697
CB2 <--- cbb 0.728
CB3 <--- cbb 0.466
CB4 <--- cbb 0.75
CB5 <--- cbb 0.747
CB6 <--- cbb 0.77
CB7 <--- cbb 0.761
CB8 <--- cbb 0.598
CB9 <--- cbb 0.742 * SRW greater than 0.70 with their critical ratio (C.R) greater than 1.96 (p<.05)
cbb
CB1
.66
e10
1.00
1CB2
.61
e11
1.05
1CB3
.96
e12
.66
1CB4
.51
e13
1.03
1
CB5
.54
e14
1.05
1CB6
.39
e15
.96
1CB7
.43
e16
.98
1
CB8
.61
e17
.74
1CB9
.49
e18
.99
1
.62
e1
1
Research Methodology
139
Table: 25
Variances of CBB
Variances of CBB Estimate S.E. C.R.
e1 0.621 0.069 9.015
e10 0.657 0.044 14.911
e11 0.609 0.042 14.59
e12 0.964 0.06 16.085
e13 0.511 0.036 14.304
e14 0.543 0.038 14.345
e15 0.393 0.028 14.004
e16 0.432 0.031 14.145
e17 0.614 0.039 15.596
e18 0.494 0.034 14.412 Estimate greater than 0.5 with their critical ratio greater than 1.96 (p< .05)
Table: 26 Fit Indexes for CBB
CBB
NFI RFI IFI TLI
CFI Delta1 rho1 Delta2 rho2
0.934 0.912 0.945 0.926 0.945 Model Fit: CMIN/DF 5.705, GFI 0.935, AGFI 0.892
Research Methodology
140
Figure 11
Enhanced Credit Card Spending:
Confirmatory Factor Analysis –ECCS
Eccs
ECCS1
.47
e30
1.00
1
ECCS2
.47
e31
.95
1
ECCS3
.50
e32
.93
1
ECCS5
.79
e33
.68
1
1.06
e3
1
Table: 27 Standardized Regression Weights for ECCS
Standardized Regression Weights for ECCS* Estimate
ECCS1 <--- Eccs 0.833
ECCS2 <--- Eccs 0.82
ECCS3 <--- Eccs 0.803
ECCS5 <--- Eccs 0.62 * SRW greater than 0.70 with their critical ratio (C.R) greater than 1.96 (p<.05)
Research Methodology
141
Table: 28 Variances of ECCS
Variances of ECCS Estimate S.E. C.R.
e3 1.06 0.094 11.269
e30 0.467 0.044 10.592
e31 0.467 0.042 11.163
e32 0.503 0.043 11.822
e33 0.785 0.052 15.003 Estimate greater than 0.5 with their critical ratio greater than 1.96 (p< .05)
Table: 29 Fit Indexes for ECCS
ECCS
NFI RFI IFI TLI
CFI Delta1 rho1 Delta2 rho2
0.987 0.96 0.989 0.966 0.989 Model Fit: CMIN/DF 6.495, GFI 0.988, AGFI 0.939
Research Methodology
142
Figure 12
Credit Card Financing Behaviour:
Confirmatory Factor Analysis –CCFB
CCFB
CCFB1
.57
e20
1.00
1
CCFB2
.27
e21
1.10
1
CCFB3 R
.81
e22
.57
1
CCFB5
.55
e23
.76
1
CCFB6
.51
e24
.50
1
.67
e2
1
Table: 30 Standardized Regression Weights for CCFB
Standardized Regression Weights for CCFB* Estimate
CCFB1 <--- CCFB 0.735
CCFB2 <--- CCFB 0.865
CCFB3R <--- CCFB 0.462
CCFB5 <--- CCFB 0.644
CCFB6 <--- CCFB 0.496 * SRW greater than 0.70 with their critical ratio (C.R) greater than 1.96 (p<.05)
Research Methodology
143
Table: 31 Variances of CCFB
Variances of CCFB Estimate S.E. C.R.
e2 0.671 0.073 9.133
e20 0.569 0.046 12.267
e21 0.27 0.039 6.961
e22 0.808 0.051 15.715
e23 0.551 0.039 14.187
e24 0.514 0.033 15.534 Estimate greater than 0.5 with their critical ratio greater than 1.96 (p< .05)
Table: 32 Fit Indexes for CCFB
CCFB
NFI RFI IFI TLI
CFI Delta1 rho1 Delta2 rho2
0.949 0.897 0.955 0.909 0.955 Model Fit: CMIN/DF 8.021, GFI 0.971, AGFI 0.913
Research Methodology
144
Figure 13: Final SEM model as originally drawn using AMOS software
mat
Mat 1
e10
1
1
Mat 2
e111
Mat 3
e121
Mat 4
e131
Mat 5
e141
cbb
CB1
e20
1
1
CB2
e211
CB3
e221
CB4
e231
CB5
e241
CB6
e251
CB7
e261
CB8
e271
CB9
e281
eccs
ECCS1
e30
1
1
ECCS2
e311
ECCS3
e321
ECCS5
e331
ccfb
CCFB1
e40
1
1
CCFB2
e411
CCFB3 R
e421
CCFB5
e431
CCFB6
e441
cd
MCSUM
50
1
1
e3 e4
e2
e1
1
1
1
1