BRM (1)

MEASUREMENT AND SCALING

CONCEPTSPresented byGroup 7Ajai Govind G (191065)Ambika Gupta (191067)Ankit Jain (191074)Bhanupriya Deswal (191081)Chandrika Mittal (191082)Kulvir Singh Gill (191092)

Measurement & Scaling Techniques

Measurement is the process of assigning numbers to objects or observations

Scaling is a procedure for the assignment of number to

a property of objects in order to impart some of the characteristics of numbers to the properties in question.

Concept A generalized idea about a class of objects,

attributes, occurrences, or processes. (For example- age, sex, brand loyalty etc.)

Operational definition A definition that gives meaning to a concept by

specifying the activities or operations necessary in order to measure it

It is an instruction to guide assignment of a number

or other measurement designation.

Any series of items that are arranged progressively according to value or magnitude, into which an item can be placed according to its quantification

Rules of Measurement

Scale

Measurement Scales Nominal Scale

Uses numbers or letters to identify different objects .

Eg : A scale to Measure the employment status1) Public sector2) Private Sector3) Self employed4) Unemployed5) Others

Nominal scale does not give any relationship between the variables

Nominal Scale

Measure of central tendency – Mode Statistical Test – Chi-square test Least Powerful scaleEg: Assignment of numbers to basketball players to identify

them.

Ordinal Scale

Places events in a particular order Variables in an ordinal scale can be ranked It only gives relative position of the variables Implies greater than or less than Measure of central tendency is median Statistical test – Non-parametric methodsEg: Question: Please rank the following mobile telephone

service providers from 1 to 5 with 1 representing the most preferred & 5 representing the least preferred

Airtel _____Hutch _____Idea _____BSNL _____Reliance _____

Interval scale

Interval between the points on the scale are equal.

There is qual distance between the two points on the scale.

Eg: Interval scales placed at an interval of 1 point[10]---[9]---[8]---[7]---[6]---[5]---[4]---[3]---[2]---[1]More powerful than ordinal scale.Measure of central tendency – Mean, Standard deviationStatistical Test – t ,f test

Ratio Scale

Has an Absolute zero of measurement

Have zero points & also have equal intervals.Compares the two variables measured on the scale.Represents actual amounts of variables.This is the most precise type scale.Can be subjected to any type of mathematical operation,Eg: Age, Weight , Money ,height are the common ratio

scales

Scale construction decisions

What level of data is involved (nominal, ordinal, interval, or ratio)?

What will the results be used for? What types of statistical analysis would be useful? Should you use a comparative scale or a noncomparative

scale? How many scale divisions or categories should be used (1 to

10; 1 to 7; -3 to +3)? Should there be an odd or even number of divisions? (Odd

gives neutral center value; even forces respondents to take a non-neutral position.)

What should the nature and descriptiveness of the scale labels be?

What should the physical form or layout of the scale be? (graphic, simple linear, vertical, horizontal)

Should a response be forced or be left optional?

Sources of Measurement Problems/Errors1. Respondent Associated errors

Non-response errors:

Failure to respond completely (unit non response)

Failure to respond one or more questions (Item non response)Reasons for non-response

Lack of knowledgeDoesn’t want to answer

Response biasWhen respondents consciously or unconsciously

misrepresent the truth.

2. Instrument associated errors

Due to poor questionnaire design, improper selection of samples.Adequate space for registering the answers in the questionnaireAmbiguous questions – confusion for respondentsComplicated words & sentences – misinterpretation

3. Situational Errors

No proper response if a third person is present during interviewLocation of interview - public places – lack of responseNo assurance on data confidentiality

4. Measurer as a sourceBody language & gestures of the interviewer discourage certain responses.Failing to record the full response of the respondentInappropriate coding & tabulationsIrrelevant statistical tools

Bases for classification of scales

Subject orientationDesigned to measure the characteristics of respondents.Judge the stimulus object present to the respondent.Ask the respondent to judge some specific objects in terms of one or more dimensions

Response form

Categorical – Rating(without reference to other objects)

comparative – ranking(compares with other objects)

Bases for classification of scales

Degree of subjectivity

Subjective personal preference – choose which person he favours, which solution he likes.Non-preference judgments – judge which solution will take fewer resources

Scale PropertiesBased on the scale the researcher chooses (nominal,ordinal etc)

Number of DimensionsUnidimensional scales – measures only one attributeMultidimensional scales – measure more than one attribute.

Scale construction Approaches

Arbitrary Approach Scale is developed on ad hoc basis. Most widely used approach.

Consensus approach(Thurstone Scale)Panel of judges evaluate the items chosen.

Item analysis approach(Likert Scale)Individual items are tested by a group of respondents.Total scores are calculated.Analysed on the basis of degree of discrimination.

Scale construction Approaches

Cumulative scales (Guttman’s Scalogram)

Conforming to some ranking of items in ascending or descending order.

Factor Scales(Semantic Differential scale)On the basis of inter correlations of items to identify the common factors.Factor analysis is used.

Important Scaling Techniques

Rating Scale Ranking Scale

Rating Scale

Qualitative description of a limited number of aspects Judge in terms of specific criteria

Like --- Dislike Above average, average, below average

3 to 7 point scales are used More the rating, more the sensitivity

A Classification of Noncomparative Rating Scales

Noncomparative Rating Scales

ContinuousRating Scales

ItemizedRating Scales

SemanticDifferential

Stapel Likert

Non-comparative RatingTechniques

Respondents evaluate only one object at a time, and for this reason noncomparative scales are often referred to as monadic scales.

Noncomparative techniques consist of continuous and itemized rating scales.

Rating Scale Types

Graphical Rating / Continuous rating Scale Points are put in a continuum Indicate rating by tick mark

Like Very Much

Like Somewhat

Neutral Dislike Some What

Dislike Very Much

Continuous/Graphic Rating Scale

Respondents rate the objects by placing a mark at the appropriate position on a line that runs from one extreme of the criterion variable to the other.

The form of the continuous scale may vary considerably.How would you rate Bigbazaar as a department store?Version 1Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Probably

the bestVersion 2Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- - Probably

the best0 10 20 30 40 50 60 70 80 90

100Version 3

Very bad Neither good Very good nor bad

Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - -Probably the best

0 10 20 30 40 50 60 70 80 90 100

Rating Scale Types

Itemized Rating Presents a series of statements Respondent selects the test

He is always involved in some friction with his fellow worker

He is often at odds with one or more of his fellow workers

He sometimes gets involved in frictionHe infrequently becomes involved in friction with

othersHe almost ever gets involved in friction with his

fellow workers

Itemized Rating Scales

The respondents are provided with a scale that has a number or brief description associated with each category.

The categories are ordered in terms of scale position; and the respondents are required to select the specified category that best describes the object being rated.

The commonly used itemized rating scales are the Likert, semantic differential, and Stapel scales.

Likert Scale (Summated Scale)

Evaluates each item on its ability to discriminate between those with high score and those with low score

Respondent indicates degree of agreement or disagreement with the statements in the instrument

Each response is given a numerical score, indicating favourableness or unfavourableness and total score represents the attitude

Likert Scale

The Likert scale requires the respondents to indicate a degree of agreement or disagreement with each of a series of statements about the stimulus objects.

Strongly Disagree Neither Agree

Strongly disagree agree nor agree

disagree 1. Sears sells high quality merchandise. 1 2X 3 4 5 2. Sears has poor in-store service. 1 2X 3 4 5 3. I like to shop at Sears. 1 2 3X 4 5 The analysis can be conducted on an item-by-item basis (profile

analysis), or a total (summated) score can be calculated. When arriving at a total score, the categories assigned to the negative

statements by the respondents should be scored by reversing the scale.

Likert Scale Construction

Identify the attitudinal object and delimit it quite specifically. Compose a series of statements about the attitudinal object that

are half positive and half negative and are not extreme, ambiguous, or neutral.

Establish (a minimum of ) content validity with the help of an expert panel.

Pilot test the statements to establish reliability for each domain. Eliminate statements that negatively affect internal consistency. Construct the final scale by using the fewest number of items

while still maintaining validity and reliability; create a balance of positive and negative items .

Administer the scale and instruct respondents to indicate their level of agreement with each statement.

Sum each respondent’s item scores to determine attitude.

Likert Scale (Multi Item) - Example

1. Bigbazaar is an attractive store. Neither

Strongly Agree Nor StronglyAgree Agree Disagree Disagree Disagree

1 2 3 4 5

2. The service at Bigbazaar is slow. Neither


1 2 3 4 5

3. Bigbazaar has attractive prices. Neither


1 2 3 4 5

Likert Scale (Summated Scale)

Advantages Easier than Thurstone Scale Without panel of judges More reliable as it considers each item statement

and respondent Limitations

Just gives the difference in attitudes and does not quantify the same

Semantic Differential Scale

The semantic differential is a seven-point rating scale with end points associated with bipolar labels that have semantic meaning.

RELIANCE IS:Powerful --:--:--:--:-X-:--:--: WeakUnreliable --:--:--:--:--:-X-:--: ReliableModern --:--:--:--:--:--:-X-: Old-fashioned

The negative adjective or phrase sometimes appears at the left side of the scale and sometimes at the right.

This controls the tendency of some respondents, particularly those with very positive or very negative attitudes, to mark the right- or left-hand sides without reading the labels.

Individual items on a semantic differential scale may be scored on either a -3 to +3 or a 1 to 7 scale.

Semantic Differential Procedure

Identify the concept to be measured Generate a list of approximately 7 or 8 bipolar

adjectives with an number of positions between each pair. (Subjects lose focus after 8)

Administer the scale and instruct respondents to identify where, on the continuum between the two adjectives, their beliefs about the concept lie.

The spaces or positions between the adjectives become categories with a numerical value (e.g. 1=unfavorable and 6=favorable) and responses are summed to determine attitude.

Semantic Differential Scale - Example

Service is discourteous 1…2…3…4…5…6…7 Service is courteous

Location is convenient 1…2…3…4…5…6…7 Location is

inconvenient

Hours are inconvenient 1…2…3…4…5…6…7 Hours are convenient

Loan interest rates 1…2…3…4…5…6…7 Loan interest rates

are high are low

Stapel Scale

The Stapel scale is a unipolar rating scale with ten categories numbered from -5 to +5, without a neutral point (zero). This scale is usually presented vertically.

Bigbazaar+5 +5+4 +4+3 +3+2 +2X+1 +1

HIGH QUALITY POOR SERVICE-1 -1-2 -2-3 -3-4X -4-5 -5

The data obtained by using a Stapel scale can be analyzed in thesame way as semantic differential data.

Ranking Scale/Comparative scale

Make comparative/relative judgments Approaches

Method of paired comparison Method of rank order

Paired Comparisons

Description - Paired comparison scales ask a respondent to pick one of two objects from a set based upon a given criterion

Example - Which of the following pairs that is most important to you while selecting a toothpaste?

a.Fights Decay b.Affordablea.Affordable b.Longer germ protectiona.Longer germ protection b.Fights decay

Rank-Order Scale

Description - respondent is asked to judge one item against another.

Example - Rank the following brands of cereal according to your preference (1=most preferred).

__ Kellogg’s Corn Flakes __ Rice Krispies

__ Wheaties__ Kellogg’s Raisin Bran ...

Other TypesConstant Sum Scale

This technique requires the respondent to divide a given number of points, typically 100, among two or more attributes based on their importance

Constant sum scales are used more often than paired comparisons because the long list of paired items is avoided

Characteristics of a super market Number of points

It is conveniently located _____Sales persons are cooperative _____The ambience is pleasing _____Parking facility is adequate _____

100 points

Cumulative Scale /Guttman’s Scalogram Scale

Here the respondent checks each item with which they agree The items are constructed so that they are automatically

cumulative– if you agree to one, you probably agree to all of the ones above it on the list

Can be a good way to gauge how people feel about controversial topics

Requires care when writing so that it doesn’t seem leading Example : Please check each statement that you agree with: __ Willing to permit immigrants to live in the U.S. __ Willing to permit immigrants to live in your community. __ Willing to permit immigrants to live in your neighborhood. __ Willing to have an immigrant as a next door neighbor. __ Willing to let your child marry an immigrant.

Differential Scale (Thurstone Scale)

Uses consensus approach Method used in measuring attitude on single dimension Used to measure the issues like war, religion, etc.

Thurston Scales

Items are formed (80 to 100) Items are given to a group of judges Panel of experts assigns values from 1 to 11 to

each item Judges favour or disfavour them All items that have consensus are selected other

items eliminated. Mean or median scores are calculated for each

item Attitude comparison made on the basis of this

median. It is a time consuming method.

Thurston Scales

Example:Please check the item that best describes your level

of willingness to try new tasks I seldom feel willing to take on new tasks (1.7) I will occasionally try new tasks (3.6) I look forward to new tasks (6.9) I am excited to try new tasks (9.8)

Surfing the Internet is

____ Extremely Good

____ Very Good

____ Good

____ Bad

____ Very Bad

____ Extremely Bad

Surfing the Internet is

____ Extremely Good

____ Very Good

____ Good

____ Somewhat Good

____ Bad

____ Very Bad

Balanced Scale Unbalanced Scale

Balanced and Unbalanced Scales

Criteria for good measurement

Reliability Validity Sensitivity Relevance Versatility Ease of response

Scale Evaluation

Scale Evaluation

ReliabilityValidity

Test-RetestInternal

ConsistencyAlternative

Forms Construct

Criterion

Content

Criteria for good measurement

Reliability

When the outcome of the measuring process is reproducible then the measuring instrument is reliable.Eg: If a coffee vending machine gives the same quantity coffee every time, then measurement of coffee vending machine is reliable

Ability to obtain similar results by measuring an object, trait or construct with independent but comparable measures

Example: Do both CAT and MAT scores measure the candidates performance?

Reliability can be defined as the degree to which the measurements of a particular instrument are free from errors and as a result produce consistent results.

Evaluation of reliability

1. Test –retest reliabilityIf the result of a research is the same even when it is conducted for the second or third time it confirms the repeatability aspect.

Eg : If 40% of the population say that they do not watch movies and when the research is repeated after sometime and the result is the same, then measurement process is said to be reliable.

2. Split-half methodIn this, the researcher divides the result obtained in two halves ad would then check one half of the scale items against the other half

3. Internal consistency When the data give the same results even after some manipulations.

Eg: After a research result is obtained for a particular study , the result can be split into two parts, the result of one part can be tested against the result of the other , if they are consistent then the measureis reliable.

Criteria for good measurement Validity

Ability of a scale or measuring instrument to measure what it is intended to measure can be termed as the validity of the measurement.Measuring the morale of the exam based on absenteeism alone.

Test for validity1. Face validityCollective agreement of the experts and researchers on the validity of the measurement scale.Weakest form of validity

2. Criterion-related validity

It relates the degree to which measurement instrument can analyze a variable that is said to have a criterion.If a new method is developed , one has to ensure that it correlates with other measures of the same construct.Eg: Length of an object is measured with the help of tape measure ,calipers, ruler & if a new technique is developed then one has to ensure that this new measure correlates with other measures of length.

TypesPredictive Validity – The extent to which the future level of a criterion variable can be predicted by the current measurement on a scale.Eg: A scale measuring the future occupancy of an apartment.

Concurrent validity It is related with the relationship between the predictor variable

& criterion variable. Both the predictor variable & criterion variable are measured on

the same scale. A measure is used to predict something assessed at the same

point in time

3.Construct Validity It refers to the degree to which measurement instrument

represents & logically connects through the underlying theory.It assesses the underlying aspects relating to behaviourIt measures why a person behaved in a certain way

rather than how he has behaved. Assessment of how well the instrument captures the

construct, concept, or trait it is supposed to be measuring

Sensitivity

Sensitivity refers to an instrument’s ability to accurately measure variability in stimuli or responses. Sensitivity is not in high in instruments involving ‘Agree’ or ‘ Disagree’It will be high in ‘Strongly agree, mildly agree, mildly disagree, none of the above

Generalizability The amount of flexibility in interpreting the data in different research designs.

RelevanceAppropriateness of using a particular scale.

Examples Of Category (Itemized) Rating Scales

1. Balanced, forced-choice, odd-interval scale focusing on an attitude toward a specific attribute(1) How do you like the taste of Classic Coke?___ ___ ___ ___ ___Like It Like it Neither Like Dislike It StronglyVery Much Nor Dislike It Dislike It

2. Balanced, forced-choice, even-interval scale focusing on an overall attitude(2) Overall, how would you rate Ultra Brite Toothpaste?___ ___ ___ ___ ___ ___Extremely Very Somewhat Somewhat Very ExtremelyGood Good Good Bad Bad Bad

3. Unbalanced, forced-choice, odd-interval scale focusing on an overall attitude(3) What is your reaction to this advertisement? ___ ___ ___ ___ ___Enthusiastic Very Favorable Favorable Neutral Unfavorable

4. Balanced, non-forced, odd-interval scale focusing on a specific attribute(4) How would you rate the friendliness of the sales personnel at Sears’ downtown store?

__ __ __ __ __ __ __ __Very Moderately Slightly Neither Slightly Moderately Very Don’tFriendly Friendly Friendly Friendly Unfriendly Unfriendly Unfriendly Know Nor Un- Friendly

Examples Of Category (Itemized) Rating Scales

BRM (1)

Documents

Transcript of BRM (1)