PS 366 3. Measurement Related to reliability, validity: Bias and error – Is something wrong with...

Post on 14-Dec-2015

219 views 4 download

Tags:

Transcript of PS 366 3. Measurement Related to reliability, validity: Bias and error – Is something wrong with...

PS 366

3

Measurement

• Related to reliability, validity:

• Bias and error– Is something wrong with the instrument?

– Is something up with the thing being measured?

Measurement

• Bias & error with the instrument

– Random?

– Systematic?

Measurement

• Bias & error with the thing being measured

– Random?• failure to understand a survey question

– Systematic?• does person have something to hide?

Measurement

• Example:– Reliability, validity, error & bias in measuring

unemployment

– Census survey [also hiring reports, claims filed w/ government, state data to feds...]

– What sources of bias?

Measurement

• Unemployment [employment status]:

– Fully employed– Part time– looking for work, + part time– looking for work, no job– lost job, not looking for work– retired

Measurement

• Example: – Reliability, validity, error & bias in measuring

victims of violent crime

– Census surveys, police records, FBI UCR

– What sources of bias?

Measurement

• How do we ask people questions about attitudes, behavior that isn’t socially accepted?– prejudice– Racism– Feelings toward gays & lesbians– shoplifting

Measurement: Item Count Technique

• Here are 3 things that sometimes make people angry or upset. After reading these, record how many of them upset you. Not which ones, just how many?

• federal govt increasing the gas tax

• professional athletes getting million dollar salaries

• large corporations polluting the environment

Measurement: Item Count Technique

• federal govt increasing the gas tax

• professional athletes getting million dollar salaries

• large corporations polluting the environment

• federal govt increasing the gas tax

• professional athletes getting million dollar salaries

• large corporations polluting the environment

• a black family moving next door

Measurement: Item Count Technique

• Randomly assign ½ of subjects to the 3 item list

• Randomly assign ½ subjects to the 4 item list

• Difference in mean # of responses between groups = % upset by sensitive item– (mean 1 – mean 2) *100 = %

Item Count• Control TREATMENT

% upset

• Non South 2.28 2.24 0

• South 1.952.37 42

2.37 – 1.95 = 0.42 *100 = 42%

Item Count – Using poll information

• 1) The candidate graduated from a prestigious college

• 2) The candidate ran a business

• 3) The candidate’s family background

• 1) The candidate graduated from a prestigious college

• 2) The candidate ran a business

• 3) The candidate’s family background

• 4) The candidate is ahead in polls

Use poll info• Control TREATMENT % use poll

• All 2.28 1.36 1.39 3.2

• Young 1.04 1.46 41

• Is it significant?– Depends....how much does mean reflect the group? How

much variation around the mean?

Central Tendency

• Statistics that describe the ‘average’ or ‘typical’ value of a variable

– Mean– Median– Mode

Central Tendency

• Why median vs. mean?

– Household income

– Home prices

Median vs Mean HH Income• median mean• 60,667 63,809• 49,847 61,187• 66,875 74,653• 67,005 71,443• 45,735 66,662• 63,472 73,648• 44,891 60,250• 50,262 59,688• 39,930 60,495• 65,885 80,581• 76,917 85,837• 61,146 64,526• 56,815 59,781• 62,244 78,289

Median vs Mean Price

• Seattle Median $400K• Seattle Mean higher!

Central Tendency

• Mean125 92 72126120 99130100sum=864

• mean = sum X/ N

• = 864 / 8

• mean = 108

• Is this repetitive?

Central Tendency

• Mean125 92300126120 99130100sum=1092

• mean = sum X/ N

• = 1092 / 8

• = 136.5

• Is this repetitive?

Central Tendency

• Mean130126125120100 99 92

• median = (N +1) /2– (8+1)/2– 9/2– 4.5 th – (120, 125)

• Is this repetitive?

Central Tendency

• Example$120,00$60,000$40,000$40,000$30,000$30,000$30,000

• Mean = $50,000 Mdn = $40,000 Mo = $30,000

• Which is most representative?

The Distribution

• Where is mean, median, mode if

– Normal

– Left skew

– Right skew

Variation

• How are observations distributed around the central point?

• Is there one, more central point?– unimodal– bimodal

Variation

• Which is unimodal, which is bimodal:

– Mass public ideology• V con, con, moderate, lib, v. lib

– Members of Congress ideology

– What does the mean mean?

Distribution

• How spread out are the observations?

• Single peak– not much variation

• Flat?– lots of variation; what does mean mean?

Variation

• Standard deviation• Information about variation around the mean

• 1

Variation

• Mean125 92 72126120 99130100mean = 108

Variance= sum of squared distances of each obsv from mean, over # of observations

Variance

• Mean125 92 72126120 99130100mean = 108

(x - mean)125-10892-10872-108126-108120-10899-108130-108100-108

Variance

• Mean125 92 72126120 99130100mean = 108

(x - mean) (x - mean)2

17 289-16 256-36 1296 18 324 12 144 -9 8122 484-8 64

sum sqs=2938

Variance & Std. deviation

• Variance does not tell us much

• mean = 108• variance = 2938 / 8• = 367.25

• Standard deviation = square root of variance

• sd = sqrt 367.25

• = 19.2

Variation

• Range ( lo – hi)

• Variance (sum of distances from mean, squared) / n

• Standard Deviation

• Bigger # for each = more variation

• Standard Deviation

• expresses variation around the mean in ‘standardized’ units

• Bigger # = more

• Allow us to compare apples to oranges

Standard Deviation

• Total convictions– mean = 178, s.d. = 199.7

• Per capita convictions (per 10,000 people)– mean = .357, s.d. = .197

Standard Deviation

Low s.d relative to meanHigh s.d. relative to mean

Standard Deviation

Distribution of total convictions: mean 187; s.d. 199

Standard Deviation

Mean .357, s.d. .197

Standard Deviation

Turnout by state: mean = .62 ; s.d. = .07

Standard Deviation

• Tells even more if distribution ‘normal’• If data interval

• What about a state that has 50% turnout, and .7 corruption convictions per 10,000?

• Where are they in each distribution?

Standard Deviation

Mean .357, s.d. .197

X

Standard Deviation

Turnout by state: mean = .62 ; s.d. = .07

X

Standard Deviation & z-scores

• State’s position on turnout = z– z= (score – mean) / s .d. – = (.50 - .61) / .07 =

– = -.09 / .07 = -1.28

1.28 standard deviations below mean on turnout

Standard Deviation & z-scores

• State’s position on corruption = z– z= (score – mean) / s .d. – = (.70 - .35) / .19 =

– = +.35 / .19 = + 1.84

1.84 standard deviations above mean on corruption

Std Dev & Normal Curve

Std Dev & Normal Curve

Std Dev & Normal Curve

Std Dev & Normal Curve

Standard Deviation & z-scores

• Apples: Turnout + 1.84

• Oranges: Corruption -1.28

• Z = 0 is mean• Z = 3 is 3 very rare

Z scores and Normal Curve

• How many states between mean & +1.84• How many above 1.84

• See Appendix C in text– below mean = 50%– between mean and z=1.84 = 46.7%– beyond mean = 3.3% [1.5 states if normal]

Z scores and Normal Curve

• How many states between mean & -1.28• How many below z= - 1.28

• See Appendix C in text– above mean = 50%– between mean and z= -1.28 = 39.9%– beyond mean = 10.3% [1.5 states if normal]