Week 4 Slides. Conscientiousness was most highly voted for construct We will also give other...

Post on 13-Dec-2015

216 views 2 download

Tags:

Transcript of Week 4 Slides. Conscientiousness was most highly voted for construct We will also give other...

Week 4 Slides

• Conscientiousness was most highly voted for construct

• We will also give other measures– protestant work ethic and turnover intentions– Measure of academic performance? GPA / SAT??

Validity

• Traditional: Extent to which the test measures what it claims to measure or predicts what it claims to predict.

Construct

Test

Historical Terms

• Construct Validity: Does it measure or is it related to what we would expect

• Content Validity: Does the test accurately measure the construct– Is each facet represented?

• Criterion-related Validity: Does the test predict what it is supposed to predict– Integrity tests predicting bad behaviors

Historical view of validity

Construct Validity

Content Validity Convergent & Discriminant Factorial

Specific steps

CVR

Correlation & MTMM martix

Criterion-related

Predictive vs. concurrent

Reliability

Factor analysis

Validity

– Current: Are the inferences based on test score appropriate

– The newer definition emphasizes the role of test in regards to other variables besides the construct• Predicting the criteria

New Conceptualization of Validity

• Validity is a unitary concept– There are not different types of validity

• New definition: Are the implications I am making from a test appropriate? – Is it appropriate to use this test to measure this

construct?– Is it appropriate to use these items to assess this

population?– Is it appropriate to use this test to predict this these

behaviors?

Historical view of validity

Validity

Test Content Relationship with other Internal Structure

Specific steps

CVR

Correlation & MTMM martix

Criterion-related

Predictive vs. concurrent

Reliability

Factor analysis

Consequences

Response Process

Ask, Experiment, Observe

Sources of Evidence

1. Test Content2. Relationships with Other Variables3. Internal Structure4. Response Processes5. Consequences of Testing

Evidence based on…

Content Validity

• Extent to which items on a test are representative of the construct

• Two ways to demonstrate– DURING test development– AFTER test development

Content Validity

• During Test Development1. Define the testing universe• Interview experts• Old tests• Review the literature• Define the construct• What is the testing universe of a test that measures physical

fitness?

2. Develop test specifications• Blue print• What are the content areas• How many Questions

Content Validity

3. Establish a test format– Establish a test format– Written test? CAT? Practical test?– What type of questions? MC? T/F? Matching?– Assessment Centers

4. Construct test questions– Construct test questions– Carefully write questions according to the blueprint.– Make sure questions represent content area.– This is done by Subject Matter Experts.

Content Validity

• After Test Development– Examine the extent subject matter experts (SMEs)

agree on the content validity of the items• SMEs rate how essential test items are to the attribute • E.g., essential, useful but not essential, not necessary

for success on the job• Content Validity Ratio is calculated• Items below a minimum CVR value are dropped

Content Validity Ratio

• Judges rate item on scale of importance– Essential, important, not-essential

CVRi = (ne – (N/2))/ (N/2)–CVR = value of the item–n = number of experts saying the item is

essential–N = total number of experts

Sources of Evidence

1. Test Content2. Relationships with Other Variables

A. Construct validityB. Criterion-Related Validity

3. Internal Structure4. Response Processes5. Consequences of Testing

Evidence based on…

Multitrait-Multimethod Design

• Pick Variables that are theoretically unrelated

• Measure each variable with several different types of measurement (i.e. Forced Choice, True-false etc.)

• Each variable should correlate highly with other measures of the same construct (convergent validity)

Multitrait-Multimethod Design

Multitrait-Multimethod Design

• Correlations between different variables with the same method assess method bias (using the same method results in higher correlations regardless of construct)– This can also be evidence of discriminant validity

• Different Variables should not be highly correlated regardless of the method (discriminant validity)

Sources of Evidence

1. Test Content2. Relationships with Other Variables

A. Construct validityB. Criterion-Related Validity

3. Internal Structure4. Response Processes5. Consequences of Testing

Evidence based on…

Criterion Related Validity

Construct

Test

Criteria Test

Criteria Construct

Validity Coefficient

Issues with Criterion

• Objective Criterion: Observable & Measurable– How many sales did an employee make– Less error because no subjectivity– Scope is narrow and does not get at motivation

• Subjective Criterion: Matter of judgment– Supervisor rating of performance– More error– Can take into account circumstances

Issues with Criterion

• Criterion Deficiency– When the criterion measure does not assess every

area of the testing universe of the Criterion (construct)

• Criterion Contamination– When the criterion measures extraneous variables

that are not considered as part of the testing universe of the criterion.

Issues with Criterion

Sales

Job Performance

Job Performance of a retailer: Customer service; successful sales; Dependability

Job Performance

Sales & Management Test

Jeremy’s Intelligence Test

• Concurrent Study– Give test to Employees Performance– Little evidence of causality

• Predictive Study– Give Test to Applicants Performance – Range restriction

Indirect range restriction

• If you select employees based on some criterion measure– Make sure it is not correlated with the test you are

trying to gather evidence for – For instance: a personality test used to make

hiring decisions would be bad to use if you are interested in the validity of an integrity test

Overlap of Predictors

Predictor 2

Criterion

Predictor 1

Predictor 3

Single Predictor

Jealousy

Criterion

Overlapping Predictors

Jealousy

Criterion

Envy

Anger

Sources of Evidence

1. Test Content2. Relationships with Other Variables

A. Construct validityB. Criterion-Related Validity

3. Internal Structure4. Response Processes5. Consequences of Testing

Evidence based on…

Evidence from Internal Structure

• Test responses should follow the format that is theoretically expected – If test is thought to be increasingly more difficult,

there should be evidence to support that claim– If the test is thought to homogeneous or

heterogeneous, there should be evidence to support that claim

– Items may function differently for certain sub-groups• Differential item functioning

Sources of Evidence

1. Test Content2. Relationships with Other Variables

A. Construct validityB. Criterion-Related Validity

3. Internal Structure4. Response Processes5. Consequences of Testing

Evidence based on…

Evidence from Response Process

• Ask Test Takers about their decision process– Are they using traditional or atypical strategies to

answer the items• Monitor the response process– Keyboard stroke analysis– Have them show their work

• This also applies to raters or judges– Look for evidence of consistency across judgments• What about consistently inaccurate judgments?

Sources of Evidence

1. Test Content2. Relationships with Other Variables

A. Construct validityB. Criterion-Related Validity

3. Internal Structure4. Response Processes5. Consequences of Testing

Evidence based on…

Consequences of Testing

• Consequences of the test can help to set guidelines about acceptable evidence of validity

• Test for serious diseases– What are there false positive vs. false negative rates

• What about differential item functioning (i.e. sub group differences)

• If a test can place you in a better job, it better be able to prove it.

• Online or paper pencil?

Validity Generalization

• Can be done – Meta-analyses– Synthetic Validity

• Please be careful!!!!

Class Assignment

• Define Conscientiousness– Use the literature and be specific

• Define Testing Universe– What are some sub-facets that we can reasonable

measure?• Write 5 multiple choice items that will

adequately assess the testing universe• What demographics should we include?• What variables should we control for?

Writing Good Items

• GOOD survey questions– Straightforward & Unambiguous

• How well did your student do?– This is both indirect and ambiguous?

» How well did Ted do on the last exam?

• Try to be concise & specific– Long sentences can be easily confused

– Appropriate response options• How many time in the past month have you sabotaged?

– All the time, occasionally, sometimes, never.» We should use objective frequency here instead of subjective

accounts of frequency

Writing Good Items

– Ask only one question• Did you like my party last weekend? If so, please

indicate how likely you will be to attend another one of my events.• This is double barreled

– Easy to read• Make sure to use an appropriate reading level

– Most surveys are safe at a 6th grade level unless it is a sample of children