SELECTION OF MEASUREMENT INSTRUMENTS
description
Transcript of SELECTION OF MEASUREMENT INSTRUMENTS
SELECTION OF MEASUREMENTINSTRUMENTS
Administer a standardized instrumentAdminister a self developed instrumentRecord naturally available data (GPA,
absenteeism rates) Measure physical performance data
3 WAYS TO COLLECT DATA
Validity
The degree that a test measures what it is supposed to measure.
Valid for what?
Valid for whom?
Types of Validity
Content validity
The test measures the intended content
Item validity:
Sampling validity:
Face (logical) validity:
Test items measure intended content
Test samples content adequately
Test appears to measure what is supposed to
Who Determines Validity?
NO SET TEST!
Researcher determines validity by making a comparison about what should be included and what is included!
Types of Validity
Construct validity
The test measures the intended hypothetical construct
Construct- a non-observable trait
INTELLIGENCE
ANXIETY
RATE OFLEARNING
RPE
Types of ValidityConcurrent validity
Scores on your test are related to scoreson a previously administered test e.g. , judges ratings and tournament results
Steps
•Administer a VO2Max test
•Administer a treadmill walking test•Correlate the two sets of test scores•Evaluate the results
•If correlation is high, VO2Max can be substituted
Types of Validity
PREDICTIVE VALIDITYTest predicts how well an individual will perform in the future
Predictive validity of GRE for graduate study
Prediction of NTE scores and success as a teacher
Prediction of population who will become obese
Examples
To Determine Predictive Validity
Steps
•Administer the gre•Wait until first year gpa is established•Correlate two sets of test scores•Evaluate the results•Determine validity coefficient
Interpretation
Scores range from 0 to 1
Higher score is better
Reliability
Consistency of test measurementHigh test reliability means that when the test is retaken, the same scores would be earned Reliability ranges from 0 to 1
How do validity and reliability relate?
A valid test is always reliable
A reliable test is not always valid
Tests with high reliability may not measurewhat is intended by the researcher
Why do tests have low reliability?
Errors in the test
Failure to follow procedures
Student fatigue
Inattention to detail
Ambiguous questions
Familiarity with the test
Unclear directions
Improper administration
Student mood
Test-Retest ReliabilityScores are consistent over time
Steps
1. Administer test to group
2. Administer test again after time has
Passed (1 or 2 weeks)
3. Correlate 2 sets of scores
Coefficient of Stability
Alternate Forms Reliability
1. Administer one test form
2. Administer second form to the same group
3. Correlate 2 sets of scores
Coefficient of Equivalence
Test A and Test B measure the same traits
Steps
Split-half Reliability
Requires only one administration of the test
1. Administer the total test to group
2. Divide test into 2 comparable halves (odd
or even questions)
3. Compute a set of scores for each half
4. Correlate 2 sets of scores
5. Apply Spearman-Brown correction
COEFFICIENT OF EQUIVALENCE
Steps
Spearman-Brown example
Spearman-Brown correction- used to predict
reliability from 25 item to 50 item test
SPLIT-HALF RELIABILITY = O.80
r = 2r split half
1 + r split half
r = 2 x (.80)
1.00 + .80=
1.60
1.80= .89
Rationale Equivalence Reliability
Requires only one administration of the test
1. Administer the total test to group
2. Divide test into 2 comparable halves (odd
or even questions)
3. Compute a set of scores for each half
4. Correlate 2 sets of scores
5. Apply Spearman-Brown correction
COEFFICIENT OF EQUIVALENCE
Steps
Internal and External Reliability
External Reliability Researcher status Choice of subjects Social situations
and conditions Analytic constructs Methods of data
collection and analysis
Internal Reliability Inter-observer
agreement All team members
trained the same way All team members
treat subjects identically
For videotape and transcript analysis reliability is established before the experiment begins
Scorer/rater Reliability
Occurs when subjective scoring of test itemsis performed
Inter-judge reliability - 2 or more scorers rate the tests the same way
Intra-judge reliability - same scorer rates each test the same way
Reliability Coefficients
The closer to 1.0 the better
Achievement/aptitude tests – reliability should not be less than 0.9
Subtest reliability should be calculatedfor tests that have more than one component
Standard Error of Measurement
Used to express test reliability
Small standard error of measurement (SEM) indicates high reliability
Interpreted similarly to standard deviaton
SEm = SD 1 - r
Four types of measurement scales
Nominal- subjects grouped based on gender, race, fitness levelOrdinal- rank comparisons, rank tallest to shortestInterval- supply the order and the distance between sets of scores (used with standard scores)Ratio-used when there are no zero points, IQ of 160 is not twice as smart as an IQ of 80…
Standard Scores
Z scores M = 0, SD = 1.0 Used to compare
and contrast 2 different test scores e.g, push-up and 40 yard dash
T scores M = 50, SD = 10 Converts Z scores
to all positive measures
Measuring Affective Behavior
PersonalityAnxietySelf-esteemSocial BehaviorRPEs
Hey, look no hands.
LIKERTSTRONGLY
AGREEAGREE UNDECIDED DISAGREE STRONGLY
DISAGREE
1 2 3 4 5
High point values on a positive statement indicateA positive attitude.
SEMANTIC DIFFERENTIAL
NECESSARY UNNECESSARY__ __ __ __ __ __ __
__ __ __ __ __ __ __FAIR UNFAIR
3 2 1 0 -3 -2 -1
POSITIVE NEGATIVE
Types of Scales
Personality Tests
Non-projective - uses a self-report instrument
Inventory - yes/no questions
Scale - used to determine what an individual feels or believes
May not be accurate due to societal influences