Post on 12-Apr-2017
Reliability & Validity
By Rahul SinghRajat UpretyRavi PohaniSumit SethiVibhor Garg
Commonly used terms…
“She has a valid point”
“My car is unreliable”
…in science…“The conclusion of the study was not valid”
“The findings of the study were not reliable”.
•Reliability
“…the degree to which a test or measure produces the same scores when applied in the same circumstances…”
(Nelson 1997)
Reliability• Reliability is a pre-requisite of validity
Types of Reliability
• Relative• Absolute• Rater reliability (Objectivity)• Intrarater reliability• Interrater reliability.
Relative Reliability
Relatively Reliable
relates to a measure that is obtained by conducting assessment of the same phenomena with the participation of the same sample group via more than one assessment method. (also called parallel reliability)
Example: The levels of employee satisfaction of a company may be assessed with questionnaires, in-depth interviews and focus groups and results can be compared.
Absolute Reliability
i.e. Test-Retest within individuals
the measure of reliability that has been obtained by conducting the same test more than one time over period of time with the participation of the same sample group.
Example: Employees of a company may be asked to complete the same questionnaire about employee job satisfaction two times with an interval of one week, so that test results can be compared to assess stability of scores.
Rater Reliability• Inter-rater reliability• The consistency of a given measurement from more than
one observer or measurement tool
e.g.
Score for the American Gymnast
British Judge = 9.9
French Judge = 4.4
Japanese Judge = 7.0
• Intra-rater reliability• The consistency of a given measurement from one observer or measurement tool at differ
occasions
e.g.
Score for the American Gymnast in 2002 -quarter finals and finals
quarter finals = 9.9
finals = 9.8
Threats to Reliability• Standardisation of Procedures• Control of extraneous variables (adding ceteris paribus)
• Precision of Measurements
Keeping temperature, humidity and pressure constant
Measurement Errors• Ultimately, reliability is dependent on the degree of
measurement error in a given study
• The overall error in any measurement is comprised of both systematic (error in the system) and random error (error from other sources like human error)
Parallax error while measurement of refractive index
•Validity
“The soundness or appropriateness of a test or instrument in measuring what it is designed to measure”
(Vincent 1999)
Thermometer for measurement of humidity
•Validity
“Degree to which a test or instrument measures what it purports to measure”
(Thomas & Nelson 1996)
Types of Experimental Validity• Internal
• Is the experimenter measuring the effect of the independent variable on the dependent variable?
• External
• Can the results be generalised to the wider population?
Logical Validity• Face Validity• Infers that a test is valid by definition
• It is clear that the test measures what it is supposed to
e.g.If you want to assess reaction time, measuring how long it takes an individual to react to a given stimulus would have face validity
Logical Validity•Content Validity• Infers that the test measures all aspects contributing to the
variable of interest
Test Group skills
Intereview
Overall:
A logically valid test simply appears to measure the right variable in its entirety
Statistical Validity•Concurrent Validity• Infers that the test produces similar results to a previously
validated test A toad jumps when it sees an insect
When tested with a fly
When tested with a beetle
Statistical Validity•Predictive Validity• Infers that the test provides a valid reflection of future
performance using a similar test
e.g.Can performance during test A be used to predict
future performance in test B?
A B
Overall:
A statistically valid test produces results that agree with other similar tests
Logical/Statistical Validity•Construct Validity• Infers not only that the test is measuring what it is supposed
to, but also that it is capable of detecting what should exist, theoretically• Therefore relates to hypothetical or intangible constructs
TheoreticallyWind near costal regionsBlow at higher speed
Anemometer
Showing the same
Threats to Internal Validity• Maturation• Changes in the DV over time irrespective of the IV
Immature immune system – so body gets infected mature immune system white blood cells kill virus
Threats to Internal Validity• History• Unplanned events between measurements
Friendship measured 100% left unmeasured friendship measured 0%
Threats to Internal Validity• Statistical Regression• AKA regression to the mean
• An initial extreme score is likely to befollowed by less extreme subsequent scores
e.g.
Training has the greatest effect on untrained individuals.
Therefore, solution = effective sampling.
Threats to Internal Validity• Selection Bias• The groups for comparison are not equivalent
For measuring the validity of an IQ test you are selecting Einstein and a clown
Threats to Internal/External Validity• Pre-testing• Interactive effects due to the pre-test (e.g. learning,
sensitisation, etc.)
Once we are tickled, The next time we start to laugh even before someone tickles
Threats to Internal/External Validity• Experimental Mortality• Missing Data due to subject drop-out• Reduced numbers = reduced statistical Power• Not only challenges quality of data gathered
(Internal Validity) but also our ability to generalise (External Validity).
Therefore, solution = recruit sufficient participants If someone dropsOut of the experiment
Threats to External Validity• Inadequate description• 5th characteristic of research…
…should be replicable
If nobody can replicate the methods of a given study, then it is irrefutable and therefore lacks external validity.
Therefore, solution = comprehensive methodology
Threats to External Validity• Demand Characteristics• Participants detect the purpose of the study and behave
accordingly
Therefore, solution = double or single blinding.
Threats to External Validity• Operationalisation• AKA Ecological Validity• The DV must have some relevance in the ‘real
world’
Therefore, solution = choose your DV carefully.If for knowing the function of the brain dependent variable take is “Heart beat” instead of “neural Impulses”
100db 100db 100db
90db 90db 90db
Valid and Reliable
If a subject is subjected to 3 different tests and the results are almost similar each time
100db 100db 98db
90db 90db 88db
Not Valid but Reliable
If a subject is subjected to 3 different tests and the results of 1 test is different from other two by a similar error each time
100db 110db 98db
90db 89db 120db
Not Valid and not Reliable
If a subject is subjected to 3 different tests and the results of all tests are different and not by the same amount
References
• Business Research Methods by Pamela Schindler & Donald R. Cooper• Business Research Methods by Roger Bougie & U. Sekaran• Allpsych.com• UNI.EDU• Explorable.com• Research-methodology.net• Study.com• Online Lectures, Bath University, London