Overview of Research Methods in Dentistry
Robert Weyant, DMD DrPHDepartment of Dental Public Health and Information Management
University of Pittsburgh
2
What is “Causation”
• Koch-Henle postulates • Bradford-Hill 'criteria' • inductionist, refutationist, or hypothetico-
deductivist view• Provides the basis for “intervention”
"Causality. There is no escape from it, we are forever slaves to it. Our only hope, our only peace is to understand it, to
understand the why”Larry, .; Andy Wachowski, . The Matrix: Reloaded.
3
Hills Criteria of Causation• Austin Bradford Hill (1897-1991),
a British medical statistician as a way of determining the causal link between a specific factor (e.g., cigarette smoking) and a disease (such as emphysema or lung cancer).
• Hill's Criteria form the basis of modern epidemiological research, which attempts to establish scientifically valid causal connections (disease – and its cause)
• Temporal Relationship• Strength• Dose-Response
Relationship• Consistency• Plausibility• Consideration of Alternate
Explanations• Experiment• Specificity• Coherence
4
Systems
• Deterministic Systems • Events are part of an
unbroken chain of prior occurrences.
• Outcomes occur predictably
• Newtonian Physics
• Stochastic Systems • Outcomes are
computationally and practically unpredictable.
• Present state does not fully determine the next state
• Biology and medicine are stochastic
5
Statistical Causality• Observational studies (like counting cancer cases among smokers and among
non-smokers and then comparing the two) can give hints, but can never establish cause and effect.
• Hypothesis generation.• The gold standard for causation here is the randomized experiment:
• One limitation of experiment is they do a good job of testing for the presence of some causal effect they do less well at estimating the size of that effect in a population of interest.
• Subject selection may lack generalizability.• .
Exp
Med
Outcome
Research Designs
In clinical research
6
7
Essentials of Research Design
• Basic research• Clinical research (often experimental)
• Epidemiological research (often observational, know denominator)
• Health services research
limited to human research (in vivo)
8
What are our research
(and clinical) concerns?• Exposure
• Good or bad: Chemical, biological, psychological, educational, etc.
• Outcome• Good or bad: disease, cure, improved attitude,
longer life, etc.
• We generally know one and want to measure the other
• Concerns are that we measure both accurately and understand what population is represented.
9
Classification Schemes• Descriptive vs. Analytical• Experimental vs. Observational• Time Referenced
• Prospective vs. Cross-sectional vs. Retrospective
10
Describe or Analyze?
• Descriptive• simply describe what was seen (common in
surveys). Prevalence of various conditions. • PREVALENCE: the proportion of the population who
exhibit the condition of interest.
• Analytic• attempt to determine the associations between
disease and possible risk factors/determinates and to quantify risk. (common for experimental designs and search for causality)
11
Experiment or Observe?
• Experimentation is defined by the degree of control or manipulation the investigator has over the study conditions.
• In a non-experimental (observational) design the investigator has less control over the study conditions.
• The consequences of study design are in the limitations put upon the interpretation of the results of the study.
12
Time
Cross Sectional
Time
Retrospective
Case control
? $Prospective
All experimental and cohort (obs)
13
Classification according to CONTROL / INTERVENTION
• Experimental Designs (Classic Design = RCT)
• Prospective• Investigator alters the conditions understudy• There is a true control group• Randomization MUST occur
• Observational• May be prospective/retrospective/cross-sectional• No control• No intervention
14
Issues of concern
1. Population
2. Control group
3. Sample size
4. Placebo
5. Control of Operational Procedures
6. Validity and Reliability of Measures
7. Duration
8. Statistical Analysis
15
1. Population (Relevance)
• When you read a study you must ask: • is the population representative of
something I care about? • Is it appropriate to answer the question.
16
How do people get into a “study”?
• They volunteer• Often they are in the right place at the
right time• They have the right disease (severity) or
exposure.• Often “clinic” based studies are very
poorly generalized to larger populations.
17
Why people don’t get into a study
• Too sick or not sick enough• Wrong gender, race, etc….• Don’t live in the right place.• Don’t know about the study.
Population of interest(in community)
Barriers
Lack of knowledge
Referral Issues
Fear
Transportation
Present for study
Eligible
Barriers
Wrong disease severity
Demographic issues
Barriers
Fear
Transportation
Not willing to be “randomized”
Complete study and can be found for follow up
Generalizability of Results
Barriers
Not adhere to protocol
Lost to follow up (move, die)
Where do research “subjects” come from?
Consent/Enroll
19
Is the study relevant and valid?
• External validity• Do the study subjects represent a definable
population of interest - i.e., “your patients”?• Hence, is it relevant
• Internal validity• Is the study well designed and analyzed?• Hence, is it valid
20
2. Sample Size (did you look at enough people…)
• There must be enough people in the study to ensure that the conclusions are valid. The likelihood that a finding will be spurious or incorrect decreases as you increase the number of individuals in the study.
• POWER: the ability of a test to detect a significant difference when one exists. Be particularly attentive to negative studies.
• Function of effect size, variance, sample size
21
3. Control Group ?(it worked!……compared to what?)
• If we are to conclude that an intervention has an effect, then we must be sure that the group with and without the intervention were similar before the study began/and remained so except for the intervention.
• If not, bias can result in spurious conclusions.
22
4. Placebo ?(I feel much better...what was that?)
• Placebo is a material, formulation, intervention that is similar to the test product, but without the active ingredient.
• There is a well documented placebo effect in many situations.• Up to 70% in some studies.
23
5. Control of operational procedures (What exactly did you do, doctor?)
• When reading a study for your own use, it is important that the authors explain precisely what they did. This allows the reader to generalize to his/her own situation and helps to assess relevance
24
6. Reliability of measures (That was great…now do it again?)
• One of the most important areas in any study: did the effect occur and how do we know. Someone measured it. We must be able to determine that the investigator(s) measured it accurately, repeatable.
• INTRA-RATER reliability (same cases over time)• INTER-RATER reliability (comparison of same cases
among raters)• Instrumentation
25
7. Duration of study(over so fast?)
• Did the trial run long enough to measure the desired effect.
• Caries trials 2-3 years• Calculus-preventing agents 90 days• Orthodontic outcomes (20 years?)• Implants (5 years)
26
8. Statistical Analysis(So, did I find anything “significant?”)
• Where they appropriate to the design, quality of data, intent of investigators.
• Statistical analysis is based on type of data (nominal, ordinal, ratio).
• Type of question being asked• Summarize• Difference between groups• Effect size or risk
27
Threats to Validity of a Study(Nice result, but what about…)
• Bias: Any systematic error in a study which results in an incorrect estimate of the association between disease and exposure.
• Confounding: results when there is a mixing of the effect of the exposure and disease with a “third factor”
• Chance: The exposure:disease relationship is spurious as the result of random variation in sampling.
28
Types of Bias
• Selection• Non-representative sample• Non-comparable case/control groups• Loss to follow-up• Differential survival
• Observation (Misclassification error)• Disease Classification• Exposure Classification• Instrumentation
29
Confounding
• Definition: the bias in the (crude) disease-exposure estimate that can result when the exposure-disease relationship is mixed up with the effect of “extraneous variables”
• Confounding affects our understanding of the “true” disease-exposure relationship
• The determination is “data-based”• Two methods
• Stratification• mulitvariate analysis
30
Chance
• That’s what we have statistics for - to quantify the chance.
• Type 1 (alpha) error (p-value).
Research Designs
no
Observational studies
yes
Experimental studies
Is there to be a
control group
no
Do we know disease status of patients before study
Will observations bemade at more than onetime
yesCohort study
yesCase-control study
Cross sectional studyno
Alter the conditionsunder study
yes True experiment
no Quasi - Experiment
32
Observational Designs
Cross SectionalCase Control (retrospective)
Cohort (prospective)
33
Cross Sectional Study
• Measure, Classify, Compare• Used for questionnaires, surveys,
prevalence estimates, to generate hypotheses.
• Everything occurs “at once”.
Cross Sectional Design1. Select Pop of interest
Population of Interest
2. Select Sample
Study Sample
3. Assess population for both disease (outcome) status
Disease Positive
Disease Negative
RF +
RF +
RF -
RF -
and risk factor (exposure) status
Analyze using correlational statistics but causation not “provable” due to lack of temporal association
Cross-Sectional DesignAdvantage:
1. Quick and Low Cost
2. Evaluate a large number of variables
3. Enroll a large number of Subjects
Disadvantage:
1. Subject selection may reflect selection bias (volunteers, hospital patients)
2. Is difficult to identify cause and effect relationship.
Common Uses:• Questionnaires and Surveys• Prevalence studies• Hypothesis Development
36
Case Control
• Select cases and controls• Retrospective assessment of risk factors• Quantify exposure. Since no
denominator, only relative rates.
Case-Control Design
1. Select group of subjects WITH disease/outcome of interest = CASES
Cases
2. Select group of subjects WITHOUT disease/outcome = CONTROLS
Controls
RF +
RF +
RF -
RF -
3. Measure (retrospectively) risk factors of interest.
4. Analyze using strength of association measures.
Selection of controls crucial
Case selection also must be carefully considered
Common Use:
Rare Disease (e.g., birth defects)
Long Latency (e.g., cancer)
Case-Control DesignAdvantages
1. not dependent on natural frequency of disease (thus used to study rare diseases)
2. well suited to study diseases with long latency
3. requires comparatively few cases (2:1 or 3:1 matching)
4. not dependent on previously established cohort
5. allows study of multiple potential causes of disease
6. relatively low cost and quick
7. ethical: disease has already occurred
Disadvantages
1. case selection may be problematic
2. controls may not be representative of same population as cases in terms of disease risk or confounders
3. investigators may be biased when know of disease status of subjects
4. subjects may bias answers (recall) due to disease status
5. factors which are used to match are removed from analysis
6. incidence, prevalence, RR and AR can't be calculated since no "population at risk" denominator is available
39
Cohort Design
• Select two or more groups (cohorts) that are free of disease but differ on their exposure status.• May start with one heterogeneous cohort.
• Cohorts have a “denominator” which allows the calculation of true rates.
• Useful when “exposure” varies over time.
Cohort Study Design1. Select Population of interest
Population of Interest
2. Recruit sample WITHOUT disease(s) of interest and measure risk factors
Disease Free Study Sample
(baseline exam)
Prospective, Observational Design.
Uses:
• Determining/quantifying risk factors• Developing new etiological theory•Establishing causality
Visit 2
3. Recall cohort periodically and remeasure risk factors and disease status
Visit 3 Visit n
Time
Cohort DesignAdvantages
1. allows risk to be expressed as incidence
2. certain biases are reduced:
exposure status
disease status
3. subject characteristics can be related to more than one outcome
Disadvantages
1. inefficient for study of rare disease
2. assessment of relationships limited to those defined at beginning of study
3. selection bias not controlled
4. loss to follow-up common
5. subjects may change in regards to characteristics (i.e. exposure status)
6. bias may be present if the characteristic studied influences surveillance and if surveillance influences detection of outcome (Berkson's fallacy)
7. expensive and time consuming
42
Experimental Designs
Clinical Trials (RCTs)
Field Trials
43
Clinical Trials• Prospective controlled experiment of human
subjects to assess intervention for a specific disease.
• Asks an important research question• Clinical event or outcome• Done in clinical or medical setting• Evaluates one or more interventions compared
with “standard treatment”• Informed consent and DSMB required
44
Phases of Clinical Trials
• Phase I: dose finding• Phase II: efficacy at fixed dose• Phase III: comparing treatment (RCT)• Phase IV: late/uncommon effects
45
Uses of Clinical Trails (experimental studies)
• Test new drug therapy • Test new surgical interventions• Test educational/programatic
interventions
Randomized Clinical Trial Design
1. Recruit individuals WITH disease.
Study Sample
with disease
Randomization is essential, and along with strict control of experimental conditions allows for minimal bias
Excellent internal validity (but possibly low external validity)
2. Randomize into treatment arms
Standard Treatment
New Treatment
Randomization
Outcomes
Outcomes
3. Follow up to assess outcomes
Ethical only to the degree that differences in treatment are unknown at time of study initiation (equipoise). Requires DSMB.
Experimental DesignAdvantages
1. investigator directly controls assignment to study groups
2. investigator directly controls exposure to agent.
3. random assignment measures can control extraneous factors.
4. blinding of evaluators may be possible
Disadvantages:
1. not immune to problems encountered with other designs: (non-compliance, incomplete follow-up, biased observation)
2. may have low external validity
3. may not be feasible for studies of disease etiology (ethical considerations, rare disease)
4. may not be feasible for effective disease prevention exists. (can't withhold treatment)
5. Can be very expensive
48
Efficacy vs. Effectiveness
• Efficacy is the potential to provide a clinical benefit.
• Measured in CTs
• Effectiveness is the benefit provided in routine “real world” use.
• Measured in surveillance systems (registries), after market incident reports, etc.
49
Hierarchy of Research Designs
• Experimental designs• Cohort studies• Case-control designs• Human trial without controls• Cross-sectional designs• Descriptive studies• Case reports• Personal opinion
Based on control of bias and confounding and ability to make causal arguments
50
RCT’s Strengths
• Minimally biased design• Randomization• Control of extraneous variables
• Prospective (causality established)• Design issues determined prior to initiation of
study.
51
Problems with (Dental) RCTs• Difficult to randomize• Ethical Concerns
• Principle of Equipoise involves the ethical treatment of human subjects in experimental conditions. A subject should only be submitted to a randomized, controlled design if there is substantial uncertainty about which of the treatments would benefit the subject most.
• RCTs should not be done when patient preference can be elicited (ortho vs. surgical tx)
• Blinding issues (Hawthorne effect)• Expensive (and often lack sponsor)
What are the current “Issues” in Dental clinical research?
• Diagnosis• Treatment approach• Materials• Long term issues
• Harm
• Health Services Research
• Cost Effectiveness
52
53
Negative Study
• No association• Sloppy design (poor methods or
analysis)• Bias• Chance
• Statistics measures “chance” (expressed as p-value)
Systematic Reviews
Putting it all together
54
55
Scientific Truth relies on• the weight of evidence over many studies that creates
confidence in results. • If its not published….it didn’t happen.
• Journalistic Reviews…the “old way”• Remember the essays you used to write as a student? You would
browse through the indexes of books and journals until you came across a paragraph that looked relevant, and copied it out. If anything you found did not fit in with the theory you were proposing, you left it out.
• Or the way its done by senior academics. Take a simmering topic, extract the juice of an argument, add the essence of one filing cabinet, sprinkle liberally with your own publications and sift out the work of noted detractors or adversaries…or
56
Systematic Reviews…the new way
• In contrast to the old way, systematic reviews use explicit and rigorous methods to identify, critically appraise, and synthesize relevant studies.
• Qualitative: when the results of studies are not statistically combined.
• Quantitative or Meta-analysis: systematic review that uses statistical methods to combine the results of two or more studies
Maturation of DentistryMaturation of Dentistry
Age of Empiricism: Dental practice based on observation and experience in ignorance of scientific findings
Age of EvidenceDental practice based on high quality evidence of effectiveness
All knowledge maintained personally
Textbooks and Journals
Internet
Apprentice Model of Education
Scientific Literature and Knowledge Synthesis-based Education
Absence of Research
RCTsSystematic Reviews and Meta Analysis
Evolution of the Dental Knowledge Base
• store of specialized information - diseases - treatment methods - treatment outcomes • basis of professional decision-making • has evolved over time with respect to:
- creation- synthesis- dissemination
Bader JEBDP 2004
What is a Systematic Review• A "systematic review” comprehensively locates,
evaluates and synthesizes all the available literature on a given topic using a strict scientific design which must itself be reported in the review.
• Aim of SR is:• Systematic (e.g. in its identification of literature)• Explicit (e.g. in its statement of objectives, materials and
methods)• Reproducible (e.g. in its methodology and conclusions)
• Goal: To efficiently integrate valid information and provide a basis for rational decision making.
60
Features of a Systematic Review
• Explicit criteria (reproducible)• Efficient
• As it is impractical for even an expert to read all the literature published in his field. SR are a succinct but robust form for practitioners who need to keep up to date?
• Well focused (PICO)• Thorough (unpublished information may be included)
• Provides a context for studies and creates a sense of the “weight of evidence”
• Secondary data analysis
Why Systematic Reviews• Annually 3 million articles are published in
biomedical journals and biomedicine mass doubling time is less than 20 months.
• You would need to read a dozen or more articles per day (365 days/yr.) to stay up to date.
• Not all articles are valid or useful for patient care.• SR provide a summary and context of the current
state of knowledge (that is lacking if you only read a few articles in an area).
Quality of Evidence Pyramid
Meta-Analysis
Systematic Review
Randomized Controlled Trial
Cohort studies
Case Control studies
Case Series/Case Reports
Basic Research and Animal research
}Guidelines
Questions come in two varieties:
• BACKGROUND QUESTIONS• Textbooks/Basic Sci Faculty
• FOREGROUND QUESTIONS• Clinical Faculty• Journal articles • Guidelines
BackgroundForeground
Dental School Professional Practice
Background Questions
• Are general questions about conditions, illnesses, syndromes and patterns of disease, and pathophysiology.
• "What is the typical clinical presentation of primary oral herpes?” or
• “Which teeth are most commonly affected during ECC?”
• Novices asks this type of question in a particular knowledge area, in order to gain a general understanding of clinical issues.
• Best resources include textbooks and faculty.
Foreground Questions• Foreground questions are about issues of patient
care and clinical decision-making. • Best resources:
• guidelines, • systematic reviews
Remember: Generally, its not what you don’t know that causes problems - its what you “know” that just ain't so….
Steps in Developing Systematic Reveiws
66
Step 1: Identify an area of Uncertainty
• Diagnosis• How well does DIAGNODENT diagnosis interproximal caries?
• Therapy• Should asymptomatic impacted third molars be extracted?
• Prognosis• How long will a implant last when used to replace a single
anterior tooth lost due to trauma? Is it different if the tooth loss is due to perio?
• Harm or Causality• Do posterior inlays result in greater risk of tooth sensitivity
compared with other posterior restorations?
Step 2: Frame it as an Answerable Questions (PICO Format)
• P patients or populations• I interventions• C comparison group(s) or "gold standard"• O outcome(s) of interest
P.I.C.O.Patient or Problem
Intervention (a cause, prognostic factor, treatment etc.
Comparison Intervention (if necessary)
Outcomes
Tips for Building Questions
Starting with your patient, ask “How would I describe a group of patients similar to mine?”
Balance precision with brevity.
Ask “Which main intervention am I considering”
Be specific
Ask “What is the main alternative to compare with the intervention?”
Be specific
Ask “What can I hope to accomplish?”, or “What could this exposure really affect?”
Be specific
Example
In young adults will asymptomatic impacted third molars, cause ortho relapse or lead to problems better dealt with prophylactically
Surgical extraction Watchful waiting reduction of ortho relapse, prevention of oral infections, reduction in surgical complications at an older age.
Step 3: Search for the Evidence
• Philosophy: Find all literature that is relevant and valid
• Eliminate studies with poor design • Reduce potential for bias
• Effect size (design effects)• Publication (no negative studies)• Author (COI)• Poor search strategies
Step 3: Search for the Evidence
• Establish inclusion and exclusion criteria• Type of study (RCTs, Cohort, Case-Control,
Cross sectional)• Type of exposure and outcomes
• Case Definition• Exposure Definition• Are Outcomes Important (to whom?)
71
Step 3: Search for the Evidence
• Develop Search Strategy• Electronic Databases
• MEDLINE, EMBASE, Cochrane Library, etc.• Search Filter (are they tested and sensitive/specific)
• Hand searching• Unpublished studies
• Gray literature (conference proceedings, disssertations)
• Reference lists• Personal communication
72
Step 4: Extract Data
• Apply Inclusion and exclusion criteria• Two stage review (title/abstract; full article)• Two reviewers• Rules for resolving disagreements • Use predetermined forms• Log reason for exclusion
73
Step 5: Analyze and Present Results
• Evidence Table• Research design• Subjects• Methods• Results
• Qualitative Summary• Quantitative Summary
• Heterogeneity• Meta-analysis• Sensitivity analysis
• Methodological Quality• allocation concealment • blinding • statistical analysis• funding/sponsorship• population (specificity)• intervention (specificity)• outcomes (specificity)
74
Step 6: Interpret and Review Results
• Have all the main outcomes been considered• Have data been presented about absolute
change as a result of the intervention• Have any factors that may limit application been
considered• Are the results consistent• Don’t confuse “no evidence of an effect” with
“evidence of no effect”
75
Forest Plots
A quick look at meta-analysis
76
77
there’s a label to tellyou what the comparisonis and what the outcomeof interest is
78
At the bottom there’sa horizontal line. This is the scale measuringthe treatment effect.Here the outcome is deathand towards the left thescale is less than one,meaning the treatmenthas made death lesslikely.
Take care to read whatthe labels say – things tothe left do not always meanthe treatment is better thanthe control.
79
The vertical line in themiddle is where thetreatment and control have the same effect – there is no differencebetween the two
80
For each study there is an id
The data foreach trial are here, divided into the experimental and control groups
This is the % weightgiven to thisstudy in the pooled analysis
81
•Each study is given a blob, placed where the data measure the effect.•The size of the blob is proportional to the % weight •The horizontal line is called a confidence interval and is a measure of how we think the result of this study might vary with the play of chance. •The wider the horizontal line is, the less confident we are of the observed effect.
The label above the graph tells you what statistic has been used
The data shown in the graph are also given numerically
82
The pooled analysis is given a diamond shapewhere the widest bit in the middle is located at the calculated best guess (point estimate), and the horizontal width is the confidence interval
Definition of a 95% confidence interval: If a trial was repeated 100 times, then 95 out of those 100 times, the best guess (point estimate) would lie within this interval.
At the end of the day….
What do we really want to know?
83
Can we believe it ?
• bias free search & inclusion criteria?• appraisal of methodology of primary
studies?• consistent results from all primary
studies?• if not, are the differences sensibly explained?
• are the conclusions supported by the data?
84
If we believe it — does it apply to our patient?
• Is our patient (or population) so different from those in the primary studies that the results may not apply?
• consider differences in:• time — many things change.• culture — both treatments and values of
outcomes can be different• stage of illness or prevalence can effect
results.
We believe it ! But….does it matter?
• Is the benefit worthwhile to our patient?• Ask the patient about cultural values.• Think about Relative Risk Reduction vs.
Absolute Risk to our patient.• Potential benefit is the Absolute risk
avoided in our patient = Absolute Risk Reduction (ARR)!
Is it a systematic review? does it:
• define a four part (answerable) clinical question?
• combine Randomized Controlled Trials (RCT’s)?
• describe PRE-DEFINED search methods?• PRE-DEFINED inclusion criteria?• PRE-DEFINED methodological exclusion
criteria?
PICO Practice
“What is the effectiveness of semiannual fluoride varnish compared to semiannual fluoride gel in preventing dental caries in permanent teeth among caries-active adults?”
Egger at al., 2001
Step 1: Key Clinical Question
PICO Practice
“What is the effectiveness of semiannual fluoride varnish compared to semiannual fluoride gel in preventing dental caries in permanent teeth among caries-active adults?”
Egger at al., 2001
Step 1: Key Clinical Question
Population
PICO Practice
“What is the effectiveness of semiannual fluoride varnish compared to semiannual fluoride gel in preventing dental caries in permanent teeth among caries-active adults?”
Egger at al., 2001
Step 1: Key Clinical Question
Population
PICO Practice
“What is the effectiveness of semiannual fluoride varnish compared to semiannual fluoride gel in preventing dental caries in permanent teeth among caries-active adults?”
Egger at al., 2001
Step 1: Key Clinical Question
Intervention
PICO Practice
“What is the effectiveness of semiannual fluoride varnish compared to semiannual fluoride gel in preventing dental caries in permanent teeth among caries-active adults?”
Egger at al., 2001
Step 1: Key Clinical Question
Intervention
PICO Practice
“What is the effectiveness of semiannual fluoride varnish compared to semiannual fluoride gel in preventing dental caries in permanent teeth among caries-active adults?”
Egger at al., 2001
Step 1: Key Clinical Question
Comparison
PICO Practice
“What is the effectiveness of semiannual fluoride varnish compared to semiannual fluoride gel in preventing dental caries in permanent teeth among caries-active adults?”
Egger at al., 2001
Step 1: Key Clinical Question
Comparison
PICO Practice
“What is the effectiveness of semiannual fluoride varnish compared to semiannual fluoride gel in preventing dental caries in permanent teeth among caries-active adults?”
Egger at al., 2001
Step 1: Key Clinical Question
Outcome
PICO Practice
“What is the effectiveness of semiannual fluoride varnish compared to semiannual fluoride gel in preventing dental caries in permanent teeth among caries-active adults?”
Egger at al., 2001
Step 1: Key Clinical Question
Outcome
PICO Practice
98
Source of Secondary Information
• Systematic Reviews• E.g., Cochrane Collaboration
• Guidelines• E.g., National Guidelines Clearinghouse
THE COCHRANECOLLABORATION
Cochrane Collaboration
• An international organisation that aims to help people make well-informed decisions about healthcare by preparing, maintaining and promoting the accessibility of systematic reviews of the effects of health care interventions.
Cochrane Centres
South African
Australasian
Chinese
Brazilian
Nordic
German
San Antonio
ItalianIberoamerican
French
Dutch
UK
Canadian
New England
San Francisco
Cochran Library
106
107
Evaluation of Diagnostic Tests
108
Topics1. How do we “know” something.
• Scientific Reasoning
2. What are the elements and structure of scientific thinking.
• Facts, Hypotheses, Theories, Paradigms
3. Research Designs and Control of Bias4. Clinical Epidemiology
• Sensitivity, Specificity, Predictive Value
5. Measurement in Dentistry6. The Research Enterprise
109
Topics1. How do we “know” something.
• Scientific Reasoning
2. What are the elements and structure of scientific thinking.
• Facts, Hypotheses, Theories, Paradigms
3. Research Designs and Control of Bias4. Clinical Epidemiology
• Sensitivity, Specificity, Predictive Value
5. Measurement in Dentistry6. The Research Enterprise
110
Diagnostic Tests
• Purpose: to increase our certainty about the cause of a patients illness
• Common Types:• Physical and history findings• Laboratory test• Radiography• “Other” technological findings (pulp tester,
etc.)
111
Examples of Diagnostic Tests in Dentistry• Caries
• visual, radiography, DIFOTI• Pulpal necrosis
• Electrical, thermal • Soft tissue lesions
• Biopsy, dye• Periodontitis
• Future attachment loss, PSR• Malocclusion
• Index, study models, ceph
112
Reduction of Diagnostic Information• Scales• Indexes, • Cut Points• Basic Decision: Treat or No Treatment
113
Outcomes in Orthodontics
• Malocclusion is not a disease• Outcomes based on clinician
assumptions of patients needs/desires• Many dimensions need to be measured
• Overjet, overbite, cross bite, etc…
114
Measurement Issues in Orthodontics
• Index - assign numerical rating• Diagnostic (Angle)• Epidemiological Index (Summer’s)• Treatment need (HLD, Salzman, IOTN)• Treatment Outcome (PAR)
115
Valid and Reliable
Reliable but NOT valid
NOT reliable or valid
Reliable and valid
Can’t be valid unless reliable
116
What is validity in Ortho Index
• Measures dimensions of occlusion that are considered clinically important. These could based upon:• Expert opinion• Clinical consequences (disease) or
change• Patient values and desires
117
How to assess reliability
• Intra-rater• Have same person rate the “case” more than once.
• Inter-rater• Have different people rate the “case”.
• Expressed as measures of rater agreement• Nominal (Kappa)• Categorical (Percent agreement, weighted Kappa )• Continuous (Correlation, ICC)
118
Test Quality
• Diagnosis is an imperfect process - all tests have some inherent inaccuracy
• The “correct” diagnosis thus becomes a probability
• Understanding the mathematical performance of a test improves the clinicians decision making process.
119
Measures of the Quality of a Diagnostic Test• Sensitivity• Specificity• Accuracy• Predictive Value (positive and negative)• The higher these numbers - the better the
test.
120
the “Gold Standard”
• The definitive diagnostic technique• Often expensive, elaborate, or difficult to
perform.• We are always looking for faster, cheaper,
better ways to diagnose disease (and to determine treatment).
121
Sensitivity
• The number of people with the disease (Gold Standard) who have a positive test result.
• Relates Gold Standard to New Test.• A sensitive test rarely misses people with
disease.• Sensitive tests should be selected when there
is an important penalty for missing disease (i.e., cancer diagnosis)
122
Specificity
• The number of people without the disease who test positive.
• A specific test will rarely misclassify people without disease as diseased.
• Specific tests are used to “rule in” a diagnosis that has been suggested by other tests.
123
Accuracy of a Test
• The overall ability of a test to correctly classify a patient.
• Sensitivity + Specificity / 2
124
Predictive Value
• Positive predictive value is probability of disease in a patient with an abnormal test.
• Negative predictive value is the probability of no disease in a patient when the test result is normal.
125
A new diagnostic test for periodontal disease
126
“PERIOCHECK®”
• A new diagnostic assay that the company claims “predicts” future periodontal attachment loss (LOA).
• Requires a “blood test” of 1 ml of blood placed into the “Periocheck” machine.
• Values of the test range from -5 to +5• “Gold Standard” is actual attachment
loss (measured prospectively).
127
A Validation Study for PERIOCHECK
• 300 subjects recruited into study
• 2 edentulous exclusions
• 8 medical complication exclusions
• 4 refused upon consent
• 45% African Am• Mean age 49 ± 15y• Upon 2 year follow up
• 48 lost to follow up
• Final Study Sample• 238 (79%)
• 125 had LOA (52.5%)• 113 no LOA (47.5%)
128
Distribution of Baseline PERIOCHECK values by future LOA
0
5
10
15
20
25
30
35
-4 -3 -2 -1 0 1 2 3 4
Periocheck Values
Fre
qu
ency
People who do NOT develop LOA
People who DO develop LOA
Diagnostic Cutpoint ≥ 0
TN - True Negatives
TN
TP - True Positives
TP
FN - False Negatives
FN
FP - False Positives
FP
129
Distribution of Baseline PERIOCHECK values by future LOA
0
5
10
15
20
25
30
35
-4 -3 -2 -1 0 1 2 3 4
Periocheck Values
Fre
qu
ency
People who do NOT develop LOA
People who DO develop LOA
Diagnostic Cutpoint ≥ 0
TN - 91
TN
TP - 109
TP
FN - 16
FN
FP - 22
FP
Positive Test ≥ 0
Negative Test < 0
Disease Present
Disease Absent
Gold Standard (eventual LOA)
Periocheck
109 22
16 91
0
5
10
15
20
25
30
35
-4 -3 -2 -1 0 1 2 3 4
Periocheck Values
Fre
qu
ency
TN
TN
TP
TP
FN
FN
FP
FP
125 113 238
Prevalence = 125/238 = 52%
131
Quality of Diagnostic Test
• Sensitivity - the number of people with disease who have a positive test.
• Specificity - the number of people without a disease who have a negative test
Positive Test ≥ 0
Negative Test < 0
Disease Present
Disease Absent
Gold Standard (eventual LOA)
Periocheck
109 22
16 91
0
5
10
15
20
25
30
35
-4 -3 -2 -1 0 1 2 3 4
Periocheck Values
Fre
qu
ency
TN
TN
TP
TP
FN
FN
FP
FP
125 113 238
Sensitivity = 109/124 = 87.9%
Specificity = 91/113 = 80.5%
Prevalence = 125/238 = 52%
133
Performance related to “Cut Point”
• “cut point” is arbitrary and may be changed. • It is a decision point that a clinician may wish to
set for him/herself.• Sensitivity and Specificity are inversely
associated to one another and vary with the cut point
Positive Test ≥ 0
Negative Test < 0
Disease Present
Disease Absent
Gold Standard (eventual LOA)
Periocheck
109 22
16 91
0
5
10
15
20
25
30
35
-4 -3 -2 -1 0 1 2 3 4
Periocheck Values
Fre
qu
ency
TN
TN
TP
TP
FN
FN
FP
FP
125 113 238
Sensitivity = 109/125 = 87.9%
Specificity = 91/113 = 80.5%
Prevalence = 125/238 = 52%
Positive Test ≥ -2
Negative Test < -2
Disease Present
Disease Absent
Gold Standard (eventual LOA)
Periocheck
124 35
1 78
0
5
10
15
20
25
30
35
-4 -3 -2 -1 0 1 2 3 4
Periocheck Values
Fre
qu
ency
TN
TN
TP
TP
FN
FN
FP
FP
125 113 238
Sensitivity = 124/125 = 99.2%
Specificity = 78/113 = 69.0%
Prevalence = 125/238 = 52%
136
What we have so far
• That at the cut point studied (i.e., 0) • for every 100 patients without disease we
will correctly classify 80 of them. (Specificity)
• For every 100 patients with disease we will correctly classify 89 of them. (Sensitivity)
137
Relationship of Sensitivity/Specificity to Cut Point
Cut Point Sensitivity Specificity
-3 100 34
0 95 71
.5 90 82
1 83 91
3 55 99
138
ROC Curves
• Relates changes in sensitivity and specificity to changes in cut point.
• Provides overall utility of test• Suggests “optimal” cut point
139
Senst
1-Spec0
50
100
0 50 100
-20.5
1
1.5
3
ROC CURVE
140
Senst
1-Spec0
50
100
0 50 100
-10.5
1
1.5
2
ROC CURVE
141
Senst
1-Spec0
50
100
0 50 100
-10.5
1
1.5
2
ROC CURVE
142
Senst
1-Spec0
50
100
0 50 100
ROC CURVE
Area =.5
Area =.91
143
Senst
1-Spec0
50
100
0 50 100
ROC CURVE
Optimal cut point
144
What we actually get clinically• People with a “positive” test
• And we want to know how many really DO have disease
• Positive Predictive Value - the number of people with a positive test who have disease.
• People with a “negative” test• And we want to know how many really DO NOT have
disease.• Negative Predictive Value - the number of people
with a negative test who do not have disease.
Positive Test ≥ 0
Negative Test < 0
Disease Present
Disease Absent
Gold Standard (eventual LOA)
Periocheck
109 22
16 91
0
5
10
15
20
25
30
35
-4 -3 -2 -1 0 1 2 3 4
Periocheck Values
Fre
qu
ency
TN
TN
TP
TP
FN
FN
FP
FP
238
Positive Pred = 109/131 = 83.2%
Negative Pred = 91/107 = 85.0%
Prevalence = 125/238 = 52%
131
107
146
Test performance and prevalence
• Sensitivity and Specificity are stable properties
• PPV and NPV are frequency (Prevalence) dependent properties
Positive Test ≥ 0
Negative Test < 0
Disease Present
Disease Absent
Gold Standard (eventual LOA)
Periocheck
109 22
16 91
0
5
10
15
20
25
30
35
-4 -3 -2 -1 0 1 2 3 4
Periocheck Values
Fre
qu
ency
TN
TN
TP
TP
FN
FN
FP
FP
238
Positive Pred = 109/131 = 83.2%
Negative Pred = 91/107 = 85.0%
Prevalence = 125/238 = 52%
131
107
113125
Positive Test ≥ 0
Negative Test < 0
Disease Present
Disease Absent
Gold Standard (eventual LOA)
Periocheck
109 194
16 806
0
5
10
15
20
25
30
35
-4 -3 -2 -1 0 1 2 3 4
Periocheck Values
Fre
qu
ency
TN
TN
TP
TP
FN
FN
FP
FP
1125
Positive Pred = 109/303 = 35.9%
Negative Pred = 806/822 = 98.0%
Prevalence = 125/1125 = 11%
303
822
125 1000
149
Remember
• Sensitivity and Specificity are stable with changing prevalence, but will vary inversely with “cut point”.
• PPV/NPV vary by the prevalence of the population in which the test is administered.
• Best to use when uncertainty is high • Prevalence close to 50%
150
HIV Example (ELISA)When used in premarital screenings
• Sensitivity - 98• Specificity - 99• Prevalence - 250/100,000• PPV = 20%• 2 million marriages / year in US• HIV cases = 5,000• For every 1000 correctly diagnosed, there will
be 4000 false positives.
Research Ethics
• Subject safety• Informed Consent• Privacy and Confidentiality• Adverse events• Equipoise
151
Risk/Benefit RatioWritten Consent. IRB Approved. Full disclosure of Risks
how information is protected from unauthorized observation, and how participants are to be notified of any unforeseen findings from the research that they may or may not want to know.
The investigator must consider how adverse events will be handled; who will provide care for a participant injured in a study and who will pay for that care are important considerations.
the investigator should be in a state of "equipoise," that is, if a new intervention is being tested against the currently accepted treatment, the investigator should be genuinely uncertain which approach is superior.
Ethical Issues in Human Research
• Autonomy• Beneficence• Justice
152
the obligation on the part of the investigator to respect each participant as a person capable of making an informed decision regarding participation in the research study. The investigator must ensure that the participant has received a full disclosure of the nature of the study, the risks, benefits and alternatives, with an extended opportunity to ask questions.
Tuskegee: Study of syphilis in Blacks, without telling them of their participation. Deception and lack of informed consent used.Individuals followed for 40 years without treatment.
beneficence, which refers to the obligation on the part of the investigator to attempt to maximize benefits for the individual participant and/or society, while minimizing risk of harm to the individual. An honest and thorough risk/benefit calculation must be performed.
justice, which demands equitable selection of participants, i.e., avoiding participant populations that may be unfairly coerced into participating, such as prisoners and institutionalized children.
Components of Ethical, Valid Consent
• Disclosure• Understanding• Voluntariness• Competence• Consent
153
Disclosure: The potential participant must be informed as fully as possible of the nature and purpose of the research, the procedures to be used, the expected benefits to the participant and/or society, the potential of reasonably foreseeable risks, stresses, and discomforts, and alternatives to participating in the research.
Understanding: The participant must understand what has been explained and must be given the opportunity to ask questions and have them answered by one of the investigators.
Voluntariness: The participant's consent to participate in the research must be voluntary, free of any coercion or promises of benefits unlikely to result from participation.
Competence: The participant must be competent to give consent. If the participant is not competent due to mental status, disease, or emergency, a designated surrogate may provide consent if it is in the participant's best interest to participate.
Consent: The potential human subject must authorize his/her participation in the research study, preferably in writing, although at times an oral consent or assent may be more appropriate.
154
The End
Questions?
Top Related