Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences....

55
Field Reliability of Static-99R and diagnosis in WIC 6600 Evaluations Joseph Lockhart, PhD, ABPP Melinda DiCiro, PsyD, ABPP FMHAC 2020

Transcript of Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences....

Page 1: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Field Reliability of Static-99R and diagnosis in WIC 6600 Evaluations

Joseph Lockhart, PhD, ABPPMelinda DiCiro, PsyD, ABPP

FMHAC 2020

Page 2: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Thanks to all who

contributed to this study over

nearly two years!

• Anna Brennan• Jim Rokop• Administration • …and others

Page 3: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Questions the Study was

designed to answer

• How consistent are raters in scoring the risk instrument (Static-99R)?

• (Only other SVP study in Texas found poor interrater reliability – Boccaccini et al., 2009)

• What are the most common diagnoses?• Do the raters show adequate diagnostic

agreement? (Other studies found varied k). • Are there differences between employees vs

independent evaluators (IEs) in their ratings?

Page 4: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Why do we care about

field reliability?

• “Field studies we would argue provide evidence concerning an instrument’s psychometric properties that is more generalizable to real-world cases specifically because the data were collected under similar circumstances.” (Edens and Boccaccini, 2017)

Page 5: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

RELIABILITY & VALIDITY

• Reliability is the degree to which an assessment tool produces stable and consistent results.

• Validity refers to how well a test measures what it is purported to measure

• Why is reliability necessary?• While reliability is necessary, it alone is not sufficient. For

a test to be reliable, it also needs to be valid. For example, if your scale is off by 5 lbs, it reads your weight every day with a deficit of 5 lbs. The scale is reliable because it consistently reports the same weight every day, but it is not valid because it subtracts 5 lbs to your true weight.

Page 6: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

How to weigh yourselfThis method might be reliable, but is it valid?

Page 7: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Why do we care about

bias and error?

• Threaten reliability • Violate ethical principles• Violate scientific principles• Demonstrated forensic evaluator vulnerability• Demonstrated human vulnerability• Effective ways to mitigate bias and error exist

Page 8: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Why test field reliability?

• Reliability of outcomes is valued• Field conditions introduce bias and error• Do chosen methods mitigate potential bias and

error?• Instruments• Diagnostic schemes• Training• QA

Page 9: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by
Page 10: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Bias threatens reliabilityEspecially Under Field Conditions

Agencies & AdversariesTraining

Methods

Perspectives

Incentives

Evaluators Thinking too fast

Human vulnerability

Heuristic cognitive bias

Individual biases

Page 11: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Ethics Codes and Guidelines, Bias & Error

Key Ethical Principles• Justice• Respect for persons• Integrity

Forensic Guidelines• Impartiality• Avoiding conflicts of interest• Mitigating impact of personal bias• Reliable sources and methods

Page 12: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Scientific Principles

Decry Bias

• Objectivity

• Neutrality

• Reproducibility

National Research Council National Academy of Sciences—Strengthening forensic science 2009; Rigorous protocol control for bias and error.

Similarities between forensic science and Forensic Mental Health Evaluation • Cognition• Understanding, analysis, and interpretation of data• Perception• Decision making

Mitigate bias with scientific principles and research-based methods • Work like a scientist, not a clinician• Rival hypothesis testing• Use standardized methods• Use certification programs

Zapf & Dror (2017) Understanding and Mitigating Bias in Forensic Evaluation. International Journal of Forensic Mental Health

Page 13: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Frye and Daubert Admissibility

Standards Requirements

FryeThe method is generally acceptedAdmissibility of expert's scientific testimony, established in Frye v. United States, 293 F. 1013 (D.C. Cir. 1923).

Daubert The theory or technique in question

1. Can be and has been tested

2. Has been subjected to peer review and publication

3. Has known or potential error rate

4. Has standards controlling its operation

5. Has widespread acceptance within a relevant scientific community.

US Supreme Court case, Daubert v. Merrell Dow Pharmaceuticals Inc., 509 U.S. 579 (1993).

Page 14: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

California admissibility is

based on People v. Kelly (1976), similar to Frye,

and Sargon v USC (2012),

mentioning Daubert.

(1) the reliability of the method must be established, usually by expert testimony, and

(2) the witness furnishing such testimony must be properly qualified as an expert to give an opinion on the subject.

CA Supreme Court, People v. Kelly, (1976).

Page 15: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Adversarial Bias & 3rd Party Allegiance Demonstrated Vulnerability of Forensic Evaluators to Bias

Results and scores favor retaining party or align with agency perspective

• Murrie, Boccacini, Guarnera, & Rufino (2013) • Murrie, Boccaccini, et al (2009; 2013)• Levinson (2004)• Murrie Boccacini, Johnson and Janke (2008)• Murrie, et al (2008) • Murrie & Boccacini (2015)• Chevalier, Boccaccini, Murrie, and Varela

(2015).

This Photo by Unknown Author is licensed under CC BY-SA-NC

Page 16: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Inherent subjectivity Demonstrated Vulnerability of Forensic Evaluators to Bias

More subjective indicators more subject to bias• Murrie, Boccacini, Guarnera, & Rufino

(2013)• Murrie, Boccaccini, et al (2009)• Murrie, et al (2008) • Guanera & Murrie (2017)

Less bias with more objective evidence• Murrie, Boccacini, Guarnera, Rufino (2013)• Murrie, Boccaccini, et al (2009)

Page 17: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Boundaries of cognitionDemonstrated Vulnerability of Forensic Evaluators to Bias

Evaluators subject to irrelevant information• Murrie, Boccacini, Guarnera, Rufino (2013)• Zapf & Dror (2017)

More bias/unreliability with adjustments to actuarials• Hanson, Helmus, & Harris (2015)• Storey, Watt, Jackson,& Hart (2012)• Wormith, Hogg, & Guzzo (2012)

“Bounded rationality” of the brain for complex configural analysis tasks

• Faust and Faust (2012)This Photo by Unknown Author is licensed under CC BY-SA

Page 18: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Individual biases and preferencesDemonstrated Vulnerability of Forensic Evaluators to Bias

Bias and results vary by evaluator• Murrie, Boccacini, Guarnera, Rufino

(2013)• Boccacini Turner and Murrie (2008)• Murrie (2008)

Page 19: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Expert status and experience Demonstrated Vulnerability of Forensic Evaluators to Bias

Experience and number of evaluations completed is not protective

• More divergence with evaluations (n=2)• Boccacini, Turner, & Murrie (2008)

• Expertise builds bias traps• Zapf, P & Dror, D. (2017

• More experience, more “bias blind” spots and ineffective mitigation

• Zapf, P. A., Kukucka, J., Kassin, S. M., & Dror, I. E. (2018).

Page 20: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Misguided bias mitigation strategiesDemonstrated Vulnerability of Forensic Evaluators to Bias

• Myth that willpower and introspection reduce bias

• Zapf, P. A., Kukucka, J., Kassin, S. M., & Dror, I. E. (2018)

• Kukucka, Kassin, Zapf, & Dror (2017)• Bias blind spots

• Pronin, Lin, & Ross (2002)• Neal & Brodsy (2014)

Page 21: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Sources of Evaluator Bias

Cognitive architecture of the brainTraining and motivationSocial interactionBase rate expectationsIrrelevant case informationReference materialsCase evidence

Zapf, P & Dror, D. (2017) Understanding and mitigating bias in forensic evaluation. International Journal of Forensic Mental Health

Page 22: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

More well-established sources of

human bias and error

a partial list

• Confirmation bias• Earlier findings• Diagnostic momentum

• Motivated reasoning• Overconfidence

• Dunning Kruger Effect• Limits on cognition (Faust 2012)

• Memory fallibility• Nonlinearity• Too Much Information

Page 23: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Mitigating biases and error

Allegiance

Standardized procedures

Reduce incentives

Uniform requirements

Neutral organization

Avoid in house solutions

Diverse training

Base rate neglect

Use standardized instruments

Know base rates

Understand effects of low

base rates

Avoid anecdotes

Confirmation

Use standardized instruments

Force review of data that

disconfirms Identify

alternative hypotheses

Be ready to revise

Minimize contamination

Expert Overconfidence

Use standardized instruments

Mitigate Dunning

Kruger effect

QA

Don’t over-rely on unique

data demonstrate

expertise

Monitor drift

Limits on Cognition

Use standardized instruments

Algorithms

Streamline data

Mask irrelevant data

Thinking too fast

Use standardized instruments

Use algorithms

Put your thinking on ice

Think slow

Individual prejudice and preferences

Use standardized instruments

and methods

Verify research-base

Subject to court of public

opinion

Subjectivity

Use standardized instruments

Use external metrics

Use checklists

Use Instruments

with firm rules

Page 26: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Static 99 R

Standardization• .8 to .9 ICC

Field• >. 80 ICC Boccaccini et al, 2012; Hansen,et al, 2014; Hanson and Morton-Bourgon, 2009

• Overall ICC = .78, [.64, .90]) for Static-99R total score in a sample of 55 California parole and probation officers Hanson, R. K., Thornton, D., Helmus, L., & Babchishin, K. M. (2016).

• 88 vs. .73 (pre 2003 coding ) Rice et al., 2014)

Page 27: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

DSM-5 Reliability of Diagnoses

ASPD K .21 (trials) .51 K Levenson 2004 (SVP).76 Pa Packard & Levenson 2006

Pedophilic Disorder .41 Seto et al 2016. 65 K Levenson 2004 (SVP).85 Pa Packard & Levenson 2006

Substance Use Disorders.40 K Trials.43 K Levenson 2004 (SVP).71 Pa Packard & Levenson 2006

Paraphilic Disorders.30 to .47 K Levenson 2004 (SVP).68 to .97 Pa Packard & Levenson 2006

Page 28: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Static 99 R, DSM-5 & ICD 11,

reliability, and bias protections

Use congruent with ethical guidelines

Use congruent with scientific principles

Generally accepted

• Standardized definitions

• Consensus-based

• Base rates described

• Mitigate memory fallibility

• Mitigate over-reliance on expert status, overconfidence, gut and intuition

But, DSM 5 has poor or unknown reliability for SVP applicable disorders

DSM-5 Field trials• ASPD Kappa .21• EtOH Use Disorder Kappa .40• Not funded for paraphilias

Page 29: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Key Strategies; OrganizationsImproving reliability (thus validity) in sex offender

evaluationsKill Idols! Slay Egos!USE• Actuarial risk assessments• Consensus-based diagnostic schemes• External review & guardrails• TrainingTO• Mitigate adversarial and agency bias• Increase accuracy• Mitigate individual biases and error

Page 30: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Sample Description

• Inmates with qualifying offenses are screened prior to their release (this was done by DSH, for the past two years by BPH).

• Those who are screened as potential SVPs are referred for a full evaluation.

• Sample thus consists of pre-screened “potential” SVPs (none low-level)

• Two separate evaluators per case.

Page 31: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Sample Description

• DSH maintains a large SVP database of the full evals covering several years.

• Variables include the Static-99R and diagnosis• Database prior to 2012 could contain Static-99

vs Static-99R scores (and may be less complete).

• Rather than analyzing (and cleaning) all data, made decision to analyze random subset.

• 200 “negative” cases chosen randomly• 50 “positive” cases chosen randomly• Two evaluators per case

Page 32: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Sample Description

• The Department of State Hospitals (DSH) Forensic Services Division (FSD) uses a database to capture and store all referral and evaluation data for the Sexually Violent Predator (SVP) program. The database is called the Sex Offender Commitment Program Support System (SOCPSS)

• The SOCPSS holds over 30,000 records relating to Sexually Violent Predator referrals. During the timeframe queried, 2012 – 2017, there were 14,089 referrals and 10,912 initial evaluations conducted.

• To gather the provided sample, the SOCPSS was queried to collect all evaluations completed between the years 2012 and 2017. Evaluations were then separated into categories of positive or negative outcomes and a random sample was taken using the RAND function in Excel of 200 negative evaluations and 50 positive evaluations. Each evaluation selected included the following information: Evaluator ID, inmate name, DOB, CDCR#, Static 99r scores, diagnoses, evaluation received date, and the evaluation decision.

Page 33: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Inter-rater Reliability on the Static-99R

Inter-rater reliability on continuous scales is typically measured by the ICC statistic

Page 34: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Characterizing Static-99R Inter-rater reliability using the ICC

>0.9 Excellent

0.75-0.89 Good

0.5 – 0.75 Moderate

< 0.5 Poor agreement

-A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research

Page 35: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Comparing Static-99R ICCs for final outcomes

“Negative” final outcomeICC Lower bound Upper bound

0.90 (“excellent”) 0.87 0.92

“Positive” final outcomeICC Lower bound Upper bound

0.81 (“good”) 0.69 0.89*may be lowered due to restricted range

Page 36: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Static-99R Inter-rater score differences

0 130 (52%)

1 88 (35%)

2 21 (8%)

3 6 (3%)

4 5 (2%)

Page 37: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Summary: Static-99R Inter-rater Reliability Results

All results were in the “good” or “excellent” range. The ICC for the “positive” final outcome group is likely smaller due to a restricted

range of Static-99R scores (i.e., few low scores in the “positive” group).

Page 38: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Static-99R scores by final outcome

Page 39: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by
Page 40: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Frequency of diagnoses by outcome

Page 41: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Diagnostic frequency – Negative cases (400 raters). *May be more than 100%

Pedophilic Disorder 161 (40%)

Antisocial Personality Disorder 98 (25%)

Alcohol Use Disorder 77 (19%)

Stimulant Use Disorder 36 (9%)

Cannabis Use Disorder 9 (2%)

Schizoaffective Disorder 11 (3%)

Psychotic Disorder/Schizophrenia 8 (2%)

Page 42: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Diagnostic frequency – Positive cases (100 raters)

Pedophilic Disorder 63

Antisocial Personality Disorder 19

Alcohol Use Disorder 18

Stimulant Use Disorder 4

Cannabis Use Disorder 9

Psychotic Disorder/Schizophrenia 6

Other Specified Paraphilic Disorder (OSPD) 10

Exhibitionistic Disorder 10

Fetishistic Disorder 2

Frotteuristic Disorder 2

Page 43: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Inter-rater diagnostic agreement for Pedophilic/non-Pedophilic

Disorder (Cohen’s kappa)

Page 44: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Characterizing Inter-rater diagnostic agreement using Cohen’s kappa

>0.9 Almost perfect

0.81-.89 Excellent

0.61 – 0.80 Substantial

0.41-0.60 Moderate

0.21-0.40 Fair-Landis and Koch

Page 45: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Comparing Cohen’s kappa for Pedophilic/non-Pedophilic Disorder diagnostic agreement – results show “substantial”

agreement

“Negative” final outcomekappa Lower bound Upper bound

0.69 0.59 0.79

“Positive” final outcomekappa Lower bound Upper bound

0.74 0.55 0.93

Page 46: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Static-99 scores by Pedophilic/non-Pedophilic Disorder & Final Outcome

Page 47: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by
Page 48: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Comparing DSH

employees vs Independent

Evaluators (IEs)

• Substantial differences in either Static-99R scores or final outcome opinions could indicate bias, as found in some studies (e.g., Chevalier et al., 2015).

• Differences in categorical outcome are typically measured by the chi-square statistic.

Page 49: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Final Outcome opinion by

DSH Employee vs IEs

• N=363, due to some ind evaluators becoming employees. RESULTS: The chi-square results indicate no systematic differences between how often employees/IEs came to positive/negative outcome decisions.

IEs Employees

Final Outcome Negative 125 166

Positive 32 40

Chi-square p-value0.924, non-sig

Page 50: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Evaluator distributions

• Next, we look at distributions of the Evaluator’s Static-99 ratings depending on whether they are DSH employees or not.

• Major differences in Static-99 scores by DSH vs IEs could suggest bias or training issues.

• As there are many more DSH employees in our sample than IEs, we will put them on the same scale using a density function.

Page 51: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by
Page 52: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Summary of study results

• Raters showed good or excellent consistency (ICC) in scoring the risk instrument (Static-99R)

• The most common diagnoses are pedophilia, ASPD, and substance use Disorders

• Static-99R scores and pedophilia dx are related to final outcome

• Raters show substantial diagnostic agreement (Cohen’s kappa) for Pedo/non-Pedo dx’s

• There are no significant differences between employees vs IEs in their outcome opinions (per chi-square), nor Static-99 Ratings

Page 53: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Potential Factors

underlying results

• Allegiance • Fluidity of allegiance • Many evaluators work for both PD and DA; variety of evaluations• Absence of incentives • Absence of pressure • Same side

• Confirmation• Use of Static 99 - least vulnerable to “pull”• Force consideration blind spot data• Lack of diagnostic momentum

• Diagnoses justified by DSM criteria –”diagnosed” disorder• CDCR qualifying diagnosis is rare

• Base Rates• Knowledge of base rate• Force Base rate consideration

• Limits on Cognition• Structured methods• Structured tools • Effective documentation and use of memory aid

Page 54: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

Potential Factors

Mitigating Bias in SVP

Evaluations

• Overconfidence• Training • Legal review for all• “Grandma” reasoning and pseudo-expert

not tolerated• Thinking too fast & subjectivity• Standardized assessment protocol• Selection of well trained and high integrity

evaluators!

Page 55: Field Reliability of Static-99R and diagnosis in WIC 6600 ...Individual biases and preferences. Demonstrated Vulnerability of Forensic Evaluators to Bias. Bias and results vary by

RecommendationsPre and Post

• Training• Standardized and rigorous• Regular refreshers for scoring tests

• QA• Robust

• Review all DOPs and Positives• Review key indicators

• Hiring• Hire the best• Value integrity