Using NDNQI For Quality Improvement at Jefferson Healthcare System

Improving the Evaluation of Measures of Patient Reported Outcomes using Content

Validity Analysis:A Bayesian Randomized Equivalency Experiment

Byron J. GajewskiAssociate Professor of Biostatistics

Associate Professor of Nursing

w/ Coffland, Boyle, Bott, Leopold, Oberhelman, & Dunton

http://www.kumc.edu/

http://www.kumc.edu/

Valmi D. Sousa, PhD, APRN, BC

Associate Professor of Nursing

1958-2010

Background: Measures of Patient Reported Outcomes (PRO)

As evolving healthcare therapies, treatments, and policies are introduced in the U.S., it is more critical than ever to quickly develop new valid and reliable instruments for measuring patient reported outcomes

Examples:o disease management (e.g. diabetes)

o fine motor function

o activities of daily living of older adults (e.g. bathing, dressing…)

o cognitive impairment

Background: Other Behavioral Measures (Scales and Instruments)

Nursing home staff perception of Culture Change

Nursing home staff perception of End-of-Life practices

Registered Nurse (RN) Job Enjoyment in hospitals

Background: Validating measures of PRO & other behavioral measures

Core: Psychometrics “Instruments” (“reliability”)o Also known as “Scales,” or “paper & pencil questionnaires”

Background worko Develop theory

o Draft potential questions (or items)

Validity o Criterion Validity (Gold Standard)

• Often Gold Standard not directly measureable (Job Enjoyment)

o Alternative to Criterion Validity

• Content Validity: Elicit content experts’ opinion (“essential” or “relevance”)

• Construct Validity: Test on participants

– How many “constructs” are we measuring?

– Factor Analysis

Example Instrument: RN Job Enjoyment

strongly agree (6) to strongly disagree (1)

Nurses with whom I work with would say that they:

# Item

1 Are fairly well satisfied with their jobs.

2 Would not consider taking another job.

3 Have to force themselves to work much of the time.

4 Are enthusiastic about their work almost every day.

5 Like their jobs better than the average worker does.

6 Feel that each day on their job will never end.

7 Find real enjoyment in their work.

6

One-Factor analysis measurement model

7

f

ρ1

Z2

Z1

Z4

Z3

ρ2

ρ3

ρ4

Z6

Z5

ρ5

ρ6

Z7

ρ7ρj = corr (f, zj)

f = standardized latent domain scoreo (mean = 0 variance = 1)

zj = standardized response for item j

(1) Validate Factor Structure

(2) Good estimates of ρj

One-Factor analysis measurement model

8

(1) Validate Factor Structure

(2) Good estimates of ρj

(3) Sample sizes? 10/parameter estimate

7 means + 7 variances + 7 correlations = 21 “=“ 210 subjects

Note also: se{g(ρj)}=1/ √n

Outline

Introduction

Content Validity

Related Literature

Study Designo Exact Model

o Approximate model

o Hypotheses

o Sample size calculations and equivalency trial

Results

Discussion & Limitations

Introduction

Instrument validation methodso Content validity

o Construct validity

Traditionally analyzed separately

Integrated analysis of content and construct validity (IACCV)o Both datasets on the same metric

Accomplished using Bayesian methodologyo Expert data – prior distribution

o Participants’ data utilized to update prior information via a posterior distribution

Introduction

Previous use of combination of IACCV and Bayesian methodology

o Instrument measuring nursing home culture change

o Stable estimates of psychometrics parameters via posterior distribution

o Useful results with small sample size

Introduction: Potential Impact of IACCV -> Efficiency!

~76,000 participants will be in instrument development studies in next five years

using IACCV reduces this participant number by 37%, big decrease in manpower and time

IACCV transfers some response burden from vulnerable participants to expert panelso Disabilities/Cancer/low populations

Content Validity

Typical content validity procedureso Content experts

o Instrument available to experts

o Content review tool

o Definition of the construct

o Assessment of item relevance using a four-point Likert scale

• 1 = content not relevant

• 2 = content somewhat relevant

• 3 = content quite relevant

• 4 = content highly relevant

Content Validity

Content validity index for each itemo Calculating proportion

o Focus on responses of “quite” and “highly” relevant

o Minimum item content validity index of 0.80

• At least 80% experts agree that an item is quite or highly relevant

Justification of the content validity index cut-pointo IACCV/Bayesian methodology valid????

Purpose of Study

We hypothesize that experts equate

“relevance” and “correlation” scale

ρj = corr (f, zj)

f = standardized latent domain scoreo (mean = 0 variance = 1)

zj = standardized response for item j

Purpose of Study

Relevancy Responses Corr. Correlation Scale

1 = Not Relevant ↔No Correlation

[0.0 – 0.10)

2 = Somewhat Relevant ↔Small Correlation

[0.10 – 0.30)

3 = Quite Relevant ↔Medium Correlation

[0.30 – 0.50)

4 = Highly Relevant ↔Large Correlation

[0.50 – 1.00)

16

Overall Design

Bayesian design with two group, randomized equivalency study

Registered Nurse Job Enjoyment Scale (Taunton et al. 2004)o National Database of Nursing Quality Indicators TM (NDNQI®)

Role of the site coordinator

Subjects are voluntary from the NDNQI® site coordinator pool

Overall Design

Site Coordinators

(Randomized)

Relevance Group

Relevance Scale

Not Relevant-Highly

Relevant

Correlation Group

Correlation Categories

Categories 0.00 to 1.00

18

Study Participants

Content expertso Total 1,226 site coordinators emailed

o 397 eligible participants volunteered

o All are registered nurses

o 120 participants randomly chosen

o Participants randomized into two groups

• n₁ = n₂ = 60

o Over sampled by 22 to obtain at minimum of 98 completed surveys (see slides later)

Study Design

Tools created using Survey Monkey (http://www.surveymonkey.com/)

Participants received the respective link by email

Survey with 11 itemso Eight items based on Job Enjoyment (7 actual, 1 “sabotage”)

o Three items for basic demographics

http://www.surveymonkey.com/

Exact Model

Combining all expert responses for all items

g (ρjkm) = g (ρjm) + ejkm

m = 1 ‘relevance’ group and m= 2 ‘correlation’ group

r = 8 Job Enjoyment items

j = item 1,…,8

k = expert opinion 1,…,nm

ρjm = correlation between item and domain

(pooling within the mth group)

ejkm = normally distributed, mean 0, and variance σ²

Exact Modelg (ρjkm) = g (ρjm) + ejkm

ρjkm = kth expert opinion for jth itemo Transformation g(ρ)=1/2log{(1+ρ)/(1-ρ)}

o Allows for correlations of -1 to 1

o Related to xjkm (observed ordinal values)

Measurement model for the correlation of the first item to its

domain from six experts.

ρ1

1

1

1

1

1

1

x12

x11

x14

x13

x16

x15

ρ11

ρ12

ρ13

ρ14

ρ15

ρ16

Approximate Model

xjkm = µjm + exjkm

Modeling x’s on the ordinal scale 1-4

µjm = mean response for jth item from kth expert in the mth

group

exjkm = normally distributed, mean 0, and variance σ2

j

Hypotheses

H1j :| ρj1 - ρj2 | < 0.25

• Exact Model

• g (ρjkm) = g (ρjm) + ejkm

H*1j :| µj1 - µj1 | < 0.5

• Approximate Model

• xjkm = µjm + exjkm

27

Posterior Calculations

Exact Model o Complicated by cutpoints and untransformed scale inferences

o Posterior distribution calculations of ρjkm using Markov chain Monte Carlo (MCMC)

o WinBUGS

o Burned in 1,000 draws and used the next 10,000 iterations

Approximate Modelo Calculations are in closed form

o Can be done in Excel

o Easy sample size calculations

PriorsModel Parameter Distribution Median

(95% Crl)Pr

Approximate µjm N (2.5, σj / √4)n0m = 4 & σ = 1

2.5 (1.52, 3.48) - -

µj1 - µj2 N (0, σj / √2) 0.0 (-1.39, 1.39) 0.52

Exact σ 1/σ2~U(0.01,100)n0m = 8 & σ = 1/n

0.50 (0.03, 0.98) - -

ρjm g(ρjm)~ N(0.5493, 1/ √8) 0.50 (-0.15, 0.85) - -

ρj1 - ρj2 Simulation 0.00 (-0.74, 0.74) 0.52

29

Equivalency Analysis

Posterior calculations

o Approximate and exact models

o Posterior median (50th percentile)

o 95% Credible Intervals (Crl)

• 2.5th percentile

• 97.5 percentile

o Pr(H1j)

Sample Size Calculations Equivalency Trial Sample size calculations

o Each item observed as:

• Approx. normal random variable

•

– Mean µj1 - µj2

– Variance σ2j (1/n1 + 1/n2)

o Prior distribution for µj1 - µj2

• Normal

– Mean 0

– Variance σ2j (1/4 + 1/4)

• Equivalent to four in each group

1 2j jx x

Sample Size Calculations Equivalency Trial Sample size for n1 + n2

(|µj1 - µj2| < 0.5) > λ

OR

P µj1 - µj2({| µj1 - µj2| < 0.5} |xj1, xj2, n1, n2, σ2

j ,4, 4) > λ

(parameters to the right)

A = (xj1, xj2, n1, n2, σ2j ,4, 4)’

Sample Size Calculations Equivalency Trial λ = 90%

= -0.25 and σ2j = 1

Suspect mean differences are “0”

Allow for deviation between 0 and boundary (0.5)

Find the minimal integer n=n1=n2:Min{n | P µj1 - µj2

({| µj1 - µj2| < 0.5} | = -0.25, n, A) > 0.90

Noting the posterior distribution

[ µj1 - µj2 | = -0.25, n, A ] ~ N (-0.25, √2/(4+n)

1 2j jx x

1 2j jx x

1 2j jx x

Sample Size Calculations Equivalency Trial

Plot indicates that n = 49 is sufficient

n=n1=n2=60 to account for possible dropout

Response Rates

Relevance group (m=1) with 59 subjects

Correlation group (m=2) with 51 subjects

Response rates greater in the relevance groupo Beta-Binomial distribution with uniform priors

o Posterior probability = 0.9984

o 95% Credible Interval in difference is (0.04, 0.23)

o Demonstrates correlation group has a significant smaller response rate

RN Demographics

Length of time in Position %

1-5 years in current position 40.4%

6-19 years in current position 34.8%

> 20 years in current position 24.8%

Experience

Total RN experience in US > 20 years 70%

Highest Academic Degree %

Diploma/Associate 9%

Baccalaureate 37%

Masters 47%

Doctorate 6%

Not Applicable 1%

36

Summary Statistics

# Item m=1 ‘Relevance’ (n=59)Response %

m=2 ‘Correlation’ (n=51)Response %

1 2 3 4 1 2 3 4

1 Satisfied with job 0 10 25 64 0 6 26 68

2 Consider another job 3 27 25 44 2 14 44 40

3 Force themselves to work 5 15 29 51 8 10 20 62

4 Enthusiastic to work 0 12 27 61 0 2 34 64

5 Like job better than averageworker does

5 24 37 34 4 10 36 50

6* Are clinically competent 17 31 24 29 10 30 40 20

7 Feel job will never end 7 25 25 42 6 8 34 52

8 Real enjoyment in their work 0 2 22 76 0 6 16 78

Relevance responses (1=“not relevant” to 4=“highly relevant”)Correlation responses (1=“0.00-0.10” to 4=“0.50-1.00”)

37

Summary Statistics

#m=1 (Relevancy) m=2 (Correlation)

n1 s1 n2 s2

1 3.54 59 0.68 3.62 50 0.60

2 3.10 59 0.92 3.22 50 0.76

3 3.25 59 0.90 3.36 50 0.96

4 3.49 59 0.70 3.62 50 0.53

5 3.00 59 0.89 3.32 50 0.82

6* 2.64 59 1.08 2.70 50 0.91

7 3.03 59 0.98 3.32 50 0.87

8 3.75 59 0.48 3.72 50 0.57

1x 2x

38

Approximate Model(Relevancy-Correlation)

Differences

# σ E(µ1-µ2) SE 2.5%-tile 97.5%-tile Prob(H1*)

1 0.64 -0.08 0.12 -0.32 0.17 1.00

2 0.85 -0.12 0.16 -0.44 0.20 0.99

3 0.93 -0.11 0.18 -0.46 0.24 0.99

4 0.63 -0.13 0.12 -0.37 0.11 1.00

5 0.86 -0.32 0.17 -0.64 0.00 0.86*

6* 1.00 -0.06 0.19 -0.43 0.32 0.99

7 0.93 -0.29 0.18 -0.64 0.06 0.88*

8 0.52 0.03 0.10 -0.17 0.22 1.00

*Correlation groups had higher ratings39

Exact Model Results(Relevancy-Correlation)

# m=1 m=2 Differences

E(ρ1) sd(ρ1) E(ρ2) sd(ρ2) E(ρ1-ρ2) sd(ρ1-ρ2) 2.50% 97.5% Pr(H1)

1 0.56 0.03 0.58 0.03 0.02 0.04 -0.05 0.09 1.00

2 0.45 0.03 0.47 0.03 0.02 0.04 -0.06 0.10 1.00

3 0.49 0.03 0.52 0.03 0.03 0.04 -0.04 0.11 1.00

4 0.54 0.03 0.57 0.03 0.03 0.04 -0.04 0.10 1.00

5 0.42 0.03 0.50 0.03 0.07 0.04 0.00 0.15 1.00

6* 0.35 0.03 0.36 0.03 0.00 0.04 -0.08 0.09 1.00

7 0.44 0.03 0.50 0.03 0.06 0.04 -0.01 0.14 1.00

8 0.62 0.03 0.61 0.03 0.00 0.04 -0.08 0.07 1.00

Median (95% Crl for σ was 0.24 (0.22-0.26)

40

Discussion

The assumption that relevance and correlation corresponds substantiated

Comparing modelso Exact model

• Precise estimates on the correlation scale

• Can infer the portion of experts scoring items with medium to large correlation

o Approximate model

• Conservative approach for calculating sample sizes

• If sample size is appropriate for the approximate model, the size is large enough for the exact model

DiscussionCombining m=1 (relevance) and m=2 (correlation)

# Item 1(%) 2(%) 3(%) 4(%) CVI CVIa CVIe

1 Satisfied with job 0 8 25 66 0.91 0.81 0.92

2 Consider another job 3 21 34 42 0.76 0.56 0.78

3 Force themselves to work 6 13 25 56 0.81 0.61 0.85

4 Enthusiastic to work 0 7 30 62 0.93 0.80 0.91

5Like job better than average worker

does5 18 37 41 0.78 0.56 0.78

6 Are clinically competent 14 31 31 25 0.56 0.37 0.60

7 Feel job will never end 7 17 29 47 0.76 0.57 0.79

8 Real enjoyment in their work 0 4 19 77 0.96 0.92 0.96

42

Limitations (1) Negatively worded items and impact on the interpretation

of correlations.

(2) responses “correlation” group < responses “relevance” group.

(3) The two tools may agree for spurious reason?

(4) Lack of training of experts regarding correlation.

Conclusions

The relevance tool and the correlation tool found to be equivalent

Content validity justified with correlation argument

Implications for replicating this method with other psychometric instrumentso Recall that se{g(ρj)}=1/ √n

o From exact model (Relevance group) se{g(ρj)}=0.037

• 12.4 participants per expert (about 10)

o Typically 6 experts used, experts are 6*10=60 participants

• Originally we needed 210, content validity reduced to 150!

Acknowledgments

Thanks to site coordinators and NDNQI staff member Kim Boyle

This research is supported by:o Contract from the American Nurses Association, NDNQI (PI: Nancy Dunton)

o University of Kansas Research Institute Bridging Grant (Chair Peter Smith)

o Department of Biostatistics (Chair Matt Mayo)

o School of Nursing Office of Grants and Research (Associate Dean Marge Bott)

o Lauren Aaronson & Carol Smith (grant writing mentors)

Using NDNQI For Quality Improvement at Jefferson Healthcare System

Technology

Transcript of Using NDNQI For Quality Improvement at Jefferson Healthcare System