Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics...

22
This article was downloaded by: [71.164.205.228] On: 04 April 2015, At: 09:03 Publisher: Institute for Operations Research and the Management Sciences (INFORMS) INFORMS is located in Maryland, USA Information Systems Research Publication details, including instructions for authors and subscription information: http://pubsonline.informs.org Predictive Analytics for Readmission of Patients with Congestive Heart Failure Indranil Bardhan, Jeong-ha (Cath) Oh, Zhiqiang (Eric) Zheng, Kirk Kirksey To cite this article: Indranil Bardhan, Jeong-ha (Cath) Oh, Zhiqiang (Eric) Zheng, Kirk Kirksey (2015) Predictive Analytics for Readmission of Patients with Congestive Heart Failure. Information Systems Research 26(1):19-39. http://dx.doi.org/10.1287/isre.2014.0553 Full terms and conditions of use: http://pubsonline.informs.org/page/terms-and-conditions This article may be used only for the purposes of research, teaching, and/or private study. Commercial use or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher approval, unless otherwise noted. For more information, contact [email protected]. The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitness for a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, or inclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, or support of claims made of that product, publication, or service. Copyright © 2015, INFORMS Please scroll down for article—it is on subsequent pages INFORMS is the largest professional society in the world for professionals in the fields of operations research, management science, and analytics. For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org

Transcript of Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics...

Page 1: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

This article was downloaded by: [71.164.205.228] On: 04 April 2015, At: 09:03Publisher: Institute for Operations Research and the Management Sciences (INFORMS)INFORMS is located in Maryland, USA

Information Systems Research

Publication details, including instructions for authors and subscription information:http://pubsonline.informs.org

Predictive Analytics for Readmission of Patients withCongestive Heart FailureIndranil Bardhan, Jeong-ha (Cath) Oh, Zhiqiang (Eric) Zheng, Kirk Kirksey

To cite this article:Indranil Bardhan, Jeong-ha (Cath) Oh, Zhiqiang (Eric) Zheng, Kirk Kirksey (2015) Predictive Analytics for Readmission ofPatients with Congestive Heart Failure. Information Systems Research 26(1):19-39. http://dx.doi.org/10.1287/isre.2014.0553

Full terms and conditions of use: http://pubsonline.informs.org/page/terms-and-conditions

This article may be used only for the purposes of research, teaching, and/or private study. Commercial useor systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisherapproval, unless otherwise noted. For more information, contact [email protected].

The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitnessfor a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, orinclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, orsupport of claims made of that product, publication, or service.

Copyright © 2015, INFORMS

Please scroll down for article—it is on subsequent pages

INFORMS is the largest professional society in the world for professionals in the fields of operations research, managementscience, and analytics.For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org

Page 2: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Information Systems ResearchVol. 26, No. 1, March 2015, pp. 19–39ISSN 1047-7047 (print) � ISSN 1526-5536 (online) http://dx.doi.org/10.1287/isre.2014.0553

© 2015 INFORMS

Predictive Analytics for Readmission ofPatients with Congestive Heart Failure

Indranil BardhanNaveen Jindal School of Management, University of Texas at Dallas, Richardson, Texas 75080, [email protected]

Jeong-ha (Cath) OhJ. Mack Robinson College of Business, Georgia State University, Atlanta, Georgia 30302, [email protected]

Zhiqiang (Eric) ZhengNaveen Jindal School of Management, University of Texas at Dallas, Richardson, Texas 75080, [email protected]

Kirk KirkseyUniversity of Texas Southwestern Medical Center, Dallas, Texas 75390, [email protected]

Mitigating preventable readmissions, where patients are readmitted for the same primary diagnosis within30 days, poses a significant challenge to the delivery of high-quality healthcare. Toward this end, we

develop a novel, predictive analytics model, termed as the beta geometric Erlang-2 (BG/EG) hurdle model,which predicts the propensity, frequency, and timing of readmissions of patients diagnosed with congestiveheart failure (CHF). This unified model enables us to answer three key questions related to the use of predictiveanalytics methods for patient readmissions: whether a readmission will occur, how often readmissions will occur,and when a readmission will occur. We test our model using a unique data set that tracks patient demographic,clinical, and administrative data across 67 hospitals in North Texas over a four-year period. We show that ourmodel provides superior predictive performance compared to extant models such as the logit, BG/NBD hurdle,and EG hurdle models. Our model also allows us to study the association between hospital usage of healthinformation technologies (IT) and readmission risk. We find that health IT usage, patient demographics, visitcharacteristics, payer type, and hospital characteristics, are significantly associated with patient readmissionrisk. We also observe that implementation of cardiology information systems is associated with a reduction inthe propensity and frequency of future readmissions, whereas administrative IT systems are correlated with alower frequency of future readmissions. Our results indicate that patient profiles derived from our model canserve as building blocks for a predictive analytics system to identify CHF patients with high readmission risk.

Keywords : patient readmissions; healthcare information technologies; congestive heart failure; predictivehealthcare analytics

History : Il-Horn Hann, Senior Editor; Sudip Bhattacharjee, Associate Editor. This paper was received onOctober 14, 2011, and was with the authors 13 months for 4 revisions. Published online in Articles in AdvanceNovember 24, 2014.

1. IntroductionReadmission of patients with chronic diseases is a sig-nificant and growing problem in the United Statesand an increasing burden on the healthcare system.Preventable patient readmissions cost the U.S. health-care system about $25 billion every year, accordingto a study by PricewaterhouseCoopers (2010). Expertsbelieve that high readmission rates, when patients arereadmitted within 30 days of discharge, indicate thatthe nation’s hospitals are not adequately addressingpatient health issues. To tackle this problem, the U.S.Centers for Medicare and Medicaid Services (CMS)has imposed penalties on hospitals for preventablereadmissions related to chronic conditions such asheart failure or pneumonia, starting in 2012.

We develop a novel, predictive analytics model topredict patient readmission rates based on clinical,

patient, and hospital information technologies (IT)characteristics. We focus on patients diagnosed withCHF, because this represents one of the first two healthconditions that the Department of Health and HumanServices (HHS) began to monitor, starting in 2012.Specifically, our research seeks to develop a health-care analytics model that considers the propensity, fre-quency, and timing of patient readmissions. That is,for a given patient, we are interested in studying:(a) What is the likelihood of a future readmission?(b) How many future readmissions are likely to occur?And (c) when will the next readmission occur? Ourmodel stands in sharp contrast to the existing read-mission literature that mostly focuses on one or theother, but seldom addresses all of the above researchquestions in an integrated manner (Chin and Goldman1997, Philbin and DiSalvo 1999, Krumholz et al. 2000,

19

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 3: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF20 Information Systems Research 26(1), pp. 19–39, © 2015 INFORMS

Silverstein et al. 2008, Kansagara et al. 2011). Ourproposed beta geometric/Erlang-2 gamma (BG/EG)hurdle model addresses these research questionssimultaneously.

A key aspect to improving the quality of healthcareand reducing patient readmissions is the implemen-tation of health information technology (HIT). Theuse of HIT has the potential to improve healthcarequality, reduce readmission rates and costs, and con-sequently increase productivity (Congressional Bud-get Office 2008, Miller and Tucker 2011). Past studieshave primarily focused on the impact of HIT on theproductivity and operational efficiency of hospitalsand/or healthcare providers (Bardhan and Thouin2013, Das et al. 2011, Hillestad et al. 2005). However,to the best of our knowledge, they have not ana-lyzed the impact of HIT on patient-level readmissions(e.g., see the extensive review by Kansagara et al.2011). In their commentary on digital transformationof healthcare, Agarwal et al. (2010, p. 796) identify themeasurement and quantification of HIT payoff and itsimpact on patient care outcomes as a significant areafor future research. Our research fills this gap in theliterature by examining whether adoption of HIT isassociated with a reduction in the readmission risk ofCHF patients, through our proposed model.

To fully account for patient and hospital hetero-geneity, it is important to conduct a comprehensivestudy based on a large, longitudinal panel of patientsacross multiple hospitals to evaluate the readmis-sion risk of patients. We obtained such a uniquedata set, which tracks a large panel of patient admis-sions across hospitals in North Texas. The data wasgleaned from hospitals’ administrative claims systemselectronically and integrated across all hospitals in theregion using a unique master patient index.

Our results indicate that health IT applications,patient demographics, payer type, admission condi-tion, and comorbidities are important determinantsof patient readmission risk. Our model offers a morenuanced view of patient readmissions when we dif-ferentiate the propensity of (initial) readmission fromthe frequency of (future) readmissions. We find thatusage of cardiology information systems is associatedwith a reduction in the propensity and frequency offuture readmissions, whereas administrative IT sys-tem usage is correlated with a lower frequency offuture readmissions. Furthermore, we compare thepredictive performance of our model with severalstate-of-the art models that have been proposed inthe extant literature. Our “horse race” experimentdemonstrates the superiority of our model in pre-dicting both the incidence and timing of future read-missions. Because improved prediction is a founda-tional step toward mitigating future readmissions, our

model can serve as an integral component of a predic-tive healthcare analytics tool to better profile, predict,and take preventive actions on patients with highreadmission risk.

2. Background2.1. Readmission AnalyticsExtant readmission studies have typically been basedon a single hospital using relatively small samples,or are restricted to a specific cohort such as elderlypatients (e.g., Joynt et al. 2011, Shelton et al. 2000,Silverstein et al. 2008), veterans (e.g., Muus et al. 2010),or specific racial and income groups (e.g., Philbin et al.2001). A few studies have used data from several hos-pitals (e.g., Philbin et al. 2001), but usually they belongto the same hospital system (Amarasingham et al.2010, Deswal et al. 2004, Silverstein et al. 2008), over-looking the possibility of patient admissions acrossmultiple (disparate) hospitals. This can lead to seri-ous undercounting of patient readmissions because itis not uncommon for patients to be admitted (overtime) to different hospitals that are owned by differ-ent entities. In fact, our data shows that 37.4% of CHFpatients who were readmitted within 30 days of theirinitial admission visit a different hospital. As Nasiret al. (2010) observe, the same-hospital readmissionrate is likely to underreport the actual interhospitalreadmission rate by as much as 50%, and is of lim-ited value as a benchmark for care quality. Hence, itis important to analyze data that provide a completepicture of patient admissions across hospitals within ageographic region.

Besides modeling the risk propensity of readmis-sion, it is equally important to understand how fre-quently (count) and when (timing) readmissions arelikely to occur using a healthcare analytics system.Some examples of count and timing models in health-care include the count of health service utilization,such as physician consultations, emergency room vis-its, or the amount of home care received (Deb andTrivedi 2002, Winkelmann 2006). Our proposed read-mission model draws on count models in statis-tics (Winkelmann 2010) and consumer repeat-buyingmodels in the marketing literature (e.g., Fader et al.2005, Gupta 1991).

2.2. Health IT and Patient ReadmissionsThe information systems research literature has wit-nessed a growing interest in the role of IT in patientdiagnosis, healthcare delivery, and treatment. Priorstudies have mostly focused on hospital performanceand the impact of IT (Devaraj and Kohli 2000, 2003;Menon et al. 2000; Das et al. 2011), or on hospital-level adoption and diffusion of health IT (Angst et al.2010, Agarwal et al. 2010, Zheng et al. 2005). However,

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 4: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHFInformation Systems Research 26(1), pp. 19–39, © 2015 INFORMS 21

there is a growing emphasis on patient-level analyticsas researchers and clinicians have come to recognizethat the ultimate impact of HIT has to be measuredon patient-level outcomes, and therefore, recent stud-ies have called for greater attention using patient-leveldata to generate useful and actionable insights (Angstet al. 2010, Gao et al. 2010).

The growth in adoption of electronic medicalrecords (EMR) and HIT systems in recent years hasspurred widespread interest in studying the impact ofHIT applications on patient care outcomes (Andersonand Agarwal 2011, Bardhan and Thouin 2013). Priorstudies report improved quality of care in diabetestreatment with the use of EMR systems (Cebul et al.2011), and in a recent review, Buntin et al. (2011)report that 92% of recent studies document a pos-itive impact of HIT on hospital outcomes, includ-ing healthcare quality and efficiency. Whereas a fewstudies on the use of computerized provider orderentry (CPOE) systems report improvements relatedto medication errors (Aron et al. 2011, Kaushalet al. 2003), others report unintended adverse con-sequences (Campbell et al. 2006), such as a suddenincrease in mortality rates after implementation ofCPOE (Han et al. 2005). Usage of automated notesand record systems, order entry, and clinical deci-sion support systems, has helped lower complica-tions and mortality rates (Amarasingham et al. 2009,McCullough et al. 2010, Miller and Tucker 2011).However, as reported in a recent comprehensivereview by Kansagara et al. (2011), there have not beenany studies on the relationship between health IT sys-tems and patient readmission rates.

Although recent studies have reported mixed evi-dence on the impact of HIT on the quality of patientcare, they have been limited by data deficiencies andlimitations in their econometric estimation methods.For example, Linder et al. (2009) use cross-sectional,pooled data analysis of patient visit data but donot take into account the possibility of serial correla-tion among multiple visits by the same patient overtime, which may lead to biased estimates in ordinaryleast squares (OLS) regressions. McCullough et al.(2010) and DesRoches et al. (2010) focus specificallyon two types of HIT applications—EHR and CPOEsystems—and study whether hospitals with these sys-tems exhibit greater levels of process quality com-pared to hospitals without these systems. They ignorethe role of ancillary HIT applications, such as radi-ology, laboratory, and order communication systems,which support decision making related to patientcare, and the impact of nonclinical applications, suchas patient scheduling systems, human resource sys-tems, and financial systems, on information work-flows. A recent study reports nine potential areas

where health IT can be utilized to reduce readmis-sions directly, including case management, communi-cation, analytics and modeling, postacute follow up,health information exchanges, social media, mobility,robotics, and innovation (HIMSS 2012). In particu-lar, it advocates the use of HIT, such as EMR andrisk assessment software, in improving care coordi-nation and transitions from admission to dischargeby facilitating patient assessment and dischargeplanning.

A major difference between our study and theextant literature is our focus on health IT applica-tions and their relationship with the risk, frequency,and timing of patient readmissions. We focus on threeclasses of hospital IT applications, namely, cardiol-ogy specific, general clinical, and hospital administra-tive systems. Cardiology information systems enhancepatient safety by serving as a repository of patientcardiac information across the continuum of cardiaccare. Such systems support cardiac and peripheralcatheterization, hemodynamics monitoring, echocar-diography, vascular ultrasound, nuclear cardiology,and ECG management, and integrate information andimaging data from multiple systems, enabling clini-cians to make optimal care decisions (Pratt 2010).

Clinical information systems improve decision-making capabilities associated with care management.For instance, use of CPOE systems not only speedsup transmission of a patient’s prescription to a phar-macy, thereby reducing delays but also (a) reducesthe need for nurses or physician assistants to tran-scribe prescriptions, thereby lowering the potentialfor medication transcription errors, and (b) providesdecision support capabilities to flag possible drug–drug and drug-allergy interactions at the time whena physician enters the prescription. Such IT-enabledcapabilities within CPOE applications reduce the inci-dence of adverse drug events and are expected toyield significant savings in inpatient care as well asoutpatient visits (Amarasingham et al. 2009). Clinicalsystems aid in short-term preventive care as well asdisease management of chronic diseases such as CHF.For example, heuristics within EMR systems can iden-tify patients in need of follow-up cardio tests, remindphysicians to order needed tests and schedule preven-tive care visits, and provide consistent records of clin-ical test results, thereby leading to better clinical out-comes. Case management systems (within EMRs) alsohelp to coordinate workflows, such as communicationbetween multiple specialists and high-risk patients.

Hospital administrative systems also play an impor-tant role in the delivery and coordination of patientcare. Benefits management portals enable cross-functional integration of data across multiple depart-ments, and patient administration systems trackpatient movement in inpatient settings, allowing

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 5: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF22 Information Systems Research 26(1), pp. 19–39, © 2015 INFORMS

clinicians and supporting staff to improve hospitalresource utilization by reducing waiting time at thepoint of admission, discharge, or transfer (Bardhanand Thouin 2013). Other administrative applications,such as personnel management systems, support staffneeds related to patient education and discharge tran-sition, which is critical to reducing readmission risk.

3. Model DevelopmentWe first briefly describe two baseline estimation mod-els that have been widely used in the readmissionliterature: the logistic regression and the proportionalhazard models. We then address their methodologicaldeficiencies and propose a predictive model for read-mission analytics to address these challenges.

3.1. Baseline ModelsThe readmission literature has commonly used logis-tic regression models to estimate the readmissionprobability of patients (Muus et al. 2010, Philbin andDiSalvo 1999, Shelton et al. 2000, Silverstein et al.2008). These studies model the incidence of a readmis-sion after a patient’s initial visit as a binary outcome,and involve patient-level analysis where the unit ofanalysis is a patient. Hence, the readmission propen-sity of each patient is defined as a logit function ofcovariates. Another type of baseline model uses sur-vival analysis (or hazard models) to estimate the timeduration between consecutive patient readmissions. Itconsiders each visit as the unit of analysis, and hence,is called visit-level analysis. In our case, the hazard rate,h4t5, refers to the readmission rate of a patient perunit of time, i.e., the readmission rate of a patient ona given day, which is defined as,

h4t5= limãt→0

P4t < T < t +ãt � T > t5

ãt0

The hazard rate is also often expressed through thesurvival function, S4t5 = 1 − F 4t5, where F 4t5 is thecumulative distribution function of the time to failure.The hazard function, h4t5, provides the instantaneousreadmission rate that a patient, who is not readmit-ted by time t, will be readmitted during the infinites-imally small time interval, (t, t +ãt). The commonlyused survival model is the Cox proportional hazardmodel (Cox 1972, Krumholz et al. 2000).

Though both approaches are useful in identifyingreadmission risk factors, they do not provide addi-tional insights to develop a predictive model to esti-mate the frequency and timing of future readmis-sions. First, these models are typically limited intackling the nonstationarity nature of patient read-missions, where a patient’s readmission propensityoften changes over time depending on her chang-ing condition and treatment during prior admissions,

commonly referred to as state dependency (Heckman1991). Prior studies using logistic regression models(Amarasingham et al. 2010, Silverstein et al. 2008) orproportional hazard models (Alexander et al. 1999,Krumholz et al. 2000) typically only account for thefirst readmission; they do not track multiple read-missions for the same patient over time. Second,such models do not fully account for unobservedpatient heterogeneity, wherein some patients maybe intrinsically healthier than others to start with.1

The aforementioned prior studies only account forobserved patient heterogeneity such as demographics,comorbidities, utilization patterns (Amarasinghamet al. 2010), self-rated health conditions (Mudgeet al. 2010), self-reported compliance to prescriptions(Chin and Goldman 1997), or prescreened samplesto reduce heterogeneity in patient groups (Krumholzet al. 2000). Though logit or hazard models can beextended to account for unobserved patient hetero-geneity (e.g., the mixed proportional hazard model),to the best of our knowledge, such extensions havenot been applied to the healthcare literature to modelreadmissions.

Third, extant readmission models do not capturethe timing or frequency of readmissions. For exam-ple, logistic regression models record a readmissionas a binary outcome based on the occurrence orabsence of a readmission (Amarasingham et al. 2010,Silverstein et al. 2008), regardless of the number ofoccurrences of such readmissions for a patient. Fur-thermore, although hazard models partially addressthe issue of data censoring, they typically assume thatthe censored data follow the same stationary processas the observed data, which is a severe limitation ofmost readmission studies. In contrast, our proposedmodel considers the survival after each admission,thus directly addressing the data censoring issue. Wesummarize the major differences between our modeland the extant readmission literature in Table 1.

3.2. The BG/EG Hurdle ModelTo address the deficiencies of existing baseline mod-els, we now develop a stochastic model, the betageometric/Erlang-2 gamma hurdle model, to betterpredict patient readmission patterns. Our model hasseveral distinctive properties compared to the base-line models. It accounts for (a) a patient’s readmis-sion propensity, frequency, and timing in an inte-grated manner; (b) nonstationarity in readmissionrates; (c) both observed and unobserved patient het-erogeneity; and (d) data truncation due to unobserved

1 Overlooking unobserved heterogeneity can lead to “weeding outeffects” where duration dependence in the observed hazard func-tion becomes more negative as the hazardous population tends todrop out first (Heckman 1991).

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 6: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHFInformation Systems Research 26(1), pp. 19–39, © 2015 INFORMS 23

Tabl

e1

ACo

ntra

stof

the

Lite

ratu

rew

ithOu

rStu

dy

Dom

ain

Rese

arch

desi

gnEs

timat

ion

met

hods

Rese

arch

mod

el

Mul

ti-HI

Tim

pact

Inte

grat

esTr

ade-

offs

hosp

ital

onpa

tient

-fre

quen

cybe

twee

nIn

divi

dual

leve

lAc

coun

tfor

Tim

e-Re

sear

chsy

stem

30-d

ayle

vel

and

timin

gTy

peIa

ndPr

edic

tive

unob

serv

edDa

tano

nzer

oNo

n-va

ryin

gCo

ntin

uous

dom

ain

anal

ysis

read

mis

sion

sou

tcom

esof

even

tsTy

peII

erro

rspe

rfor

man

cehe

tero

gene

itytru

ncat

ion

hurd

lest

atio

narit

yco

varia

tes

time

Philb

inan

dDi

Salv

o(1

999)

Heal

thca

reY

12-m

onth

NN

NY

NN

NN

NN

Silv

erst

ein

etal

.(20

08)

Heal

thca

reN

YN

NN

YN

NN

NN

NKr

umho

lzet

al.(

2000

)He

alth

care

N16-

mon

thN

NN

NN

NN

NN

NAm

aras

ingh

amet

al.(

2010

)He

alth

care

NY

NN

NY

NN

NN

NN

Chin

and

Gold

man

(199

7)He

alth

care

N60

-day

NN

NN

NN

NN

NN

Felk

eret

al.(

2004

)He

alth

care

N160

-day

NN

NY

NN

NN

NN

Alex

ande

reta

l.(1

999)

Heal

thca

reY

12-m

onth

NN

NN

NY

NN

NN

Angs

teta

l.(2

010)

Info

.Sys

.Y

N.A.

NN

N.A.

N.A.

N.A.

YN.

A.N.

A.Y

NDe

vara

jand

Kohl

i(20

03)

Info

.Sys

.N

N.A.

NN.

A.N.

A.N.

A.N.

A.N.

A.N.

A.N.

A.Y

NBa

rdha

nan

dTh

ouin

(201

3)In

fo.S

ys.

YN.

A.Y

N.A.

N.A.

N.A.

NN.

A.N

N.A.

YN

Gupt

a(1

991)

Mar

ketin

gN.

A.N.

A.N.

A.Y

NY

Y3Y

NY2

YN

Fade

reta

l.(2

004)

Mar

ketin

gN.

A.N.

A.N.

A.Y

NY

Y3Y

NY

YN

Fade

reta

l.(2

005)

Mar

ketin

gN.

A.N.

A.N.

A.Y

NY

Y3Y

NN

NN

Schw

eide

land

Knox

(201

3)M

arke

ting

N.A.

N.A.

N.A.

YN

YY3

YN

YY

YGö

nüla

ndTe

rHof

sted

e(2

006)

Mar

ketin

gN.

A.N.

A.N.

A.Y

NY

Y3Y

NY

YY

Jain

and

Vilc

assi

m(1

991)

Mar

ketin

gN.

A.N.

A.N.

A.Y

NN

Y3Y

NY

YN

Win

kelm

ann

(200

4)Ec

onom

ics

N.A.

N.A.

N.A.

NN

NY3

YY

NY

NTh

isst

udy

Heal

thca

re&

YY

YY

YY

YY

YY

YY

Info

.Sys

.

1 Patie

nts

acro

ssm

ultih

ospi

tals

yste

ms

are

stud

ied

butr

eadm

issi

onis

only

toa

sing

leho

spita

l.2 No

nsta

tiona

rity

thro

ugh

time-

vary

ing

cova

riate

s.3 Th

ese

pape

rsco

nsid

erun

obse

rved

hete

roge

neity

atth

epo

pula

tion

leve

l;w

eco

nsid

erbo

thpo

pula

tion-

and

indi

vidu

al-le

velh

eter

ogen

eity

(thru

rand

omef

fect

s).

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 7: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF24 Information Systems Research 26(1), pp. 19–39, © 2015 INFORMS

Figure 1 Illustration of Patient Readmission Patterns

s0 – 0initial admission for patient i

Observation starting point for the entiredata collection (January 2006)

Observation end point for theentire data (December 2009)

t2

T

s1

causes such as patients’ death or migration out of thegeographic region.2

Our model integrates two components: a hurdlecomponent, which estimates the probability of read-mission, and a BG/EG component, which estimatesthe frequency and timing of future readmissions. Thehurdle model is suitable when one believes thatpatients who are admitted once need to be treateddifferently from those who are readmitted multipletimes. The hurdle component not only estimates theprobability of readmission but also addresses the ex-cessive zero count, as observed in our data, where70% of CHF patients are not readmitted (Winkelmann2010). Wooldridge (2010, p. 690) refers to this scenarioas the participation decision because the hurdle modelreflects a decision-maker’s choice on whether or notto participate in an event (i.e., readmission). Specifi-cally, the hurdle component models the probability ofa zero outcome using a logit model as

log(

�0i

1 − �0i

)

=X0i · �0i1 i = 11 0 0 0 1N1 (1)

where �0i is the probability of no readmission foran individual patient i and X0i is the set of covari-ates observed for patient i at their initial admissiontime with coefficients �0i. A hurdle regression con-siders systematically different statistical processes forthe zero and nonzero binary outcomes, where thepositive counts are conditioned on having nonzerooutcomes. It reflects a two-stage process, where therisk factors affecting readmission frequency may bedifferent from those determining the propensity ofreadmission.

The BG/EG component estimates the frequencyand timing of admissions simultaneously. Wooldridge(2010, p. 690) refers to the frequency of events as theamount decision. Suppose we have N patients, wherepatient i is readmitted Ji times at (t11 t21 0 0 0 1 tJi) overthe period (01Ti], where t0 = 0 corresponds to the

2 Failure to account for data truncation is especially problematic forCHF, which tends to occur among older patients (who are 69 yearsold on average at discharge). For example, older patients typi-cally tend to incur more readmissions, which is consistent with ourmodel results.

initial admission time and Ti represents the censor-ing point, which is the end of the model calibra-tion period for patient i. As each patient i is admit-ted and readmitted at different times, Ti varies acrosspatients. Figure 1 provides a schematic representationof a patient’s readmission patterns over time.

We assume that the time interval between two con-secutive admissions follows an Erlang-2 distribu-tion, which means that the timing of a patient’sfuture readmission depends not only on the cur-rent visit but also on the previous admission. Thisrelaxes the restrictive stationary assumption of theexponential distribution that is most common in pro-portional hazard models (Winkelmann 2010). By treat-ing each admission as an independent random event,it not only overlooks the rich admission history ofa patient, but can lead to an erroneous predictionof patients’ future readmissions. A patient’s readmis-sion rate depends on the medical treatments that shereceives, her health status, and prior hospitalizationhistory. Consequently, researchers in the marketingliterature have proposed that the Erlang-2 distribu-tion be used to model interpurchase times, as itmore closely resembles customer purchase behavior(Chatfield and Goodhardt 1973, Gupta 1991, Jeulandet al. 1980, Morrison and Schmittlein 1981, Fader et al.2005). Hence, the Erlang-2 distribution is appropriatein our context of patient readmission behavior as itassumes that timing of the next admission is condi-tionally dependent on the duration of the previousadmission.

The Erlang-2 distribution takes the form of fi4x121�5= �2xe−�x forx1�≥ 0, where x is a continuous randomvariable and � is the (readmission) rate parameter. Wefollow Gupta’s (1991) general approach, which speci-fies patient i’s probability, or hazard function, to paya hospital visit in time period t, given a set of time-varying covariates, Xt , as

h4t1Xt5= �t · eXt�1 ≡ �t ·�4t51 (2)

where �t is the baseline hazard at time t (Cox 1972).We follow Seetharaman and Chintagunta’s (2003)

formulation of the continuous time hazard model toincorporate time-varying covariates 4Xt1

1Xt21 0 0 0 1XtJi

5

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 8: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHFInformation Systems Research 26(1), pp. 19–39, © 2015 INFORMS 25

in the Erlang-2 distribution. The patient-level sur-vivor function of the interadmission time distribu-tion between the 4j − 15th and the jth admission isspecified as

S4tj − tj−11Xtj5

=

(

1 +�j ·

∫ tj

tj−1

�4u5du

)

· exp[

−�j

∫ tj

tj−1

�4u5du

]

= 41 +�j�4tj1 tj−155exp6−�j�4tj1 tj−1571 (3)

where

�4tj1 tj−15≡ �4tj5−�4tj−153

�4t5≡

∫ t

0�4u5du3 �4t5≡ eXt�1 0

(4)

The individual probability density function duringthe time interval 4tj−11 tj 7, given a covariate vector Xtj

,then follows:

f 4tj − tj−1 � �j1�11Xtj5

= �2j ·�4tj5 ·�4tj1 tj−15 · exp6−�j�4tj1 tj−1570 (5)

The overall likelihood at the patient level is simplythe product of Equation (5) over j as

L4Ti1�1 � �1Xt5=

Ji∏

j=1

f 4tj − tj−1 � �j1�11Xtj5+ �i1 (6)

where �i ∼ N4�1�25 represents patient-level randomeffects to capture the unobserved heterogeneityof individual patients (where � and � representthe mean and standard deviation of the randomeffect normal distribution function, respectively). Wenote that incorporating time-varying covariates intothe Erlang-2 Gamma model in this manner is anew methodological contribution to the literature.Although Fader et al. (2004) also incorporate time-varying covariates into the Erlang-2 distribution, theyonly model the grouped duration data where time isgrouped into weeks and the covariates are assumedto be constant during a week’s interval. Wooldridge(2010) argues that this treatment is unsuitable for mul-tispell data, where the event can occur multiple timesduring the chosen time interval. Since patient admis-sions can occur at any time and multiple times duringa certain time period (e.g., a month), and consideringthat the covariates (comorbidities) may change at anytime, we need to model readmissions along a contin-uous time frame.

Furthermore, to account for the fact that the rateof patient visits may differ across patients, we adoptthe common mixture distribution for � (Winkelmann2010), which is assumed to be gamma distributed withshape parameter r and scale parameter �: g4� � r1�5=�r�r−1e−��â4r5−1. The flexibility associated with the

gamma distribution allows it to fit various shapesof distributions because of its additive and conju-gate properties. Gupta (1991) observes that the mostappropriate specification for interpurchase time is amodel that features an Erlang-2 interpurchase processwith gamma-distributed purchase rates (to accountfor customer heterogeneity). Based on the assump-tions that interadmission times are distributed accord-ing to (5) and that unobserved heterogeneity in � fol-lows a gamma distribution, we specify the likelihoodfunction as

L4Ti1�11 r1�5 =

( J∏

j=1

�4tj5�4tj1 tj−15

)

·â4r + 2J 5 ·�r

â4r5

·

(

�+

J∑

j=1

�4tj1 tj−15+�4T 1 tJ 5

)−4r+2J 5

+ �i0 (7)

We further consider the possibility that, after everyadmission, a patient can become inactive with adropout probability of p from a geometric process.However, each patient is likely to have a different p,the cause of which may not always be observed, suchas death or relocation outside the region, which leadsto a data truncation problem. For a geometric process,this unobserved heterogeneity is commonly modeledthrough a beta mixing function for the binary out-come geometric processes (Winkelmann 2010). Takentogether, this yields the beta-geometric distributionFader et al. (2005), which is specified as

P4dropout after jth admission5= p41 − p5j−11 (8)

where the heterogeneity in dropout probabilities fol-lows a beta distribution, with parameters a and b indi-cating the relative propensity of dropping out or not:f 4p � a1 b5= pa−141 − p5b−1B4a1 b5−1.

Thus a patient may be inactive either after T (i.e., noobservation is made between the last admission andthe end of the period) or right after the last admission.While dropping the subscript for each individual i forbrevity, we model these cases as follows:

(i) A patient is inactive after T : 3

L4� � t11 0 0 0 1 tJ 1T 1 inactive at time � > T 5

= �2J

( J∏

j=1

�4tj5�4tj1 tj−15

)

· exp(

−� ·

( J∑

j=1

�4tj1 tj−15+�4T 1 tJ 5

))

0 (9)

3 For a full derivation of Equations (9)–(15), see Appendix A in theonline supplement (available as supplemental material at http://dx.doi.org/10.1287/isre.2014.0553).

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 9: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF26 Information Systems Research 26(1), pp. 19–39, © 2015 INFORMS

(ii) A patient becomes inactive right after the lastadmission J :

L4� � t11 0 0 0 1 tJ 1T 1 inactive at time � ∈ 4tJ 1T 75

= �2J

( J∏

j=1

�4tj5�4tj1 tj−15

)

· exp(

−� ·

J∑

j=1

�4tj1 tj−15

)

0 (10)

This yields the likelihood function

L4�1p1� �X1 J 1 tJ 1T 5

= 41 − p5J�2J

( J∏

j=1

�4tj5�4tj1 tj−15

)

· exp(

−�

( J∏

j=1

�4tj5�4tj1 tj−15+�4T 1 tJ 5

))

+ �J>0p41 − p5J−1�2J

( J∏

j=1

�4tj5�4tj1 tj−15

)

· exp(

−�

( J∏

j=1

�4tj1 tj−15

))

+ �i0 (11)

Expectation over the distribution of � yields thelikelihood function

L4r1�1p1� �X1 J 1 tJ 1T 5

= 41 − p5J ·A1 + �J>0p41 − p5J−1A2 + �i1 (12)

where

A1 =

( J∏

j=1

�4tj5�4tj1 tj−15

)

·â4r + 2J 5 ·�r

â4r5

·

(

�+

J∑

j=1

�4tj1 tj−15+�4T 1 tJ 5

)−4r+2J 5

1 (13)

A2 =

( J∏

j=1

�4tj5�4tj1 tj−15

)

·â4r + 2J 5 ·�r

â4r5

·

(

�+

J∑

j=1

�4tj1 tj−15

)−4r+2J 5

0 (14)

Taking expectation over � and p yields the individ-ual likelihood function L1, for patients with at leastone readmission as follows:

L14r1�1a1 b1� �X1 J 1 tJ 1T 5

=A1 ·

∫ 1

041 − p5J

pa−141 − p5b−1

B4a1 b5dp

+A2 ·

∫ 1

0p · 41 − p5J−1 p

a−141 − p5b−1

B4a1 b5dp+ �i

=

( J∏

j=1

�4tj5�4tj1 tj−15

)

·â4r + 2J 5

â4r5·�r

·â4a+ b5â4b+ J − 15â4b5â4a+ b+ J 5

·

[

4b+ J − 15 ·(

�+

J∑

j=1

�4tj1 tj−15+�4T 1 tJ 5

)−4r+2J 5

+ �J>0 · a ·

(

�+

J∑

j=1

�4tj1 tj−15

)−4r+2J 5]

+ �i0 (15)

Hence, the log-likelihood of N patients, who have atleast one readmission, is specified as

LL1 =

N∑

i=1

log(

L14r1�1a1 b1� �Xi4t51 Ji1 tJi1Ti5)

0 (16)

Combining it with the logit hurdle component, theoverall likelihood function, L0, for a patient with noreadmission after the initial admission (J = 0), simpli-fies to

L0 =

∫ �

0exp4−��4T 1055·41+��4T 1055·

�r�r−1e−��

â4r5d�

=

(

�+�4T 105

)r

·

(

1+r�4T 105

�+�4T 105

)

0 (17)

Altogether, this yields a BG/EG hurdle model, thelikelihood function of which is given by

L=

N∏

i=1

�di041 − �05

1−di

41 −L051−di

·L1−di1 1 (18)

where

�0i = P4Ji = 05 and di = 1 − min8Ji1190

The log-likelihood of the BG/EG hurdle model istherefore

LL =

N∑

i=1

(

di · log �0i + 41 − di5 · log41 − �0i5

− 41 − di5 · log41 −L0i5+ 41 − di)

· LL1i51 (19)

where the first two terms of the right-hand side referto the hurdle, and the last two terms are the likelihoodof positive count of readmissions.

3.3. Contrast with the LiteratureWe now recap the key methodological contributionsthat differentiate our study from the prior literature.Table 1 provides a summary of the key differentia-tors of our study, in contrast to the prior literature,where we draw on multiple disciplines. With respectto research design, there are several distinguishingfeatures of our paper. First, none of the prior stud-ies have focused on the association between health

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 10: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHFInformation Systems Research 26(1), pp. 19–39, © 2015 INFORMS 27

IT and patient readmission risk. Second, although themarketing literature has numerous studies that inte-grate the estimation of the propensity and timing ofconsumer purchases, the prior healthcare research hasbeen limited by its explicit focus on the estimation ofreadmission propensity, while ignoring the frequencyand timing of future readmissions. This is the firststudy to provide an integrated model to estimate riskpropensity as well as the frequency and timing of read-missions in a healthcare analytics context.

With respect to model estimation, while the issuesof unobserved heterogeneity and data truncationhave been previously studied in the marketing lit-erature, this is the first study to explicitly accountfor the possibility of unobserved heterogeneity anddata truncation (e.g., because of death or patient re-location) using patient readmission data. Similarly,extant healthcare studies on readmissions have notaccounted for the possibility of nonstationarity in thepatient readmission process, as well as the impact oftime-varying covariates on readmission risk and fre-quency (Kansagara et al. 2011). With the exception ofWinkelmann (2004), this is the only study to develop ahurdle-based estimation model that estimates the fre-quency and timing of future readmissions once the firstreadmission hurdle has been crossed. This study is amongthe first to account for trade-offs between types Iand II errors in the development of a predictive modelto study patient readmissions. Although a few mar-keting studies (e.g., Gupta 1991, Fader et al. 2005)also examine predictive performance of their respec-tive models, they focus on prediction accuracy (i.e.,type I error) without regard to the potential trade-offswith type II errors.

As shown in the last row of Table 1, this is theonly study in the health IT domain to address thelimitations of prior studies in terms of their researchdesign, estimation methods, and model assumptionswith respect to modeling patient readmissions.

4. DataOur data consist of four years of patient admissionrecords and clinical data from 67 hospitals in theNorth Texas region starting from January 2006 toDecember 2009. Patient visits across multiple hos-pitals are tracked by matching the regional masterpatient index (REMPI), developed by the Dallas FortWorth Hospital Council (DFWHC) Research Founda-tion. A REMPI is a unique ID number assigned toeach patient that allows us to track patients over timeand across all hospitals in a region. In other words, aREMPI makes it possible to obtain a patient’s entirereadmission history and enables us to study the pat-terns of patient care and clinical diagnosis receivedacross multiple hospitals with different ownership.

We observe that this is a major improvement in ourstudy compared to previous studies that have beenrestricted to studying patient readmission data fromhospitals that belong to a single hospital (or health)system (Silverstein et al. 2008).

Our data records 65,188 admissions that originatefrom 40,983 distinct patients with CHF as their pri-mary diagnosis. Among these patients, 70% had a sin-gle admission, whereas 30% (12,211) experienced mul-tiple admissions as shown in Figure B.1 in the onlineappendix. Table B.1 in the online appendix providesa description of our sample data. Our data capturesseveral patient demographic characteristics includinggender, racial profile, and discharge age. Among allpatients, 52% (21,281) are female, 72% (29,320) areCaucasian, and 21% (8,686) are African Americans.The average discharge age is 69 years, with 66%(27,134) of the patients being 65 years or older.

The hospital health IT usage data are drawn fromthe HIMSS analytics database for the correspondingfour-year period, i.e., 2006 to 2009. After consultingwith health IT practitioners, and based on the inten-sity of health IT usage among our sample of 67 hos-pitals, we identify 45 applications that are commonlyused in the treatment of CHF patients. We omit var-ious IT applications that are only relevant to gen-eral hospital management, and focus instead on clini-cal and administrative functions associated with CHF.Exploratory factor analysis (EFA) on this group ofapplications results in the selection of 18 health ITapplications, where the factor scores are greater thanthe threshold of 0.6 for significance. Such clusteringof HIT applications has been previously employed togroup health IT applications according to their pri-mary functionality (Bhattacherjee et al. 2007, Himmel-stein et al. 2010, Bardhan and Thouin 2013).

We code each HIT application as a binary variablewhere one indicates that it has been implemented andoperational, and zero indicates otherwise. Using EFAwith Varimax rotation, we identify three major classesof health IT applications: administrative IT, clinical IT,and cardiology IT. Administrative IT consists of healthinformation management applications, chart track-ing, revenue cycle management, and patient billingapplications. Clinical IT comprises hospital-wide clin-ical systems such as EMR, operating room IT, andCPOE applications. Cardiology IT, which are primarilyused to treat CHF patients, include cardiology infor-mation systems, cath lab systems, and echocardiol-ogy and computerized tomography (CT) systems. Thespecific applications that comprise our three HIT fac-tors are shown in Table B.2 in the online appendix.We observe that our HIT factors are generally consis-tent with previous research (e.g., Bhattacherjee et al.2007, Menachemi et al. 2008).

Based on the EFA results, we calculate a summa-tive index score for each factor that represents a ratio

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 11: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF28 Information Systems Research 26(1), pp. 19–39, © 2015 INFORMS

of the number of applications used in each hospi-tal to the total number of applications. We normal-ize the summative index to a value between zeroand one, which represents the percentage of healthIT applications being used at a given hospital out ofthe entire class of applications.4 As of 2009, hospitalsin our sample have (on average) implemented 91%of administrative IT applications, 57% of clinical ITapplications, and 23% of cardiology information sys-tems. Compared to hospitals with below-average lev-els of administrative IT and cardiology IT, hospitalswith above-average levels of these two HIT applica-tions enjoy lower 30-day readmission rates by 3% and4%, respectively. On the other hand, the 30-day read-missions rate of hospitals with an above-average levelof clinical IT applications is not significantly differentthan hospitals in the below-average group.5

Our sample statistics show that 37% of patientswho are readmitted visit different hospitals. Thisalarmingly large percentage suggests a severe under-counting for a single-hospital study, the commonapproach adopted in the extant readmission literature.Hence, for each patient readmission, we count thenumber of different hospitals visited by the patientprior to their current visit. We develop a new mea-sure, patient stickiness, defined as the ratio of the num-ber of times that a patient visits the same hospitalto the total number of admissions until the presenttime. In other words, if a patient is treated within asingle hospital (across multiple visits), her stickinessmeasure is higher compared to patients who are read-mitted across multiple hospitals.

We also include other control variables that havebeen commonly used in the readmission literature(Mudge et al. 2010, Ross et al. 2008, Silverstein et al.2008) including patient demographics (discharge age,gender, race), length of stay (LOS), number of diagnoses,number of procedures, payer type, admission type, andthe risk of mortality. LOS is defined as the numberof patient days from admission to discharge duringan inpatient visit. In our sample, the average LOS is5.45 days, with a mean of 12.58 diagnoses, 1.08 proce-dures recorded, and total hospital charges of $37,649.

4 We also experimented with factor scores (i.e., using factor load-ing as the weights) and the results remain consistent. We preferthe normalized summative index because it has a more intuitiveinterpretation.5 Based on a split sample analysis, we observe that the averagereadmission rate for hospitals with above-average usage of admin-istrative IT is 12%, whereas that for hospitals with below-averageadministrative IT is 15% (the difference is significant at p < 00001).Similarly, hospitals with above-average usage of cardiology IT havea 10% readmission rate, compared to a 14% readmission rate forbelow-average hospitals. This difference is also statistically signif-icant (p < 00001). On the other hand, high or low levels of clinicalIT are not associated with a significant difference in average read-mission rates.

Payer type is classified into five categories accordingto their claim filing code: Medicare, Medicaid, self-pay, private insurance, and other insurance types. Foreach admission, hospitals record the admission typeand the risk of patient mortality. The standard admis-sion type is coded into six classes (class 1 denotesemergency). Risk mortality is coded on a scale from 1to 4, which indicates the patient death risk as minor,moderate, major, or extreme, respectively.

CHF is likely to be accompanied by other comor-bidities. Frequent comorbidities associated with CHFinclude diabetes mellitus, hypertension, peripheralvascular disease, chronic pulmonary disease, renalfailure, anemia, alcohol abuse, drug abuse, andischemic heart disease (Ross et al. 2008). We controlfor these comorbidity variables, which are identifiedby the Elixhauser index (Elixhauser et al. 1998) basedon the ICD-9-CM (International Classification of Dis-eases) diagnosis codes.

Other hospital characteristics may also affectpatient readmission rates. We include hospital-levelcontrol variables such as the number of beds, case mixindex, and teaching/nonteaching hospital attributes.Num_beds is the number of beds available for usein each hospital, which serves as a proxy for hos-pital capacity. We control for the hospital case mixindex (CMI), which accounts for the average severityof patients’ disease case mix. Tch-hosp represents theacademic status of the hospitals. Table 2 provides def-initions as well as descriptive statistics on our modelvariables.

5. Empirical AnalysisWe first discuss identification of the potentiallyendogenous HIT variables and then present theresults of our empirical estimation, starting withthe baseline results, followed by the BG/EG hurdleestimation.

5.1. Identification of the Health IT EffectsIt is likely that the health IT variables that we con-struct in our study may be subject to endogeneity. Forexample, having a higher level of readmissions mayprompt a hospital to implement HIT (i.e., simultane-ity). Identification of the causal effect of health IT canbe challenging, because it is hard to isolate it froma hospital’s efforts to improve patient outcomes suchas readmission reduction. To address endogeneity,we identify two instrumental variables (IV): averagelevel of health IT in (external) peer hospital systems,and the difference in a hospital’s level of health ITbetween the current and previous year. The first IVis defined as the average level of HIT among peerhospitals, after excluding other hospitals within thesame health system as the focal hospital. The secondIV is derived by taking the difference in the values

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 12: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHFInformation Systems Research 26(1), pp. 19–39, © 2015 INFORMS 29

Table 2 Variable Definitions

Variable Description of variable Descriptive statistics

Demographic variablesGender Patient’s gender Female (52%), Male (48%)Patient race Patient’s race Caucasian (72%), African

American (21%)log(disch_age) Patient age on the day of discharge (log transformed) 69.17 (15.65)a1b

(log(disch_age))2 Quadratic term of log-transformed discharge ageHealth IT variables

Administrative IT Normalized summative index of administrative IT applications 0.91 (0.22)a

Clinical IT Normalized summative index of clinical IT applications 0.57 (0.28)a

Cardiology IT Normalized summative index of cardiology IT applications 0.23 (0.35)a

Visit characteristicsNumber of procedures Number of procedures on each admission per patient 1.08 (1.98)a

log(LOS) log-transformed length of stay 5.45 (5.87)a1c

Patient stickinessProportion of same hospital visits Number of times a patient visited the same hospital divided by the total

number of visits up to the current admission0.93 (0.18)a

Payer type variablesMedicare Binary indicator of claim filed to Medicare 40,266 (61.77%)d

Medicaid Binary indicator of claim filed to Medicaid 4,912 (7.54%)d

Private Binary indicator of private insurance 16,650 (25.54%)d

Other type of insurance Binary indicator of other types of insurance (Veterans Administration orother federal programs)

686 (1.05%)d

Admission condition variablesAdmission type (medical emergency) Binary indicator of admission type classified as medical emergency 44,247 (67.88%)d

Risk mortality Risk mortality (1: minor (15.14%), 2: moderate (46.88%),3: major (29.24%), 4: extreme (8.74%))

Comorbidity variablesDiabetes_mellitus Binary indicator of diabetes mellitus 28,436 (43.62)d

Hypertension Binary indicator of hypertension 27,399 (42.03)d

Periph_vascular Binary indicator of periphery vascular 6,935 (10.64)d

Chronic_pulmonary Binary indicator of chronic pulmonary disease 22,949 (35.20)d

Renal_failure Binary indicator of renal failure 22,441 (34.43)d

Anemia Binary indicator of anemia 19,214 (29.47)d

Alcohol_abuse Binary indicator of alcohol abuse 984 (1.51)d

Drug_abuse Binary indicator of drug abuse 2,237 (3.43)d

Ischemic_disease Binary indicator of ischemic disease 35,301 (54.15)d

Hospital variablesNum_beds Number of beds in hospital 392.70 (298.44)a

Tch_hosp Binary indicator of teaching/nonteaching hospital(1 = teaching/0 = nonteaching)

30,341 (46.54%)d

CMI Case mix index 1.53 (0.26)a

Admission and dischargeAdmission source ER reference (1 = ER reference (77%)/0 = non-ER reference)Discharge disposition Discharged to home/self-care (1 = home (62%)/0 = elsewhere)

aMean (standard deviation); bstatistics on discharge age in years; cstatistics on length of stay in days; dnumber of occurrences (%, percentage out of thetotal number of observations).

of the HIT variables across two consecutive years.Using the difference of an endogenous variable as anIV was proposed by Arellano and Bond (1991) andhas become a well-accepted method to account forendogeneity. Both IVs are correlated with the currentyear’s level of HIT of the focal hospital since (a) thehospital is likely to monitor and follow the HIT appli-cations that its peer hospitals have implemented (dueto peer pressure), resulting in correlated hospital-levelIT decisions and (b) the previous year’s HIT (and thederived difference) ought to be correlated with thelevel of HIT in the current year.

At the same time, these two IVs are unlikely to besystematically determined by an individual patient’sreadmission at the focal hospital. In other words,competition in the local healthcare market may drivehealth IT implementations such that the IT infrastruc-ture of competing hospital systems may influence thefocal hospital’s decision to implement HIT. However,it is unlikely that HIT implementation decisions ofpeer hospitals will be associated with an individualpatient’s readmission rate at the focal hospital. Like-wise, a hospital’s current level of HIT may depend onits level in the prior year, but prior year levels (and

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 13: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF30 Information Systems Research 26(1), pp. 19–39, © 2015 INFORMS

the derived difference) predates patient outcomes inthe current year and thus is unlikely to be systemat-ically codetermined. Hence, both variables fulfill thecriteria for IV estimation. We also test the strengthand exogeneity of our IV in a nonlinear, two-stage,least-squares model. The F -value of the administra-tive IT, clinical IT, and cardiology IT variables, inthe first stage, are 296.85, 860.79, and 578.13, respec-tively, confirming the strength of these IVs. Further-more, the Hansen’s J test provides a test statistic ofJ = 30305 (p = 003469), thus supporting the exogeneityof these IVs.

5.2. Baseline Model ResultsFor patient-level analysis, our objective is to developa better understanding of the determinants of read-mission propensity within 30 days of discharge fromthe previous admission. The dependent variable inthe logit model is measured as a binary variable thattakes a value of one in the presence of a 30-dayreadmission, and zero otherwise. The dependent vari-able in the proportional hazard model is measuredas the time interval between two consecutive admis-sions that occur within a 30-day window. If a read-mission occurs outside of the 30-day window, we treatthe subsequent visit as a new admission, a commonpractice in research and practice (Joynt and Jha 2012).We present our estimation results for the logit andCox proportional hazard models in Table 3. Based onthe logit estimation results, we observe that femalepatients are 7.6% less likely to be readmitted within 30days compared to male patients (odds ratio = 00924);their average duration between consecutive readmis-sions is also 11.7% less than their male counterparts.

The logit results suggest that African AmericanCHF patients are 42% more likely to be readmit-ted within 30-days of their prior discharge, as com-pared to their Caucasian counterparts (coeff. = 00356,p < 0010; odds ratio = 1042). Our results also suggestthat older patients do not necessarily incur higherreadmission risks, as demonstrated by the insignifi-cant value of the coefficient of log(disch_age). How-ever, the negative and significant coefficient on thequadratic term, log(disch_age)2, implies that the riskof readmission within 30 days starts to decrease forpatients who are older than 63. We note that this neg-ative effect on the quadratic term can be attributed tothe possibility of data truncation (i.e., older patientsbeing closer to the end of their lives), which is notaccounted for in the baseline estimation models.6

6 The BG/EG hurdle model addresses this issue by modeling a“drop out” probability after each visit for a patient. We further con-ducted a robustness check using the approach described in Gönüland Hofstede (2006) to test the potential nonlinear effect of age andfound that the estimation results are consistent with our resultsreported here.

The estimation results of the logit and proportionalhazard models indicate that the three classes of healthIT applications (i.e., administrative, clinical, and car-diology IT) are not significantly associated with thepropensity of 30-day readmission. We also do notobserve a significant association between health ITapplications and the time duration between consec-utive admissions based on the results of the pro-portional hazard model. We will revisit these factorsagain during our discussion of the BG/EG hurdlemodel results.

A positive coefficient (coeff. = 00112; p-value < 0001)on LOS indicates that longer hospital stays are asso-ciated with higher risk of readmission, which is alsoconsistent with previous findings (Mudge et al. 2010).This result may be attributed to the possibility thatsicker patients may require longer LOS and are morelikely to be readmitted within 30 days.

With respect to the effect of payer type on read-mission risk, we observe that patients with Medicaidare at a significant risk of 30-day readmission, com-pared to self-pay patients. For Medicaid patients, thereadmission risk increases by 16.7% compared to self-payers, marginally significant at a p-value < 0010.

The severity of a patient’s condition, based on riskmortality scores, also plays a significant role in deter-mining readmission risk. Patients with moderate risklevels (i.e., level 2) have a higher risk of being read-mitted within 30 days, as compared to patients withlow risk levels (i.e., level 1), with the risk increasingby 36.4%. Higher levels of mortality are also asso-ciated with a greater 30-day readmission risk by adegree of 63.9% and 65.2% for levels 3 and 4, respec-tively. We observe that the average duration betweenconsecutive readmissions is 15.8% less for emergencyroom patients compared with those that are admittedas inpatients. We also observe that teaching hospitalshave a higher risk of 30-day patient readmission com-pared to their nonteaching counterparts, with the riskof readmission being greater than 14.9%, due to thepossibility that such hospitals often treat more com-plex cases.

5.3. BG/EG Hurdle Model ResultsNext, we estimate the BG/EG hurdle model, whichtakes into account the following salient estimationissues: (a) unobserved patient-level heterogeneity,(b) state dependency of patient readmissions that re-sult in a nonstationary readmission rate, and (c) datacensoring due to truncation in the data caused bypatient dropout from the sample. A unique feature ofour model is that it allows estimation of two distinctcomponents of patient readmissions that have beenoverlooked in the readmission literature: (a) a patient’spropensity of being readmitted within 30 days ofthe prior admission (logit hurdle), and (b) frequency

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 14: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHFInformation Systems Research 26(1), pp. 19–39, © 2015 INFORMS 31

Table 3 Baseline Logit and Proportional Hazard Estimation of 30-Day Readmission Model

Logit Proportional hazard

Variable Parameter estimate Standard error Odds ratio Parameter estimate Standard error Hazard ratio

Intercept −30585 4008775∗∗∗ 00028Gender 2 female −00078 4000225∗∗∗ 00924 −00124 4000395∗∗∗ 00883Patient race

Asian or Pacific Islander 00108 4002615 10115 −00133 4001795 00876African American 00356 4002115∗ 10428 −00013 4000525 00987Other 00126 4002195 10135 −00170 4000865∗∗ 00843

log(disch_age ) 00502 4004265 10653 00181 4002625 10198log(disch_age )2 −00094 4000585∗ 00911 −00055 4000395 00947Health IT

Administrative IT 00038 4001155 10039 −00087 400105 00916Clinical IT 00076 4000855 10079 00102 4000785 10107Cardiology IT −00050 4000605 00951 −00083 4000565 00921

Number of procedures −00067 4000135∗∗∗ 00935 −00063 4000125∗∗∗ 00939log(LOS) 00111 4000345∗∗∗ 10117 00120 4000315∗∗∗ 10128Payer type

Medicare 00112 4000875 10119 00106 4000815 10111Medicaid 00154 4000945∗ 10167 00128 4000885 10136Private 00070 4000825 10073 00061 4000775 10063Other type of insurance 00057 4002365 10058 00070 4001995 10073

Admission typeMedical emergency 0006 4000575 10062 00147 4000435∗∗∗ 10158

Risk mortalityLevel 2 00310 4000735∗∗∗ 10364 00294 4000665∗∗∗ 10341Level 3 00494 4000805∗∗∗ 10639 00482 4000725∗∗∗ 10619Level 4 00502 4001025∗∗∗ 10652 00646 4000935∗∗∗ 10909

ComorbiditiesDiabetes_mellitus 00025 4000435 10026 −000004 4000395 1Hypertension −00137 4000525∗∗∗ 00872 −00103 4000465∗∗ 00902Periph_vascular 00032 4000655 10033 00078 4000585 10081Chronic_pulmonary −00012 4000445 00988 −00024 400045 00977Renal_failure 00089 4000565∗ 10093 00130 400055∗∗∗ 10139Anemia −00005 4000475 00995 00033 4000425 10033Alcohol_abuse −00253 400185 00777 −00116 400165 0089Drug_abuse 00689 4001155∗∗∗ 10992 00598 4001085∗∗∗ 10819Ischemic_disease 00206 4000445∗∗∗ 10229 00175 400045∗∗∗ 10191

Num_beds −000003 40000015∗∗∗ 1 −000001 40000015 1Tch_hosp 00139 4000575∗∗∗ 10149 00142 4000515∗∗∗ 10153CMI −00224 4001235∗ 008 −00503 4000985∗∗∗ 00605−2 logL 17,893.18 61,254.92AIC 17,959.18 61,318.92

Note. Standard errors in parentheses.∗p = 0010; ∗∗p = 0005; ∗∗∗p = 0001.

of future readmissions (BG/EG) after the hurdle hasbeen crossed. We treat the 30-day readmission win-dow as a hurdle to be overcome before the latter con-dition (i.e., future readmission) is observed.

5.3.1. Logit-HurdleAnalysis. Thelogit-hurdlecom-ponent estimates the propensity of 30-day readmis-sion where we use the same set of independentvariables as in patient-level baseline analysis. For theBG/EG estimation of the frequency of future read-missions, we include three new variables in additionto the ones used in the logit hurdle model. Thesevariables include (a) patient stickiness, (b) destination

of patient’s prior discharge (i.e., if the previous dis-charge was to home or self-care), and (c) admissionsource of the patient (e.g., if the patient was admit-ted to the ER on the prior admission). These variablesallow for accurate model identification of our BG/EGestimation model as they are only relevant to theBG/EG component and are not included in the logithurdle estimation. Our choice of these variables isbased on recent anecdotal evidence that suggests thatcommunity-level factors, such as challenges faced bypatients when they are discharged to their homes, areassociated with patient readmissions (e.g., Kansagaraet al. 2011).

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 15: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF32 Information Systems Research 26(1), pp. 19–39, © 2015 INFORMS

The logit-hurdle parameter estimates, as shown inthe left-hand panel of Table 4, indicate that patientdemographics (gender, race, and discharge age) andadmission characteristics (number of procedures andLOS) are significant determinants of 30-day readmis-sion risk. We observe that usage of cardiology IT isassociated with a significant reduction in 30-day read-mission risk (coeff. = −00086; p-value < 0005). CHFpatients who are admitted to hospitals with a highlevel of cardiology IT applications are about 8.3% lesslikely to be readmitted within 30 days. On the otherhand, higher levels of administrative and clinical ITsystems are associated with a slight increase in patientreadmission risk by 2.2% and 1.4%, respectively. Thereare several possible explanations for this result. First,it may be attributed partly to self-selection becausesicker patients, who are at higher risk to be readmit-ted, may seek better-equipped hospitals with greaterIT resources. However, since we have controlled forpatient risk mortality in our model, this is unlikelyto be an issue. As a robustness check, we conduct aHeckit analysis to account for possible sample selec-tion bias arising from sicker patients. The resultsshow that all parameter estimates remain qualita-tively unchanged, ruling out possible selection bias.A similar argument may hold that hospitals withhigher levels of readmissions self-select into adopt-ing clinical and administrative IT systems. However,we have addressed this potential endogeneity withinstrument variables in §5.1.

Because the HIMSS data do not explicitly distin-guish between implementation and usage of vari-ous health IT applications, another possibility maybe that, although hospitals have implemented admin-istrative and clinical IT systems, their actual use ofthese systems may not have occurred in tandem. AsDevaraj and Kohli (2003) observe, the drivers of busi-ness value is not implementation of IT per se butits actual usage within business processes. Hospi-tals with high levels of HIT implementation may notnecessarily represent the ones with higher propor-tion of users of these systems. Our sample period(from 2006–2009) represents the period when hospi-tals started to implement clinical IT, such as CPOEand EMR systems. Implementation of administrativeand clinical IT systems usually requires significantinvestments in training clinical staff and users, andincurs a time lag of 18 to 24 months before processimprovements are realized (Menon et al. 2000). Sincewe do not observe actual usage of HIT applications,the negative effects of clinical IT on care outcomesmay not reflect the time lag between health IT imple-mentation and their actual usage. In this respect, wenote that recent meaningful use incentives providedby the federal government to spur usage of electronic

health records (EHRs) may indeed serve as a cata-lyst to increase usage of health IT to improve clini-cal workflows and patient care outcomes (Blumenthaland Tavenner 2010).

5.3.2. BG/EG Analysis. Next, we present the esti-mation results of the BG/EG component in the right-hand panel of Table 4, and observe that interestingpatterns begin to emerge for many of the model vari-ables compared to the results of our logit hurdle esti-mation.7 We find differential effects of health IT appli-cations on CHF patient readmission rates. For patientswho cross the 30-day readmission hurdle, cardiologyIT systems are associated with a 23% reduction in fre-quency of future readmission (coeff. = −0025, p < 0001,hazard ratio = 00770). Similarly, our results indicatethat administrative IT systems are associated with a26.3% reduction in the frequency of future readmis-sions (coeff. = −00304, p < 0001, hazard ratio = 00737).This indicates that healthcare IT systems have dif-ferential impacts on 30-day patient readmission riskand their frequency of future readmissions, necessi-tating the separation of the two components in ourmodel. This differential effect reveals the underlyingmechanism of the impact of EHRs on clinical infor-mation workflow. When a patient is first admitted toa hospital, there is no prior history available on thepatient, thereby limiting the benefit of EHRs. How-ever, once the patient is readmitted, administrativeand cardiology systems accrue more information onthe patient’s prior medical history and their benefitsstart to emerge. Our results also indicate that clini-cal HIT systems are not significantly associated withthe frequency of future readmissions. This may beattributed to the possibility that such systems mayimprove workflows for clinicians through automatedreminders and computerized order entry, but maynot have a direct impact on patient readmission risk.Our results suggest that administrative and cardiol-ogy IT are particularly important in ensuring thatreadmitted patients receive high-quality care, whichtranslates into a lower risk and frequency of futurereadmissions.

An interesting finding is that repeat care from thesame hospital reduces the risk of future readmissionssignificantly, as reflected in the negative coefficientestimate for the patient stickiness variable. Our BG/EGresults indicate that a 1% increase in patient stickiness

7 Because our BG/EG model is based on continuous time, wherethe time unit is one day, we first calculate the values of the densityfunction for each time point and integrate it over a 30-day period todefine the probability for 30 days. To calculate the marginal effectfor each variable, we compute the point estimate of the densityfunction by plugging in one coefficient at a time holding all othervariables at their mean values. We then aggregate the functionalestimation over 30 days for each variable.

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 16: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHFInformation Systems Research 26(1), pp. 19–39, © 2015 INFORMS 33

Table 4 BG/EG Hurdle Model Estimation Results

30-day readmission logit hurdle BG/EG

Variable Parameter estimate Standard error Odds ratio Parameter estimate Standard error Hazard ratio

Intercept −30379 4003645∗∗∗ 00034Gender : female −00024 400015∗∗ 00976 −00094 4000575∗ 00908Patient race

African American 00343 4000395∗∗∗ 10409 00112 4000115∗∗∗ 10124Asian or Pacific Islander 00054 4000395 10056 −00002 4000095 00998Other 00075 4000185∗∗∗ 10078 −00024 4000145 00975

log4disch_age 5 00459 400135∗∗∗ 10582 10448 400235∗∗∗ 40295log4disch_age 52 −00061 4000135∗∗∗ 00941 −00272 4000445∗∗∗ 00641Health IT

Administrative IT 00022 4000135∗ 10022 −00304 4001175∗∗∗ 00737Clinical IT 00014 4000025∗∗∗ 10014 00006 4000085 10006Cardiology IT −00086 4000345∗∗ 00917 −00250 4000895∗∗∗ 00770

Number of procedures −00074 400015∗∗∗ 00929 −00011 4000045∗∗∗ 00989log(LOS) 00120 4000215∗∗∗ 10128 −00016 4000175 00984Payer type

Medicare 00130 4000345∗∗∗ 10139 −00263 4000355∗∗∗ 00764Medicaid 00090 4000795 10094 −00102 4000535∗ 00898Private 00103 4000195∗∗∗ 10109 00259 4000155∗∗∗ 10310Other type of insurance 00116 4000975 10123 −00031 4000075∗∗∗ 00967

Admission typeMedical emergency 00081 4000195∗∗∗ 10084 −00147 4000125∗∗∗ 00861

Risk mortalityLevel 2 00227 4000395∗∗∗ 10254 00060 4000065∗∗∗ 10064Level 3 00509 4000515∗∗∗ 10663 −00039 4000175∗∗ 00960Level 4 00459 4000765∗∗∗ 10583 00005 4000025∗∗∗ 10005

ComorbiditiesDiabetes_mellitus 00028 4000135∗∗ 10028 −00098 4000415∗∗ 00903Hypertension −00075 4000265∗∗∗ 00927 00144 4000365∗∗∗ 10160Periph_vascular 00020 4000155 10020 −00019 4000045∗∗∗ 00980Chronic_pulmonary 00000 4000055 10000 00070 4000215∗∗∗ 10075Renal_failure 00084 4000215∗∗∗ 10088 00072 4000285∗∗ 10078Anemia 00039 400015∗∗∗ 10040 −00079 400035∗∗∗ 00921Alcoho_abuse −00283 40025 00754 −00017 4000125 00982Drug_abuse 00709 4000885∗∗∗ 20033 00016 400025 10017Ischemic_disease 00179 4000355∗∗∗ 10197 00018 400015∗ 10019

Num_beds 00009 4000065 10009 −00004 4000035 00997Tch_hosp. 00200 400045∗∗∗ 10221 00066 4000325∗∗ 10070CMI −00353 4000585∗∗∗ 00703 −00154 4000485∗∗∗ 00861Patient stickiness −00075 4000375∗∗ 00927Previous discharge to home/self 00135 4000315∗∗∗ 10148Previous ER reference −00002 4000085 00997

r 20833 a 50782 � 00021 Expected daily admission rate 2.18%� 129097 b 60096 � 00099 Expected drop-out rate 48.68%

−2 log L 97,239.979 AIC 97,379.979

Note. Standard errors in parentheses.∗p = 0010; ∗∗p = 0005; ∗∗∗p = 0001.

reduces the frequency of future readmission by 7.3%(coeff. = −00075; p-value < 0005; hazard ratio = 00927).In other words, patients who are treated at the samehospital are less likely to incur future readmissions.This finding can be attributed to the possibility thata patient who is treated at the same hospital mayreceive a better continuum of care since doctors arelikely to have access to her complete medical historyand can make more informed decisions related to

patient diagnosis and treatment. Another plausibleinterpretation is that these “sticky” patients are lesssevere ones and are less likely to incur readmissionsin the first place. To account for this possibility, wecompare the profiles of highly sticky patients withless stickier ones and do not find a statistical differ-ence across these two samples in terms of risk mor-tality scores. Hence, our results imply that improving“patient stickiness” allows healthcare providers to

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 17: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF34 Information Systems Research 26(1), pp. 19–39, © 2015 INFORMS

reduce the frequency of future readmissions of CHFpatients, and provide indirect evidence of the valueof information integration across disparate hospitals.

Next, we observe that payer-type variables areassociated with different patterns of readmissionpropensity and frequency of readmissions. Com-pared to self-pay patients, Medicare patients exhibita higher propensity of initial 30-day readmission.However, once they cross this readmission hurdle,their frequency of future readmission decreases by23.6% (coeff. = −00263; p-value < 0001; hazard ratio =

00764). Future readmissions of Medicaid patients alsodecrease by 10.2% (coeff. = −00102; p-value < 0010;hazard ratio = 00898). On the other hand, patientswith private insurance exhibit a 31.0% higher risk offuture readmissions than self-pay patients. In otherwords, after the first readmission, Medicare/Medicaidpatients are likely to incur less frequent hospital ad-missions relative to self-pay patients, whereas patientswith private insurance are likely to be readmittedmore frequently. We also observe that ER patientsexhibit the same readmission characteristics as Medi-care and Medicaid patients.

Our results imply that providers who treat Medicareand Medicaid patients would be well-served to reducefuture readmissions, in light of the proposed penal-ties for not meeting federal readmission requirements(Joynt and Jha 2012). However, for self-pay or pri-vate insurance patients, there are no such stipulationsand the quality of preventive medical care that theyreceive is not likely to be on par with that of Medicarepatients (Ong et al. 2009). Hospitals are more likelyto provide high-quality, preventive care to Medicareand Medicaid patients once they are readmitted byensuring that they receive extraordinary interventionsto reduce future readmissions. For example, a recentpartnership between Parkland Hospital in Dallas andthe Texas Health Resources system has resulted in thedevelopment of a risk stratification model to identifyhigh-risk CHF patients and provide them with a ded-icated heart failure team, nurse practitioner, and phar-macist, to reduce the likelihood of future readmissions(Hagland 2011).

We also observe that patients who are discharged totheir homes or self-care facilities exhibit a greater fre-quency of future readmissions (coeff. = 00135; p < 0001;odds ratio = 10148). This result suggests that alterna-tive discharge locations, such as intermediate care orskilled nursing facilities, should be explored for at-risk CHF patients who may otherwise receive inade-quate postdischarge care.

5.4. Comparison of Predictive AnalyticsPerformance

Although we have, thus far, focused on explanatoryresults of the BG/EG hurdle model, we now turn our

attention to the predictive capabilities of our proposedmodel and contrast these results against existing mod-els in the literature. In a recent paper, Shmueli andKoppius (2011, p. 553) advocated the importance ofpredictive analytics and questioned the (almost exclu-sive) practice of explanatory statistical modeling inthe IS research. They argue that

0 0 0Despite the importance of predictive analytics, wefind that they are rare in the empirical IS litera-ture. Extant IS literature relies nearly exclusively onexplanatory statistical modeling 0 0 0 0 However, explana-tory power does not imply predictive power and thuspredictive analytics are necessary for assessing predic-tive power and for building empirical models that pre-dict well 0 0 0 0

Fader et al. (2005) and Morrice and Bardhan (1995)also call for more attention to employ prediction mod-els as the yardstick for researchers to use when judg-ing model performance.

Our review of the readmission literature reveals thelack of attention on predictive analytics. Whereas theprimary focus has been limited to models to esti-mate patient readmission risk, our BG/EG hurdlemodel serves as a predictive model that is capableof predicting the propensity and frequency of futurereadmissions on any given patient. We now evaluateour model using a “horse race” to compare its pre-dictive performance against other benchmark mod-els, which consist of the random estimation, base-line logit model, BG/NBD hurdle model, EG hurdlemodel, and the BG/EG model without hurdle. Thelogit model represents the baseline model commonlyused in the healthcare literature to model readmis-sions. The BG/NBD hurdle model (Fader et al. 2005)represents the case when nonstationary readmissionpatterns are not accounted for, and it only differs fromour model by replacing our EG component with thestationary NBD process. The EG hurdle model (Gupta1991) depicts the case where the drop out component(i.e., the BG process) is not accounted for; and the“BG/EG without hurdle” model is similar to our pro-posed model without the hurdle component. Collec-tively, we design this horse race to demonstrate therelative performance of our model compared to thecurrent state-of-the-art, alternative models that do notfully address the complexity of the patient readmis-sion process.

We use two years of data to calibrate the trainingmodel and then test its predictive performance byusing the next one year as the holdout period. Hence,our training set consists of admission records of CHFpatients from January 2007 through December 2008,and the testing set includes one year of admissiondata from January to December 2009.8 In the holdout

8 We drop the first year’s (i.e., 2006) data from our training model toalleviate the potential left censoring problem. Initiating the training

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 18: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHFInformation Systems Research 26(1), pp. 19–39, © 2015 INFORMS 35

Figure 2 Lift Curve of the Predictive Performance of Different Models

0

10

20

30

40

50

60

70

80

90

100

0 20 40 60 80 100

Patient segment (%)

Random

Baseline logit

EG-hurdle

BG/EG

BG/EG-hurdle

Mat

ched

rea

dmitt

ed p

atie

nts

(%)

sample, 2,348 patients were readmitted out of 19,408patients. This evaluation scheme is consistent withFader et al. (2005) conditional expectation approach.For the benchmarking models and our BG/EG hur-dle model, the last observed visit for each patient inthe training set is used as the “snapshot” to build thetraining model, which is then applied to the test datato predict readmission occurrences of each patientduring the holdout period. We measure our model’spredictive accuracy against actual readmission dataobserved during the last year of our sample period.

Figure 2 represents a lift curve, which describes theoverall lift in predictive performance provided by thesix types of estimation methodologies in our horserace experiment. We derive the probability of read-mission for each patient for all models consideredin our experiment. Figure 3 demonstrates that ourBG/EG hurdle model outperforms all other bench-mark models as a whole, followed by the BG/EGwithout hurdle, BG/NBD hurdle, logit, EG hurdle,and the random estimation model, in that order.For example, compared to the logit model, the liftimprovement in our BG/EG hurdle model is 27.29%if we focus on the top 25% of readmitted patients.We provide additional details of the lift table for spe-cific patient segments in Appendix C of the onlinesupplement. Based on our sample data of 587 read-mitted, high-risk patients, the BG/EG hurdle modelaccurately profiles 160 more patients than the baselinelogit model.

Furthermore, we adopt another criterion to evalu-ate the predictive power of the BG/EG hurdle model.

model in 2007 allows us to at least have a one-year lead time toensure that we measure our dependent variable as accurately as pos-sible. We acknowledge that if the index admission occurred before2006, it is possible (though with small probability) that we may mis-label such readmissions. However, for 30-day readmissions, havinga whole year of data as a buffer ensures that our labels are accurate.

This criterion focuses on the accuracy of our model’sprediction of the frequency of future readmissions.Prediction of readmission frequency has importantmanagerial implications because hospital managerscan utilize the results to anticipate future demandand allocate hospital resources accordingly. We eval-uate our model by first predicting the timing of read-missions for each patient during the holdout period.Then, by aggregating the occurrence of readmissionsfor each month across all patients who are identifiedas candidates for readmission, we obtain an expecta-tion of the total number of monthly readmissions. Wethen compare this expectation to the number of actualmonthly readmissions.

Since the baseline logit model only deals with abinary dependent variable, we use the estimated logitprobability along with a simple OLS regression modelto predict the frequency of future readmissions. First,for each patient, we calculate the projected numberof total admissions based on OLS estimation. At thesame time, an estimated logit probability is derivedfor each patient. Next, we estimate the projected num-ber of admissions for patients with estimated oddsgreater than 0.5 for each month. Figure 3 reportsthe actual readmissions versus the predicted readmis-sion frequency based on the BG/EG hurdle and base-line logit estimation for each month of the one-yearholdout period. Overall, we observe that the BG/EGhurdle model provides a fairly accurate prediction,where the average difference between the predictedand actual values is 10%. On the other hand, the aver-age difference between the predicted and actual val-ues for the baseline model is significantly worse at62%. The BG/EG hurdle model is able to outperformthe baseline models by accurately identifying and pre-dicting patient readmission patterns. It is worth not-ing that our hurdle model also achieves a higher accu-racy in terms of predicting the 30-day readmissionrate, which is within 4.7% of the actual number ofreadmissions.

5.5. Robustness ChecksA common alternative to our approach is to createan out of sample by randomly selecting a subset ofpatients from our data across all years, and then pre-dicting the readmission patterns for this out of sam-ple given the in sample representing other patients’readmission history (Shmueli and Koppius 2011). Wegenerate such an out of sample by randomly selecting50% of patients as in sample, and treating the rest asthe out of sample. We recalibrate our model based onthe in sample, and use this model to predict readmis-sions for patients in the out of sample. For each out-of-sample patient, we use the BG/EG hurdle modelto predict the propensity of a 30-day readmission.

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 19: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF36 Information Systems Research 26(1), pp. 19–39, © 2015 INFORMS

Figure 3 Predicted versus Actual Number of Patient Admissions

0

100

200

300

400

500

600

1 2 3 4 5 6 7 8 9 10 11 12

Tot

al n

o of

adm

issi

ons

Month

Prediction of total number of admissions

Actual BG/EG hurdle Logit

The BG/EG hurdle model exhibits an overall out-of-sample accuracy of 59%, whereas the overall accu-racy of the logit model is 56%. One concern regardingthe prediction accuracy of a model among high-riskpatient segments is that it may be achieved at thecost of over-predicting readmitted cases. Therefore,we also account for type-II errors, which represent theprobabilities of false classification of patients as read-mission cases. Our randomly selected out-of-sampleprediction shows that the type-II error for the BG/EGhurdle model is lower than the baseline logit models.The C-statistic, the standard measure that accounts fortypes I and II trade-offs, of the BG/EG hurdle modelis 0.601, whereas the corresponding C-statistic of thelogit model is 0.563.

We also check the robustness of our results byanalyzing whether our estimations that are basedon multihospital data still hold for a single hospi-tal sample. Our analysis of patient data from a large,teaching hospital shows that the traditional, single-hospital readmission rate, when readmissions to otherhospitals are not accounted for, is 30.55%. On theother hand, if we also account for patients who arereadmitted to other hospitals in the region, the read-mission rate climbs to 40.45%, which indicates that asingle-hospital view of readmission is erroneous andunderestimates the true risk of readmission. For exam-ple, although diabetes, renal failure, drug abuse, andischemic disease are significant factors that increasethe risk of CHF readmission in our analysis, a single-hospital view only identifies drug abuse as a readmis-sion risk factor. Although the actual number of CHFadmissions to this hospital is 12.3 cases per month,our model predicts an average of 13.2 CHF-related vis-its per month, whereas the logistic model predicts anaverage of 27.2 visits per month. This is an averagedifference across months of 33.3% and 172% for theBG/EG and logistic models, respectively, confirmingthat the BG/EG model outperforms logit models evenbased on data obtained from a single hospital.

We further estimate the all-period readmission re-sults, where we study readmissions over the entire

four-year period (instead of 30 days), and report theseresults in Appendix D in the online appendix. Theseresults are qualitatively similar to the main resultsreported in the paper.

6. ConclusionsUnderstanding the characteristics of patient readmis-sion patterns allows hospitals to develop better pre-dictive capabilities in order to identify and profilepatients who pose greater readmission risk. Predict-ing the propensity of readmission for a CHF patientenables hospitals to identify and deliver appropri-ate treatment to the right patients and provide moreefficient postacute care, which significantly reducessubsequent readmissions. In this study, we examinethe association between patient and health IT char-acteristics and the risk propensity of future read-missions for patients with CHF. By incorporatingpatient history of readmissions across multiple hos-pitals, we develop a predictive BG/EG hurdle modelthat accounts for unobserved patient heterogeneity,nonstationary admission rates, time-varying risk fac-tors, and data censoring. Furthermore, we estimatethe specific effects related to the propensity as wellas the frequency of future readmissions. Our pro-posed model represents a significant methodologicalimprovement over extant models of readmission risk,and delivers superior predictive performance com-pared to traditional models.

By developing a greater understanding of patientreadmission behavior, a hospital can better pro-file patients who are at higher risk of readmissionand implement preventive measures to target thesepatients effectively. The scope of previous academicresearch on patient readmissions has been severelylimited because of the lack of information sharingacross hospitals that can be attributed to the absence ofa common master patient index. In this study, we iden-tify the risk factors associated with patient readmis-sions across multiple hospitals over a four-year periodbased on a unique data set obtained through electronic

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 20: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHFInformation Systems Research 26(1), pp. 19–39, © 2015 INFORMS 37

integration of patient data across 67 hospitals in a largegeographical region.

6.1. DiscussionThere are several deficiencies in the manner in whichthe extant healthcare literature treats the readmissionproblem. Admittedly, monitoring the 30-day hospi-tal readmission rate is one of the key barometers ofthe current healthcare reform plan. However, overlyfocusing on short term (e.g., 30-day) readmission tar-gets alone may lead to myopic actions, as evident fromthe hurdle estimation component of the BG/EG hurdlemodel. We find that managerial insights obtained fromthe logit hurdle component and the BG/EG compo-nent are drastically different. For instance, we observethat health IT systems are associated with a reductionin the frequency of future readmissions, once the 30-day readmission hurdle has been crossed. Similarly,we find that Medicare patients exhibit a lower fre-quency of future readmissions, once they cross the30-day initial readmission hurdle. This paper sends aforward-looking message to policy makers that think-ing beyond 30-day readmissions may be necessary.

We observe that health IT applications representimportant tools in reducing avoidable inpatient read-missions. For example, Parkland Hospital in Dallascaptures clinical, social, and demographic character-istics of patient data in EMR systems, which is usedto calculate a risk score for each heart failure patienton admission (Hagland 2011). A high-risk score trig-gers an alert to a heart failure “SWAT team” forfollow-up care. Our results suggest that the use ofcardiology and administrative IT applications help toreduce patient readmission risk and lay the founda-tion for more effective treatment and care delivery.Our empirical results support recent recommenda-tions made by the Hospital Readmission Workgroup,which advocates use of health IT tools, such as casemanagement systems, predictive analytics, and socialmedia, as enablers to offer patients better postdis-charge care and reduce the incidence of hospital read-missions (HIMSS 2012).

6.2. ImplicationsPrediction of the timing and frequency of a patient’sfuture readmissions is a unique component of ourmodel, since it enables managers to make better deci-sions related to hospital capacity planning. Whenaggregated across thousands of patient readmissionsin a given year, even relatively small improvementsin predictive modeling of readmission risk and fre-quency can substantially improve the quality and costof healthcare delivery. For example, in 2004, the aver-age hospitalization cost for a CHF case was estimatedto be $9,400 (Russo et al. 2007). Hence, our model’ssuperior prediction capabilities can potentially pro-vide average annual savings of up to $1,504,000 for

these high-risk patients, if our predictions were toresult in successful readmission avoidance when wecorrectly predict their readmission propensity andapply preventive care in advance (e.g., deployment ofa dedicated cardiology team and support processes).9

Furthermore, disentangling the estimation of thefrequency from readmission propensity provides uswith a more accurate and nuanced understanding ofpatient readmission patterns. From a patient perspec-tive, we find that repeat care delivery at the samehospital reduces the risk of future readmissions sig-nificantly. This indicates that a patient treated at thesame hospital (across multiple visits) tends to receivebetter quality of care, which reduces their risk ofbeing readmitted for the same diagnosis in the future.

Predictive analytics is central to the Medicare Hos-pital Readmissions Reduction Program (HRRP), asestablished by the Affordable Care Act (ACA) to im-prove the quality of healthcare. To reduce preventablereadmissions, the ACA imposes a financial penaltyon hospitals with above-average Medicare readmis-sions, where the penalties are collected from hospitalsthrough a reduction in their base Medicare inpa-tient claims payments, up to a cap set at 3% eachyear (James 2013). For instance, one of the hospi-tals in our sample would have faced a significantpenalty if this regulation had been enforced in 2009.Its CHF inpatient-related charges were $9 million, ofwhich $5.7 million was charged to Medicare patients.Under the HRRP, the financial penalty for this hospi-tal would be about $171,000. If the hospital could suc-cessfully implement healthcare IT, it could potentiallyreduce patient readmission rates by 4%, resulting in areduction of $360,000 in readmission costs as well asavoiding $171,000 in financial penalties due to ACAregulations, for a total cost reduction of $531,000.

6.3. Limitations and Future ResearchNevertheless, our study does have a few limitations.Our model is limited to studying CHF patients withinone geographic region. Although the North Texasregion is fairly diverse in terms of its population,future studies are needed to expand the scope of ourmodels to account for patient demographic charac-teristics in other regions of the country. Our studyis restricted to patients whose primary diagnosis isCHF. Future studies will extend these models to otherchronic diseases. Our measures of hospital IT usage

9 The estimated cost savings are based on the top 25% highestrisk patients from the lift curve, which first ranks patients fromthe highest readmission probability to the lowest, and then countsthe rate of true positives. The predictive performance from the liftcurve shows that the BG/EG hurdle model accurately profiles 160more patients than the baseline logit model. Therefore, the poten-tial cost savings of accurately predicting 160 patients is equal to160× $9,400 per patient, or $1.5 million.

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 21: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF38 Information Systems Research 26(1), pp. 19–39, © 2015 INFORMS

are based on the HIMSS data, which provides infor-mation on overall hospital-wide health IT applica-tions instead of their usage for treatment of specificpatients. Future studies will be designed to inves-tigate the impact of the usage of different types ofhealth IT for treatment of specific patient and dis-ease clusters. We acknowledge that our observed rela-tionships between health IT and patient readmissionrisk are associational in nature, although we haveaccounted for potential endogeneity in terms of read-mission risk.

Supplemental MaterialSupplemental material to this paper is available at http://dx.doi.org/10.1287/isre.2014.0553.

AcknowledgmentsThe authors gratefully acknowledge the feedback receivedfrom the senior editor, associate editor, and the reviewteam. The authors thank seminar participants at the Uni-versity of Washington, University of Arizona, Universityof Texas at Austin, University of Texas at Dallas, Univer-sity at Buffalo, Georgia State University; as well as atten-dees of the 32nd International Conference on InformationSystems, Shanghai, and the 2011 INFORMS Conference onInformation Systems and Technology, Charlotte; and Dr.Gary Reed at the University of Texas Southwestern MedicalCenter for their helpful feedback. The authors also grate-fully acknowledge the Dallas Fort Worth Hospital CouncilResearch Foundation and the Healthcare Information andManagement Systems Society for their help in providingaccess to the research data. Financial support from UT Dal-las and research grants awarded by the UT SouthwesternMedical Center (Contract 34456001 and 34456002) are grate-fully acknowledged.

ReferencesAgarwal R, Gao G, DesRoches C, Jha AK (2010) Research

commentary—The digital transformation of healthcare: Cur-rent status and the road ahead. Inform. Systems Res. 21(4):796–809.

Alexander M, Grumbach K, Remy L, Rowell R, Massie BM (1999)Congestive heart failure hospitalizations and survival in Cal-ifornia: Patterns according to race/ethnicity. Amer. Heart J.137(5):919–927.

Amarasingham R, Plantinga L, Diener-West M, Gaskin DJ, PoweNR (2009) Clinical information technologies and inpatient out-comes: A multiple hospital study. Arch. Intern. Med. 169(2):108–114.

Amarasingham R, Moore BJ, Tabak YP, Drazner MH, Clark CA,Zhang S, Reed WG, Swanson TS, Ma Y, Halm EA (2010) Anautomated model to identify heart failure patients at risk for30-day readmission or death using electronic medical recorddata. Medical Care 48(11):981–988.

Anderson CL, Agarwal R (2011) The digitization of healthcare.Inform. Systems Res. 22(3):469–490.

Angst CM, Agarwal R, Sambamurthy V, Kelley K (2010) Social con-tagion and information technology diffusion: The adoption ofelectronic medical records in U.S. hospitals. Management Sci.56(8):1219–1241.

Arellano M, Bond S (1991) Some tests of specification for paneldata: Monte Carlo evidence and an application to employmentequations. Rev. Econom. Stud. 58(2):277–297.

Aron R, Dutta S, Janakiraman R, Pathak PA (2011) Impact ofautomation of systems on medical errors. Inform. Systems Res.22(3):429–446.

Bardhan I, Thouin M (2013) Health information technology and itsimpact on the quality and cost of healthcare delivery. DecisionSupport Systems 55(2):438–449.

Bhattacherjee A, Hikmet N, Menachemi N, Kayhan VO, BrooksRG (2007) The differential performance effects of healthcareinformation technology adoption. Inform. Systems Management24(1):5–14.

Blumenthal D, Tavenner M (2010) The “meaningful use” regula-tion for electronic healthcare records. New England J. Medicine363(6):501–504.

Buntin MB, Burke MF, Hoaglin MC, Blumenthal D (2011) The ben-efits of health information technology: A review of the recentliterature shows predominantly positive results. Health Affairs30(3):464–471.

Campbell EM, Sittig DF, Ash JS, Guappone KP, Dykstra RH (2006)Types of unintended consequences related to computerizedprovider order entry. J. Amer. Med. Informatics Assoc. 13(5):547–556.

Cebul RD, Love TE, Jain AK, Hebert CJ (2011) Electronic healthrecords and quality of diabetes care. New England J. Medicine365(9):825–833.

Chatfield C, Goodhardt GJ (1973) A consumer purchasing modelwith Erlang interpurchase time. J. Amer. Statist. Assoc. 68(344):828–835.

Chin M, Goldman M (1997) Correlates of early hospital readmis-sion or death in patients with congestive heart failure. Amer. J.Cardiol. 79(12):1640–1644.

Congressional Budget Office (2008) Evidence on the Costs andBenefits of Health Information Technology. Washington, DC.www.cbo.gov/publication/41690.

Cox DR (1972) Regression models and life-tables. J. Royal Statist.Soc. Ser. B (Methodological) 34(2):187–220.

Das S, Yaylacicegi U, Menon NM (2011) The effect of informationtechnology investments in healthcare: A longitudinal study ofits lag, duration, and economic value. IEEE Trans. Engrg. Man-agement 58(1):124–140.

Deb P, Trivedi PK (2002) The structure of demand for health care:Latent class versus two-part models. J. Health Econom. 21(4):601–625.

DesRoches CM, Campbell EG, Vogeli C, Zheng J, Rao SR, ShieldsAE, Donelan K, Rosenbaum S, Bristol SJ, Jha AK (2010) Elec-tronic health records’ limited successes suggest more targeteduses. Health Affairs 29(4):639–646.

Deswal A, Petersen NJ, Souchek J, Ashton CM, Wray NP (2004)Impact of race on health care utilization and outcomes inveterans with congestive heart failure. J. Amer. Coll. Cardiol.43(5):778–784.

Devaraj S, Kohli R (2000) Information technology payoff in thehealth-care industry: A longitudinal study. J. ManagementInform. Systems 16(4):41–67.

Devaraj S, Kohli R (2003) Performance impacts of informationtechnology: Is actual usage the missing link? Management Sci.49(3):273–289.

Elixhauser A, Steiner C, Harris DR, Coffey RM (1998) Comorbid-ity measures for use with administrative data. Medical Care36(1):8–27.

Fader PS, Hardie BGS, Huang C-Y (2004) A dynamic change-point model for new product sales forecasting. Marketing Sci.23(1):50–65.

Fader PS, Hardie BGS, Lee KL (2005) “Counting your customers”the easy way: An alternative to the Pareto/NBD model. Mar-keting Sci. 24(2):275–284.

Felker GM, Leimberger JD, Califf RM, Cuffe MS, Massie BM,Adams KF, Gheorghiade M, O’Connor CM (2004) Risk strati-fication after hospitalization for decompensated heart failure.J. Cardiac Failure 10(6):460–466.

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.

Page 22: Predictive Analytics for Readmission of Patients with ... · Bardhan et al.: Predictive Analytics for Readmission of Patients with CHF Information Systems Research 26(1), pp. 19–39,

Bardhan et al.: Predictive Analytics for Readmission of Patients with CHFInformation Systems Research 26(1), pp. 19–39, © 2015 INFORMS 39

Gao G, McCullough J, Agarwal R, Jha A (2010) A study of onlinephysician ratings by patients. Working paper, R. H. SmithSchool of Business, University of Maryland, College Park.

Gönül FF, Hofstede FT (2006) How to compute optimal catalogmailing decisions. Marketing Sci. 25(1):65–74.

Gupta S (1991) Stochastic models of interpurchase time with time-dependent covariates. J. Mktg. Res. 28(1):1–15.

Hagland M (2011) Mastering readmissions: Laying the foundationfor change. Healthcare Informatics 28(4):10–16.

Han YY, Carcillo JA, Venkataraman ST, Clark RSB, Watson RS,Nguyen TC, Bayir H, Orr RA (2005) Unexpected increasedmortality after implementation of a commercially sold com-puterized physician order entry system. Pediatrics 116(6):1506–1512.

Heckman JJ (1991) Identifying the hand of past: Distinguish-ing state dependence from heterogeneity. Amer. Econom. Rev.81(2):75–79.

Hillestad R, Bigelow RJ, Bower A, Girosi F, Meili R, Scoville R, Tay-lor R (2005) Can electronic medical record systems transformhealth care? Potential health benefits, savings, and costs. HealthAffairs 24(5):1103–1117.

Himmelstein DU, Wright A, Woolhandler S (2010) Hospital com-puting and the costs and quality of care: A national study.Amer. J. Medicine 123(1):40–46.

HIMSS (2012) Reducing readmissions: Top ways informationtechnology can help. The Hospital Readmission Workgroup,Management Engineering-Process Improvement Committee,Chicago. www.himss.org/ResourceLibrary/ResourceDetail.aspx?ItemNumber=10534.

Jain DC, Vilcassim NJ (1991) Investigating household purchase tim-ing decisions: A conditional hazard function approach. Market-ing Sci. 10(1):1–23.

James J (2013) Health policy brief: Medicare hospital readmissionreduction program. Health Affairs (November 12).

Jeuland AP, Bass FM, Wright GP (1980) A multibrand stochasticmodel compounding heterogeneous Erlang timing and multi-nomial choice processes. Oper. Res. 28(2):255–277.

Joynt KE, Jha AK (2012) Thirty-day readmissions—truth and con-sequences. New England J. Medicine 366(15):1366–1369.

Joynt KE, Orav EJ, Jha AK (2011) Thirty-day readmission rates forMedicare beneficiaries by race and site of care. J. Amer. Med.Assoc. 305(7):675–681.

Kansagara DEH, Englander H, Salanitro A, Kagen D, Theobald C,Freeman M, Kripalani S (2011) Risk prediction models for hos-pital readmission: A systematic review. J. Amer. Medical Assoc.306(15):1688–1698.

Kaushal RSK, Shojania KG, Bates DW (2003) Effects of computer-ized physician order entry and clinical decision support sys-tems on medication safety: A systematic review. Arch. Intern.Med. 163(12):1409–1416.

Krumholz HM, Chen Y-T, Wang Y, Vaccarino V, Radford MJ, Hor-witz RI (2000) Predictors of readmission among elderly sur-vivors of admission with heart failure. Amer. Heart J. 139(1):72–77.

Linder JA, Rigotti NA, Schneider LI, Kelley JHK, Brawarsky PP,Haas JS (2009) An electronic health record–based interven-tion to improve tobacco treatment in primary care: A cluster-randomized controlled trial. Arch. Intern. Med. 169(8):781–787.

McCullough JS, Casey M, Moscovice I, Prasad S (2010) The effectof health information technology on quality in U.S. hospitals.Health Affairs 29(4):647–654.

Menachemi N, Chukmaitov A, Saunders C, Brooks RG (2008) Hos-pital quality of care: Does information technology matter? Therelationship between information technology adoption andquality of care. Health Care Manage. Rev. 33(1):51–59.

Menon NM, Lee B, Eldenburg L (2000) Productivity of informa-tion systems in the healthcare industry. Inform. Systems. Res.11(1):83–92.

Miller AR, Tucker CE (2011) Can health care information technol-ogy save babies? J. Political Econom. 119(2):289–324.

Morrice DJ, Bardhan IR (1995) A weighted least squares approach tocomputer simulation factor screening. Oper. Res. 43(5):792–806.

Morrison DG, Schmittlein DC (1981) Predicting future randomevents based on past performance. Management Sci. 27(9):1006–1023.

Mudge AM, Kasper K, Clair A, Redfern H, Bell JJ, Barras MA,Dip G, Pachana NA (2010) Recurrent readmissions in medicalpatients: A prospective study. J. Hosp. Med. 6(2):61–67.

Muus K, Knudson A, Klug M, Gokun J, Sarrazin M (2010) Effect ofpostdischarge follow-up care on readmissions among U.S. vet-erans with congestive heart failure: A rural-urban comparison.Internat. J. Rural Remote Health Res. 10(2):1447.

Nasir K, Lin Z, Bueno H, Normand S-LT, Drye EE, Keenan PS,Krumholz HM (2010) Is same-hospital readmission rate agood surrogate for all-hospital readmission rate? Medical Care48(5):477–481.

Ong M, Mangione CM, Romano PS, Zhou Q, Auerbach AD, ChunA, Davidson B, et al. (2009) Looking forward, looking back.Circulation: Cardiovascular Quality Outcomes 2(6):548–557.

Philbin EF, DiSalvo TG (1999) Prediction of hospital readmissionfor heart failure: Development of a simple risk score based onadministrative data. J. Amer. Coll. Cardiol. 33(6):1560–1566.

Philbin EF, Dec GW, Jenkins PL, DiSalvo TG (2001) Socioeconomicstatus as an independent risk factor for hospital readmissionfor heart failure. Amer. J. Cardiol. 87(12):1367–1371.

Pratt L (2010) N.J. teaching hospital parlays cardiology and IT part-nership into new business model. Health Imaging IT 8(6):1–4.

PricewaterhouseCoopers (2010) The price of excess: Identify-ing waste in healthcare spending. PricewaterhouseCoopersHealth Research Institute. www.pwc.com/us/en/healthcare/publications/the-price-of-excess.jhtml.

Ross JS, Mulvey GK, Stauffer B, Patlolla V, Bernheim SM, KeenanPS, Krumholz HM (2008) Statistical models and patient predic-tors of readmission for heart failure. Arch. Intern. Med. 168(13):1371–1386.

Russo CA, Ho K, Elixhauser A (2007) Hospital stays for circulatorydiseases, 2004. HCUP Statistical Brief #26 (Agency for Health-care Research and Quality, Rockville, MD).

Schweidel DA, Knox G (2013) Incorporating direct marketing activ-ity into latent attrition models. Marketing Sci. 32(3):471–487.

Seetharaman PB, Chintagunta PK (2003) The proportional hazardmodel for purchase timing: A comparison of alternative spec-ifications. J. Bus. Econom. Statist. 21(3):368–382.

Shelton P, Sager M, Schraeder C (2000) The community assessmentrisk screen (CARS): Identifying elderly persons at risk for hos-pitalization or emergency department visit. Amer. J. ManagedCare 6(8):925–933.

Shmueli G, Koppius OR (2011) Predictive analytics in informationsystems research. MIS Quart. 35(3):553–572.

Silverstein MD, Qin H, Mercer SQ, Fong J, Haydar Z (2008) Riskfactors for 30-day hospital readmission in patients ≥ 65 yearsof age. Proc. Baylor U. Medical Center 21(4):363–372.

Winkelmann R (2004) Health care reform and the number of doctorvisits—An econometric analysis. J. Appl. Econom. 19(4):455–472.

Winkelmann R (2006) Reforming health care: Evidence from quan-tile regressions for counts. J. Health Econom. 25(1):131–145.

Winkelmann R (2010) Econometric Analysis of Count Data (Springer,Berlin).

Wooldridge JM (2010) Econometric Analysis of Cross Section and PanelData, 2nd ed. (MIT Press, Cambridge, MA).

Zheng K, Padman R, Johnson MP, Diamond HS (2005) Understand-ing technology adoption in clinical care: Clinician adoptionbehavior of a point-of-care reminder system. Int. J. MedicalInformatics 74(7–8):535–543.

Dow

nloa

ded

from

info

rms.

org

by [

71.1

64.2

05.2

28]

on 0

4 A

pril

2015

, at 0

9:03

. Fo

r pe

rson

al u

se o

nly,

all

righ

ts r

eser

ved.