Validating EMR Audit Automation Carl A. Gunter University of Illinois Accountable Systems Workshop.
-
Upload
magdalena-frear -
Category
Documents
-
view
213 -
download
0
Transcript of Validating EMR Audit Automation Carl A. Gunter University of Illinois Accountable Systems Workshop.
Validating EMR Audit Automation
Carl A. GunterUniversity of Illinois
Accountable Systems Workshop
Situation• Access to hospital Electronic
Medical Record (EMR) data suffers risk of high loss in the event of false negatives (incorrect refusal of access).– Example: doctor acting on an
emergency cannot get access to list of allergies.
• Hospital has highly trained personnel in whom much trust is vested.
Consequences• Hospital access systems give
liberal access to records, relying on accountability.
• Insider threats are serious and abuses are widely documented.
• Accesses are too numerous to review manually by experts.
• Automated support is required.
Root Problem Statement
Ideal Approach• Obvious approach: develop
anomaly detector (AD) with rules and train classifiers on bad and good accesses.
• Run the AD on the audit logs and investigate positives manually with domain experts
Problem• This requires considerable
dependence on experts.• Assumes experts know how
to provide labels.• Assumes experts can
formulate rules.• Assumes labeled training sets
exist and that researchers will be able to get access to them.
Validation Problem Statement
• The primary validation approach applied by researchers in this area can be called the Random Object Access Model (ROAM).
• ROAM is based on the premise that anomalous users and accesses look random.
• Strategy– Develop rules and train classifier on real data set
augmented with synthetic random users and accesses.
– Test ability to recognize random users or accesses.
Primary Validation Approach
Pro• Likely that illegitimate
accesses appear random.• Good ROAM classifier
prepares for expert review to identify false positives.
• ROAM classifier may find legitimate but interesting hospital information flows.
• Provides a ready testing strategy reminiscent of “fuzzing”.
Con• There no current quantified
evidence that random accesses and illegitimate accesses have strong overlap.
• Indeed, there is evidence that in some cases legitimate accesses look random.
• Some illegitimate accesses may be systematic in ways that defy detection by ROAM classifiers.
ROAM Assessment
• What are the prospects for alternative models?• Example: introduce specific attacks experienced
“in the wild” similar to network traces enriched with known attacks.
• Another idea: look at problems like masquerading and open terminals.
• Behaviors are not random, but may display learnable characteristics.
Beyond ROAMing
Explored an alternative validation model based on topic classification. Idea:• Patients are “documents” and diagnoses, drugs, etc. are their
“words”. • Use Latent Dirichlet Allocation (LDA) to learn topics that can be
used to classify patients.• Use this to characterize users as readers of documents.• Detect unusual readers.• Detect readers of random topics.Modeling and Detecting Anomalous Topic Access, Siddharth Gupta, Casey Hanson, Carl A. Gunter, Mario Frank, David Liebovitz, and Bradley Malin. IEEE Intelligence and Security Informatics, June 2013.
Random Topic Access Model (RTAM)
Topic Distributions
Diagnosis Topics
Neoplasm Topic Obstetric Topic Kidney Topic
Multidimensional Scaling: Patient Diagnosis
RTAM: Random Users• r ~ Dir() with n dimensions, where n is the number of topics.
a.) Direct or Masquerading User (α<1) : an anomalous user of some specialty gains sole access to the terminal of another user in the hospital.b.) Purely Random User (α=1): user is characterized by completely random behavior, with little semantic congruence to the hospital setting.c.) Indirect User: user type resembles an even blend of the topics of many specialized users.
• Random Topic Access Detection (RTAD): an anomaly detection framework that generates synthetic users using RTA and applies a standard spatial outlier, k-nearest neighbor k-NN detection scheme for classification.
• Methodology1. LDA: define patient topics, and user typing to represent users in the topic
space.2. RTA user injection: generate three types of anomalous users and insert into
each role at a 5% mix rate.3. Detection (k-NN): if the ratio of the avg. distance from a user to its k nearest
spatial neighbors to the avg. pairwise distance among those neighbors is greater than a threshold, call the user anomalous.
4. Evaluation Metric: best Area Under the Curve (AUC) for each , role combination.
Random Topic Access Detection (RTAD)
Results - I
The best AUC across all evaluated dimensions is plotted for each role performing poor for .
Results - II
The best AUC across all evaluated dimensions is plotted for each role performing well or near average for .
• Other strategies besides ROAM may capture new types of threats.
• Good progress on technical measures of validation; need links to expert review and ground truth.
• More evaluation studies are needed.• Important to integrate access audit with general
business intelligence: understanding the roles and workflows of the organization.
Discussion and Conclusions