Comparative Analysis of Algorithmic Approaches for Auto ...

29
Comparative Analysis of Algorithmic Approaches for Auto-Coding with ICD-10-AM and ACHI Rajvir Kaur Master of Research

Transcript of Comparative Analysis of Algorithmic Approaches for Auto ...

Page 1: Comparative Analysis of Algorithmic Approaches for Auto ...

Comparative Analysis of Algorithmic Approaches for Auto-Coding with

ICD-10-AM and ACHI

Rajvir Kaur

Master of Research

Page 2: Comparative Analysis of Algorithmic Approaches for Auto ...

Authors

Rajvir KaurJeewani Anupama Ginige

Page 3: Comparative Analysis of Algorithmic Approaches for Auto ...

Introduction

• Electronic Health Records (EHRs): Digitised version of paper based medical records

• What is Clinical Coding?

• Assignment of alphanumeric codes

• Manually assigned by clinical coders

• Uses:

• Funding, insurance claim processing

• Research

• Government and policy makers use coded data.

Image Credit: https://medium.com/@Petuum/automated-icd-coding-using-deep-learning-1e9170652175

Page 4: Comparative Analysis of Algorithmic Approaches for Auto ...

Classification system in different countries

• Countries specific classification system:

ICD-10-CM (Clinical Modification)

ICD-10-CA (Canadian Modification)

ICD-10-GM (German Modification)

ICD-10-AM (Australian Modification)

• Ireland, Singapore, Saudi Arabia

Image Credit: https://www.slideshare.net/EduardoPorras2

Page 5: Comparative Analysis of Algorithmic Approaches for Auto ...

Challenges in manual coding

• Complexity of codes

ICD-9 : 3,882 codes

ICD-10: Approx. 70,000 codes

• 15-42 records per day

• Annual cost: 25 billion dollars (U.S.)

• Training and recruitment cost

• Highly prone to errors

Image Credit: http://bestptbilling.com/how-to-reduce-icd-10-transition-pain-for-physical-therapy-practice-owners/

“Boy, this new system is so confusing.your ICD-9 code says that you’re here

for a sprained ankle, but your ICD-10 codesays it’s complete and irreversible skeletal failure.

Page 6: Comparative Analysis of Algorithmic Approaches for Auto ...

Our Contribution

• We focus on:

• ICD-10-AM and ACHI classification system

• Comparing and analysing various approaches based on standard evaluation criteria

• Our research concentrates on only two ICD-10-AM and ACHI chapters

• Digestive System:

Chapter 11: Diseases of the digestive system (ICD-10-AM)

Chapter 10: Procedures on digestive system (ACHI)

• Respiratory System:

Chapter 10: Diseases of the respiratory system (ICD-10-AM)

Chapter 7: Procedure on respiratory system (ACHI)

Page 7: Comparative Analysis of Algorithmic Approaches for Auto ...

Ethics Approval

• Western Sydney University Ethics No.: H12628190

• Dataset:

• Total 190 clinical records (Gold Standard)

• Collected from hospitals across Australia

• Archived by National Centre for Classification

in Health (NCCH)

Page 8: Comparative Analysis of Algorithmic Approaches for Auto ...

Sample data

Page 9: Comparative Analysis of Algorithmic Approaches for Auto ...

Paper based Electronic version

• PDF or Image file to Tabular format

• Created text narratives

• Information extracted from medical records include:

Principal Diagnoses (PDx)

Additional Diagnoses (ADx)

Smoke related diagnosis

Diabetes condition

Supplementary conditions

Past Medical History (PMHx)

Family Medical History

Principal Procedure

Additional Procedure

Type of anaesthesia

Ventilation details

Allied health intervention

Page 10: Comparative Analysis of Algorithmic Approaches for Auto ...

Dataset

• 190 original records

• Additional 45 records similar to digestive and respiratory diseases and interventions

45 Clinical Records = 190 + 45 =235

15 digestive system 30 respiratory system

Dataset Digestive system records Respiratory system records

Data190 116 74

Data235 131 104

Page 11: Comparative Analysis of Algorithmic Approaches for Auto ...

Overview of the Proposed work

Clinical Text Processing Using ICD-10-AM/ ACHI

TASK 1:ICD-10-AM/ ACHI

Chapter Classification

TASK 2:ICD-10-AM/ ACHI Code Assignment

Digestive System

Respiratory System

Pattern Matching

Rule Based

Machine Learning

Page 12: Comparative Analysis of Algorithmic Approaches for Auto ...

Approaches and TechniquesClinical Text Processing Approaches and Techniques

Pattern Matching

Regular Expression

Evaluation1. Precision2. Recall3. F-score4. Accuracy5. Hamming Loss6. Jaccard Similarity

Rule-based Machine LearningPre-processing1. Sentence splitting2. Abbreviation Expansion3. Tokenisation4. Spell Check

Defining Rules

Pre-processing1. Sentence splitting2. Abbreviation Expansion3. Tokenisation4. Spell Check5. Stop word removal6. Negation detection

Feature Extraction1-gram, 2-gram, 3-gram, 4-gram

ClassificationSVM, Naïve Bayes, Decision TreeRandom Forest, AdaBoost, kNN, MLP

Evaluation

Evaluation1. Precision2. Recall3. F-score4. Accuracy5. Hamming Loss6. Jaccard Similarity

Page 13: Comparative Analysis of Algorithmic Approaches for Auto ...

Pattern Matching

• Simplest approach

• Search a text-string within the text

• Match character for character

• Use Regular Expression

bronchi, bronchus, bronchial, bronchitis

A 51 year old patient has serious cough but no sign of pneumonia

keywords

Page 14: Comparative Analysis of Algorithmic Approaches for Auto ...

Rule-based approach

• Use logical expression and Boolean operations

if (logical expression) then (category)

ICD-10 Codes Generating rulesK05.2

Acute periodontitis

Acute pericoronitis

Parodontal abscess

Peridontal abscess

Excludes

acute apical periodontitis (K04.4)

periapical abscess (K04.7)

periapical abscess with sinus (K04.6)

If document contains

acute periodontitis OR

acute pericoronitis OR

parodontal abscess OR

peridontal abscess OR

AND document NOT contains

acute apical periodontitis AND

periapical abscess AND

periapical abscess with sinus

assign code K05.2

Page 15: Comparative Analysis of Algorithmic Approaches for Auto ...

Machine Learning

• ML

Image Credit: https://www.newtium.com/Software/Predictive

Page 16: Comparative Analysis of Algorithmic Approaches for Auto ...

Data Preprocessing

1. Abbreviation Expansion Admission Date: **** Discharge Date:****Presenting ProblemsRespiratory -coughPRINCIPAL DIAGNOSISInfective exacerbation of bronchiectasisAcute-on-chronic Type 2 respiratory failureSummary of ProgressDear Doctor,Thank you for your ongoing care of **** , who presented to ****hospital on **** with SOB, cough and chest pain, on abackground of bronchiectasis. The patient was admitted underthe case of Dr**** (Respiratory) for management of infectiveexacerbation of bronchiectasis.BackgroundBronchiectasis- Known to Dr****(Respiratory)- Bronchiectasis diagnosed 20 years ago, secondary to childhoodpertussis Left ventricular failure- Known to Dr****(Cardiology)Cough, SOB, Pleuritic chest pain

Abbreviations Full-form

COPD Chronic obstructive pulmonary disease

SBO Small bowel obstruction

IHD Ischaemic heart disease

SOB Shortness of breath

HTN Hypertension

T2DM Type 2 diabetes mellitus

Page 17: Comparative Analysis of Algorithmic Approaches for Auto ...

Data Preprocessing

2. Spell Check

Used : NLTK and PyEnchant Python libraries

Australian English American English

oesophagus esophagus

tumour tumor

anaemia anemia

anaesthetic anesthetic

ischaemic ischemic

diarrhoea diarrhea

Page 18: Comparative Analysis of Algorithmic Approaches for Auto ...

Data Preprocessing

3. Stop word removal

‘again’, ‘about’, ‘there’, ‘once’, ‘during’, ‘out’, ‘they’, ‘own’, ‘an’,‘some’, ‘its’, ‘yours’ ‘such’, ‘into’, ‘most’, ‘itself’, ‘other’, ‘off’, ‘am’,‘who’, ‘as’, ‘him’, ‘each’, ‘themselves’, ‘until’, ‘we’, ‘these’, ‘your’, ‘his’,‘through’, ‘me’, ‘her’, ‘more’ , ‘himself’, ‘this’, ‘down’, ‘should’, ‘our’, ‘their’,‘while’, ‘above’, ‘both’, ‘up’, ‘ours’, ‘she’, ‘all’, ‘when’, ‘at’, ‘any’,‘before’, ‘them’, ‘same’, ‘yourselves’, ‘because’, ‘what’, ‘over’, ‘why’, ‘now’,‘he’, ‘you’, ‘herself’, ‘just’, ‘ourselves’, ‘hers’, ‘yourself’, ‘how’, ‘theirs’‘further’, ‘doing’, ‘where’, ‘too’, ‘whom’, ‘those’

Xno, not, nil, never

Page 19: Comparative Analysis of Algorithmic Approaches for Auto ...

Data Preprocessing

4. Negation Detection

negated term

The patient is suffering from serious cough but no evidence of pneumonia.

keywords

Negated findings: (pneumonia, ‘True’) – do not assign code

Non-negated findings: (cough, ‘True’) – assign code

Page 20: Comparative Analysis of Algorithmic Approaches for Auto ...

Feature Extraction

Bag of words representation

X:The infant was admitted to The hospital for bronchiolitis with worse cough andwheeze

Y:The old male presented forvomiting and diarrhoea

admitted 1and 2bronchiolitis 1cough 1diarrhoea 1for 2hospital 1infant 1male 1old 1presented 1to 1the 3vomiting 1was 1wheeze 1with 1worse 1

Page 21: Comparative Analysis of Algorithmic Approaches for Auto ...

Classification

Seven classifiers:

Support Vector Machine (SVM)

Naïve Bayes (NB)

Decision Tree (DT)

Random Forest (RF)

AdaBoost

k-Nearest Neighbor (kNN)

Multi Layer Perceptron (MLP)

Page 22: Comparative Analysis of Algorithmic Approaches for Auto ...

Evaluation

Yi: Ground truth label Zi : Predicted label N: Number of records M: Set of all labels

Positive Negative

Positive True Positive (TP)

False Negative(FN)

Negative False Positive (FP)

True Negative(TN)

Pre

dic

ted

Ground Truth

Page 23: Comparative Analysis of Algorithmic Approaches for Auto ...

Results: TASK 1 ICD-10-AM/ACHI Chapter Classification

TASK 1:ICD-10-AM/ ACHI

Chapter Classification

Digestive System

Respiratory System

Gastrointestinalclass

Respiratory class

 Metrics

Classifiers

Data190 0.95 0.95 0.95 0.9474 0.05263 0.94736

Data235 0.87 0.87 0.87 0.8723 0.12765 0.87234

Data190 0.93 0.92 0.92 0.9211 0.07894 0.92105

Data235 0.98 0.98 0.98 0.9787 0.02127 0.97872

Data190 0.89 0.87 0.86 0.8684 0.13157 0.86842

Data235 0.88 0.87 0.87 0.8723 0.12765 0.87234

Data190 0.76 0.55 0.42 0.5526 0.44736 0.55263

Data235 0.84 0.81 0.8 0.8085 0.19148 0.80851

Data190 0.84 0.84 0.84 0.8421 0.15789 0.84211

Data235 0.9 0.89 0.89 0.8936 0.10638 0.89361

Data190 0.85 0.84 0.84 0.8421 0.15789 0.84211

Data235 0.89 0.89 0.89 0.8936 0.10638 0.89361

Data190 0.88 0.87 0.87 0.8684 0.13157 0.86842

Data235 0.9 0.89 0.89 0.8936 0.10638 0.89361

Multi Layer

Perceptron

Support Vector

Machine

Naïve Bayes

Decision Tree

Random Forest

k-Nearest

Neighbor

AdaBoost

Jaccard

SimilarityDataset Precision Recall F-score Accuracy

Hamming

Loss

Page 24: Comparative Analysis of Algorithmic Approaches for Auto ...

Task 2: ICD-10-AM/ACHI Code Assignment TASK 2:

ICD-10-AM/ACHI Code Assignment

Pattern Matching

Rule-BasedMachine Learning

Training-Testingnot required

Training-Testing required

Test Data(20%)

Data190

Digestive system: 22

Respiratory system:16

Total: 38

Data235

Digestive system: 26

Respiratory system:21

Total: 47

Number of Medical Records

Page 25: Comparative Analysis of Algorithmic Approaches for Auto ...

Results: TASK 2 ICD-10-AM/ACHI Code Assignment

Data190 Data235

00.20.40.60.8

1

Pattern Matching Rule-Based

Precision

Recall

F-score

Accuracy

HL

JS

0

0.2

0.4

0.6

0.8

1

Pattern Matching Rule-Based

Precision

Recall

F-score

Accuracy

HL

JS

Approach Dataset Precision Recall F-score Accuracy

Hamming

Loss

Jaccard

Similarity

Data190 0.7953 0.4184 0.5277 0.4027 0.043 0.4365

Data235 0.8029 0.409 0.5201 0.3945 0.0405 0.4255

Data190 0.7913 0.6916 0.7257 0.6053 0.1728 0.5803

Data235 0.792 0.6872 0.7222 0.6011 0.1745 0.5768

Pattern

Matching

Rule based

Page 26: Comparative Analysis of Algorithmic Approaches for Auto ...

TASK 2 Results: Machine Learning

Classifier Dataset Precision Recall F- score Accuracy

Hamming

Loss

Jaccard

Similarity

Data190 0.76798 0.45175 0.54361 0.44051 0.03706 0.44776

Data235 0.89308 0.55191 0.65373 0.54143 0.01955 0.52697

Data190 0.62534 0.63168 0.57465 0.44051 0.67841 0.42014

Data235 0.72891 0.61722 0.61821 0.49643 0.35158 0.48805

Data190 0.58333 0.25586 0.33523 0.25389 0.01392 0.27135

Data235 0.66666 0.30773 0.39793 0.29717 0.02453 0.32365

Data190 0.81421 0.81329 0.79115 0.66831 0.23514 0.65517

Data235 0.92392 0.92019 0.91412 0.86118 0.09458 0.82945

Data190 0.92062 0.85015 0.87305 0.79201 0.08776 0.74537

Data235 0.91407 0.91295 0.90351 0.84462 0.11271 0.79245

Data190 0.62938 0.29488 0.37559 0.29073 0.02192 0.29

Data235 0.63475 0.34756 0.38689 0.34537 0.00942 0.33055

Data190 0.68001 0.46388 0.51485 0.38567 0.34411 0.36667

Data235 0.57679 0.46974 0.40582 0.40993 0.24057 0.3913kNN

SVM

Naïve Bayes

Random

Forest

AdaBoost

Decision

Tree

MLP

Data190 results using 4-gram and Data235 results using 2-gram feature set

Page 27: Comparative Analysis of Algorithmic Approaches for Auto ...

Comparison of approaches

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 Pattern Matching

Rule-based

Machine Learning

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 Pattern Matching

Rule-based

Machine Learning

Data190 Data235

Page 28: Comparative Analysis of Algorithmic Approaches for Auto ...

Conclusion and Future Work

• Conclusion:

• Due to adoption of EHRs and advanced classification systems, there is the need to automate clinical workflow

• Computer Assisted Coding has capability to overcome the challenges of manual coding

• Machine Learning approach is capable to predict correct ICD-10-AM and ACHI codes

• Future Work:

• To work on large-scale data

• To work on other chapters of ICD-10-AM and ACHI classification system

• To apply Deep Learning and Hybrid approaches for Computer Assisted Coding

Page 29: Comparative Analysis of Algorithmic Approaches for Auto ...

Thank you