PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of...

37
Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study a Public Hospital, Jakarta-Indonesia Ito Wasito Bogor, 29 October 2015 Trilateral Scientific Meeting on Big Data, Data Bases, and Dynamic Analysis

description

Presented by Ito Wasito, Universitas Indonesia, last 29 October 2015 at the Trilateral Scientific Meeting on Big Data, Data Bases, and Dynamic Analysis in Bogor, Indonesia

Transcript of PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of...

Page 1: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Identification of the Independency of Variables and Prediction

of Outcome of Tubercolusis (TBC) Treatment using Dynamic

Bayesian Networks:

Initial work of A case study a Public Hospital, Jakarta-Indonesia

Ito Wasito

Bogor, 29 October 2015

Trilateral Scientific Meeting on Big Data, Data Bases, and Dynamic Analysis

Page 2: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Outline

2

1. Background

2. Goals

3. Related Works

4. Approach

5. Scope of Problems

6. Review on Bayesian Networks and Dynamics Bayesian Networks

7. Implementation

8. Experimental Results

9. Conclusions

10. Future Works

Page 3: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Faculty of Computer Science, University of

Indonesia

Labs:

1. IT Governance

2. Computer Networks, Architecture, High

Performance Computing

3. Digital Library and Distance Learning

4. E-Government

5. Enterprise Computing

6. Formal Methods in Software Engineering

7. Information Retireval

8. Pattern Recognition

26/11/2015

Page 4: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Faculty of Computer Science, University of

Indonesia

In preparation to develop a Big Data Centre (

Collaboration with Australia Universities).

26/11/2015

Page 5: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Background

The Highest Countries in TBC epidemic:

1.India

2.China

3.South Africa

4. INDONESIA (WHO,2012)

6-9

months

Model TBC Patient

Supervision

5

Outcome of TBC treatment

Page 6: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

6

Background

Dynamic Bayesian

Networks

Bayesian

Networks

6

DAG (Direct Acyclic Graph)

Variables relationships :

Probabilities and Graph

Causal

Easy to interpret

Static Model

Time Series

Temporal Dependent

Page 7: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Approach

7 Graph

Structure

on

Dynamic

Bayesian

Networks Identificatio

n of

Independen

cy

Variables

on TBC

Patient

Independen

cy and

Dependenc

y variables

with

Outcome

Treatment

Prediction

Page 8: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Goals

8

1. To identify relationships between variables in

TBC patients's data

2. To Identify variables independency in TBC

diseases

3. To identify the independency and dependency

of variables in TBC disease against treatment's

outcome.

Page 9: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Scope of Problems

9

- 12 Variables of TBC Patient's Data.

- Treatment of Patient's Data at Persahabatan

Public Hospital, Jakarta-Indonesia.

- Software tools: CaMML1.4.1 ( Monash

University, Australia) and Netica Java API

Package.

- Performance evaluation: prediction accuracy

and logarithmic loss

Page 10: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Related Works

• Effective in symptoms of disease prediction

• accuracy 100%

DBN to predict Osteoarthritic Knee Pain

A model based Bayesian Network for prediction of IVF

Success Rate

DBN to predict sequences of

organ failures in patients admitted to ICU

10

(Watt,et all,2008)

(Kakhki,2013)

(Sandri,et all,2014)

Page 11: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

11

Evaluation of a Dynamic Bayesian Networks to predict Osteoarthritic Knee Pain

11

(Watt,et all,2008)

Page 12: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

ICU Treatment Analysis

To predict the order of organ failure on patient under 7 days treatment.

Score change on Sequential Organ Failure Assessment (SOFA) of patient at ICU

Use Genie software

12

(Sandri,et all,2014)

Page 13: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Revisit: Bayesian Networks

Graph representation from probablity distribution

DAG (Directed Acyclic Graph)

13

(Larranaga,et all, 2013)

Page 14: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

14

D-separation

d-separation : To determine whether a node is

independent with the other nodes.

d-separation denoted as:

d-sepG (X;Y|Z) if there is no active path between

node X ∈ X dan node Y∈Y at graph G

Relationship between node :

Direct

Indirect

14

(Direct relationship)

(Koller dan Friedman,2009)

Page 15: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Node Relationships : Indirect

15

(Koller dan Friedman,2009)

Page 16: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Bayesian Networks in Health Applications

16

(Lucas,2004)

Diagnosis Reasoning

• To develop a diagnosis model on one specific disease.

• Prediction of the model of outcome of treatment.

• Exploitation of knowledge from treatment process.

Treatment Selection

• Decision system support to choose the optimal treatment of disease.

Discovering functional interactions

• Molecular mechanism i.e. gene interaction

Page 17: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Dynamic Bayesian Networks

Two Tuples: Prior Model dan Transition Model

Formula DBN

17

(Van Gerven,2008)

(Van Gerven,2008)

Page 18: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Dynamic Bayesian Networks Components

18

(Perez-Ariza,2012)

1. Sets of defined node

2. Intra-slice links

3. Temporal ( Inter-slice links)

4. Conditional Probability Table for the first intersection of time

5. Conditional Probability Table for second intersection of time

(from different parents).

Page 19: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

CaMML (Causal Discovery via MML)

19

(Korb dan Nicholson, 2010)

Structure Learning

• Metropolis Algorithm with Markov Chain Monte Carlo (MCMC) search algorithm

Score based

• Minimum Message Length (MML)

Page 20: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Minimum Message Length (MML)

20

(Korb dan Nicholson, 2010)

where:

h: data

N: number of variables in h.

pi : prior probability i.

i: index in h.

j: index not in h.

Page 21: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

The Comparisons of DBN Structure Learning Software

21

Name

Structure Learning Parameter

Learning

DBN

Algorithms

GUI URL

CaMML Yes Yes CaMML Yes bayesian-intelligence.com

Bayes Net

Toolbox*

Partial1/ 2-Step1

Yes

Yes BIC1,2,BDe2 No code.google.com/p/bnt/

BNFinder Partial1 ? BIC,BDe, MIT No bioputer.mimuw.edu.pl/software/bnf/

Global

MIT*

Partial1 No MIT No code.google.com/p/globalmit/

Tetrad No/Manual3 Yes (PC,others) Yes www.phil.cmu.edu/tetrad/

GeNIe No/Manual3,4 Yes (PC,K2,other) Yes genie.sis.pitt.edu

Banjo Yes No BDe No cs.duke.edu/~amink/software/banjo/

* Requires Matlab. GlobalMIT: May be possible to use Octave (free) instead of Matlab

1 Supports DBN learning with interslice arcs only (i.e. no arcs within time slices)

2 With DBmcmc extension (bioss.ac.uk/~dirk/software/DBmcmc/) but binary/ternary data/attributes only

3 No official support for learning DBNs. Can adapt BN algorithms using tier priors etc.

4 DBN parameter learning, but no structure learning. Supports DBN inference, unrolling etc.

(Black, 2013)

Page 22: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

CaMML (Causal Discovery via MML)

22

Data transformation example from Static Bayesian Networks to Dynamic Bayesian Networks dapat as shown in Figure 2 (Black, 2013).

Figure 2. Data Transformation

Page 23: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Data Preprocessing

24

Data Category defined by Expert

Raw Data

(TBC electronic Data)

Pre-processed data

Page 24: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

12 Variables

25

1. Age

2. Sex

3. Sputum test (D1,D2,D3)

4. Treatment categories

5. Weight (B1,B2,B3).

6. Tubercolusis types ( lung or extra lung).

7. Take in anti TBC medicine regularity ( yes/no)

8. Outcome of the Treatment

Page 25: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Implementation

input Data Input and Parameter MML

CaMML (Causal Discovery via MML)

The outcome of DBN justified by expert

opinion

26

Page 26: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Structure Evaluation of DBN Graph

27

• Use package Netica J API

Prediction accuracy (%)

Page 27: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Experiments with different outcomes

recovered

completed

fail

died

incomplete cured

change treatment location

success

(recovered,complete)

unsuccess

(fail,died, incomplete

cured obat,change

treatment locatio)

6 categories of treatment outcomes 2 Categories of treatment outcomes

28

(WHO,2013)

Page 28: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Experiment’s Results I

29

DBN Graph of Experiemental Results I 1

Page 29: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Experiments results 2

30

DBN Graph of Experimental 2

Page 30: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Experiment’s Results 3

31

BN Graph of Experiments 3

Page 31: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Experimen’st Results 4

32

BN Graph of Experiments results 4

Page 32: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Experiments Results E

33

BN Graph of Experiements results 5 (Structure Model by Experts)

Page 33: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Experiments results 6

34

BN Graph of Experiements Results 6 (Structure Model by

Expert)

Page 34: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Accuracy Comparisons

35

Page 35: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Conclusions

36

Graph Structure of Dynamic Bayesian Networks

As Prediction Model of Patient’s

Treatment Outcome of Tuberculosis

Could identify independency of variables on

Tuberculosis patient’s

treatment data

The independencies of variables can be identified by d-separation

algorithm

Page 36: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Future Works

37

By adding more variables, the model prediction of TBC treatment may have better performances.

Can be applied to the others epidemic data such as dengue fever, influenza type disease ( bird flu) etc.

Page 37: PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of Tubercolusis (TBC) Treatment using Dynamic Bayesian Networks: Initial work of A case study

Q & A

Thank you 38