PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of...

Post on 18-Feb-2016

6 views 2 download

description

Presented by Ito Wasito, Universitas Indonesia, last 29 October 2015 at the Trilateral Scientific Meeting on Big Data, Data Bases, and Dynamic Analysis in Bogor, Indonesia

Transcript of PRESENTATION: Identification of the Independency of Variables and Prediction of Outcome of...

Identification of the Independency of Variables and Prediction

of Outcome of Tubercolusis (TBC) Treatment using Dynamic

Bayesian Networks:

Initial work of A case study a Public Hospital, Jakarta-Indonesia

Ito Wasito

Bogor, 29 October 2015

Trilateral Scientific Meeting on Big Data, Data Bases, and Dynamic Analysis

Outline

2

1. Background

2. Goals

3. Related Works

4. Approach

5. Scope of Problems

6. Review on Bayesian Networks and Dynamics Bayesian Networks

7. Implementation

8. Experimental Results

9. Conclusions

10. Future Works

Faculty of Computer Science, University of

Indonesia

Labs:

1. IT Governance

2. Computer Networks, Architecture, High

Performance Computing

3. Digital Library and Distance Learning

4. E-Government

5. Enterprise Computing

6. Formal Methods in Software Engineering

7. Information Retireval

8. Pattern Recognition

26/11/2015

Faculty of Computer Science, University of

Indonesia

In preparation to develop a Big Data Centre (

Collaboration with Australia Universities).

26/11/2015

Background

The Highest Countries in TBC epidemic:

1.India

2.China

3.South Africa

4. INDONESIA (WHO,2012)

6-9

months

Model TBC Patient

Supervision

5

Outcome of TBC treatment

6

Background

Dynamic Bayesian

Networks

Bayesian

Networks

6

DAG (Direct Acyclic Graph)

Variables relationships :

Probabilities and Graph

Causal

Easy to interpret

Static Model

Time Series

Temporal Dependent

Approach

7 Graph

Structure

on

Dynamic

Bayesian

Networks Identificatio

n of

Independen

cy

Variables

on TBC

Patient

Independen

cy and

Dependenc

y variables

with

Outcome

Treatment

Prediction

Goals

8

1. To identify relationships between variables in

TBC patients's data

2. To Identify variables independency in TBC

diseases

3. To identify the independency and dependency

of variables in TBC disease against treatment's

outcome.

Scope of Problems

9

- 12 Variables of TBC Patient's Data.

- Treatment of Patient's Data at Persahabatan

Public Hospital, Jakarta-Indonesia.

- Software tools: CaMML1.4.1 ( Monash

University, Australia) and Netica Java API

Package.

- Performance evaluation: prediction accuracy

and logarithmic loss

Related Works

• Effective in symptoms of disease prediction

• accuracy 100%

DBN to predict Osteoarthritic Knee Pain

A model based Bayesian Network for prediction of IVF

Success Rate

DBN to predict sequences of

organ failures in patients admitted to ICU

10

(Watt,et all,2008)

(Kakhki,2013)

(Sandri,et all,2014)

11

Evaluation of a Dynamic Bayesian Networks to predict Osteoarthritic Knee Pain

11

(Watt,et all,2008)

ICU Treatment Analysis

To predict the order of organ failure on patient under 7 days treatment.

Score change on Sequential Organ Failure Assessment (SOFA) of patient at ICU

Use Genie software

12

(Sandri,et all,2014)

Revisit: Bayesian Networks

Graph representation from probablity distribution

DAG (Directed Acyclic Graph)

13

(Larranaga,et all, 2013)

14

D-separation

d-separation : To determine whether a node is

independent with the other nodes.

d-separation denoted as:

d-sepG (X;Y|Z) if there is no active path between

node X ∈ X dan node Y∈Y at graph G

Relationship between node :

Direct

Indirect

14

(Direct relationship)

(Koller dan Friedman,2009)

Node Relationships : Indirect

15

(Koller dan Friedman,2009)

Bayesian Networks in Health Applications

16

(Lucas,2004)

Diagnosis Reasoning

• To develop a diagnosis model on one specific disease.

• Prediction of the model of outcome of treatment.

• Exploitation of knowledge from treatment process.

Treatment Selection

• Decision system support to choose the optimal treatment of disease.

Discovering functional interactions

• Molecular mechanism i.e. gene interaction

Dynamic Bayesian Networks

Two Tuples: Prior Model dan Transition Model

Formula DBN

17

(Van Gerven,2008)

(Van Gerven,2008)

Dynamic Bayesian Networks Components

18

(Perez-Ariza,2012)

1. Sets of defined node

2. Intra-slice links

3. Temporal ( Inter-slice links)

4. Conditional Probability Table for the first intersection of time

5. Conditional Probability Table for second intersection of time

(from different parents).

CaMML (Causal Discovery via MML)

19

(Korb dan Nicholson, 2010)

Structure Learning

• Metropolis Algorithm with Markov Chain Monte Carlo (MCMC) search algorithm

Score based

• Minimum Message Length (MML)

Minimum Message Length (MML)

20

(Korb dan Nicholson, 2010)

where:

h: data

N: number of variables in h.

pi : prior probability i.

i: index in h.

j: index not in h.

The Comparisons of DBN Structure Learning Software

21

Name

Structure Learning Parameter

Learning

DBN

Algorithms

GUI URL

CaMML Yes Yes CaMML Yes bayesian-intelligence.com

Bayes Net

Toolbox*

Partial1/ 2-Step1

Yes

Yes BIC1,2,BDe2 No code.google.com/p/bnt/

BNFinder Partial1 ? BIC,BDe, MIT No bioputer.mimuw.edu.pl/software/bnf/

Global

MIT*

Partial1 No MIT No code.google.com/p/globalmit/

Tetrad No/Manual3 Yes (PC,others) Yes www.phil.cmu.edu/tetrad/

GeNIe No/Manual3,4 Yes (PC,K2,other) Yes genie.sis.pitt.edu

Banjo Yes No BDe No cs.duke.edu/~amink/software/banjo/

* Requires Matlab. GlobalMIT: May be possible to use Octave (free) instead of Matlab

1 Supports DBN learning with interslice arcs only (i.e. no arcs within time slices)

2 With DBmcmc extension (bioss.ac.uk/~dirk/software/DBmcmc/) but binary/ternary data/attributes only

3 No official support for learning DBNs. Can adapt BN algorithms using tier priors etc.

4 DBN parameter learning, but no structure learning. Supports DBN inference, unrolling etc.

(Black, 2013)

CaMML (Causal Discovery via MML)

22

Data transformation example from Static Bayesian Networks to Dynamic Bayesian Networks dapat as shown in Figure 2 (Black, 2013).

Figure 2. Data Transformation

Data Preprocessing

24

Data Category defined by Expert

Raw Data

(TBC electronic Data)

Pre-processed data

12 Variables

25

1. Age

2. Sex

3. Sputum test (D1,D2,D3)

4. Treatment categories

5. Weight (B1,B2,B3).

6. Tubercolusis types ( lung or extra lung).

7. Take in anti TBC medicine regularity ( yes/no)

8. Outcome of the Treatment

Implementation

input Data Input and Parameter MML

CaMML (Causal Discovery via MML)

The outcome of DBN justified by expert

opinion

26

Structure Evaluation of DBN Graph

27

• Use package Netica J API

Prediction accuracy (%)

Experiments with different outcomes

recovered

completed

fail

died

incomplete cured

change treatment location

success

(recovered,complete)

unsuccess

(fail,died, incomplete

cured obat,change

treatment locatio)

6 categories of treatment outcomes 2 Categories of treatment outcomes

28

(WHO,2013)

Experiment’s Results I

29

DBN Graph of Experiemental Results I 1

Experiments results 2

30

DBN Graph of Experimental 2

Experiment’s Results 3

31

BN Graph of Experiments 3

Experimen’st Results 4

32

BN Graph of Experiments results 4

Experiments Results E

33

BN Graph of Experiements results 5 (Structure Model by Experts)

Experiments results 6

34

BN Graph of Experiements Results 6 (Structure Model by

Expert)

Accuracy Comparisons

35

Conclusions

36

Graph Structure of Dynamic Bayesian Networks

As Prediction Model of Patient’s

Treatment Outcome of Tuberculosis

Could identify independency of variables on

Tuberculosis patient’s

treatment data

The independencies of variables can be identified by d-separation

algorithm

Future Works

37

By adding more variables, the model prediction of TBC treatment may have better performances.

Can be applied to the others epidemic data such as dengue fever, influenza type disease ( bird flu) etc.

Q & A

Thank you 38