Active Learning based on Bayesian Networks

12
Active Learning based on Bayesian Networks Luis M. de Campos, Silvia Acid and Moisés Fernández

description

Active Learning based on Bayesian Networks. Luis M. de Campos, Silvia Acid and Moisés Fernández. Index of Contents. 1. Introduction The scenario is pool-based active learning cycle. 2. Data and evaluation - PowerPoint PPT Presentation

Transcript of Active Learning based on Bayesian Networks

Page 1: Active Learning based on Bayesian Networks

Active Learning based on Bayesian Networks

Luis M. de Campos, Silvia Acid and Moisés Fernández

Page 2: Active Learning based on Bayesian Networks

2

Index of Contents

1. Introduction  The scenario is pool-based active learning cycle.

2. Data and evaluation  We have participated in 5 from the six datasets considered. The evaluation

is realized with AUC and ALC.

3. Methods  Features, modules implemented, general procedure, how to query labels

and a practical example.

4. Results  The best result is in sixth position.

5. Conclusions

6. Acknowledgments

Page 3: Active Learning based on Bayesian Networks

3

1. Introduction

Page 4: Active Learning based on Bayesian Networks

4

2. Data and evaluation

There are 6 datasets of test-final phase. We have participated in five

from the six: A, C, D, E and F.

These datasets are from different application domains: Chemoinformatics. Embryology. Marketing Text ranking.

Evaluation with: Area under the ROC curve (AUC)

Area under the Learning Curve (ALC).

Page 5: Active Learning based on Bayesian Networks

5

3. Methods. Features

Hardware used: laptop with platform Ubuntu 8.10, 4GB of memory

and Intel core duo to 2.53GHz.

We have used three base classifiers from Bayesian Networks: Naive Bayes. It was used in dataset D. TAN (Tree Augmented Network) with score BDeu. It was used in dataset F. CHillClimber. New classifier that moves in a reduced search space centered on the node class. It was used with score BDeu and in dataset A, C and E.

Method of discretization for numerical variables: Fayyad & Irani MDL in TAN and CHillClimber. None in Naive Bayes.

Page 6: Active Learning based on Bayesian Networks

6

3. Methods. Features and Modules

Active learning method: uncertainty sampling.

We didn’t use unlabeled data for training.

Software implemented (several modules): Matlab: main module. It calls the module C++. C++: intermediate module. It calls the module Weka-Java. Weka-Java: final module. It’s implemented with Java in Weka with several modifications.

1

5 4

3

2

Page 7: Active Learning based on Bayesian Networks

7

3. Methods. Procedure The procedure is as follows:

1. Algorithm trains with all known instances, initially it only has got the seed.

2. It selects new examples to query using a particular method (a,b,c). See the following transparency.

3. It joins all of known instances.

4. Are they all instances known? No: go to 1. Yes: end.

Number of instances to query in each iteration is fixed (three

different ways): Exponencial.

Equal10-All.

All-Equal10.“n” is the total labels of dataset.

(n/2)/10 (n/2)/10 (n/2)/10 (n/2)…

(n/2) (n/2)/10 (n/2)/10 (n/2)/10…

1 2 4 8 …16 32 64

Iteration 1

Iteration 2

Iteration 3

Iteration 4

Iteration 5

Iteration 2

Iteration 3

…Iteration 1

Iteration 4

Page 8: Active Learning based on Bayesian Networks

8

3. Methods. How to query examples (a, b or c)

For each iteration we sort the examples in increasing ordering of

the probabilities of the most probable class. Then we choose “x”

examples with the particular method elected:

a. We query the “x” examples having the lowest probabilities.

b. We query “x1” and “x2” examples having the lowest probabilities corresponding to class -1 and to class 1 respectively maintaining the proportion of examples of each class known so far.. x = x1 + x2.

c. like method b, but “x1” and “x2” are calculated using the proportion of examples of each class estimated from both the tags returned by the oracle and values returned by our classifier.

Page 9: Active Learning based on Bayesian Networks

9

3. Methods. An example.

Prior knowledge: 6 examples corresponding to class -1 and 4 to class 1.

In addition, our classifier shows the next probabilities:

Our strategy of type exponencial indicates that we have to choose 4

examples (we are in the iteration three): With method a: we would choose examples 3,5,4,6. With method b: we would choose examples 3,5,2,1. With method c: we would choose examples 3,5,4,2.

Example Class -1 Class 1

1 0.10 0.90

2 0.20 0.80

3 0.60 0.40

4 0.70 0.60

5 0.65 0.35

6 0.75 0.25

… … …

Example MaxProb Class

1 0.90 1

2 0.80 1

3 0.60 -1

4 0.70 -1

5 0.65 -1

6 0.75 -1

… … …

Example MaxProb Class

3 0.60 -1

5 0.65 -1

4 0.70 -1

6 0.75 -1

2 0.80 1

1 0.90 1

… … …

Select Max probability Sort

Page 10: Active Learning based on Bayesian Networks

10

4. Results Our results are rather modest, obtaining reasonable performance only in

two datasets, C and E.

To the left we can see the plot of dataset E and to the right the plot of

dataset C.

Dataset A C D E F

Method CHillClimber, exponencial, a)

TAN, equal10-all, c)

NaiveBayes, all-equal10, a)

CHillClimber, exponencial, b)

TAN, exponencial, b)

Ranking 20/22 6/14 15/19 12/20 13/16

Page 11: Active Learning based on Bayesian Networks

11

5. Conclusions

We can improve our process if we apply further processing by

clustering when we have a few instances.

Advantages: Simple. No time consuming.

Disadvantages: Static behavior. Lack of knowledge in early stages of the process.

Page 12: Active Learning based on Bayesian Networks

12

Acknowledgments

This work has been supported by the Spanish research programme

Consolider Ingenio 2010: MIPRCV (CSD2007-00018).