2008: Applied AIS - A Roadmap of AIS Research in Brazil and Sample Applications

69
Applied AIS Applied AIS : : A A Roadmap of AIS Research in Roadmap of AIS Research in Brazil and Sample Applications Brazil and Sample Applications Leandro Nunes de Castro [email protected] ; [email protected] Mackenzie University NatComp – From Nature to Business

description

ICARIS 2008 (International Conference on Artificial Immune Systems), Phuket, Thailand

Transcript of 2008: Applied AIS - A Roadmap of AIS Research in Brazil and Sample Applications

Applied AISApplied AIS: : A A Roadmap of AIS Research in Brazil and Roadmap of AIS Research in Brazil and

Sample ApplicationsSample Applications

Leandro Nunes de Castro

[email protected]; [email protected]

Mackenzie University

NatComp – From Nature to Business

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

2

AgendaAgenda

• Main Application Areas and When AIS Should be Applied

• A Worth Knowing Bibliography• A Roadmap of AIS Research in Brazil• Sample Projects

– Bi-clustering for text mining– An AIS for Spam Detection– Optimal Power Flow– A Grain Classifier– Container Scheduling

Application AreasApplication Areas

Which and When

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

4

Main Application AreasMain Application Areas

• An imprecise and incomplete classification:– Pattern Recognition and Classification– Machine Learning– Knowledge Discovery from Databases– Search and Optimization– Robotics– Control– Industrial Applications– Anomaly Detection

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

5

Common FeaturesCommon Features• When AIS should be used:

– Difficulty in modeling– Poorly defined– Dynamic environments– Large number of variables– Missing or noisy variables (attributes)– Highly nonlinear– Difficulty in finding derivatives– Combinatorial solutions (NP-Complete/NP-Hard)– Multiple simultaneous solutions are required

Where to Find InformationWhere to Find Information

http://www.artificial-immune-systems.org/

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

7

AISWeb: AISWeb: The Online Home of Artificial Immune SystemsThe Online Home of Artificial Immune Systems

www.artificial-immune-systems.orgwww.artificial-immune-systems.org

• About the Immune System• ICARIS• Immune-Inspired Algorithms• Jobs and studentships• Links to Researchers• Modeling the Immune System• Publications• Teaching Resources

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

8

The On-line Searchable BibliographyThe On-line Searchable Bibliography

http://www.asap.cs.nott.ac.uk/ais/

You can either search or browse:• Books • Book Chapters • Theses • Journal Papers • Conference Papers • Technical Reports

A Roadmap of AIS Research in BrazilA Roadmap of AIS Research in Brazil

Main Research Groups and Their Focus

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

10

Geographic DistributionGeographic Distribution

UFMG

UNIFEI

UNICAMP

MACKENZIE & NATCOMPUTFPR

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

11

A Brief Description of Each GroupA Brief Description of Each Group

• University: UNICAMP (University of Campinas)• LBiC: Laboratory for Bio-Inspired Computing• Leader: Prof. Fernando J. Von Zuben• Main Application Areas:

– Nonlinear dynamic systems identification– Combinatorial optimization: vehicle routing, gene ordering, – Music composition and arts– Bioinformatics: phylogenetic tree reconstruction, gene

expression analysis– Optimal Wiener equalizers– ANN design– Data mining: clustering, classification, text mining

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

12

Some LBiC ContributionsSome LBiC Contributions

aiNet

opt-aiNet

copt-aiNet

dopt-aiNet

dcopt-aiNet

ARIA

ABNET

RABNET

SABNET

SaiNet

omni-aiNet

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

13

A Brief Description of Each GroupA Brief Description of Each Group

• University: UFMG (Federal University of Minas Gerais)• LITC: Computational Intelligence Laboratory• Leader: Prof. Walmir Matos Caminhas• Main Application Areas:

– Data Mining: spam identification; fault, anomaly and intrusion detection

– Nonlinear system identification and control: induction motors, electromagnetic devices

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

14

A Brief Description of Each GroupA Brief Description of Each Group• University: UNIFEI (Federal University of Itajuba)• CRTI: Reference Center on Information Technology• Leader: Dr. Leonardo de Mello Honório• Main Application Areas:

– Optimization: optimal power flow, agents scheduling

• University: UTFPR (Technological Federal University of Paraná)

• Leader: Dr. Myriam Regattieri• Main Application Areas:

– Economic load dispatch, protein folding, breast cancer profiling, protein structure prediction

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

15

A Brief Description of Each GroupA Brief Description of Each Group

• University: Mackenzie & NatComp• Leader: Prof. Leandro Nunes de Castro• Main Application Areas:

– Optimization: combinatorial and multimodal – Intelligent machines– Optimal Wiener equalizers– ANN design– Data mining: clustering, classification, text mining

Sample Applications from the Sample Applications from the Brazilian GroupsBrazilian Groups

From Text Mining to Grain Classification

BIC-aiNet: An AIS for Text ClusteringBIC-aiNet: An AIS for Text Clustering

Pablo de Castro et al., Natural Computing journal, in press.

A Group from Unicamp

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

18

Immune-Inspired Biclustering to Text MiningImmune-Inspired Biclustering to Text Mining• Standard Clustering;

– Applied to either the rows or columns of a data matrix; that is, clusters either objects or attributes

– ‘Global’ model

• Biclustering:– Simultaneous row-column clustering; that is, clusters objects and

attributes– ‘Local’ model– Also called coclustering, bidimensional clustering, subspace

clustering– Terminology introduced for gene expression data analysis

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

19

BiclusteringBiclustering

• When biclustering should be used? – A set of objects influences (is influenced by) a set of attributes– An object may belong to more than one cluster

• Restrictions:– A cluster of objects should be defined with respect to only a

subset of attributes– A cluster of attributes should be defined with respect to only a

subset of objects– The clusters should not be exclusive and/or exhaustive: an

object/attribute should be able to belong to more than one cluster or no cluster at all and be grouped using a subset of attributes/objects

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

20

BiclusteringBiclustering

• A bicluster is a subset of rows that exhibit similar behavior across a subset of columns, and vice-versa.

• The bicluster AIJ = (I,J) is a subset of rows and a subset of columns where I = {i1, ..., ik} is a subset of rows (I X and k n), and J = {j1, ..., js} is a subset of columns (J Y and s m).

• A bicluster can be defined as a k by s submatrix of matrix A.

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

21

BiclusteringBiclustering• Two interpretations:

– As a two-way permutation problem: interactive reordering of rows and columns so as to produce multiple clusters in different regions of the matrix

– As a two-way partition problem: creation of submatrices so as to maximize an index that evaluates clustering properties (e.g., similarity among objects)

1221

2233

1321

2233

1221

1321

3223

2211

2311

1221

2233

1321

11

11

12

22

rows = {1,3}columns = {1,4}

rows = {2,3}columns = {3,4}

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

22

An Immune-Inspired Algorithm for BiclusteringAn Immune-Inspired Algorithm for Biclustering

• BIC-aiNet:– Multi-population– Dynamic control of the population size– Diversity maintenance

• Encoding:– Two ordered vectors (rows and columns)– Each individual represents a single bicluster

1221

2233

1321Row: [2, 3]Column: [1, 3, 4]

121

222

Data matrix Individual Bicluster

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

23

BIC-aiNetBIC-aiNet

• Fitness function:

||||),(

N

w

M

wRNMf rc

ji

IJiJIjij rrraNM

R,

2)(||||

1

N, M are the set of rows and columns, R is the residue of a bicluster, is a residue threshold, wc is the weight of the number of columns, wr is the weight of the number of rows, aij is the value in the original data matrix, and I (J)indicates the mean values for row (column) i (j).

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

24

The BIC-aiNet AlgorithmThe BIC-aiNet Algorithm

Mutation: insertion/deletionof rows and columnsSuppression: eliminates similar biclusters

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

25

BIC-aiNet PerformanceBIC-aiNet Performance

• Application in Collaborative Filtering– Perform automated recommendations for a user

– Input: matrix Rij, in which each entry represents the rating of user i to item j

• Data sets used: MovieLens, Jester and Dating– MovieLens: 100,000 ratings assigned by 943 users on 1,682

movies. Range: 1(bad) – 5(excellent) – The purpose is to group users with similar interests in order to

provide a recommendation of a movie when a user asks for.

• Performance Measures:– RMSE, MAE, Accuracy, Precision

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

26

BIC-aiNet PerformanceBIC-aiNet Performance

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

27

BIC-aiNet DiscussionBIC-aiNet Discussion

• Bi-clustering interpreted as a bipartition problem• Possibility of using just some attributes per cluster• Multimodal solutions• Accurate recommendations• An attribute may appear in more than one cluster or in

none

IA-AIS: An AIS To Detect SpamIA-AIS: An AIS To Detect Spam

Thiago S. Guzella et al., BioSystems 92(2008), pp. 215-225

A Group from UFMG

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

29

SPAM Detection SPAM Detection

• SPAM messages are constantly ‘evolving’, e.g.: – free == fr33

– viagra = v1agra

– casino = casin0

– watch = w4tch

• SPAM messages can be identified by features such as the sender’s e-mail address, message subject and message body

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

30

IA-AIS: Innate and Adaptive AISIA-AIS: Innate and Adaptive AIS• Combines features from Negative Selection and Clonal

Selection• Composed of Macrophages, B cells, T cells, Interactions

among B and T cells (helper and regulatory)

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

31

IA-AIS: Innate and Adaptive AISIA-AIS: Innate and Adaptive AIS

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

32

IA-AIS: Innate and Adaptive AISIA-AIS: Innate and Adaptive AIS

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

33

IA-AIS: Innate and Adaptive AISIA-AIS: Innate and Adaptive AIS

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

34

IA-AIS: Parameters and ConfigurationsIA-AIS: Parameters and Configurations

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

35

IA-AIS: Experimental ResultsIA-AIS: Experimental Results

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

36

IA-AIS: DiscussionIA-AIS: Discussion

• Macrophages, B cells, T helper cells, T regulatory cells• Incorporation of user feedback• Considers interactions of immune cells• Interesting alternative when high true positive values are

relevant

CGbAIS: A Cluster Gradient-Based AIS CGbAIS: A Cluster Gradient-Based AIS to Optimal Power Flowto Optimal Power Flow

Leonardo M. Honorio et al., IEEE Trans. on Power Systems, in press.

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

38

Optimal Power Flow ProblemsOptimal Power Flow Problems• OPF Main Features:

– Non-linear, non-convex, large-scale – Several sets of continuous and discrete variables

• CGbAIS Main Features:– An individual is related to a set of control variables that define a

possible solution, characterized by a set of equations that describe its behavior

– The Jacobian vector associated with the solutions can be used to guide mutation

– A clustering algorithm is used to reduce computational effort– To ensure the KKT conditions, a modified Lagrangian system is

used

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

39

The Cluster Gradient-Based AIS (CGbAIS)The Cluster Gradient-Based AIS (CGbAIS)

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

40

CGbAIS for Discrete OptimizationCGbAIS for Discrete Optimization

• An antibody is a partial path over a tree search• Maintenance of nLocalBest clones and selection of

nGlobalBest clones• Clustering of paths with the same nodes• Numerical information used to evolve the population:

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

41

Combinatorial Optimization with CGbAISCombinatorial Optimization with CGbAIS

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

42

Solving the Lagrangian Problem with CGbAISSolving the Lagrangian Problem with CGbAIS

• Typical OPF Problem Formulation

• Augmented Lagrangian Function

cscscs

maxcsmin

cs

cs

x,xx,xx,x

h)x,x(hh

)x,x(g to Subject

)x,xf( Minimize

0

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

43

Solving the Lagrangian Problem with CGbAISSolving the Lagrangian Problem with CGbAIS

• Karush-Kuhn-Tucker Conditions

• The Dual Problem

0,0,0,0,0

ss

x

L

w

Lii

0

),(

toSubject

Maximize

wxLwith ,min),(

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

44

Solving the Lagrangian Problem with CGbAISSolving the Lagrangian Problem with CGbAIS

• Formulation Mixing the Primal and Dual Problems

• Scenario 1:– Transmission loss reduction by installing shunt compensation– Variation of the IEEE 14-bus test system– mFLoss: mean loss reduction

wx n

i i

n

i i w

L

x

Lwxwith

wxMinimize

1

2

1

2

.),(

),(

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

45

CGbAIS: Experimental ResultsCGbAIS: Experimental Results

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

pu

no. of generations

Dual Error

Primal Error

Losses

nAt mFLoss mGen mPerror mDerror mTime10 0.293 22.01 0.022 0.041 12.320 0.293 18.02 0.021 0.037 19.830 0.293 17.30 0.012 0.029 30.140 0.293 17.21 0.015 0.038 42.150 0.293 17.90 0.009 0.026 49.5

nAt mFLoss mGen mPerror mDerror mTime10 0.293 16.32 0.021 0.045 4.920 0.293 17.01 0.015 0.033 7.130 0.293 15.80 0.011 0.031 9.140 0.293 15.20 0.018 0.039 11.150 0.293 22.10 0.009 0.026 14.6

Results without clustering

Results with clustering

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

46

CGbAIS: DiscussionCGbAIS: Discussion

• A hybridization of na AIS with clustering and numerical information to improve computing effort and search robustness

• Use of a bent augmented Lagrangian• Good results when compared with traditional interior-

point methods

A Grain Classification MachineA Grain Classification Machine

NatComp – From Nature to Business

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

48

Automatic Grain ClassificationAutomatic Grain Classification• Actors involved:

– Producers;– Local and global consumers;– Cooperatives;– Banks;– Stock Market.

• Motivation:– Automatic certification of quality;– Avoid classification conflicts;– No equivalent machine available***;– Standardization.

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

49

Automatic Grain ClassificationAutomatic Grain Classification• Physical Classification: based on a sample of grains

– Grain quality: endogenous and exogenous defects;– Grain size.

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

50

Examples of Grain DefectsExamples of Grain Defects

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

51

The Grain Classifier ProjectThe Grain Classifier Project• Public Investor• The Development Cycle:

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

52

HW PrototypingHW Prototyping

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

53

HW PrototypingHW Prototyping

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

54

Computer VisionComputer Vision

• Image Capture:– Double face capture

• Feature Extraction:– Color, Texture and Shape

attributes– Based on the RGB

histograms– Total of 70 attributes

extracted per grain

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

55

Feature Selection and ClassificationFeature Selection and Classification

• Feature Selection:– Filter and Wrapper

• Classification:– Naïve Bayes– KNN– Support Vector Machines– Multi-Layer Perceptrons– aiNet+RBF– SRABNET: Supervised RABNET

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

56

Experimental ResultsExperimental Results

• Estimating the Weight

ICS-RBF = aiNet+RBF

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

57

Experimental ResultsExperimental Results• Classification Performance

Etr% Ete%

ECV% Std

MLP 8,80 1,814

SVM 10,60 3,31

k-NN 15,10 3,00

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

58

Sample Sample CertificateCertificate

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

59

DiscussionDiscussion

• The immune system approach demonstrated to be competitive

• Experiment with binary classification followed by defect classification

• Experiment hierarchical classification• Possibility of automating the classification of grains

Operation Planning in a Container Operation Planning in a Container Terminal (CONTER)Terminal (CONTER)

NatComp – From Nature to Business

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

61

The Importance of Container TerminalsThe Importance of Container Terminals

• Most World commerce is performed using containers

• The operation of a CONTER is a very complicated and challenging task, involving space and equipment constraints, short time spans for ship docking, pre-specified ship plans, customs procedures, etc.

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

62

A Typical Problem: Scheduling RTGsA Typical Problem: Scheduling RTGs

• When a Ship Plan is received in the terminal, the operators in have to search and load the selected containers into the ship.

• The RTGs (Rubber Tyred Gantry Crane) are typical container handling equipments and move in three directions: x, y, and z.

• The less movements the RTGs make, the faster and cheaper becomes the ship loading.

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

63

RTGs Movements: ProductiveRTGs Movements: Productive

(a) (b) (c)

(c) (d) (e)

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

64

RTGs Movements: Unproductive (Set-Up)RTGs Movements: Unproductive (Set-Up)

(a) (b)

(c) (d)

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

65

Cost to Remove ContainersCost to Remove Containers

TTz

zz

y

yy

x

xx ntV

nl

V

nl

V

nlQi

N

iiQC

1

where nx, ny and nz is the number of movements in direction X, Y and Z, respectively; Vx, Vy and Vz is the RTG velocity in direction X, Y and Z, respectively; tT is the time spent to lock or unlock the spreader and nT is the number of spreader locking/unlocking.

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

66

The copt-aiNet AlgorithmThe copt-aiNet Algorithm

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

67

A Demo on the RTG Scheduling ProblemA Demo on the RTG Scheduling Problem

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

68

DiscussionDiscussion

• Vast number of applications available• Great potential for further applications and developments• Some issues that still deserve investigation:

– Formal aspects– Comparison (theoretical and empirical) with other approaches– Loads of testing– Real benefits (Are they really useful?)– Danger theory– How far to stretch the metaphor?– Scalability– Robustness to high dimensions

Applied AIS - ICARIS 2008 Leandro Nunes de Castro

69

DiscussionDiscussion

• Current Issues at Mackenzie and NatComp– An optimal clustering algorithm– AIS applied to recommender systems– AIS applied to intelligent virtual environments– AIS applied to virtual simulations for training purposes

THANK YOU FOR THE ATTENTION!THANK YOU FOR THE ATTENTION!