Post on 05-Dec-2014
description
Applied AISApplied AIS: : A A Roadmap of AIS Research in Brazil and Roadmap of AIS Research in Brazil and
Sample ApplicationsSample Applications
Leandro Nunes de Castro
lnunes@mackenzie.br; lnunes@natcomp.com.br
Mackenzie University
NatComp – From Nature to Business
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
2
AgendaAgenda
• Main Application Areas and When AIS Should be Applied
• A Worth Knowing Bibliography• A Roadmap of AIS Research in Brazil• Sample Projects
– Bi-clustering for text mining– An AIS for Spam Detection– Optimal Power Flow– A Grain Classifier– Container Scheduling
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
4
Main Application AreasMain Application Areas
• An imprecise and incomplete classification:– Pattern Recognition and Classification– Machine Learning– Knowledge Discovery from Databases– Search and Optimization– Robotics– Control– Industrial Applications– Anomaly Detection
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
5
Common FeaturesCommon Features• When AIS should be used:
– Difficulty in modeling– Poorly defined– Dynamic environments– Large number of variables– Missing or noisy variables (attributes)– Highly nonlinear– Difficulty in finding derivatives– Combinatorial solutions (NP-Complete/NP-Hard)– Multiple simultaneous solutions are required
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
7
AISWeb: AISWeb: The Online Home of Artificial Immune SystemsThe Online Home of Artificial Immune Systems
www.artificial-immune-systems.orgwww.artificial-immune-systems.org
• About the Immune System• ICARIS• Immune-Inspired Algorithms• Jobs and studentships• Links to Researchers• Modeling the Immune System• Publications• Teaching Resources
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
8
The On-line Searchable BibliographyThe On-line Searchable Bibliography
http://www.asap.cs.nott.ac.uk/ais/
You can either search or browse:• Books • Book Chapters • Theses • Journal Papers • Conference Papers • Technical Reports
A Roadmap of AIS Research in BrazilA Roadmap of AIS Research in Brazil
Main Research Groups and Their Focus
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
10
Geographic DistributionGeographic Distribution
UFMG
UNIFEI
UNICAMP
MACKENZIE & NATCOMPUTFPR
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
11
A Brief Description of Each GroupA Brief Description of Each Group
• University: UNICAMP (University of Campinas)• LBiC: Laboratory for Bio-Inspired Computing• Leader: Prof. Fernando J. Von Zuben• Main Application Areas:
– Nonlinear dynamic systems identification– Combinatorial optimization: vehicle routing, gene ordering, – Music composition and arts– Bioinformatics: phylogenetic tree reconstruction, gene
expression analysis– Optimal Wiener equalizers– ANN design– Data mining: clustering, classification, text mining
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
12
Some LBiC ContributionsSome LBiC Contributions
aiNet
opt-aiNet
copt-aiNet
dopt-aiNet
dcopt-aiNet
ARIA
ABNET
RABNET
SABNET
SaiNet
omni-aiNet
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
13
A Brief Description of Each GroupA Brief Description of Each Group
• University: UFMG (Federal University of Minas Gerais)• LITC: Computational Intelligence Laboratory• Leader: Prof. Walmir Matos Caminhas• Main Application Areas:
– Data Mining: spam identification; fault, anomaly and intrusion detection
– Nonlinear system identification and control: induction motors, electromagnetic devices
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
14
A Brief Description of Each GroupA Brief Description of Each Group• University: UNIFEI (Federal University of Itajuba)• CRTI: Reference Center on Information Technology• Leader: Dr. Leonardo de Mello Honório• Main Application Areas:
– Optimization: optimal power flow, agents scheduling
• University: UTFPR (Technological Federal University of Paraná)
• Leader: Dr. Myriam Regattieri• Main Application Areas:
– Economic load dispatch, protein folding, breast cancer profiling, protein structure prediction
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
15
A Brief Description of Each GroupA Brief Description of Each Group
• University: Mackenzie & NatComp• Leader: Prof. Leandro Nunes de Castro• Main Application Areas:
– Optimization: combinatorial and multimodal – Intelligent machines– Optimal Wiener equalizers– ANN design– Data mining: clustering, classification, text mining
Sample Applications from the Sample Applications from the Brazilian GroupsBrazilian Groups
From Text Mining to Grain Classification
BIC-aiNet: An AIS for Text ClusteringBIC-aiNet: An AIS for Text Clustering
Pablo de Castro et al., Natural Computing journal, in press.
A Group from Unicamp
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
18
Immune-Inspired Biclustering to Text MiningImmune-Inspired Biclustering to Text Mining• Standard Clustering;
– Applied to either the rows or columns of a data matrix; that is, clusters either objects or attributes
– ‘Global’ model
• Biclustering:– Simultaneous row-column clustering; that is, clusters objects and
attributes– ‘Local’ model– Also called coclustering, bidimensional clustering, subspace
clustering– Terminology introduced for gene expression data analysis
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
19
BiclusteringBiclustering
• When biclustering should be used? – A set of objects influences (is influenced by) a set of attributes– An object may belong to more than one cluster
• Restrictions:– A cluster of objects should be defined with respect to only a
subset of attributes– A cluster of attributes should be defined with respect to only a
subset of objects– The clusters should not be exclusive and/or exhaustive: an
object/attribute should be able to belong to more than one cluster or no cluster at all and be grouped using a subset of attributes/objects
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
20
BiclusteringBiclustering
• A bicluster is a subset of rows that exhibit similar behavior across a subset of columns, and vice-versa.
• The bicluster AIJ = (I,J) is a subset of rows and a subset of columns where I = {i1, ..., ik} is a subset of rows (I X and k n), and J = {j1, ..., js} is a subset of columns (J Y and s m).
• A bicluster can be defined as a k by s submatrix of matrix A.
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
21
BiclusteringBiclustering• Two interpretations:
– As a two-way permutation problem: interactive reordering of rows and columns so as to produce multiple clusters in different regions of the matrix
– As a two-way partition problem: creation of submatrices so as to maximize an index that evaluates clustering properties (e.g., similarity among objects)
1221
2233
1321
2233
1221
1321
3223
2211
2311
1221
2233
1321
11
11
12
22
rows = {1,3}columns = {1,4}
rows = {2,3}columns = {3,4}
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
22
An Immune-Inspired Algorithm for BiclusteringAn Immune-Inspired Algorithm for Biclustering
• BIC-aiNet:– Multi-population– Dynamic control of the population size– Diversity maintenance
• Encoding:– Two ordered vectors (rows and columns)– Each individual represents a single bicluster
1221
2233
1321Row: [2, 3]Column: [1, 3, 4]
121
222
Data matrix Individual Bicluster
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
23
BIC-aiNetBIC-aiNet
• Fitness function:
||||),(
N
w
M
wRNMf rc
ji
IJiJIjij rrraNM
R,
2)(||||
1
N, M are the set of rows and columns, R is the residue of a bicluster, is a residue threshold, wc is the weight of the number of columns, wr is the weight of the number of rows, aij is the value in the original data matrix, and I (J)indicates the mean values for row (column) i (j).
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
24
The BIC-aiNet AlgorithmThe BIC-aiNet Algorithm
Mutation: insertion/deletionof rows and columnsSuppression: eliminates similar biclusters
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
25
BIC-aiNet PerformanceBIC-aiNet Performance
• Application in Collaborative Filtering– Perform automated recommendations for a user
– Input: matrix Rij, in which each entry represents the rating of user i to item j
• Data sets used: MovieLens, Jester and Dating– MovieLens: 100,000 ratings assigned by 943 users on 1,682
movies. Range: 1(bad) – 5(excellent) – The purpose is to group users with similar interests in order to
provide a recommendation of a movie when a user asks for.
• Performance Measures:– RMSE, MAE, Accuracy, Precision
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
27
BIC-aiNet DiscussionBIC-aiNet Discussion
• Bi-clustering interpreted as a bipartition problem• Possibility of using just some attributes per cluster• Multimodal solutions• Accurate recommendations• An attribute may appear in more than one cluster or in
none
IA-AIS: An AIS To Detect SpamIA-AIS: An AIS To Detect Spam
Thiago S. Guzella et al., BioSystems 92(2008), pp. 215-225
A Group from UFMG
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
29
SPAM Detection SPAM Detection
• SPAM messages are constantly ‘evolving’, e.g.: – free == fr33
– viagra = v1agra
– casino = casin0
– watch = w4tch
• SPAM messages can be identified by features such as the sender’s e-mail address, message subject and message body
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
30
IA-AIS: Innate and Adaptive AISIA-AIS: Innate and Adaptive AIS• Combines features from Negative Selection and Clonal
Selection• Composed of Macrophages, B cells, T cells, Interactions
among B and T cells (helper and regulatory)
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
31
IA-AIS: Innate and Adaptive AISIA-AIS: Innate and Adaptive AIS
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
32
IA-AIS: Innate and Adaptive AISIA-AIS: Innate and Adaptive AIS
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
33
IA-AIS: Innate and Adaptive AISIA-AIS: Innate and Adaptive AIS
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
34
IA-AIS: Parameters and ConfigurationsIA-AIS: Parameters and Configurations
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
35
IA-AIS: Experimental ResultsIA-AIS: Experimental Results
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
36
IA-AIS: DiscussionIA-AIS: Discussion
• Macrophages, B cells, T helper cells, T regulatory cells• Incorporation of user feedback• Considers interactions of immune cells• Interesting alternative when high true positive values are
relevant
CGbAIS: A Cluster Gradient-Based AIS CGbAIS: A Cluster Gradient-Based AIS to Optimal Power Flowto Optimal Power Flow
Leonardo M. Honorio et al., IEEE Trans. on Power Systems, in press.
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
38
Optimal Power Flow ProblemsOptimal Power Flow Problems• OPF Main Features:
– Non-linear, non-convex, large-scale – Several sets of continuous and discrete variables
• CGbAIS Main Features:– An individual is related to a set of control variables that define a
possible solution, characterized by a set of equations that describe its behavior
– The Jacobian vector associated with the solutions can be used to guide mutation
– A clustering algorithm is used to reduce computational effort– To ensure the KKT conditions, a modified Lagrangian system is
used
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
39
The Cluster Gradient-Based AIS (CGbAIS)The Cluster Gradient-Based AIS (CGbAIS)
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
40
CGbAIS for Discrete OptimizationCGbAIS for Discrete Optimization
• An antibody is a partial path over a tree search• Maintenance of nLocalBest clones and selection of
nGlobalBest clones• Clustering of paths with the same nodes• Numerical information used to evolve the population:
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
41
Combinatorial Optimization with CGbAISCombinatorial Optimization with CGbAIS
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
42
Solving the Lagrangian Problem with CGbAISSolving the Lagrangian Problem with CGbAIS
• Typical OPF Problem Formulation
• Augmented Lagrangian Function
cscscs
maxcsmin
cs
cs
x,xx,xx,x
h)x,x(hh
)x,x(g to Subject
)x,xf( Minimize
0
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
43
Solving the Lagrangian Problem with CGbAISSolving the Lagrangian Problem with CGbAIS
• Karush-Kuhn-Tucker Conditions
• The Dual Problem
0,0,0,0,0
ss
x
L
w
Lii
0
),(
toSubject
Maximize
wxLwith ,min),(
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
44
Solving the Lagrangian Problem with CGbAISSolving the Lagrangian Problem with CGbAIS
• Formulation Mixing the Primal and Dual Problems
• Scenario 1:– Transmission loss reduction by installing shunt compensation– Variation of the IEEE 14-bus test system– mFLoss: mean loss reduction
wx n
i i
n
i i w
L
x
Lwxwith
wxMinimize
1
2
1
2
.),(
),(
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
45
CGbAIS: Experimental ResultsCGbAIS: Experimental Results
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
pu
no. of generations
Dual Error
Primal Error
Losses
nAt mFLoss mGen mPerror mDerror mTime10 0.293 22.01 0.022 0.041 12.320 0.293 18.02 0.021 0.037 19.830 0.293 17.30 0.012 0.029 30.140 0.293 17.21 0.015 0.038 42.150 0.293 17.90 0.009 0.026 49.5
nAt mFLoss mGen mPerror mDerror mTime10 0.293 16.32 0.021 0.045 4.920 0.293 17.01 0.015 0.033 7.130 0.293 15.80 0.011 0.031 9.140 0.293 15.20 0.018 0.039 11.150 0.293 22.10 0.009 0.026 14.6
Results without clustering
Results with clustering
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
46
CGbAIS: DiscussionCGbAIS: Discussion
• A hybridization of na AIS with clustering and numerical information to improve computing effort and search robustness
• Use of a bent augmented Lagrangian• Good results when compared with traditional interior-
point methods
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
48
Automatic Grain ClassificationAutomatic Grain Classification• Actors involved:
– Producers;– Local and global consumers;– Cooperatives;– Banks;– Stock Market.
• Motivation:– Automatic certification of quality;– Avoid classification conflicts;– No equivalent machine available***;– Standardization.
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
49
Automatic Grain ClassificationAutomatic Grain Classification• Physical Classification: based on a sample of grains
– Grain quality: endogenous and exogenous defects;– Grain size.
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
50
Examples of Grain DefectsExamples of Grain Defects
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
51
The Grain Classifier ProjectThe Grain Classifier Project• Public Investor• The Development Cycle:
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
54
Computer VisionComputer Vision
• Image Capture:– Double face capture
• Feature Extraction:– Color, Texture and Shape
attributes– Based on the RGB
histograms– Total of 70 attributes
extracted per grain
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
55
Feature Selection and ClassificationFeature Selection and Classification
• Feature Selection:– Filter and Wrapper
• Classification:– Naïve Bayes– KNN– Support Vector Machines– Multi-Layer Perceptrons– aiNet+RBF– SRABNET: Supervised RABNET
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
56
Experimental ResultsExperimental Results
• Estimating the Weight
ICS-RBF = aiNet+RBF
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
57
Experimental ResultsExperimental Results• Classification Performance
Etr% Ete%
ECV% Std
MLP 8,80 1,814
SVM 10,60 3,31
k-NN 15,10 3,00
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
59
DiscussionDiscussion
• The immune system approach demonstrated to be competitive
• Experiment with binary classification followed by defect classification
• Experiment hierarchical classification• Possibility of automating the classification of grains
Operation Planning in a Container Operation Planning in a Container Terminal (CONTER)Terminal (CONTER)
NatComp – From Nature to Business
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
61
The Importance of Container TerminalsThe Importance of Container Terminals
• Most World commerce is performed using containers
• The operation of a CONTER is a very complicated and challenging task, involving space and equipment constraints, short time spans for ship docking, pre-specified ship plans, customs procedures, etc.
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
62
A Typical Problem: Scheduling RTGsA Typical Problem: Scheduling RTGs
• When a Ship Plan is received in the terminal, the operators in have to search and load the selected containers into the ship.
• The RTGs (Rubber Tyred Gantry Crane) are typical container handling equipments and move in three directions: x, y, and z.
• The less movements the RTGs make, the faster and cheaper becomes the ship loading.
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
63
RTGs Movements: ProductiveRTGs Movements: Productive
(a) (b) (c)
(c) (d) (e)
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
64
RTGs Movements: Unproductive (Set-Up)RTGs Movements: Unproductive (Set-Up)
(a) (b)
(c) (d)
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
65
Cost to Remove ContainersCost to Remove Containers
TTz
zz
y
yy
x
xx ntV
nl
V
nl
V
nlQi
N
iiQC
1
where nx, ny and nz is the number of movements in direction X, Y and Z, respectively; Vx, Vy and Vz is the RTG velocity in direction X, Y and Z, respectively; tT is the time spent to lock or unlock the spreader and nT is the number of spreader locking/unlocking.
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
66
The copt-aiNet AlgorithmThe copt-aiNet Algorithm
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
67
A Demo on the RTG Scheduling ProblemA Demo on the RTG Scheduling Problem
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
68
DiscussionDiscussion
• Vast number of applications available• Great potential for further applications and developments• Some issues that still deserve investigation:
– Formal aspects– Comparison (theoretical and empirical) with other approaches– Loads of testing– Real benefits (Are they really useful?)– Danger theory– How far to stretch the metaphor?– Scalability– Robustness to high dimensions
Applied AIS - ICARIS 2008 Leandro Nunes de Castro
69
DiscussionDiscussion
• Current Issues at Mackenzie and NatComp– An optimal clustering algorithm– AIS applied to recommender systems– AIS applied to intelligent virtual environments– AIS applied to virtual simulations for training purposes
THANK YOU FOR THE ATTENTION!THANK YOU FOR THE ATTENTION!