SJTU Zhou Lingling1 Introduction to Electronics Zhou Lingling.
Real-valued negative selection algorithms Zhou Ji 11-2-2005.
-
Upload
hannah-griffith -
Category
Documents
-
view
214 -
download
0
Transcript of Real-valued negative selection algorithms Zhou Ji 11-2-2005.
Real-valued Real-valued negative negative selection selection
algorithmsalgorithmsZhou JiZhou Ji
11-2-200511-2-2005
outlineoutline
BackgroundBackground Variations of real-valued selection Variations of real-valued selection
algorithmsalgorithms More details through an example: More details through an example: V-V-
detectordetector DemonstrationDemonstration
3background
Background: AISBackground: AIS AIS (Artificial Immune Systems) – only AIS (Artificial Immune Systems) – only
about 10 years’ historyabout 10 years’ history Negative selection (development of T cells)Negative selection (development of T cells) Immune network theory (how B cells and Immune network theory (how B cells and
antibodies interact with each other)antibodies interact with each other) Clonal selection (how a pool of B cells, Clonal selection (how a pool of B cells,
especially, memory cells are developed)especially, memory cells are developed) New inspirations from immunology: danger New inspirations from immunology: danger
theory, germinal center, etc.theory, germinal center, etc. Negative selection algorithmsNegative selection algorithms
The earliest and most widely used AIS.The earliest and most widely used AIS.
4
Biological metaphor of Biological metaphor of negative selectionnegative selection
How T cells mature in the thymus:How T cells mature in the thymus: The cell are diversified.The cell are diversified. Those that recognize self are eliminated.Those that recognize self are eliminated. The rest are used to recognize nonself.The rest are used to recognize nonself.
5background
The idea of negative The idea of negative selection algorithms (NSA)selection algorithms (NSA)
The problem to deal with: anomaly detection (or The problem to deal with: anomaly detection (or one-class classification)one-class classification)
Detector setDetector set random generation: maintain diversityrandom generation: maintain diversity censoring: eliminating those that match self samplescensoring: eliminating those that match self samples
The concept of feature space and detectors
6background
Outline of a typical NSAOutline of a typical NSA
Generation of detector setAnomaly detection:(classification of incoming data items)
7background
Family of NSAFamily of NSATypes of works about NSATypes of works about NSA Applications: solving real world problems by using a Applications: solving real world problems by using a
typical version or adapting for specific applications typical version or adapting for specific applications Improving NSA of new detector scheme and generation Improving NSA of new detector scheme and generation
method and analyzing existing methods. Works are data method and analyzing existing methods. Works are data representation specific, mostly binary representation.representation specific, mostly binary representation.
Establishment of framework for binary representation to Establishment of framework for binary representation to include various matching rules; discussion on uniqueness include various matching rules; discussion on uniqueness and usefulness of NSA; introduction of new concepts.and usefulness of NSA; introduction of new concepts.
What defines a negative selection algorithm?What defines a negative selection algorithm? Representation in negative spaceRepresentation in negative space One-class learningOne-class learning Usage of detector setUsage of detector set
Data representation in Data representation in NSANSA
Different representations vs. different Different representations vs. different searching spacesearching space
Various representations:Various representations: BinaryBinary String over finite alphabet: no fundamental String over finite alphabet: no fundamental
difference from binarydifference from binary Real-valued vectorReal-valued vector hybridhybrid
Different distance measureDifferent distance measure Data representation is not the only factor to make Data representation is not the only factor to make
a scheme differenta scheme different
Real-valued NSAReal-valued NSA
Why is real-valued NSA different from Why is real-valued NSA different from binary NSA?binary NSA? Hard to analyze: simple combinatorics would Hard to analyze: simple combinatorics would
not worknot work Necessary and proper for many real Necessary and proper for many real
applications: binary representation may applications: binary representation may decouple the relation between feature space decouple the relation between feature space and representationand representation
Is categorization based on data Is categorization based on data representation a good way to understand representation a good way to understand and develop NSA? and develop NSA?
10
Major issues in Major issues in NSANSA
Number of detectorsNumber of detectors Affecting the efficiency of generation and detectionAffecting the efficiency of generation and detection
Detector coverageDetector coverage Affecting the accuracy detectionAffecting the accuracy detection
Generation mechanismsGeneration mechanisms Affecting the efficiency of generation and the quality of resulted Affecting the efficiency of generation and the quality of resulted
detectorsdetectors
Matching rules – generalizationMatching rules – generalization How to interpret the training dataHow to interpret the training data depending on the feature space and representation schemedepending on the feature space and representation scheme
Issues that are not NSA specificIssues that are not NSA specific Difficulty of one-class classificationDifficulty of one-class classification Curse of dimensionalityCurse of dimensionality
Variations of real-valued Variations of real-valued NSANSA
Rectangular detectors generated Rectangular detectors generated with GAwith GA
Circular detectors that move and Circular detectors that move and change sizechange size
MILA (multilevel immune learning MILA (multilevel immune learning algorithm)algorithm)
Rectangular detectors + Rectangular detectors + GAGA
Rectangular detectors: “rules” of value Rectangular detectors: “rules” of value rangerange
Generated by a typical genetic algorithmGenerated by a typical genetic algorithm
By Gonzalez, Dasgupta
Circular detectors Circular detectors (hypersphere)(hypersphere)
From constant size to variable sizeFrom constant size to variable size Moving after initial generation:Moving after initial generation:
Reduce overlapReduce overlap ““artificial annealing”artificial annealing”
By Dasgupta, KrishnaKumar et al
By Dasgupta, Gonzalez
MILAMILA Multilevel – to capture local patterns Multilevel – to capture local patterns
and global patternsand global patterns Negative selection + positive Negative selection + positive
selectionselection Euclidean distance on sub-spaceEuclidean distance on sub-space
For example, suppose that a self string is <s1, s2, …, sL> and the window size is chosen as 3, then the self peptide strings can be <s1, s3, sL>, < s2, s4, s9 >, < s5, s7, s8 > and so on by randomly picking up the attribute at some positions.
V-detectorV-detector
V-detector is a new negative V-detector is a new negative selection algorithm.selection algorithm.
It embraces a series of related works It embraces a series of related works to develop a more efficient and more to develop a more efficient and more reliable algorithm.reliable algorithm.
It has its unique process to generate It has its unique process to generate detectors and determine coverage.detectors and determine coverage.
16
V-detector’s major V-detector’s major featuresfeatures
Variable-sized detectorsVariable-sized detectors Statistical confidence in detector Statistical confidence in detector
coveragecoverage Boundary-aware algorithmBoundary-aware algorithm ExtensibilityExtensibility
In real-valued representation, detector can be visualized as hyper-sphere.Candidate 1: thrown-away; candidate 2: made a detector.
Match or not match?
18
Variable sized detectors in V-detector Variable sized detectors in V-detector method method
are “maximized detector”are “maximized detector”
Unanswered question: what is the self space?Unanswered question: what is the self space?
traditional detectors: constant size V-detector: maximized size
19
Why is the idea of “variable sized Why is the idea of “variable sized detectors” novel?detectors” novel?
The rational of constant size: a uniform matching The rational of constant size: a uniform matching thresholdthreshold
Detectors of variable size exist in some negative Detectors of variable size exist in some negative selection algorithms as a different mechanismselection algorithms as a different mechanism Allowing multiple or evolving size to optimize the coverage Allowing multiple or evolving size to optimize the coverage
– limited by the concern of overlap– limited by the concern of overlap Variable size as part of random property of Variable size as part of random property of
detectors/candidatesdetectors/candidates V-detector uses variable sized detectors to maximize V-detector uses variable sized detectors to maximize
the coverage with limited number of detectors the coverage with limited number of detectors Size is decided on by the training dataSize is decided on by the training data Large nonself region is covered easilyLarge nonself region is covered easily Small detectors cover ‘holes’Small detectors cover ‘holes’ Overlap is not an issue in V-detectorOverlap is not an issue in V-detector
20
Statistical estimate of detector Statistical estimate of detector coveragecoverage
Exiting works: estimate necessary number Exiting works: estimate necessary number of detectors – no direct relationship of detectors – no direct relationship between the estimate and the actual between the estimate and the actual detector set obtained.detector set obtained.
Novelty of Novelty of V-detectorV-detector:: Evaluate the coverage of the actual detector Evaluate the coverage of the actual detector
setset Statistical inference is used as an integrated Statistical inference is used as an integrated
components of the detector generation components of the detector generation algorithm, not to estimate coverage of finished algorithm, not to estimate coverage of finished detector set.detector set.
21
Basic idea leading to the new Basic idea leading to the new estimation mechanismestimation mechanism
Random points are taken as detector Random points are taken as detector candidates. The probability that a candidates. The probability that a random point falls on covered region random point falls on covered region (some exiting detectors) reflects the (some exiting detectors) reflects the portion that is covered -- similar to portion that is covered -- similar to the idea of Monte Carlo integral.the idea of Monte Carlo integral. Proportion of covered nonself space Proportion of covered nonself space
= probability of a sample point to be = probability of a sample point to be a covered point. (the points on self a covered point. (the points on self region not counted)region not counted)
When more nonself space has been When more nonself space has been covered, it becomes less likely that a covered, it becomes less likely that a sample point to be an uncovered sample point to be an uncovered one. In other words, we need try one. In other words, we need try more random point to find a more random point to find a uncovered one - one that can be uncovered one - one that can be used to make a detector.used to make a detector.
22
Statistics involvedStatistics involved
Central limit theory: sample statistic follows normal Central limit theory: sample statistic follows normal distributiondistribution Using sample statistic to population parameterUsing sample statistic to population parameter In our application, use proportion of covered random points to In our application, use proportion of covered random points to
estimate the actual proportion of covered areaestimate the actual proportion of covered area
proportion0 1
23
Statistic inferenceStatistic inference
Point estimate versus confidence intervalPoint estimate versus confidence interval
Estimate with confidence interval versus Estimate with confidence interval versus hypothesis testinghypothesis testing Proportion that is close to 100% will make the Proportion that is close to 100% will make the
assumption of central limit theory invalid – assumption of central limit theory invalid – not normal distribution.not normal distribution.
Purpose of terminating the detector Purpose of terminating the detector generationgeneration
Hypothesis testingHypothesis testing Identifying null hypothesis/alternative hypothesis.Identifying null hypothesis/alternative hypothesis.
Type I error: falsely reject null hypothesis Type I error: falsely reject null hypothesis Type II error: falsely accept null hypothesisType II error: falsely accept null hypothesis The null hypothesis is the statement that we’d rather take as The null hypothesis is the statement that we’d rather take as
true if there is not strong enough evidence showing true if there is not strong enough evidence showing otherwise. In other words, we consider type I error more otherwise. In other words, we consider type I error more costly.costly.
In term of coverage estimate, we consider falsely inadequate In term of coverage estimate, we consider falsely inadequate coverage is more costly. So the null hypothesis is: the current coverage is more costly. So the null hypothesis is: the current coverage is below the target coverage.coverage is below the target coverage.
Choose significant level: maximum probability we are Choose significant level: maximum probability we are willing to accept in making Type I Error.willing to accept in making Type I Error.
Collect sample and compute its statistic, in this case, the Collect sample and compute its statistic, in this case, the proportion.proportion.
Calculate Calculate zz score from proportion an compare with score from proportion an compare with zz If If zz is larger, we can reject null hypothesis and claim is larger, we can reject null hypothesis and claim
adequate coverage with confidenceadequate coverage with confidence
25
Boundary-aware algorithm Boundary-aware algorithm versus point-wise versus point-wise
interpretationinterpretation A new concept in negative selection algorithmA new concept in negative selection algorithm Previous works of NSAPrevious works of NSA
Matching threshold is used as mechanism to control the Matching threshold is used as mechanism to control the extent of generalizationextent of generalization
However, each self sample is used individually. The However, each self sample is used individually. The continuous area represented by a group of sample is continuous area represented by a group of sample is not captured. (point-wise interpretation)not captured. (point-wise interpretation)
More specificityRelatively more aggressive to detect anomaly
More generalizationThe real boundary isExtended.
Desired interpretation: The area represented byThe group of points
26
Boundary–aware: using the training Boundary–aware: using the training points as a collectionpoints as a collection
• Boundary-aware algorithmA ‘clustering’ mechanism though represented in negative space• The training data are used as a collection instead individually.• Positive selection cannot do the same thing
27
V-detector is more than a V-detector is more than a real-valued negative real-valued negative selection algorithmselection algorithm
V-detector can be implemented for any V-detector can be implemented for any data representation and distance measure.data representation and distance measure. Usually negative selection algorithms were Usually negative selection algorithms were
designed with specific data representation and designed with specific data representation and distance measure.distance measure.
The features we just introduced are not The features we just introduced are not limited by representation scheme or limited by representation scheme or generation mechanism. (as long as we have generation mechanism. (as long as we have a distance measure and a threshold to a distance measure and a threshold to decide matching)decide matching)
28contribution
V-detector algorithm withconfidence in detector coverage
29contribution
V-detector algorithm withconfidence in detector coverage
30contribution
V-detector algorithm withconfidence in detector coverage
31
V-detector’s advantagesV-detector’s advantages
Efficiency: Efficiency: fewer detectorsfewer detectors fast generationfast generation
Coverage confidenceCoverage confidence Extensibility, simplicityExtensibility, simplicity
ExperimentsExperiments A large pool of A large pool of synthetic data synthetic data (2-D real space) (2-D real space)
are experimented to understand V-detector’s are experimented to understand V-detector’s behaviorbehavior More detail analysis of the influence of various More detail analysis of the influence of various
parameters is planned as ‘work to do’parameters is planned as ‘work to do’ Real world dataReal world data
Confirm it works well enough to detect real world Confirm it works well enough to detect real world “anomaly”“anomaly”
Compare with methods dealing with similar Compare with methods dealing with similar problemsproblems
DemonstrationDemonstration How actual training data and detector look likeHow actual training data and detector look like Basic UI and visualization of V-detector Basic UI and visualization of V-detector
implementationimplementation
Parameters to evaluate its Parameters to evaluate its performanceperformance
Detection rateDetection rate False alarm rateFalse alarm rate Number of detectorsNumber of detectors
34
Control parameters and Control parameters and algorithm variationsalgorithm variations
Self radius – key parameterSelf radius – key parameter Target coverageTarget coverage Significant level (of hypothesis testing)Significant level (of hypothesis testing) Boundary-aware versus point-wiseBoundary-aware versus point-wise Hypothesis testing versus naïve estimateHypothesis testing versus naïve estimate Reuse random points versus minimum Reuse random points versus minimum
detector set (to be implemented)detector set (to be implemented)
35
Data’s influence on Data’s influence on performanceperformance
Specific shape Specific shape Intuitively, “corners” will affect the Intuitively, “corners” will affect the
results.results. Number of training pointsNumber of training points
Major influenceMajor influence
36
Experiments on 2-D Experiments on 2-D synthetic datasynthetic data
Training points (1000) Test data (1000 points) and the ‘real shape’ we try to learn
37
Detector sets generatedDetector sets generated
Trained with 1000 points Trained with 100 points
Synthetic data (‘intersection’ and pentagram): Synthetic data (‘intersection’ and pentagram):
compare naïve estimate and hypothesis testingcompare naïve estimate and hypothesis testing
‘intersection’ shape pentagram
Synthetic data : results for different shapes of Synthetic data : results for different shapes of self regionself region
Synthetic data (ring): compare boundary-Synthetic data (ring): compare boundary-aware and point-wiseaware and point-wise
0
0.2
0.4
0.6
0.8
1
1.2
0 0.05 0.1 0.15 0.2 0.25
Self threshold
Dete
ctio
n Ra
te
point-wiseboundary-aware
Detection rate False alarm rate
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 0.05 0.1 0.15 0.2 0.25
Self threshold
fals
e al
arm
rat
e point-wise
boundary-aware
Synthetic data (cross-shaped self): Synthetic data (cross-shaped self): balance of errorsbalance of errors
0
5
10
15
20
25
30
35
40
45
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
err
or
rate
(p
erc
en
tag
e)
false negative (99% coverage) false positive (99% coverage)
42
Real world dataReal world data
Biomedical dataBiomedical data Pollution dataPollution data Ball bearing – preprocessed time Ball bearing – preprocessed time
series dataseries data Others: Iris data, gene data, India Others: Iris data, gene data, India
TeluguTelugu
Results of biomedical Results of biomedical datadata
Training DataTraining Data AlgorithmAlgorithm Detection RateDetection Rate False Alarm rateFalse Alarm rate Number of DetectorsNumber of Detectors
MeanMean SDSD MeanMean SDSD MeanMean SDSD
100% training100% training MILAMILA 59.0759.07 3.853.85 00 00 10001000** 00
NSANSA 69.3669.36 2.672.67 00 00 10001000 00
r=0.1r=0.1 30.6130.61 3.043.04 00 00 21.5221.52 7.297.29
r=0.05r=0.05 40.5140.51 3.923.92 00 00 14.8414.84 5.145.14
50% training50% training MILAMILA 61.6161.61 3.823.82 2.432.43 0.430.43 10001000** 00
NSANSA 72.2972.29 2.632.63 2.942.94 0.210.21 10001000 00
r = 0.1r = 0.1 32.9232.92 2.352.35 0.610.61 0.310.31 15.51 15.51 4.854.85
r=0.05r=0.05 42.8942.89 3.833.83 1.071.07 0.490.49 12.2812.28 44
25% training25% training MILAMILA 80.4780.47 2.802.80 14.9314.93 2.082.08 10001000** 00
NSANSA 86.9686.96 2.722.72 19.5019.50 2.052.05 10001000 00
r=0.1r=0.1 43.6843.68 4.254.25 1.241.24 0.50.5 12.24 12.24 3.973.97
r=0.05r=0.05 57.9757.97 5.865.86 2.632.63 0.770.77 8.94 8.94 2.572.57
Results of air pollution Results of air pollution datadata
0
20
40
60
80
100
120
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
de
tec
tio
n r
ate
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
fals
e a
larm
ra
te
Detection rate (99.99% coverage) Detection rate (99% coverage)False alarm rate (99% coverage) False alarm rate (99.99% coverage)
0
200
400
600
800
1000
1200
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
nu
mb
er
of
de
tec
tors
99.99% coverage 99% coverage
Detection rate and false alarm rate Number of detectors
Ball bearing’s structure Ball bearing’s structure and damageand damage
Damaged cage
Ball bearing dataBall bearing data
raw data: time series of acceleration raw data: time series of acceleration measurementsmeasurements
Preprocessing (from time domain to representation Preprocessing (from time domain to representation space for detection)space for detection)
1.1. FFT (Fast Fourier Transform) with Hanning windowing: FFT (Fast Fourier Transform) with Hanning windowing: window size 32window size 32
2.2. Statistical moments: up to 5Statistical moments: up to 5thth order order
-60
-40
-20
0
20
40
60
80
1 33 65 97 129 161 193 225 257 289 321 353 385 417 449 481 513 545 577 609 641 673 705 737 769 801 833 865 897 929 961 993Example of raw data (new bearings, first 1000 points)
Ball bearing data: resultsBall bearing data: resultsBall bearing conditionsBall bearing conditions Total number of data pointsTotal number of data points Number of detected Number of detected
anomaliesanomaliesPercentage detectedPercentage detected
New bearing (normal)New bearing (normal) 27392739 00 0%0%
Outer race completely brokenOuter race completely broken 22412241 21822182 97.37%97.37%
Broken cage with one loose elementBroken cage with one loose element 29882988 577577 19.31%19.31%
Damage cage, four loose elementsDamage cage, four loose elements 29882988 337337 11.28%11.28%
No evident damage; badly wornNo evident damage; badly worn 29882988 209209 6.99%6.99%
Ball bearing conditionsBall bearing conditions Total number of data pointsTotal number of data points Number of detectedNumber of detectedanomaliesanomalies
Percentage detectedPercentage detected
New bearing (normal)New bearing (normal) 26512651 00 0%0%
Outer race completely brokenOuter race completely broken 21692169 16741674 77.18%77.18%
Broken cage with one loose elementBroken cage with one loose element 28922892 1414 0.48%0.48%
Damage cage, four loose elementsDamage cage, four loose elements 28922892 00 0%0%
No evident damage; badly wornNo evident damage; badly worn 28922892 00 0%0%
Preprocessed with FFT
Preprocessed with statistical moments
48contribution
Ball bearing experiments with two different preprocessing techniques
Results of Iris dataResults of Iris dataDetection rateDetection rate False alarm rateFalse alarm rate
Setosa 100%Setosa 100% MILAMILA 95.1695.16 00
NSA (single level)NSA (single level) 100100 00
V-detectorV-detector 99.9899.98 00
Setosa 50%Setosa 50% MILAMILA 94.0294.02 8.428.42
NSA (single level)NSA (single level) 100100 11.1811.18
V-detectorV-detector 99.9799.97 1.321.32
Versicolor 100%Versicolor 100% MILAMILA 84.3784.37 00
NSA (single level)NSA (single level) 95.6795.67 00
V-detectorV-detector 85.9585.95 00
Versicolor 50%Versicolor 50% MILAMILA 84.4684.46 19.619.6
NSA (single level)NSA (single level) 9696 22.222.2
V-detectorV-detector 88.388.3 8.428.42
Virginica 100%Virginica 100% MILAMILA 75.7575.75 00
NSA (single level)NSA (single level) 92.5192.51 00
V-detectorV-detector 81.8781.87 00
Virginica 50%Virginica 50% MILAMILA 88.9688.96 24.9824.98
NSA (single level)NSA (single level) 97.1897.18 33.2633.26
V-detectorV-detector 93.5893.58 13.1813.18
Iris data:Iris data: number of detectors number of detectors
meanmean maxmax MinMin SDSD
Setosa 100%Setosa 100% 2020 4242 55 7.877.87
Setosa 50%Setosa 50% 16.4416.44 3333 55 5.635.63
Veriscolor 100%Veriscolor 100% 153.24 153.24 255255 7272 38.838.8
Versicolor 50%Versicolor 50% 110.08 110.08 184184 6060 22.6122.61
Virginica 100%Virginica 100% 218.36 218.36 443443 7878 66.1166.11
Virginica 50%Virginica 50% 108.12108.12 203203 4646 30.7430.74
51
ConclusionsConclusions Real-valued NSA has unique advantages and Real-valued NSA has unique advantages and
difficulties.difficulties. Good NSA should not be limited by the Good NSA should not be limited by the
difference in data representationdifference in data representation ““Killer application” is needed to support the Killer application” is needed to support the
necessity of NSA as many other “soft necessity of NSA as many other “soft computation” paradigmcomputation” paradigm Compare with other methods. In case of NSA, Compare with other methods. In case of NSA,
other one-class classification, e.g. one-class SVMother one-class classification, e.g. one-class SVM Good representation scheme and distance Good representation scheme and distance
measure play a very important role in measure play a very important role in performance – more important than performance – more important than algorithm variations in many cases.algorithm variations in many cases.
referencesreferences S Forrest, A. S. Perelson, L. Allen, and R. Cherukuri. Self-nonself discrimination in a computer. S Forrest, A. S. Perelson, L. Allen, and R. Cherukuri. Self-nonself discrimination in a computer.
In Proc. of the IEEE Symposium on Research in Security and Privacy, IEEE Computer Society In Proc. of the IEEE Symposium on Research in Security and Privacy, IEEE Computer Society Press, Los Alamitos, CA, pp. 202–212, 1994.Press, Los Alamitos, CA, pp. 202–212, 1994.
D. Dasgupta and F. Gonzalez, An Immunity-Based Technique to Characterize Intrusions in D. Dasgupta and F. Gonzalez, An Immunity-Based Technique to Characterize Intrusions in Computer Networks. In the Journal IEEE Transactions on Evolutionary Computation, Computer Networks. In the Journal IEEE Transactions on Evolutionary Computation, Volume:6, Issue:3,Page(s):281-291, June, 2002.Volume:6, Issue:3,Page(s):281-291, June, 2002.
F. Gonzalez, D. Dasgupta and L.F. Nino. A Randomized Real-Valued Negative Selection F. Gonzalez, D. Dasgupta and L.F. Nino. A Randomized Real-Valued Negative Selection Algorithm. In the proceedings of the 2nd International Conference on Artificial Immune Algorithm. In the proceedings of the 2nd International Conference on Artificial Immune Systems UK September 1-3, 2003.Systems UK September 1-3, 2003.
D.Dasgupta, S.Yu and N.S. Majumdar. MILA - Multilevel Immune Learning Algorithm. In the D.Dasgupta, S.Yu and N.S. Majumdar. MILA - Multilevel Immune Learning Algorithm. In the proceedings of the Genetic and Evolutionary Computation Conference(GECCO) Chicago, July proceedings of the Genetic and Evolutionary Computation Conference(GECCO) Chicago, July 12-16 2003.12-16 2003.
Dasgupta, Ji, Gonzalez, Artificial immune system (AIS) research in the last five years, CEC Dasgupta, Ji, Gonzalez, Artificial immune system (AIS) research in the last five years, CEC 20032003
Ji, Dasgupta, Augmented negative selection algorithm with variable-coverage detectors, CEC Ji, Dasgupta, Augmented negative selection algorithm with variable-coverage detectors, CEC 20042004
D.Dasgupta, K.KrishnaKumar, D.Wong, M.Berry Negative Selection Algorithm for Aircraft D.Dasgupta, K.KrishnaKumar, D.Wong, M.Berry Negative Selection Algorithm for Aircraft Fault Detection. 3rd International Conference on Artificial Immune Systems Catania, Sicily.Fault Detection. 3rd International Conference on Artificial Immune Systems Catania, Sicily.(Italy) September 13-16 2004.(Italy) September 13-16 2004.
Ji, Dasgupta, Real-valued negative selection algorithm with variable-sized detectors, GECCO Ji, Dasgupta, Real-valued negative selection algorithm with variable-sized detectors, GECCO 20042004
Simon M. Garrett. How do we evaluate artificial immune systems? Evolutionary Computation, Simon M. Garrett. How do we evaluate artificial immune systems? Evolutionary Computation, 13(2):145–178, 2005.13(2):145–178, 2005.
Ji, Dasgupta, Estimating the detector coverage in a negative selection algorithm, GECCO 2005Ji, Dasgupta, Estimating the detector coverage in a negative selection algorithm, GECCO 2005 Ji, A boundary-aware negative selection algorithm, ASC 2005Ji, A boundary-aware negative selection algorithm, ASC 2005 Ji, Dasgupta, Revisiting negative selection algorithms, submitted to the Evolutionary Ji, Dasgupta, Revisiting negative selection algorithms, submitted to the Evolutionary
Computation JournalComputation Journal Ji, Dasgupta, An efficient negative selection algorithm of “probably adequate” coverage, Ji, Dasgupta, An efficient negative selection algorithm of “probably adequate” coverage,
submitted to SMCsubmitted to SMC
Questions?Questions?
Thank you!Thank you!
What is matching rule?What is matching rule?
When a sample and a detector are When a sample and a detector are considered matching.considered matching.
Matching rule plays an important Matching rule plays an important role in negative selection algorithm. role in negative selection algorithm. It largely depends on the data It largely depends on the data representation.representation.
Experiments and ResultsExperiments and Results Synthetic DataSynthetic Data
2D. Training data are randomly chosen from the normal 2D. Training data are randomly chosen from the normal region.region.
Fisher’s Iris DataFisher’s Iris Data One of the three types is considered as “normal”.One of the three types is considered as “normal”.
Biomedical DataBiomedical Data Abnormal data are the medical measures of disease Abnormal data are the medical measures of disease
carrier patients.carrier patients. Air Pollution DataAir Pollution Data
Abnormal data are made by artificially altering the Abnormal data are made by artificially altering the normal air measurementsnormal air measurements
Ball bearings: Ball bearings: Measurement: time series data with preprocessing - 30D Measurement: time series data with preprocessing - 30D
and 5D and 5D
Synthetic data - Synthetic data - Cross-shaped Cross-shaped
self spaceself space Shape of self region and Shape of self region and example detector coverageexample detector coverage
(a) Actual self space (b) self radius = 0.05 (c) self radius = 0.1
Synthetic data - Synthetic data - Cross-Cross-
shaped self spaceshaped self space ResultsResults
0
20
40
60
80
100
120
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
det
ecti
on
rat
e
0
10
20
30
40
50
60
70
80
90
fals
e a
larm
rat
e
Detection rate (99.99% coverage) Detection rate (99% coverage)False alarm rate (99% coverage) False alarm rate (99.99% coverage)
0
200
400
600
800
1000
1200
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
nu
mb
er o
f d
etec
tors
99.99% coverage 99% coverage
Detection rate and false alarm rate Number of detectors
Synthetic data - Synthetic data - Ring-shaped self Ring-shaped self
spacespace Shape of self region and example Shape of self region and example detector coveragedetector coverage
(a) Actual self space (b) self radius = 0.05 (c) self radius = 0.1
0
20
40
60
80
100
120
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
det
ecti
on
rat
e
0
10
20
30
40
50
60
70
fals
e a
larm
rat
e
Detection rate (99.99% coverage) Detection rate (99% coverage)False alarm rate (99% coverage) False alarm rate (99.99% coverage)
0
200
400
600
800
1000
1200
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
nu
mb
er o
f d
etec
tors
99.99% coverage 99% coverage
Synthetic data - Synthetic data - Ring-shaped Ring-shaped
self spaceself space ResultsResults
Detection rate and false alarm rate Number of detectors
Iris dataIris dataVirginica as normal, 50% points Virginica as normal, 50% points
used to trainused to train
0
20
40
60
80
100
120
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
de
tec
tio
n r
ate
0
10
20
30
40
50
60
fals
e a
larm
ra
te
Detection rate (99.99% coverage) Detection rate (99% coverage)False alarm rate (99% coverage) False alarm rate (99.99% coverage)
0
200
400
600
800
1000
1200
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
nu
mb
er
of
de
tec
tors
99.99% coverage 99% coverage
Detection rate and false alarm rate Number of detectors
Biomedical dataBiomedical data
Blood measure for a group of 209 Blood measure for a group of 209 patientspatients
Each patient has four different types Each patient has four different types of measurementof measurement
75 patients are carriers of a rare 75 patients are carriers of a rare genetic disorder. Others are normal.genetic disorder. Others are normal.
Biomedical data Biomedical data
0
10
20
30
40
50
60
70
80
90
100
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radius
de
tec
tio
n r
ate
0
10
20
30
40
50
60
fals
e a
larm
ra
te
Detection rate (99.99% coverage) Detection rate (99% coverage)False alarm rate (99% coverage) False alarm rate (99.99% coverage)
0
200
400
600
800
1000
1200
0.01 0.03 0.05 0.07 0.09 0.11 0.13 0.15 0.17 0.19
self radiusn
um
be
r o
f d
ete
cto
rs
99.99% coverage 99% coverage
Detection rate and false alarm rate Number of detectors
Air pollution dataAir pollution data Totally 60 original records.Totally 60 original records. Each is 16 different measurements concerning air Each is 16 different measurements concerning air
pollution.pollution. All the real data are considered as normal.All the real data are considered as normal. More data are made artificially:More data are made artificially:
1.1. Decide the normal range of each of 16 measurementsDecide the normal range of each of 16 measurements2.2. Randomly choose a real recordRandomly choose a real record3.3. Change three randomly chosen measurements within Change three randomly chosen measurements within
a larger than normal rangea larger than normal range4.4. If some the changed measurements are out of range, If some the changed measurements are out of range,
the record is considered abnormal; otherwise they the record is considered abnormal; otherwise they are considered normalare considered normal
Totally 1000 records including the original 60 are Totally 1000 records including the original 60 are used as test data. The original 60 are used as used as test data. The original 60 are used as training data.training data.
Example of data (FFT of new Example of data (FFT of new bearings)bearings)
--- first 3 coefficients of the first --- first 3 coefficients of the first 100 points100 points
0
100
200
300
400
500
600
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100
coefficient 1 coefficient 2 coeffcient 3
Example of data (statistical Example of data (statistical moments of new bearings)moments of new bearings)
--- moments up to 3rd order of --- moments up to 3rd order of the first 100 pointsthe first 100 points
-2000
-1000
0
1000
2000
3000
4000
5000
6000
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100
1st order 2nd order 3rd order
How much one sample How much one sample tellstells
Samples may be on Samples may be on boundaryboundary
In term of detectorsIn term of detectors
Comparing three Comparing three methodsmethods
Constant-sized detectors V-detector New algorithm
Self radius = 0.05
Comparing three Comparing three methodsmethods
Constant-sized detectors V-detectors New algorithm
Self radius = 0.1
Back to the presentation