Post on 10-May-2015
Anomaly Detection Using The Cortical Learning Algorithm
Subutai Ahmadsubutai@numenta.org
2Source: mimobaby.com
3
Lindsay Lohan
4
Anomaly Detection Using The Cortical Learning Algorithm
Subutai Ahmadsubutai@numenta.org
6
Three Topics
• What is “Anomaly Detection”?
• How is the anomaly score computed in NuPIC/CLA today?
• How is the anomaly score used in the product Grok?
+ sample code!
7
8
Spatial (Static) Anomalies
9
Temporal Anomalies
10
Windmill Gear Bearing Temperature
11
Anomalies In Random Behavior
12
“Temporary” Anomalies
13
Anomaly Detection
• Anomalies are any significant deviation from normal behavior
• Anomaly detection is valuable
• Anomaly detection is hard – there are many flavors– Spatial anomalies
– Temporal anomalies
– Anomalies in random data
– “Temporary” anomalies
– Etc.
14
The Anomaly Score In NuPIC
• NuPIC implements anomaly scoring for streaming datasets
• Core feature of the OPF (Online Prediction Framework)– Use inferenceType = TemporalAnomaly
– Outputs an anomaly score between 0 and 1 for every data point
• Detects spatial and temporal anomalies
• Continuously learning online system
• Works for numerical and categorical data
15
Computing Anomaly Score
Time of DayEncoders Sensor Value
Data
Spatial Pooler
Temporal Pooler
CLA
Predictions
CLA constantly learns common spatial patterns and temporal sequences in the stream of inputs
Anomaly Score =
0 if current value was predicted1 if value was totally unpredictedbetween 0 and 1 if similar to predicted value
At each time step Temporal Pooler makes multiple predictions about what might come next
16
Artificial Example
B, C, or D occurs– Anomaly score = 0
E occurs:– Completely different from B,C, or D -->
anomaly score = 1
– Similar to B, C, or D --> score will be between 0 and 1
– “Similar” means “similar after encoding”
• If A -> E repeats:– Anomaly score will drop to 0
A B A B A C A B A D A _
17
Example: Anomalous CPU Usage
18
Example: Heater Temperature
Unusual temporal behavior
Unusually lowreadings
Anomalyscore
Anomalyscore
19
Example: Change In Randomness
20
Sample Code
• Sample code and datasets for running anomaly detection available:
• https://github.com/subutai/nupic.subutai/run_anomaly
|-- README.md
|-- data
| |-- art_load_balancer_spikes.csv
| |-- cpu_5f553.csv
| |-- cpu_825cc.csv
| |-- cpu_cc0c5.csv
| `-- rds_connections.csv
|-- model_params.py
|-- run_all.sh
`-- run_anomaly.py
21
Grok
• Define what to monitor• Grok ingests streaming
data
• Builds models automatically
• Continuously learns• Adapts to changes
• Visualize likelihood of unusual behavior
• See metrics and data• Prevent downtime
22
Use Case: Sudden Changes, Slow changes
23
Use Case: Subtle Changes
24
What Have We Learned From Grok?
• Anomaly detection is extremely useful
• Real world data is really really noisy!– We will never build a perfect predictive model
• There’s no way to set a threshold on the anomaly score– High anomaly score not necessarily bad
– Random stuff happens normally
• Visually you can see a qualitative change in the anomaly scores
• In Grok we detect the change in the anomaly score itself– Compute a likelihood that the predictability of the data has changed
25
Anomaly Likelihood In Grok
1. For each new data point compute anomaly score using OPF
2. Estimate the probability distribution of historical anomaly scores
3. Compute likelihood that the recent anomaly scores comes from same distribution as historical anomaly scores
26
Example: Anomaly Score
27
Example: Likelihood Score
28
Example: Change In Randomness
29
Use Case: Changes in Randomness
30
Windmill Gear Bearing Temperature
31
Anomaly Likelihood Code
• Anomaly likelihood scheme has proven to be critical in making anomaly score useful in a practical application
• We are making the Anomaly Likelihood code available:
https://github.com/subutai/nupic.subutai/run_anomaly
• Self contained function right now– It might be useful to look at, but not in an easy to use form yet!
– Plan to create better sample code and then perhaps integrate into OPF.
32
What About Swarming?
• Swarming is an automated parameter selection scheme in NuPIC
– Runs hundreds of models with unique parameter combinations
– Selects the best field combinations and parameters
• In Grok we use a single pre-swarmed parameter set– Fixed set of fields (timestamp + value)
– Data fed in every 5 minutes
– Works very well across different data streams with above characteristics
• In general you will still need to swarm– Great set of tutorials online put together by Matt
– But the system is relatively insensitive to small parameter changes, so you may not need to swarm too often
33
Where Do We Go Next?
• The CLA is proving to be excellent at detecting anomalies in datasets we’ve tried so far
– Fully automated - no parameter tuning in Grok!
• We’ve learned a lot in the process of creating the product– We’d like to spread the ideas to the community
• It’s clear we’re just scratching the surface
34
Benchmark For Streaming Anomaly Detection
• Hard to find good anomaly detection benchmarks for streaming data
• We’ve decided to create a dataset and testing methodology focused on streaming data and anomaly detection
– Model real time online streaming data sources
– Emphasis will be on temporal streaming data, automation, and continuous learning
– Well defined methodology for evaluating algorithms
– Baseline results using CLA
• We’re hoping it will be useful to the NuPIC community as we continue to push the boundaries
– Please see Ian Danforth or me if you’re interested
35
Resources
• Read “The Science of Anomaly Detection” whitepaper on numenta.com
• Github repository containing sample code, anomaly likelihood algorithm, and data:
–https://github.com/subutai/nupic.subutai/run_anomaly
• Survey of Machine Learning techniques: – Chandola, Varun, Arindam Banerjee, and Vipin Kumar. "Anomaly detection:
A survey." ACM Computing Surveys (CSUR) 41.3 (2009): 15.