Anomaly Detection Using the CLA
Transcript of Anomaly Detection Using the CLA
![Page 2: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/2.jpg)
2Source: mimobaby.com
![Page 3: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/3.jpg)
3
Lindsay Lohan
![Page 4: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/4.jpg)
4
![Page 6: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/6.jpg)
6
Three Topics
• What is “Anomaly Detection”?
• How is the anomaly score computed in NuPIC/CLA today?
• How is the anomaly score used in the product Grok?
+ sample code!
![Page 7: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/7.jpg)
7
![Page 8: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/8.jpg)
8
Spatial (Static) Anomalies
![Page 9: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/9.jpg)
9
Temporal Anomalies
![Page 10: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/10.jpg)
10
Windmill Gear Bearing Temperature
![Page 11: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/11.jpg)
11
Anomalies In Random Behavior
![Page 12: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/12.jpg)
12
“Temporary” Anomalies
![Page 13: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/13.jpg)
13
Anomaly Detection
• Anomalies are any significant deviation from normal behavior
• Anomaly detection is valuable
• Anomaly detection is hard – there are many flavors– Spatial anomalies
– Temporal anomalies
– Anomalies in random data
– “Temporary” anomalies
– Etc.
![Page 14: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/14.jpg)
14
The Anomaly Score In NuPIC
• NuPIC implements anomaly scoring for streaming datasets
• Core feature of the OPF (Online Prediction Framework)– Use inferenceType = TemporalAnomaly
– Outputs an anomaly score between 0 and 1 for every data point
• Detects spatial and temporal anomalies
• Continuously learning online system
• Works for numerical and categorical data
![Page 15: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/15.jpg)
15
Computing Anomaly Score
Time of DayEncoders Sensor Value
Data
Spatial Pooler
Temporal Pooler
CLA
Predictions
CLA constantly learns common spatial patterns and temporal sequences in the stream of inputs
Anomaly Score =
0 if current value was predicted1 if value was totally unpredictedbetween 0 and 1 if similar to predicted value
At each time step Temporal Pooler makes multiple predictions about what might come next
![Page 16: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/16.jpg)
16
Artificial Example
B, C, or D occurs– Anomaly score = 0
E occurs:– Completely different from B,C, or D -->
anomaly score = 1
– Similar to B, C, or D --> score will be between 0 and 1
– “Similar” means “similar after encoding”
• If A -> E repeats:– Anomaly score will drop to 0
A B A B A C A B A D A _
![Page 17: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/17.jpg)
17
Example: Anomalous CPU Usage
![Page 18: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/18.jpg)
18
Example: Heater Temperature
Unusual temporal behavior
Unusually lowreadings
Anomalyscore
Anomalyscore
![Page 19: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/19.jpg)
19
Example: Change In Randomness
![Page 20: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/20.jpg)
20
Sample Code
• Sample code and datasets for running anomaly detection available:
• https://github.com/subutai/nupic.subutai/run_anomaly
|-- README.md
|-- data
| |-- art_load_balancer_spikes.csv
| |-- cpu_5f553.csv
| |-- cpu_825cc.csv
| |-- cpu_cc0c5.csv
| `-- rds_connections.csv
|-- model_params.py
|-- run_all.sh
`-- run_anomaly.py
![Page 21: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/21.jpg)
21
Grok
• Define what to monitor• Grok ingests streaming
data
• Builds models automatically
• Continuously learns• Adapts to changes
• Visualize likelihood of unusual behavior
• See metrics and data• Prevent downtime
![Page 22: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/22.jpg)
22
Use Case: Sudden Changes, Slow changes
![Page 23: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/23.jpg)
23
Use Case: Subtle Changes
![Page 24: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/24.jpg)
24
What Have We Learned From Grok?
• Anomaly detection is extremely useful
• Real world data is really really noisy!– We will never build a perfect predictive model
• There’s no way to set a threshold on the anomaly score– High anomaly score not necessarily bad
– Random stuff happens normally
• Visually you can see a qualitative change in the anomaly scores
• In Grok we detect the change in the anomaly score itself– Compute a likelihood that the predictability of the data has changed
![Page 25: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/25.jpg)
25
Anomaly Likelihood In Grok
1. For each new data point compute anomaly score using OPF
2. Estimate the probability distribution of historical anomaly scores
3. Compute likelihood that the recent anomaly scores comes from same distribution as historical anomaly scores
![Page 26: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/26.jpg)
26
Example: Anomaly Score
![Page 27: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/27.jpg)
27
Example: Likelihood Score
![Page 28: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/28.jpg)
28
Example: Change In Randomness
![Page 29: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/29.jpg)
29
Use Case: Changes in Randomness
![Page 30: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/30.jpg)
30
Windmill Gear Bearing Temperature
![Page 31: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/31.jpg)
31
Anomaly Likelihood Code
• Anomaly likelihood scheme has proven to be critical in making anomaly score useful in a practical application
• We are making the Anomaly Likelihood code available:
https://github.com/subutai/nupic.subutai/run_anomaly
• Self contained function right now– It might be useful to look at, but not in an easy to use form yet!
– Plan to create better sample code and then perhaps integrate into OPF.
![Page 32: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/32.jpg)
32
What About Swarming?
• Swarming is an automated parameter selection scheme in NuPIC
– Runs hundreds of models with unique parameter combinations
– Selects the best field combinations and parameters
• In Grok we use a single pre-swarmed parameter set– Fixed set of fields (timestamp + value)
– Data fed in every 5 minutes
– Works very well across different data streams with above characteristics
• In general you will still need to swarm– Great set of tutorials online put together by Matt
– But the system is relatively insensitive to small parameter changes, so you may not need to swarm too often
![Page 33: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/33.jpg)
33
Where Do We Go Next?
• The CLA is proving to be excellent at detecting anomalies in datasets we’ve tried so far
– Fully automated - no parameter tuning in Grok!
• We’ve learned a lot in the process of creating the product– We’d like to spread the ideas to the community
• It’s clear we’re just scratching the surface
![Page 34: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/34.jpg)
34
Benchmark For Streaming Anomaly Detection
• Hard to find good anomaly detection benchmarks for streaming data
• We’ve decided to create a dataset and testing methodology focused on streaming data and anomaly detection
– Model real time online streaming data sources
– Emphasis will be on temporal streaming data, automation, and continuous learning
– Well defined methodology for evaluating algorithms
– Baseline results using CLA
• We’re hoping it will be useful to the NuPIC community as we continue to push the boundaries
– Please see Ian Danforth or me if you’re interested
![Page 35: Anomaly Detection Using the CLA](https://reader035.fdocuments.in/reader035/viewer/2022062405/554e86e0b4c90573338b4794/html5/thumbnails/35.jpg)
35
Resources
• Read “The Science of Anomaly Detection” whitepaper on numenta.com
• Github repository containing sample code, anomaly likelihood algorithm, and data:
–https://github.com/subutai/nupic.subutai/run_anomaly
• Survey of Machine Learning techniques: – Chandola, Varun, Arindam Banerjee, and Vipin Kumar. "Anomaly detection:
A survey." ACM Computing Surveys (CSUR) 41.3 (2009): 15.