Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic...
Transcript of Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic...
![Page 1: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/1.jpg)
Machine Learning-Based Detection of Ransomware Using SDN
Greg Cusack*, Oliver Michel, Eric Keller3/21/18
* Presenter
![Page 2: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/2.jpg)
● Ransomware Overview● Previous work● Programmable Forwarding Engines (PFEs)● Method● Classification● Results● Current Progress
Overview
![Page 3: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/3.jpg)
Ransomware
● Malicious software holds a victim’s files at ransom
● Files held until ransom paid● Two main types:
○ Locker○ Crypto
● Difficult to develop long term solutions
● IoT boom -> More avenues for infection
● Ransomware as a Service (RaaS)
WannaCry Ransomware
Source: https://www.pbs.org/newshour/science/everything-
need-know-wannacrypt-ransomware-attack
![Page 4: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/4.jpg)
Ransomware Data Flow
WannaCry Ransomware
![Page 5: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/5.jpg)
Previous Work
WannaCry Ransomware
● EldeRan: Machine learning approach for ransomware classification○ Track Windows API calls, file system operations, registry key operations, etc.
● Software-defined networking-based detection of crypto ransomware○ Fingerprint HTTP traffic
● Most packet trace approaches are payload-based
Source: K. Cabaj, M. Gregorczyk, and W. Mazurczyk. Software-defined networking-based crypto
ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016
![Page 6: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/6.jpg)
Programmable Forwarding Engines (PFEs)
WannaCry Ransomware
● High-rate, programmable, network switches● Supports the scalable generation of rich flow records● Can process network data at high-rates of speed and
extract vital, per-packet flow information● Provides data and speed necessary for network,
flow-based, traffic analysis and fingerprinting
![Page 7: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/7.jpg)
Compact, Per Packet Flow Records
WannaCry Ransomware
● Provides richness and scalability for large networked systems
● Tailored to fit a user's specific application
PFE Flow Record Overview and Features
![Page 8: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/8.jpg)
Method
WannaCry Ransomware
● Goal: Utilize machine learning and leverage the recent trend in switch hardware to identify ransomware via its network traffic signature
● Collect ransomware PCAP samples (>100MB)● Collect clean traffic as baseline
○ Web browsing, streaming, file downloading, etc.
● Stream processor development● Classification
Source: http://www.malware-traffic-analysis.net
![Page 9: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/9.jpg)
Flow Records and Stream Processing
WannaCry Ransomware
![Page 10: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/10.jpg)
Flow-Based Features
WannaCry Ransomware
● Flow Duration● Interarrival Times
○ Minimum○ Mean○ Maximum ○ Standard deviation
● Packet Lengths○ Minimum○ Mean○ Maximum ○ Standard deviation
● Burst Lengths○ Minimum○ Mean○ Maximum
● Total number of packets● Ratios:
○ Packets out/packet in○ Bytes out/bytes in
● # of unique packet lengths
![Page 11: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/11.jpg)
Random Forest
WannaCry Ransomware
● Ensemble Algorithm○ Divide and conquer approach
● Collection of decision trees○ Avoids overfitting
● Random subsets of features used to build smaller, shallower trees
● Majority voting from decision trees to decide class
● Bagging used to improve stability, reduce variance, and increase accuracy
Source: https://www.youtube.com/watch?v=D_2LkhMJcfY
![Page 12: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/12.jpg)
WannaCry Ransomware
Results
![Page 13: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/13.jpg)
WannaCry Ransomware
Confusion Matrix (28 Features)
● Accuracy○ Correct / total = 0.8689
● Recall ○ tp / (tp + fn) = 0.8925
● Precision○ tp / (tp + fp) = 0.8384
● F1 Score○ 2 * (R * P) / (R + P) = 0.8689
![Page 14: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/14.jpg)
WannaCry Ransomware
ROC Curve (28 Features)
● Area Under Curve (AUC)○ 0.93502
● 10-Fold Cross Validation Score○ 0.87301
● Decision Trees: 40● Max Tree Depth: 15
![Page 15: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/15.jpg)
Features
Inflow #
of Bytes
Outflow to inflow
packet ratioInflow σLength
Inflow Mean
Burst Length
Outflow #
of Bytes
Outflow Minimum
Interarrival Time
Outflow Flow
Duration
Outflow σLength
Feature Importances (28 Features)
![Page 16: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/16.jpg)
Confusion Matrix (8 Features)
WannaCry Ransomware
● Accuracy○ Correct / total = 0.8738
● Recall ○ tp / (tp + fn) = 0.8710
● Precision○ tp / (tp + fp) = 0.8617
● F1 Score○ 2 * (R * P) / (R + P) = 0.8738
![Page 17: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/17.jpg)
ROC Curve (8 Features)
WannaCry Ransomware
● Area Under Curve (AUC)○ 0.91951
● 10-Fold Cross Validation Score○ 0.86827
● Decision Trees: 40● Max Tree Depth: 15
![Page 18: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/18.jpg)
Inflow σ Length
Outflow Minimum
Interarrival Time
Outflow to
inflow packet
ratio
Features
Feature Importances (8 Features)
![Page 19: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/19.jpg)
Can we identify a specific type of ransomware?
Crypto-Based Cerber Ransomware Detection
![Page 20: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/20.jpg)
Features
Inflow Mean
Burst Length
Inflow Max
Burst Length
Outflow to inflow
packet ratio
Cerber Feature Importances (8 Features)
![Page 21: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/21.jpg)
Confusion Matrix Cerber Ransomware
WannaCry Ransomware
● Accuracy○ Correct / total = 0.9444
● Recall ○ tp / (tp + fn) = 1.000
● Precision○ tp / (tp + fp) = 0.9091
● F1 Score○ 2 * (R * P) / (R + P) = 0.9439
![Page 22: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/22.jpg)
ROC Curve Cerber Ransomware
WannaCry Ransomware
● Area Under Curve (AUC)○ 0.98750
● 10-Fold Cross Validation Score○ 0.90500
● Decision Trees: 40● Max Tree Depth: 15
![Page 23: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/23.jpg)
Takeaways
● Initial findings are promising but require further research
● Packet lengths, interarrival times, and flow ratios leave ransomware susceptible to identification
● Recent emergence of PFEs provide the right backbone for flow-based feature extraction
![Page 24: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/24.jpg)
Work in Progress
● Sandboxing ransomware samples to collect network traffic
● Implementing stream processor on a PFE ASIC
● Developing LSTM Recurrent Neural Network
● System architecture redesign
![Page 25: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/25.jpg)
Questions?
![Page 26: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/26.jpg)
Ransomware Detection System Overview
![Page 27: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/27.jpg)
Hardware-generated, Bidirectional Microflows
● Generated in switch ASIC● Turn current *Flow flow table into bidirectional flow table● Evict bidirectional microflows to CPU for feature extraction● PFE data flow overview:
![Page 28: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/28.jpg)
Flow Feature Structure
![Page 29: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/29.jpg)
SDN Controller/Server Data Flow
![Page 30: Ransomware Using SDN Machine Learning-Based Detection of · ransomware detection using http traffic characteristics. arXiv preprint arXiv:1611.08294, 2016. Programmable Forwarding](https://reader036.fdocuments.in/reader036/viewer/2022062606/5fe70a434821eb7bd94d63e7/html5/thumbnails/30.jpg)
Packet Length Frequency