opml shooting the moving target - USENIX · Shooting the moving target : machine learning in...

18
1 Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016

Transcript of opml shooting the moving target - USENIX · Shooting the moving target : machine learning in...

Page 1: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

1

Shooting the moving target :machine learning in cybersecurity

Ankit Arun*, Ignacio Arnaldo

MIT CSAIL top 16 2016

Page 2: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

2

1. Machine Learning in Cybersecurity: problem statement and state-of-the-art

2. Machine Learning Platform

3. Current state of the system

4. Ongoing efforts

Outline

Page 3: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

3

Vast number of data sources and attacks

100+Log Types

1000+Security Attacks

Reported in 2018

~ 24kmalicious mobile apps are blocked

everyday

600%IoT attacks in 2017

350%annually

Ransomware attacks

303USA faced

targeted attacks between 2015 and

2017

Page 4: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

The Need for AI in InfoSec: Data Problem

86%are investigated

successfully

80%of Attacks GoUndetected

By machines (aka logs and network systems) during or after the attack

By human analysts, after an attack has been known to occur

4

Page 5: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

Detection approaches

5

I’m Here

Coverage

False positives Dwell time

Threat intel and signatures Rules Anomaly detection Supervised models

Page 6: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

6

Challenges…

Cybersecurity

ComputerVision

More ExpertKnowledgeRequired

DATA PROPERTY AVAILABILITY VARIETY LABELED STATIC / DYNAMIC

Siloed with BarriersAdversarial

and Dynamic

Page 7: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

7

State-of-the-art ML in Cybersecurity

[1] M. Darling, G. Heileman, G. Gressel, A. Ashok, and P. Poornachandran, “A lexical approach for classifying malicious urls”

[2] M. S. I. Mamun, M. A. Rathore, A. H. Lashkari, N. Stakhanova, and A. A. Ghorbani, “Detecting malicious urls using lexical analysis”

[3] Woodbridge, H. S. Anderson, A. Ahuja, and D. Grant, “Predicting domain generation algorithms with long short-term memory networks”

[4] H. S. Anderson, J. Woodbridge, and B. Filar, “DeepDGA: Adversarially-Tuned Domain Generation and Detection,”

[5] J. Saxe et al., “eXpose: A Character-Level Convolutional Neural Network with Embeddings For Detecting Malicious URLs, File Paths and Registry Keys”

● 2015-2016: lexical analysis to detect spam, malware hosting and phishing URLs [1][2]

● 2016: LSTMs for DGA detection [3][4]● 2017: Char-level CNNs for URL classification [5]

Academia Industry● Web traffic open by default● Blacklists based on threat intelligence● ML is rarely used for live detection

Are the approaches still valid or are they outdated?How do the models perform in real world scenarios?

Will the models work in my environment?Risks preventing ML adoption:

Page 8: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

Logs

Data Pipelines

Labels

Models

Continuous Improvement Process

• Adding/Changing more data• Changing the entity to model• Adding more attack examples• Changing modeling strategy

8

Page 9: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

9

Machine Learning Platform

Page 10: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

10

The cloud repositories

Golden Data Set and Models

Threat Researchers

ML Engineers

Data Scientists

Page 11: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

11

The cloud repositories

Horizontal Brute Force Attackenvironment_1raw_logsnormalized_logsfeatures

label.csvlabeled_feature_matrixmodelsBrute_force_attack_classifier_v1.1Brute_force_attack_outlier_v1.1

Page 12: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

12

Configurable data pipelines

fields {name: ‘protocol’display_name: ‘Protocol’index: ‘proto’data_type: string

}

Log Parsing Engine

Page 13: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

13

Configurable data pipelines

feature {name: ‘distinct_protocol’display_name: ‘Distinct Protocol’definition: ‘count_distinct(protocol)’data_type: int

}

Feature Compute Engine

Page 14: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

14

Model Versioning

Brute_force_attack_Classifier_V2.3

Major Version

Minor Version

Brute Force Attack ClassifierParam Version Apr 2019 Mar 2019 Feb 2019 Jan 2019 Dec 2018 Nov 2018 Oct 2018 Sep 2018

5

5 1

4

v1

v2

v3

4 3 2 1

4 3 2

3 2 1

Page 15: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

Current state of the system

Ping Sweep

Port Scan

DNS Reconnaissance

Zone Transfer

Social Eng Domains

Phishing Domains

Redirects

Dll Highjack

Task Sched

Mimikatz

Winroot

Domain Enumeration

Brute Force Login

Overpass the Hash

Skeleton Key Attack

Kerberoasting

DC Replication

Golden Ticket Attack

SSO Login Attack

Malware Backdoor

DGAs

TOR Connections

ICMP Tunneling

HTTP Tunneling

Twittor

SSH Tunneling

DNS Tunneling

DNS Beaconing

ICMP Exfiltration

HTTP Exfiltration

Gmail Exfiltration

Twitter Exfiltration

NTP Exfiltration

SMTP Exfiltration

DNS Exfiltration

Cloud Takeover

Reconnaissance Delivery Privilege Escalation

Lateral Movement

Command and Control Exfiltration

Fwd Proxy Logs / NGFW

AD Logs

EDR Logs

DNS Logs

App Logs

Network

Proxy Logs

Zscaler

BlueCoat

Squid

Bro HTTP

Intersafe

FW Logs

PANW

Cisco ASA

Fortigate

NetScreen

Bro Conn

Flow Logs

Netflow

VPC Flow

IBM QFlow

DNS Logs

Windows DNS

Suricata

Bro DNS

Authentication

Auth/Auth

Active Directory

Okta

End Point

EDR Logs

Carbon Black

osQuery

Applications

App Logs

Apache

Box

OneDrive

Audit Trail

AWS CloudTrail

Contextual

Contextual

DHCP

Tenable

STIX

Open IoC

Alexa Top 1M

31Data Sources

27Golden

Datasets

70Models

1000Model

Deployment

Weekly Model

Updates

15

Page 16: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

Ongoing efforts

• Automating Feature Computation

• Data Shift Detection

• Automating Model Review/Update Process

16

Page 17: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

References• https://www.ptsecurity.com/ww-en/analytics/cybersecurity-

threatscape-2018-q3/• https://www.checkpoint.com/downloads/product-related/report/2018-

security-report.pdf• https://www.varonis.com/blog/cybersecurity-statistics/

17

Page 18: opml shooting the moving target - USENIX · Shooting the moving target : machine learning in cybersecurity Ankit Arun*, Ignacio Arnaldo MIT CSAIL top 16 2016. 2 1. ... Task Sched

18

Questions?