Strata San Jose 2016 - Reduce False Positives in Security

Post on 11-Apr-2017

10.854 views 3 download

Transcript of Strata San Jose 2016 - Reduce False Positives in Security

Base Rate Fallacy

Why False Positives?

Case Study: Outlier Detection

Using an outlier detection system to identify fraudsters within the environment.

For a set of generating mechanisms find the unusual ones.

Example Time Series

Solution: Feedback Loop

Fraud: Takeaways

- Concept Drift is a shift in behavior.- Feedback combats concept drift.- Implicit Feedback > Explicit Feedback

IDS: Anatomy of Successful Detection

Context: Security Analyst

Red team Kill Chain

Blue team Kill Chain

False positives: Lose Ability to Triage

Fact: You cannot salvage a false positive with Contextual Info or Visualization

What is a Successful detection?

Properties + Frameworks

Successful detection captures Adversary TTP from Sensor data ignoring Expected activity

Source: @MSwannMSFT

Properties of a Successful Detection

Adaptability

Credible

Interpretability

Actionable

Basic Advanced

Less Useful

More U

seful

Sophistication of Algorithms

Usefulness of A

lerts

Secu

rity

Dom

ain

Kno

wle

dge

Framework for a Successful detection

Basic Advanced

Less Useful

More U

seful

Sophistication of Algorithms

Usefulness of A

lerts

Secu

rity

Dom

ain

Kno

wle

dge

Outlier

Basic Advanced

Less Useful

More U

seful

Sophistication of Algorithms

Usefulness of A

lerts

Secu

rity

Dom

ain

Kno

wle

dge

Outlier

Anomaly

Increase Complexity

Basic Advanced

Less Useful

More U

seful

Sophistication of Algorithms

Usefulness of A

lerts

Secu

rity

Dom

ain

Kno

wle

dge

Outlier

AnomalyIncrease Complexity

Security InterestingAlerts

Incr

e ase

Dom

ain

Kno

wle

dgeSuccessful

Detections incorporate Domain Knowledge Alerts

How to encode Domain Knowledge: Embrace Rules

• Business Heuristics to filter out the “Security interesting anomalies”

• Rules can take many forms: •TI feeds •IOCs, IOAs•TTPs

• Rules are awesome • Credible, Interpretable, Adaptable (to some

extent), Actionable!• Highest Precision • Highest Recall

Three ways to combine ML and Rules

Three Ways to combine Rules and ML 1.Above Machine Learning Systems

a.Business Heuristics to filter alerts i. “For account _foo_, only raise sev 2 alerts until March 28th, 2016”,

Work by Dan Mace et. al, Microsoft

2. Below Machine Learning Systemsa. Featurizations - “If IP address present in List of malicious IP dataset, flag 1”b. Utilizes Threat Intel feeds (Cymru, Virus total, FireEye)

3: Combining Rules and Machine Learning together using Markov Logic Networks

Initial Ideas given by Vinod Nair, MSR

Intuition

•Rules alone place a set of hard constraintson the set of possible worlds•Let’s make them soft constraints:When a world violates a formula,It becomes less probable, not impossible•Give each formula a weight(Higher weight ⇒ Stronger constraint)

Source: Lectures by Pedro Domingos

Interactive logons from service accounts causes attack

Similar service accounts tend to have similar logon behavior

Example: Service Accounts

Domain Knowledge

Example: Service Accounts

Encode as First Order Logic

Example: Service Accounts

1.5

1.1

Example: Service Accounts

AssociateEach Rule With the Learned Weight

Example: Service Accounts

1.5

1.1

Attack(A)

InteractiveLogon(A)

InteractiveLogon(B)

Attack(B)

Example: Service Accounts

Consider two service accounts: A,B

Example: Service Accounts

1.5

1.1

Attack(A)

InteractiveLogon(A)

InteractiveLogon(B)

Attack(B)Similar(A,

B)

Similar(B,A)

Similar(A,A)

Similar(B,B)

Example: Service Accounts

1.5

1.1

Attack(A)

InteractiveLogon(A)

InteractiveLogon(B)

Attack(B)Similar(A,

B)

Similar(B,A)

Similar(A,A)

Similar(B,B)

Example: Service Accounts

1.5

1.1

Attack(A)

InteractiveLogon(A)

InteractiveLogon(B)

Attack(B)Similar(A,

B)

Similar(B,A)

Similar(A,A)

Similar(B,B)

•How to learn the structure? •Begin with hand-coded rules•Use Inductive Logic Programming, but need to infer arbitrary clause

•How to learn the weights? •For generative learning, depend on pseudolikelihood

•Checkout Alchemy -- http://alchemy.cs.washington.edu/

Call for Action - After the conference • One Week

•Review •@CodyRioux - IPython Notebook•@Ram_ssk - Follow Up material

•Think comprehensively about Rules

• One Month •Ask your data scientists to literature review section

•Implement the rules on TOP of ML systems

• One quarter•Implement a feedback system to capture training data

•Implement all TI feeds within an ML System

•Play with Alchemy

Literature● The Base-Rate Fallacy and its Implications for the Difficulty of Intrusion Detection

(Alexsson, 1999)

● Enhancing Performance Prediction Robustness by Combining Analytical Modeling

and Machine Learning (Didona et al., 2015)

● Richardson, Matthew, and Pedro Domingos. "Markov logic networks."Machine

learning 62.1-2 (2006): 107-136.