Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan...

32
Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    0

Transcript of Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan...

Page 1: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Automated Worm Fingerprinting

Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage

Manan Sanghi

Page 2: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

The menace

Page 3: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Context

Worm Detection Scan detection Honeypots Host based behavioral detection

Payload-based ???

Page 4: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Context

Characterization A priori vulnerability signatures

Generally manual Honeycomb

Host based Longest common subsequences

Autograph Network level automatic signature generation

Page 5: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Context

Containment Host quarantine String matching Connection throttling

Address Blacklisting

Content Filtering

Internet Quarantine

Page 6: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Worm behavior

Content Invariance Limited polymorphism e.g. encryption key portions are invariant e.g. decryption routine

Content Prevalence invariant portion appear frequently

Address Dispersion # of infected distinct hosts grow overtime reflecting different source and dest. addresses

Page 7: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Key Idea

Detect unknown worms on the basis of

A common exploit sequence

Rage of unique sources and destination

Page 8: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Content Sifting

For each string w, maintain prevalence(w): Number of times it is found in the

network traffic sources(w): Number of unique sources

corresponding to it destinations(w): Number of unique destinations

corresponding to it

If thresholds exceeded, then block(w)

Page 9: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Issues

How to compute prevalence(w), sources(w) and destinations(w) efficiently?

Scalable Low memory and CPU requirements Real time deployment over a Gigabit scale

link

Page 10: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

prevalence(w)

w – entire packet Use multi-stage filters (k-ary sketches?)

w – small fixed length b Rabin fingerprints Value sampling

Page 11: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Value Sampling

The problem: s-b+1 substrings Solution: Sample But: Random sampling is not good enough Trick: Sample only those substrings for which

the fingerprint matches a certain pattern Since Rabin fingerprints are randomly

ditributed,

Prtrack(x)=1-e-f(x-b+1)

Page 12: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

sources(w) & destinations(w)

Address Dispersion Counting distinct elements vs. repeating

elements Simple list or hash table is too expensive Key Idea: Bitmaps Trick : Scaled Bitmaps

Page 13: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Direct Bitmap

Each content source is hashed into a bitmap, the corresponding bit is set, and an alarm is raised when the number of bits set exceeds a threshold

Drawback: lose estimation of actual values of each counter

Page 14: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Scaled Bitmap

Idea: Subsample the range of hash space How it works?

multiple bitmaps each mapped to progressively smaller and smaller portions of the hash space.

bitmap recycled if necessary.

Result

Roughly 5 time less memory + actual estimation of address dispersion

Page 15: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Putting it together

Page 16: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Experience

System design: Sensors and Aggregators sensor sift through traffic on configurable address space

zones of responsibility aggregator coordinates real-time updates from the sensors,

coalesces related signatures and so on. Parameters:

content prevalence: 3 address dispersion threshold:30 garbage collection time: several hours

Page 17: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

prevalence(w) threshold

Page 18: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Address Dispersion threshold

Page 19: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Garbage Collection threshold

Page 20: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Trace-based False Positives

Page 21: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Performance Processing time:

Memory Consumption: 4M bytes

Page 22: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Live Experience

Detect known worms: CodeRed,

Detect new worms: MyDoom, Sasser, Kibvu.B

Page 23: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Limitation & Extension

Variant content

Network evasion

Extension: Dealing with slow worms

Page 24: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Comparison

Earlybird Autograph

Infect the system with Network Data (real traces)

Rabin fingerprint

White-list/blacklist

No-prefiltering Flow-reassembly

Single sensor algorithmics + centralized aggregators

Distributed Deployment + active cooperation between

multiple sensors

On-line Off-line

Overlapping, fixed-length chunks

Non-overlapping, variable-length chunks

Qinghua Zhang

Page 25: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Breather

Page 26: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Polygraph: Automatically Generating Signatures For Polymorphic Worms

James Newsome, Brad Karp, Dawn Song

Page 27: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

The case for polymorphic worms

Single Substring Insufficient

Sensitive: Should exist in all payload of a worm

Specific: Should be long enough to not exist in any non-worm payload

Page 28: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Examples

Page 29: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Signature Classes

Signature – set of tokens

Conjunction Signatures

Token-subsequence Signatures

Bayes Signatures

Page 30: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Problem Formulation

Page 31: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Algorithms

Preprocessing Distinct substrings of a minimum length l that

occur in at least k samples in suspicious pool

Generating signatures Conjunction signatures Token Subsequence Signatures Bayes Signatures

Page 32: Automated Worm Fingerprinting Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage Manan Sanghi.

Wrap Up

Automated Worm Fingerprinting (OSDI 2004)

Polygraph: Automatically Generating Signatures For Polymorphic Worms

(IEEE Security Symposium 2005)

Manan Sanghi