Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware...

33
Hardware Malware Detectors John Demme , Matthew Maycock, Jared Schmitz, Adrian Tang, Adam Waksman, Simha Sethumadhavan, Salvatore Stolfo Computer Science Department, Columbia University 1 Thursday, June 27, 13

Transcript of Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware...

Page 1: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Hardware Malware DetectorsJohn Demme, Matthew Maycock, Jared Schmitz, Adrian Tang,

Adam Waksman, Simha Sethumadhavan, Salvatore Stolfo

Computer Science Department,Columbia University

1

Thursday, June 27, 13

Page 2: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

2

WormsTrojan Horses

SpywareAdwareBotnets

Thursday, June 27, 13

Page 3: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

0M

25M

50M

75M

100M

125M

2000 2002 2004 2006 2008 2010 2012

Malware in the Wild

AVTest unique database entries (av-test.org)3

Thursday, June 27, 13

Page 4: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Why we are losing the battle?

100

1,000

10,000

100,000

1,000,000

10,000,000

1985 1991 1997 2003 2009 2015

Line

s of C

ode

DEC Seal

Stalker

Milky Way

Snort

Network FlightRecorder

Unified ThreatManagement

ClamAV (2005)

ClamAV (2012)

Sources: “An analytical framework for cyber security” DARPA’s Peter “Mudge” Zatko 2011 & Ohloh.net

Malware: 125 lines

Growth of Security Code

Expected number

of bugs

4

Thursday, June 27, 13

Page 5: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

State of PracticeApplications

Antivirus

Operating System

Hardware

Hypervisor

• Software-based antivirus runs above the O/S, hypervisor and hardware

• Operating systems and hypervisors have exploited bugs

• Software antivirus is vulnerable and can’t help it!

5

Thursday, June 27, 13

Page 6: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Proposed Solution

Applications

Operating System

Hardware Antivirus

Hypervisor

• Hardware-based antivirus not susceptible O/S bugs

• Hardware tends to have fewer bugs & exploits

6

Thursday, June 27, 13

Page 7: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Outline

• Detecting malware with performance counters

• ... using machine learning

• Experimental setup

• Results for Android

• An architecture for hardware antivirus systems

7

Thursday, June 27, 13

Page 8: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

How do we build hardware A/V?

8

Software Hardware

Primitives

How it works

Files, Downloads,File Systems,

Registry Entries,System Calls

Memory,Dynamic Instructions,

uArch Events,System Calls

Scans downloads for static signatures ?

Thursday, June 27, 13

Page 9: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

bzip

2-i1

L1 Exclusive Hits Arithmetic µOps Executed

bzip

2-i2

bzip

2-i3

mcf

hmm

ersje

nglib

quan

tum

h264

omne

tpp

asta

ras

tar

Xala

nc

Programs have Unique µArch Signatures

9

Can µarch footprints uniquely identify malware during

execution?

Could we build a database of malicious

µarch signatures?

bzip

2-i1

L1 Exclusive Hits Arithmetic µOps Executed

bzip

2-i2

bzip

2-i3

mcf

hmm

ersje

nglib

quan

tum

h264

omne

tpp

asta

ras

tar

Xala

nc

Time

Thursday, June 27, 13

Page 10: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Outline

• Detecting malware with performance counters

• ... using machine learning

• Experimental setup

• Results for Android

• An architecture for hardware antivirus systems

10

Thursday, June 27, 13

Page 11: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Machine Learning: Classifiers

11

Classifier

VVV

1

2

3

VVV

1

2

3

VVV

1

2

3

VVV

1

2

3

VVV

1

2

3

VVV

1

2

3

VVV

1

2

3

VVV

1

2

3

VVV

1

2

3

VVV

1

2

3

VVV

1

2

3

MalwareNot

Malware

Training(at A/V Vendor)

Production(on consumer device)

V1

VV

2

3

Classifier

Malware?

p( malware )Thursday, June 27, 13

Page 12: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

• Microarchitectural event counts yield performance vectors over time

• Feed each vector into classifier, results in p(malware) over time

• Average over time, decide if malware or not with threshold

bzip

2-i1

L1 Exclusive Hits Arithmetic µOps Executed

bzip

2-i2

bzip

2-i3

mcf

hmm

ersje

nglib

quan

tum

h264

omne

tpp

asta

ras

tar

Xala

nc

bzip

2-i1

L1 Exclusive Hits Arithmetic µOps Executed

bzip

2-i2

bzip

2-i3

mcf

hmm

ersje

nglib

quan

tum

h264

omne

tpp

asta

ras

tar

Xala

nc

Machine Learning on µArch Events

12

bzip

2-i1

L1 Exclusive Hits Arithmetic µOps Executed

bzip

2-i2

bzip

2-i3

mcf

hmm

ersje

nglib

quan

tum

h264

omne

tpp

asta

ras

tar

Xala

nc

p( malware )

Not malware

Time

Classifier

Thursday, June 27, 13

Page 13: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

bzip

2-i1

L1 Exclusive Hits Arithmetic µOps Executed

bzip

2-i2

bzip

2-i3

mcf

hmm

ersje

nglib

quan

tum

h264

omne

tpp

asta

ras

tar

Xala

nc

Malware

• Microarchitectural event counts yield performance vectors over time

• Feed each vector into classifier, results in p(malware) over time

• Average over time, decide if malware or not with threshold

bzip

2-i1

L1 Exclusive Hits Arithmetic µOps Executed

bzip

2-i2

bzip

2-i3

mcf

hmm

ersje

nglib

quan

tum

h264

omne

tpp

asta

ras

tar

Xala

nc

bzip

2-i1

L1 Exclusive Hits Arithmetic µOps Executed

bzip

2-i2

bzip

2-i3

mcf

hmm

ersje

nglib

quan

tum

h264

omne

tpp

asta

ras

tar

Xala

nc

Machine Learning on µArch Events

13

Time

Classifier

p( malware )

Thursday, June 27, 13

Page 14: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Outline

• Detecting malware with performance counters

• ... using machine learning

• Experimental setup

• Results for Android

• An architecture for hardware antivirus systems

14

Thursday, June 27, 13

Page 15: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Android Malware Detection

• 17x increase malware detections in 2012 (“Trends for 2013” ESET Latin America’s Lab)

• Android 4.1 mobile O/S

• ARM/TI PandaBoard

• 6 performance counters

15

Thursday, June 27, 13

Page 16: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Malware Families & Variants

• Family of variants which all do similar things

• Usually packaged with different host software

• Expectation: similar malicious code, different host code

16

GPS LoggerGPS

Logger

GPS Logger

GPS Logger

GPS Logger GPS

Logger

Thursday, June 27, 13

Page 17: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Can we detect new variants after seeing only old ones?

17

GPS Logger GPS

LoggerGPS

Logger

GPS LoggerGPS

Logger

Train classifier on these malware apps

Evaluate our classifier with different variants in the same family

Thursday, June 27, 13

Page 18: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Experimental Setup• 503 Malware apps

• 37 families

• Taken from internet repository [1] and previous work [2]

• 210 Non-malware apps

• Most popular apps on Google play

• System applications (ls, bash, com.android.*)

• 3.68e8 data points total

[1] http://contagiominidump.blogspot.com/[2] Y. Zhou and X. Jiang, “Dissecting android malware: Characterization and evolution,” in Security and Privacy (SP), 2012 IEEE Symp. on, pp. 95 –109, may 2012.18

•Tapsnake•Zitmo•Loozfon-android•Android.Steek•Android.Trojan.Qic

somos•CruseWin•Jifake•AnserverBot•Gone60•YZHC•FakePlayer•LoveTrap•Bgserv•KMIN•DroidDreamLight•HippoSMS•Dropdialerab•Zsone

•Endofday•AngryBirds-Lena.C• jSMSHider•Plankton•PJAPPS•Android.Sumzand•RogueSPPush•FakeNetflix•GEINIMI•SndApps•GoldDream•CoinPirate•BASEBRIDGE•DougaLeaker.A•Newzitmo•BeanBot•GGTracker•FakeAngry•DogWars

Thursday, June 27, 13

Page 19: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Very Realistic, Noisy Data• Network connectivity allowed:

• Malware could phone home

• Additional noise introduced

• Input bias allowed:

• Multiple users conducted data collection

• Environmental noise allowed:

• Malware ran with system applications, not isolated

• Contamination between training & testing data prevented:

• Non-volatile storage wiped, eliminating ‘sticky’ malware

• Training/testing split before data collection

• Makes our task harder, better feasibility study

19

Thursday, June 27, 13

Page 20: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Outline

• Detecting malware with performance counters

• ... using machine learning

• Experimental setup

• Results for Android

• An architecture for hardware antivirus systems

20

Thursday, June 27, 13

Page 21: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Experiments in Paper

• Detection of malicious packages on Android

• Detection of malicious threads on Android

• Linux rootkit detection

• Cache side-channel attack detection

21

Thursday, June 27, 13

Page 22: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

RandomPerc

enta

ge o

f Mal

war

e Id

entifi

edDetection Rates

22False Positive Rate

Ideal

Thursday, June 27, 13

Page 23: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Perc

enta

ge o

f Mal

war

e Id

entifi

ed

23False Positive Rate

Malware Family Detection

Thursday, June 27, 13

Page 24: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Rootkit Detection

24

Perc

enta

ge o

f Mal

war

e Id

entifi

ed

False Positive Rate

100

80

60

40

20

0 100806040200

Thursday, June 27, 13

Page 25: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Result Summary• Detection of malicious packages on Android

• 90% accuracy

• Detection of malicious threads on Android

• 80% accuracy

• Linux rootkit detection

• About 60% accuracy

• Difficult problem; rootkits are tiny slices of execution

• Cache side-channel attack detection

• 100% accuracy, no false positives

25

Thursday, June 27, 13

Page 26: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

One Way to Improve Results

• Malware writers package with non-malware.

• Problem: what’s actually malware?

• Our (bad) solution: all of it

• Raises false-positives

26

MalwareNon-Malware(e.g. Angry Birds, Pandora)

Android Software Package

Thursday, June 27, 13

Page 27: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Outline

• Detecting malware with performance counters

• ... using machine learning

• Experimental setup

• Results for Android

• An architecture for hardware antivirus systems

27

Thursday, June 27, 13

Page 28: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Recommendations for Hardware A/V System Design1. Provide strong isolation mechanisms to

enable anti-virus software to execute without interference.

2. Investigate both on-chip and off-chip solutions for the AV implementations.

3. Allow performance counters to be read without interrupting the executing process.

4. Ensure that the AV engine can access physical memory safely.

5. Investigate domain-specific optimizations for the AV engine.

6. Increase performance counter coverage and the number of counters available.

7. The AV engine should be flexible enough to enforce a wide range of security policies.

8. Create mechanisms to allow the AV engine to run in the highest privilege mode.

9. Provide support in the AV engine for secure updates.

28

Thursday, June 27, 13

Page 29: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Recommendations for Hardware A/V System Design1. Provide strong isolation mechanisms to

enable anti-virus software to execute without interference.

2. Investigate both on-chip and off-chip solutions for the AV implementations.

3. Allow performance counters to be read without interrupting the executing process.

4. Ensure that the AV engine can access physical memory safely.

5. Investigate domain-specific optimizations for the AV engine.

6. Increase performance counter coverage and the number of counters available.

7. The AV engine should be flexible enough to enforce a wide range of security policies.

8. Create mechanisms to allow the AV engine to run in the highest privilege mode.

9. Provide support in the AV engine for secure updates.

29

Thursday, June 27, 13

Page 30: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Strong isolation mechanisms enable anti-virus software to execute without interference

• Non-interruptable

• A/V requires data from cores

• Starvation == exploit

• A/V uses off-die memory

• Starvation == exploit

30

AV

Strongly Isolated

Core

NOC: Rx for performance cntr. data, Tx: security exceptions

Isolated Secure Bus

Thursday, June 27, 13

Page 31: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Allow Secure System Updates

31

Update Package(Encrypted & signed)

Trained Classifier

Action Program

Revision Number

• Update protocol:

• Decrypt

• Verify signature

• Check revision number

• Disallows access to classifiers & action programs

Thursday, June 27, 13

Page 32: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Outline

• Detecting malware with performance counters

• ... using machine learning

• Experimental setup

• Results for Android

• An architecture for hardware antivirus systems

32

Thursday, June 27, 13

Page 33: Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, ... Trojan

Contributions(1) First hardware-based antivirus detection

• Promising results, reasons to believe results will improve

• First branch predictors started at 80% accuracy...

(2) Dataset available: http://castl.cs.columbia.edu/colmalset

Much to follow on: 0-day exploit detection, attacks,counterattack malware detection, better machine learning,

precise training labels, ML accelerators, prototypes, etc.

33

Thursday, June 27, 13