Technologies of flame detection UVdetectors UV/IR detectors IR detectors IR3 detectors.
Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware...
Transcript of Hardware Malware Detectors - Columbia Universityjdd/papers/isca13_malware_slides.pdf · Hardware...
Hardware Malware DetectorsJohn Demme, Matthew Maycock, Jared Schmitz, Adrian Tang,
Adam Waksman, Simha Sethumadhavan, Salvatore Stolfo
Computer Science Department,Columbia University
1
Thursday, June 27, 13
2
WormsTrojan Horses
SpywareAdwareBotnets
Thursday, June 27, 13
0M
25M
50M
75M
100M
125M
2000 2002 2004 2006 2008 2010 2012
Malware in the Wild
AVTest unique database entries (av-test.org)3
Thursday, June 27, 13
Why we are losing the battle?
100
1,000
10,000
100,000
1,000,000
10,000,000
1985 1991 1997 2003 2009 2015
Line
s of C
ode
DEC Seal
Stalker
Milky Way
Snort
Network FlightRecorder
Unified ThreatManagement
ClamAV (2005)
ClamAV (2012)
Sources: “An analytical framework for cyber security” DARPA’s Peter “Mudge” Zatko 2011 & Ohloh.net
Malware: 125 lines
Growth of Security Code
Expected number
of bugs
4
Thursday, June 27, 13
State of PracticeApplications
Antivirus
Operating System
Hardware
Hypervisor
• Software-based antivirus runs above the O/S, hypervisor and hardware
• Operating systems and hypervisors have exploited bugs
• Software antivirus is vulnerable and can’t help it!
5
Thursday, June 27, 13
Proposed Solution
Applications
Operating System
Hardware Antivirus
Hypervisor
• Hardware-based antivirus not susceptible O/S bugs
• Hardware tends to have fewer bugs & exploits
6
Thursday, June 27, 13
Outline
• Detecting malware with performance counters
• ... using machine learning
• Experimental setup
• Results for Android
• An architecture for hardware antivirus systems
7
Thursday, June 27, 13
How do we build hardware A/V?
8
Software Hardware
Primitives
How it works
Files, Downloads,File Systems,
Registry Entries,System Calls
Memory,Dynamic Instructions,
uArch Events,System Calls
Scans downloads for static signatures ?
Thursday, June 27, 13
bzip
2-i1
L1 Exclusive Hits Arithmetic µOps Executed
bzip
2-i2
bzip
2-i3
mcf
hmm
ersje
nglib
quan
tum
h264
omne
tpp
asta
ras
tar
Xala
nc
Programs have Unique µArch Signatures
9
Can µarch footprints uniquely identify malware during
execution?
Could we build a database of malicious
µarch signatures?
bzip
2-i1
L1 Exclusive Hits Arithmetic µOps Executed
bzip
2-i2
bzip
2-i3
mcf
hmm
ersje
nglib
quan
tum
h264
omne
tpp
asta
ras
tar
Xala
nc
Time
Thursday, June 27, 13
Outline
• Detecting malware with performance counters
• ... using machine learning
• Experimental setup
• Results for Android
• An architecture for hardware antivirus systems
10
Thursday, June 27, 13
Machine Learning: Classifiers
11
Classifier
VVV
1
2
3
VVV
1
2
3
VVV
1
2
3
VVV
1
2
3
VVV
1
2
3
VVV
1
2
3
VVV
1
2
3
VVV
1
2
3
VVV
1
2
3
VVV
1
2
3
VVV
1
2
3
MalwareNot
Malware
Training(at A/V Vendor)
Production(on consumer device)
V1
VV
2
3
Classifier
Malware?
p( malware )Thursday, June 27, 13
• Microarchitectural event counts yield performance vectors over time
• Feed each vector into classifier, results in p(malware) over time
• Average over time, decide if malware or not with threshold
bzip
2-i1
L1 Exclusive Hits Arithmetic µOps Executed
bzip
2-i2
bzip
2-i3
mcf
hmm
ersje
nglib
quan
tum
h264
omne
tpp
asta
ras
tar
Xala
nc
bzip
2-i1
L1 Exclusive Hits Arithmetic µOps Executed
bzip
2-i2
bzip
2-i3
mcf
hmm
ersje
nglib
quan
tum
h264
omne
tpp
asta
ras
tar
Xala
nc
Machine Learning on µArch Events
12
bzip
2-i1
L1 Exclusive Hits Arithmetic µOps Executed
bzip
2-i2
bzip
2-i3
mcf
hmm
ersje
nglib
quan
tum
h264
omne
tpp
asta
ras
tar
Xala
nc
p( malware )
Not malware
Time
Classifier
Thursday, June 27, 13
bzip
2-i1
L1 Exclusive Hits Arithmetic µOps Executed
bzip
2-i2
bzip
2-i3
mcf
hmm
ersje
nglib
quan
tum
h264
omne
tpp
asta
ras
tar
Xala
nc
Malware
• Microarchitectural event counts yield performance vectors over time
• Feed each vector into classifier, results in p(malware) over time
• Average over time, decide if malware or not with threshold
bzip
2-i1
L1 Exclusive Hits Arithmetic µOps Executed
bzip
2-i2
bzip
2-i3
mcf
hmm
ersje
nglib
quan
tum
h264
omne
tpp
asta
ras
tar
Xala
nc
bzip
2-i1
L1 Exclusive Hits Arithmetic µOps Executed
bzip
2-i2
bzip
2-i3
mcf
hmm
ersje
nglib
quan
tum
h264
omne
tpp
asta
ras
tar
Xala
nc
Machine Learning on µArch Events
13
Time
Classifier
p( malware )
Thursday, June 27, 13
Outline
• Detecting malware with performance counters
• ... using machine learning
• Experimental setup
• Results for Android
• An architecture for hardware antivirus systems
14
Thursday, June 27, 13
Android Malware Detection
• 17x increase malware detections in 2012 (“Trends for 2013” ESET Latin America’s Lab)
• Android 4.1 mobile O/S
• ARM/TI PandaBoard
• 6 performance counters
15
Thursday, June 27, 13
Malware Families & Variants
• Family of variants which all do similar things
• Usually packaged with different host software
• Expectation: similar malicious code, different host code
16
GPS LoggerGPS
Logger
GPS Logger
GPS Logger
GPS Logger GPS
Logger
Thursday, June 27, 13
Can we detect new variants after seeing only old ones?
17
GPS Logger GPS
LoggerGPS
Logger
GPS LoggerGPS
Logger
Train classifier on these malware apps
Evaluate our classifier with different variants in the same family
Thursday, June 27, 13
Experimental Setup• 503 Malware apps
• 37 families
• Taken from internet repository [1] and previous work [2]
• 210 Non-malware apps
• Most popular apps on Google play
• System applications (ls, bash, com.android.*)
• 3.68e8 data points total
[1] http://contagiominidump.blogspot.com/[2] Y. Zhou and X. Jiang, “Dissecting android malware: Characterization and evolution,” in Security and Privacy (SP), 2012 IEEE Symp. on, pp. 95 –109, may 2012.18
•Tapsnake•Zitmo•Loozfon-android•Android.Steek•Android.Trojan.Qic
somos•CruseWin•Jifake•AnserverBot•Gone60•YZHC•FakePlayer•LoveTrap•Bgserv•KMIN•DroidDreamLight•HippoSMS•Dropdialerab•Zsone
•Endofday•AngryBirds-Lena.C• jSMSHider•Plankton•PJAPPS•Android.Sumzand•RogueSPPush•FakeNetflix•GEINIMI•SndApps•GoldDream•CoinPirate•BASEBRIDGE•DougaLeaker.A•Newzitmo•BeanBot•GGTracker•FakeAngry•DogWars
Thursday, June 27, 13
Very Realistic, Noisy Data• Network connectivity allowed:
• Malware could phone home
• Additional noise introduced
• Input bias allowed:
• Multiple users conducted data collection
• Environmental noise allowed:
• Malware ran with system applications, not isolated
• Contamination between training & testing data prevented:
• Non-volatile storage wiped, eliminating ‘sticky’ malware
• Training/testing split before data collection
• Makes our task harder, better feasibility study
19
Thursday, June 27, 13
Outline
• Detecting malware with performance counters
• ... using machine learning
• Experimental setup
• Results for Android
• An architecture for hardware antivirus systems
20
Thursday, June 27, 13
Experiments in Paper
• Detection of malicious packages on Android
• Detection of malicious threads on Android
• Linux rootkit detection
• Cache side-channel attack detection
21
Thursday, June 27, 13
RandomPerc
enta
ge o
f Mal
war
e Id
entifi
edDetection Rates
22False Positive Rate
Ideal
Thursday, June 27, 13
Perc
enta
ge o
f Mal
war
e Id
entifi
ed
23False Positive Rate
Malware Family Detection
Thursday, June 27, 13
Rootkit Detection
24
Perc
enta
ge o
f Mal
war
e Id
entifi
ed
False Positive Rate
100
80
60
40
20
0 100806040200
Thursday, June 27, 13
Result Summary• Detection of malicious packages on Android
• 90% accuracy
• Detection of malicious threads on Android
• 80% accuracy
• Linux rootkit detection
• About 60% accuracy
• Difficult problem; rootkits are tiny slices of execution
• Cache side-channel attack detection
• 100% accuracy, no false positives
25
Thursday, June 27, 13
One Way to Improve Results
• Malware writers package with non-malware.
• Problem: what’s actually malware?
• Our (bad) solution: all of it
• Raises false-positives
26
MalwareNon-Malware(e.g. Angry Birds, Pandora)
Android Software Package
Thursday, June 27, 13
Outline
• Detecting malware with performance counters
• ... using machine learning
• Experimental setup
• Results for Android
• An architecture for hardware antivirus systems
27
Thursday, June 27, 13
Recommendations for Hardware A/V System Design1. Provide strong isolation mechanisms to
enable anti-virus software to execute without interference.
2. Investigate both on-chip and off-chip solutions for the AV implementations.
3. Allow performance counters to be read without interrupting the executing process.
4. Ensure that the AV engine can access physical memory safely.
5. Investigate domain-specific optimizations for the AV engine.
6. Increase performance counter coverage and the number of counters available.
7. The AV engine should be flexible enough to enforce a wide range of security policies.
8. Create mechanisms to allow the AV engine to run in the highest privilege mode.
9. Provide support in the AV engine for secure updates.
28
Thursday, June 27, 13
Recommendations for Hardware A/V System Design1. Provide strong isolation mechanisms to
enable anti-virus software to execute without interference.
2. Investigate both on-chip and off-chip solutions for the AV implementations.
3. Allow performance counters to be read without interrupting the executing process.
4. Ensure that the AV engine can access physical memory safely.
5. Investigate domain-specific optimizations for the AV engine.
6. Increase performance counter coverage and the number of counters available.
7. The AV engine should be flexible enough to enforce a wide range of security policies.
8. Create mechanisms to allow the AV engine to run in the highest privilege mode.
9. Provide support in the AV engine for secure updates.
29
Thursday, June 27, 13
Strong isolation mechanisms enable anti-virus software to execute without interference
• Non-interruptable
• A/V requires data from cores
• Starvation == exploit
• A/V uses off-die memory
• Starvation == exploit
30
AV
Strongly Isolated
Core
NOC: Rx for performance cntr. data, Tx: security exceptions
Isolated Secure Bus
Thursday, June 27, 13
Allow Secure System Updates
31
Update Package(Encrypted & signed)
Trained Classifier
Action Program
Revision Number
• Update protocol:
• Decrypt
• Verify signature
• Check revision number
• Disallows access to classifiers & action programs
Thursday, June 27, 13
Outline
• Detecting malware with performance counters
• ... using machine learning
• Experimental setup
• Results for Android
• An architecture for hardware antivirus systems
32
Thursday, June 27, 13
Contributions(1) First hardware-based antivirus detection
• Promising results, reasons to believe results will improve
• First branch predictors started at 80% accuracy...
(2) Dataset available: http://castl.cs.columbia.edu/colmalset
Much to follow on: 0-day exploit detection, attacks,counterattack malware detection, better machine learning,
precise training labels, ML accelerators, prototypes, etc.
33
Thursday, June 27, 13