Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for...
Transcript of Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for...
![Page 1: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/1.jpg)
Evolutionary Computation forImproving Malware Analysis
Kevin Leach1, Ryan Dougherty2, Chad Spensky3,Stephanie Forrest2, Westley Weimer1
1University of Michigan2Arizona State University
3University of California, Santa Babara
June 1, 2019
1/18
![Page 2: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/2.jpg)
Setting the Stage
Ï Can we improve the efficiency ofautomated malware analysis usingevolutionary computation?
2/18
![Page 3: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/3.jpg)
Introduction
3/18
![Page 4: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/4.jpg)
Malware Analysis
Ï Analysts want to quickly identifymalware behavior
Ï What damage does it do?Ï How does it infect a system?Ï How do we defend against it?
4/18
![Page 5: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/5.jpg)
Stealthy MalwareÏ Growing volume of stealthy malwareÏ Malware sample maintains secrecy by using
artifacts to detect analysis environmentsÏ Timing artifacts — overhead introduced by analysis
Ï Single-stepping instructions with debugger is slowÏ Imperfect VM environment does not match native speed
Ï Functional artifacts — features introduced by analysisÏ isDebuggerPresent() — legitimate feature abused by
adversariesÏ Incomplete emulation of some instructions by VMÏ Device names (hard drive named “VMWare disk”)
Ï Automated analysis is difficult
5/18
![Page 6: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/6.jpg)
Automated Malware Analysis
Ï Cluster of servers analyzes malwareÏ Analysis success depends on cluster’s environment
(e.g., OS, virtualization)
6/18
![Page 7: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/7.jpg)
Transparency
Ï We want to understand stealthy samples
Ï We can mitigate artifactsÏ Hook/intercept API calls
(e.g., isDebuggerPresent())Ï Spoof timing
(e.g., virtualize result of rdtsc instruction)Ï Use alternate virtualization
(e.g., a sample that detects VMWare may not detectVirtualBox)
7/18
![Page 8: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/8.jpg)
Cost of TransparencyÏ Mitigation costs resources
Ï Development effort(e.g., modifying virtualization)
Ï Execution time(e.g., due to runtime overhead)
Ï Mitigation covers some subset ofmalware
Ï Artifact category(e.g., hooking disk-related APIs coversmalware that checks the disk)
8/18
![Page 9: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/9.jpg)
Key Idea: Transparency/Cost
Ï We can control which artifacts areexposed
Ï i.e., control costs and coverage
Ï Use a vector of yes/no answers
VMWare? large disk? Spoof timers? cost
0 1 0 x1 0 1 y
9/18
![Page 10: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/10.jpg)
Key Idea: Transparency/Cost
Ï Derive a cost model empiricallyÏ What are the values of x and y?
VMWare? large disk? Spoof timers? cost
0 1 0 x1 0 1 y
10/18
![Page 11: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/11.jpg)
Automated Malware Analysis
Ï Can we analyze more stealthy malware ifwe offer different analysis environments?
11/18
![Page 12: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/12.jpg)
Proposed Architecture
Covering array GI algorithm
Cost model Artifacts
Server Cluster
StealthyMalware
1©
2©
3©
12/18
![Page 13: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/13.jpg)
Proposed Approach
Given: Cost model, number of serversFind: settings that minimize cost, maximize coverage
Config Server A0 A1 A2 cost
1 1 0 1 0 32 1 0 1 2
2 1 0 1 1 42 1 1 0 3
3 1 0 0 0 12 1 1 1 7
13/18
![Page 14: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/14.jpg)
Proposed Approach
Given: Cost model, number of serversFind: settings that minimize cost, maximize coverage
Config Server A0 A1 A2 cost
1 1 0 1 0 32 1 0 1 2
2 1 0 1 1 42 1 1 0 3
3 1 0 0 0 12 1 1 1 7
14/18
![Page 15: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/15.jpg)
Proposed Approach
Given: Cost model, number of serversFind: settings that minimize cost, maximize coverage
Config Server A0 A1 A2 cost
1 1 0 1 0 32 1 0 1 2
2 1 0 1 1 42 1 1 0 3
3 1 0 0 0 12 1 1 1 7
15/18
![Page 16: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/16.jpg)
Initial ResultsÏ Challenge Ground-truth data is hard to
come byÏ Manually reverse engineered 20 samples
Ï Baseline analyze each withhigh-transparency, high-cost analysis(e.g., all 1’s)
Ï roughly 360x overhead
Ï Cost Function We estimated a costfunction based manual review
16/18
![Page 17: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/17.jpg)
Initial ResultsBaseline Us / 2 servers Us / 4 servers Us / 8 servers
360x 16.14 – 64.51 12.06 – 49.39 1.05 – 30.92
Ï Promising improvements to throughput(potentially, by 1–2 orders of magnitude)
Ï Future workÏ Ongoing analysis involving 20k+ stealthy
samplesÏ Need to empirically derive cost function
(rather than manual assessment)
17/18
![Page 18: Evolutionary Computation for Improving Malware Analysis-12pt · Evolutionary Computation for Improving Malware Analysis Kevin Leach1, Ryan Dougherty2, Chad Spensky3, Stephanie Forrest2,](https://reader033.fdocuments.in/reader033/viewer/2022042418/5f33d82339828c360c6a459c/html5/thumbnails/18.jpg)
Conclusion
Ï We can control which artifactsare exposed to stealthy malwaresamples
Ï We can evolve a set of analysisserver configurations tomaximize coverage whileminimizing cost
18/18