Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg...
-
Upload
jason-glenn -
Category
Documents
-
view
214 -
download
0
Transcript of Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg...
English ShellcodeJoshua Mason, Sam
SmallJohns Hopkins University
Fabian MonroseUniversity of North
Carolina
Greg MacManusiSIGHT Partners
16th ACM CCS
Advaced Defense Lab 2
Outline
Introduction On the arms race Related work Our approach Automatic generation Implementation Evaluation
Advaced Defense Lab 3
Introduction
Code-injection attack Source code for script-language Byte-code Machine code
The common component The injected code or … shellcode
Advaced Defense Lab 4
Misconception
Shellcode is delivered in tandem with the exploitation. Store shellcode in memory, then exploit
Shellcode takes the form of directly executable machine code. polymorphism
Advaced Defense Lab 5
Misconception…?
Even polymorphic shellcode is constrained by an essential component: the decoder.
Shellcode is fundamentally different in structure than non-executable payload data. This paper!!!
Decoder
Encoded data
Advaced Defense Lab 6
About This Paper
Automatically producing English Shellcode
Although it is not indistinguishable form authentic English prose. Do you want to analyze?
Advaced Defense Lab 7
On The Arms Race
Shellcode developers are often faced with constraints that limit the range of byte-values aceepted. e.g. printable, alphanumeric, MIME
Encoding Self-modification
Advaced Defense Lab 8
On The Arms Race
Much literature describing code injection attacks assumes a standard attack template. A NOP sled, shellcode, and one or more
pointer
While emulation and static analysis have bean successful in identifying some failings of advanced shellcode. But…overhead
Advaced Defense Lab 9
On The Arms Race
It has been suggested that malicious polymorphic behavior cannot be modeled effectively. On the infeasibility of Modeling
Polymorphic Shellcode. By Y. Song et al.
Advaced Defense Lab 10
Related Work
Limit the spoils of exploitation and to prevent developers from writing vulnerable code
Preventing the execution of injected code
Content-based input-validation Polymorphic▪ To identify self-decrypting shellcode▪ But … non-self-contained polymorphic shellcode
Advaced Defense Lab 11
Our Approach
Shellcode is simply an ordered list of machine instructions. “Shake Shake Shake!” push %ebx; push “ake ”;
push %ebx; push “ake ”;push %ebx; push “ake!”;
But add, mov, call To develop an automated approach
Arbitrary shellcode English representation
Advaced Defense Lab 12
High-level Overview
English shellcode is completely self-contained.
Advaced Defense Lab 13
The Decoder
The decoder must be English-cpmpatible Cannot use many instruction▪ E.g. loop instructions
Our decoder has the form: Initialization Decoder Encoded payload
Advaced Defense Lab 14
The Decoder principle
Only English-compatible instructions
English-compatible instructions that can produce useful instructions
Favor instructions that have less-constrained ASCII equivalents push %eax (“P”) > push %ecx (“Q”)
Advaced Defense Lab 15
Decoder - initialization
Overwriting registers and patching some instructions
Using inc instruction and manipulatiing the alignment of the stack
Advaced Defense Lab 16
Advaced Defense Lab 17
Decoder - Unpacking
“and r/m8, r8”(0x20, ASCII space character) add▪ lods (load string from esi)
Advaced Defense Lab 18
Decoder - Decoding
Two pointer: %esi, %edi
”,” and “ ”
”u” and “decode””G”
Advaced Defense Lab 19
Advaced Defense Lab 20
Decoder – Initialing Registers
Using popa instruction (ASCII character “a”)
Advaced Defense Lab 21
Automatic Generation
Taken as-is, the custom decoder will have common English characters, but will not appearance of English text.
Add some instructions between decoder instructions
Augmenting a statistical language generation algorithm.
Advaced Defense Lab 22
Automatic Generation
n-gram model length is 5
the ith instruction in decoder have a
level i A sentence have score i when it
complete level i
)|()|()|()(
)|()|()|()()(
123121
12121312121
nn
nnn
WWPWWPWWPWP
WWWWPWWWPWWPWPWWWP
Advaced Defense Lab 23
Advaced Defense Lab 24
Using beam search algorithm Keep the best m(=20,000) candidates
during the process For encoded payload, observe how
many target byte are encoded
Advaced Defense Lab 25
Implementation
The training data Over 15,000 Wikipedia articles 27,000 books from the Project
Gutenberg Language engine was constructed in
the Java language using the LingPipe API
Scoring engine using ptrace API Executor Watcher
Taking 12 hours
Advaced Defense Lab 26
Advaced Defense Lab 27
An Optimized Design
Emulation Expand 1 instruction into tens of
instructions Monitored direct execution
Maintain 2 machine state Use 3 separate stacks Pause 2 conditions▪ Encounter a jump▪ Change memory
Roughly in less than 1 hour
Advaced Defense Lab 28
Evaluation
Exit(0) 2054 bytes
Advaced Defense Lab 29
Compare with Spectrum Analysis
Windows Bind DLL Inject