Qemu Symbolic Execution · Introduction of KLEE (symbolic execution tool) - 3 How does KLEE work...
Transcript of Qemu Symbolic Execution · Introduction of KLEE (symbolic execution tool) - 3 How does KLEE work...
Qemu code fault automatic discovery with symbolic search
Paul Marinescu, Cristian Cadar, Chunjie Zhu, Philippe Gabriel
Goals of this presentation
Introduction of KLEE (symbolic execution tool)
Qemu fault/patch retrospective
Understand how Qemu-dm works
Qemu code check by symbolic execution
Work on the way
Introduction of KLEE (symbolic execution tool) - 1
https://github.com/klee/klee
klee_make_symbolic(&a, sizeof(a), “a”)
klee_make_symbolic(&b, sizeof(b), “b”)
int foo(int x, int y) {
int ret = 0;
if (x + y < 15) {
if (y != 10)
ret = 1;
else
ret = 2;
} else {
if (y != 10)
ret = 3;
else
ret = 4;
}
return ret;
}
foo(a, b);
Introduction of KLEE (symbolic execution tool) - 2
See real execution paths explored by KLEEhttps://github.com/klee/klee
test000001.ktest test000002.ktest test000003.ktest test000004.ktest(int32 overflow)
args : ['test.o']num objects: 2object 0: name: b'a'object 0: size: 4object 0: data: 2147483635object 1: name: b'b'object 1: size: 4object 1: data: 10
args : ['test.o']num objects: 2object 0: name: b'a'object 0: size: 4object 0: data: 0object 1: name: b'b'object 1: size: 4object 1: data: 0
args : ['test.o']num objects: 2object 0: name: b'a'object 0: size: 4object 0: data: 2110308033object 1: name: b'b'object 1: size: 4object 1: data: 37170385
args : ['test.o']num objects: 2object 0: name: b'a'object 0: size: 4object 0: data: 2147483640object 1: name: b'b'object 1: size: 4object 1: data: 10
Introduction of KLEE (symbolic execution tool) - 3
How does KLEE work compile target program to LLVM bitcode
core engine plays the role of a virtual machine for LLVM bitcode symbolic execution
traverses as many possible code paths in a given time budget (dead loop?)1. depth-first search/breadth-first search/non-uniform-random search
2. query-cost-optimization/code-coverage-optimization
requests constraint solver to give a solution once run into code branch
special case handling1. constraint solver does not support symbolic-sized objects, e.g. malloc(size)
external environment modeling (e.g. file system access)
one test case is generated once a code path reaches its end or encounters an error
replay the test case after klee code check is completed
https://github.com/klee/klee
Introduction of KLEE (symbolic execution tool) - 4
Successful story (see http://llvm.org/pubs/2008-12-OSDI-KLEE.pdf) https://github.com/klee/klee
Qemu fault/patch retrospective - 1
Qemu buffer overflow CVE-2015-4106, does not restrict PCI config space write access for PCI pass-through
CVE-2015-3456, floppy disk controller issue
CVE-2015-2752, XEN_DOMCTL_memory_mapping hypercall issue
others
Postmortem idea to spot any potential vulnerability automatically?
Qemu fault/patch retrospective - 2
Solutions
fuzz testinga) treat Qemu as a black box
b) generate random input to Qemu, easy to implement
c) a very hard time reaching some code paths (e.g. int32 x == 764387, 1/2^32 chance to hit the branch without any guidance)
d) not reproducible
symbolic executiona) have internal state representation of Qemu
b) generate stable test case to reproduce any code fault
c) higher code coverage
d) difficult to adopt
Understand how Qemu-dm works - 1
Understand how Qemu-dm works - 2
guest os <-> xen hypervisor
guest os issues “IN AL, 0x10”
VM exit traps guest os into hypervisor
hypervisor packages an ioreq and fills it into ioreq queue (shared memory between hypervisor and qemu), notifies qemu to handle this request and waits the io instruction done (hypervisor schedules other task to execute on CPU, but does not block forever)
qemu gives response, hypervisor reads the data out, and then copies it to guest os registers in VMCS (See x86 VT-x spec)
xen-hypervisor <-> qemu process (qemu calls libxcto map the shared memory into its own virtual memory address space at startsup)
qemu event loop polls ioreq from queue
qemu gets an ioreq and parse it (0x10, read, memory to store 0x10 data)
qemu calls xen_platform ioport read function (xen_platform registers ioport 0x10)
qemu writes the data into the memory block (ioreq contains a memory point which is used to store the data)
qemu notifies hypervisor that job is done
Qemu code check by symbolic search
Rebuild Qemu in LLVM bitcode (libxc dependencies?)
Minimal Qemu image necessary load/startup instructions
ioport in/out instructions
Run check klee core engine loads Qemu LLVM bitcode and the minimal image
klee generates input and traverses Qemu program state space
klee records the input sequence (change on klee?)
watchdog monitor, restart if klee terminates when it runs into Qemu code fault
code coverage report?
Alternative option start Qemu from an actual instruction trace and treat various instruction arguments as
symbolic input, see if some input causes errors
Work on the way
Rebuild Qemu remove dependency
stub libxc?
klee libxc modeling?
achievement boot toy OS using klee and do some initial symbolic checks
KLEE symbolic variable input -> instruction input?
restart after crash, next crash at the same location? (using klee seeds)
others
We are still on the way ...
Q & A
Thanks.Questions?