ReVirt: Enabling Intrusion Analysis through Virtual-Machine Logging and Replay
description
Transcript of ReVirt: Enabling Intrusion Analysis through Virtual-Machine Logging and Replay
ReVirt: Enabling Intrusion Analysis through Virtual-
Machine Logging and Replay
Jae Wook KimDistributed Computing Systems Laboratory
2005.11.28
2
Introduction
Most computer systems try to enable attack analysis by logging various events.
Logs provided by current systems fall short in two ways of what is needed: integrity and completeness
Integrity: they assume the OS kernel is trustworthy
Attacker's first move is to subvert the logs Delete or modify, or at least disable
Completeness: Do not log sufficient information to recreate or understand all attacks
Still require lots of educated guesses Can't account for non-determinism
3
Goal of ReVirt
Integrity Encapsulate the target system inside a virtual machine
and place the logging software beneath this virtual machine.
Running the logger in a different domain tan the target system protects the logger from a compromised application or operating system.
Completeness Adapt techniques used in fault-tolerance for primary-
backup recovery, such as checkpointing, logging, and roll-forward recovery.
4
Virtual Machines
The VMM makes a much better trusted computing base than the guest operating system, due to its narrow interface and small size.
The narrow VMM interface restricts the actions of an attacker.
5
UMLinux
Virtual machine used by ReVirt The guest OS in UMLinux runs on top of the host
OS and uses host services as the interface to peripheral devices. (OS-on-OS)
Guest OS and all applications run within a single host process
VMM is implemented as a loadable module in the host kernel.
6
UMLinux Address Space
Host kernel occupies [0xc0000000, 0xffffffff]
Host user occupies [0x0, 0xc0000000]
Guest kernel occupies [0x70000000, 0xc0000000]
Current guest application occupies [0x0, 0x70000000]
7
Trusted computing base for UMLinux
The trusted computing base (TCB) is composed of the VMM kernel module and the host OS.
Logging in an OS-on-OS structure is much more difficult to attack than the logging in a direct-on-host structure, because the TCB for an OS-on-OS structure can be much smaller than the complete host operating system.
8
Attacks against host OS
From above by causing application processes to invoke the host OS in dangerous ways
DoH: Attacker has complete freedom to invoke whatever functionality the host OS makes available to user processes
OoO: Attacker who has gained control of all application processes can use these same avenues to attack the guest OS
Low level of the network protocol stack by sending dangerous network packets to the host
DoH: Packets traverse through the entire network stack and are delivered to applications
OoO: Packets need only traverse a small part of the network stack
9
Logging and replaying UMLinux
Logging is used widely for recovering state. Basic concept
Start from a checkpoint of a prior state Roll forward using the log to reach the desired state
Replaying a process requires logging the non-deterministic events that affect the process’s computation.
Time: the exact point in the execution stream at which an event takes place
External input: data received from a non-logged entity
10
Logging in ReVirt
Logs All non-deterministic events that can affect the execution
of the virtual-machine process Asynchronous virtual interrupts All input from external entities
During replay, ReVirt prevents new asynchronous virtual interrupts from perturbing the replaying virtual machine process.
11
Cooperative logging
Of all the sources of non-determinism, only received network messages have the potential to generate enormous quantities of log data.
If the sending computer is being logged via ReVirt, then the receiver need not log the message data because the sender can re-create the sent data via replay.
Can reduce log volume, but complicates replay and requires that cooperating computers trust each other to regenerate the same message data during replay
Not yet implemented in ReVirt
12
Direct-on-host logging
Host kernel logs and replays all its host processes Not as secure and much more difficult than a
virtual-machine approach DoH involves multiple host processes while an OoO
approach involves only a sing host process Replaying multiple host processes can be done in
2 ways Replay communication channels between processes
Replaying shared-memory communication channel requires complex instrumentation of the executing code and adds significant overhead
Replay the scheduling order between host processes Difficult because host process can be interrupted while
executing in kernel mode Hard to identify the point where an interrupt occurred.
13
Using ReVirt to analyze attacks
ReVirt enables an administrator to replay the complete execution of a computer before, during, and after the attack.
Two types of tools to assist the administrator to understand the attack.
Inside the guest virtual machine ReVirt supports the ability to continue live execution at any
point in the replay. Use this ability to run new guest commands to probe the
virtual machine state. Virtual machine cannot switch back to replaying after being
perturbed in this manner. Outside the guest virtual machine
Debuggers and disk analyzers Do not depend on the guest kernel or guest applications.
14
Experiments
AMD Athlon 1800+ IDE, 256 MB, Samsung SV4084 IDE
Host & Guest kernel: modified Linux 2.4.18 5 workloads
POV-Ray CPU-intensive ray-tracing program
kernel-build NFS kernel-build SPECweb99
benchmark to measure web server performance Desktop machine
15
Virtualization overhead
Time overhead that arises from running all applications in the UMLinux virtual machine
Compare running all applications within UMLinux with running them directly on a host Linux 2.4.18.
Results Very little overhead for compute-intensive POV-Ray No overhead for interactive jobs such as e-mail Others are higher because they issue more guest kernel
calls
16
Validating ReVirt correctness
Verify that the ReVirt system faithfully replays the exact execution of the original run
Add extensive error checking to alert if the replaying run deviates from the original
2 micro-benchmarks Runs 2 guest processes that share an mmap’ed memory
region Runs a single process that increments a variable in an
infinite loop
1 macro-benchmark Boot computer, start the GNOME window manager, open
several interactive terminal windows, and concurrently build two applications on a remote NFS server
17
Logging and replaying overhead
Quantify the space and time overhead of logging Time overhead of logging is small (at most 8%) Space overhead of logging is small enough to
save logs over a long period of time at low cost 120 GB disk can store the volume of log traffic generated
by NFS kernel-build for 3-4 months
18
Logging and replaying overhead
19
Conclusions
ReVirt applies virtual-machine and fault-tolerance techniques to enable a system administrator to replay the long-term, instruction-by-instruction execution of a computer system.
Because the target operating system and target applications run within a virtual machine, ReVirt can replay the execution before, during, and after the intruder compromises the system.
Because ReVirt can replay instruction-by-instruction sequences, it can provide arbitrarily detailed observations about what transpired on the system.