Statistical Approach to Detection of Hardware Virtualization Based Rootkits

20
Statistical Approach to Detection of Hardware Virtualization Based Rootkits Igor Korkin Adviser: Prof. T.V. Petrova February 2012

description

There exist many methods of detection of this malware type, and as many ways to prevent them. In this paper, I chose a detection method based on the time-stamp counter (TSC) and ways to prevent this detection, such as modification of the TSC and Blue Chicken technology. To develop the ways to detect hypervisor I did mining and modeling of CPU behavior. I designed the models (two directed multigraphs) of CPU behaviour in cases when hypervisor is present or not. With the help of models of CPU behavior, I discovered hidden relationships between variability of time duration of certain instructions in various CPU states. I suggested that we could use certain statistical values such as variance, fourth moments and others to detect a hypervisor or several nested ones. Experimental verification of the models built with the help of the Kolmogorov Criterion showed that a 5% significance level the model data are consistent with experimental data. The statistical values grow when we install a hypervisor. The hypervisor can modify only the mean values, but it cannot change these variation values. I took into consideration lack of repeatability and reproducibility of experimental results. This method was implemented in the as a program and driver for Windows. This tool was successfully tested and implemented on various workstations, laptops and hypervisors.

Transcript of Statistical Approach to Detection of Hardware Virtualization Based Rootkits

Page 1: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

Statistical Approach to Detection of Hardware Virtualization Based Rootkits

Igor KorkinAdviser: Prof. T.V. Petrova

February 2012

Page 2: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

o There are no legitimate tools to detect a hypervisor and nested ones either.o A hypervisor is likely to be run illegally and include malware payload.o A hypervisor can prevent its detection with the help of TSC

cheating and Blue Chicken technology.

History of the problem

2

How PC works in cases of no hypervisor and after it has been installed:

Nowadays there are many computer systems that use CPU with hardware virtualization support and on the other hand there is no tool to detect hypervisor with certainty.

Operation system 2

Virtual machine 2

Hardware

Operation system

Hardware

PC without hypervisor

PC is controlled by two nested hypervisors

Operation system 1

Virtual machine 1

hypervisorinstallation

Trusted hypervisor

Hypervisor rootkit

Page 3: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

Research aims

3

o Analysis of statistical characteristics of trace execution latencies in cases when hypervisors are present or not;

o Creation of a criterion of hypervisors’ presence under the condition that the hypervisors can prevent their detection;

Hypervisor is able to cheat TSC and as a result we can see that mean trace execution latencies in cases when hypervisors are present are similar to their values in case when there is no hypervisor. Therefore, a new statistical approach has to be found.

Page 4: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

The main tasks● Comparative analysis and classification of existing methods and tools of hypervisors detection.● Research into processor’s instructions execution latencies in cases when hypervisors are present or not. ● Design and analysis of execution trace models in cases when hypervisors are present or not. ● Creation of criteria for hypervisors’ presence under the condition that the hypervisors can prevent their detection.● Research into statistical characteristics of trace execution latencies in cases when hypervisors are present or not; and testing of the criteria.● Development of hypervisor detection tool.

4

Page 5: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

Hypervisor detection methods

Timing Behavioural

Signature based

Based on TLB

Based on RSB

Based on instructions trapped by hypervisor

Based on TLB

Based on memory walk

Based on bugs in hypervisors and

CPUs

Proactive

Hypersight Rootkit Detector McAfee's DeepSAFE

DeepWatch (JTAG port) Co-Pilot (PCI card)

Based on a trusted hypervisor

Based on hardware

Based on software

The suggested classification of hypervisor detection methods

5

My method is based on the difference in trace execution latencies in cases when hypervisors are present or not

Page 6: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

Comparative analysis of existing hypervisors detection tools

Name of the tool

Ability to detect Convenience in usage and distribution

Detection of nested

hypervisorsnon stealthy hypervisor

stealthy hyperviso

rHypersight

Rootkit Detector+ ― + ―

Symantec Endpoint

Protection+ ― + ―

McAfee Deep Defender

+ ― + ―DeepWatch + + ― ―

Co-Pilot + + ― ―

6

Experimental software samples

+ ― + ―

Page 7: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

Virtual machine

Hypervisor

OperationsystemTRACE

DT

HardwareTSC

Analysis of timing detection method which is based on instructions trapped by hypervisor

Trace ‒ a set of CPUID instructions.

7

No cheating TSC

Cheating TSC

No hypervisor 2*103 2*103

Hypervisor is present 2*105 2*103

Mean trace execution latencies in ticks:

Hypervisor cheats TSC

How to get statistical data:• read TSC value;• execute 10 CPUID instructions;• read TSC value again;• calculate the trace execution latency.

*DT – detection tool

Page 8: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

Switching between CPU modes in cases when hypervisor is present or not

8

In case of no hypervisor

Hypervisor is present

Protected mode(R-mode)

System Management

Mode(S-mode)

SMM entry

SMM exit

VM exit

VM entryVMX non root mode (R-mode)

VMX root mode (V-mode)

System Management Mode

(S-mode)

System Management Mode

(S-mode)

SMM entry

SMMexit

SMM entry

SMMexit

Page 9: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

Execution trace model in case of no hypervisor (NH)

9

Trace execution latency with n0 CPUID instructions:

(p,ns)

(q,0) (q,0). .

. .

(p,ns)

𝑈1 𝑈2 𝑈𝑛0 𝑡𝑁𝐻 = ሺ𝑛0 + 𝑚𝑁𝐻∗𝑛𝑆ሻ∗𝑘 𝑘 is one instruction execution latency in ticks 𝑛0 is the number of CPUID instructions in the trace 𝑈𝑖 are instructions in the trace, 𝑖 = 1,..,𝑛0 𝑛𝑆 is the number of instructions in the SMM dispatcher 𝑝 is the probability of the fact, that the switch between R- and S- mode occurred (along a lower arc) 𝑞 is the probability of the fact, that the switch between R- and S- mode did not occur (along a higher arc, 𝑞 = 1− 𝑝) 𝑚𝑁𝐻 is the random variable, which means how many numbers of switches between R- and S- mode occur. Its values are binomially distributed, parameters 𝑛0 and 𝑝.

Page 10: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

Execution trace model in case when hypervisor is present (HP) under the condition that it can prevent its detection

10

Trace execution latency with 𝑛0 CPUID instructions: 𝑡𝐻𝑃 = ሺ𝑛0 + 𝑛0 ∗𝑛𝑉+ 𝑚𝐻𝑃 ∗𝑛𝑆+ 𝑛0 ∗𝛿ሻ∗𝑘

Hypervisor dispatcher

(p,ns)

(q,0)

(p,ns)

(q,0) (q,0)

(p,ns)

(q,0)

(p,ns)

(q,0)

(p,ns)

(q,0). .

. .

. .(p,ns)

𝑘 is one instruction execution latency in ticks 𝑛0 is the number of instructions in the trace; 𝑈𝑖 – instructions in the trace, 𝑖 = 1,..,𝑛0 𝑛𝑉 is the number of instructions in the hypervisor dispatcher; 𝑉𝑗 are instructions in the hypervisor dispatcher, 𝑗= 1,..,𝑛𝑉 𝑛𝑆 is the number of instructions in the SMM dispatcher 𝑝 is the probability of the fact, that the switch to S- mode occurred (along a lower arc) 𝑞 is the probability of the fact, that the switch to S- mode did not occur (along a higher arc, 𝑞 = 1− 𝑝) 𝛿 is the value of TSC cheating 𝑚𝐻𝑃 is the random variable, which means how many numbers of switches to S- mode occur. Its values are binomially distributed, parameters (𝑛0 + 𝑛0 ∗𝑛𝑉) and 𝑝

Page 11: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

0 10 20 30 40 50 60 70 80 90 1002915

2920

2925

2930

2935

2940

Номера измерений

Вре

мя т

расс

ы в

так

тах,

tGraphs of theoretical distribution of trace execution latencies

in cases when hypervisor is present or not and under the condition that it prevents its detection

11

Hypervisor is present

Hypervisoris not present

Parameters values:k=2920 p=0.004 d=-20n0=10 ns=200 nv=200

Number of repeated measurements

Trac

e ex

ecut

ion

late

ncie

s, ti

cks

Page 12: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

𝑛0,𝑛𝑆,𝑛𝑉,𝛿,𝑘 ∈ℤ; 𝑝∈ℝ are some fixed values. 𝑋𝑁𝐻 is range of values, received from 𝑡𝑁𝐻, 𝑋𝐻𝑃 is range of values, received from 𝑡𝐻𝑃. 𝑋: 𝑋= 𝑋𝑁𝐻⋃𝑋𝐻𝑃. 𝑥Ԧ𝑛 = (𝑥1,..,𝑥𝑛) is a sample from 𝑋. Hypervisor presence criterion based on the variance. Critical set (hypervisor is present): 𝑊= {𝑥Ԧ𝑛:𝜎ො��2ሺ𝑥Ԧ𝑛ሻ≥ 𝑑},

where 𝜎ො��2 is sample variance, 𝑑∈ℤ is experimentally defined. Making a decision: if 𝑥Ԧ𝑛 ∈𝑊, it means a hypervisor is present; if 𝑥Ԧ𝑛 ∉𝑊, it means there is no hypervisor.

Hypervisor presence criterion based on the fourth central moments. Critical set (hypervisor is present): 𝑊= {𝑥Ԧ𝑛:𝜐ො��2ሺ𝑥Ԧ𝑛ሻ≥ 𝜇},

where 𝜐ො��2 is sample fourth central moment, 𝜇∈ℤ is experimentally defined. Making a de-cision: if 𝑥Ԧ𝑛 ∈𝑊, it means a hypervisor is present; if 𝑥Ԧ𝑛 ∉𝑊, it means there is no hyper-visor.

Hypervisor presence criterion based on the length of variation series. Critical set (hypervisor is present): 𝑊= {𝑥Ԧ𝑛:𝑙መሺ𝑥Ԧ𝑛ሻ≥ 𝑒},

where 𝑙መ is length of variation series, 𝑒∈ℤ is experimentally defined. Making a decision: if 𝑥Ԧ𝑛 ∈𝑊, it means a hypervisor is present; if 𝑥Ԧ𝑛 ∉𝑊, it means there is no hypervisor.

12

Hypervisors presence criterions

Theorem. For sampling from tNH and tHP prove the existence of the hypervisor presence criterion.

Page 13: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

Experimental check of hypervisors presence criteria

● The experiments were carried out as single-factor experiments:

● The variable factor is the PC state in cases when a hypervisor is present or not.

● Statistics of trace execution latencies were analyzed.

● The results were 1000х10 matrix, including measurements data of trace execution latencies in cases when a hypervisor is present or not. (ТHP and ТNH)

● According to ISO 5725 experiments were carried out in series of 5 repeated cases during 10 days until the data stabilized.

13

Page 14: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

0 10 20 30 40 50 60 70 80 90 1002915

2920

2925

2930

2935

2940

Номера измерений

Вре

мя т

расс

ы в

так

тах,

t

14

Experimental results. Graphs of distribution of trace execution latencies in cases when hypervisor is present or

not and under the condition that it prevents its detection

Hypervisor is present

Hypervisor is not present

Experimental results fromIntel Core 2 Duo E8200 Windows 7

Number of repeated measurements

Trac

e ex

ecut

ion

late

ncie

s, ti

cks

Page 15: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

PC Statistics Filtration level 𝑓

Threshold values Probability of errors Hypervisor is

not present Hypervisor is present Type I, α Type II, β

1 𝐿ത𝑓 0 ≤ 7 ≥ 8 0.04 0 𝐷ഥ𝑓 0 ≤ 14 ≥ 18 0.02 0 𝑀ഥ𝑓 0.1 ≤ 679 ≥ 947 0.02 0

2 𝐿ത𝑓 0 ≤ 11 ≥ 12 0.1 0.06 𝐷ഥ𝑓 0.2 ≤ 100 ≥ 101 0.08 0.1 𝑀ഥ𝑓 0.2 ≤ 168 ≥ 13030 0.14 0.02

3 𝐿ത𝑓 0 ≤ 34 ≥ 241 0 0 𝐷ഥ𝑓 0 ≤ 216 ≥ 5478 0 0 𝑀ഥ𝑓 0.02 ≤ 54 ≥ 956 0 0

𝐿ത𝑓 is the mean length of variation series, 𝐷ഥ𝑓 is the mean variance, 𝑀ഥ𝑓 is the mean fourth central moments.

Threshold values of statistics

15

#1 – Intel Core 2 Duo E8200, Windows 7, #2 – Intel Core 2 Duo E6300, Windows 7,#3 – AMD Phenom X4 945, Windows Live CD XP (DDD)

My hypervisor was used in PCs #1 and #2. Special BIOS hypervisor was used in PC #3.

Page 16: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

16

Algorithm of getting threshold values of trace execution statistical characteristics

Input data: noneOutput data: statistics, their filtration levels and threshold values

Choose threshold values SN H,[f] and SHP,[f] whereby error I

and error II are minimal

End

From matrixes TN H and THP get filtered arrays TN H,f and THP,f for

the following filtration levels f={0; 0,02; 0,05; 0,1; 0,15; 0,2}

Calculate the statistical characteristics SN H,f and SHP,f for the arrays TN H,f and THP,fGet matrixes TN H and THP of trace

execution latencies with (and without) a hypervisor

Entry

Page 17: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

17

Hypervisor detection algorithmInput data: statistics, their filtration levels and threshold values.Output data: a solution about hypervisors’ presence

Get the matrix of trace execution latency TТЕSТ

Entry

Calculate statistical characteristics SТЕSТ,[f]

SТЕSТ,[f] ≤ SN H,[f]T

F

End

SТЕSТ,[f] ≥ SHP,[f]

Hypervisor is present

Hypervisor is not present

F

T

Get the array TТЕSТ,[f] after filtering of matrix TТЕSТ

Input characteristics: SN H,[f] , SHP,[f] and [f]

Page 18: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

Phase Description Pr

epar

ator

y 1. Upgrade BIOS software with trusted BIOS-image with the help of a programmer.

2. Install OS. 3. Get threshold values for hypervisor detection with the

help of the corresponding algorithm.

Ope

ratio

nal 4. Repeatedly check the system to see if a hypervisor

is present. 5. Install additional software (MS Office etc). 6. Monitor messages about a hypervisor detection. 7. To adapt the detection tool to the legitimate hypervisor

do step 3 again.

Approach to hypervisor detection

18

OSBIOS image

Additional software

Software updates

?

Page 19: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

19

Hypervisor detection tool architecture

Hypervisor detection subsystem

Threshold values creation subsystem

Threshold values

Intruder’s activity imitation subsystem

Hypervisor’s start and stop

Ope

rati

onal

ph

ase

Pre

para

tory

ph

ase

Page 20: Statistical Approach to Detection of Hardware Virtualization Based Rootkits

Ability to detect stealthy hypervisors

Convenience in usage and distribution

Detection of nested hypervisors

+ + +

o Execution trace models were designed for the cases when hypervisors are present or not

o Hypervisors presence criteria were created. They are based on the variance, fourth moments, and length of variation series of trace execution latencies

o Algorithms of hypervisor detection were developedo Hypervisors detection software tool was developed.

It has the following advantages:

20

The main results