Dependability Benchmarking of Off-the-Shelf OS...

31
Dependability Benchmarking of Off-the-Shelf OS Kernels Karama Kanoun 45th Meeting of IFIP Working Group 10.4, Moorea, French Polynesia, March 5-9, 2004 Partially financed by the European Commission

Transcript of Dependability Benchmarking of Off-the-Shelf OS...

Page 1: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Dependability Benchmarking

of Off-the-Shelf OS Kernels

Karama Kanoun

45th Meeting of IFIP Working Group 10.4, Moorea, French Polynesia, March 5-9, 2004

Partially financed by the European Commission

Page 2: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

DBench

Objective of DBench

Conceptual framework & experimental environment for benchmarking

the dependability of (C)OTS and COTS-based systems

Concepts, specifications and guidelines for dependability benchmarking

Dependability benchmark prototypes

Current / final results

A framework for dependability benchmarking

A set of benchmark specifications and associated prototypes

User point of view: robustness benchmarks wrt external errors

Emphasis on representativeness and validation

Page 3: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

DBench Framework

Categorization

Benchmark Target - BT

(System nature

Application area

Operating environment)

Benchmarking context

(Life-cycle phase

Benchmark user

Benchmark performer

Benchmark purpose)

Measures

Measure nature (qualitative or quantitative)

Measure type(dependability- or performance-related)

Measure extent(comprehensive or specific)

Assessment method (experimentation or modeling & experimentation)

System Under Benchmark - SUB

Workload

Faultload

Measurements

Procedures & rules

Experimentation

Page 4: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Benchmark developed

General-purpose operating systems

Robustness and timing measures, TPC-C Client, faulty application

Real-Time Kernels in onboard space systems

Predictability of the kernel response time, faulty application

Engine control applications in automotive systems

Robustness of the control application, transient hardware faults

On-line transaction processing (OLTP) systems,

TPC-C-based, Operator, software & hardware faults

TPC-C like measures

Measures based on modeling & experimentation: availability, cost

Web servers

Page 5: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Properties

Representativeness

Repeatability

Reproducibility

Portability

Non-intrusiveness

Scalability

Cost effective

Set-up

Execution duration

Page 6: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

OS Benchmarking

Integrator of a system including an operating system (OS)

Information on OS dependability

Select the most appropriate OS / system characteristics

Publishable results

Objectives of OS dependability benchmarking

Provide generic and reproducible methods

Characterize the OS behavior in the presence of faults

Compare alternative solutions

Page 7: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

OS Benchmarking Context

Limited knowledge about the OS

OS

Functional description of the OS

Non-intrusiveness

Faults injected outside the OS

Accessibility and observability

Page 8: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Operating system

Hardware

Devicedrivers

Faultload

Benchmark Management

System

Workload

API

Benchmark Target & SUB

System Under Benchmark (SUB)

Page 9: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Benchmark Measures

OS level

Workload level

Operating system

Hardware

Devicedrivers

Workload

API

Page 10: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Measures

- POS: OS Robustness (outcome distribution)

- Texec: OS reaction time in the presence of faults

- Tres: OS Restart time in the presence of faults

OS Level Measures

OS Outcomes SEr Error code

SXp Exception

SPc Panic

SHg Hang

SNS No signaling

Page 11: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Workload Level Measures

Workload Outcomes WCC Correct completion

WEC Erroneous completion

WAb Abort

WHg Hang

Page 12: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

SNS-WHgSHg-WHgSPc-WHgSXp-WHgSEr-WHgHang

SNS-WAb—SPc-WAbSXp-WAbSEr-WAbAbort

SNS-WEC——SXp-WECSEr-WECErroneous completion

SNS-WCC——SXp-WCCSEr-WCCCorrect completion

Nosignalling

HangPanicExceptionError code OS

Workload

Workload Measures

- PSNS: WL Robustness (WL outcome distribution)

- TWL: WL completion time in the presence of faults

Workload Level Measures

Combined states

Page 13: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Measure Summary

OS Measures

POS: OS Robustness

Texec: reaction time in the presence of fault (τexec: in absence of faults)

Tres: restart time in the presence of faults (τres: in absence of faults)

Workload Measures

PSNS: WL Robustness, when OS in SNS

TWL: WL correct completion time in the presence of faults

(τWL: in absence of faults)

Page 14: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Execution profile

Workload: TPC-C Client in the current prototype

Faultload

Selection of system calls to be corrupted

Ideally: all system calls with parameters

In practice: most critical OS functional components

Processes and Threads, File Input/output,

Memory management, Configuration Management

28 system calls, 75 parameters, 502 corrupted values

Interception of the selected system calls

Parameter corruption technique: selective substitution

Page 15: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Out-of-range

Data

Parameter Corruption technique

Systematic Bit Flip Selective substitution

Incorrect

Address

Incorrect

Data

Page 16: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Experimental Set-up

Page 17: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Measurements

Experiments with Workload (WL) completion

tExpEnd (n)

tResume (n)

tResponse (n)

tWStart (n)

tExpStart (n+1)

Restart time

Workload Completion Time

OS Reaction time

System Call to intercept WL End

tExpStart (n)

Page 18: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Timeout >> Workload completion duration

tExpStart (n)

Experiments without Workload (WL) completion

tExpEnd (n)

Restart time

tExpStart (n+1)

Experiment End

Measurements

tResume (n)

tResponse (n)

tWStart (n)

OS Reaction time

System Call to intercept WL End

Page 19: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Results: OS Robustness

Panic/Gel0,0%

Non Signalement57,4%

Exception11,4%

Code d'erreur31,2%

Windows XP

Panic/Gel0,0%Exception

11,4%

Code d'erreur34,1%

Non Signalement54,5%

Windows 2000

Panic/Gel0,0%

Non Signalement55,1%

Exception12,0%

Code d'erreur33,0%

Windows NT4

11

Exception11.4%

Error code34.1%

No signaling54.5%

28 system calls intercepted, 552 experiments

POS

Exception11.4%

Error code31.2%

No signaling57.4%

Exception12%

Error code33%

No signaling55%

Windows NT Windows XPWindows 2000

Page 20: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Sensitivity Analysis wrt Faultload

353All (132)xFL4

240028xFL3

11328xFL2

32528xxFL1

55228xxxFL0

#experiments

# Systemcalls

SystematicBit-Flip

Out-of-range data

Incorrectaddress

Incorrectdata

Page 21: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Workload States

Abort / Hang (101)

Completion (451)

Windows NT

↓WL

Abort / Hang (107)

Completion (445)

Windows 2000

↓WL

Abort / Hang (128)

Completion (424)

Windows XP

↓WL

Page 22: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Refinement of Workload States

47846Abort / Hang (101)

25758136Completion (451)

No signaling (304)Exception (66)Error code (182) Windows NT

↓WL

49652Abort / Hang (107)

25257136Completion (445)

No signaling (301)Exception (63)Error code (188) Windows 2000

↓WL

49673Abort / Hang (128)

2685799Completion (424)

No signaling (317)Exception (63)Error code (172) Windows XP

↓WL

↓PSNS

Page 23: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

OS Reaction Time

Windows XP

Windows 2000

Windows NT

111 µs

1782 µs

344 µs

τexec

114 µs

(176 µs)

1241 µs

(3359 µs)

128 µs

(230 µs)

Texec

(Std dev.)

23 µs

(17 µs)

22 µs

(28 µs)

17 µs

(18 µs)

Texec

Error code

108 µs

(162 µs)

973 µs

(2978 µs)

86 µs

(138 µs)

Texec

Exception

165 µs

(204 µs)

2013 µs

(4147)

203 µs

(281)

Texec

No-signaling

Page 24: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Detailed OS Reaction Time

Error Code return

System Call

µs

010203040506070

Clos

eHan

dle

Crea

teRe

mot

eThr

ead

Duplica

teHa

ndle

GetE

nviro

nmen

tVar

iableW

GetE

xitC

odeT

hrea

dGe

tFile

Type

GetP

roce

ssVe

rsion

Glob

alAl

loc

Glob

alFr

eeGl

obalLo

ckGl

obalUn

lock

IsBa

dWrit

ePtr

Loca

lAllo

cLo

calFre

eLo

calR

eAllo

cRe

adFile

Resu

meT

hrea

d

SetT

hrea

dPrio

rity

Susp

endT

hrea

d

Virtua

lAllo

cEx

Writ

eFile

Windows NT4 Windows 2000 Windows XPNT 2000 XP

Page 25: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Detailed OS Reaction Time

Exception Notification

System Call

µs

0

50

100

150

200

250

300

Creat

eRem

oteT

hrea

d

GetEnv

ironm

entV

ariab

leWGet

ExitCod

eThr

ead

GetPriv

ateP

rofile

String

AGet

Startu

pInf

oA

GlobalF

ree

Loca

lFre

e

ReadF

ile

Writ

eFile

Windows NT4 Windows 2000 Windows XP

a: 2640 µs

a 2640

NT 2000 XP

Page 26: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Detailed OS Reaction TimeNo-Signaling

µs

System call0

100

200

300

400

500

600

Creat

eRem

oteT

hrea

d

Creat

eThr

ead

Duplic

ateH

andle

FreeE

nviro

nmen

tStri

ngsW

GetExit

CodeT

hrea

d

GetPriv

ateP

rofile

IntA

GetPriv

ateP

rofile

String

A

GetPro

cess

Versio

n

GetSta

rtupI

nfoA

GlobalA

lloc

GlobalF

ree

GlobalL

ock

GlobalU

nlock

IsBad

ReadP

tr

IsBad

Writ

ePtr

Loca

lAllo

cLo

calF

ree

Loca

lReA

lloc

ReadF

ile

SetThr

eadP

riorit

y

Virtua

lAllo

cEx

Writ

eFile

Windows NT4 Windows 2000 Windows XP

a: 10416 µsb: 8205 µs

a b 10416 8205

NT 2000 XP

Page 27: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

OS Restart Time

Windows XP

Windows 2000

Windows NT

74 s

105 s

92 s

τres

80 s

109 s

96 s

Tres

8 s

8 s

4 s

Std Deviation

Page 28: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Detailed OS Restart Time

WL Abort or Hang

60

80

100

120

140

160

0 100 200 300 400 500

Experiment

seconds

Windows NT4

60

80

100

120

140

160

0 100 200 300 400 500

Experiment

seconds

Windows 2000

60

80

100

120

140

160

0 100 200 300 400 500

Experiment

seconds

Windows XP

Windows NT Windows 2000

Windows XP

Page 29: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

WL Execution Time

Windows XP

Windows 2000

Windows NT

67 s

70 s

74 s

τWC

69 s

74 s

80 s

TWC

10 s

13 s

12 s

Std Deviation

Page 30: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Conclusion

OS robustness benchmark wrt application erroneous behavior

Dependability benchmark prototype for Windows family

Novelty

Structured set of measures

Realistic Workload: TPC-C Client

Standard experimental procedures and rules

Benchmark properties

Benchmark execution duration: 2 days

Page 31: Dependability Benchmarking of Off-the-Shelf OS Kernelswebhost.laas.fr/TSF/IFIPWG/Workshops&Meetings/45/03-Kanoun.pdf · Benchmark developed wGeneral-purpose operating systems ½Robustness

Validation of the benchmark

Results in conformance with Microsoft claim

Sensitivity study wrt to parameter corruption technique

Sensitivity study wrt system calls corrupted

Benchmark properties

Current work

Other OS family: Linux

Other workload