Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton...

26
Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft

Transcript of Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton...

Page 1: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Resource Management for Computer Operating Systems

Sarah BirdPhD CandidateBerkeley EECS

Burton SmithTechnical Fellow

Microsoft

Page 2: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Background• Management of computing resources is a

primary operating system responsibility– It lets multiple applications share a single system

• These days, we face new challenges– Increasing numbers of cores (hardware threads)– Emergence of parallel applications– Quality of Service (QoS) requirements– The need to manage power and energy

• Current practice fails to address these issues– Thread management is a good example

Page 3: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Windows Thread Management• Windows maintains queues of runnable threads

– One queue per priority per core• Each core chooses a thread from the head of its nonempty queue

of highest priority and runs it• The thread runs for a “time quantum” unless it blocks or a higher

priority queue becomes nonempty• Thread priority can change at scheduling events, with a new

priority based on what just happened:– One thread unblocking another– I/O completion (keyboard, mouse, storage, network)– Preemption by a higher priority thread– Time quantum expiration– New thread creation

• Other operating systems are pretty similar

Page 4: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Some Obvious Shortcomings• Operating system thread scheduling is costly– It incurs (needless) changes in protection– User mode thread scheduling is much cheaper

• Thread progress is unpredictable– Lock-free techniques were invented to avoid this

• Processes have no control over threads– Despite their big role in memory management

• Response times are impossible to guarantee– What number of threads is enough for a process?

• Power and energy are ignored

Page 5: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

A Way Forward• Resources should be allocated to processes– Cores of various types– Memory (working sets)– Bandwidths, e.g. to shared caches, memory, storage

and interconnection networks• The operating system should:– Optimize the responsiveness of the system– Respond to changes in user expectations– Respond to changes in process requirements– Maintain resource, power and energy constraints

What follows is a scheme to realize this vision

Page 6: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Responsiveness• Run times determine process responsiveness– The time from a mouse click to its result– The time from a service request to its response– The time from job launch to job completion– The time to execute a specified amount of work

• The relationship is usually a nonlinear one– Achievable responses may be needlessly fast– There is generally a threshold of acceptability

• Run time depends on the resources– Some resources will have more effect than others– Effects will often vary with computational phase

Page 7: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Penalty Functions• The penalty function of a process defines how run

times translate into responsiveness– Its shape expresses the nonlinearity of the relationship– The shape will depend on the application and on the

current user interface state (e.g. minimized)• We let the total penalty be the instantaneous sum

of the current penalties of the running processes– Resources determine run times and therefore penalties

• Assigning resources to processes to minimize the total penalty maximizes system responsiveness

Page 8: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Typical Penalty Functions

Run time

Pena

ltyDeadline

Run time

Pena

lty

Page 9: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Adjusting Penalty Functions• Penalty functions are like priorities, except:– They apply to processes, not kernel threads– They are explicit convex functions of the run time

• The operating system can adjust their slopes– Up or down, based on user behavior or preference– The deadlines can probably be left alone

• The total penalty is easy to compute– The objective is to keep minimizing it

Page 10: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Run Time Functions• Generally, they exhibit “diminishing returns”

as resources are added to a process– That is, run times are generally convex functions

• They vary with input and must be measured– Sometimes run time increases with some resource– We may see “plateaus” or various types of “noise”

Run

time

Memory allocation

Page 11: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Resource Allocation As OptimizationContinuously minimize pP p(p(rp,0, … rp,n-1) with

respect to the resource allocations rp,j , where

• P, p, p, and the rp,j are all time-varying;• P is the set of runnable processes;• rp,j 0 is the allocation of resource j to process p;

• The penalty p depends on the run time p;

• p depends in turn on the allocated resources rp,j;

• pP rp,j = Aj , the available quantity of resource j.– All unused (slack) resources are allocated to process 0

Page 12: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

The Optimization Picture

Runtime1(r(0,1), …, r(n-1,1)) Runtime1

Pena

lty1Continuously

minimize the total penalty

Runtime2 (r(0,2), …, r(n-1,2)) Runtime2

Runtimei(r(0,i), …, r(n-1,i)) Runtimei

Pena

lty2

Pena

ltyi

Subject to the total available resources

Page 13: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Convexity and Concavity• A function f is convex if for all x, y in n, in , and

0 1, f(x + (1−)y) f(x) + (1−)f(y)– Chords of the function’s graph must lie on or above it

• A function f is concave if for all x, y in n, in , and0 1, f(x + (1−)y) f(x) + (1−)f(y)– Alternatively, f is concave if and only if –f is convex

• Affine functions Ax + b are both convex and concave• If functions fi are convex so is their sum i fi

• If functions fi are convex so is their maximum maxi fi • If f is both convex and monotone non-decreasing and g is convex then

h(x) = f(g(x)) is convex• If f is convex, so is the set C = { xdom(f) | f(x) }

– C is called the -sublevel set of f

Page 14: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Convex Optimization• A convex optimization problem has the form:

Minimize f0(x1, … xm)

subject to fi(x1, … xm) 0, i = 1, … k

where the functions fi : Rm R are all convex• Convex optimization has several virtues– It guarantees a single global extremum– It is not much slower than linear programming

• In our formulation, resource management is a convex optimization problem

Page 15: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Managing Power and Energy• Total system power can be limited by an affine resource

constraint j wj · p 0 rp,j W

• Total power can also be limited using 0 and 0

– Assume all slack resources r0,j are powered off

– Instead of being a run time,0 is total power• It will be convex in each of the slack resources r0,j

– 0 has a slope that can depend on the battery charge• Low-penalty work loses to 0 when the battery is depleted

Total Power

Pena

lty

As charge depletes,slope increases

a0,r

Tota

l Pow

er

Page 16: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Modeling Run Times• Run time measurements can be used to

construct a model, yielding several advantages– “Noise” can be removed while building the model– Interpolation among measurements is avoided– Rates of change with resources are computable

• We have constructed several types of model– Linear and quadratic (rate) models– Kernel Canonical Correlation Analysis– Genetically Programmed Response Surfaces– Most recently, convex run time models

Page 17: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Convex Models 17

w: learned parametersb: bandwidth allocations: bandwidth amplificationsm: memory allocations

ji ji

ji

bb

wTbw

,

,0

*),(

3

3

2

2

1

10),(

b

w

m

w

b

wTbw

i iii

i

mb

wTmbw

0),,,(

Page 18: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.
Page 19: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.
Page 20: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Modeling Results• Our convex models fit the data very well– We are now seeing how they work in practice

• We think linear least-squares applied to past run time data can learn the model parameters– The QR matrix factorization can be updated and

downdated to track changes in process behavior• We have also discovered some interesting

artifacts along the way…

Page 21: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

0 500 1000 1500 2000 2500 3000 35000

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.22

3

4

5

6

7

8

9

10

11

12

13

14

15

16

Memory Pages

Seco

nds

Nbodies Frame Time

Page 22: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Nbodies Frame Time

0 500 1000 1500 2000 2500 3000 35000.00333333333333333

0.0333333333333333

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16Memory Pages

Seco

nds

Page 23: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

PACORA• PACORA stands for “Performance-Aware Convex

Optimization for Resource Allocation”– See our paper in the poster session of HotPar ‘11

• It manages resource allocation policy in the Berkeley ParLab’s Tesselation operating system– One of us (Bird) is currently implementing it

• Nearly all of the ideas presented here are being prototyped in PACORA and in Tesselation– If these ideas work they may be used elsewhere

Page 24: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

24

Policy S

erv

ice

MajorChangeRequest

ACK/NACK

App Creation

and Resizing

Requestsfrom Users

Admission

Control

MinorChanges

ACK/NACK

App App App

All system

resources

App group with

fraction of resources

App

Current Application

Store

(Current Resources)

Global Policies /User Policies andPreferences

Offline Modelsand Behavioral

Parameters

Build and Refine

Resource Value

Functions

App

#1

App

#2

App

#3

User/System

Perfo

rman

ce

Repo

rts

Convex Optimization

Set Penalty Functions

Decision Validation

PACORA in Tessellation

Resource Allocations

Performance/Energy Counters

Page 25: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Acknowledgements• Stephen Boyd and Lieven Vandenberghe, whose

book taught us about convex optimization• Krste Asanović, David Patterson, and John

Kubiatowicz at Berkeley, for their strong support• The Microsoft Windows developers, for giving

us real applications and ways to measure them• Colleagues at the UC Berkeley ParLab and at

Microsoft Research, for helping us tell the story

Page 26: Resource Management for Computer Operating Systems Sarah Bird PhD Candidate Berkeley EECS Burton Smith Technical Fellow Microsoft.

Thank you. Any Questions?