Systems EECS 750: Advanced Operating

22
EECS 750: Advanced Operating Systems 01/31 /2014 Heechul Yun

Transcript of Systems EECS 750: Advanced Operating

Page 1: Systems EECS 750: Advanced Operating

EECS 750: Advanced Operating Systems

01/31 /2014

Heechul Yun

Page 2: Systems EECS 750: Advanced Operating

Administrative

• Next summary assignment due– by 11:59 p.m., Sunday

– Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler, PACT’07

• Sign up for presentations– First student presentation starts on Feb 3. (Monday)

• Project group– Due on Feb 3. (Monday)

Page 3: Systems EECS 750: Advanced Operating

Recap: CPU Scheduling

• Unicore scheduling– Fairness and responsiveness– Completely Fair Scheduler– BVT

• Multicore scheduling– Partitioned scheduling– Load balancing– DWRR and global fairness

Page 5: Systems EECS 750: Advanced Operating

Process Abstraction

• Basic unit for resource accounting– E.g., cpu time spent, memory size, …

• Basic unit for scheduling

• But…

Page 6: Systems EECS 750: Advanced Operating

A Scenario

• Background tasks in your desktop– E.g., BitTorrent, AntiVirus scan, create search

indexes for local files, …

• Control all background tasks to consume– Less than 10% of CPU time– Less than 20% of total memory– Less than 50% of network bandwidth – …

Page 7: Systems EECS 750: Advanced Operating

Another Scenario

• Tasks in a server node– Map/reduce processes– Web server processes– Search query processing processes– …

• You want to – Control all map/reduce processes’ resources– Control all web server processes’ resources– …

• How?

(*) Figure source: Zheng et al, “CPI2: CPU performance isolation for shared compute clusters”, EuroSys’13

Threads/server at Google(*)

Page 8: Systems EECS 750: Advanced Operating

CGROUP in Linux

• Control Group– Started by engineers at Google in 2006

• Group multiple processes– E.g., EECS750 (15 processes), EECS678 (30

processes) groups

• Control resource usage on a per-group basis– E.g., 70% CPU to EECS750, 30% CPU to EECS678

Page 9: Systems EECS 750: Advanced Operating

Group Hierarchy

• Represents a tree structured relationship among groups

Root Group

UndergraduateGroup

GraduateGroup

PhDGroup

MasterGroup

Page 10: Systems EECS 750: Advanced Operating

Resource Management

• Control resource allocation on a per-group basis

Root Group

UndergraduateGroup

GraduateGroup

PhDGroup

MasterGroup

100% CPU100% MEM

30% CPU50% MEM

70% CPU50% MEM

30% CPU30% MEM

40% CPU20% MEM

Page 11: Systems EECS 750: Advanced Operating

CGROUP Subsystems

• Control resources of a CGROUP

• Available subsystems– cpu CPU b/w limit, weight (share), …– memory memory size limit, …– cpuset cores, memory controllers, …– …

Page 12: Systems EECS 750: Advanced Operating

Example

# mount -t cgroup none /sys/fs/cgroup; # cd /sys/fs/cgroup

# mkdir grad; mkdir grad/phd; mkdir /grad/master

→ create ‘grad’, ‘phd’, ‘master’ CGROUPs

# # echo 100 101 > grad/phd/tasks

→ assign PID 100, 101 to the ‘phd’ group

# echo 200 201 202> grad/master/tasks

→ assign PID 200, 201,202 to the ‘master’ group

# echo 3072 > grad/phd/cpu.shares

# echo 4096 > grad/master/cpu.shares

→ assign 3:4 CPU weights to ‘phd’ and ‘master’ groups

See https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt

Page 14: Systems EECS 750: Advanced Operating

Some Background

• Dot-com bubble (1997-2000)– Many internet companies were founded

• Google: 1998, Netflix: 1997

– Every company wanted to create its website

• Web server research– First version of Apache webserver in 1995– High performance web server was a hot topic

• Client computing → Server computing

Page 15: Systems EECS 750: Advanced Operating

Classical Application

• An application = A process = A resource principal (CPU time, memory)

Page 16: Systems EECS 750: Advanced Operating

Kernel Intensive Application

• Scenario: interrupt handling to receive packets for process A occurred while executing process B

– Time spent on ISR will be charged to process B’s system time. – Packet buffer memory is not charged to any process

• Resources spent on kernel are not controlled

Page 17: Systems EECS 750: Advanced Operating

Application with Multiple Processes

• How to account and manage resources for a group of processes?

Page 18: Systems EECS 750: Advanced Operating

Resource Container

• A logical abstraction to manage resources – Resources: CPU time, memory, …

• A resource container can be associated with multiple processes (threads)

• Can construct a hierarchy

Page 19: Systems EECS 750: Advanced Operating

Example

Web Server

CGIProcess

#1

Static documents

Container 1

Dynamicdocument

Container 2

CGIProcess

#N

WebClients

Page 20: Systems EECS 750: Advanced Operating

Web Server Throughput

Page 21: Systems EECS 750: Advanced Operating

Summary

• Group resource management– Generic abstraction to account and control

resources

• CGROUP in Linux– Powerful tools which are heavily used by google

cloud, android, and many Linux distros

Page 22: Systems EECS 750: Advanced Operating

Discussion

• What resources are important?

• What resources are controllable by the OS?

• What resources are not controllable by the OS?