CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

20
CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support

Transcript of CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

Page 1: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

CMPT 431

Dr. Alexandra Fedorova

Lecture III: OS Support

Page 2: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

2CMPT 431 © A. Fedorova

The Role of the OS

• The operating system needs to provide support for implementation of distributed systems

• We will look at how distributed systems services interact with the operating systems

• We will discuss the support that the operating system needs to provide

Page 3: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

3CMPT 431 © A. Fedorova

Direct Interaction with the OS

OS

Process:a DS

component

system calls

• A process directly interacts with the OS via system calls

• Example: a web browser, a web server

Page 4: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

4CMPT 431 © A. Fedorova

Interaction via Middleware Layer

OS

Process:a DS

component

system calls

Middleware

Function calls or IPC

• A process directly interacts with the OS via a middleware layer

• A middleware layer directly interacts with the OS

• Example: a peer-to-peer file system implemented over a distributed hash table

Page 5: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

5CMPT 431 © A. Fedorova

Interaction via Inclusion

OS DS component

• A DS component is a part of the operating system, i.e., an operating system daemon

• Example: Network File System (NFS) daemon• Runs as a kernel thread, shares address space with the kernel,

interacts with the rest of the OS via function calls• Why would one want to build a DS component that interacts with

the OS via inclusion?

Page 6: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

6CMPT 431 © A. Fedorova

Digression: Protection Implementation In the Kernel

• System calls are expensive• Why? – Protection domains• Refresh memory protection from your OS class• Good thing: we get memory protection• Bad thing: crossing protection domains is

expensive. Why?• So is this the best solution?

Page 7: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

7CMPT 431 © A. Fedorova

Alternative: Protection Via Language

• Safety features are guaranteed by language/runtime• Compiler checks safe memory access• In addition there are manifests w.r.t. what the process will

and will not do• This way you get protection• And no need for hardware protection domains –

everything can run in a single address space• Singularity: an OS from Microsoft implemented these

concepts• ... End digression

Page 8: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

8CMPT 431 © A. Fedorova

Infrastructure Provided by the OS

• Networking– Interface to network devices– Implementation of common protocols: TPC, UDP, IP

• Processes and threads– Efficient scheduling, load balancing and thread

switching– Efficient thread synchronization– Efficient inter-process communication (IPC)

Page 9: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

9CMPT 431 © A. Fedorova

The Need for Good Process/Thread Support

• Many distributed applications are implemented using multiple threads or processes

• Why?

Page 10: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

10CMPT 431 © A. Fedorova

Motivation for Multithreaded Designs

• Servers provide access to large data sets (web servers, e-commerce servers)

• Even in the presence of caching, they often need to do I/O (to access files on disk or a network FS)

• I/O takes much longer than computation

• Overlapping I/O with computation to improve response time

• Threads make it easy to overlap I/O with computation

• While one thread blocks on I/O another can perform computation

Single thread

Multiple threads

1 request 1.6 requests

Page 11: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

11CMPT 431 © A. Fedorova

Process or Thread Scheduling

• Will use “process” and “thread” interchangeably– A single-threaded process maps to a kernel thread– Each thread in a multithreaded process (usually) maps to a kernel

thread• A scheduler decides which thread runs next on the CPU• To ensure good support for DS components, a scheduler

must:– Be scalable– Balance the load well– Ensure good interactive response– Keep context switches to a minimum (why?)

Page 12: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

12CMPT 431 © A. Fedorova

Case Study: Solaris™ 10 OS

• Solaris is often used on server systems• Known for its good scalability, good load

balancing and interactive performance• We will look at Solaris runqueues and how they

are managed– A runqueue is a scheduling queue– A structure containing pointers to runnable threads –

i.e., threads that are waiting for CPU

Page 13: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

13CMPT 431 © A. Fedorova

Runqueues in SolarisGlobal kernel priority

queue kpqueueUser priority queues for CPU0 disp_qs

User priority queues for CPU1 disp_qs

… …

Pri 0 Pri 1 Pri N Pri 0 Pri 1 Pri N

• There is a user-level queue for each priority level• A dispatcher runs the thread from the highest-priority non-empty queue

Page 14: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

14CMPT 431 © A. Fedorova

Processor Load Balancing

• Load balancing ensures that the load is evenly distributed among the CPUs on a multiprocessor

• This improves the overall response time• Solaris kernel ensures that queues are well balanced when it

enqueues a thread into a runqueue

/* * setbackdq() keeps runqs balanced such that the difference in length * between the chosen runq and the next one is no more than RUNQ_MAX_DIFF. * (…) */

A comment from Solaris source code. Source: http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/disp/disp.c, line 1200

Page 15: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

15CMPT 431 © A. Fedorova

Tuning Thread Priorities For Improved Response Time

• If a thread has waited too long for a processor, its priority is elevated, so no thread is starved

• Threads holding critical resources are put to the front of the queue so that they release those resources as quickly as possible

/* * Put the specified thread on the front of the dispatcher * queue corresponding to its current priority. * * Called with the thread in transition, onproc or stopped state * and locked (transition implies locked) and at high spl. * Returns with the thread in TS_RUN state and still locked. */

A comment on setfrontdq from Solaris source code. Source: http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/disp/disp.c, line 1381

Page 16: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

16CMPT 431 © A. Fedorova

Ensuring Good Responsiveness in Time-Sharing Scheduler

• Solaris’s time-sharing scheduler (the default scheduler) assigns priorities so as to ensure good interactive performance

• Timeslice: the amount of time a thread can run on CPU before it is pre-empted

• If thread T used up it’s entire timeslice on CPU:– priority(T)↓, timeslice(T)↑

• If thread T has given up CPU before using up its timeslice:– priority(T) ↑, timeslice (T) ↓

• Why is this done?

Page 17: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

17CMPT 431 © A. Fedorova

Time-Sharing Scheduler: Answers

• Minimizing context switch costs:– CPU-bound threads stay on CPU longer without a context switch– In compensation, they are scheduled less often, due to decreased

priority– Reducing the number of context switches improves performance

• Ensuring good response for interactive applications– Interactive applications usually don’t use up their entire timeslice– Example: process a network message and release the CPU before

the timeslice expires– Those applications will have their priority elevated, so they will

respond quickly when response is needed (e.g., the next network packet arrives)

Page 18: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

18CMPT 431 © A. Fedorova

What Limits Performance of MP/MT Applications?

• The cost of context switching – depends on the hardware; the OS cannot fix it alone

– Save/restore the registers– Flush the CPU pipeline– If switching address spaces

• May need to flush the TLB (depends on the processor)• May need to flush the cache (depends on the processor)

• The cost of inter-process communication(IPC): requires context switching

• The cost of inter-thread synchronization – by and large depends on the program structure; OS can fix some of it, but not all

Page 19: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

19CMPT 431 © A. Fedorova

Thread Synchronization

If lock is not available, threads waitExecution becomes serialized

Page 20: CMPT 431 Dr. Alexandra Fedorova Lecture III: OS Support.

20CMPT 431 © A. Fedorova

Next…

• Talk about synchronization• Operating system support for efficient

synchronization• Transactional memory – new programming

paradigm for efficient synchronization