07-Unit7

15
Advanced Operating Systems (Distributed Systems) Unit 7 Sikkim Manipal University Page No. 153 Unit 7 Process Management Structure: 7.1 Introduction Objectives 7.2 Process Migration 7.3 Threads 7.4 Terminal Questions 7.1 Introduction The notion of a process is central to the understanding of operating systems. There are quite a few definitions presented in the literature, but no "perfect" definition has yet appeared. Definition The term "process" was first used by the designers of the MULTICS in 1960's. Since then, the term process is used somewhat interchangeably with 'task' or 'job'. The process has been given many definitions, for instance: A program in Execution. An asynchronous activity. The 'animated spirit' of a procedure in execution. The entity to which processors are assigned. The 'dispatchable' unit. and many more definitions have been given. As we can see from the above that there is no universally agreed upon definition, but the definition "Program in Execution" seems to be most frequently used. And this is a concept used in the present study of operating systems.

Transcript of 07-Unit7

Page 1: 07-Unit7

Advanced Operating Systems (Distributed Systems) Unit 7

Sikkim Manipal University Page No. 153

Unit 7 Process Management

Structure:

7.1 Introduction

Objectives

7.2 Process Migration

7.3 Threads

7.4 Terminal Questions

7.1 Introduction

The notion of a process is central to the understanding of operating

systems. There are quite a few definitions presented in the literature, but no

"perfect" definition has yet appeared.

Definition

The term "process" was first used by the designers of the MULTICS in

1960's. Since then, the term process is used somewhat interchangeably

with 'task' or 'job'. The process has been given many definitions, for

instance:

A program in Execution.

An asynchronous activity.

The 'animated spirit' of a procedure in execution.

The entity to which processors are assigned.

The 'dispatchable' unit.

and many more definitions have been given. As we can see from the above

that there is no universally agreed upon definition, but the definition

"Program in Execution" seems to be most frequently used. And this is a

concept used in the present study of operating systems.

Page 2: 07-Unit7

Advanced Operating Systems (Distributed Systems) Unit 7

Sikkim Manipal University Page No. 154

Now that we have agreed upon the definition of process, the question is

what is the relation between process and program. It is same beast with

different name or when this beast is sleeping (not executing) it is called

program and when it is executing it becomes process. Well, to be very

precise, a Process is not the same as program. In the following discussion

we point out some of the differences between process and program.

Process is not the same as program. A process is more than a program

code. A process is an 'active' entity as opposed to program which is

considered to be a 'passive' entity. As we all know a program is an algorithm

expressed in some suitable notation, (e.g., programming language). Being

passive, a program is only a part of process. Process, on the other hand,

includes:

Current value of Program Counter (PC)

Contents of the processors registers

Value of the variables

The process stack (SP) which typically contains temporary data such as

subroutine parameter, return address, and temporary variables.

A data section that contains global variables.

A process is the unit of work in a system.

In Process model, all software on the computer is organized into a number

of sequential processes. A process includes PC, registers, and variables.

Conceptually, each process has its own virtual CPU. In reality, the CPU

switches back and forth among processes. (The rapid switching back and

forth is called multiprogramming).

Process Management

In a conventional (or centralized) operating system, process management

deals with mechanisms and policies for sharing the processor of the system

among all processes. In a Distributed Operating system, the main goal of

Page 3: 07-Unit7

Advanced Operating Systems (Distributed Systems) Unit 7

Sikkim Manipal University Page No. 155

process management is to make the best possible use of the processing

resources of the entire system by sharing them among all the processes.

Three important concepts are used in distributed operating systems to

achieve this goal:

1. Processor Allocation: It deals with the process of deciding which

process should be assigned to which processor.

2. Process Migration: It deals with the movement of a process from its

current location to the processor to which it has been assigned.

3. Threads: They deal with fine-grained parallelism for better utilization of

the processing capability of the system.

This unit describes the concepts of process migration and threads.

Issues in Process Management

Transparent relocation of processes

– Preemptive process migration – costly

– Non-preemptive process migration

Selecting the source and destination nodes for migration

Cost of migration – size of the address space and time taken to migrate

Address space transfer mechanisms – total freezing, pre-transfering,

transfer on reference

Message forwarding for migrated processes

– Resending the message

– The origin site mechanism

– Link traversal mechanism

– Link update mechanism

Process migration in heterogeneous systems

Objectives:

This unit introduces the reader management of processes present in a

distributed network. It discusses the differences between the processes

Page 4: 07-Unit7

Advanced Operating Systems (Distributed Systems) Unit 7

Sikkim Manipal University Page No. 156

running on a uni-processor system and processes running on a distributed

system in specific. It speaks about process migration mechanisms in which

the processes may be shifted or migrated to different machines on the

network depending on the availability of resources to complete the process

execution. It also discusses the concept of threads, their mechanisms, and

differences between a thread and a process on uni-processor system and a

distributed system.

7.2 Process Migration

Definition:

The relocation of a process from its current location (the source system) to

some other location (Destination).

A process may be migrated either before it starts executing on its source

node or during the course of its execution. The former is known as

pre-emptive process migration.

Process migration involves the following steps:

1. Selection of a process to be migrated

2. Selection of destination system or node

3. Actual transfer of the selected process to the destination system or node

The following are the desirable features of a good process migration

mechanism:

A good process migration mechanism must possess transparency, minimal

interferences, minimal residual dependencies, efficiency, and robustness.

i) Transparency: Levels of transparency:

Access to objects such as files and devices should be done in a

location-independent manner. To accomplish this, system should

provide a mechanism for transparent object naming.

Page 5: 07-Unit7

Advanced Operating Systems (Distributed Systems) Unit 7

Sikkim Manipal University Page No. 157

System calls should be location-independent. However, system

calls related to physical properties of node need not be location-

independent.

Interprocess communication should be transparent. Messages

sent to a migrated process should be delivered to the process

transparently; i.e. the sender doesn’t have to resend it.

ii) Minimal Interference: Migration of a process should involve minimal

interference to the progress of the process and to the system as a

whole. For example, minimize freezing time; can be done by partial

transfer.

iii) Minimal residual dependencies: Migrated process should not continue

to depend in any way on its previous node, because such dependency

can diminish the benefits of migrating and also the failure of the previous

node will cause the process to fail.

iv) Efficiency: Time required for migrating a process and cost of supporting

remote execution should be minimized.

v) Robustness: Failure of any node other than the one on which the

process is running should not affect the execution of the process.

Process Migration Mechanism

Migration of a process is a complex activity that involves proper handling of

several sub-activities in order to meet the requirements of a good process

migration mechanism. The four major subactivities involved in process

migration are as follows:

1. Freezing the process and restarting on another node.

2. Transferring the process’ address space from its source node to its

destination node

3. Forwarding messages meant for the migrant process

Page 6: 07-Unit7

Advanced Operating Systems (Distributed Systems) Unit 7

Sikkim Manipal University Page No. 158

4. Handling communication between cooperating processes that have

been separated as a result of process migration.

The commonly used mechanisms for handling each of these subactivities

are described below:

1. Mechanisms for freezing the process:

In pre-emptive process migration, the usual process is to take a “snapshot”

of the process’ on its source node and reinstate the snapshot on the

destination node. For this, at some point during migration, the process is

frozen on its source node, its state information is transferred to its

destination node, and the process is restarted on its destination node using

this state information. By freezing this process, we mean that the execution

of the process is suspended and all external interactions with the process

are deferred.

Some general issues involved in these operations are described below:

i) Immediate and delayed blocking: When can these two approaches be

used?

If the process is not executing a system call, it can be blocked

immediately.

If a process is executing a system call, it may or may not be

possible to block it immediately, depending on the situation and

implementation.

ii) Fast and slow I/O operations: It is feasible to wait for fast I/O

operations (e.g. disk I/O) after blocking. However, not feasible to wait

for slow I/O operations such as terminal. But proper mechanisms are

necessary for these I/O operations to continue.

iii) Information about open files: Names of files, file descriptors, current

modes, current position of their file pointers, etc need to preserved and

Page 7: 07-Unit7

Advanced Operating Systems (Distributed Systems) Unit 7

Sikkim Manipal University Page No. 159

transferred. Also, temporary files would more efficiently be created at

the node on which the process is executing.

iv) Reinstating the process on the destination node: This involves

creating an empty process on the destination node, and the state of

the transferred process is copied into the empty process and is

unfrozen.

v) Address Transfer mechanisms: Migration of a process involves the

transfer of the process state (includes contents of registers, memory

tables, I/O states, process identifiers, etc.) and the process’s address

space (i.e., code, data, and the program stack).

There are three ways to transfer the address space:

a) Total freezing: Process execution is stopped while the address

space is being transferred. It is simple but inefficient

b) Pre-transferring: The address space is transferred while the

process is still running on the source node. Pre-transfer is

followed by repeated transfer of pages modified during the

transfer.

c) Transfer on reference: Only part of the address space is

transferred. The rest of the address space is transferred only on

demand.

vi) Message forwarding mechanisms: After the process has been

migrated, messages bound for that process should be forwarded to its

current node. The following are the three types of messages:

a) messages received at the source after the process execution is

stopped at the source but the process was not started at the new

node;

b) messages received at the source node after the process has

started executing at the destination;

Page 8: 07-Unit7

Advanced Operating Systems (Distributed Systems) Unit 7

Sikkim Manipal University Page No. 160

c) messages to be sent to the migrant process from any other node

after the process started executing at the destination.

Message Forwarding Mechanisms

In moving a message, it must be ensured that all pending, en-route, and

future messages arrive at the process’s new location. The messages to be

forwarded to the migrant process’s new location can be classified into the

following:

Type 1: Messages received at the source node after the process’s

execution has been stopped on its source node and the process’s execution

has not yet been started on its destination node.

Type 2: Messages received at the source node after the process’s

execution has started on its destination node.

Type 3: Messages that are to be sent to the migrant process from any other

node after it has started executing on the destination node.

The different mechanisms used for message forwarding in existing

distributed systems are described below:

1. Resending the message: Instead of the source node forwarding the

messages received for the migrated process, it notifies the sender about

the status of the process. The sender locates the process and resends

the message.

2. Origin site mechanism: Process’s origin site is embedded in the

process identifier.

Each site is responsible for keeping information about the current

locations of all the processes created on it.

Messages are always sent to the origin site. The origin site then

forwards it to the process’s current location.

Page 9: 07-Unit7

Advanced Operating Systems (Distributed Systems) Unit 7

Sikkim Manipal University Page No. 161

A drawback of this approach is that the failure of the origin site will

disrupt the message forwarding .

Another drawback is that there is continuous load on the origin site.

3. Link traversal mechanism: A forwarding address is left at the source

node

The forwarding address has two components

– The first component is a system-wide unique process identifier,

consisting of (id of the node on which the process was created,

local pid)

– The second component is the known location of the process.

This component is updated when the corresponding process is

accessed form the node.

Co-processes Handling Mechanisms

In systems that allow process migration, an important issue is the necessity

to provide efficient communication between a process (parent) and its

sub-processes (children), which might have been migrated and placed on

different nodes. The two different mechanisms used by existing distributed

operating systems to take care of this problem are described below:

1. Disallowing separation of co-processes: There are two ways to do

this

Disallow migration of processes that wait for one or more of their

children to complete.

Migrate children processes along with their parent process.

2. Home node or origin site concept: This approach.

Allows the processes and sub-processes to migrate independently.

All communication between the parent and children processes take

place via the home node.

Page 10: 07-Unit7

Advanced Operating Systems (Distributed Systems) Unit 7

Sikkim Manipal University Page No. 162

Process Migration in Heterogeneous Systems

Following are the ways to handle heterogeneous systems

Use external data representation mechanism to handle this.

Issues related to handling floating point representation need to be

addressed. i.e., number of bits allocated to mantissa and exponent

should be at least as large as the largest representation in the system.

Signed infinity and signed 0 representation: Not all nodes in the system

may support this.

Process Migration Merits

Reducing the average response time of the processes

Speeding up individual jobs

Gaining higher throughput

Utilizing resources effectively

Reducing network traffic

Improving system reliability

Improving system security

7.3 Threads

Threads are a popular way to improve application performance through

parallelism. In traditional operating systems the basic unit of CPU utilization

is a process. Each process has its own program counter, register states,

stack, and address space. In operating systems with threads facility, the

basic unit of CPU utilization is a thread. In these operating systems, a

process consists of an address space and one or more threads of control.

Each thread of a process has its own program counter, register states, and

stack. But all the threads of a process share the same address space.

Hence they also share the same global variables. In addition, all threads of

a process also share the same set of operating system resources such as

Page 11: 07-Unit7

Advanced Operating Systems (Distributed Systems) Unit 7

Sikkim Manipal University Page No. 163

open files, child processes, semaphores, signals, accounting information,

and so on. Threads share the CPU in the same way as processes do. i.e. on

a uni-processor system, threads run in a time-sharing mode, whereas on a

shared memory multi-processor, as many threads can run simultaneously

as there are processors. Akin to traditional processes, threads can create

child threads, can block waiting for system calls to complete, and can

change states during their course of execution. At a particular instance of

time, a thread can be in any one of several states: Running, Blocked,

Ready, or Terminated. In operating systems with threading facility, a

process having a single thread corresponds to a process of a traditional

operating system. Threads are referred to as lightweight processes and

traditional processes are referred to as heavyweight processes.

Why Threads?

Some of the limitations of the traditional process model are listed below:

1. Many applications wish to perform several largely independent tasks

that can run concurrently, but must share the same address space and

other resources.

For example, a database server or file server UNIX’s make facility allows

users to compile several files in parallel, using separate processes for

each.

2. Creating several processes and maintaining them involves lot of

overhead. When a context switch occurs, the state information of the

process (register values, page tables, file descriptors, outstanding I/O

requests, etc) need to be saved.

3. On UNIX systems, new processes are created using the fork system

call. fork is an expensive system call.

4. Processes cannot take advantage of multiprocessor architectures,

because a process can only use one processor at a time. An application

Page 12: 07-Unit7

Advanced Operating Systems (Distributed Systems) Unit 7

Sikkim Manipal University Page No. 164

must create a number of processes and dispatch them to the available

processors.

5. Switching between threads sharing the same address space is

considerably cheaper than switching between processes The traditional

UNIX process is single-threaded.

Consider a set of single threaded processes executing on a Uni - processor

machine. The first three processes were spawned by a server in response

to three clients. The lower two processes run some other server application

Figure 7.1: Traditional UNIX system – Uniprocessor with

single-threaded processes

Two servers running on a uni – processor system. Each server runs as a

single process, with multiple threads sharing a single address space. Inter-

thread context-switching can be handled by either the OS kernel or a user-

level threads library.

Eliminating multiple nearly identical address spaces for each application

reduces the load on the memory subsystem.

Page 13: 07-Unit7

Advanced Operating Systems (Distributed Systems) Unit 7

Sikkim Manipal University Page No. 165

Disadvantage: Multithreaded processes must be concerned with

synchronizing the access to the objects by several of their own threads.

Two Multithreaded processes running on a multiprocessor. All threads of

one process share the same address space, but run on different processors.

We get improved performance but synchronization is more complicated.

Figure 7.2: Multithreaded Processes in a Multiprocessor System

To summarize:

A Process can be divided into two components – a set of threads and a

collection of resources. The collection of resources include an address

space, open files, user credentials, quotas, etc, that are shared by all

threads in the process.

A Thread

is a dynamic object that represents a control point in the process and

that executes a sequence of instructions.

has its private objects, program counter, stack, and a register context.

Page 14: 07-Unit7

Advanced Operating Systems (Distributed Systems) Unit 7

Sikkim Manipal University Page No. 166

User-level thread libraries.

IEEE POSIX standards group generated several drafts of a threads

package known as pthreads.

Sun’s Solaris OS supports pthreads library. It also has implemented its

own threads library.

Models for Organizing Threads

The following are some ways of organizing threads:

Dispatcher-workers model: Dispatcher thread accepts requests from

clients and dispatches it to one of the appropriate free worker threads for

further processing of the request.

Team Model: All threads behave equal in this model. Each thread gets and

process’s client’s request on its own Pipline model: In this model, threads

are arranged in a pipeline so that output data generated by the first thread is

used for processing by the second thread, output by second thread is used

by the third....

User-level Threads Libraries

The interface provided by the threads package must include several

important facilities such as for:

Creating and terminating threads

Suspending and resuming threads

Assigning priorities to the individual threads

Thread scheduling and context switching

Synchronizing activities through facilities such as semaphores and

mutual exclusion locks

Sending messages from one thread to another

Page 15: 07-Unit7

Advanced Operating Systems (Distributed Systems) Unit 7

Sikkim Manipal University Page No. 167

Case Study – DCE threads

DCE threads comply with IEEE POSIX (Portable OS interface) standard

known as P-Threads.

DCE provides a set of user-level library procedures for the creation and

maintenance of threads.

To access the thread services DCE provides an API that is compatible to

the POSIX standard.

If a system supporting DCE has no intrinsic support for threads, the API

provides an interface to the thread library that is linked to the application.

If the system supporting DCE has OS kernel support for threads, DCE is

set up to use this facility. In this case the API serves as an interface to

kernel supported threads facility.

7.4 Terminal Questions

1. Differentiate between pre-emptive and non-preemptive process

migration. Mention their advantages and disadvantages.

2. Discuss the issues involved in freezing a migrant process on its source

node and restarting it on its destination node.

3. Discuss the threading issues with respect to process management in a

DSM system.