MC0070 SMU MCA SEM2 2011

MC0070 – Operating Systems with UnixBook ID: B0682 & B0683

Set-1

1. Define process. Explain the major components of a process.

Ans:1Process:A process can be simply defined as a program in execution. Process along with program code, comprises of program counter value, Processor register contents, values of variables, stack and program data.A process is created and terminated, and it follows some or all of the states of process transition; such as New, Ready, Running, Waiting, and Exit.Process is not the same as program. A process is more than a program code. A process is an ‘active’ entity as oppose to program which considered being a ‘passive’ entity. As we all know that a program is an algorithm expressed in some programming language. Being a passive, a program is only a part of process. Process, on the other hand, includes:· Current value of Program Counter (PC)· Contents of the processors registers· Value of the variables· The process stack, which typically contains temporary data such as subroutine parameter, return address, and temporary variables.· A data section that contains global variables.· A process is the unit of work in a system.

A process has certain attributes that directly affect execution, these include:

PID - The PID stands for the process identification. This is a unique number that defines the process within the kernel.

PPID - This is the processes Parent PID, the creator of the process.

UID - The User ID number of the user that owns this process.

EUID - The effective User ID of the process.

GID - The Group ID of the user that owns this process.

EGID - The effective Group User ID that owns this process.

Priority - The priority that this process runs at.

To view a process you use the ps command.

# ps -l

F S UID PID PPID C PRI NI P SZ:RSS WCHAN TTY TIME COMD

30 S 0 11660 145 1 26 20 * 66:20 88249f10 ttyq 6 0:00 rlogind


The F field: This is the flag field. It uses hexadecimal values which are added to show the value of the flag bits for the process. For a normal user process this will be 30, meaning it is loaded into memory.

The S field: The S field is the state of the process, the two most common values are S for Sleeping and R for Running. An important value to look for is X, which means the process is waiting for memory to become available.

PID field: The PID shows the Process ID of each process. This value should be unique. Generally PID are allocated lowest to highest, but wrap at some point. This value is necessary for you to send a signal to a process such as the KILL signal.

PRI field: This stands for priority field. The lower the value the higher the value. This refers to the process NICE value. It will range from 0 to 39. The default is 20, as a process uses the CPU the system will raise the nice value.

P flag: This is the processor flag. On the SGI this refers to the processor the process is running on.

SZ field: This refers to the SIZE field. This is the total number of pages in the process. Each page is 4096 bytes.

TTY field: This is the terminal assigned to your process.

Time field: The cumulative execution time of the process in minutes and seconds.

COMD field: The command that was executed.

In Process model, all software on the computer is organized into a number of sequential processes. A process includes PC, registers, and variables. Conceptually, each process has its own virtual CPU. In reality, the CPU switches back and forth among processes.

2. Describe the following:

A.) Layered Approach

B.) Micro Kernels

C.) Virtual Machines

Ans:2

Layered Approach

With proper hardware support, operating systems can be broken into pieces that are smaller and more appropriate than those allowed by the original MS-DOS or UNIX systems. The operating system can then retain much greater control over the computer and over the


applications that make use of that computer. Implementers have more freedom in changing the inner workings of the system and in creating modular operating systems. Under the top-down approach, the overall functionality and features are determined and the separated into components. Information hiding is also important, because it leaves programmers free to implement the low-level routines as they see fit, provided that the external interface of the routine stays unchanged and that the routine itself performs the advertised task.

A system can be made modular in many ways. One method is the layered approach, in which the operating system is broken up into a number of layers (levels). The bottom layer (layer 0) id the hardware; the highest (layer N) is the user interface.

UsersFile Systems

Inter-process Communication I/O and Device Management

Virtual Memory Primitive Process

Management Hardware

Fig. 2.2: Layered Architecture

An operating-system layer is an implementation of an abstract object made up of data and the operations that can manipulate those data. A typical operating – system layer-say, layer M-consists of data structures and a set of routines that can be invoked by higher-level layers. Layer M, in turn, can invoke operations on lower-level layers.

The main advantage of the layered approach is simplicity of construction and debugging. The layers are selected so that each uses functions (operations) and services of only lower-level layers. This approach simplifies debugging and system verification. The first layer can be debugged without any concern for the rest of the system, because, by definition, it uses only the basic hardware (which is assumed correct) to implement its functions. Once the first layer is debugged, its correct functioning can be assumed while the second layer is debugged, and so on. If an error is found during debugging of a particular layer, the error must be on that layer, because the layers below it are already debugged. Thus, the design and implementation of the system is simplified.

Each layer is implemented with only those operations provided by lower-level layers. A layer does not need to know how these operations are implemented; it needs to know only what these operations do. Hence, each layer hides the existence of certain data structures, operations, and hardware from higher-level layers. The major difficulty with the layered approach involves appropriately defining the various layers. Because layer can use only lower-level layers, careful planning is necessary. For example, the device driver for the backing store (disk space used by virtual-memory algorithms) must be at a lower level than the memory-management routines, because memory management requires the ability to use the backing store.


Other requirement may not be so obvious. The backing-store driver would normally be above the CPU scheduler, because the driver may need to wait for I/O and the CPU can be rescheduled during this time. However, on a larger system, the CPU scheduler may have more information about all the active processes than can fit in memory. Therefore, this information may need to be swapped in and out of memory, requiring the backing-store driver routine to be below the CPU scheduler. A final problem with layered implementations is that they tend to be less efficient than other types. For instance, when a user program executes an I/O operation, it executes a system call that is trapped to the I/O layer, which calls the memory-management layer, which in turn calls the CPU-scheduling layer, which is then passed to the hardware. At each layer, the parameters may be modified; data may need to be passed, and so on. Each layer adds overhead to the system call; the net result is a system call that takes longer than does one on a non-layered system. These limitations have caused a small backlash against layering in recent years. Fewer layers with more functionality are being designed, providing most of the advantages of modularized code while avoiding the difficult problems of layer definition and interaction.

Micro-kernels

When the kernel became large and difficult to manage. In the mid-1980s, researches at Carnegie Mellon University developed an operating system called Mach that modularized the kernel using the microkernel approach. This method structures the operating system by removing all nonessential components from the kernel and implementing then as system and user-level programs. The result is a smaller kernel. There is little consensus regarding which services should remain in the kernel and which should be implemented in user space. Typically, however, micro-kernels provide minimal process and memory management, in addition to a communication facility.

Device

Drivers

File Server

Client Process

….

Virtual Memory

Microkernel Hardware

Fig. 2.3: Microkernel Architecture

The main function of the microkernel is to provide a communication facility between the client program and the various services that are also running in user space. Communication is provided by message passing. For example, if the client program and service never interact directly. Rather, they communicate indirectly by exchanging messages with the microkernel.


On benefit of the microkernel approach is ease of extending the operating system. All new services are added to user space and consequently do not require modification of the kernel. When the kernel does have to be modified, the changes tend to be fewer, because the microkernel is a smaller kernel. The resulting operating system is easier to port from one hardware design to another. The microkernel also provided more security and reliability, since most services are running as user – rather than kernel – processes, if a service fails the rest of the operating system remains untouched.

Several contemporary operating systems have used the microkernel approach. Tru64 UNIX (formerly Digital UNIX provides a UNIX interface to the user, but it is implemented with a March kernel. The March kernel maps UNIX system calls into messages to the appropriate user-level services.

The following figure shows the UNIX operating system architecture. At the center is hardware, covered by kernel. Above that are the UNIX utilities, and command interface, such as shell (sh), etc.

Virtual Machine

The layered approach of operating systems is taken to its logical conclusion in the concept of virtual machine. The fundamental idea behind a virtual machine is to abstract the hardware of a single computer (the CPU, Memory, Disk drives, Network Interface Cards, and so forth) into several different execution environments and thereby creating the illusion that each separate execution environment is running its own private computer. By using CPU Scheduling and


Virtual Memory techniques, an operating system can create the illusion that a process has its own processor with its own (virtual) memory. Normally a process has additional features, such as system calls and a file system, which are not provided by the hardware. The Virtual machine approach does not provide any such additional functionality but rather an interface that is identical to the underlying bare hardware. Each process is provided with a (virtual) copy of the underlying computer.

Hardware Virtual machine

The original meaning of virtual machine, sometimes called a hardware virtual machine, is that of a number of discrete identical execution environments on a single computer, each of which runs an operating system (OS). This can allow applications written for one OS to be executed on a machine which runs a different OS, or provide execution “sandboxes” which provide a greater level of isolation between processes than is achieved when running multiple processes on the same instance of an OS. One use is to provide multiple users the illusion of having an entire computer, one that is their “private” machine, isolated from other users, all on a single physical machine. Another advantage is that booting and restarting a virtual machine can be much faster than with a physical machine, since it may be possible to skip tasks such as hardware initialization.

Such software is now often referred to with the terms virtualization and virtual servers. The host software which provides this capability is often referred to as a virtual machine monitor or hypervisor.

Software virtualization can be done in three major ways: Emulation, full system simulation, or “full virtualization with dynamic recompilation” — the virtual machine simulates the complete hardware, allowing an unmodified OS for a completely different CPU to be run. Paravirtualization — the virtual machine does not simulate hardware but instead offers a special API that requires OS modifications. An example of this is XenSource’s XenEnterprise (www.xensource.com) Native virtualization and “full virtualization” — the virtual machine only partially simulates enough hardware to allow an unmodified OS to be run in isolation, but the guest OS must be designed for the same type of CPU. The term native virtualization is also sometimes used to designate that hardware assistance through Virtualization Technology is used.

Application virtual machine

Another meaning of virtual machine is a piece of computer software that isolates the application being used by the user from the computer. Because versions of the virtual machine are written for various computer platforms, any application written for the virtual machine can be operated on any of the platforms, instead of having to produce separate versions of the application for each computer and operating system. The application is run on the computer using an interpreter or Just In Time compilation. One of the best known examples of an application virtual machine is Sun Microsystem’s Java Virtual Machine.


3. Memory management is important in operating systems. Discuss the main problems

that can occur if memory is managed poorly.

Ans:3The part of the operating system which handles this responsibility is called the memory manager. Since every process must have some amount of primary memory in order to execute, the performance of the memory manager is crucial to the performance of the entire system. Virtual memory refers to the technology in which some space in hard disk is used as an extension of main memory so that a user program need not worry if its size extends the size of the main memory.

For paging memory management, each process is associated with a page table. Each entry in the table contains the frame number of the corresponding page in the virtual address space of the process. This same page table is also the central data structure for virtual memory mechanism based on paging, although more facilities are needed. It covers the Control bits, Multi-level page table etc. Segmentation is another popular method for both memory management and virtual memory

Basic Cache Structure : The idea of cache memories is similar to virtual memory in that some active portion of a low-speed memory is stored in duplicate in a higher-speed cache memory. When a memory request is generated, the request is first presented to the cache memory, and if the cache cannot respond, the request is then presented to main memory.

Content-Addressable Memory (CAM) is a special type of computer memory used in certain very high speed searching applications. It is also known as associative memory, associative storage, or associative array, although the last term is more often used for a programming data structure.

In addition to the responsibility of managing processes, the operating system must efficiently manage the primary memory of the computer. The part of the operating system which handles this responsibility is called the memory manager. Since every process must have some amount of primary memory in order to execute, the performance of the memory manager is crucial to the performance of the entire system. Nutt explains: “The memory manager is responsible for allocating primary memory to processes and for assisting the programmer in loading and storing the contents of the primary memory. Managing the sharing of primary memory and minimizing memory access time are the basic goals of the memory manager.”


The real challenge of efficiently managing memory is seen in the case of a system which has multiple processes running at the same time. Since primary memory can be space-multiplexed, the memory manager can allocate a portion of primary memory to each process for its own use. However, the memory manager must keep track of which processes are running in which memory locations, and it must also determine how to allocate and de-allocate available memory when new processes are created and when old processes complete execution. While various different strategies are used to allocate space to processes competing for memory, three of the most popular are Best fit, Worst fit, and First fit. Each of these strategies are described below:

Best fit: The allocator places a process in the smallest block of unallocated memory in which it will fit. For example, suppose a process requests 12KB of memory and the memory manager currently has a list of unallocated blocks of 6KB, 14KB, 19KB, 11KB, and 13KB blocks. The best-fit strategy will allocate 12KB of the 13KB block to the process.

Worst fit: The memory manager places a process in the largest block of unallocated memory available. The idea is that this placement will create the largest hold after the allocations, thus increasing the possibility that, compared to best fit, another process can use the remaining space. Using the same example as above, worst fit will allocate 12KB of the 19KB block to the process, leaving a 7KB block for future use.

First fit: There may be many holes in the memory, so the operating system, to reduce the amount of time it spends analyzing the available spaces, begins at the start of primary memory and allocates memory from the first hole it encounters large enough to satisfy the request. Using the same example as above, first fit will allocate 12KB of the 14KB block to the process.


Notice in the diagram above that the Best fit and First fit strategies both leave a tiny segment of memory unallocated just beyond the new process. Since the amount of memory is small, it is not likely that any new processes can be loaded here. This condition of splitting primary memory into segments as the memory is allocated and deallocated is known as fragmentation. The Worst fit strategy attempts to reduce the problem of fragmentation by allocating the largest fragments to new processes. Thus, a larger amount of space will be left as seen in the diagram above.

Another way in which the memory manager enhances the ability of the operating system to support multiple process running simultaneously is by the use of virtual memory. According the Nutt, “virtual memory strategies allow a process to use the CPU when only part of its address space is loaded in the primary memory. In this approach, each process’s address space is partitioned into parts that can be loaded into primary memory when they are needed and written back to secondary memory otherwise.” Another consequence of this approach is that the system can run programs which are actually larger than the primary memory of the system, hence the idea of “virtual memory.” Brookshear explains how this is accomplished:

“Suppose, for example, that a main memory of 64 megabytes is required but only 32 megabytes is actually available. To create the illusion of the larger memory space, the memory manager would divide the required space into units called pages and store the contents of these pages in mass storage. A typical page size is no more than four kilobytes. As different pages are actually required in main memory, the memory manager would exchange them for pages that are no longer required, and thus the other software units could execute as though there were actually 64 megabytes of main memory in the machine.”

In order for this system to work, the memory manager must keep track of all the pages that are currently loaded into the primary memory. This information is stored in a page table maintained by the memory manager. A page fault occurs whenever a process requests a page that is not currently loaded into primary memory. To handle page faults, the memory manager takes the following steps:

1. The memory manager locates the missing page in secondary memory.2. The page is loaded into primary memory, usually causing another page to be unloaded.


3. The page table in the memory manager is adjusted to reflect the new state of the memory.

4. The processor re-executes the instructions which caused the page fault.

4. Explain the following with suitable examples:

1. Copying Files

2. Moving Files

3 Removing Files

4. Creating Multiple Directories

5. Removing a Directory

6. Renaming Directories

Ans:4Copying Files-

Copying files and directories is done with the cp command. A useful option is recursive copy (copy all underlying files and subdirectories), using the -R option to cp. The general syntax is

cp [-R] fromfile tofile

As an example the case of user newguy, who wants the same Gnome desktop settings user oldguy has. One way to solve the problem is to copy the settings of oldguy to the home directory of newguy:

victor:~> cp -R ../oldguy/.gnome/ .

This gives some errors involving file permissions, but all the errors have to do with private files that newguy doesn't need anyway. We will discuss in the next part how to change these permissions in case they really are a problem.

Moving Files-

The mv command (stands for move) allows you to rename a file in the same directory or move a file from one directory to another. If you move a file to a different directory, the file can be renamed or it can retain its original name. mv can also be used to move and rename directories.


% mv [<options>] <source1> [<source2> ...] <target>

Depending on whether the <source(s)> and <target> are files or directories, different actions are taken. These are described in the table below. If <target> is a filename, only one <source> file may be specified.

Source Target Result

file new file name Rename file to new name

file existing file nameOverwrite existing file with source file contents; keep existing file name

directory new directory name Rename directory to new name

directoryexisting directory name

Move directory such that it becomes a subdirectory of existing directory

one or more files

existing directory name

Move files to existing directory

An important option is:

-i If <target> exists, the user is prompted for confirmation before overwriting.

Creating Multiple Directories-

The mkdir command (for make directory) is used to create a directory. The command format is:

% mkdir <dirname> ...

If a pathname is not specified, the directory is created as a subdirectory of the current working directory. Directory creation requires write access to the parent directory. The owner ID and group ID of the new directory are set to those of the creating process.

Examples:

create a subdirectory in the current working directory

% mkdir progs

create one with a full pathname (the directory Tools must already exist)

% mkdir /usr/nicholls/Tools/Less

The mkdir command can even create a directory and its subdirectories if we use its –p option:

$ mkdir –p journal/97


Removing a Directory-

To remove an empty directory, use rmdir. Suppose that we made a typing mistake while creating a directory and want to remove it so that we can create the right one. Enter these commands:

$ mkdir jornal

$rmdir jornal

$ mkdir journal

The rmdir command removes only empty directories. If a directory still has files, we must remove them befor using rmdir:

$ rmdir journal

Rmdir:journa:Directory not empty

$rm journal/*

$ rmdir journal

Actually, rm can remove directories if we use its -r option.

Renaming Directories:

To change the name of a file, use the following command format (where thirdfile and file3 are

sample file names):

mv thirdfile file3

The result of this command is that there is no longer a file called thirdfile, but a new file called

file3 contains what was previously in thirdfile.

Like cp, the mv command also overwrites existing files. For example, if you have two files,

fourthfile and secondfile, and you type the command

mv fourthfile secondfile

mv will remove the original contents of secondfile and replace them with the contents of

fourthfile. The effect is that fourthfile is renamed secondfile, but in the process secondfile is

deleted.


5. Discuss the concept of File substitution with respect to managing data files in UNIX.

Ans:5

File Substitution Works

It is important to understand how file substitution actually works. In the previous examples, the ls command doesn’t do the work of file substitution –the shell does. Even though all the previous examples employ the ls command, any command that accepts filenames on the command line can use file substitution. In fact, using the simple echo command is a good way to experiment with file substitution without having to worry about unexpected results. For example,

$ echo p*

p10 p101 p11

When a metacharacter is encountered in a UNIX command, the shell looks for patterns in filenames that match the metacharacter. When a match is found, the shell substitutes the actual filename in place of the string containing the metacharacter so that the command sees only a list of valid filenames. If the shell finds no filenames that match the pattern, it passes an empty string to the command.

The shell can expand more than one pattern on a single line. Therefore, the shell interprets the command

$ ls LINES.* PAGES.*

as

$ ls LINES.dat LINES.idx PAGES.dat PAGES.idx

There are file substitution situations that you should be wary of. You should be careful about the use of whitespace (extra blanks) in a command line. If you enter the following command, for example, the results might surprise you:

What has happened is that the shell interpreted the first parameter as the filename LINES. with no metacharacters and passed it directly on to ls. Next, the shell saw the single asterisk (*), and matched it to any character string, which matches every file in the directory. This is not a big

http://train-srv.manipalu.com/wpress/wp-content/uploads/2009/06/clip-image0307.jpg


problem if you are simply listing the files, but it could mean disaster if you were using the command to delete data files!

Unusual results can also occur if you use the period (.) in a shell command. Suppose that you are using the

$ ls .*

command to view the hidden files. What the shell would see after it finishes interpreting the metacharacter is

$ ls . .. .profile

which gives you a complete directory listing of both the current and parent directories.

When you think about how filename substitution works, you might assume that the default form of the ls command is actually

$ ls *

However, in this case the shell passes to ls the names of directories, which causes ls to list all the files in the subdirectories. The actual form of the default ls command is

$ ls .

6. How do we make calculations using dc and bc utilities? Describe with at least two examples in each case.

Ans:6

Making Calculations with dc and bc

UNIX has two calculator programs that you can use from the command line: dc and bc. The dc (desk calculator) program uses Reverse Polish Notation (RPN), familiar to everyone who has used Hewlett-Packard pocket calculators, and the bc (basic calculator) program uses the more familiar algebraic notation. Both programs perform essentially the same calculations.

Calculating with bc

The basic calculator, bc, can do calculations to any precision that you specify. Therefore, if you know how to calculate pi and want to know its value to 20, 50, or 200 places, for example, use bc. This tool can add, subtract, multiply, divide, and raise a number to a power. It can take square roots, compute sines and cosines of angles, calculate exponentials and logarithms, and handle arctangents and Bessel functions. In addition, it contains a programming language


whose syntax looks much like that of the C programming language. This means that you can use the following:

· Simple and array variables

· Expressions

· Tests and loops

· Functions that you define

Also, bc can take input from the keyboard, from a file, or from both.

Here are some examples of bc receiving input from the keyboard:

$ bc

2*3

6

To do multiplication, all you have to do is enter the two values with an asterisk between them. To exit from bc, just type Ctrl+d. However, you can also continue giving bc more calculations to do.

Here’s a simple square root calculation (as a continuation of the original bc command):

sqrt(11)

3

Oops! The default behavior of bc is to treat all numbers as integers. To get floating-point numbers (that is, numbers with decimal points in them), use the scale command. For example, the following input tells bc that you want it to set four decimal places and then try the square root example again:

scale=4

sqrt(11)

3.3166

In addition to setting the number of decimal places with scale, you can set the number of significant digits with length.

You need not always use base-10 for all your calculations, either. For example, suppose that you want to calculate the square root of the base-8 (octal) number, 11. First change the input base to 8 and then enter the same square root command as before to do the calculation:


ibase=8

sqrt(11)

3.0000

Ctrl+D

$

This result is correct because octal 11 is decimal 9 and the square root of 9 is 3 in both octal and decimal.

You can use a variable even without a program:

$ bc

x=5

10*x

50

Here’s a simple loop in bc’s C-like syntax:

y=1

while(y<5){

y^2

y=y+1

}

1

4

9

16

The first line sets y to the value 1. The next four lines establish a loop: the middle two lines repeat as long as the value of y is less than 5 (while(y<5)). Those two repeated lines cause bc to print the value of y-squared and then add one to the value of y. Note that bc doesn’t display the value of a variable when it’s on a line with an equals sign (or a while statement). Also, note the positions of the braces.


Here’s another, more compact kind of loop. It sets the initial value for y, tests the value of y, and adds one to the value of y, all on one line:

for (y = 1; y <= 5; y = y + 1){

3*y

}

3

6

9

12

15

Initially, y is set to 1. Then the loop tests whether the variable is less than or equal to 5. Because it is, bc performs the calculation 3*y and prints 3. Next, 1 is added to the present value of y, making it 2. That’s also less than 5, so bc performs the 3*y calculation, which results in 6 being printed. y is incremented to 3, which is then tested; because 3 is less than 5, 3*y is calculated again. At some point, bc increments y to 6, which is neither less than 5 nor equal to it, so that the loop terminates with no further calculation or display.

You can define and use new functions for the bc program. A bc function is a device that can take in one or more numbers and calculate a result. For example, the following function, s, adds three numbers:

define s(x,y,z){

return(x+y+z)

}

To use the s function, you enter a command such as the following:

s(5,9,22)

36

Each variable name and each function name must be a single lowercase letter. If you are using the math library, bc -l, (discussed below), the letters a, c, e, j, l, and s are already used.

If you have many functions that you use fairly regularly, you can type them into a text file and start bc by entering bc myfile.bc (where myfile is the name of text file). The bc program then knows those functions and you can invoke them without having to type their definitions again. If


you use a file to provide input to bc, you can put comments in that file. When bc reads the file, it ignores anything that you type between /* and */.

If scale is 0, the bc program does modulus division (using the % symbol), which provides the remainder that results from the division of two integers, as in the following example:

scale=4

5/2

2.5000

5%2

0

scale=0

5/2

2

5%2

1

If scale is not 0, the numbers are treated as floating point even if they are typed as integers.

In addition to including C’s increment operators (++ and —), bc also provides some special assignment operators: +=, -=, *=, /=, and ^=.

The built-in math functions include the following:

Function Returnsa(x) The arc tangent of xc(x) The cosine of xe(x) e raised to the x power

j(n,x)The Bessel function of n and x, where n is an integer and x is any real number

l(x) The natural logarithm of xs(x) The sine of x

To use these math functions, you must invoke bc with the -l option, as follows:

$ bc -l


Calculating with dc

As mentioned earlier, the desk calculator, dc, uses RPN, so unless you’re comfortable with that notation, you should stick with bc. Also, dc does not provide a built-in programming language, built-in math functions, or the capability to define functions. It can, however, take its input from a file.

If you are familiar with stack-oriented calculators, you’ll find that dc is an excellent tool. It can do all the calculations that bc can and it also lets you manipulate the stack directly.

To display values, you must enter the p command. For example, to add and print the sum of 5 and 9, enter

5

9

+p

14

See your UNIX reference manual (different versions use different titles), or if you have them, view the on-line man pages for details on dc.


Set-2

1. Describe the following with respect to Deadlocks in Operating Systems:

a. Deadlock Avoidance

b. Deadlock Prevention

Ans:1DeadLock:In computer science, Coffman deadlock refers to a specific condition when two or more processes are each waiting for the other to release a resource, or more than two processes are waiting for resources in a circular chain (see Necessary conditions). Deadlock is a common problem in multiprocessing where many processes share a specific type of mutually exclusive resource known as a software lock or soft lock. Computers intended for the time-sharing and/or real-time markets are often equipped with a hardware lock (or hard lock) which guarantees exclusive access to processes, forcing serialized access. Deadlocks are particularly troubling because there is no general solution to avoid (soft) deadlocks.

Deadlock avoidance

Deadlock can be avoided if certain information about processes are available in advance of resource allocation. For every resource request, the system sees if granting the request will mean that the system will enter an unsafe state, meaning a state that could result in deadlock. The system then only grants requests that will lead to safe states. In order for the system to be able to figure out whether the next state will be safe or unsafe, it must know in advance at any time the number and type of all resources in existence, available, and requested. One known algorithm that is used for deadlock avoidance is the Banker's algorithm, which requires resource usage limit to be known in advance. However, for many systems it is impossible to know in advance what every process will request. This means that deadlock avoidance is often impossible.

Two other algorithms are Wait/Die and Wound/Wait, each of which uses a symmetry-breaking technique. In both these algorithms there exists an older process (O) and a younger process (Y). Process age can be determined by a timestamp at process creation time. Smaller time stamps are older processes, while larger timestamps represent younger processes.

Wait/Die Wound/WaitO needs a resource held by Y O waits O diesY needs a resource held by O Y dies Y waits

It is important to note that a process may be in an unsafe state but would not result in a deadlock. The notion of safe/unsafe states only refers to the ability of the system to enter a deadlock state or not. For example, if a process requests A which would result in an unsafe state, but releases B which would prevent circular wait, then the state is unsafe but the system is not in deadlock.

Deadlock Prevention


The difference between deadlock avoidance and deadlock prevention is a little subtle. Deadlock avoidance refers to a strategy where whenever a resource is requested, it is only granted if it cannot result in deadlock. Deadlock prevention strategies involve changing the rules so that processes will not make requests that could result in deadlock.

Here is a simple example of such a strategy. Suppose every possible resource is numbered (easy enough in theory, but often hard in practice), and processes must make their requests in order; that is, they cannot request a resource with a number lower than any of the resources that they have been granted so far. Deadlock cannot occur in this situation.

As an example, consider the dining philosophers problem. Suppose each chopstick is numbered, and philosophers always have to pick up the lower numbered chopstick before the higher numbered chopstick. Philosopher five picks up chopstick 4, philosopher 4 picks up chopstick 3, philosopher 3 picks up chopstick 2, philosopher 2 picks up chopstick 1. Philosopher 1 is hungry, and without this assumption, would pick up chopstick 5, thus causing deadlock. However, if the lower number rule is in effect, he/she has to pick up chopstick 1 first, and it is already in use, so he/she is blocked. Philosopher 5 picks up chopstick 5, eats, and puts both down, allows philosopher 4 to eat. Eventually everyone gets to eat.

An alternative strategy is to require all processes to request all of their resources at once, and either all are granted or none are granted. Like the above strategy, this is conceptually easy but often hard to implement in practice because it assumes that a process knows what resources it will need in advance.

2. Discuss the file structure? Explain the various access modes.

Ans:2

File Structure Unix hides the “chunkiness” of tracks, sectors, etc. and presents each file as a “smooth” array of bytes with no internal structure. Application programs can, if they wish, use the bytes in the file to represent structures. For example, a wide-spread convention in Unix is to use the newline character (the character with bit pattern 00001010) to break text files into lines. Some other systems provide a variety of other types of files. The most common are files that consist of an array of fixed or variable size records and files that form an index mapping keys to values. Indexed files are usually implemented as B-trees.

File Types Most systems divide files into various “types.” The concept of “type” is a confusing one, partially because the term “type” can mean different things in different contexts. Unix initially supported only four types of files: directories, two kinds of special files (discussed later), and “regular” files. Just about any type of file is considered a “regular” file by Unix. Within this category, however, it is useful to distinguish text files from binary files; within binary files there are executable files (which contain machine-language code) and data files; text files might be source files in a particular programming language (e.g. C or Java) or they may be human-readable text in some mark-up language such as html (hypertext markup language). Data files may be classified


according to the program that created them or is able to interpret them, e.g., a file may be a Microsoft Word document or Excel spreadsheet or the output of TeX. The possibilities are endless.

In general (not just in Unix) there are three ways of indicating the type of a file:

1. The operating system may record the type of a file in meta-data stored separately from the file, but associated with it. Unix only provides enough meta-data to distinguish a regular file from a directory (or special file), but other systems support more types.

2. The type of a file may be indicated by part of its contents, such as a header made up of the first few bytes of the file. In Unix, files that store executable programs start with a two byte magic number that identifies them as executable and selects one of a variety of executable formats. In the original Unix executable format, called the a.out format, the magic number is the octal number 0407, which happens to be the machine code for a branch instruction on the PDP-11 computer, one of the first computers to implement Unix. The operating system could run a file by loading it into memory and jumping to the beginning of it. The 0407 code, interpreted as an instruction, jumps to the word following the 16-byte header, which is the beginning of the executable code in this format. The PDP-11 computer is extinct by now, but it lives on through the 0407 code!

3. The type of a file may be indicated by its name. Sometimes this is just a convention, and sometimes it's enforced by the OS or by certain programs. For example, the Unix Java compiler refuses to believe that a file contains Java source unless its name ends with .java.

Some systems enforce the types of files more vigorously than others. File types may be enforced

Not at all, Only by convention, By certain programs (e.g. the Java compiler), or By the operating system itself.

Unix tends to be very lax in enforcing types.

Access Modes [ Silberschatz, Galvin, and Gagne, Section 11.2 ]

Systems support various access modes for operations on a file.

Sequential. Read or write the next record or next n bytes of the file. Usually, sequential access also allows a rewind operation.

Random. Read or write the nth record or bytes i through j. Unix provides an equivalent facility by adding a seek operation to the sequential operations listed above. This packaging of operations allows random access but encourages sequential access.

Indexed. Read or write the record with a given key. In some cases, the “key” need not be unique--there can be more than one record with the same key. In this case, programs use a combination of indexed and sequential operations: Get the first record with a given key, then get other records with the same key by doing sequential reads.


Note that access modes are distinct from from file structure--e.g., a record-structured file can be accessed either sequentially or randomly--but the two concepts are not entirely unrelated. For example, indexed access mode only makes sense for indexed files.

File Attributes This is the area where there is the most variation among file systems. Attributes can also be grouped by general category.

Name.

Ownership and Protection.

Owner, owner's “group,” creator, access-control list (information about who can to what to this file, for example, perhaps the owner can read or modify it, other members of his group can only read it, and others have no access).

Time stamps.

Time created, time last modified, time last accessed, time the attributes were last changed, etc. Unix maintains the last three of these. Some systems record not only when the file was last modified, but by whom.

Sizes.

Current size, size limit, “high-water mark”, space consumed (which may be larger than size because of internal fragmentation or smaller because of various compression techniques).

Type Information.

As described above: File is ASCII, is executable, is a “system” file, is an Excel spread sheet, etc.

Misc.

Some systems have attributes describing how the file should be displayed when a directly is listed. For example MacOS records an icon to represent the file and the screen coordinates where it was last displayed. DOS has a “hidden” attribute meaning that the file is not normally shown. Unix achieves a similar effect by convention: The ls program that is usually used to list files does not show files with names that start with a period unless you explicit request it to (with the -a option).

Unix records a fixed set of attributes in the meta-data associated with a file. If you want to record some fact about the file that is not included among the supported attributes, you have to use one of the tricks listed above for recording type information: encode it in the name of the file, put it into the body of the file itself, or store it in a file with a related name (e.g. “foo.attributes”). Other systems (notably MacOS and Windows NT) allow new attributes to be invented on the fly. In MacOS, each file has a resource fork, which is a list of (attribute-name, attribute-value) pairs. The attribute name can be any four-character string, and the attribute


value can be anything at all. Indeed, some kinds of files put the entire “contents” of the file in an attribute and leave the “body” of the file (called the data fork) empty.

3. Discuss various conditions to be true for deadlock to occur.

Ans:3

There are many resources that can be allocated to only one process at a time, and we have seen several operating system features that allow this, such as mutexes, semaphores or file locks.

Sometimes a process has to reserve more than one resource. For example, a process which copies files from one tape to another generally requires two tape drives. A process which deals with databases may need to lock multiple records in a database.

A deadlock is a situation in which two computer programs sharing the same resource are effectively preventing each other from accessing the resource, resulting in both programs ceasing to function.

The earliest computer operating systems ran only one program at a time. All of the resources of the system were available to this one program. Later, operating systems ran multiple programs at once, interleaving them. Programs were required to specify in advance what resources they needed so that they could avoid conflicts with other programs running at the same time. Eventually some operating systems offered dynamic allocation of resources. Programs could request further allocations of resources after they had begun running. This led to the problem of the deadlock. Here is the simplest example:

Program 1 requests resource A and receives it.Program 2 requests resource B and receives it.Program 1 requests resource B and is queued up, pending the release of B.Program 2 requests resource A and is queued up, pending the release of A.

Now neither program can proceed until the other program releases a resource. The operating system cannot know what action to take. At this point the only alternative is to abort (stop) one of the programs.

Learning to deal with deadlocks had a major impact on the development of operating systems and the structure of databases. Data was structured and the order of requests was constrained in order to avoid creating deadlocks.

In general, resources allocated to a process are not preemptable; this means that once a resource has been allocated to a process, there is no simple mechanism by which the system can take the resource back from the process unless the process voluntarily gives it up or the system administrator kills the process. This can lead to a situation called deadlock. A set of processes or threads is deadlocked when each process or thread is waiting for a resource to be freed which is controlled by another process. Here is an example of a situation where deadlock can occur.

Mutex M1, M2;


/* Thread 1 */while (1) { NonCriticalSection() Mutex_lock(&M1); Mutex_lock(&M2); CriticalSection(); Mutex_unlock(&M2); Mutex_unlock(&M1);}

/* Thread 2 */while (1) { NonCriticalSection() Mutex_lock(&M2); Mutex_lock(&M1); CriticalSection(); Mutex_unlock(&M1); Mutex_unlock(&M2);}

Suppose thread 1 is running and locks M1, but before it can lock M2, it is interrupted. Thread 2 starts running; it locks M2, when it tries to obtain and lock M1, it is blocked because M1 is already locked (by thread 1). Eventually thread 1 starts running again, and it tries to obtain and lock M2, but it is blocked because M2 is already locked by thread 2. Both threads are blocked; each is waiting for an event which will never occur.

Traffic gridlock is an everyday example of a deadlock situation.


In order for deadlock to occur, four conditions must be true.

Mutual exclusion – Each resource is either currently allocated to exactly one process or it is available. (Two processes cannot simultaneously control the same resource or be in their critical section).

Hold and Wait – processes currently holding resources can request new resources No preemption – Once a process holds a resource, it cannot be taken away by another

process or the kernel. Circular wait – Each process is waiting to obtain a resource which is held by another

process.

The dining philosophers problem discussed in an earlier section is a classic example of deadlock. Each philosopher picks up his or her left fork and waits for the right fork to become available, but it never does.

Deadlock can be modeled with a directed graph. In a deadlock graph, vertices represent either processes (circles) or resources (squares). A process which has acquired a resource is show with an arrow (edge) from the resource to the process. A process which has requested a resource which has not yet been assigned to it is modeled with an arrow from the process to the resource. If these create a cycle, there is deadlock.

The deadlock situation in the above code can be modeled like this.


This graph shows an extremely simple deadlock situation, but it is also possible for a more complex situation to create deadlock. Here is an example of deadlock with four processes and four resources.

There are a number of ways that deadlock can occur in an operating situation. We have seen some examples, here are two more.

Two processes need to lock two files, the first process locks one file the second process locks the other, and each waits for the other to free up the locked file.

Two processes want to write a file to a print spool area at the same time and both start writing. However, the print spool area is of fixed size, and it fills up before either process finishes writing its file, so both wait for more space to become available.


4. What are various File Systems supported in UNIX? Discuss any three of them.

Ans:4

File System Types

Initially, there were only two types of file systems-the ones from AT&T and Berkeley. Following are some file systems types:

2.1 s5

Before SVR4, this was the only file system used by System V, but today it is offered by SVR4 by this name for backward compatibility only. This file system uses a logical block size of 512 or 1024 bytes and a single super block. It also can’t handle filenames longer than 14 characters.

2.2 ufs

This is how the Berkeley fast file systems is known to SVR4 and adopted by most UNIX systems. Because the block size here can go up to 64 KB, performance of this file system is considerably better than s5. It uses multiple super blocks with each cylinder group storing a superblock. Unlike s5, ufs supports 255-character filenames, symbolic links and disk quotas.

2.3 Ext2

This is the standard file system of Linux, It uses a block size of 1024 bytes and, like ufs, uses multiple superblocks and symbolic links.

2.4 Iso9660 or hsfs

This is the standard file system used by CD-ROMs and uses DOS-style 8+3 filenames, Since UNIX uses longer filenames, hsfs also provides Rock Ridge extensions to accommodate them.

2.5 msdos or pcfs

Most UNIX systems also support DOS file systems. You can create this file system on a floppy diskette and transfer files to it for use on a windows system. Linux and Solaris can also directly access a DOS file system in the hard disk.

2.6 swap

2.7 bfs The boot file system

This is used by SVR4 to host the boot programs and the UNIX kernel. Users are not meant to use this file system.


2.8 proc or procfs

This can be considered a pseudo-file system maintained in memory. It stores data of each running process and appears to contain files. But actually contains none. Users can obtain most process information including their PIDs. directly from here.

2.8 Fdisk

Creating Partitions

Both Linux and SCO UNIX allow a user to have multiple operating systems on Intel Machines. It’s no wonder then that both offer the Windows-type fdisk command to create, delete and activate partitions. fdisk in Linux , however, operates differently from windows. The fdisk m command shows you all its internal commands of which the following subset should serve our purpose:

Command Action

A toggle a bootable flag N add a new partition

D delete a partition P Print partition table

L list known partition types Q Quit without saving

M print this menu W Write table to disk & exit.

2.9 mkfs : creating file systems

Now that you have created a partition, you need to create a file system on this partition to make it usable. mkfs is used to build a Linux file system on a device, usually a hard disk partition. The exit code returned by mkfs is 0 on success and 1 on failure.

The file system-specific builder is searched for in a number of directories like perhaps /sbin, /sbin/fs, /sbin/fs.d, /etc/fs, /etc (the precise list is defined at compile time but at least contains /sbin and /sbin/fs), and finally in the directories listed in the PATH environment variable.

5. What do you mean by a Process? What are the various possible states of Process? Discuss.

Ans:5

A process under unix consists of an address space and a set of data structures in the kernel to keep track of that process. The address space is a section of memory that contains the code to execute as well as the process stack. The kernel must keep track of the following data for each process on the system:

the address space map,


the current status of the process,

the execution priority of the process,

the resource usage of the process,

the current signal mask,

the owner of the process.

A process has certain attributes that directly affect execution, these include:

PID - The PID stands for the process identification. This is a unique number that defines the process within the kernel.

PPID - This is the processes Parent PID, the creator of the process.

UID - The User ID number of the user that owns this process.

EUID - The effective User ID of the process.

GID - The Group ID of the user that owns this process.

EGID - The effective Group User ID that owns this process.

Priority - The priority that this process runs at.

To view a process you use the ps command.

# ps -l

F S UID PID PPID C PRI NI P SZ:RSS WCHAN TTY TIME COMD

30 S 0 11660 145 1 26 20 * 66:20 88249f10 ttyq 6 0:00 rlogind

The F field: This is the flag field. It uses hexadecimal values which are added to show the value of the flag bits for the process. For a normal user process this will be 30, meaning it is loaded into memory.

The S field: The S field is the state of the process, the two most common values are S for Sleeping and R for Running. An important value to look for is X, which means the process is waiting for memory to become available.

PID field: The PID shows the Process ID of each process. This value should be unique. Generally PID are allocated lowest to highest, but wrap at some point. This value is necessary for you to send a signal to a process such as the KILL signal.


PRI field: This stands for priority field. The lower the value the higher the value. This refers to the process NICE value. It will range form 0 to 39. The default is 20, as a process uses the CPU the system will raise the nice value.

P flag: This is the processor flag. On the SGI this refers to the processor the process is running on.

SZ field: This refers to the SIZE field. This is the total number of pages in the process. Each page is 4096 bytes.

TTY field: This is the terminal assigned to your process.

Time field: The cumulative execution time of the process in minutes and seconds.

COMD field: The command that was executed.

The fork() System Call

The fork() system call is the basic way to create a new process. It is also a very unique system call, since it returns twice(!) to the caller. This system call causes the current process to be split into two processes - a parent process, and a child process. All of the memory pages used by the original process get duplicated during the fork() call, so both parent and child process see the exact same image. The only distinction is when the call returns. When it returns in the parent process, its return value is the process ID (PID) of the child process. When it returns inside the child process, its return value is ‘0′. If for some reason this call failed (not enough memory, too many processes, etc.), no new process is created, and the return value of the call is ‘-1′. In case the process was created successfully, both child process and parent process continue from the same place in the code where the fork() call was used.

#include <unistd.h> /* defines fork(), and pid_t. */

#include <sys/wait.h> /* defines the wait() system call. */

//storage place for the pid of the child process, and its exit status.

pid_t child_pid; int child_status;

child_pid = fork(); /* lets fork off a child process… */

switch (child_pid) /* check what the fork() call actually did */

{ case -1: /* fork() failed */

perror("fork"); /* print a system-defined error message */ exit(1);

case 0: /* fork() succeeded, we’re inside the child process */ printf("hello worldn");

exit(0); //here the CHILD process exits, not the parent.


default: /* fork() succeeded, we’re inside the parent process */ wait(&child_status); /* wait till the child process exits */ }

/* parent’s process code may continue here… */

6. Explain the working of file substitution in UNIX. Also describe the usage of pipes in UNIX Operating system.

Ans:6

File Substitution:

It is important to understand how file substitution actually works. In the previous examples, the ls command doesn’t do the work of file substitution –the shell does. Even though all the previous examples employ the ls command, any command that accepts filenames on the command line can use file substitution. In fact, using the simple echo command is a good way to experiment with file substitution without having to worry about unexpected results. For example,

$ echo p*

p10 p101 p11

When a metacharacter is encountered in a UNIX command, the shell looks for patterns in filenames that match the metacharacter. When a match is found, the shell substitutes the actual filename in place of the string containing the metacharacter so that the command sees only a list of valid filenames. If the shell finds no filenames that match the pattern, it passes an empty string to the command.

The shell can expand more than one pattern on a single line. Therefore, the shell interprets the command

$ ls LINES.* PAGES.*

as

$ ls LINES.dat LINES.idx PAGES.dat PAGES.idx

There are file substitution situations that you should be wary of. You should be careful about the use of whitespace (extra blanks) in a command line. If you enter the following command, for example, the results might surprise you:


What has happened is that the shell interpreted the first parameter as the filename LINES. with no metacharacters and passed it directly on to ls. Next, the shell saw the single asterisk (*), and matched it to any character string, which matches every file in the directory. This is not a big problem if you are simply listing the files, but it could mean disaster if you were using the command to delete data files!

Unusual results can also occur if you use the period (.) in a shell command. Suppose that you are using the

$ ls .*

command to view the hidden files. What the shell would see after it finishes interpreting the metacharacter is

$ ls . .. .profile

which gives you a complete directory listing of both the current and parent directories.

When you think about how filename substitution works, you might assume that the default form of the ls command is actually

$ ls *

However, in this case the shell passes to ls the names of directories, which causes ls to list all the files in the subdirectories. The actual form of the default ls command is

$ ls .

Using Pipes to Pass Files Between Programs:

Suppose that you wanted a directory listing that was sorted by the mode—file type plus permissions. To accomplish this, you might redirect the output from ls to a data file and then sort that data file. For example,



Although you get the result that you wanted, there are three drawbacks to this method:

· You might end up with a lot of temporary files in your directory. You would have to go back and remove them.

· The sort program doesn’t begin its work until the first command is complete. This isn’t too significant with the small amount of data used in this example, but it can make a considerable difference with larger files.

· The final output contains the name of your tempfile, which might not be what you had in mind.

Fortunately, there is a better way. The pipe symbol (|) causes the standard output of the program on the left side of the pipe to be passed directly to the standard input of the program on the right side of the pipe symbol. Therefore, to get the same results as before, you can use the pipe symbol. For example,

You have accomplished your purpose elegantly, without cluttering your disk. It is not readily apparent, but you have also worked more efficiently. Consider the following example:





Both ls and sort are executing simultaneously, which means that sort can begin processing its input, even before ls has finished its output. A program, such as sort, that takes standard input and creates standard output is sometimes called a filter. The capability to string commands together in a pipeline, combined with the capability to redirect input and output, is part of what gives UNIX its great power. Instead of having large, comprehensive programs perform a task, several simpler programs can be strung together, giving the end user more control over the results. It is not uncommon in the UNIX environment to see something like this:

$ cmd1 <infile | cmd2 -options | cmd3 | cmd4 -options >outfile.

MC0070 SMU MCA SEM2 2011

Documents

Transcript of MC0070 SMU MCA SEM2 2011