1 Distributed Systems Distributed Web-Based Systems Chapter 12.
Distributed Systems
-
Upload
george-foster -
Category
Documents
-
view
15 -
download
0
description
Transcript of Distributed Systems
![Page 1: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/1.jpg)
Lecture 14 – Operating System Architecture and Performance
![Page 2: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/2.jpg)
Part 1 – Operating System Support: Clearly the ability of an operating system to adequately provide the support for both local and remote interprocess communication is the most important characteristic of an operating system in determining its suitability for use in a distributed system. We will begin today’s lecture by looking in detail at the issues related to providing this support.
![Page 3: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/3.jpg)
OS Features of Concern:
• What are the primatives provided by the OS to facilitate remote interprocess communication
• Which standard communications protocols are supported by the OS to do this?
more …
![Page 4: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/4.jpg)
OS Features of Concern:
• Is the implementation open?
i.e. are the key interfaces well published and widely available?
• What has been done in order to ensure that the communication operations are performed efficiently?
more …
![Page 5: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/5.jpg)
OS Features of Concern:
• What support exists to account for use in networks with high latency?
• What support exists to deal with disconnections from the network?
![Page 6: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/6.jpg)
![Page 7: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/7.jpg)
How primitive are the primitives?
• Does the OS only provide basic functions like the “getRequest” and “sendReply?” …
OR
• Are more sophisticated functions such as the “doOperation” provided?
![Page 8: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/8.jpg)
Kernel
sendRequest
getReply
User-level
doOperation
Kernel
doOperationeffici
ency
![Page 9: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/9.jpg)
![Page 10: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/10.jpg)
Theory and Practice:
• There are advantages to embedding higher level functionality in the kernel
• In practice, middleware usually provides this functionality instead
• TCP and UDP are traditionally provided by the OS and used by middleware
![Page 11: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/11.jpg)
The Research Continues …
Portability and Interoperability
v.s.
Efficiency
![Page 12: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/12.jpg)
Part 2 – Openness and Standardization: Early in our study of distributed systems, we saw the importance of openness. On the other hand, an open system that is not widely used may be problematic. Standardization is therefore also important.
![Page 13: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/13.jpg)
Necessity for Internet connectivity has become a given …
• Requirements for UDP and TCP support abound
![Page 14: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/14.jpg)
Sometimes novel protocols are required for special hardware (ex. wireless) …
• Layered approach helps with this difficulty (i.e. layering allows alternative choices to be made at the lower layers)
Although TCP is a standard choice, it isn’t terribly effective for wireless communications
![Page 15: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/15.jpg)
Part 3 – Invocation Performance: Given the importance of both local and in particular remote invocation mechanisms, we will be concerned with the costs incurred by the operating system when providing these capabilities. We will see that the network is not the only source of performance problems.
![Page 16: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/16.jpg)
![Page 17: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/17.jpg)
What is a remote invocation? …
• Any invocation that crosses address space
• may or may not cross machine boundaries (i.e. network space)
![Page 18: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/18.jpg)
How is crossing address space like crossing network space?
Arguments need to be copied from space to space
• Not unlike marshaling/unmarshalling
• May rely on similar or the same mechanisms
![Page 19: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/19.jpg)
Remote v.s. Local
• local invocations only require pointers to arguments - no address space is crossed
• remote invocations are more complex and require copying all the bytes representing all the structures involved across the address spaces involved
![Page 20: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/20.jpg)
Network capabilities have improved substantially over the years …
… but invocation times have not kept up.
![Page 21: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/21.jpg)
Invocation Costs• crossing address space
• crossing network space
• marshalling/unmarshalling
• data copying
• thread scheduling
• context switching
![Page 22: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/22.jpg)
Part 4 – Measuring Performance: To provide a fair basis for comparison to determine the penalties paid for remote procedure calls or remote method invocations as opposed to their local counterparts, we may use a “null RPC” or a “null RMI.” We will now explain what these things are as well as discussing the typical observations that are made when they are used as mechanisms to measure performance.
![Page 23: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/23.jpg)
Features of null RPC/null RMI:
• Executes a null procedure
• Passes no arguments from the caller
• Returns no results to the caller
Allows measurement of delay introduced by OS/network
![Page 24: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/24.jpg)
Network I/O for a typical null RPC/RMI is minimal …only 100 bytes.
@ 100 Mbps, this amounts to 0.01 milliseconds
The time required for a typical null RPC/RMI is, however on the order of 0.1 milliseconds
That’s 10X longer than the network calculation suggests!
![Page 25: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/25.jpg)
Conclusion:
Clearly there is much delay introduced by the operating system in RMI/RPC.
![Page 26: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/26.jpg)
Part 5 – Improving Performance: There are a number of things that the operating system can do to improve the performance of RPC/RMI. We have already mentioned that one such strategy is to embed higher level functions in the OS, however we noted that this is done often at the expense of portability and interoperability. We will now look at some other alternatives to improve performance.
![Page 27: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/27.jpg)
• Memory sharing
Rapid “copying” of arguments/results can improve the delay. This extends down the protocol stack from layer to layer.
• Choice of protocols
TCP/UDP … overhead with TCP isn’t always significant. How the OS buffers TCP can be more significant. If the policy of the OS is to wait for more data before sending, this could be a hinderance.
![Page 28: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/28.jpg)
• Recognition of LRPC
If RPC is on a single machine, improvements in performance can be made by recognizing this (hence the name “lightweight RPC”) and treating it differently to take advantage of the fact that it is on a single machine.
In one implementation, for example, a stack was used to transfer parameters from client to server.
![Page 29: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/29.jpg)
Part 6 – Network Latency: Despite the preceding material which implied that the operating system was to blame for all performance problems with distributed systems, one must admit that network latencies can often be very high in applications running across the internet. Additionally, such applications may also suffer outright network disconnections for extended periods. This can be viewed as a period of extremely high latency and treated as such. We will now discuss how this can be dealt with and the role that the operating system might play in this.
![Page 30: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/30.jpg)
Asynchronous and Concurrent Invocations:
As dictated by the operating system, it may be possible to employ the following strategies:
• Permit Asynchronous Invocation (i.e. do not block when performing I/O
• Permit Concurrent Invocations (i.e. allow many operations to take place in parallel)
![Page 31: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/31.jpg)
Recall:
• Even with one CPU this is advantageous
• This is called pipelining:
Pipelining: The simultaneous execution of many independent subtasks by a single autonomous unit.
![Page 32: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/32.jpg)
Persistent Invocations:
Again, this availability of this strategy may be dictated by the operating system.
• Persistent invocation does not give up
• The caller may eventually chose to cancel it
• Such a strategy may be appropriate for something like a PDA which might go out of range for a few minutes at a time
![Page 33: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/33.jpg)
Part 7 – Operating System Architecture: There are two primary types of operating system kernels typically available. These are monolithic kernels and microkernels. We will look at the features of each type, and will compare and contrast each. There are also hybrid versions of kernels possible.
![Page 34: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/34.jpg)
Openness:
• For efficient use of resources (memory, disk, CPU) only what is required should be included. This is of particular importance with small systems such as PDA’s.
• Any hardware or software component should be able to be altered without requiring changes throughout
• Alternative components should be permissible to meet user preferences
• We should be able to add services without compromising existing ones
![Page 35: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/35.jpg)
Kernel types:- impacts what level of functionality is incorporated in the kernel and what level of functionality remains in user space
• Monolithic
- large, massive
- non-modular
- difficult to adapt
![Page 36: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/36.jpg)
Kernel types (cont’d):
• Microkernel
- sleek and streamline
- provides only low level basic functionality
- layers may be built on top to provide portability or user processes can access low level functionality directly to improve performance.*
* This will not rival the performance of higher level functionality embedded in a monolithic kernel.
![Page 37: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/37.jpg)
Comparison of Kernels:
Advantages of microkernels:
- can enforce modularity
- small size suggests less likelihood of bugs
Advantage of monolithic kernels:
- efficient operations
![Page 38: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/38.jpg)
Hybrid Kernels:
Performance problems are the biggest disadvantage of microkernels. To deal with this, attempts have been made at providing kernel loadable modules that load into the address space of the kernel to provide higher level functionality without requiring crossing of address spaces.
The research continues …
![Page 39: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/39.jpg)
Part 8 – Applications: We will now discuss a number of questions based upon applying the material from this lecture.
![Page 40: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/40.jpg)
Discuss encapsulation, concurrency, protection, name resolution, parameter passing, and scheduling in the context of the UNIX file service running on a single computer.
![Page 41: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/41.jpg)
Encapsulation:
A process may only access file data and attributes through the system call interface
Concurrency:
Several processes may access the same or different files concurrently. Locks may be placed on files by processes.
![Page 42: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/42.jpg)
Protection:
Users set access permissions using the familiar ugo/rwx format. Processes are associated with particular users and groups.
Name Resolution:
Pathnames are resolved by looking up each component in the appropriate directory until the actual filename is reached.
![Page 43: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/43.jpg)
Parameter Passing:
May be done by passing them in machine registers during a system call or by copying them between address spaces
Scheduling:
There are no separate file system threads. All file activity executes in the kernel.
![Page 44: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/44.jpg)
Why are some system interfaces implemented by dedicated system calls to the kernel while others are built on top of message-based system calls?
![Page 45: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/45.jpg)
Dedicated system calls are more efficient than message-based calls, however there is an advantage to implementing a system call as an RPC call: It makes the operations transparent between local and remote resources.
![Page 46: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/46.jpg)
What are the advantages to using copy-on-write for UNIX where a call to fork is often followed with a call to exec? What should occur in the event that the region that has been copied is itself copied?
![Page 47: Distributed Systems](https://reader033.fdocuments.in/reader033/viewer/2022051516/56813053550346895d960209/html5/thumbnails/47.jpg)
It would be wasteful to copy the address space of the forked process since they are immediately replaced. With copy-on-write, only the few pages that needed to be copied prior to the exec would be copied.
If exec is not called, and the forked process forks again, there are then three pages that are codependent - the father, the son, and the grandson. This arrangement of page dependencies complicates the copy-on-write policy.