Distributed System

1. What is a Distributed System?

A distributed system is a collection of autonomous computers linked by a computer network that appear to the users of the system as a single computer.

Inter-process communication

In computing, Inter-process communication (IPC) is a set of methods for the exchange of data among multiple threads in one or more processes. Processes may be running on one or more computers connected by a network. IPC methods are divided into methods for message passing, synchronization, shared memory, and remote procedure calls (RPC). The method of IPC used may vary based on the bandwidth and latency of communication between the threads, and the type of data being communicated.

There are several reasons for providing an environment that allows process cooperation:

Information sharing

Speedup

Modularity

Convenience

Privilege separation

IPC may also be referred to as inter-thread communication and inter-application communication.

The combination of IPC with the address space concept is the foundation for address space independence/isolation.[

MARSHALLING

In distributed system different modules can use different representations for the same data. To exchange such data between modules, it is necessary to reformat the data. This operation (called marshalling) needs some computer time and sometimes it is most expensive part in network communication. The article discusses some problems and concrete implementation examples of marshaling procedures.

Remote procedure call

In computer science, a remote procedure call (RPC) is an inter-process communication that allows a computer program to cause a subroutine or procedure to execute in another address space (commonly on another computer on a shared network) without the programmer explicitly coding the details for this remote interaction. That is, the programmer writes essentially the same code whether the subroutine is local to the executing program, or remote. When the software in question uses object-oriented principles, RPC is called remote invocation or remote method invocation.

http://en.wikipedia.org/wiki/Object-oriented

http://en.wikipedia.org/wiki/Address_space

http://en.wikipedia.org/wiki/Subroutine

http://en.wikipedia.org/wiki/Computer_program

http://en.wikipedia.org/wiki/Computer_program

http://en.wikipedia.org/wiki/Inter-process_communication

http://en.wikipedia.org/wiki/Computer_science

http://en.wikipedia.org/wiki/Inter-process_communication#cite_note-0


http://en.wikipedia.org/wiki/Privilege_separation

http://en.wikipedia.org/wiki/Remote_procedure_call


http://en.wikipedia.org/wiki/Shared_memory

http://en.wikipedia.org/wiki/Synchronization_(computer_science)

http://en.wikipedia.org/wiki/Message_passing

http://en.wikipedia.org/wiki/Computer_network

http://en.wikipedia.org/wiki/Process_(computing)

http://en.wikipedia.org/wiki/Thread_(computer_science)

http://en.wikipedia.org/wiki/Computing

Client-Server Communication Model

The client–server model is a computing model that acts as a distributed application which partitions tasks or workloads between the providers of a resource or service, called servers, and service requesters, called clients.[1] Often clients and servers communicate over a computer network on separate hardware, but both client and server may reside in the same system. A server machine is a host that is running one or more server programs which share their resources with clients. A client does not share any of its resources, but requests a server's content or service function. Clients therefore initiate communication sessions with servers which await incoming requests.

• Structure: group of servers offering service to clients

• Based on a request/response paradigm

• Techniques:

– Socket, remote procedure calls (RPC), Remote Method Invocation (RMI)

Issues in Client-Server Communication

• Addressing

• Blocking versus non-blocking

• Buffered versus unbuffered

• Reliable versus unreliable

• Server architecture: concurrent versus sequential

• Scalability


http://en.wikipedia.org/wiki/Client%E2%80%93server_model#cite_note-0

http://en.wikipedia.org/wiki/Client_(computing)

http://en.wikipedia.org/wiki/Server_(computing)

http://en.wikipedia.org/wiki/Distributed_application

http://en.wikipedia.org/wiki/Computing

Stub (distributed computing)

• The stubs in RPC are responsible for packing and unpacking the call parameters, and the call results

- this is called marshalling/unmarshalling

• Stubs must allow for the fact that client and server may be machines of different types

- for example, integers may be represented differently (byte-ordering)

A stub in distributed computing is a piece of code used for converting parameters passed during a Remote Procedure Call (RPC).

The main idea of an RPC is to allow a local computer (client) to remotely call procedures on a remote computer (server). The client and server use different address spaces, so conversion of parameters used in a function call have to be performed, otherwise the values of those parameters could not be used, because of pointers to the computer's memory pointing to different data on each machine. The client and server may also use different data representations even for simple parameters (e.g., big-endian versus little-endian for integers.) Stubs are used to perform the conversion of the parameters, so a Remote Function Call looks like a local function call for the remote computer.

Stub libraries must be installed on client and server side. A client stub is responsible for conversion of parameters used in a function call and deconversion of results passed from the server after execution of the function. A server skeleton, the stub on server side, is responsible for deconversion of parameters passed by the client and conversion of the results after the execution of the function.

Stub can be generated in one of the two ways:

1. Manually: In this method, the RPC implementer provides a set of translation functions from which a user can construct his or her own stubs. This method is simple to implement and can handle very complex parameter types.

2. Automatically: This is more commonly used method for stub generation. It uses an interface description language (IDL), that is used for defining the interface between Client and Server. For example, an interface definition has information to indicate whether, each argument is input, output or both — only input arguments need to be copied from client to server and only output elements need to be copied from server to client. function.

http://en.wikipedia.org/wiki/Function_(computer_science)

http://en.wikipedia.org/wiki/Interface_description_language

http://en.wikipedia.org/wiki/Interface_description_language

http://en.wikipedia.org/wiki/Class_skeleton

http://en.wikipedia.org/wiki/Little-endian

http://en.wikipedia.org/wiki/Big-endian


http://en.wikipedia.org/wiki/Server_(computing)

http://en.wikipedia.org/wiki/Client_(computing)



http://en.wikipedia.org/wiki/Remote_Procedure_Call

http://en.wikipedia.org/wiki/Remote_Procedure_Call

http://en.wikipedia.org/wiki/Distributed_computing

Mutual exclusion

Mutual exclusion (often abbreviated to mutex) algorithms are used in concurrent programming to avoid the simultaneous use of a common resource, such as a global variable, by pieces of computer code called critical sections. A critical section is a piece of code in which a process or thread accesses a common resource.

The critical section by itself is not a mechanism or algorithm for mutual exclusion. A program, process, or thread can have the critical section in it without any mechanism or algorithm which implements mutual exclusion.

Election Algorithms

Election Algorithms The coordinator election problem is to choose a process from among a group of processes on different

processors in a distributed system to act as the central coordinator. An election algorithm is an algorithm for solving the coordinator election problem. By the nature of the

coordinator election problem, any election algorithm must be a distributed algorithm.-a group of processes on different machines need to choose a coordinator

-peer to peer communication: every process can send messages to every other process.

-Assume that processes have unique IDs, such that one is highest

-Assume that the priority of process Pi is i

(a) Bully Algorithm

Background: any process Pi sends a message to the current coordinator; if no response in T time units, Pi tries to elect itself as leader. Details follow:

Algorithm for process Pi that detected the lack of coordinator

1. Process Pi sends an “Election” message to every process with higher priority.2. If no other process responds, process Pi starts the coordinator code running and sends a message to all processes

with lower priorities saying “Elected Pi”3. Else, Pi waits for T’ time units to hear from the new coordinator, and if there is no response start from step (1)

again.

Algorithm for other processes (also called Pi)

If Pi is not the coordinator then Pi may receive either of these messages from Pj

if Pi sends “Elected Pj”; [this message is only received if i < j]

Pi updates its records to say that Pj is the coordinator.

Else if Pj sends “election” message (i > j)

Pi sends a response to Pj saying it is alive

Pi starts an election.

http://en.wikipedia.org/wiki/Thread_(computer_science)

http://en.wikipedia.org/wiki/Process_(computing)

http://en.wikipedia.org/wiki/Critical_section

http://en.wikipedia.org/wiki/Global_variable

http://en.wikipedia.org/wiki/Concurrent_programming

http://en.wikipedia.org/wiki/Algorithm

(b) Election In A Ring => Ring Algorithm.

-assume that processes form a ring: each process only sends messages to the next process in the ring

- Active list: its info on all other active processes

- assumption: message continues around the ring even if a process along the way has crashed.

Background: any process Pi sends a message to the current coordinator; if no response in T time units, Pi initiates an election

1. initialize active list to empty.2. Send an “Elect(i)” message to the right. + add i to active list.

If a process receives an “Elect(j)” message

(a) this is the first message sent or seen

initialize its active list to [i,j]; send “Elect(i)” + send “Elect(j)”

(b) if i != j, add i to active list + forward “Elect(j)” message to active list

(c) otherwise (i = j), so process i has complete set of active processes in its active list.

=> choose highest process ID + send “Elected (x)” message to neighbor

If a process receives “Elected(x)” message,

set coordinator to x

Example:

Suppose that we have four processes arranged in a ring: P1 P2 P3 P4 P1 …

P4 is coordinator

Suppose P1 + P4 crash

Suppose P2 detects that coordinator P4 is not responding

P2 sets active list to [ ]

P2 sends “Elect(2)” message to P3; P2 sets active list to [2]

P3 receives “Elect(2)”

This message is the first message seen, so P3 sets its active list to [2,3]

P3 sends “Elect(3)” towards P4 and then sends “Elect(2)” towards P4

The messages pass P4 + P1 and then reach P2

P2 adds 3 to active list [2,3]

P2 forwards “Elect(3)” to P3

P2 receives the “Elect(2) message

P2 chooses P3 as the highest process in its list [2, 3] and sends an “Elected(P3)” messageP3 receives the “Elect(3)” message

P3 chooses P3 as the highest process in its list [2, 3] + sends an “Elected(P3)” message

Distributed Scheduling

Load balancing (computing)

Load balancing is a computer networking methodology to distribute workload across multiple computers or a computer cluster, network links, central processing units, disk drives, or other resources, to achieve optimal resource utilization, maximize throughput, minimize response time, and avoid overload. Using multiple components with load balancing, instead of a single component, may increase reliability throughredundancy. The load balancing service is usually provided by dedicated software or hardware, such as a multilayer switch or a Domain Name System server.

Load Balancing

some may be lightly loaded, some moderately loaded, some heavily loaded

Differences between "N-times" faster processor and processor pool of N processors is interesting. While the arrival rate is N times the individual rate, the service rate is not unless the pool is constantly maximally busy but one N-times-faster processor can be much more expensive ( or non-existent ) than N slow processor Let's analyze N isolated systems to see the problem of underutilization in the absence of load balancing -- consider a system of N identical and independent M/M/1 servers:

Issues in Load Distribution

Load

Resource and CPU queue lengths are good indicators of load. Artificially increment CPU queue length for transferred jobs on their way. Set timeouts for such jobs to safeguard against transfer failures. Little correlation between queue length and CPU utilization for interactive jobs: use utilization instead. Monitoring CPU utilization is expensive Modeling -- Poisson Process, Markov process, M/M/1 queue, M/M/N

Classification of Algorithms

Static -- decisions hard-wired into algorithm using prior knowledge of system Dynamic -- use state information to make decisions. Adaptive -- special case of dynamic algorithms; dynamically change parameters of the algorithm

Load Sharing vs. Load Balancing

Load Sharing -- reduce the likelihood of unshared state by transferring tasks to lightly loaded nodes Load Balancing -- try to make each load have approximately same load

Preemptive vs. Nonpreemptive

http://en.wikipedia.org/wiki/Domain_Name_System

http://en.wikipedia.org/wiki/Multilayer_switch

http://en.wikipedia.org/wiki/Redundancy_(engineering)

http://en.wikipedia.org/wiki/Computer_cluster

http://en.wikipedia.org/wiki/Computer_cluster


Preemptive transfers -- transfer of a task that is partially executed, expensive due to collection of task's state Nonpreemptive transfers -- only transfer tasks that have not begun execution.

Components of Load Distribution

Transfer policy -- threshold based, determine if a process should be executed remotely or locally Selection policy -- which task should be picked, overhead in transfer of selected task should be offset by reduction in its response time Location policy -- which node to be sent, possibly use polling to find suitable node Information policy -- when should the information of other nodes should be collected; demand-driven, or periodic, or state-change-driven Demand-driven:

nodes gather information about other nodes

sender initiated receiver initiated symmetrically initiated

Periodic :nodes exchange information periodically

State-change-driven :nodes disseminate information when their state changes

Stability -- queueing-theoretic, or algorithmic perspective

Sender Initiated Algorithms

overloaded node-- when a new task makes the queue length ≥ threshold T underloaded node -- if accepting a task still maintains queue lenght < threshold T overloaded node attempts to send task to underloaded node only newly arrived tasks considered for transfers location policies:

random -- no information about other nodes threshold -- polling to determine if its a receiver ( underloaded ) shortest -- a # of nodes are polled at random to determine their queue length

information policy: demand-driven stability: polling increases activites; render the system unstable at high load

Receiver Initiated Algorithms

initiated by underloaded nodes underloaded node tries to obtain task from overloaded node initiate search for sender either on a task departure or after a predetermined period information policy: demand-driven stability: remain stable at high and low loads disadvantage: most tasks transfer are preemptive

Symmetrically Initiated Algorithms

senders search for receivers --good in low load situations; but high polling overhead in high load situations receivers search for senders -- useful in high load situations, preemptive task transfer facility is necessary

Stable Symmetrically Initiated Algorithms

use the information gathered during polling to classify nodes in system as either Sender/overload, Receiver/underload, or OK

Distributed Deadlock

A deadlock is a condition in a system where a process cannot proceed because it needns to obtain a resource held by another process but it itself is holding a resource that the other process needs. More formally, four conditions have to be met for a deadlock to occur in a system:

1. mutual exclusion -A resource can be held by at most one process.

2. hold and wait -Processes that already hold resources can wait for another resource.

3. non-preemption -A resource, once granted, cannot be taken away.

4. circular wait -Two or more processes are waiting for resources held by one of the other processes.

Resource allocation can be represented by directed graphs:

P1R1 means that resource R1 is allocated to process P1.

P1R1 means that resource R1 is requested by process P1.

Deadlock is present when the graph has cycles. An example is shown in Figure 1.

Deadlocks in distributed systems

The same conditions for deadlocks in uniprocessors apply to distributed systems. Unfortunately, as in many other aspects of distributed systems, they are harder to detect, avoid, and prevent. Tannenbaum proposes four strategies for dealing with distributed deadlocks:

1. ignorance: ignore the problem (this is the most common approach).

2. detection: let deadlocks occur, detect them, and then deal with them.

3. prevention: make deadlocks impossible.

4. avoidance: choose resource allocation carefully so that deadlocks will not occur.

The last of these, deadlock avoidance through resource allocation is difficult and requires the ability

to predict precisely the resources that will be needed and the times that they will be needed. This is difficult and not practical in real systems. The first of these is trivially simple. We will focus on the middle two approaches.

DDBMS (distributed database management system)

A DDBMS (distributed database management system) is a centralized application that manages a distributed database as if it were all stored on the same computer. The DDBMS synchronizes all the data periodically, and in cases where multiple users must access the same data, ensures that updates and deletes performed on the data at one location will be automatically reflected in the data stored elsewhere.

DDBMS Advantages

Data are located near “greatest demand” site

Faster data access

Faster data processing

Growth facilitation

Improved communications

Reduced operating costs

User-friendly interface

Less danger of a single-point failure

Processor independence

DDBMS Disadvantages

Complexity of management and control

Security

Lack of standards

Increased storage requirements

Greater difficulty in managing the data environment

Increased training cost

Distributed Multimedia Systems

1. Introduction

o Definition: "A distributed multimedia system (DMS) is an integrated communication, computing, and information system that enables the processing, management, delivery, and presentation of synchronized multimedia information with quality-of-service guarantees."

o http://encyclopedia.jrank.org/articles/pages/6729/Distributed-Multimedia-Systems.html

2. Characteristics

http://whatis.techtarget.com/definition/0,289893,sid9_gci933754,00.html

o Delivering the streams of multimedia data

Audio samples, Video frames

o To meet the timing requirements

QoS (Quality of Service)

o Flexibility (adapting to user needs)

o Availability

o Scalability

3. Factors that affect a system

o Server bandwidth

o Cache space

o Number of copies

o The number of clients

4. Basic Schema Wide area gateway Video server Digital TV/radio server Video camera and mike Local network Local network

5. Typical infrastructure components for multimedia applications

6.

7. Different Designs and Architectures

o Database

o Proxy/information servers

o Clients

o Wired or wireless networks

8. Approaches

o Proxy-based approach

o Parallel or clustered servers approach

Varies based on clip duration, number of clients, bandwidth available, etc

o Caching

9. Quality of Service (QoS)

o DMMS are real-time systems as data must be delivered on time

o Not critical – Some flexibility exists

o Loss is acceptable when resync is possible.

o “ Acceptable” service is measured by:

Bandwidth (Throughput)

Latency (Access time)

Data Loss Rate (Acceptable loss ratio)

10. QoS Management

o “ QoS Management”

Process of managing resources to meet the Acceptable service criteria.

o Resources include:

CPU / processing power

Network bandwidth

Buffer memory(on both ends)

Disk bandwidth

Other factors affecting communication

11. Why do we need QoS?

o As multimedia becomes more widespread, strain on network increases!

o Networks provide insufficient QoS for distribution of multimedia.

Ethernet (wired or wireless) is best effort

Collisions, data loss, congestion, etc.

o For some multimedia applications, synchronization is vital.

12. QoS Managers

o Software that runs on network nodes which have two main functions:

QoS negotiation : get requirements from apps and checks feasibility versus available resources.

Admission control : If negotiation succeeds, provides a "resource contract" that guarantees reservation of resources for a certain time.

13. Ways to achieve QoS

o Buffering (on both ends)

o Compression

More load on the nodes, but that is okay

o Bandwidth Reservation

o Resource Scheduling

o Traffic Shaping

o Flow Specifications

o Stream Adaptation

14. Traffic Shaping

o Output buffering at the source to keep data flowing smoothly.

o Two main algorithms:

Leaky bucket : guarantees that data flows at a constant rate without bursts - completely eliminate bursty traffic.

Token bucket : variation of leaky bucket where tokens are generated to allow for some bursty traffic when bandwidth is unused for a certain period of time.

15. Traffic Shaping

16. Flow specifications

o RFC 1363 defines QoS parameters:

Bandwidth

Latency and jitter constraints

Data loss limits

Token bucket size

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Pearson Education 2001

17. Stream Adaptation

o Adjust the data flow based on resource availability.

o Scaling

Scale down content at the source to reduce bandwidth required:

Audio : reduce the rate of audio sampling or dropping channels

Video : reduce resolution, number of pixels, change compression algorithm, color depths, color spaces, and combinations.

o Filtering

One target asks the source to reduce quality for all the clients, even if some can handle higher quality.

Suitable for more than one simultaneous target and guarantees the same QoS for all the targets

18. Applications of DMMS

o Digital Libraries

o Distance learning

o Teleconferencing

o Video on Demand (VoD) & Video on Reservation (VoR)

o Pay Per View

o Audio Streaming

o Video Streaming

o E-commerce

o P2PTV

19. Voddler

o Video on Demand and Pay Per View

o Long movies

o Requires high bandwidth

o Hybrid P2P distribution network

20. Voddler http://en.wikipedia.org/wiki/File:P2ptv.PNG

21. YouTube, Platform

o Apache

o Python

o Linux

o MySQL

o Psyco

o lighttpd for video instead of Apache, because of overheads

22. YouTube, Serving Video

o Each video hosted by a mini-cluster. Each video is served by more than one machine.

o Most popular content is moved to a CDN (content delivery network)

o Less popular content (1-20 views per day) uses YouTube servers in various proper sites

23. YouTube, Data Center Strategy

o Used manage hosting providers at first. Living off credit cards so it was the only way.

o Managed hosting can't scale with you. You can't control hardware or make favorable networking agreements.

o So they went to a colocation arrangement. Now they can customize everything and negotiate their own contracts.

o Videos come out of any data center. Not closest match or anything. If a video is popular enough it will move into the CDN.

Distributed System

Education

Transcript of Distributed System