Why do we need models? There are many dimensions of variability in distributed systems. Examples:...

Why do we need models?

There are many dimensions of variability in distributed systems. Examples: interprocess communication mechanisms, failure classes, security mechanisms etc.

Models are simple abstractions that help understand the variability -- abstractions that preserve the essential features, but hide the implementation details from observers who view the system at a higher level.

A message passing model

System is a graph G = (V, E).

V= set of nodes (sequential processes)

E = set of edges (links or channels) (bi/unidirectional)

Four types of actions by a process:

- Internal action

- Communication action

- Input action

- Output action

A reliable FIFO channel

• Axiom 1. Message m sent message m received

• Axiom 2. Message propagation delay is arbitrary but finite.

• Axiom 3. m1 sent before m2 m1 received before m2.

P

Q

Life of a process

When a message m is received

1. Evaluate a predicate with m

and the local variables;

2. if predicate = true then

- update internal variables;

-send k (k≥0) messages;

else skip {do nothing}

end if

Synchrony vs. Asynchrony

Send & receive can be blocking or non-blocking

Postal communication is asynchronous:

Telephone communication is synchronous

Synchronous communication or not?

(1) RPC,(2) Email

Synchronous

clocks

Local physical clocks synchronized

Synchronous processes

Lock-step synchrony

Synchronous channels

Bounded delay

Synchronous message-order

First-in first-out channels

Synchronous communication

Communication via handshaking

Shared memory model

Here is an implementation of a channel of capacity (max-1)

Sender P writes into the circular buffer and increments the pointer (when p ≠ q-1 mod max)

Receiver Q reads from the circular buffer and increments the pointer (when q ≠ p)

0

1

2

max-1

P Q

p q

Variations of shared memory models

0

3

21 State reading model.

Each process can read

the states of its neighbors

0

3

21Link register model.

Each process can read from

and write ts adjacent buffers

Modeling wireless networks

• Communication via broadcasting• Limited range• Dynamic topology• Collision of broadcasts

(handled by CSMA/CA protocol)• Low power

0

1

2

3

4

5

6

0

1

2

3

4

5

6

(a)

(b)

RTS RTS

CTS

Weak vs. Strong Models

One object (or operation) of a strong model = More than one objects (or operations) of a weaker model.

Often, weaker models are synonymous with fewer restrictions.

One can add layers to create a stronger model from weaker one).

Examples

HLL model is stronger than assembly language model.

Asynchronous is weaker than synchronous.

Bounded delay is stronger than unbounded delay (channel)

Model transformation

Stronger models

- simplify reasoning, but

- needs extra work to implement

Weaker models

- are easier to implement.

- Have a closer relationship with the real

world

Can model X be implemented using model Y is an interesting question in computer science.

Sample problems

Non-FIFO channel to FIFO channel

Message passing to shared memory

Atomic broadcast to non-atomic broadcast

A resequencing protocol{sender process P} {receiver process Q}var i : integer {initially 0} var k : integer {initially 0}

buffer: buffer[0..∞] of msg {initially k: buffer [k] = empty

repeat repeat send m[i],i to Q; {STORE} receive m[i],i from P; i := i+1 store m[i] into buffer[i];

forever {DELIVER} while buffer[k] ≠ empty do begin

deliver content of buffer [k];buffer [k] := empty k := k+1;

endforever

Observations(a) Needs Unbounded sequence numbers and (b) Needs Unbounded number of buffer slots This is bad.

Now solve the same problem on a model where (a) The propagation delay has a known upper bound of T.(b) The messages are sent out @ r per unit time.(c) The messages are received at a rate faster than r.

The buffer requirement drops to r.T. Synchrony pays.

Question. How can we solve the problem using bounded buffer space when the propagation delay is arbitrarily large?

Non-atomic to atomic broadcast

We now include crash failure as a part of our model. First realize what the difficulty is with a naïve approach:

{process i is the sender}for j = 1 to N-1 (j ≠ i) send message m to neighbor[j]

What if the sender crashes at the middle?

Suggest an algorithm for atomic broadcast.

Other classifications

Reactive vs Transformational systemsA reactive system never sleeps (like: a server, or a token ring)A transformational (or non-reactive systems) reaches a fixed point

after which no further change occurs in the system

Named vs Anonymous systemsIn named systems, process id is a part of the algorithm. In anonymous systems, it is not so. All are equal.

(-) Symmetry breaking is often a challenge.

(+) Saves log N bits. Easy to switch one process by another with no side effect)

Model and complexity

Many measures

o Space complexityo Time complexityo Message complexityo Bit complexityo Round complexity

Consider broadcasting in an n-cube (n=3)

0 1

2

3

4

5

6

7

source

Broadcasting using messages{Process 0} sends m to neighbors

{Process i > 0}

repeatreceive message m {m contains

the value};if m is received for the first time then x[i] := m.value; send x[i] to

each neighbor j > I else discard mend ifforeverWhat is the (1) message complexity (2) space complexity per process?

0 1

2

3

4

5

6

7

Each process jhas a variable x[j]initially undefined

Broadcasting using shared memory

{Process 0} x[0] := v{Process i > 0}repeatif a neighbor j < i : x[i] ≠ x[j]

then x[i] := x[j] {this is a step}

else skip end ifforever

What is the time complexity? How many steps are needed?Can be arbitrarily large!

0 1

2

3

4

5

6

7


Broadcasting using shared memory

Now, use “large atomicity”, where

in one step, a process can read

the states of ALL its neighbors.

What is the time complexity?

How many steps are needed?

The time complexity is now O(n2) 0 1

2

3

4

5

6

7


Complexity in rounds

Rounds are truly defined for

synchronous systems. An

asynchronous round consists

of a number of steps where

every process (including the

slowest one) takes at least

one step.

How many rounds do you need

to complete the broadcast?

0 1

2

3

4

5

6

7


Exercises

Consider an anonymous distributed system consisting of N

processes. The topology is a completely connected

network, and the links are bidirectional. Propose an

algorithm using which processes can acquire unique

identifiers. (Hint: use coin flipping, and organize the

computation in rounds).

Exercises

Alice and Bob enter into an agreement: whenever one falls

sick, (s)he will call the other person. Since making the

agreement, no one called the other person, so both

concluded that they are in good health. Assume that the

clocks are synchronized, communication links are perfect,

and a telephone call requires zero time to reach. What kind

of interprocess communication model is this?

Why do we need models? There are many dimensions of variability in distributed systems. Examples:...

Documents

Transcript of Why do we need models? There are many dimensions of variability in distributed systems. Examples:...