A11 Clocks -Synch1

download A11 Clocks -Synch1

of 55

Transcript of A11 Clocks -Synch1

  • 8/13/2019 A11 Clocks -Synch1

    1/55

    Synchronizationin

    Distributed Systems

    Chapter 6

  • 8/13/2019 A11 Clocks -Synch1

    2/55

    Guide to Synchronization Lectures

    Synchronization in shared memory systems

    Event ordering in distributed systems

    Logical time, logical clocks, time stamps, Mutual exclusion in distributed systems

    Centralized, decentralized, etc.

    Election algorithms Data race detection in multithreaded

    programs

  • 8/13/2019 A11 Clocks -Synch1

    3/55

    Background

    Synchronization: coordination of actionsbetween processes.

    Processes are usually asynchronous, (operate

    independent of events in other processes) Sometimes need to cooperate/synchronize

    For mutual exclusion

    For event ordering (was message x from process Psent before or after message y from process Q?)

  • 8/13/2019 A11 Clocks -Synch1

    4/55

    Introduction

    Synchronization in centralized systems isprimarily accomplished through sharedmemory

    Event ordering is clear because all events aretimed by the same clock

    Synchronization in distributed systems isharder No shared memory

    No common clock

  • 8/13/2019 A11 Clocks -Synch1

    5/55

    Clock Synchronization

    Some applications rely on event orderingto be successful See page 232 for some examples

    Event ordering is easier if you can accuratelytime-stamp events, but in a distributed systemthe clocks may not always be synchronized

    Is it possible to synchronize clocks in adistributed system?

  • 8/13/2019 A11 Clocks -Synch1

    6/55

    Physical Clocks - pages 233-238

    Physical clock example: counter + holdingregister + oscillating quartz crystal The counter is decremented at each oscillation

    Counter interrupts when it reaches zero

    Reloads from the holding register Interrupt = clock tick (often 60 times/second)

    Software clock: counts interrupts This value represents number of seconds since some

    predetermined time (Jan 1, 1970 for UNIX systems;beginning of the Gregorian calendar for Microsoft)

    Can be converted to normal clock times

  • 8/13/2019 A11 Clocks -Synch1

    7/55

    Clock Skew

    In a distributed system each computer hasits own clock

    Each crystal will oscillate at slightly

    different rate. Over time, the software clock values on

    the different computers are no longer the

    same.

  • 8/13/2019 A11 Clocks -Synch1

    8/55

    Clock Skew

    Clock skew(offset): the difference betweenthe times on two different clocks

    Clock drift : the difference between a clock

    and actual time Ordinary quartz clocks drift by ~ 1sec in

    11-12 days. (10-6secs/sec)

    High precision quartz clocks drift rate issomewhat better

  • 8/13/2019 A11 Clocks -Synch1

    9/55

    Various Ways of Measuring Time*

    The sun Mean solar secondgradually getting longer as

    earths rotation slows.

    International Atomic Time (TAI)

    Atomic clocks are based on transitions of the cesiumatom

    Atomic second = value of solar second at some fixedtime (no longer accurate)

    Universal Coordinated Time (UTC) Based on TAI seconds, but more accurately reflects

    sun time (inserts leap seconds to synchronize atomicsecond with solar second)

  • 8/13/2019 A11 Clocks -Synch1

    10/55

    Getting the Correct (UTC) Time*

    WWV radio station or similar stations inother countries (accurate to +/- 10 msec)

    UTC services provided by earth satellites(accurate to .5 msec)

    GPS (Global Positioning System)(accurate to 20-35 nanoseconds)

  • 8/13/2019 A11 Clocks -Synch1

    11/55

    Clock Synchronization Algorithms*

    In a distributed system one machine mayhave a WWV receiver and some techniqueis used to keep all the other machines in

    synch with this value.

    Or, no machine has access to an externaltime source and some technique is used

    to keep all machines synchronized witheach other, if not with real time.

  • 8/13/2019 A11 Clocks -Synch1

    12/55

    Clock Synchronization Algorithms

    Network Time Protocol (NTP): Objective: to keep all clocks in a system synchronized to

    UTC time (1-50 msec accuracy)not so good in WAN

    Uses a hierarchy of passive time servers

    The Berkeley Algorithm: Objective: to keep all clocks in a system synchronized to

    each other (internal synchronization)

    Uses active time servers that poll machines periodically

    Reference broadcast synchronization (RBS) Objective: to keep all clocks in a wireless system

    synchronized to each other

  • 8/13/2019 A11 Clocks -Synch1

    13/55

    Three Philosophies of Clock

    Synchronization Try to keep all clocks synchronized to

    real time as closely as possible

    Try to keep all clocks synchronized toeach other, even if they vary somewhatfrom UTC time

    Try to synchronize enough so thatinteracting processes can agree upon anevent order. Refer to these clocks as logical clocks

  • 8/13/2019 A11 Clocks -Synch1

    14/55

    6.2 Logical Clocks

    Observation: if two processes (running onseparate processors) do not interact, itdoesnt matter if their clocks are not

    synchronized. Observation: When processes do interact,

    they are usually interested in event order,instead of exact event time.

    Conclusion: Logical clocks are sufficientfor many applications

    Some applications rely on event orderingto be successful

  • 8/13/2019 A11 Clocks -Synch1

    15/55

    Formalization

    The distributed system consists of nprocesses, p1, p2, pn (e.g, a MPI group)

    Each piexecutes on a separate processor

    No shared memory

    Each pihas a state si

    Process execution: a sequence of events Changes to the local state

    Message Send or Receive

  • 8/13/2019 A11 Clocks -Synch1

    16/55

    Two Versions

    Lamports logical clocks: synchronizes

    logical clocks

    Can be used to determine an absolute

    ordering among a set of events although theorder doesnt necessarily reflect causal

    relations between events.

    Vector clocks: can capture the causalrelationships between events.

  • 8/13/2019 A11 Clocks -Synch1

    17/55

    Lamports Logical Time

    Lamport defined a happens-beforerelation between events in a process.

    "Events" are defined by the application. Thegranularity may be as coarse as aprocedure or as fine-grained as a singleinstruction.

  • 8/13/2019 A11 Clocks -Synch1

    18/55

    Happened BeforeRelation (ab)

    ab: (page 244-245)

    in the same [sequential] process,

    send, receive in different processes,

    (messages)

    transitivity: if ab and bc, then ac

    If ab, then a and b are causally related;

    i.e., event a potentially has a causal effecton event b.

  • 8/13/2019 A11 Clocks -Synch1

    19/55

    Concurrent Events

    Happens-before defines a partial order ofevents in a distributed system.

    Some events cant be placed in the order

    a and bare concurrent (a || b) if!(ab) and !(ba).

    If a and b arent connected by thehappened-before relation, theres no wayone could affect the other.

  • 8/13/2019 A11 Clocks -Synch1

    20/55

    Logical Clocks

    Needed: method to assign a timestamp toevent a(call it C(a)), even in the absence of aglobal clock

    The method must guarantee that the clocks

    have certain properties, in order to reflect thedefinition of happens-before. Define a clock (event counter), Ci, at each

    process (processor) Pi.

    When an event a occurs, its timestamp ts(a) =C(a), the local clock value at the time the eventtakes place.

  • 8/13/2019 A11 Clocks -Synch1

    21/55

    Correctness Conditions

    If a and b are in the same process, andab then C(a) < C(b)

    If a is the event of sending a message

    from Pi, and b is the event of receiving themessage by Pj, then Ci (a) < Cj (b).

    The value of C must be increasing (time

    doesnt go backward). Corollary: any clock corrections must bemade by adding a positive number to a time.

    I l t ti R l

  • 8/13/2019 A11 Clocks -Synch1

    22/55

    Implementation Rules

    Between any two successive events a & b in

    Pi, increment the local clock (Ci = Ci + 1) thus Ci(b) = Ci(a) + 1

    When a message m is sent from Pi, set its

    time-stamp tsmto Ci, the time of the sendevent after following previous step.

    When the message is received at Pjthe local

    time must be greater than tsm. The rule is(Cj= max{Cj, tsm} + 1).

  • 8/13/2019 A11 Clocks -Synch1

    23/55

    Lamports Logical Clocks (2)

    Figure 6-9. (a) Three processes, each with its own clock.

    The clocks run at different rates.

    Event a: P1 sends m1to P2 at t = 6,Event b: P2 receivesm1 at t = 16.If C(a) is the time m1

    was sent, and C(b) isthe time m1 isreceived, do C(a) andC(b) satisfy thecorrectness conditions?

  • 8/13/2019 A11 Clocks -Synch1

    24/55

    Lamports Logical Clocks (3)

    Figure 6-9. (b) Lamports algorithm corrects the clocks.

    Event c: P3sends m3 toP2 at t = 60Event d: P2receives m3

    at t = 56Do C(c) andC(d) satisfytheconditions?

  • 8/13/2019 A11 Clocks -Synch1

    25/55

    Application Layer

    Application sends message mi

    Adjust local clock,Timestamp mi

    Middleware sendsmessage

    Network Layer

    Message miis received

    Adjust local clock

    Deliver mi

    to application

    Middleware layer

    Figure 6-10. The positioning of Lamports logical clocks in distributed systemsHandling clock management as a middleware operation

  • 8/13/2019 A11 Clocks -Synch1

    26/55

    Figure 5.3 (Advanced Operating Systems,Singhal and Shivaratri)How Lamports logical clocks advance

    e11 e12 e13 e14 e15 e16 e17

    e21 e22 e23 e24 e25

    P1

    P2

    Which events are causally related?

    Which events are concurrent?

    eij represents event jon processor i

  • 8/13/2019 A11 Clocks -Synch1

    27/55

    A Total Ordering Rule(does not guarantee causality)

    A total ordering of events can be obtainedif we ensure that no two events happen atthe same time (have the same timestamp).

    Why? So all processors can agree on anunambiguous order.

    How? Attach process number to low-order

    end of time, separated by decimal point;e.g., event at time 40 at process P1 is40.1,event at time 40 at process P2 is 40.2

  • 8/13/2019 A11 Clocks -Synch1

    28/55

    Figure 5.3 - Singhal and Shivaratri

    e11 e12 e13 e14 e15 e16 e17

    e21 e22 e23 e24 e25

    P1

    P2

    What is the total ordering of the events in thesetwo processes?

  • 8/13/2019 A11 Clocks -Synch1

    29/55

    Example: Total Order Multicast

    Consider a banking database, replicatedacross several sites.

    Queries are processed at thegeographically closest replica

    We need to be able to guarantee that DBupdates are seen in the same order

    everywhere

  • 8/13/2019 A11 Clocks -Synch1

    30/55

    Totally Ordered Multicast

    Update 1: Process 1 at Site A adds $100 to anaccount, (initial value = $1000)Update 2: Process 2 at Site B increments theaccount by 1%

    Without synchronization,its possible thatreplica 1 = $1111,replica 2 = $1110

  • 8/13/2019 A11 Clocks -Synch1

    31/55

    Message 1: add $100.00

    Message 2: increment account by 1% The replica that sees the messages in the

    order m1, m2 will have a final balance of$1111

    The replica that sees the messages in theorder m2, m1 will have a final balance of$1110

  • 8/13/2019 A11 Clocks -Synch1

    32/55

    The Problem

    Site 1 has final account balance of $1,111after both transactions complete and Site 2has final balance of $1,100.

    Which is right? Either, from the standpointof consistency.

    Problem: lack of consistency.

    Both values should be the same Solution: make sure both sites see/process

    all messages in the same order.

  • 8/13/2019 A11 Clocks -Synch1

    33/55

    Implementing Total Order

    Assumptions:

    Updates are multicast to all sites, including(conceptually) the sender

    All messages from a single sender arrive inthe order in which they were sent

    No messages are lost

    Messages are time-stamped with Lamportclock values.

  • 8/13/2019 A11 Clocks -Synch1

    34/55

    Implementation

    When a process receives a message, putit in a local message queue, ordered bytimestamp.

    Multicast an acknowledgement to all sites Each ackhas a timestamp larger than the

    timestamponthemessage it

    acknowledges The message queue at each site willeventually be in the same order

  • 8/13/2019 A11 Clocks -Synch1

    35/55

    Implementation

    Deliver a message to the application only whenthe following conditions are true:

    The message is at the head of the queue

    The message has been acknowledged by all otherreceivers. This guarantees that no update messageswith earlier timestamps are still in transit.

    Acknowledgements are deleted when the

    message they acknowledge is processed. Since all queues have the same order, all sites

    process the messages in the same order.

  • 8/13/2019 A11 Clocks -Synch1

    36/55

    Causality

    Causally related events:

    Event a may causally affect event bifab

    Events aand bare causally related if either

    ab orba.

    If neither of the above relations hold, thenthere is no causal relation between a & b. We

    say that a || b (a and b are concurrent)

  • 8/13/2019 A11 Clocks -Synch1

    37/55

    Vector Clock Rationale

    Lamport clocks limitation: If (ab) then C(a) < C(b) but

    If C(a) < C(b) then we only know that either

    (a

    b) or (a || b), i.e., b a In other words, you cannot look at the clock

    values of events on two different processors

    and decide which one happens before. Lamport clocks do not capture causality

  • 8/13/2019 A11 Clocks -Synch1

    38/55

    Lamports Logical Clocks (3)

    Figure 6-12.

    Suppose we add a message to thescenario in Fig. 6.12(b).

    Tsnd(m1) < Tsnd(m3).(6) < (32)

    Does this mean

    send(m1)send(m3)?But

    Tsnd(m1) < Tsnd(m2).(6) < (20)

    Does this meansend(m1)send(m2)?

    m2

    m3

  • 8/13/2019 A11 Clocks -Synch1

    39/55

    Figure 5.4Time

    P1

    P2

    P3

    e11.

    e21

    e12

    e22

    e31 e32 e33

    (1) (2)

    (1) (3)

    (1) (2) (3)

    C(e11) < C(e22) and C(e11) < C(e32) but while e11e22, we cannot saye11e32 since there is no causal path connecting them. So, withLamport clocks we can guarantee that if C(a) < C(b) thenb a , but by looking at the clock values alone we cannot saywhether or not the events are causally related.

    Space

  • 8/13/2019 A11 Clocks -Synch1

    40/55

    Vector ClocksHow They Work

    Each processor keeps a vector of values,instead of a single value. VCiis the clock at process i; it has a component

    for each process in the system.

    VCi[i] corresponds to Pis local time. VCi[j] represents Pis knowledge of the timeat Pj (the # of events that Piknows haveoccurred at Pj

    Each processor knows its own time exactly,and updates the values of other processorsclocks based on timestamps received inmessages.

  • 8/13/2019 A11 Clocks -Synch1

    41/55

    Implementation Rules

    IR1: Increment VCi[i] before each new event.

    IR2: When process i sends a message m it setsms (vector) timestamp to VCi (after incrementingVC

    i[i])

    IR3: When a process receives a message itdoes a component-by-component comparison ofthe message timestamp to its local time andpicks the maximum of the two correspondingcomponents. Adjust local componentsaccordingly.

    Then deliver the message to the application.

  • 8/13/2019 A11 Clocks -Synch1

    42/55

    Review

    Physical clocks: hard to keep synchronized

    Logical clocks: can provide some notion ofrelative event occurrence

    Lamports logical time

    happened-before relation defines causal relations logical clocksdont capture causality

    total ordering relation

    use in establishing totally ordered multicast

    Vector clocks

    Unlike Lamport clocks, vector clocks capture causality

    Have a component for each process in the system

  • 8/13/2019 A11 Clocks -Synch1

    43/55

    Figure 5.5. Singhal and Shivaratri

    (1, 0 , 0)(2, 0, 0) (4, 5, 2)

    e11 e12 e14

    (0, 1, 0) (2, 2, 0) (2, 3, 1) (2, 5, 2)

    (0, 0, 1) (0, 0, 2)

    e21 e22 e23 e24

    e31 e32

    P1

    P2

    P3

    (2,4,2)e25

    Vector clock values. In a 3- process system, VC(Pi) = vc1, vc2, vc3

    e13

    (3, 0, 0)

    e33

    (0, 0, 3)

  • 8/13/2019 A11 Clocks -Synch1

    44/55

    Establishing Causal Order

    When Pi sends a message mto Pj, Pj knows How many events occurred at Pi before mwas sent

    How many relevantevents occurred at other sites beforemwas sent (relevant = happened-before)

    In Figure 5.5, VC(e24) = (2, 4, 2). Two events in P1and two events in P3 happened before e24.

    Even though P1 and P3 may have executed other events,

    they dont have a causal effect on e24.

    f /C

  • 8/13/2019 A11 Clocks -Synch1

    45/55

    Happened Before/Causally RelatedEvents - Vector Clock Definition

    a b iff ts(a) < ts(b)(a happens before b iff the timestamp of a is lessthan the timestamp of b)

    Events a and b are causally relatedif ts(a) < ts(b) or

    ts(b) < ts(a)

    Otherwise, we say the events are concurrent.

    Any pair of events that satisfy the vector clockdefinition of happens-before will also satisfy theLamport definition, and vice-versa.

  • 8/13/2019 A11 Clocks -Synch1

    46/55

    Comparing Vector Timestamps

    Less than: ts(a) < ts(b)iff at least onecomponent of ts(a)is strictly less than thecorresponding component of ts(b) and allother components of ts(a) are either lessthan or equal to the correspondingcomponent in ts(b).

    (3,3,5) (3,4,5), (3, 3, 3) (3, 3, 3),

    (3,3,5) (3,2,4), (3, 3 ,5) | | (4,2,5).

  • 8/13/2019 A11 Clocks -Synch1

    47/55

    Figure 5.4Time

    P1

    P2

    P3

    e21

    e12

    e22

    e31 e32 e33

    (1, 0, 0) (2, 0, 0)

    (0, 1, 0) (2, 2, 0)

    (0, 0,1) (0, 0, 2) (0, 0, 3)

    ts(e11) = (1, 0, 0) and ts(e32) = (0, 0, 2), which shows that thetwo events are concurrent.ts(e11) = (1, 0, 0) and ts(e22) = (2, 2, 0), which shows thate11 e22

    e11

    C l O d i f M

  • 8/13/2019 A11 Clocks -Synch1

    48/55

    Causal Ordering of MessagesAn Application of Vector Clocks

    Premise: Deliver a message only ifmessages that causally precede it havealready been received

    i.e., if send(m1) send(m2), then it should betrue that receive(m1) receive(m2)at eachsite.

    If messages are not related (send(m1) ||send(m2)),delivery order is not of interest.

  • 8/13/2019 A11 Clocks -Synch1

    49/55

    Compare to Total Order

    Totally ordered multicast (TOM) isstronger (more inclusive) than causalordering (COM).

    TOM orders allmessages, not just those thatare causally related.

    Weaker COM is often what is needed.

  • 8/13/2019 A11 Clocks -Synch1

    50/55

    Enforcing Causal Communication

    Clocks are adjusted only when sending orreceiving messages; i.e, these are the onlyevents of interest.

    Send m: Piincrements VCi[i]by 1 andapplies timestamp, ts(m).

    Receive m: Pi compares VCito ts(m); set

    VCi[k]to max{VCi[k] , ts(m)[k]}for each k,k i.

  • 8/13/2019 A11 Clocks -Synch1

    51/55

    Message Delivery Conditions

    Suppose: PJreceives message mfrom Pi Middleware delivers mto the application iff

    ts(m)[i] = VCj[i] + 1

    all previous messages from Pihave been delivered ts(m)[k] VCi[k] for all k i

    PJ has received all messages that Pi had seen beforeit sent message m.

  • 8/13/2019 A11 Clocks -Synch1

    52/55

    In other words, if a message mis receivedfrom Pi,you should also have received

    every message that Pireceived before itsent m; e.g.,

    if m is sent by P1and ts(m)is (3, 4, 0) and youare P

    3, you should already have received

    exactly 2 messages from P1and at least 4from P2

    if m is sent by P2and ts(m) is (4, 5, 1, 3) and if

    you are P3and VC3is (3, 3, 4, 3) then youneed to wait for a fourth message from P2andat least one more message from P1.

  • 8/13/2019 A11 Clocks -Synch1

    53/55

    P0

    P1

    P2

    (1, 0, 0)

    P1 received message mfrom P0 before sendingmessage m*to P2; P2 must wait for delivery of mbefore receiving m*

    (Increment own clock only on message send)

    Before sending or receiving any messages, ones

    own clock is (0, 0, 0)

    VC2

    (1, 0, 0) (1, 1, 0)

    (1, 1, 0)

    VC1

    m

    m*

    VC0

    VC2

    Figure 6-13. Enforcing Causal Communication

    VC0

    (1, 1, 0)

    (0, 0, 0)VC2

  • 8/13/2019 A11 Clocks -Synch1

    54/55

    History

    ISIS and Horus were middleware systemsthat supported the building of distributedenvironments through virtually

    synchronous process groups Provided both totally ordered and causally

    ordered message delivery.

    Lightweight Causal and Atomic Group Multicast Birman, K., Schiper, A., Stephenson, P, ACM Transactions on

    Computer Systems, Vol 9, No. 3, August 1991, pp 272-314.

  • 8/13/2019 A11 Clocks -Synch1

    55/55

    Location of Message Delivery

    Problems if located in middleware:

    Message ordering captures only potential causality;no way to know if two messages from the samesource are actually dependent.

    Causality from other sources is not captured.

    End-to-end argument: the application is betterequipped to know which messages are causally

    related. But developers are now forced to do more

    work; re-inventing the wheel.