Mathematics-Inspired Protocols for Distributed Systems Steven Y. Ko*, Indranil Gupta Dept. of...

Mathematics-Inspired Protocols for Distributed Systems

Steven Y. Ko*, Indranil GuptaDept. of Computer Science

University of Illinois at Urbana-Champaign{[email protected]; [email protected]}

Yookyung Jo**Dept. of Computer Science

Cornell University{[email protected]}

** Work done during M.S. at UIUC.* Currently PhD student at UIUC. In the audience here.

mailto:[email protected]



2

An Open Design Challenge

Phenomena and Results from Biological, Physical,

Social Worlds, etc.

Self-adaptiveProtocols for

Distributed ComputingProblems

3

Lost in Translation

Two popular ways today for this translation:I. Sit down with social scientists, biologists etc.

Time consuming, terminology different But good to talk!

II. Read textbooks written by them and derive protocols that “somewhat” model the phenomenon

Both above approaches often lead to: Hand-wavy design: non-rigorous translation leads to

unpredictable protocols Difficulty of analysis of derived protocol Lack of generality of translation

4

The Third Way


Social Worlds, etc.



Mathematical Models

5

Why Mathematical Models?

For long, a popular language for representing phenomena and ideas• Scientists from fields of Biology, Sociology, Physics, etc. have used

these models to represent phenomena, results, and ideas• Many decades (or centuries) of equations available in these fields• E.g., Sequence Equations, e.g.,

Translation is Systematic and Rigorous• Derive Protocol from Mathematical model (equation)• Translation is not hand-wavy

Derived Protocol is easy to understand• Rigorous analysis and provable properties of derived protocol• Generality of translation• Amenable to augmentation with topology-awareness, etc.

tt xx .2

11

6

Story of this Paper

• Consider a popular class of mathematical models– E.g., Sequence equations, e.g.,

• Develop techniques for translating any mathematical equation belonging to this class into a distributed protocol– Key idea: Emergent behavior of the protocol across the

distributed system = the mathematical equation• We are not simulating the mathematical equation at each process

• Challenge: going from global (equation) local (protocol)

• Use these techniques to design adaptive protocols for P2P computing

tt xrx .1

7

Roadmap

I. Related Work and System Model

II. Translation of Sequence Equations into Sequence Protocols

III. Adaptive Protocols for P2P Computing– HoneyAdapt system for Grid computing

8

Related Work• Simulation of mathematical equation at each process

[Uresin90]– Instead, our focus is on running a protocol that obtains equation

as emergent behavior

• Nature-inspired research, e.g., [Mute,AntNet], etc., and population protocols [Merritt00,Angluin04]– But not derived from mathematical models

• [Gupta04] considered translation of continuous differential equations into equivalent distributed protocols– Current SASO 07 paper considers translation of sequence

equations. Main differences from [Gupta04]:– Sequence equations discrete and not continuous– Require completely different translation techniques– Adaptive and phase-change behavior more pronounced

9

System Model

• Static group of N non-faulty processes (N large) Can be relaxed for most sequence protocols

• Reliable unicast communication TCP Sequence protocols resilient to message losses

• Coarse-grained time synchronization (O(minutes))– allows processes to move synchronously– allows notion of “rounds” Provided by NTP, TIME, or DAYTIME (e.g., NIST servers) Many sequence protocols have asynchronous variants

• Any process can randomly sample another process Use CYCLON [Voulgaris05], Peer sampling service [Jelasity04], etc.

10

Creating Sequence Protocols

• Canonical Sequence Equation:

– is a variable in [0,1] – is its value at time t– k is constant

[All our discussion is extendible to multi-variable sequence equations]

• Challenge: global local– Assign each process p a binary state variable Xp, representing

whether it is in state X or not. Xp=1 means the process is in state X. Xp=0 means process is not in state X.

– Let x = fraction of processes (system-wide) with Xp=1. So,

– Derive a distributed protocol so that the time-variation of x is predicted by the sequence equation. That is:

tx

),...,,( 211 kttttt xxxxfx x

x1 tx

[0,1]x

x :Goal 1 tx

11

Case Study I (Constant Term)

Each round at process p:

Flip a coin with heads probability r

if heads

Xp=1

else Xp=0

=> Value of x, the fraction of processes in state X, is predicted by:

]1,0[,1 rrxt

rx

x :Goal 1 tx

12

Case Study II (Linear Term)

Each round at process p: remember last round’s Xp value // Token Generation

if last round’s Xp was 1 generate an expected r tokens

relay token to random process

// Token Relayhold at most one token at any time

if receive any additional tokens relay it to a random process

// Token Applyat end of round

if have > 0 tokensset Xp=1

else Xp=0

Number of tokens generated is Value of x, the fraction of

processes in state X, is predicted by:

This Protocol can be extended to: Arbitrary memory (k in

sequence equation) Multi-variable equations

0,.1 rxrx tt

tt xrx .1

x :Goal 1 tx

Multiplicative Protocol

.Ntx

13

Stepping Back – General Methodology

For the sequence equation:

– Take each term on the right hand side • Term is minimal unit separated by + and – signs

– Translate each term according to appropriate case studies – Generate positive tokens for + terms, and negative tokens for –

terms– A positive token destroys a negative token– Relay and Apply tokens as usual

Theorem: If for each term T, number of tokens generated is T X N, then

),...,,( 211 kttttt xxxxfx

xtx Term Translation=Case Study

14

What other Terms can we Translate?

I. Polynomial Terms:

1. Constant – Case Study I

2. Linear – Case Study II

3. (multi-variable equations) Multiplicative Polynomial -

II. Non-polynomial Terms:

1. Division Terms -

2. Fractional Terms

– next

III. Recursive Translation – next

]1,0[,1 rrxt

0,.1 rxrx tt

see paper

see paper

15

Translation of Fractional Term

Each round at process p:remember last k round’s Xp valuesdivide round into two equi-long subrounds

// Subround 1// Token Generation

for each j =1 to L if Xj(p)=1 generate aj tokens tagged with j

// Token Relay and Applymulticast tokens to all other processes

// Subround 2// Token Generation

select random token among those receivedsuppose tag is j’if (bj’=1) generate a token for subround 2

// Token Relay and Applyapply as usual

Subround 1: E[Number of tokens generated at p] is=

Subround 2: E[Number of tokens generated at p] is=

Value of x, the fraction of processes in state X, is predicted by:

binary 'positive, s',.

..

,

1

11 sba

xa

xab

Tx jjLj

jjj

Lj

jjjj

t

xT

Round

Subround 1

Subround 2

Lj

jjj xa

1

.

Lj

jjj

Lj

jjjj

xa

xab

1

1

.

..

x :Goal T

16

Recursive Translation

• Any term that consists of sub-terms that are translatable, can itself be translated– Split a round into two

subrounds– In subround 1, run the

derived protocols for the subterms

– In subround 2, run the derived protocol for the overall term

– Subround division is also recursive

12

11,

tt

tt

xx

xTx

Round

Subround 1

Subround 2

2' txx

1

1

'

t

t

xx

x

STEP1

STEP2

EXAMPLE

17

Roadmap

I. Related Work and System ModelII. Translation of Sequence Equations into

Sequence ProtocolsIII. Adaptive Protocols for P2P Computing• Multiplicative Protocol (based on )

– For detecting global thresholds in a distributed fashion

– see paper for details

• HoneyAdapt system for Grid computing– next

tt xrx .1

18

HoneyAdapt - Motivation

Typical Client

1. Fetch next data chunk

3. Send back re

sults to server

Challenge: how do clientschoose “best” algorithm (A,B,…L)adaptively at run time in a black-box manner?e.g., for parallel sorting problem, A=quicksort, B=insertion sort,…

Grid Server (master)-Partitions large data set

into chunks-Serves out chunks on-demand to clients-Collates results in the end-E.g., parallel sorting problem, graphics rendering, etc.

Grid Clients (workers)

Connected in an overlay

2. Process data chunk using one of algorithms A,B,C,D,…L

19

HoneyAdapt – InspirationNectar Source A Nectar Source B

Honyebees (apis mellifera)-need to decide which is the “better” nectar source-in a distributed fashion

20

HoneyAdapt – InspirationNectar Source A Nectar Source B

3. Execute honeybee dance of 8’s-Duration of dance proportional to quality of advertised nectar source-Direction of dance points towards source

4. After dance, if did not follow(so with probability pf),decide next source to forageby picking a dancing beeat random

1. (time t) Forage a nectar source

2. With probability (1-pf), use the same nectar source fortime (t+1)pf=following probability

21

HoneyAdapt – Mathematical ModelSource A = Algorithm A Source B = Algorithm B

L

AjjL

Ajjj

iiii tapf

tasq

tasqpftata )(..

)(.

)(.)1).(()1(

Fraction of nodes (bees/clients) foraging source (algorithm) iat time (t+1)

(See paper for general model. From [Seeley96].)

Following probabilityQuality of source (algorithm) j

=

Linear Term Fractional Term (+Recursive)

Bees converge quickly towards better source (proof in paper)

22

(Recall) HoneyAdapt - Motivation

Grid Server (master)-Partitions large data set

into chunks-Serves out chunks on-demand to clients-Collates results in the end-E.g., parallel sorting problem, graphics rendering, etc.

Grid Clients(workers)

Connected in an overlay

Typical Client

1. Fetch next data chunk

3. Send back re

sults to server

2. Process data chunk using one of algorithms A,B,C,D,…L

23

HoneyAdapt –Model and Derived Protocol

L

AjjL

Ajjj

iiii tapf

tasq

tasqpftata )(..

)(.

)(.)1).(()1(

Fraction of nodes (bees/clients) foraging source (algorithm) iat time (t+1)

Following probabilityQuality of algorithm j

=

2A. Choose algorithm i (initially, random) for this chunk2B. With probability (1-pf), use same algorithm for next chunk 2B. Dance: create a number of advertisement messages for algo i. Number of adv. msgs. proportional to the quality of sorting (inversely proportional to running time of chunk with algorithm i)2C. Send advertisement messages to immediate neighbors

in overlay2D. If follow (prob. pf), decide algorithm i for next chunk by

picking an advertisement message at randomAlgorithm’s emergent behavior

= Sequence equation(proof in paper)

24

HoneyAdapt - Simulation

Scalability up to and beyond 4000 nodes: -Running time: Only 85% worse than optimal-Bandwidth: 0.04 messages/node/chunk

Adaptivity: HoneyAdapt takes only 2x time compared to optimal, and beats non-adaptive strategies

Setup:* Random graph overlay of ~1000 clients* Dataset consists of 100K chunks of 10 different types* Each type has 10 algorithms assigned randomly in terms of quality* “Cluster”=consecutive chunks of same type (with same “best” algo.)* pf=0.9

25

HoneySort – DeploymentSetup:* Up to 30 COTS PC clients (Linux)* Complete graph overlay with TCP links* Clients choose between quicksort and insertion sort* Sort 1 million database of 8 B entries* Server pre-partitions data into 333 chunks

Results: Sorted Arrays: HoneySort as good as insertion sort Randomized arrays: HoneySort as good as quicksort Part-sorted part-randomized arrays:

Honeysort beats both quicksort and insertion sort!

26

Summary


Social Worlds, etc.



Mathematical Models

This paper:Model=Sequence EquationsTranslation techniques for polynomial/non termsDerived Sequence Protocol so its emergent behavior = Sequence equationHoneyAdapt for adaptive Grid computingHoneySort beats traditional parallel sorting

algorithms

Distributed Protocols Research Group (DPRG):http://kepler.cs.uiuc.edu

27

Backup Slides

28

Translation of Division Term

Each round at process p:remember last k round’s Xp valuesdivide round into two equi-long subrounds

// Token Generation [subround 1]integer i=0do

select a random process q

query the value of Xq k rounds ago

i=i+1until (Xq=1)generate i token messages

// Token Relay [subround 2]relay token to random processhold at most one token at any time

if receive any additional tokens relay it to a random

process

// Token Apply [subround 2]at end of round

use tokens for next subround

E[Number of tokens generated at p] is

Total number of tokens generated is

Value of x, the fraction of processes in state X, is predicted by:

This Protocol can be extended to: Arbitrary memory (k in sequence equation) Multi-variable equations

integeran is ,kx

rT

kt

x :Goal T

ktx

1

ktxN

1.

Round

Subround 1

Subround 2xT

(usually a sub-term in a larger term)

29

Big Picture

• Self-adaptive and self-organizing distributed protocols

• Protocol design

• Biological, Physical, Social phenomena as a source of ideas for protocol design

Need: Systematic Translation of phenomena into distributed protocols

• Use mathematical models as a conduit

Mathematics-Inspired Protocols for Distributed Systems Steven Y. Ko*, Indranil Gupta Dept. of...

Documents

Transcript of Mathematics-Inspired Protocols for Distributed Systems Steven Y. Ko*, Indranil Gupta Dept. of...