Download - Naming Sandhya Turaga CS-249 Fall-2005. Outline of Chapter Naming in general Characteristics of distributed naming Bindings Consistency Scaling Approaches.

Naming

Sandhya TuragaCS-249

Fall-2005

Outline of Chapter Naming in general Characteristics of distributed naming

BindingsConsistencyScaling

Approaches to Design a global name service

DEC Global Name Service ( by Lampson) Stanford Design (by DAVID R. CHERITON and

TIMOTHY P. MANN)

Naming in general Names facilitate sharing. Names are used for communication of objects Unique identifiers Never reused, always bound to an object, &provides

location independence Pure Names Nothing but a bit pattern that is an identifier Pure names commit one to nothing – Eg: 123065#$21 Impure Names Carries commitments of one sort or the anotherEg: src/dec/ibm

Characteristics

BindingsMachine to address bindingServices to machine binding

Consistency How one propagates updates among

replicates of a naming database (Explained more in the DEC example)

Characteristics ( contd…)

Scaling Being able to manage indefinite number of

machines each providing some part of name lookup service of indefinite size which is managed by a great number of more or less autonomous administrators without any major changes to the existing environment.

Managing name Space rather than the avoidance of scale disasters In the implementation

Approaches

DEC Global Naming Service Has two levels

Client Level Administration Level

The Stanford Design Has three levels

Global Level Administrational Level Managerial Level

Approach – DEC – client level Client level

File structure is seen similar to a UNIX file system - like a tree of directories ( Figure 1)

Each directory is identified by a unique Directory Identifier (DI)

Each ARC of the directory is called a directory reference (DR) = DI of the child directory

Directory path relative to the root is Full Name (FN)

In the figure1 finance/personal is relative to root src

Src DI=#100

ComputerSci DI=#200 Finance DI=#300

windowsMainframe Personal

File system as seen in DEC – client level (Figure 1)

DR=#300DR=#200

DEC – client level (contd…) In this tree structure each node carries

timestamp & a mark which is either absent/present

A node is present/absent can be known by the update operations.

A node is which is absent is denoted by a strike on the circle in figure 2.

Time-stamps are used to allow trees to be updated concurrently.

Figure 2

DEC-Administrational Level

Administrator allocates resources to the implementation of the service and reconfigures it to deal with failures.

It sees a set of directory copies (DC), each one stored on a different server (S) machine.

Figure 3 shows dec/src is stored on four servers.

Figure 3 DEC

SRC

server1 server3 server4server2

10 10 10 12user1 10User2 12

user1 10User3 12

user1 10 user1 10user4 11user2 12

DEC- Administration level (Contd..)

A lookup can try one or more of the servers to find a copy from which to read.

The values on servers(10,12) are the current time which is called the last sweep time.

Each copy also has nextTS (next time-stamp) it will assign to a new update.

Updates are spread through Sweep operation which has a time-stamp sweepTS.

An update originates at one DC

DEC- Administration level (Contd..) Sweep operation sets the DC’s nextTS

to sweepTS and then reads & collects updates from DC.

Once these updates are written back the lastsweep is set to sweepTS.

Messages are sent among DCs with updates to speed up the update propagation. ( figure 4-a )

After a successful sweep operation at time 14 Figure 3 would look like Figure 4-b

Figure 4-aDEC

SRC


10 10 10 12user1 10User2 12

user1 10User3 12

user1 10 user1 10user4 11user2 12

User2 12 User4 11

Figure 4-bDEC

SRC


14 14 14 14

user1 10user2 11user3 12user4 14




DEC- Administration level (Contd..)

To obtain the set of DCs reliably ,all the DCs are linked in a ring with the arrows pointing to other DC (Figure 5).

Sweep starts at one DC follows the arrows and completes the ring .

In case of a server a failure ring must be reformed. In this process updates may lost and a new sweep operation must be started.

Figure 5 DEC

SRC

server1

server3

server4

server2

14

DEC-Name Space Expanding the name space and changing its

structure without destroying the usefulness of old names is important in hierarchical naming structure.

Each directory in this hierarchical structure is a context within which names can be generated independently of what is going on in any other directory.

Thus we can have a directory name Finance under two prefixes such as src/dec/Finance & src/ibm/Finance. Each directory can have its own administrator, and they do not have to coordinate their actions.

Figure 6

#999

#311

SRC

DEC

#783

Finance

IBM

Finance

#222

#444

DEC-Name Space

Growth of name space in by combining existing name services ,each with its own root is:

add a new root by making the existing root nodes as its children (Figure 6).

Figure 7

DEC-Name Space (Contd..)

How can a directory be found from its DI?

Root keeps a table of well-known directories that maps certain DIs into links which are FNs relative to the root.

See figure 7.

DEC-Name Space (Contd..)

Sometimes restructuring (Moving a subtree) is required.

For example DEC buys IBM then the path for IBM changes as shown in figure 8.

Old paths as ansi/ibm won’t work now. Users should be forwarded from ansi/ibm to ansi/dec/ibm to access IBM.

Figure 8

DEC-Name Space - Caching Caching is a good idea for lookups. Caching is achieved by: 1) Slow rate of change on the naming

database 2) Or Tolerating some inaccuracy in

caching data Enforcing slow rate of change can be

achieved with: expiration time (TX) on entries in the

database with an exception.

DEC-Name Space (Figure 9)

#999

#311

ANSI

DEC 20 Nov 2006

#783

SRC 30 Nov 2006

#999/DEC =311

Valid until 20 Nov 2006

DEC Name Space

For example in figure 7 the result of looking up ansi/dec/src is valid until 20 Nov 2006

Stanford Naming System Design It has three levels Global Level Administrational level Managerial level In lower levels, naming is handled

by object managers How do administrational & global

levels differ?

Stanford Naming System Design (Contd..) –(Figure 10)

Fig 10

Managerial level

Each directory is stored by a single object manager

Any kind of object can be named using a directory implemented by its object manager. Eg: Files

Object manager stores the absolute name of the root of each subtree

Managerial level (contd ..) For example (in Figure l0), the subtrees

rooted at the directories edu/stanford/dsg/bin and %edu/stanford/dsg/1ib are both implemented by DSG file server 1, which thus covers all the names with prefixes %edu/stanford/dsg/bin and %edu/stanford/dsg/lib.

Accordingly, file server 1 stores all the files and directories under these two subtrees

Managerial level (Contd..)

Object manager implements all operations on the names it covers.

What are the advantages of integrating names with object management?

Managerial directories record every name-object binding

Managerial level (contd..) Requirements to construct a

complete naming service: Clients should know to which manager

it should send the messages Clients should know which name is

unbound and which name is unavailable

Separate mechanism is needed to implement operations on directories above the managerial level.

Managerial level (contd..) Finding manager location can be

provided by prefix caches & multicasting. Each client in a naming system maintains

a name prefix cache. Each entry of cache associates a name

prefix with a directory identifier. Directory identifier has two fields: (m,s) m- Manager identifier S-Specific directory identifier

Managerial level (contd..) Hit when a cache search returns a

cache entry containing the name of the managerial directory it is called “Hit”

near miss When a cache search doesn’t return

the name of the directory but does return something is called near miss.

Managerial level (contd..) If cache search returns a local

administrational entry then the client multicasts a “Probe” request to a group of managers specified by the cache entry.

If the cache search returns a directory identifier that specifies a ‘liaison server” then the client multicasts a probe request on the given name to liaison server.

What is liaison server?

Managerial level (contd..) What if the liaison server crashes

in between? What happens when the liaison

server receives the probe? Cache consistency is maintained

by discarding stale cache entries. What is Stale cache entry?

Managerial level (contd..) Advantages of name prefix caching

mechanism: 1) High cache-hit ratio & Performance. 2) Near miss reduces the amount of

work required for the shared naming system.

3) The longer the prefix returned by the near miss the more work is saved.

4) Correctness of cache information is automatically checked.

Administrational level Administrational directories are

implemented using object managers and administrational directory managers.

Administrational directory manager covers the unbound names

Bound names are covered by object managers

Administrational level (contd..)

The administrational directory manager holds a list of bound names, but it is not considered to cover these names

Figure 11 illustrates how information is distributed in the directory %edu/stanford/dsg/user of Figure10

Figure 11

Fig 11

Figure 10(Same as in slide 30)

Administrational level (Contd..) Managers that cooperate in

implementing an administrational directory are called its participants & they form a participant group.

Each participant in the admin directory responds to probes on the names it covers.

Every name is covered by at least one participant.

Administrational level (Contd..) Directory listing is maintained by

directory’s manager. Clients obtain list of names from

directory manager What if directory managers fail? Directory’s manager can coordinate

access to the directory by clients located outside the local administration

Administrational level (Contd..) Remote administrational directory

can be accessed through local liaison server and the global directory system.

Advantages of this technique – Even if directory manager is down the corresponding file server can respond to name-mapping requests

( see figure 12)

Figure 12 ♦root

● user1 ●user2

● user3 ●user4

● user5 ●user6 ●user7

▪Node2▪Node3

▪Node1

♦child2♦child1

File server 1 File server 2 File server 3

Administration level

Global level

managerial level

Global level This design is similar to Global directory

system of DEC implementation proposed by Lampson.

Interfacing to the global directory system

Liaison servers act as intermediaries for all client operations at the global level.

Caching performed by liaison servers improves the response time for global level queries

Global level (Contd..)

Also reduces load on the global directory system.

What happens if a global directory system becomes unavailable within some administration?

Performance of name prefix caching Load per operation 1) A packet event is transmission

or reception of network packet. 2) Multicast with g recipients costs

g+1pkt events 3)The avg number of pkt events

required to map an event Cmap=4h + (r + m + 7)(1 - h)

Performance of name prefix caching h- Cache hit ratio r- No.of retransmissions required to know

the host is down m-No.of object managers in the system Derivation of the above equation: when there is cache hit name mapping

costs -4 pkt events when there is a cache miss r+m+7pkt

events

Performance of name prefix caching

When can Cmap reach the optimum value?

What is this r+m+7? Is there a statistical model for

computing a cache-hit ratio? (See next slide)

Cache-performance model

Cache-performance model Input parameters : 1) No.of name-mapping requests issued

per unit time 2) The average length of time a name-

cache entry is valid 3)The average length of time a client

cache remains in use before it is discarded

4)The locality of reference

Cache-performance model Avg. steady-state hit ratio of all clients : h=1-∑ ∑ ß/( ßj,k +vk ) j k

ßj,k –Avg interarrival time for requests generated by client j that reference a name in managerial sub tree k.

Vk –Validity time for cache entry of a managerial sub tree k.

ß-Global Avg interarrival time for name-mapping requests.

Cache-performance model

Steady-state ratio for single pair is: hj,k =1- ßj,k /( ßj,k +vk ) Miss-ratio=total no.of misses/total

no.of requests Hit-ratio=1-(Miss-ratio)

Cache performance model (Contd..) – (Figure 13)

Fig 13

Cache performance model (Contd..)

Is it reasonable to expect the ratio of h in the range 99.00-99.98? The following graph shows h for each client, sub tree pair Vk varies from 100-5000(on x-axis) Hit-ratio is 0.9901 at Vk 100 & 0.9998

for Vk is 5000.

Cache performance model (Contd..) – Figure 14

Fig 14

Cache performance model (Contd..) What is startup misses ? Will startup misses effect hit-

ratio ? Depends on validity time ( see figures 15 )

Hit Ratio as a function of Time ( figure 15 )

Possion Distribution of Hit Ratio vs Time

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 5 10 15

Time

Hit

Ra

tio

h - hitratio

Conclusions

How large can a system be built with this design ? Depends on load per operation Load on managers ( see figure 16 )

Load per server as a function of system size - Figure 16

Questions ?