Victor Bahl Joint work with Aul Adya, Jitendra Padhye and Alec Wolman
Centrifuge: Integrated Lease Management and Partitioning for Cloud Services Atul Adya,John...
-
Upload
jayde-raison -
Category
Documents
-
view
215 -
download
1
Transcript of Centrifuge: Integrated Lease Management and Partitioning for Cloud Services Atul Adya,John...
Centrifuge: Integrated Lease Management and Partitioning for Cloud Services
Atul Adya†, John Dunagan*, Alec Wolman*†Google, *Microsoft Research
1
7th USENIX conference on Networked systems design and implementation, 2010 (NSDI’10)
Seminar Presentation for CSE 708 by Ruchika MehreshDepartment of Computer Science and Engineering22nd February, 2011
Outline
Problem statement
Requirements
Architecture Owner library
Structure
2
Other issuesAPI
Performance evaluation
Conclusion
Lookup library Manager service
Problem statementManaging and operating on a large number of in-
memory state while maintaining the responsiveness of a cloud service.
Microsoft’s Live Mesh service defined the requirements and constraints for solving this problem.
3
Microsoft Live’s Mesh service• Large scale commercial cloud service• In operation since April 2008• As of March 2009, Centrifuge was actively used by 5
Live Mesh component services (List)
4
Terminology• Lease
– Technique to ensure that only one server at a time is responsible for a given piece of state.
• Partitioning– Process of assigning each piece of state to an in-memory
server; requests are then sent to the appropriate servers.
5
Sample ApplicationCloud based rendezous service
6
Front-end web server
Front-end web server
Front-end web server
Back-end application server
Back-end application server
Back-end application server
Request (D1) : My IP address is x
Request (D1) : My IP address is x
Request (D2) : What is D1’s IP address?
Request (D2) : What is D1’s IP address?Response (D2) : x
Response (D2) : x
Outline
Problem statement
Requirements
Architecture Owner library
Structure
7
Other issuesAPI
Performance evaluation
Conclusion
Lookup library Manager service
Requirements• Large number of small objects• Every object can be handled by one server• Memory is expensive – no replication• Effective partitioning• Freedom from fine- grained leasing• Load balancing• Easy addition, removal or crash handling of servers.
8
Outline
Problem statement
Requirements
Architecture Owner library
Structure
9
Other issuesAPI
Performance evaluation
Conclusion
Lookup library Manager service
Architecture
10
The front-end servers take care of incoming requests and forward them to the back-end servers
Logically centralized service implemented using a replicated state machine
Manager-directed leasing and partitioning
• Servers that hold object leases
• Holds objects in-memory• Process requests
Outline
Problem statement
Requirements
Architecture Owner library
Structure
11
Other issuesAPI
Performance evaluation
Conclusion
Lookup library Manager service
Manager Service• Executes manager-directed leasing
– Manager controls how the key space is partitioned and leases are assigned.
• Key space mapped to variable-length range using consistent hashing– One range lease per virtual node– 64 virtual nodes per owner library (Question)
• Every leased range has a lease generation number
12
Manager Service
13
Only the current leader executes the logic for partitioning and lease management• Runs Paxos• Serves as state store • Executes leader election protocol
Standby serversOnly the current leader interacts with Lookups and Owners
Live deployment•3 standby servers•5 Paxos-running servers•Can tolerate 2 machine failures
Manager Service – Leader Election• Changes to leader applied to Paxos group first• Standby servers check for candidacy periodically• New leader reads in the state from Paxos group• Lookups/Owners contact leader directly
– What happens when leader does not respond?
• Server addition and removal– Leader calculates the desired (re)assignments, recalls leases,
then grant the new ones– 60 second leases
14
Outline
Problem statement
Requirements
Architecture Owner library
Structure
15
Other issuesAPI
Performance evaluation
Conclusion
Lookup library Manager service
Lookup library• Maintains complete copy of the lease table
– 200KB per lease table (100X64X32 B)
• Advantages– Local objects to owner mapping– When remote confirmation at owner misses (due to
change in lease number), a loss notification is generated• Incremental changes copied from manager every
30 seconds (Question)
16
Lookup-Manager Protocol
17
Entries in the change log are truncated every 5 minutes
Manager sends entire lease table when lookup’s LSN is too oldOr Size(change log) > Size(lease table)
Outline
Problem statement
Requirements
Architecture Owner library
Structure
18
Other issuesAPI
Performance evaluation
Conclusion
Lookup library Manager service
Owner Library• Owners send lease request/renews messages to
Manager every 15 seconds– 3 consecutive lost/delayed requests lose the lease
• Manager responds with the (complete) renewed/granted lease information
• Why are lease grants different from renewals?
19
Outline
Problem statement
Requirements
Architecture Owner library
Structure
20
Other issuesAPI
Performance evaluation
Conclusion
Lookup library Manager service
Other issues• Dealing with clocks
– Clock rate synchronization
• Dealing with Message Races– Solution: Add two sequence numbers(Senders’s and owner’s seq no.)– Drop the racing message– Random backoff, send again
21
Outline
Problem statement
Requirements
Architecture Owner library
Structure
22
Other issuesAPI
Performance evaluation
Conclusion
Lookup library Manager service
API
23
• Semantics of Lookup() are that it returns hints• If hint fails, caller retries after a short backoff• New lease generation number (without flag) at Owner represents node
crash• Caller resends the earlier subscribe message• All lookup libraries signal a LossNotificationUpcall() on appropriate
ranges
OwnershipChangeUpcall(): To initialize data structures when some new range of the key space has been granted, or to garbage collect the associated state
Outline
Problem statement
Requirements
Architecture Owner library
Structure
24
Other issuesAPI
Performance evaluation
Conclusion
Lookup library Manager service
Performance Evaluation• Observation of Centrifuge in production for 2.5
months (Dec 2008 to March 2009)• 130 Owners; 1,000 Lookups; 8 Manager servers• Questions
– Is Centrifuge manager a bottleneck in steady-state?– How well does Centriuge handle high-churn events?– How stable are production servers?
25
Performance Evaluation
26
v v
v v
v v
• Low CPU and network utilization in steady state for all server
• Slightly higher for the current leader
• Network bursts on 12/16/2008 and 1/15/2009 when standbys take over
• Security patches rolled out on these two days
1.5 month of steady activity10 leases lost over this period
Conclusions•Unplanned owner failures are quite rare•Owner recovery is reasonably rapid (7 Owners recovered in <10 min)•Message races are very rare in steady state (12 drops in 1.5 months)
Performance Evaluation
27
v v
v v
v v v
Conclusion:• Burst in network traffic due to lookup servers copying entire lease tables• During high churn, the observed load at the leader (Manager service) is small•During the (quiet) period Jan 8 to Jan 13 , API success rate is 100% over 53 million invocations
• 2.5 hour window during on second security rollout period
• 9:20 pm: Servers 1 and 3 restarted, server 2 took over
• 9:45 pm: Server 2 restarted, server 1 took over
• Number of dip in owners at periodic intervals because patch is applied to group of servers at regular intervals.
v
Performance Evaluation• Testbed
– Around 2k front-end server instances, maintaining 10:1 Lookups:Owners ratio
– Restarted more rapidly than in production environment
28At Manager Leader
Outline
Problem statement
Requirements
Architecture Owner library
Structure
29
Other issuesAPI
Performance evaluation
Conclusion
Lookup library Manager service
Conclusion• Simplifies building scalable application tiers with in-
memory state• Combining leasing and partitioning to handle massive
number of objects– Freedom from fine-grained leases.
• Manager-directed leasing that scales well– Avoids lease fragmentation
• Non-traditional API where,– Clients can not request leases
30
Questions-1 (Santoshb)
• Any reasons for choosing 64 virtual nodes per owner library? Can this be increased in more complex systems?
• Most of these measurements were experimentally derived. Though they haven’t directly mentioned about this choice, it is most likely the optimum choice considering the work load.
Adding more number of virtual nodes can adversely affect the performance. Consider installing too many virtual machines on one server.
31Back
Questions-2 (Santoshb)
• What are the bottlenecks to extend the current design across multiple data centers?
• Centrifuge is designed to work within a datacenter (Refer to section 2.4). However, if it has to be extended to multiple datacenters, network reliability will be the first and foremost issue.
I believe it is not an issue of bottlenecks, but feasibility. If there are too many message races, or latencies involved,
Centrifuge can not work efficiently. Besides, maintaining strict clock rate synchronization across datacenters will also be a challenge. And this is just the tip of the iceberg.
32
Questions-3 (Fatih)
• Each lookup library has a replicated copy of the lease table and these lookups can have stale data. How does the system handle this stale data?
• In slides.
33Back
Questions-4 (Lavone_R)
• What do you think the drawbacks and short comings of centrifuge are in regards to its lease mechanism policies, its partitioning implementation and its infrastructure that enables services to be ran on pools of in-memory servers?
• Centrifuge works well within a datacenter, so we can not move copies of replicas close to the site and make one a primary copy (like in Pnuts). Also, need fine-grained leases for such applications.Synchronization requirement is strict
Chubby is better for loosely-coupled distributed systems. For the problem specification, Centrifuge seems to be the best
option. Chubby handles different kind of problem (file system, session).
34
Questions-5 (Yong Wang)
• In section 2 :“each object can be fully handled by one machine holding an exclusive”, is that mean the reading or writing of each object is serializable? Why that simplify design? Is there any consideration about keeping consistency?
• Yes, the requests are inherently serializable because only one server is responsible for applying changed to a particular object. (Refer to Section 4.2)
This assumption simplifies design because if a server could not handle a popular object from within a lease range, there would be additional lease reassignments and this will make No issues of consistency here (Refer to Section 4.2)
35
Thank You !!
36
List of Live Mesh Services1. File sharing and synchronization across devices2. Notifications of activity on these shared files3. A virtual desktop that is hosted in the datacenter4. File manipulation through a web browser5. Connectivity to remote devices
37Back