Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data...
-
Upload
brayan-trulove -
Category
Documents
-
view
227 -
download
0
Transcript of Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data...
![Page 1: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/1.jpg)
Amazon’s DynamoSimple Cloud Storage
![Page 2: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/2.jpg)
Foundations
• 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”– Idea of tabular data– SQL Foundations
• Codd’s 12 rules – How database structured and what is available to
user– Application not dependent on physical or logical
levels of database– Insert, Update, Delete operators
![Page 3: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/3.jpg)
Foundations, continued• First Relational database
management systems (RDBMS)– Oracle in 1979, first SQL
based system– Microsoft SQL server, etc– Open source software
would follow later (mySQL)• Follows Codd’s ideas
– Complexity on the server side, let the query do all the work
– Very stringent requirements
![Page 4: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/4.jpg)
Drawbacks
• Licensing fees on a per processor rate– High end Oracle system is mind-numbingly expensive
• Load Distribution requires specific nodes to handle– Some servers have specific roles, failure point in
network
• Complexity on servers creates difficultly with maintenance, upgrades– Upgrades often all at once as result, not incremental
![Page 5: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/5.jpg)
A New Direction
• Simplify services that the database provides– Easier scaling and error handling
• Take a more pragmatic approach– Tailor system to sacrifice some aspects of the
traditional RDBMS to gain performance in others– Systems less general, specific end requirements in
mind when creating
![Page 6: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/6.jpg)
Examples• Amazon Dynamo
– Simple primary key– Highly available, end user
based model– Low cost virtualized nodes
• Facebook Cassandra– Similar goals to Amazon’s
Dynamo– Highly avaiable, incremental
scalablilty• Google File System
– Master node– Data distributed across low
cost nodes
![Page 7: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/7.jpg)
Dynamo Goals
• Scale – adding systems to network causes minimal impact
• Symmetry – No special roles, all features in all nodes• Decentralization – No Master node(s)• Highly Available – Focus on end user experience• SPEED – A system can only be as fast as the lowest
level• Service Level Agreements – System can be adapted to
an application’s specific needs, allows flexibility
![Page 8: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/8.jpg)
Dynamo Assumptions• Query Model – Simple interface exposed to application level
– Get(), Put()– No Delete()
• Atomicity, Consistency, Isolation, Durability– Operations either succeed or fail, no middle ground– System will be eventually consistent, no sacrifice of availability to
assure consistency– Conflicts can occur while updates propagate through system– System can still function while entire sections of network are down
• Efficiency – Measure system by the 99.9th percentile– Important with millions of users, 0.1% can be in the 10,000s
• Non Hostile Environment - No need to authenticate query, speed boost
![Page 9: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/9.jpg)
Wanted Results
• Deliver requests in a bounded time
• Always writable– Highly available to users
• No dedicated roles• Work split between
nodes fairly
![Page 10: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/10.jpg)
Techniques
![Page 11: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/11.jpg)
Partitioning
• Consistent Hashing– Changing the number of slots in hash table results
in only a small number of keys to remap– More info
• A ring of virtual nodes– Node responsible for region between it and its
predecessor
![Page 12: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/12.jpg)
Virtual Node
• Physical Machine has # of virtual nodes based on performance
• Can adapt load more easily if a machine goes down
• Likewise, assign nodes to a new machine in network
![Page 13: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/13.jpg)
Replication
• Application provided parameter N• Replication on different physical nodes
– Data still available if nodes go down– Makes part of preference list for query
![Page 14: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/14.jpg)
Versioning and Vector Clocks
• Updates propagate asynchronously, need a way of distinguishing conflicts– Possible reason for
absence of Delete()• Vector Clock
– List of (node, counter)– Limited size, limit
overhead for data– If all fields are less than
or equal, first can be updated by second
![Page 15: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/15.jpg)
Sloppy Quorum and Hinted Handoff
• W and R parameter set min # of nodes in a read or write
• Read and write on the first N healthy nodes, no strict membership, can vary over time
• Hint in metadata for intended node, will update once that node is again available
• Allows for temporary failure in nodes or entire networks
![Page 16: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/16.jpg)
Synchronization and Gossip
• Merkle Trees - Info• Use common key values between two nodes
– Traverse tree and check vector clocks to see if updates are needed
– Exchange information on most current version of the data if inconsistencies are found
• Gossip– Nodes select neighbors at random and reconcile
membership change histories• Use seed nodes to initialize• Detect failures
![Page 17: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/17.jpg)
Routing get() and put()
• Two Techniques– Route request through a load balancer
• Slower• Simpler application level code
– Partition aware client, route directly to appropriate nodes• Faster• More complicated application level
• First node routed to is “coordinator” node– Generates vector clock for put and gives data to N highest
healthy nodes– Queries N highest nodes for all versions, returned all
versions found
![Page 18: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/18.jpg)
Implementation
• Java based– Hardware independent, JVM
• Allows different back-end systems to be used, based on size of data needed to be stored– Berkeley Database Transactional Data Store– BDB Java Edition– MySQL, can handle large objects
• Coordinator node is a state machine for read/writes for client– Coordinator for a write determined by fastest read
![Page 19: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/19.jpg)
Flexibility
• Changing W, R, N – Business logic specific reconciliation
• Data replicated over nodes• Application level reconciliation fro conflicting objects
– Timestamp Reconciliation• Similar to above, last write wins
– High performance read engine• By setting R = 1, W = N• Reads fast and numerous, few updates
![Page 20: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/20.jpg)
Observed Results - Speed
![Page 21: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/21.jpg)
Observed Results – Load Balancing
• Higher traffic causes load to be balanced more evenly– Requests of popular keys
let system to balance more easily
• In lower traffic, less important to balance load
![Page 22: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/22.jpg)
Observed Results - Coordination
• Client coordination can provide a speed boost
• Read and write latency nearly identical
• Results as expected
![Page 23: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/23.jpg)
Observed Results - Versions
• Measured over 24 hour period for shopping cart– 99.94% of users saw 1 version– 0.00057% saw 2 versions– 0.00047% saw 3 versions– 0.00009% saw 4 versions
• Increase caused by increase in number of concurrent writers, most likely
![Page 24: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/24.jpg)
Conclusions
• Dynamo allows Amazon’s customers to have a consistent experience even in face of server and network errors
• Gives a scalable solution with millions of data points to be queried quickly and efficiently
• Offloads complexity to the application to provide a simple, flexible, and fast server-side implementation
![Page 25: Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.](https://reader035.fdocuments.in/reader035/viewer/2022062221/56649cc05503460f94986bc2/html5/thumbnails/25.jpg)
Thanks for listening!