The Very First
2
Thanks!
Tonight
3
• Membase Overview• Use Cases and Customer Examples• Zynga and Membase• Membase Architecture• Demo!• Developing with Membase• A Glimpse into the Future
What is Membase?
Membase is a distributed database
5
Membase Servers
In the data center
Web application server
Application user
On the administrator console
Web application serverWeb application server
Five minutes or less to a working cluster• Downloads for Linux and Windows• Start with a single node• One button press joins nodes to a clusterEasy to develop against• Just SET and GET – no schema required• Drop it in. 10,000+ existing applications
already “speak membase” (via memcached)• Practically every language and application
framework is supported, out of the boxEasy to manage• One-click failover and cluster rebalancing• Graphical and programmatic interfaces• Configurable alerting
Membase is Simple, Fast, Elastic
6
Membase is Simple, Fast, Elastic
7
Predictable• “Never keep an application waiting”• Quasi-deterministic latency and throughputLow latency• Built-in Memcached technology
High throughput• Multi-threaded• Low lock contention• Asynchronous wherever possible• Automatic write de-duplication
Membase is Simple, Fast, Elastic
8
Zero-downtime elasticity• Spread I/O and data across commodity
servers (or VMs) • Consistent performance with linear cost• Dynamic rebalancing of a live clusterAll nodes are created equal• No special case nodes• Any node can replace any other node, online• Clone to growExtensible• Filtered TAP interface provides hook points
for external systems (e.g. full-text search, backup, warehouse)
• Data bucket – engine API for specialized container types
Built-in Memcached Caching Layer
9
Memcached
Membase Database
Membase Cache
Membase Database
Memcached Mode Membase Mode
Fact: Membase development team has also contributed over half of the code to the Memcached project.
Use Cases
Ad targeting
11
eventsprofiles, campaigns
profiles, real time campaign statistics
40 milliseconds to come up with an answer.
2
3
1
Search and Gaming Portal
12
Database
(Zynga slides not available)
Membase Architecture
Clustering
• Underlying cluster functionality based on erlang OTP
• Have a custom, vector clock based way of storing and propagating...– Cluster topology– vBucket mapping
• Collect statistics from many nodes of the cluster– Identify hot keys, resource
utilization
15
TAP
• A generic, scalable method of streaming mutations from a given server– As data operations arrive, they can be sent to arbitrary TAP
receivers
• Leverages the existing memcached engine interface, and the non-blocking IO interfaces to send data
• Three modes of operation
Working setDataMutations
Working setDataMutations
Working set
17
Membase data flow – under the hood
18
SET request arrives at KEY’s master server
Listener-Sender
Master server for KEY Replica Server 2 for KEYReplica Server 1 for KEY
3 3
1SET acknowledgement returned to application2
DiskDisk Disk
RAM
mem
base
sto
rage
eng
ine
DiskDisk Disk
4
ns_servermembase(memcached + membase engine)
moxi ns_server
vbucketmigratorTAP
memcached operationswith tap commands
memcached operations
Client
port 11211 memcached operations
moxi + Client
port 11210 memcached operations REST/comet
cluster topology and vbucket map
Clients, nodes and other nodes
19
Data buckets are secure membase “slices”
20
Membase data servers
In the data center
Web application server
Application user
On the administrator console
Bucket 1Bucket 2
Aggregate Cluster Memory and Disk Capacity
vBucket mapping
21
Disk > Memory
Buc
ket C
onfig
urat
ion
mem_high_wat
mem_low_wat
memory quota
22
Dataset may have many items infrequently accessed. However, memcached has different behavior (LRU) than wanted with membase.
Still, traditional (most) RDBMS implementations are not 100% correct for us either. The speed of a miss is very, very important.
Membase Demo
Key-Value Patterns
Key-Value
25
Key-Value
25
Items have:KeyValueExpirationFlagsCAS (more on this later)
Operations include:Get/SetIncrement/DecrementAppend/Prepend
Key-Value
25
Key-Value
Image courtesy http://www.flickr.com/photos/brenda-starr/3509344100/sizes/m/in/photostream/
(with a replica )25
Membase Datatypes
26
Membase Datatypes
• byte[]– Does your data have
1s and 0s?
26
Membase Datatypes
• byte[]– Does your data have
1s and 0s?
26
“Any customer can have a car painted any colour that he wants so long as it is black.”
Membase Datatypes
• byte[]– Does your data have
1s and 0s?
26
“Any customer can have a car painted any colour that he wants so long as it is black.”
• Items do have flags– Many clients use flags
– Data type options• Google protobuf• Thrift• Avro
Transactions
• Lock == slow me down• CAS operations
– Optimistic locking• Very useful with complex
datatypes– Imagine two clients trying to
update a complex item• You’re likely using CAS
already... if you use a CPU
27
User 1
Fail!
User 2Success
Common Use: Sessions
• Web user sessions– Highly read, less writes in many case– Protocol advantage of memcached
• Options already for PHP, Ruby and Java
• Application state– Not necessarily “entity” style things– May be appropriate for a “cache” pool
28
Common Use (cache): Rate Limiting
• Want to provide API calls into the system– Twitter search– Google search services
• Use the atomic increment– Set an item with a unique ID– Upon API request,
increment and check• HTTP 420: go away and come
back later
29
Your Users
Your App
¡Ouch!
Looking Ahead: NodeCodeFrank Weigel, Membase
Beyond key-value • Indexing/Range Queries• Advanced Data Structures• Sub-object direct manipulation
Validation and In-flight transformation• Block mutations failing validation• Enrich or transform objects
Connectors (Integrate easily with other systems)• Solr• Hadoop• MySQL
NodeCode – Motivation
31
NodeCode - What is it?
Method for extending & customizing Membase
Separate code modules
Defined interface to datapath and cluster manager
Notification on events• Synchronous• Asynchronous
32
Simple• Packaged modules for easy install and enable• Library of “off the shelf” modules• Module monitoring• Straight forward development and debuggingFast• Low latency/high-throughput• Per-bucket process isolation• Don’t break data manager performance/correctnessElastic• Automatically migrate and instantiate on rebalance• Provide support for migration of internal data• Leverage native Membase engine for internal data storage
NodeCode – Drivers
33
Block-level architecture
34
Java only– jar format
Must implement minimal module API• Initial module startup• Module removal• Association with bucket
NodeCode library helper functions• Register synchronous & asynchronous listeners/callbacks• Register protocol extension/callbacks • Register rebalance callback• Register cluster manager event callbacks• Membase data access
NodeCode 1.0 Plans
35
37
Q&A
Attributions
• http://commons.wikimedia.org/wiki/File:Flag_of_China.png
• http://commons.wikimedia.org/wiki/File:Flag_of_South_Korea.svg
• http://commons.wikimedia.org/wiki/File:Flag_of_Japan.svg
38
Top Related