Membase Intro from Membase Meetup San Francisco

48
The Very First

description

 

Transcript of Membase Intro from Membase Meetup San Francisco

Page 1: Membase Intro from Membase Meetup San Francisco

The Very First

Page 2: Membase Intro from Membase Meetup San Francisco

2

Thanks!

Page 3: Membase Intro from Membase Meetup San Francisco

Tonight

3

• Membase Overview• Use Cases and Customer Examples• Zynga and Membase• Membase Architecture• Demo!• Developing with Membase• A Glimpse into the Future

Page 4: Membase Intro from Membase Meetup San Francisco

What is Membase?

Page 5: Membase Intro from Membase Meetup San Francisco

Membase is a distributed database

5

Membase Servers

In the data center

Web application server

Application user

On the administrator console

Web application serverWeb application server

Page 6: Membase Intro from Membase Meetup San Francisco

Five minutes or less to a working cluster• Downloads for Linux and Windows• Start with a single node• One button press joins nodes to a clusterEasy to develop against• Just SET and GET – no schema required• Drop it in. 10,000+ existing applications

already “speak membase” (via memcached)• Practically every language and application

framework is supported, out of the boxEasy to manage• One-click failover and cluster rebalancing• Graphical and programmatic interfaces• Configurable alerting

Membase is Simple, Fast, Elastic

6

Page 7: Membase Intro from Membase Meetup San Francisco

Membase is Simple, Fast, Elastic

7

Predictable• “Never keep an application waiting”• Quasi-deterministic latency and throughputLow latency• Built-in Memcached technology

High throughput• Multi-threaded• Low lock contention• Asynchronous wherever possible• Automatic write de-duplication

Page 8: Membase Intro from Membase Meetup San Francisco

Membase is Simple, Fast, Elastic

8

Zero-downtime elasticity• Spread I/O and data across commodity

servers (or VMs) • Consistent performance with linear cost• Dynamic rebalancing of a live clusterAll nodes are created equal• No special case nodes• Any node can replace any other node, online• Clone to growExtensible• Filtered TAP interface provides hook points

for external systems (e.g. full-text search, backup, warehouse)

• Data bucket – engine API for specialized container types

Page 9: Membase Intro from Membase Meetup San Francisco

Built-in Memcached Caching Layer

9

Memcached

Membase Database

Membase Cache

Membase Database

Memcached Mode Membase Mode

Fact: Membase development team has also contributed over half of the code to the Memcached project.

Page 10: Membase Intro from Membase Meetup San Francisco

Use Cases

Page 11: Membase Intro from Membase Meetup San Francisco

Ad targeting

11

eventsprofiles, campaigns

profiles, real time campaign statistics

40 milliseconds to come up with an answer.

2

3

1

Page 12: Membase Intro from Membase Meetup San Francisco

Search and Gaming Portal

12

Database

Page 13: Membase Intro from Membase Meetup San Francisco

(Zynga slides not available)

Page 14: Membase Intro from Membase Meetup San Francisco

Membase Architecture

Page 15: Membase Intro from Membase Meetup San Francisco

Clustering

• Underlying cluster functionality based on erlang OTP

• Have a custom, vector clock based way of storing and propagating...– Cluster topology– vBucket mapping

• Collect statistics from many nodes of the cluster– Identify hot keys, resource

utilization

15

Page 16: Membase Intro from Membase Meetup San Francisco
Page 17: Membase Intro from Membase Meetup San Francisco
Page 18: Membase Intro from Membase Meetup San Francisco
Page 19: Membase Intro from Membase Meetup San Francisco
Page 20: Membase Intro from Membase Meetup San Francisco

TAP

• A generic, scalable method of streaming mutations from a given server– As data operations arrive, they can be sent to arbitrary TAP

receivers

• Leverages the existing memcached engine interface, and the non-blocking IO interfaces to send data

• Three modes of operation

Working setDataMutations

Working setDataMutations

Working set

17

Page 21: Membase Intro from Membase Meetup San Francisco

Membase data flow – under the hood

18

SET request arrives at KEY’s master server

Listener-Sender

Master server for KEY Replica Server 2 for KEYReplica Server 1 for KEY

3 3

1SET acknowledgement returned to application2

DiskDisk Disk

RAM

mem

base

sto

rage

eng

ine

DiskDisk Disk

4

Page 22: Membase Intro from Membase Meetup San Francisco

ns_servermembase(memcached + membase engine)

moxi ns_server

vbucketmigratorTAP

memcached operationswith tap commands

memcached operations

Client

port 11211 memcached operations

moxi + Client

port 11210 memcached operations REST/comet

cluster topology and vbucket map

Clients, nodes and other nodes

19

Page 23: Membase Intro from Membase Meetup San Francisco

Data buckets are secure membase “slices”

20

Membase data servers

In the data center

Web application server

Application user

On the administrator console

Bucket 1Bucket 2

Aggregate Cluster Memory and Disk Capacity

Page 24: Membase Intro from Membase Meetup San Francisco

vBucket mapping

21

Page 25: Membase Intro from Membase Meetup San Francisco

Disk > Memory

Buc

ket C

onfig

urat

ion

mem_high_wat

mem_low_wat

memory quota

22

Dataset may have many items infrequently accessed. However, memcached has different behavior (LRU) than wanted with membase.

Still, traditional (most) RDBMS implementations are not 100% correct for us either. The speed of a miss is very, very important.

Page 26: Membase Intro from Membase Meetup San Francisco

Membase Demo

Page 27: Membase Intro from Membase Meetup San Francisco

Key-Value Patterns

Page 28: Membase Intro from Membase Meetup San Francisco

Key-Value

25

Page 29: Membase Intro from Membase Meetup San Francisco

Key-Value

25

Items have:KeyValueExpirationFlagsCAS (more on this later)

Operations include:Get/SetIncrement/DecrementAppend/Prepend

Page 30: Membase Intro from Membase Meetup San Francisco

Key-Value

25

Page 31: Membase Intro from Membase Meetup San Francisco

Key-Value

Image courtesy http://www.flickr.com/photos/brenda-starr/3509344100/sizes/m/in/photostream/

(with a replica )25

Page 32: Membase Intro from Membase Meetup San Francisco

Membase Datatypes

26

Page 33: Membase Intro from Membase Meetup San Francisco

Membase Datatypes

• byte[]– Does your data have

1s and 0s?

26

Page 34: Membase Intro from Membase Meetup San Francisco

Membase Datatypes

• byte[]– Does your data have

1s and 0s?

26

“Any customer can have a car painted any colour that he wants so long as it is black.”

Page 35: Membase Intro from Membase Meetup San Francisco

Membase Datatypes

• byte[]– Does your data have

1s and 0s?

26

“Any customer can have a car painted any colour that he wants so long as it is black.”

• Items do have flags– Many clients use flags

– Data type options• Google protobuf• Thrift• Avro

Page 36: Membase Intro from Membase Meetup San Francisco

Transactions

• Lock == slow me down• CAS operations

– Optimistic locking• Very useful with complex

datatypes– Imagine two clients trying to

update a complex item• You’re likely using CAS

already... if you use a CPU

27

User 1

Fail!

User 2Success

Page 37: Membase Intro from Membase Meetup San Francisco

Common Use: Sessions

• Web user sessions– Highly read, less writes in many case– Protocol advantage of memcached

• Options already for PHP, Ruby and Java

• Application state– Not necessarily “entity” style things– May be appropriate for a “cache” pool

28

Page 38: Membase Intro from Membase Meetup San Francisco

Common Use (cache): Rate Limiting

• Want to provide API calls into the system– Twitter search– Google search services

• Use the atomic increment– Set an item with a unique ID– Upon API request,

increment and check• HTTP 420: go away and come

back later

29

Your Users

Your App

¡Ouch!

Page 39: Membase Intro from Membase Meetup San Francisco

Looking Ahead: NodeCodeFrank Weigel, Membase

Page 40: Membase Intro from Membase Meetup San Francisco

Beyond key-value • Indexing/Range Queries• Advanced Data Structures• Sub-object direct manipulation

Validation and In-flight transformation• Block mutations failing validation• Enrich or transform objects

Connectors (Integrate easily with other systems)• Solr• Hadoop• MySQL

NodeCode – Motivation

31

Page 41: Membase Intro from Membase Meetup San Francisco

NodeCode - What is it?

Method for extending & customizing Membase

Separate code modules

Defined interface to datapath and cluster manager

Notification on events• Synchronous• Asynchronous

32

Page 42: Membase Intro from Membase Meetup San Francisco

Simple• Packaged modules for easy install and enable• Library of “off the shelf” modules• Module monitoring• Straight forward development and debuggingFast• Low latency/high-throughput• Per-bucket process isolation• Don’t break data manager performance/correctnessElastic• Automatically migrate and instantiate on rebalance• Provide support for migration of internal data• Leverage native Membase engine for internal data storage

NodeCode – Drivers

33

Page 43: Membase Intro from Membase Meetup San Francisco

Block-level architecture

34

Page 44: Membase Intro from Membase Meetup San Francisco

Java only– jar format

Must implement minimal module API• Initial module startup• Module removal• Association with bucket

NodeCode library helper functions• Register synchronous & asynchronous listeners/callbacks• Register protocol extension/callbacks • Register rebalance callback• Register cluster manager event callbacks• Membase data access

NodeCode 1.0 Plans

35

Page 45: Membase Intro from Membase Meetup San Francisco
Page 46: Membase Intro from Membase Meetup San Francisco

37

Q&A

Page 48: Membase Intro from Membase Meetup San Francisco