Deep Dive into Amazon ElastiCache Architecture and Design Patterns (DAT307) | AWS re:Invent 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

DAT307 - Deep Dive into Amazon ElastiCache

Architecture and Design Patterns

Nate Wiger, Principal Solutions Architect

November 14, 2013

Contents

• Caching: What’s all this then?

• Amazon ElastiCache

• Laziness, impatience, and hubris

• From one to a dozen nodes

• Memcached vs. Redis showdown

Device Fragmentation

• Phones, tablets, PCs, toasters

• HTML, apps, JSON APIs

• Presentation differs

• Data is the same

• CDN for static images, videos

• Doesn’t help “Welcome Back, Kotter!”

Death By 1000 Queries

• Login, session

• New messages, recent posts

• Calls to Facebook, Twitter APIs

• Your friends love the new Coldplay album!!!

• Sudden viral traffic spikes

cache (noun)

a group of things that have been stored in a secret

place because they are illegal or have been stolen

Typical Web 2.0 App

ELB App

External APIs

Amazon ElastiCache

• Managed cache service

• Memcached or Redis

• Launch cluster of nodes

• Scale up / down

• Monitoring + alerts

Memcached

• In-memory

• Slab allocator

• Multithreaded

• No persistence

• Gold standard

Fire It Up

Wire It Up

Wire It Up # Ruby

require ‘dalli’

cache = Dalli::Client([

’mycache.z2vq55.0001.usw2.cache.amazonaws.com:11211’,

’mycache.z2vq55.0002.usw2.cache.amazonaws.com:11211’

])

cache.set("some_key", "Some value")

value = cache.get("some_key")

cache.set("another_key", 3)

cache.delete("another_key”)

Multiple Cache Nodes

ELB App

External APIs

Sharding Across Nodes server_list = [



]

server_index = hash(key) % server_list.length

server = server_list[server_index]

Sharding Across Nodes server_list = [



]

server_index = hash(key) % server_list.length

server = server_list[server_index]

BAD

Consistent Hashing

It’s All Been Done Before

• Ruby – Dalli

• Python – HashRing / MemcacheRing

• Node.js – node-memcached

• PHP – libketama or ElastiCache Client

• Java – SpyMemcached or ElastiCache Client

https://github.com/mperham/dalli

https://pypi.python.org/pypi/hash_ring/

https://github.com/3rd-Eden/node-memcached



https://github.com/RJ/ketama

http://docs.aws.amazon.com/AmazonElastiCache/latest/UserGuide/AutoDiscovery.html

https://code.google.com/p/spymemcached/

https://github.com/amazonwebservices/aws-elasticache-cluster-client-memcached-for-java

So Far

• Launched a cache cluster

• Got the node names

• Connected our client

• Figured out sharding

What To Cache?

• Everything!

• Database records

• Full HTML pages

• Page fragments

• Remote API calls

How To Cache It?

• Lazy population

• Write-through

• Timed refresh

Laziness is a Virtue # Python

def get_user(user_id):

record = cache.get(user_id)

if record is None:

# Run a DB query

record = db.query("select * from users where id = ?", user_id)

cache.set(user_id, record)

return record

# App code

user = get_user(17)

Ship It

• Most data is never accessed

• Ensures cache is filled

• Caches fail and scale

• But cache miss penalty

• Best approach for most data

Foresight is 20-20 # Python

def save_user(user_id, values):

record = db.query("update users ... where id = ?", user_id, values)

cache.set(user_id, record)

return record

# App code

user = save_user(17, {"name": "Nate Dogg"})

Laziness vs. Impatience

• Ensures cache is always current

• Write penalty vs. read penalty

• But missing data on scale up

• Plus excess data / cache churn

• Still need lazy fetch too

Combo Move! def save_user(user_id, values):

record = db.query("update users ... where id = ?", user_id, values)

cache.set(user_id, record, 300) # ttl

return record

def get_user(user_id):

record = cache.get(user_id)

if record is None:

record = db.query("select * from users where id = ?", user_id)

cache.set(user_id, record, 300) # ttl

return record

# App code

save_user(17, {"name": "Nate Diddy"})

user = get_user(17)

Timed Refresh

• Run job to periodically update cache

• Good for Top-N lists

• Time-intensive rankings

• Trending items

Monitoring

+ Alerts

Monitoring

• Integration with CloudWatch metrics

• Setup alarms to send via email

• Memory usage

• Evictions

• Which ElastiCache metrics should I monitor?

http://aws.amazon.com/cloudwatch/

http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/AlarmThatSendsEmail.html

http://docs.aws.amazon.com/AmazonElastiCache/latest/UserGuide/CacheMetrics.WhichShouldIMonitor.html




Node Discovery

• Setup an Amazon SNS topic for ElastiCache

• Have app listen for events – ElastiCache:AddCacheNodeComplete

– ElastiCache:RemoveCacheNodeComplete

• Reconfigure connections

• See Event Notifications and Amazon SNS

http://docs.aws.amazon.com/AmazonElastiCache/latest/UserGuide/ManagingCacheClusters.htmlManagingCacheClusters.SNS

http://docs.aws.amazon.com/AmazonElastiCache/latest/UserGuide/ElastiCacheSNS.html

Programmable Scaling

ELB App

External APIs SNS

Add Node

Node Auto-Discovery # PHP

$server_endpoint = "mycache.z2vq55.cfg.usw2.cache.amazonaws.com";

$server_port = 11211;

$cache = new Memcached();

$cache->setOption(

Memcached::OPT_CLIENT_MODE, Memcached::DYNAMIC_CLIENT_MODE);

# Set config endpoint as only server

$cache->addServer($server_endpoint, $server_port);

# Lib auto-locates nodes

$cache->set("key", "value");

Redis

• Also in-memory

• Advanced data types

• Atomic operations

• Single-threaded

• Persistence

• Read replicas

http://redis.io/topics/data-types

Leaderboard with Sorted Sets ZADD leaderboard 556 "Andy"

ZADD leaderboard 819 "Barry"

ZADD leaderboard 105 "Carl"

ZADD leaderboard 1312 "Derek"

ZREVRANGE leaderboard 0 -1

1) "Derek"

2) "Barry"

3) "Andy"

4) "Carl"

ZREVRANK "Barry"

2

Follow the Leader def save_score(user, score):

record = db.query("update users ... where id = ?", user_id, score)

redis.zadd("leaderboard", score, user)

def get_rank(user)

return redis.zrevrank(user) + 1

# App code

save_score("Andy", 556)

save_score("Barry", 819)

save_score("Carl", 105)

save_score("Derek", 1312)

get_rank("Barry") # 2

Redis Replicas

ELB App

External APIs

Replication Group

Reads Writes

Redis Sharding

• Same concept as Memcached

• BUT

• Can't shard – Lists

– Sets / sorted sets

– Hashes

• Require single in-memory structure

Anti-Pattern: Dedicated Nodes

• Spawn multiple nodes

• Use for different features – Leaderboard

– Counters

• Can still shard key-value ops

Dedicated Redis Nodes

ELB App

External APIs

Counters Leaderboard

Summary

• Caching is good

• Good caching is hard

• ElastiCache eases deployment

• Memcached or Redis

• More to come

Please give us your feedback on this

presentation

As a thank you, we will select prize

winners daily for completed surveys!

DAT307 - Nate Wiger

Deep Dive into Amazon ElastiCache Architecture and Design Patterns (DAT307) | AWS re:Invent 2013

Technology

Transcript of Deep Dive into Amazon ElastiCache Architecture and Design Patterns (DAT307) | AWS re:Invent 2013