AWS re:Invent 2016: Learn how IFTTT uses ElastiCache for Redis to predict events and index terabytes...

29
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Nicholas Silva IFTTT Darin Briskman Amazon Web Services DAT317 Learn how IFTTT uses ElastiCache for Redis to predict events and index terabytes of logs

Transcript of AWS re:Invent 2016: Learn how IFTTT uses ElastiCache for Redis to predict events and index terabytes...

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Nicholas Silva – IFTTT

Darin Briskman – Amazon Web Services

DAT317Learn how IFTTT uses ElastiCache

for Redis to predict events and index

terabytes of logs

In-Memory Key-Value Store

High-performance

Redis and Memcached

Fully managed; Zero admin

Highly Available and Reliable

Hardened by Amazon

Redis – the fast in-memory database

Powerful ~200 commands + Lua scripting

In-memory database

Utility data structuresstrings, lists, hashes, sets, sorted sets,

bitmaps & HyperLogLogs

Simple

Atomic operationssupports transactions

has ACID properties

Ridiculously fast!<500microsecond latency for

most commands

Highly Availablereplication

Persistentsnapshots or append-only log

Open Source

Data Types for Rapid Development

Redis Data Type Contains Read/write ability

StringBinary-safe strings (up to 512 MB), Integers or

floating point values, Bitmaps.Operate on the whole string, parts, increment/decrement the

integers and floats, get/set bits by position.

HashUnordered hash table of keys to string

valuesAdd, fetch, or remove individual ítems by key, fetch the

whole hash.

List Doubly linked list of stringsPush or pop items from both ends, trim based on offsets,

read individual or multiple items, find or remove items by

value.

Set Unordered collection of unique stringsAdd, fetch, or remove individual items, check membership,

intersect, union, difference, fetch random items.

Sorted SetOrdered mapping of string members to

floating-point scores, ordered by scoreAdd, fetch, or remove individual items, fetch items based on

score ranges or member value.

Geospatial

index

Sorted set implementation using geospatial

information as the scoreAdd, fetch or remove individual items, search by coordinates

and radius, calculate distance.

HyperLogLogProbabilistic data structure to count unique

things using 12Kb of memoryAdd individual or multiple items, get the cardinality.

Value1Key1Value2Key2

Lon.: -103.55328

Lat.: 20.63373

Value

10000110...10

I m a string!

...0000110

ACBD

CBCA

C: 250A: 250D: 0.3B: 0.1

Source:: https://cs.brown.edu/courses/cs227/archives/2011/slides/mar07-redis.pdf

Some Data Type Examples

Source: http://www.slideshare.net/FedericoDanielColomb/redis-introduction-

54750742

One connection, countless possibilities

43 Million

9.5 Million

360+

1 Billion

80 Million

Applets created

Users on the platform

Services launched

Runs per month

Service activations

1

4

300+

60+

1

Region (us-east-1)

Availability Zones

EC2 Spot Instances

ElastiCache Nodes

DevOps Engineer

Applet Optimization

IFTTT Applet Optimization

• Applets consist primarily of triggers and actions

• Instant triggers are inbound calls to IFTTT

• Strive for as little delay as possible

• Many services do not support

instant triggers

IFTTT Applet Optimization

• Polling...

IFTTT Data Storage Requirements

• Fast

• Native complex data types (sets, bitmaps)

• Atomic transactions across multiple keys

IFTTT Applet Prediction

• Applet run data published to Kinesis

• Kinesis consumer writes to ElastiCache

• Redis Bitmap - a bit for every minute of the day (1440)

IFTTT Applet Prediction

• Kinesis events are processed and written to ElastiCache

• Key name is a combination of applet id and date

• Value is a bitmap of minutes. If an event happens at 1am,

"SETBIT key 60 1".

• Schedule predictor uses ML to come up with a schedule,

exposes an API internally

IFTTT Applet Prediction

Polling Frequency:

_____|_____|_____|_____|_____|_____|_____|_____|___

_

Actual Event Schedule:

_____________|___________|________________|_____

_

Predicted Applet Schedule:

_____|______||||||_________|||||_______|______||||||_____

IFTTT Applet Prediction

• Prediction algorithm uses historical data to return a

daily schedule for each applet

• Schedule is transformed into a key for each minute of

the day with a list of applets to check

• Applet enqueuer reads the entire schedule for the

minute

IFTTT Applet Scheduling

• Enqueueing service crawls all Applets refreshing their

schedules

• Writes a pivoted schedule back to Redis sets

• Each fetch results in a Lua script of SADD, SREM, and

SISMEMBER

• New structure is 1440 sets (one for each minute of

the day)

• Enqueuer GETs the current minute and enqueues the

applet IDs of the set

IFTTT Applet Optimization

• Significant reduction in average polling frequency

• Significant reduction in delay time for Applets

• Smoother workload (less spiky due to enqueuer

changes)

• Fast, scalable, and flexible data storage that exceeded

project requirements

Applet Logs

IFTTT Applet Logs

• Every Applet transaction is logged (run, error,

deactivated, etc)

• Applet logs are accessible by users

• Indexed by user, applet, and service

IFTTT Applet Log Requirements

• Must be affordable and scalable

• Good user experience when displaying

• Future resilient for adding additional indexes

IFTTT Writing Applet Logs

• Applet run data published to Kinesis

• Kinesis consumer simultaneously writes chunks of data

to S3 and indexes to ElastiCache

• Can add indexes going forward or reindex backwards

• Look up index in ElastiCache, fetch chunk locations

• Read data from S3 and cache recent reads in

ElastiCache for performance

IFTTT Reading Applet Logs

IFTTT Reading Applet Logs

• Indexes keep most recent X items

• When an index gets too large, we truncate the index

with Lua and store to S3

IFTTT Reading Applet Logs

• Indexes keep most recent X items

• When an index gets too large, we truncate the index

with Lua and store to S3

IFTTT Applet Logs

• S3 cold storage for long-term archiving

• ElastiCache Redis for "hot" data and indexes

• Fast-loading for users while still affordable and scalable

• Costs scale with number of users instead of growing out

of hand

• Applets or entire user accounts can be re-indexed from

S3

• Future indexes can be added without hassle

Thank you!

Remember to complete

your evaluations!