Introduction to Aerospike
-
Upload
aerospike-inc -
Category
Technology
-
view
6.274 -
download
4
description
Transcript of Introduction to Aerospike
© 2014 Aerospike. All rights reserved. Confidential 1
Aerospike aer . o . spike [air-oh- spahyk] noun, 1. tip of a rocket that enhances speed and stability
ROCKET ENGINEFOR CONTEXT DRIVEN APPS THAT
PERSONALIZE THE INTERNET
BY YOUNG PAIKDIRECTOR SALES ENGINEERING, AEROSPIKE
© 2014 Aerospike. All rights reserved. Confidential 2
Aerospike NoSQL Database
© 2014 Aerospike. All rights reserved. Confidential 3
AGENDA
1. How the game has changed, driving need for next-gen NoSQL
2. Who uses Aerospike and why
3. Architecture Overview
© 2014 Aerospike. All rights reserved. Confidential 4
Internet Enterprises have changed the game…
Simple, Personalized, Instant
Complex, Standardized, Silo-ed
© 2014 Aerospike. All rights reserved. Confidential 5
1. Instant Response■“Every 100ms latency costs
Amazon 1% in sales”– Greg Linden, Amazon
■“An extra ½ sec in search page generation dropped traffic 20%” – Google (average 1.5 sec)
■“A 1 sec delay can cause 7% decline in conversion”– Walmart
Consumers Expect and “Want it All”
© 2014 Aerospike. All rights reserved. Confidential 6
Consumers Expect and “Want it All”
1. Instant Response
2. Intuitive Service ■ Personalized & Consistent
across channels
■ Mobile devices, tablet, car…■ Web, mobile, social media…
© 2014 Aerospike. All rights reserved. Confidential 7
Consumers Expect and “Want it All”
1. Instant Response
2. Intuitive Service ■ Personalized & Consistent
across channels
■ Mobile devices, tablet, car…■ Web, mobile, social media…
■ Seamless across the business■ Marketing, sales, support…
© 2014 Aerospike. All rights reserved. Confidential 8
Consumers Expect and “Want it All”
1. Instant Response
2. Intuitive Service
3. Always-On■ How much does down-time cost?
© 2014 Aerospike. All rights reserved. Confidential 9
Enterprises must “Deliver it All”
■ Use every swipe, search, share to delight - Instantly, Intuitively, Always-On
■ (DIY or SaaS)
■ IDENTITY■SessionIDs, Cookies, DeviceIDs, ip-Addr
■ ATTRIBUTES■Demographic, geographic
■ BEHAVIOR (REAL-TIME)■Presence, swipe, search, share.. ■Channels – web, phone, in-store..■Services – frequency, sophistication
■ SEGMENTS (PRE-CALCULATED)■Attitudes, values, lifestyle, history..
■ TRANSACTIONS■Payments, campaigns
CONTEXT
© 2014 Aerospike. All rights reserved. Confidential 10
How Big is Real-time Big Context?
■How many Objects?■# People * # Devices * # Browsers■People move around, cookies get
cleared..■* 2x Replication
■“100M people ≈ 2 Billion cookies” - eBay
# People Per Profile
10 M Customers *
25 kb 250 GB
500 M Prospects *
1 kb + 500 GB
Real-time Context
= 750 GB
■ IDENTITY■SessionIDs, Cookies, DeviceIDs, ip-
Addr
■ ATTRIBUTES■Demographic, geographic
■ BEHAVIOR (REAL-TIME)■Presence, swipe, search, share.. ■Channels – web, phone, in-store..■Services – frequency, sophistication
■ SEGMENTS (PRE-CALCULATED)■Attitudes, values, lifestyle, history..
■ TRANSACTIONS■Payments, campaigns
CONTEXT
© 2014 Aerospike. All rights reserved. Confidential 11
Aerospike Database – Powering Context-Driven Apps
1. ROCKET ENGINE- In-Memory, Flash-optimized
2. WEB SCALE– Distributed, Shared nothing
3. ACID RELIABILITY– Immediate Consistency, High Availability
4. NoSQL FLEXIBILITY– Distributed Queries, Real-time Analytics
■Apps that Personalize the Internet Instantly, Intuitively & Always-On
© 2014 Aerospike. All rights reserved. Confidential 12
MILLIONS OF CONSUMERSBILLIONS OF DEVICES
AEROSPIKE CLUSTER
APP SERVERS RDBMS
DATA WAREHOUSE
SEGMENTS
Calculate models, Discover SEGMENTS eg early adopter, bargain hunter, mass affluent…
BATCH ANALYTICS
CONTEXT
IDENTITYCookies, device
ID..
ATTRIBUTESAge, Gender..
BEHAVIOR Click, search…
SEGMENTS
R / WREAL-TIME
CONTEXT
REAL-TIMEANALYTICS
QUERIES & AGGREGATIONS Risk scores, best sellers, trending now…
+
New Architecture for Context Computing
© 2014 Aerospike. All rights reserved. Confidential 13
AppNexus
eXelateFederated Media
[x+1]
Pioneered by Ad-Tech
© 2014 Aerospike. All rights reserved. Confidential 14
Powering the profile store for AppNexus (RTB)
■ “For the last three years, Aerospike’s database has been managing our vast volumes of user data. With Aerospike, we process many terabytes of data daily across our global data centers at a rate in excess of a million requests per second. – Mike Nolet, CTO
■ “AppNexus* operates at massive scale while paying close attention to the economics of the platform. Aerospike’s flash optimizations running on top of Intel® SSDs have given us the price, performance, reliability, and serviceability we need to grow our business.” – Timothy G Smith, SVP Technical Operations
• 50 Billion Ad + 300 Billion Bid Requests/dayfor Microsoft Ad Exchange, Interactive Media (Deutsche Telekom), Collective…
• 6 Billion Mobile Ads/day for Millennial Media Exchange
• 100ms SLA from click to view
© 2014 Aerospike. All rights reserved. Confidential 15
Powering the eXelate Data Exchange (DMP)■200 publishers/ marketers access real-time context on
700Million Consumers■Demographics, purchase intent, behavioral propensities
from online /offline sources eg Nielsen, MasterCard Advisors, Bizo
■Found SQL DBs an order of magnitude too expensive, considered several NoSQL DBs
■200 servers ingest 2 TB clickstream data per day to Aerospike and an analytics DWH
■Models calculated in DWH, loaded into Aerospike:
■20 TB data, 60 Billion transactions per month■50/50 balanced reads/writes■12 node clusters, 7 SSDs/128GB DRAM per node■Data synchronized across clusters in 4 data centers
■Aerospike delivered on all of these requirements.” – Elad Efraim, CTO of eXelate
© 2014 Aerospike. All rights reserved. Confidential 16
Powering [x+1] Origin Digital Marketing Hub (DMP + DSP)■ Marketing Hub
■Multi-channel analytics and personalization of messages across touchpoints
■Integrated with leading CRM platforms■Deployed at many Fortune 500 companies
■ “It was a challenge to find an extremely high-performance, high availability database…
■4 TB of data, 2 Billion profiles■5,000-10,000 attributes per profile analyzed in 4ms■10 relevant recommendations suggested in 50ms
- each time a visitor clicks on a website
…providing fast reliable access to data in real-time is simple to say, but it’s not easy to do….
Aerospike has proven that our choice to buy, not build was the right decision.” – Patrick DeAngelis, CTO, [x+1]
© 2014 Aerospike. All rights reserved. Confidential 17
Internet-scale Context Computing Platforms
RETAILE-COMMERCE
MOBILE
OMNICHANNEL GAMIN
G
WEB
VIDEO
SOCIAL
SEARCH
© 2014 Aerospike. All rights reserved. Confidential 18
Context driven Apps - Use Cases
ADVERTISING• REAL-TIME BIDDING• DEMAND SIDE PLATFORM
(DSP)• DATA MGMT PLATFORM
(DMP)• SUPPLY SIDE PLATFORM
(SSP)
MARKETING• MULTI-SCREEN OFFERS• MULTI-CHANNEL
PERSONALIZATION• REAL-TIME
RECOMMENDATIONS• ONE TIME COUPONS • LOYALTY REWARDS• DEALS NEAR YOU• RELATED ITEMS
SALES• PRODUCT AVAILABILITY• DYNAMIC PRICING• RISK SCORES• FRAUD PREVENTION• STREAM ANALYSIS
SUPPORT• REAL-TIME DASHBOARDS• PERSONAL FINANCE
PORTFOLIOS• REAL-TIME REPORTS
© 2014 Aerospike. All rights reserved. Confidential 19
Next Gen NoSQL
© 2014 Aerospike. All rights reserved. Confidential 20
Aerospike Database / Context Computing Platform
1. ROCKET ENGINE- In-Memory, Flash-optimized
2. WEB SCALE– Distributed, Shared nothing
3. ACID RELIABILITY– Immediate Consistency, High Availability
4. NoSQL EXTENSIBILITY– Distributed Queries, Real-time Analytics
■Powering Context driven Apps that Personalize the Internet Instantly, Intuitively & Always-On!
© 2014 Aerospike. All rights reserved. Confidential 21
OTHER DATABASE
OS FILE SYSTEM
PAGE CACHE
BLOCK INTERFACE
HDD SSD
BLOCK INTERFACE
SSD SSD
OPEN NVM
SSD
OTHER DATABASE
FLASH OPTIMIZEDIN-MEMORY DATABASE
Ask and I’ll tell you now.Ask me. I’ll look up the answer and then tell it to you.
AEROSPIKE
HYBRID MEMORY SYSTEM™
• Indexes in DRAM• Data in DRAM /
SSD• Balanced Reads
& Writes• Highly
Parallelized• Lock-free + ACID
ROCKET ENGINE
© 2014 Aerospike. All rights reserved. Confidential 22
Aerospike Certification Tool (ACT) for SSDs
■Industry Standard Flash (SSD / PCI-E) Benchmark■Open Source Tool used by Flash Vendors to certify drives
© 2014 Aerospike. All rights reserved. Confidential 23
10X Faster for Balanced Read/Write loads
Balanced Read-Heavy0
50,000
100,000
150,000
200,000
250,000
300,000
350,000
Aerospike CassandraMongoDB Couchbase 2.0*
*We were forced to exclude Couchbase...since when run with either disk or replica durability on it was unable to complete the test.” – Thumbtack Technology
0 50,000 100,000 150,000 200,0000
2.5
5
7.5
10Balanced Workload Read Latency
AerospikeCas-sandraMongoDB
Throughput, ops/sec
Avera
ge L
ate
ncy,
ms
0 50,000 100,000 150,000 200,0000
4
8
12
16Balanced Workload Update Latency
AerospikeCas-sandraMongoDB
Throughput, ops/sec
Avera
ge L
ate
ncy,
ms
HIGH THROUGHPUT LOW LATENCY
4 Node Cluster, each with:CPU: 8 x Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz, RAM: 31 GBSSD: 4 x INTEL SSDSA2CW120G3, 120 GB (94 GB over-provisioned) HDD: ST500NM0011, 500 GB, SATA III, 7200 RPMOS: Ubuntu Server 12.04.1 64-bit (Linux kernel v.3.2.0)
© 2014 Aerospike. All rights reserved. Confidential 24
DRAM & HDD SSD & DRAMStorage /server 180 GB (196 GB) 2.4 TB (4 x 700 GB)
TPS /server 500,000 500,000Cost /server $8,000 $11,000Server costs $1,488,000 $154,000
Power /server 0.9 kW 1.1 kWPower (2 years) $0.12 per kWh
ave. US$352,000 $32,400
Maintenance (2 years) $3,600 /server
$670,000 $50,400
Total $2,510,000 $236,800
Actual customer analysis99% < 1ms500K TPS
10 TB Storage2x Replication
186 SERVERS REQUIRED 14 SERVERS
OTHER DATABASES
ONLY
© 2014 Aerospike. All rights reserved. Confidential 25
WEB SCALE
■ Distributed Hash Table with No Hotspots
■Every key hashed with RIPEMD160 into a 20 byte (fixed length) string
■Hash + additional (fixed 64 bytes) data stored in DRAM in the index
■Some bits from hash value are used tocalculate the Partition ID (4096 partitions)
■Partition ID maps to Node ID in the cluster
■ 1 Hop to data■Client just calculates Partition ID, determines Node ID
■No Load Balancers required
■ Shared Nothing architecture
■Every node is identical
cookie-abcdefg-12345678cookie-abcdefg-12345678
182023kh15hh3kahdjsh182023kh15hh3kahdjsh
Partition ID Master Node ID
Replica Node ID
… 1 4
1820 2 3
1821 3 2
4096 4 1
© 2014 Aerospike. All rights reserved. Confidential 26
OHIO
WEB SCALE with ACID RELIABILITY
1) No Hotspots – DHT with RIPEMD160 simplifies data partitioning
2) Smart Client – 1 hop to data, no load balancers
3) Shared Nothing Architecture, every node identical
7) XDR – asynch replication across data centers ensures Zero Downtime
4) Single row ACID – synch replication in cluster
5) Smart Cluster, Zero Touch – auto-failover, rebalancing, rolling upgrades..
6) Transactions and long running tasks prioritized real-time
© 2014 Aerospike. All rights reserved. Confidential 27
XDR ensures Zero Downtime
■ Cross Data Center Replication (XDR) enables geographic redundancy and location proximity
■ Maximum flexibility■Replication set at namespace level■Active-Passive /Active-Active modes■Changes in one data center can be
■replicated to multiple data centers■forwarded to another data center
■Clusters can have different number of nodes■Automatic failure handling ensures continuity
in spite of node failures
■ Super Storm Sandy 2012■Power outage, NYC Cluster down for 17 hours■Once power returned, XDR synched in 1 hour
“Aerospike allows us to handle business continuity and reliability across 4 data centers seamlessly.” - Elad Efraim, CTO
© 2014 Aerospike. All rights reserved. Confidential 28
AP
P S
ER
VER
AER
OS
PIK
E S
ER
VER
APPLICATION
AEROSPIKE SMART CLIENT™
• APIs (C, C#, Java, PHP, Python, Ruby, Erlang…)• Transactions, Cluster awareness
EXTENSIBLE DATA MODEL
• Str, Int, Lists, Maps• Lookups, Queries, Scans
• User Defined Functions • Distributed Aggregations
MONITORING & MANAGEMENT
• Aerospike Monitoring Console™
• Command Line Tools
• Plugins-Naglos, Graphite, Zabbix
AEROSPIKE
SMART CLUSTER™
AEROSPIKE
HYBRID MEMORY SYSTEM™
AEROSPIKE (XDR)
CROSS DATA CENTER REPLICATION™
AEROSPIKEREAL-TIMEENGINE™
APP/WEB SERVER
AEROSPIKE CLUSTER
Architected for Context ComputingPatents pending Written in ‘C’
1) ROCKET FAST
2) WEB SCALE 3) ACID RELIABILITY
4) NOSQL FLEXIBILITY
© 2014 Aerospike. All rights reserved. Confidential 29
NOSQL EXTENSIBILITY
■Namespaces (policy containers) ■Determine storage - DRAM or Flash■Determine replication factor■Contain records and sets
■Sets (tables) of records■Arbitrary grouping
■Records (rows) ■Max 128k, contain key and bins■Bin with same name can contain values of
different types■String, integer, bytes (raw, blob, etc)■list ( an ordered collection of values )■map ( a collection of keys and values )
■Bins can be added anytime
© 2014 Aerospike. All rights reserved. Confidential 30
Real-time Analytics on Operational Data
DISTRIBUTED QUERIES1. “Scatter” requests to all nodes
2. Indexes in DRAM for fast map of secondary primary keys
3. Indexes co-located with data to guarantee ACID, manage migrations
4. Records read in parallel from all SSDs using lock free concurrency control
5. Aggregate results on each node
6. “Gather” results from all nodes on client
STREAM AGGREGATIONS7. Push Code/ Security Policies/ Rules to Data with UDFs
8. Pipe Query results through UDFs toFilter, Transform, Aggregate.. Map, Reduce
REAL-TIME ANALYTICS on OPERATIONAL DATA (No ETL)■In Database, within the same Cluster■On the same Data, on XDR Replicated Clusters
© 2014 Aerospike. All rights reserved. Confidential 31
Aerospike Database / Context Computing Platform
1. ROCKET ENGINE- In-Memory, Flash-optimized
2. WEB SCALE– Distributed, Shared nothing
3. ACID RELIABILITY– Immediate Consistency, High Availability
4. NoSQL EXTENSIBILITY– Distributed Queries, Real-time Analytics
■Powering Context driven Appsthat Personalize the Internet Instantly, Intuitively & Always-On!
© 2014 Aerospike. All rights reserved. Confidential 32
Recognized as the only Visionary in Gartner's Magic Quadrant for Operational Database Management Systems
Gartner, Magic Quadrant for Operational Database Management Systems Donald Fienberg et al. October 23, 2013
This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available at www.aerospike.com .Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner's research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.