Scylla: 1 Million CQL operations per second per server
-
Upload
avi-kivity -
Category
Internet
-
view
1.732 -
download
4
Transcript of Scylla: 1 Million CQL operations per second per server
Capable of 1,000,000 operations per secondPER NODE
With predictable, low latenciesCompatible with Apache Cassandra
Scylla: A new NoSQL Database
FULLY COMPATIBLE❏ Uses Cassandra SSTables❏ Use your existing drivers❏ Use your existing CQL queries❏ Use your existing cassandra.yaml❏ Manage with nodetool or other JMX console❏ Use your existing code with no change❏ Copy over a complete Cassandra database❏ Works with the Cassandra ecosystem (Spark etc.)
WHAT WOULD YOU DO WITH 1 MILLION TPS?Shrink your cluster by a factor of 10XHandle 10X traffic spikes on Black FridayModel your data instead of using Cassandra as K/VGet the most out of your data - Run more queriesMaintain your clusters while servingStop using caches in front of the database
SCYLLA IS QUITE DIFFERENTShard-per-core, no locks, no threads, zero-copyBased on the Seastar C++ application frameworkEfficient, unified DB cache (not using Linux page cache)CQL-oriented storage engineExploit all hardware resources - NUMA, multiqueue NICs, etc
SCYLLA DB: ARCHITECTURE COMPARISON
● KVM was invented by Avi in 2006, development was managed by Dor● It was a new hypervisor after VMW, Xen had dominated the market● By smart design choices and leveraging Linux and the hardware it became the most
performing hypervisor.○ KVM holds SPECvirt performance record○ KVM holds max IOPS record
● The Open Virtualization Alliance includes hundreds of companies, including HP, IBM, Intel, AMD, Red Hat, etc
● KVM is the engine behind many clouds such as OpenStack, IBM, NTT, Fujitsu, HP, Google, DigitalOcean, etc.
Kernel
Cassandra
TCP/IPScheduler
queuequeuequeuequeuequeuethreads
NICQueues
Kernel
Traditional stack Scylla sharded stack
Memory
Lock contentionCache contentionNUMA unfriendly
Application
TCP/IP
Task Schedulerqueuequeuequeuequeuequeuesmp queue
NICQueue
DPDK
Kernel (isn’t
involved)
Userspace
Application
TCP/IP
Task Schedulerqueuequeuequeuequeuequeuesmp queue
NICQueue
DPDK
Kernel (isn’t
involved)
Userspace
Application
TCP/IP
Task Schedulerqueuequeuequeuequeuequeuesmp queue
NICQueue
DPDK
Kernel (isn’t
involved)
Userspace
CoreDatabase
TCP/IP
Task Schedulerqueuequeuequeuequeuequeuesmp queue
NICQueue
DPDK
Kernel (isn’t
involved)
Userspace
No contentionLinear scalingNUMA friendly
Scylla has its own task schedulerTraditional stack Scylla stack
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise is a pointer to eventually computed value
Task is a pointer to a lambda function
Scheduler
CPU
Scheduler
CPU
Scheduler
CPU
Scheduler
CPU
Scheduler
CPU
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread is a function pointer
Stack is a byte array from 64k to megabytes
Context switch cost is
high. Large stacks pollutes
the caches No sharing, millions of
parallel events
Blasting out I/O operations
future<>make_data_requests(digest_resolver_ptr resolver, targets_iterator begin, targets_iterator end) { return parallel_for_each(begin, end, [this, resolver = std::move(resolver)] (gms::inet_address ep) { return make_data_request(ep).then_wrapped([resolver, ep] (future<foreign_ptr<lw_shared_ptr<query::result>>> f) { try { resolver->add_data(ep, f.get0()); } catch (...) { resolver->error(ep, std::current_exception()); } }); });}
Unified cacheCassandra Scylla
Key cache
Row cache
On-heap /Off-heap
Linux page cache
SSTables
Unified cache
SSTables
Unified cacheCassandra Scylla
Key cache
Row cache
On-heap /Off-heap
Linux page cache
SSTables
Unified cache
SSTables
Tuning
Caching unparsed data
Parasitic rows
Page faults
❏ Implement missing CQL features❏ Stabilize clustering❏ Complete nodetool support❏ Spark integration, other connectors❏ Thrift support ❏ Authentication and encryption
Work in progress
❏ Open source @ github❏ In Beta❏ Try it out! RPM, Docker images available❏ Live Demo @ ScyllaDB booth!❏ → http://scylladb.com
OUTLOOK