Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

16
Benchmarking Cloud Serving Systems with YCSB by Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R. Gemini Mobile Technologies, Inc. NOSQL Tokyo Reading Group (http://nosqlsummer.org/city/tokyo ) September 15, 2010 Tags: #ycsb #nosql 06/07/2022 Gemini Mobile Technologies, Inc. 1

description

This is the summary materials of "Benchmarking Cloud Serving Systems with YCSB" paper for nosql summer reading in Tokyo on September 15, 2010 at Gemini Mobile Technologies in Shibuya, Tokyo.

Transcript of Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

Page 1: Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

Benchmarking Cloud Serving Systems with YCSBby

Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.

Gemini Mobile Technologies, Inc.

NOSQL Tokyo Reading Group

(http://nosqlsummer.org/city/tokyo)

September 15, 2010

Tags: #ycsb #nosql

04/10/2023 Gemini Mobile Technologies, Inc. 1

Page 2: Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

Benchmarking Cloud Serving Systems with YCSB

Authors: Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R, Sears, R..

Abstract: … We present the "Yahoo! Cloud Serving Benchmark" (YCSB) framework, with the goal of facilitating performance comparisons of the new generation of cloud data serving systems. We define a core set of benchmarks and report results for four widely used systems: Cassandra, HBase, Yahoo!'s PNUTS, and a simple sharded MySQL implementation. We also hope to foster the development of additional cloud benchmark suites that represent other classes of applications by making our benchmark tool available via open source. In this regard, a key feature of the YCSB framework/tool is that it is extensible---it supports easy definition of new workloads, in addition to making it easy to benchmark new systems.

Appeared in: ACM Symposium on Cloud Computing, ACM, Indianapolis, IN, USA (2010)

http://research.yahoo.com/files/ycsb.pdf

04/10/2023 Gemini Mobile Technologies, Inc. All rights reserved. 2

Page 3: Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

1. Introduction

Hard to compare non-relational DBs

• Data model varies. Key-Value vs. Column-oriented vs. Document-oriented.

• DB’s performance profile (writes/reads/updates) has different emphasis.

• Consistency model, replication, fault handling, etc. are all different.

Goal: A standard benchmarking framework to evaluate “serving” systems that do online read/write data ops.

YCSB (Yahoo! Cloud Serving Benchmark)

• Workload generating client.

• Package of standard workloads (e.g., read-heavy, scan, etc.)

• Package of DB interface layers for Cassandra, HBase, MongoDB.

• Extensible. Add new workloads. Add new DBs.

04/10/2023 Gemini Mobile Technologies, Inc. All rights reserved. 3

Page 4: Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

2.1. Cloud Serving System Characteristics

• Scale-out

• To add capacity, add servers.

• Goal is constant performance/node.

• Elasticity

• Load is distributed by adding a server to a running system.

• Temporary performance decrease as data is re-distributed.

• High Availability

• System remains available in face of failures.

04/10/2023 Gemini Mobile Technologies, Inc. All rights reserved. 4

Page 5: Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

2.2 Classifications of Systems and Tradeoffs

• Read vs. Write Performance

• Write-optimized. Log-structured systems. Append updates to commit log. Reads may need to merge update information.

• Latency vs. Durability

• Disk sync writes.

• Synchronous vs. Asynchronous Replication

• Data Partitioning

• Row-based storage: A row’s data is stored contiguously on disk.

• Column storage: Different columns can be stored separately.

04/10/2023 Gemini Mobile Technologies, Inc. All rights reserved. 5

Page 6: Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

3.1 Benchmark Tiers

04/10/2023 Gemini Mobile Technologies, Inc. All rights reserved. 6

Tier 1: Performance (Latency)

• Measure latency as throughput is increased until system is saturated.

Tier 2: Scaling

• Scaleup. Increase number of servers, amount of data, and offered throughput scale proportionally. Latency should be constant.

• Elastic Speedup. In running system, add more servers. Performance should improve.

Page 7: Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

4. Benchmark Workloads

• Operation Types

• Insert

• Update

• Read

• Scan

• Data size

• Number of fields (e.g., 10)

• Field length (e.g., 100 bytes)

• Request distribution

• Uniform: All items equally likely.

• Zipfian: Some records are very popular, most records are unpopular.

• Latest: Like Zipfian with most recently inserted records as the most popular

04/10/2023 Gemini Mobile Technologies, Inc. All rights reserved. 7

Page 8: Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

4.2 Core Workloads

04/10/2023 Gemini Mobile Technologies, Inc. All rights reserved. 8

Page 9: Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

5.1 YCSB Client Architecture

• Workload Executor. Traffic generation for both “load” and “transaction” phases.

• DB Interface Layer. Custom for each DB.

04/10/2023 Gemini Mobile Technologies, Inc. All rights reserved. 9

Page 10: Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

5.2 Extensibility

YCSB package is open-source Java code.

Workload Executor

1. Modify configuration (e.g., operation mix, distribution, data size, etc.)

2. Custom Java class to define workload.

DB Interface Layer

3. Implement interface (read,update, insert, delete, scan) for DB.

04/10/2023 Gemini Mobile Technologies, Inc. All rights reserved. 10

Page 11: Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

6. Results: Setup

• Tested 4 DBs

• Cassandra 0.5.0

• HBase 0.20.3

• PNUTS MySQL 5.1.24

• MySQL(sharded) 5.1.32.

• 6 servers. Dual 65-bit quad-core 2.5 GHz Intel Xeon CPUs, 8GB RAM, 6-disk RAID-10 array, GB ethernet.

• YCSB Client on a separate 8-core server.

• Up to 500 threads.

• Client was not the bottleneck.

• No replication

• Data is 120M 1KB records (total size: 120GB). Each server then stored 20GB data.

• Cassandra, PNUTS, MySQL configured to sync to disk. HBase not sync to disk.

• Periodic compaction operations.

04/10/2023 Gemini Mobile Technologies, Inc. All rights reserved. 11

Page 12: Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

6. Results: Read vs. Write Performance

• Cassandra and HBase had better performance on write-heavy workload.

• PNUTS and MySQL had better performance on read-heavy workload.

04/10/2023 Gemini Mobile Technologies, Inc. All rights reserved. 12

Page 13: Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

6. Results: Scalability

• Vary number of servers from 2 to 12. Data size and request rate varied proportionally.

04/10/2023 Gemini Mobile Technologies, Inc. All rights reserved. 13

Cassandra and PNUTS scale well.

HBase is erratic.

Page 14: Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

6. Results: Elasticity

• Start with 2 servers with 120GB data. Then add more servers up to 6.

• Cassandra, HBase, PNUTS were able to grow elastically.

• HBase does not repartition data until next compaction.

• PNUTS was best, most stable latency while elastically repartitioning data.

04/10/2023 Gemini Mobile Technologies, Inc. All rights reserved. 14

Go from 5 to 6 servers at 10 minute mark.

Page 15: Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

7. Future Work

• Tier 3: Availability

• Tier 3: Replication

04/10/2023 Gemini Mobile Technologies, Inc. All rights reserved. 15

Page 16: Summary of "YCSB " paper for nosql summer reading in Tokyo" on Sep 15, 2010

Further Study

• Main Site: http://research.yahoo.com/Web_Information_Management/YCSB• Source Code: http://github.com/brianfrankcooper/YCSB

• Mailing list: http://tech.groups.yahoo.com/group/ycsb-users/

04/10/2023 Gemini Mobile Technologies, Inc. All rights reserved. 16