MySQL Cluster performance best practices

48
MySQL Cluster : Delivering Breakthrough Performance 26 th July 2012 Andrew Morgan Senior Product Manager – MySQL HA [email protected] clusterdb.com Mat Keep Senior Product Manager – MySQL HA [email protected]

description

Get the best out of MySQL Cluster, presentation covers: - Tuning and optimization to exploit the auto-sharded, distributed design of MySQL Cluster - Using Adaptive Query Localization to scale cross-shard JOINs - Data access patterns, schema and query optimizations - Recommended tuning parameters Tune in to the on-demand webinar: http://www.mysql.com/news-and-events/on-demand-webinars/display-od-719.html

Transcript of MySQL Cluster performance best practices

Page 1: MySQL Cluster performance best practices

MySQL Cluster : Delivering Breakthrough Performance 26th July 2012 Andrew Morgan Senior Product Manager – MySQL HA [email protected] clusterdb.com

Mat Keep Senior Product Manager – MySQL HA [email protected]

Page 2: MySQL Cluster performance best practices

The presentation is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

Copyright 2012 Oracle Corporation - 26th July 2012 2

Page 3: MySQL Cluster performance best practices

Session Agenda •  Introduction to MySQL Cluster

•  Where does MySQL Cluster fit?

•  Benchmarks •  WHILE (cluster.measurePerformance() < target) {

cluster.optimize(); }

•  Boosting performance

•  Scaling out

•  Further resources

Copyright 2012 Oracle Corporation - 26th July 2012 3

Page 4: MySQL Cluster performance best practices

MySQL Cluster – Users & Applications Extreme Scalability, Availability and Affordability

http://www.mysql.com/customers/cluster/

•  Web •  High volume OLTP •  eCommerce •  On-Line Gaming •  Digital Marketing •  User Profile Management •  Session Management & Caching •  Content Management

•  Telecoms •  Service Delivery Platforms •  VAS: VoIP, IPTV & VoD •  Mobile Content Delivery •  Mobile Payments •  LTE Access

Copyright 2012 Oracle Corporation - 26th July 2012 4

Page 5: MySQL Cluster performance best practices

MySQL Cluster Architecture

Data Nodes

Node Group 1

F1

F3

F3

F1

Nod

e 1

Nod

e 2

Node Group 2

F2

F4

F4

F2

Nod

e 3

Nod

e 4

Application Nodes

Cluster Mgmt

Cluster Mgmt

REST JPA

Copyright 2012 Oracle Corporation - 26th July 2012 5

Page 6: MySQL Cluster performance best practices

When to Consider MySQL Cluster l  What are the consequences of downtime or failing to meet

performance requirements? l  How much effort and $ is spent in developing and managing HA in

your applications? l  Are you considering sharding your database to scale write

performance? How does that impact your application and developers?

l  Do your services need to be real-time? l  Will your services have unpredictable scalability demands,

especially for writes ? l  Do you want the flexibility to manage your data

with more than just SQL ?

Copyright 2012 Oracle Corporation - 26th July 2012 6

Page 7: MySQL Cluster performance best practices

Where would I not Use MySQL Cluster? •  “Hot” data sets >3TB

•  Replicate cold data to InnoDB

•  Long running transactions •  Large rows, without using BLOBs •  Foreign Keys

•  Check out MySQL Cluster 7.3 Early Access: http://labs.mysql.com/

•  Many full table scans •  Geo-Spatial indexes •  In these scenarios; InnoDB storage engine would be the

right choice

MySQL Cluster Evaluation Guide http://mysql.com/why-mysql/white-papers/mysql_cluster_eval_guide.php

Copyright 2012 Oracle Corporation - 26th July 2012 7

Page 8: MySQL Cluster performance best practices

General Design Considerations

Overall design goal Minimize network roundtrips for your

most important requests!

•  MySQL Cluster is designed for –  Short transactions – Many parallel transactions

•  Utilize Simple access patterns to fetch data – Use efficient scans and batching interfaces

•  Analyze what your most typical use cases are –  optimize for those

Copyright 2012 Oracle Corporation - 26th July 2012 8

Page 9: MySQL Cluster performance best practices

Servicing the Most Performance-Intensive Workloads

Copyright 2012 Oracle Corporation - 26th July 2012 9

Page 10: MySQL Cluster performance best practices

Servicing the Most Performance-Intensive Workloads

writes

Copyright 2012 Oracle Corporation - 26th July 2012 10

Page 11: MySQL Cluster performance best practices

•  1 Billion+ Reads per Minute, 8 node Intel Xeon cluster •  Multi-Threaded Data Node Extensions •  NoSQL C++ API, flexaSynch benchmark

0

2

4

6

8

10

12

14

16

18

20

MySQL Cluster 7.1 MySQL Cluster 7.2

Rea

ds p

er S

econ

d (M

illio

ns)

Comparing MySQL Cluster Performance 8x Higher Performance per Node

Copyright 2012 Oracle Corporation - 26th July 2012 11

Page 12: MySQL Cluster performance best practices

•  30 x Intel E5-2600 Intel Servers •  NoSQL C++ API, flexaSynch benchmark •  ACID Transactions, with Synchronous Replication

0

5

10

15

20

25

2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

Mill

ions

of U

PDAT

Es p

er S

econ

d

MySQL Cluster Data Nodes

1.2 Billion UPDATEs per Minute

Copyright 2012 Oracle Corporation - 26th July 2012 12

Page 13: MySQL Cluster performance best practices

WHILE (cluster.measurePerformance() < target)

• Don’t optimize for the sake of it •  Wastes effort •  Introduces unnecessary compromises/complications

• Forget about database benchmarks •  You care about the end-to-end performance of your

application on your database •  If possible drive your application to drive the database

• Measurements need to be based on representative traffic

• Measurements need to be repeatable •  Easily see impact of each optimization

Copyright 2012 Oracle Corporation - 26th July 2012 13

Page 14: MySQL Cluster performance best practices

•  Enable the slow query log –  set global slow_query_log=1; –  set global long_query_time=3; //3 seconds –  set global log_queries_not_using_indexes=1; –  Slow queries will be written in the slow query log:

mysql> show global variables like 'slow_query_log_file';

+---------------------+------------------------------+

| Variable_name | Value |

+---------------------+------------------------------+

| slow_query_log_file | /data1/mysql/mysqld-slow.log |

+---------------------+------------------------------+

•  Queries will be written in plain text

Where has the time gone?

Copyright 2012 Oracle Corporation - 26th July 2012 14

Page 15: MySQL Cluster performance best practices

Query Analyzer in MySQL Enterprise Monitor (take the easy option!)

Copyright 2012 Oracle Corporation - 26th July 2012 15

Page 16: MySQL Cluster performance best practices

Simple database traffic generation

create.sql: CREATE TABLE sub_name (sub_id INT NOT NULL PRIMARY KEY, name VARCHAR(30)) engine=ndb;

CREATE TABLE sub_age (sub_id INT NOT NULL PRIMARY KEY, age INT) engine=ndb;

INSERT INTO sub_name VALUES (1,'Bill'),(2,'Fred'),(3,'Bill'),(4,'Jane'),(5,'Andrew'),(6,'Anne'),(7,'Juliette'),(8,'Awen'),(9,'Leo'),(10,'Bill');

INSERT INTO sub_age VALUES (1,40),(2,23),(3,33),(4,19),(5,21),(6,50),(7,31),(8,65),(9,18),(10,101);

query.sql: SELECT sub_age.age FROM sub_name, sub_age WHERE sub_name.name='Bill' AND sub_name.sub_id=sub_age.sub_id;

Copyright 2012 Oracle Corporation - 26th July 2012 16

Page 17: MySQL Cluster performance best practices

Simple database traffic generation

shell> mysqlslap --concurrency=5 --iterations=100 --query=query.sql --create=create.sql

Benchmark Average number of seconds to run all queries: 0.132 seconds

Minimum number of seconds to run all queries: 0.037 seconds

Maximum number of seconds to run all queries: 0.268 seconds

Number of clients running queries: 5

Average number of queries per client: 1

Copyright 2012 Oracle Corporation - 26th July 2012 17

Page 18: MySQL Cluster performance best practices

What on Earth is it doing?

EXPLAIN is your friend

mysql> EXPLAIN <query>;

mysql> EXPLAN PARTITIONS <query>;

mysql> EXPLAIN EXTENDED <query>; mysql> SHOW WARNINGS;

Copyright 2012 Oracle Corporation - 26th July 2012 18

Page 19: MySQL Cluster performance best practices

Boosting Performance

• ANALYZE TABLE • Access patterns • AQL (fast JOINs) • Distribution aware • Batching • Schema

• Connection pools • Multi-threaded data

nodes • NoSQL APIs • Hardware • More tips

Copyright 2012 Oracle Corporation - 26th July 2012 19

Page 20: MySQL Cluster performance best practices

Before you do anything else, ANALYZE!

•  New for MySQL Cluster 7.2 •  Lets the MySQL optimizer figure out how to best use

indexes etc. •  Instantly speed up queries by many times

mysql> ANALYZE TABLE <tab-name>; •  Repeat after changing schema, adding/removing

indexes or making major data changes •  Only needs running on one mysqld in the cluster

Copyright 2012 Oracle Corporation - 26th July 2012 20

Page 21: MySQL Cluster performance best practices

Access patterns

•  Primary key reads/writes -> O(1) •  Independent of database size and number of nodes

•  Index searched -> O(log n) •  n = number of tuples

•  BLOBs are stored in second table -> take longer to access

•  JOINs massively faster in MySQL Cluster 7.3 •  Partition pruning

•  By allowing a query to be satisfied with a single data node, reduce resource consumption -> greater throughput

•  If result sets not large, will also reduce latency

Copyright 2012 Oracle Corporation - 26th July 2012 21

Page 22: MySQL Cluster performance best practices

Adaptive Query Localization Scaling Distributed Joins

•  Perform Complex Queries across Shards •  JOINs pushed down to data nodes •  Executed in parallel •  Returns single result set to MySQL

•  Opens Up New Use-Cases •  Real-time analytics •  Recommendations engines •  Analyze click-streams

mysqld

Data Nodes

mysqld

AQL

Data Nodes

70x More

Performance

DON’T COMPROMISE FUNCTIONALITY TO SCALE-OUT !!

Copyright 2012 Oracle Corporation - 26th July 2012 22

Page 23: MySQL Cluster performance best practices

MySQL Cluster 7.2 AQL Test Query Web-Based Content Management System

Data Node1

Data Node2

MySQL Server

Copyright 2012 Oracle Corporation - 26th July 2012 23

Page 24: MySQL Cluster performance best practices

Web-Based CMS

Must Analyze tables for best results mysql> ANALYZE TABLE <tab-name>;

0

10

20

30

40

50

60

70

80

90

100

MySQL Cluster 7.1 MySQL Cluster 7.2

Query Execution Time Seconds

87.23 seconds

1.26 seconds

70x More

Performance

Copyright 2012 Oracle Corporation - 26th July 2012 24

Page 25: MySQL Cluster performance best practices

Did I mention ANALYZE TABLE?

Copyright 2012 Oracle Corporation - 26th July 2012 25

Page 26: MySQL Cluster performance best practices

AQL – How to Use it •  Activated when ndb_join_pushdown is on (default) •  Rules for a Join to be pushed down:

1.  Joined columns use identical types 2.  No reference to BLOB or TEXT columns 3.  No explicit lock 4.  Child tables in the Join must be accessed using one of the ref, eq_ref, or

const 5.  Tables not explicitly partitioned by [LINEAR] HASH, LIST, or RANGE 6.  Query plan doesn’t select ‘Using join buffer' 7.  If root of Join is an eq_ref or const, child tables must be joined by eq_ref

•  Run ANALYZE TABLE <tab-name> on each table once •  Use EXPLAIN to see what components are being pushed down:

•  Extra: Child of 'd' in pushed join@1 •  EXPLAIN EXTENDED <query>;SHOW WARNINGS;

Copyright 2012 Oracle Corporation - 26th July 2012 26

Page 27: MySQL Cluster performance best practices

Distribution Aware Apps •  Partition selected using hash on

Partition Key •  Primary Key by default •  User can override in table definition

•  MySQL Server (or NDB API) will attempt to send transaction to the correct data node •  If all data for the transaction are in

the same partition, less messaging -> faster

•  Aim to have all rows for high-running queries in same partition

Partition Key

Primary Key

town country population

Maidenhead UK 78000

Paris France 2193031

Boston UK 58124

Boston USA 617594

SELECT SUM(population) FROM towns WHERE country=“UK”;

Partition Key

Primary Key

town country population

Maidenhead UK 78000

Paris France 2193031

Boston UK 58124

Boston USA 617594

SELECT SUM(population) FROM towns WHERE town=“Boston”;

Copyright 2012 Oracle Corporation - 26th July 2012 27

Page 28: MySQL Cluster performance best practices

Distribution Aware – Multiple Tables

•  Extend partition awareness over multiple tables

•  Same rule – aim to have all data for instance of high running transactions in the same partition

ALTER TABLE service_ids PARTITION BY KEY(sub_id);

EXPLAIN PARTITIONS <query>;

Partition Key

Primary Key

service sub_id svc_id

twitter 19724 76325732

twitter 84539 67324782

facebook 19724 83753984

facebook 73642 87324793

Partition Key

Primary Key

sub_id age gender

19724 25 male

84539 43 female

19724 16 female

74574 21 female

Copyright 2012 Oracle Corporation - 26th July 2012 28

Page 29: MySQL Cluster performance best practices

Validate if “partition pruning” is working mysql> SHOW GLOBAL STATUS LIKE 'ndb_pruned_scan_count'; +-----------------------+-------+ | Variable_name | Value |

+-----------------------+-------+ | Ndb_pruned_scan_count | 12 | +-----------------------+-------+ mysql> SELECT * FROM services WHERE sub_id=1; +--------+--------------+--------------+ | sub_id | service_name | service_parm |

+--------+--------------+--------------+ | 1 | IM | 878 | | 1 | ssh | 666 | | 1 | Video | 654 | +--------+--------------+--------------+

mysql> SHOW GLOBAL STATUS LIKE 'ndb_pruned_scan_count'; +-----------------------+-------+ | Variable_name | Value | +-----------------------+-------+ | Ndb_pruned_scan_count | 13 | +-----------------------+-------+

Copyright 2012 Oracle Corporation - 26th July 2012 29

Page 30: MySQL Cluster performance best practices

Batching •  MySQL Cluster allows batching on

•  Inserts, index scans (when not part of a JOIN), PK reads, PK deletes, and PK (most) updates.

•  Batching means that one network round trip is used to read/modify a number of records → less ping-pong!

•  If you can batch - do it!

•  Example – Insert 1M records •  No batching:

• INSERT INTO t1(data) VALUES (<data>); •  765 seconds to insert 1M records

•  Batching (batches of 16): • INSERT INTO t1(<columns>) VALUES (<data>), (<data>)...

•  50 seconds to insert 1M records

Copyright 2012 Oracle Corporation - 26th July 2012 30

Page 31: MySQL Cluster performance best practices

Batching, more examples SELECT * FROM t1 WHERE userid=1 AND serviceid IN (1,2,3,4,5,7,8,9,10);

SET transaction_allow_batching=1; //must be set on the connection

BEGIN; INSERT INTO t1 ....; INSERT INTO t2 ....; INSERT INTO t3 ....; INSERT INTO t4 ....; DELETE FROM t5 ....; UPDATE t1 SET value='new value' WHERE id=1;

COMMIT;

Copyright 2012 Oracle Corporation - 26th July 2012 31

Page 32: MySQL Cluster performance best practices

Optimizing schema - denormalization

SELECT * FROM bb,voip WHERE bb.userid=voip.userid AND bb.userid=1;

mysql> SELECT * FROM voip_bb WHERE userid=1;

1.7x improvement

userid   bb_data  1   <data>  2   <data>  3   <data>  4   <data>  

voip bb

userid   voip_data  1   <data>  2   <data>  3   <data>  4   <data>  

userid   voip_data   bb_data  1   <data>   <data>  2   <data>   <data>  3   <data>   <data>  4   <data>   <data>  

voip_bb

Copyright 2012 Oracle Corporation - 26th July 2012 32

Page 33: MySQL Cluster performance best practices

Connection Pools •  Network hops increase latency (e.g.

Compared with InnoDB read of cached data) •  Increase throughput by sending in lots of

parallel operations •  Multiple client connections (sessions) to each

MySQL Server •  Multiple MySQL Servers

•  Connection pooling between MySQL Servers and data nodes •  Set ndb-cluster-connection-pool > 1

in my.cnf •  Ensure enough [api] sections in config.ini

•  Don’t assign hostnames!

Data Nodes

NDB API

App

thre

ad

mysqld

App

thre

ad

App

thre

ad

NDB API

App

thre

ad

mysqld

App

thre

ad

App

thre

ad

Copyright 2012 Oracle Corporation - 26th July 2012 33

Page 34: MySQL Cluster performance best practices

Multi-threaded data nodes

•  Scaling out on commodity hardware is the standard way to increase performance •  Add more data nodes and

API nodes as required •  MySQL Cluster 7.2

increases the ability to also scale-up each data node •  Increases maximum

number of utilised threads from 8 to 59

Node Group 2

Nod

e 3

Nod

e 4

Application Nodes

Node Group 1

Nod

e 2

Nod

e 1

Copyright 2012 Oracle Corporation - 26th July 2012 34

Page 35: MySQL Cluster performance best practices

Multi-threaded data nodes

•  Threads: •  ldm: 1,2,4,8 or 12 Local

Query Handler threads •  tc: typically ldm/4

Transaction Coordinator threads

•  send: ~2-3 Send threads

•  recv: ~2-4 Receive threads

•  main: 1 Main thread •  rep: 1 Replication

thread •  io: 1 I/O thread

Application Nodes

Dat

a N

ode

1 recv

tc ldm

send main

rep io

Copyright 2012 Oracle Corporation - 26th July 2012 35

Page 36: MySQL Cluster performance best practices

Multi-threaded data nodes

•  Applies to ntbdmtd only •  Configure through either:

• MaxNoOfExecutionThreads •  Single value for number of threads •  System will allocate these threads to blocks in a reasonable

way • ThreadConfig

•  Specify explicitly how many threads for each block type •  Lock threads to CPUs for further performance gains • Threadconfig=main={cpubind=0},ldm={count=4,cpubind=1,2,5,6},io={count=2,cpubind=3,4}

Copyright 2012 Oracle Corporation - 26th July 2012 36

Page 37: MySQL Cluster performance best practices

Multi-threaded data nodes – starting point for ThreadConfig

24 threads 32 threads 40 threads 48 thread ldm 8 12 16 16

tc 4 6 8 12 recv 3 3 4 5 send 3 3 4 4

rep 1 1 1 1

Copyright 2012 Oracle Corporation - 26th July 2012 37

Note that some threads are left for other data node blocks as well as the OS

Page 38: MySQL Cluster performance best practices

NoSQL APIs

• SQL: Complex, relational queries • Memcached: Key-Value web services •  Java: Enterprise Apps • NDB API: Real-time services

Mix &

Match

Data Nodes

NDB API

Clients

Native memcached HTTP/REST

JDBC / ODBC PHP / PERL

Python / Ruby

Copyright 2012 Oracle Corporation - 26th July 2012 38

Page 39: MySQL Cluster performance best practices

Hardware

• High bandwidth, low latency network •  Turn off firewalls if you can

• Use multiple disks •  Checkpoints •  Log files •  Table spaces

• SSDs •  Biggest benefit for Table spaces

• Refer to MySQL Cluster Evaluation Guide for more details

Copyright 2012 Oracle Corporation - 26th July 2012 39

Page 40: MySQL Cluster performance best practices

•  Don't enable the Query Cache! •  It is very expensive to invalidate over multiple MySQL servers •  A write on one server will force the others to purge their cache.

•  If you have tables that are read only (or change very seldom): my.cnf:

query_cache_size=1000000 query_cache_type=2 (ON DEMAND)

mysql> SELECT SQL_CACHE <cols> .. FROM table; •  SQL_CACHE tells (demands) MySQL to cache the results

from this SELECT •  This can be good for STATIC data

MySQL Query Cache

Copyright 2012 Oracle Corporation - 26th July 2012 40

Page 41: MySQL Cluster performance best practices

Non-Durable tables

•  Some types of tables account for a lot of WRITEs, but do not need to be recovered (e.g, Session tables)

•  Unnecessary to persist such tables - no REDO LOGs or CHECKPOINTs

•  Create these tables as 'NO LOGGING' tables:

mysql> set ndb_table_no_logging=1; mysql> create table session_table(..) engine=ndb; mysql> set ndb_table_no_logging=0;

•  After system restart table will be there, but empty!

Copyright 2012 Oracle Corporation - 26th July 2012 41

Page 42: MySQL Cluster performance best practices

More optimisation tips •  When using auto-increment columns, increase ndb-autoincrement-prefetch-sz

•  Set RedoBuffer=32-64M •  Disk-based tables:

•  Increase UNDO_BUFFER for write-intensive apps •  Increase DiskIOThreadPool •  Increase DiskPageBufferMemory for better caching in the data

nodes; Monitor effectiveness using NDBINFO / MEM

• FragmentLogFileSize=256M • NoOfFragmentLogFiles= 6 x DataMemory (in MB) / (4x 256MB)

•  Use OPTIMIZE TABLE and perform rolling restarts if memory fragmentation is an issue

Copyright 2012 Oracle Corporation - 26th July 2012 42

Page 43: MySQL Cluster performance best practices

Scaling out with MySQL Cluster Manager

agent

ndb_mgmd

mysqld

192.168.0.10

agent

ndb_mgmd

mysqld

192.168.0.11

agent

ndbd

ndbd

192.168.0.12

agent

ndbd

ndbd

192.168.0.13

client

Copyright 2012 Oracle Corporation - 26th July 2012 43

Page 44: MySQL Cluster performance best practices

Scaling out with MySQL Cluster Manager

agent

ndb_mgmd

mysqld

192.168.0.10

agent

ndb_mgmd

mysqld

192.168.0.11

agent

ndbd

ndbd

192.168.0.12

agent

ndbd

ndbd

192.168.0.13

client

agent

ndbd

ndbd

192.168.0.14

agent

ndbd

ndbd

192.168.0.15

mysqld mysqld

Copyright 2012 Oracle Corporation - 26th July 2012 44

Page 45: MySQL Cluster performance best practices

Scaling out with MySQL Cluster Manager

mcm> add hosts --hosts=192.168.0.14,192.168.0.15 mysite;

mcm> add package --basedir=/usr/local/mysql_7_0_9 --hosts=192.168.0.14,192.168.0.15 7.0.9;

mcm> add process [email protected],[email protected],[email protected],[email protected],[email protected],[email protected] -s port:mysqld:52=3307,port:mysqld:53=3307 mycluster;

mcm> start process --added mycluster; mysql> ALTER ONLINE TABLE <table-name> REORGANIZE PARTITION;

mysql> OPTIMIZE TABLE <table-name>;

Copyright 2012 Oracle Corporation - 26th July 2012 45

Page 46: MySQL Cluster performance best practices

Download MySQL Cluster Today!

http://www.mysql.com/downloads/cluster/#downloads

Page 47: MySQL Cluster performance best practices

Further resources

•  MySQL Cluster Performance white paper: http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster_performance.php

•  MySQL Cluster Forum: http://forums.mysql.com/list.php?25

•  MySQL Cluster Evaluation guide: http://www.mysql.com/why-mysql/white-papers/mysql_cluster_eval_guide.php

•  MySQL Cluster in Web-Scale Architectures: http://www.mysql.com/why-mysql/white-papers/mysql_cluster_eval_guide.php

Page 48: MySQL Cluster performance best practices