MySQL Cluster performance best practices
-
Upload
mat-keep -
Category
Technology
-
view
10.899 -
download
5
description
Transcript of MySQL Cluster performance best practices
MySQL Cluster : Delivering Breakthrough Performance 26th July 2012 Andrew Morgan Senior Product Manager – MySQL HA [email protected] clusterdb.com
Mat Keep Senior Product Manager – MySQL HA [email protected]
The presentation is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
Copyright 2012 Oracle Corporation - 26th July 2012 2
Session Agenda • Introduction to MySQL Cluster
• Where does MySQL Cluster fit?
• Benchmarks • WHILE (cluster.measurePerformance() < target) {
cluster.optimize(); }
• Boosting performance
• Scaling out
• Further resources
Copyright 2012 Oracle Corporation - 26th July 2012 3
MySQL Cluster – Users & Applications Extreme Scalability, Availability and Affordability
http://www.mysql.com/customers/cluster/
• Web • High volume OLTP • eCommerce • On-Line Gaming • Digital Marketing • User Profile Management • Session Management & Caching • Content Management
• Telecoms • Service Delivery Platforms • VAS: VoIP, IPTV & VoD • Mobile Content Delivery • Mobile Payments • LTE Access
Copyright 2012 Oracle Corporation - 26th July 2012 4
MySQL Cluster Architecture
Data Nodes
Node Group 1
F1
F3
F3
F1
Nod
e 1
Nod
e 2
Node Group 2
F2
F4
F4
F2
Nod
e 3
Nod
e 4
Application Nodes
Cluster Mgmt
Cluster Mgmt
REST JPA
Copyright 2012 Oracle Corporation - 26th July 2012 5
When to Consider MySQL Cluster l What are the consequences of downtime or failing to meet
performance requirements? l How much effort and $ is spent in developing and managing HA in
your applications? l Are you considering sharding your database to scale write
performance? How does that impact your application and developers?
l Do your services need to be real-time? l Will your services have unpredictable scalability demands,
especially for writes ? l Do you want the flexibility to manage your data
with more than just SQL ?
Copyright 2012 Oracle Corporation - 26th July 2012 6
Where would I not Use MySQL Cluster? • “Hot” data sets >3TB
• Replicate cold data to InnoDB
• Long running transactions • Large rows, without using BLOBs • Foreign Keys
• Check out MySQL Cluster 7.3 Early Access: http://labs.mysql.com/
• Many full table scans • Geo-Spatial indexes • In these scenarios; InnoDB storage engine would be the
right choice
MySQL Cluster Evaluation Guide http://mysql.com/why-mysql/white-papers/mysql_cluster_eval_guide.php
Copyright 2012 Oracle Corporation - 26th July 2012 7
General Design Considerations
Overall design goal Minimize network roundtrips for your
most important requests!
• MySQL Cluster is designed for – Short transactions – Many parallel transactions
• Utilize Simple access patterns to fetch data – Use efficient scans and batching interfaces
• Analyze what your most typical use cases are – optimize for those
Copyright 2012 Oracle Corporation - 26th July 2012 8
Servicing the Most Performance-Intensive Workloads
Copyright 2012 Oracle Corporation - 26th July 2012 9
Servicing the Most Performance-Intensive Workloads
writes
Copyright 2012 Oracle Corporation - 26th July 2012 10
• 1 Billion+ Reads per Minute, 8 node Intel Xeon cluster • Multi-Threaded Data Node Extensions • NoSQL C++ API, flexaSynch benchmark
0
2
4
6
8
10
12
14
16
18
20
MySQL Cluster 7.1 MySQL Cluster 7.2
Rea
ds p
er S
econ
d (M
illio
ns)
Comparing MySQL Cluster Performance 8x Higher Performance per Node
Copyright 2012 Oracle Corporation - 26th July 2012 11
• 30 x Intel E5-2600 Intel Servers • NoSQL C++ API, flexaSynch benchmark • ACID Transactions, with Synchronous Replication
0
5
10
15
20
25
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
Mill
ions
of U
PDAT
Es p
er S
econ
d
MySQL Cluster Data Nodes
1.2 Billion UPDATEs per Minute
Copyright 2012 Oracle Corporation - 26th July 2012 12
WHILE (cluster.measurePerformance() < target)
• Don’t optimize for the sake of it • Wastes effort • Introduces unnecessary compromises/complications
• Forget about database benchmarks • You care about the end-to-end performance of your
application on your database • If possible drive your application to drive the database
• Measurements need to be based on representative traffic
• Measurements need to be repeatable • Easily see impact of each optimization
Copyright 2012 Oracle Corporation - 26th July 2012 13
• Enable the slow query log – set global slow_query_log=1; – set global long_query_time=3; //3 seconds – set global log_queries_not_using_indexes=1; – Slow queries will be written in the slow query log:
mysql> show global variables like 'slow_query_log_file';
+---------------------+------------------------------+
| Variable_name | Value |
+---------------------+------------------------------+
| slow_query_log_file | /data1/mysql/mysqld-slow.log |
+---------------------+------------------------------+
• Queries will be written in plain text
Where has the time gone?
Copyright 2012 Oracle Corporation - 26th July 2012 14
Query Analyzer in MySQL Enterprise Monitor (take the easy option!)
Copyright 2012 Oracle Corporation - 26th July 2012 15
Simple database traffic generation
create.sql: CREATE TABLE sub_name (sub_id INT NOT NULL PRIMARY KEY, name VARCHAR(30)) engine=ndb;
CREATE TABLE sub_age (sub_id INT NOT NULL PRIMARY KEY, age INT) engine=ndb;
INSERT INTO sub_name VALUES (1,'Bill'),(2,'Fred'),(3,'Bill'),(4,'Jane'),(5,'Andrew'),(6,'Anne'),(7,'Juliette'),(8,'Awen'),(9,'Leo'),(10,'Bill');
INSERT INTO sub_age VALUES (1,40),(2,23),(3,33),(4,19),(5,21),(6,50),(7,31),(8,65),(9,18),(10,101);
query.sql: SELECT sub_age.age FROM sub_name, sub_age WHERE sub_name.name='Bill' AND sub_name.sub_id=sub_age.sub_id;
Copyright 2012 Oracle Corporation - 26th July 2012 16
Simple database traffic generation
shell> mysqlslap --concurrency=5 --iterations=100 --query=query.sql --create=create.sql
Benchmark Average number of seconds to run all queries: 0.132 seconds
Minimum number of seconds to run all queries: 0.037 seconds
Maximum number of seconds to run all queries: 0.268 seconds
Number of clients running queries: 5
Average number of queries per client: 1
Copyright 2012 Oracle Corporation - 26th July 2012 17
What on Earth is it doing?
EXPLAIN is your friend
mysql> EXPLAIN <query>;
mysql> EXPLAN PARTITIONS <query>;
mysql> EXPLAIN EXTENDED <query>; mysql> SHOW WARNINGS;
Copyright 2012 Oracle Corporation - 26th July 2012 18
Boosting Performance
• ANALYZE TABLE • Access patterns • AQL (fast JOINs) • Distribution aware • Batching • Schema
• Connection pools • Multi-threaded data
nodes • NoSQL APIs • Hardware • More tips
Copyright 2012 Oracle Corporation - 26th July 2012 19
Before you do anything else, ANALYZE!
• New for MySQL Cluster 7.2 • Lets the MySQL optimizer figure out how to best use
indexes etc. • Instantly speed up queries by many times
mysql> ANALYZE TABLE <tab-name>; • Repeat after changing schema, adding/removing
indexes or making major data changes • Only needs running on one mysqld in the cluster
Copyright 2012 Oracle Corporation - 26th July 2012 20
Access patterns
• Primary key reads/writes -> O(1) • Independent of database size and number of nodes
• Index searched -> O(log n) • n = number of tuples
• BLOBs are stored in second table -> take longer to access
• JOINs massively faster in MySQL Cluster 7.3 • Partition pruning
• By allowing a query to be satisfied with a single data node, reduce resource consumption -> greater throughput
• If result sets not large, will also reduce latency
Copyright 2012 Oracle Corporation - 26th July 2012 21
Adaptive Query Localization Scaling Distributed Joins
• Perform Complex Queries across Shards • JOINs pushed down to data nodes • Executed in parallel • Returns single result set to MySQL
• Opens Up New Use-Cases • Real-time analytics • Recommendations engines • Analyze click-streams
mysqld
Data Nodes
mysqld
AQL
Data Nodes
70x More
Performance
DON’T COMPROMISE FUNCTIONALITY TO SCALE-OUT !!
Copyright 2012 Oracle Corporation - 26th July 2012 22
MySQL Cluster 7.2 AQL Test Query Web-Based Content Management System
Data Node1
Data Node2
MySQL Server
Copyright 2012 Oracle Corporation - 26th July 2012 23
Web-Based CMS
Must Analyze tables for best results mysql> ANALYZE TABLE <tab-name>;
0
10
20
30
40
50
60
70
80
90
100
MySQL Cluster 7.1 MySQL Cluster 7.2
Query Execution Time Seconds
87.23 seconds
1.26 seconds
70x More
Performance
Copyright 2012 Oracle Corporation - 26th July 2012 24
Did I mention ANALYZE TABLE?
Copyright 2012 Oracle Corporation - 26th July 2012 25
AQL – How to Use it • Activated when ndb_join_pushdown is on (default) • Rules for a Join to be pushed down:
1. Joined columns use identical types 2. No reference to BLOB or TEXT columns 3. No explicit lock 4. Child tables in the Join must be accessed using one of the ref, eq_ref, or
const 5. Tables not explicitly partitioned by [LINEAR] HASH, LIST, or RANGE 6. Query plan doesn’t select ‘Using join buffer' 7. If root of Join is an eq_ref or const, child tables must be joined by eq_ref
• Run ANALYZE TABLE <tab-name> on each table once • Use EXPLAIN to see what components are being pushed down:
• Extra: Child of 'd' in pushed join@1 • EXPLAIN EXTENDED <query>;SHOW WARNINGS;
Copyright 2012 Oracle Corporation - 26th July 2012 26
Distribution Aware Apps • Partition selected using hash on
Partition Key • Primary Key by default • User can override in table definition
• MySQL Server (or NDB API) will attempt to send transaction to the correct data node • If all data for the transaction are in
the same partition, less messaging -> faster
• Aim to have all rows for high-running queries in same partition
Partition Key
Primary Key
town country population
Maidenhead UK 78000
Paris France 2193031
Boston UK 58124
Boston USA 617594
SELECT SUM(population) FROM towns WHERE country=“UK”;
Partition Key
Primary Key
town country population
Maidenhead UK 78000
Paris France 2193031
Boston UK 58124
Boston USA 617594
SELECT SUM(population) FROM towns WHERE town=“Boston”;
Copyright 2012 Oracle Corporation - 26th July 2012 27
Distribution Aware – Multiple Tables
• Extend partition awareness over multiple tables
• Same rule – aim to have all data for instance of high running transactions in the same partition
ALTER TABLE service_ids PARTITION BY KEY(sub_id);
EXPLAIN PARTITIONS <query>;
Partition Key
Primary Key
service sub_id svc_id
twitter 19724 76325732
twitter 84539 67324782
facebook 19724 83753984
facebook 73642 87324793
Partition Key
Primary Key
sub_id age gender
19724 25 male
84539 43 female
19724 16 female
74574 21 female
Copyright 2012 Oracle Corporation - 26th July 2012 28
Validate if “partition pruning” is working mysql> SHOW GLOBAL STATUS LIKE 'ndb_pruned_scan_count'; +-----------------------+-------+ | Variable_name | Value |
+-----------------------+-------+ | Ndb_pruned_scan_count | 12 | +-----------------------+-------+ mysql> SELECT * FROM services WHERE sub_id=1; +--------+--------------+--------------+ | sub_id | service_name | service_parm |
+--------+--------------+--------------+ | 1 | IM | 878 | | 1 | ssh | 666 | | 1 | Video | 654 | +--------+--------------+--------------+
mysql> SHOW GLOBAL STATUS LIKE 'ndb_pruned_scan_count'; +-----------------------+-------+ | Variable_name | Value | +-----------------------+-------+ | Ndb_pruned_scan_count | 13 | +-----------------------+-------+
Copyright 2012 Oracle Corporation - 26th July 2012 29
Batching • MySQL Cluster allows batching on
• Inserts, index scans (when not part of a JOIN), PK reads, PK deletes, and PK (most) updates.
• Batching means that one network round trip is used to read/modify a number of records → less ping-pong!
• If you can batch - do it!
• Example – Insert 1M records • No batching:
• INSERT INTO t1(data) VALUES (<data>); • 765 seconds to insert 1M records
• Batching (batches of 16): • INSERT INTO t1(<columns>) VALUES (<data>), (<data>)...
• 50 seconds to insert 1M records
Copyright 2012 Oracle Corporation - 26th July 2012 30
Batching, more examples SELECT * FROM t1 WHERE userid=1 AND serviceid IN (1,2,3,4,5,7,8,9,10);
SET transaction_allow_batching=1; //must be set on the connection
BEGIN; INSERT INTO t1 ....; INSERT INTO t2 ....; INSERT INTO t3 ....; INSERT INTO t4 ....; DELETE FROM t5 ....; UPDATE t1 SET value='new value' WHERE id=1;
COMMIT;
Copyright 2012 Oracle Corporation - 26th July 2012 31
Optimizing schema - denormalization
SELECT * FROM bb,voip WHERE bb.userid=voip.userid AND bb.userid=1;
mysql> SELECT * FROM voip_bb WHERE userid=1;
1.7x improvement
userid bb_data 1 <data> 2 <data> 3 <data> 4 <data>
voip bb
userid voip_data 1 <data> 2 <data> 3 <data> 4 <data>
userid voip_data bb_data 1 <data> <data> 2 <data> <data> 3 <data> <data> 4 <data> <data>
voip_bb
Copyright 2012 Oracle Corporation - 26th July 2012 32
Connection Pools • Network hops increase latency (e.g.
Compared with InnoDB read of cached data) • Increase throughput by sending in lots of
parallel operations • Multiple client connections (sessions) to each
MySQL Server • Multiple MySQL Servers
• Connection pooling between MySQL Servers and data nodes • Set ndb-cluster-connection-pool > 1
in my.cnf • Ensure enough [api] sections in config.ini
• Don’t assign hostnames!
Data Nodes
NDB API
App
thre
ad
mysqld
App
thre
ad
App
thre
ad
NDB API
App
thre
ad
mysqld
App
thre
ad
App
thre
ad
Copyright 2012 Oracle Corporation - 26th July 2012 33
Multi-threaded data nodes
• Scaling out on commodity hardware is the standard way to increase performance • Add more data nodes and
API nodes as required • MySQL Cluster 7.2
increases the ability to also scale-up each data node • Increases maximum
number of utilised threads from 8 to 59
Node Group 2
Nod
e 3
Nod
e 4
Application Nodes
Node Group 1
Nod
e 2
Nod
e 1
Copyright 2012 Oracle Corporation - 26th July 2012 34
Multi-threaded data nodes
• Threads: • ldm: 1,2,4,8 or 12 Local
Query Handler threads • tc: typically ldm/4
Transaction Coordinator threads
• send: ~2-3 Send threads
• recv: ~2-4 Receive threads
• main: 1 Main thread • rep: 1 Replication
thread • io: 1 I/O thread
Application Nodes
Dat
a N
ode
1 recv
tc ldm
send main
rep io
Copyright 2012 Oracle Corporation - 26th July 2012 35
Multi-threaded data nodes
• Applies to ntbdmtd only • Configure through either:
• MaxNoOfExecutionThreads • Single value for number of threads • System will allocate these threads to blocks in a reasonable
way • ThreadConfig
• Specify explicitly how many threads for each block type • Lock threads to CPUs for further performance gains • Threadconfig=main={cpubind=0},ldm={count=4,cpubind=1,2,5,6},io={count=2,cpubind=3,4}
Copyright 2012 Oracle Corporation - 26th July 2012 36
Multi-threaded data nodes – starting point for ThreadConfig
24 threads 32 threads 40 threads 48 thread ldm 8 12 16 16
tc 4 6 8 12 recv 3 3 4 5 send 3 3 4 4
rep 1 1 1 1
Copyright 2012 Oracle Corporation - 26th July 2012 37
Note that some threads are left for other data node blocks as well as the OS
NoSQL APIs
• SQL: Complex, relational queries • Memcached: Key-Value web services • Java: Enterprise Apps • NDB API: Real-time services
Mix &
Match
Data Nodes
NDB API
Clients
Native memcached HTTP/REST
JDBC / ODBC PHP / PERL
Python / Ruby
Copyright 2012 Oracle Corporation - 26th July 2012 38
Hardware
• High bandwidth, low latency network • Turn off firewalls if you can
• Use multiple disks • Checkpoints • Log files • Table spaces
• SSDs • Biggest benefit for Table spaces
• Refer to MySQL Cluster Evaluation Guide for more details
Copyright 2012 Oracle Corporation - 26th July 2012 39
• Don't enable the Query Cache! • It is very expensive to invalidate over multiple MySQL servers • A write on one server will force the others to purge their cache.
• If you have tables that are read only (or change very seldom): my.cnf:
query_cache_size=1000000 query_cache_type=2 (ON DEMAND)
mysql> SELECT SQL_CACHE <cols> .. FROM table; • SQL_CACHE tells (demands) MySQL to cache the results
from this SELECT • This can be good for STATIC data
MySQL Query Cache
Copyright 2012 Oracle Corporation - 26th July 2012 40
Non-Durable tables
• Some types of tables account for a lot of WRITEs, but do not need to be recovered (e.g, Session tables)
• Unnecessary to persist such tables - no REDO LOGs or CHECKPOINTs
• Create these tables as 'NO LOGGING' tables:
mysql> set ndb_table_no_logging=1; mysql> create table session_table(..) engine=ndb; mysql> set ndb_table_no_logging=0;
• After system restart table will be there, but empty!
Copyright 2012 Oracle Corporation - 26th July 2012 41
More optimisation tips • When using auto-increment columns, increase ndb-autoincrement-prefetch-sz
• Set RedoBuffer=32-64M • Disk-based tables:
• Increase UNDO_BUFFER for write-intensive apps • Increase DiskIOThreadPool • Increase DiskPageBufferMemory for better caching in the data
nodes; Monitor effectiveness using NDBINFO / MEM
• FragmentLogFileSize=256M • NoOfFragmentLogFiles= 6 x DataMemory (in MB) / (4x 256MB)
• Use OPTIMIZE TABLE and perform rolling restarts if memory fragmentation is an issue
Copyright 2012 Oracle Corporation - 26th July 2012 42
Scaling out with MySQL Cluster Manager
agent
ndb_mgmd
mysqld
192.168.0.10
agent
ndb_mgmd
mysqld
192.168.0.11
agent
ndbd
ndbd
192.168.0.12
agent
ndbd
ndbd
192.168.0.13
client
Copyright 2012 Oracle Corporation - 26th July 2012 43
Scaling out with MySQL Cluster Manager
agent
ndb_mgmd
mysqld
192.168.0.10
agent
ndb_mgmd
mysqld
192.168.0.11
agent
ndbd
ndbd
192.168.0.12
agent
ndbd
ndbd
192.168.0.13
client
agent
ndbd
ndbd
192.168.0.14
agent
ndbd
ndbd
192.168.0.15
mysqld mysqld
Copyright 2012 Oracle Corporation - 26th July 2012 44
Scaling out with MySQL Cluster Manager
mcm> add hosts --hosts=192.168.0.14,192.168.0.15 mysite;
mcm> add package --basedir=/usr/local/mysql_7_0_9 --hosts=192.168.0.14,192.168.0.15 7.0.9;
mcm> add process [email protected],[email protected],[email protected],[email protected],[email protected],[email protected] -s port:mysqld:52=3307,port:mysqld:53=3307 mycluster;
mcm> start process --added mycluster; mysql> ALTER ONLINE TABLE <table-name> REORGANIZE PARTITION;
mysql> OPTIMIZE TABLE <table-name>;
Copyright 2012 Oracle Corporation - 26th July 2012 45
Download MySQL Cluster Today!
http://www.mysql.com/downloads/cluster/#downloads
Further resources
• MySQL Cluster Performance white paper: http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster_performance.php
• MySQL Cluster Forum: http://forums.mysql.com/list.php?25
• MySQL Cluster Evaluation guide: http://www.mysql.com/why-mysql/white-papers/mysql_cluster_eval_guide.php
• MySQL Cluster in Web-Scale Architectures: http://www.mysql.com/why-mysql/white-papers/mysql_cluster_eval_guide.php