My sql performance tuning course
-
Upload
alberto-centanni -
Category
Education
-
view
359 -
download
5
description
Transcript of My sql performance tuning course
1
MySQL Performance Tuning
2Course topics
Introduction
MySQL Overview
MySQL Products and Tools
MySQL Services and Support
MySQL Web Pages
MySQL Courses
MySQL Certification
MySQL Documentation
3Course topics
Performance Tuning Basics
Thinking About Performance
Areas to Tune
Performance Tuning Terminology
Benchmark Planning
Benchmark Errors
Tuning Steps
General Tuning Session
Deploying MySQL and Benchmarking
4Course topics
Performance Tuning Tools
MySQL Monitoring Tools
Open Source Community Monitoring Tools
Benchmark Tools
Stress Tools
5Course topics
MySQL Server Tuning
Major Components of the MySQL Server
MySQL Thread Handling
MySQL Memory Usage
Simultaneous Connections in MySQL
Reusing Threads
Effects of Thread Caching
Reusing Tables
Setting table open_cache
6Course topics
MySQL Query Cache
MySQL Query Cache
When to Use the MySQL Query Cache
When NOT to Use the MySQL Query Cache
MySQL Query Cache Settings
MySQL Query Cache Status Variables
Improve Query Cache Results
7Course topics
InnoDB
InnoDB Storage Engine
InnoDB Storage Engine Uses
Using the InnoDB Storage Engine
InnoDB Log Files and Buffers
Committing Transactions
InnoDB Table Design
SHOW ENGINE INNODB STATUS
InnoDB Monitors and Settings
8Course topics
MyISAM
MyISAM Storage Engine Uses
MyISAM Table Design
Optimizing MyISAM
MyISAM Table Locks
MyISAM Settings
MyISAM Key Cache
MyISAM Full-Text Search
9Course topics
Other MySQL Storage Engines and Issues
Large Objects
MEMORY Storage Engine Uses
MEMORY Storage Engine Performance
Multiple Storage Engine Advantages
Single Storage Engine Advantages
10Course topics
Schema Design and Performance
Schema Design Considerations
Normalization and Performance
Schema Design
Data Types
Indexes
Partitioning
11Course topics
MySQL Query Performance
General SQL Tuning Best Practices
EXPLAIN
MySQL Optimizer
Finding Problematic Queries
Improve Query Executions
Locate and Correct Problematic Queries
12Course topics
Performance Tuning Extras
Configuring Hardware
Considering Operating Systems
Operating Systems Configurations
Logging
Backup and Recovery
13Introduction MySQL Overview
MySQL is a database management system.
A database is a structured collection of data.
MySQL databases are relational.
A relational database stores data in separate tables rather than putting all the data in one big storeroom.
MySQL software is Open Source.
Open Source means that it is possible for anyone to use and modify the software.
MySQL Server works in client/server or embedded systems.
The MySQL Database Software is a client/server system that consists of a multi-threaded SQL server that supports different backend, several different client programs and libraries, administrative tools, and a wide range of application programming interfaces (APIs).
14Introduction MySQL Products and Tools
MySQL Database Server
It is a fully integrated transaction-safe, ACID compliant database with full commit, rollback, crash recovery and row level locking capabilities
MySQL Connectors
MySQL provides standards-based drivers for JDBC, ODBC, and .Net enabling developers to build database applications
MySQL Replication
MySQL Replication enables users to cost-effectively deliver application performance, scalability and high availability.
MySQL Fabric
MySQL Fabric is an extensible framework for managing farms of MySQL Servers.
15Introduction MySQL Products and Tools
MySQL Partitioning
MySQL Partitioning enables developers and DBAs to improve database performance and simplify the management of very large databases.
MySQL Utilities
MySQL Utilities is a set of command-line tools that are used to work with MySQL servers.
MySQL Workbench
MySQL Workbench provides data modeling, SQL development, and comprehensive administration tools for server configuration, user administration, backup, and much more.
16Introduction MySQL Services and Support
MySQL Technical Support Services provide direct access to our expert MySQL Support engineers who are ready to assist you in the development, deployment, and management of MySQL applications.
Even though you might have highly skilled technical staff that can solve your issues, MySQL Support Engineers can typically solve those same issues a lot faster. A vast majority of the problems the MySQL Support Engineers encounter, they have seen before. So an issue that could take several weeks for your staff to research and resolve, may be solved in a matter of hours by the MySQL Support team.
17Introduction MySQL Web Pages
Home page http://www.mysql.com/
Downloadshttp://www.mysql.com/downloads/
Documentationhttp://dev.mysql.com/doc/
Developer Zonehttp://dev.mysql.com/
18Introduction MySQL Courses
MySQL Database Administrator
MySQL for Beginners
MySQL for Database Administrators
MySQL Performance Tuning
MySQL High Availability
MySQL Cluster
MySQL Developer
MySQL for Beginners
MySQL and PHP - Developing Dynamic Web Applications
MySQL for Developers
MySQL Developer Techniques
MySQL Advanced Stored Procedures
19Introduction MySQL Certification
Competitive Advantage
The rigorous process of becoming Oracle certified makes you a better technologist. The knowledge gained through training and practice will significantly expand the skill set and increase one's credibility when interviewing for jobs.
Salary Advancement
Companies value skilled workers. According to Oracle's 2012 salary survey, more than 80% of Oracle Certified individuals reported a promotion, compensation increase or other career improvements as a result of becoming certified.
Opportunity and Credibility
The skills and knowledge gained by becoming certified will lead to greater confidence and increased career security. Expanded skill set will also help unlock opportunities with employers and potential employers.
20Introduction MySQL Documentation
Main source to MySQL official documentation is found at
http://dev.mysql.com/doc/ or http://docs.oracle.com/cd/E17952_01/
Anyway it’s quite easy to find whatever you need being a well documented database system.
21Performance Tuning Basics Thinking about performance
Performance is measured by the time required to complete a task. In other words, performance is response time.
A database server’s performance is measured by query response time, and the unit of measurement is time per query.
So if the goal is to reduce response time, we need to understand why the server requires a certain amount of time to respond to a query, and reduce or eliminate whatever unnecessary work it’s doing to achieve the result.
In other words, we need to measure where the time goes. This leads to our second important principle of optimization: you cannot reliably optimize what you cannot measure.
Your first job is therefore to measure where time is spent.
22Performance Tuning Basics Areas to tune
Performance is usually pinned at few parameters: Hardware
MySQL Configuration
Schema and Queries
Application Architecture
23Performance Tuning Basics Areas to tune -> Hardware
CPU
MySQL works fine on 64-bit architectures, that's now the default. Make sure you use a 64-bit operating system on 64-bit hardware.
The number of CPUs MySQL can use effectively and how it scales under increasing load depend on both the workload and the system architecture.
The CPU architecture (RISC, CISC, depth of pipeline, etc.), CPU model, and operating system all affect MySQL’s scaling pattern.
A good choice is to adopt up to 24 cores CPUs.
24Performance Tuning Basics Areas to tune -> Hardware
RAM
The biggest reason to have a lot of memory isn’t so you can hold a lot of data in memory: it’s ultimately so you can avoid disk I/O, which is orders of magnitude slower than accessing data in memory. The trick is to balance the memory and disk size, speed, cost, and other qualities so you get good performance for your workload.
To ensure a reliable work and a good performance standard, MySQL environment should count up to 100's of GB.
25Performance Tuning Basics Areas to tune -> Hardware
I/O
The main bottleneck in a database environment is usually located at a mechanical layer such disk drivers and storage. Transaction logs and temporary spaces are heavy consumers of I/O, and affect performance for all users of the database. This is why disks have to wait for spindle, read and write operations and swapping between RAM and dedicated partitions.
Storage engines often keep their data and/or indexes in single large files, which means RAID (Redundant Array of Inexpensive Disks) is usually the most feasible option for storing a lot of data. 7 RAID can help with redundancy, storage size, caching, and speed.
26Performance Tuning Basics Areas to tune -> Hardware
Network
Modern NIC (Network Interface Cards) are capable of high speeds, high bandwidth and low latency.
For best performances and robustness, dedicated servers can rely on bonding and teaming OS features.
1Gb Ethernet are good enough to ensure optimal throughput even in clustered configurations
27Performance Tuning Basics Areas to tune -> Hardware
Measure, that is finding the bottleneck or limiting resource:
CPU
RAM
I/O
Network bandwidth
Measure I/O: vmstat and iostat (from sysstat package)
Measure RAM: ps, free, top
Measure CPU: top, vmstat, dstat
Measure network bandwidth: dstat, ifconfig
28Performance Tuning Basics Areas to tune -> MySQL Configuration
MySQL allows a DBA or developer to modify parameters including the maximum number of client connections, the size of the query cache, the execution style of different logs, index memory cache size, the network protocol used for client-server communications, and dozens of others. This is done by editing the “my.cnf” configuration file, as in this example:
[mysqld]
performance_schema
performance_schema_events_waits_history_size=20
performance_schema_events_waits_history_long_size=15000
log_slow_queries = slow_query.log
long_query_time = 1
log_queries_not_using_indexes = 1
29Performance Tuning Basics Areas to tune -> Schema and Queries
Queries are often intended as a sequence of SELECT, INSERT, UPDATE, DELETE statements.
A database is designed to handle queries quickly, efficiently and reliably.
"Quickly" means getting a good response time in any circumstance
"Efficiently" means a wise use of resources, such as CPU, Memory, IO, Disk Space. Practically speaking this is translated into growing money income and decreasing human effort.
"Reliably" means High Availability. High availability and performance come together to ensure continuity and fast responses.
30Performance Tuning Basics Areas to tune -> Application Architecture
Not all application performance problems come from MySQL, as well as not all application performance problems which come from MySQL are resolved on MySQL level.
One of architecture questions changing how application logic translates to queries is a great optimization.
To have an application working better, it’s fundamental to tune the statement, tune the code and the tune the logic behind it.
31Performance Tuning Basics Performance Tuning Terminology
Term Definition
Bottlenecks The bottleneck is the part of a system which is at capacity. Other parts of the system will be idle waiting for it to perform its task.
Capacity The capacity of a system is the total workload it can handle without violating predetermined key performance acceptance criteria.
Investigation Investigation is an activity based on collecting information related to the speed, scalability, and/or stability characteristics of the product under test that may have value in determining or improving product quality. Investigation is frequently employed to prove or disprove hypotheses regarding the root cause of one or more observed performance issues.
Latency Delay experienced in network transmissions as network packets traverse the network infrastructure.
Metrics Metrics are measurements obtained by running performance tests as expressed on a commonly understood scale. Some metrics commonly obtained through performance tests include processor utilization over time and memory usage by load.
32Performance Tuning Basics Performance Tuning Terminology
Term Definition
Metrics Metrics are measurements obtained by running performance tests as expressed on a commonly understood scale. Some metrics commonly obtained through performance tests include processor utilization over time and memory usage by load.
Performance Performance refers to information regarding your application’s response times, throughput, and resource utilization levels.
Resource utilization
Resource utilization is the cost of the project in terms of system resources. The primary resources are processor, memory, disk I/O, and network I/O.
Response time
Response time is a measure of how responsive an application or subsystem is to a client request.
Scalability Scalability refers to an application’s ability to handle additional workload, without adversely affecting performance, by adding resources such as processor, memory, and storage capacity.
33Performance Tuning Basics Performance Tuning Terminology
Term Definition
Stress test A stress test is a type of performance test designed to evaluate an application’s behaviour when it is pushed beyond normal or peak load conditions. The goal of stress testing is to reveal application bugs that surface only under high load conditions. These bugs can include such things as synchronization issues, race conditions, and memory leaks. Stress testing enables you to identify your application’s weak points, and shows how the application behaves under extreme load conditions.
Throughput Typically expressed in transactions per second (TPS), expresses how many operations or transactions can be processed in a set amount of time.
Utilization In the context of performance testing, utilization is the percentage of time that a resource is busy servicing user requests. The remaining percentage of time is considered idle time.
Workload Workload is the stimulus applied to a system, application, or component to simulate a usage pattern, in regard to concurrency and/or data inputs. The workload includes the total number of users, concurrent active users, data volumes, and transaction volumes, along with the transaction mix.
34Performance Tuning Basics Planning a benchmark
Designing and Planning a Benchmark
The first step in planning a benchmark is to identify the problem and the goal. Next, decide whether to use a standard benchmark or design your own.
Next, you need queries to run against the data. You can make a unit test suite into a rudimentary benchmark just by running it many times, but that’s unlikely to match how you really use the database.
How Long Should the Benchmark Last?
It’s important to run the benchmark for a meaningful amount of time.
Most systems have some buffers that create burstable capacity — the ability to absorb spikes, defer some work, and catch up later after the peak is over.
35Performance Tuning Basics Planning a benchmark
Capturing System Performance and Status
It is important to capture as much information about the system under test (SUT) as possible while the benchmark runs.
It’s a good idea to make a benchmark directory with subdirectories for each run’s results. You can then place the results, configuration files, measurements, scripts, and notes for each run in the appropriate subdirectory.
Getting Accurate Results
The best way to get accurate results is to design your benchmark to answer the question you want to answer.
Are you capturing the data you need to answer the question? Are you benchmarking by the wrong criteria? For example, are you running a CPU-bound benchmark to predict the performance of an application you know will be I/O-bound?
36Performance Tuning Basics Benchmark errors
The BENCHMARK() function can be used to compare the speed of MySQL functions or operators. For example:
mysql> SELECT BENCHMARK(100000000, CONCAT('a','b'));
However, this cannot be used to compare queries:
mysql> SELECT BENCHMARK(100, SELECT `id` FROM `lines`);
ERROR 1064 (42000): You have an error in your SQL syntax;check the manual that corresponds to your MySQL server version for the right syntax to use near 'SELECT `id` FROM `lines`)' at line 1
As MySQL needs a fraction of a second just to parse the query and the system is probably busy doing other things, too, benchmarks with runtimes of less than 5-10s can be considered as totally meaningless and equally runtimes differences in that order of magnitude as pure chance.
37Performance Tuning Basics Benchmark errors
As a general rule, when you run multiple instance of any benchmarking tools, as you increase the number of concurrent connections, you might encounter a "Too many connections" error. You need to adjust MySQL's 'max_connections' variable, which controls the maximum number of concurrent connections allowed by the server.
38Performance Tuning Basics Tuning steps
Step 1 - Storage Engines (MyISAM, InnoDB)
Step 2 - Connections
Step 3 - Sessions
Step 4 - Query Cache
Step 5 - Queries
Step 6 - Schema
39Performance Tuning Basics Tuning steps – Step 1 - Storage Engines
MySQL supports multiple storage engines:
MyISAM - Original Storage Engine, great for web apps
InnoDB - Robust transactional storage engine
Memory Engine - Stores all data in Memory
InfoBright - Large scale data warehouse with 10x or more compression
Kickfire - Appliance based, Worlds fasted 100GB TPC-H
To see what tables are in what engines
mysql> SHOW TABLE STATUS ;
Selecting the storage engine to use is a tuning decision
mysql> alter table tab engine=myisam ;
40Performance Tuning Basics Tuning steps – Step 1 – MyISAM
The primary tuning factor in MyISAM are its two caches:
key_buffer_cache should be 25% of available memory
system cache - leave 75% of available memory free
Available memory is:
All on a dedicated server, if the server has 8GB, use 2GB for the key_buffer_cache and leave the rest free for the system cache to use.
Percent of the part of the server allocated for MySQL, i.e. if you have a server with 8GB, but are using 4GB for other applications then use 1GB for the key_buffer_cache and leave the remaining 3GB free for the system cache to use.
Maximum size for a single key buffer cache is 4GB
41Performance Tuning Basics Tuning steps – Step 1 – MyISAM
mysql> show status like 'Key%' ;
Key_blocks_not_flushed - Dirty key blocks not flushed to disk
Key_blocks_unused - unused blocks in the cache
Key_blocks_used - used Blocks in the cache
% of cache free : Key_blocks_unused /( Key_blocks_unused + Key_blocks_used )
Key_read_requests - key requests to the cache
Key_reads - times a key read request went to disk
Cache read hit % : Key_reads / Key_read_requests
Key_write_requests - key write request to cache
Key_writes - times a key write request went to disk
Cache write hit % : Key_writes / Key_write_request
$ cat /proc/meminfo
to see the system cache in linux
42Performance Tuning Basics Tuning steps – Step 1 – InnoDB
Unlike MyISAM InnoDB uses a single cache for both index and data
Innodb_buffer_pool_size - should be 70-80% of available memory.
It is not uncommon for this to be very large, i.e. 44GB on a system with 40GB of memory.
Make sure its not set so large as to cause swapping!
mysql>show status like 'Innodb_buffer%' ;
InnoDB can use direct IO on systems that support it, linux, FreeBSD, and Solaris.
Innodb_flush_method = O_DIRECT
43Performance Tuning Basics Tuning steps – Step 2 – Connections
MySQL caches the threads used by a connection
mysql> show status like ‘thread%’;
thread_cache_size - Number of threads to cache
Setting this to 100 or higher is not unusual
Monitor Threads_created to see if this is an issue
Counts connections not using the thread cache
Should be less that 1-2 a minute
Usually only an issue if more than 1-2 a second
Only an issue is you create and drop a lot of connections, i.e. PHP
Overhead is usually about 250k per thread
44Performance Tuning Basics Tuning steps – Step 3 – Sessions
Some session variables control space allocated by each session (connection)
Setting these to small can give bad performance
Setting these too large can cause the server to swap!
Can be set by connection
SET SORT_BUFFER_SIZE =1024*1024*128
Set small be default, increase in connections that need it
sort_buffer_size
Used for ORDER BY, GROUP BY, SELECT DISTINCT, UNION DISTINCT
Monitor Sort_merge_passes < 1-2 an hour optimal
Usually a problem in a reporting or data warehouse database
Other important session variables
read_rnd_buffer_size - Set to 1/2 sort_buffer_size
join_buffer_size - (BAD) Watch Select_full_join
read_buffer_size - Used for full table scans, watch Select_scan
tmp_table_size - Max temp table size in memory, watch Created_tmp_disk_tables
45Performance Tuning Basics Tuning steps – Step 4 – Query Cache
MySQL Query Cache caches both the query and the full result set
query_cache_type - Controls behavior
0 or OFF - Not used (buffer may still be allocated)
1 or ON cache all unless SELECT SQL_NO_CACHE (DEFAULT)
2 or DEMAND cache none unless SELECT SQL_CACHE
query_cache_size - Determines the size of the cache
mysql> show status like 'Qc%' ;
Gives great performance if:
Identical queries returning identical data are used often
No or rare inserts, updates or deletes
Best Practice
Set to DEMAND
Add SQL_CACHE to appropriate queries
46Performance Tuning Basics Tuning steps – Step 5 – Queries
Often the #1 issue in overall performance
Always have the slow query log on
http://dev.mysql.com/doc/refman/5.5/en/slow-query-log.html
Analyze using mysqldumpslow
Use: log_queries_not_using_indexes
Check it regularly
Use mysqldumpslow
Best practice is to automate running mysqldumpslow every morning and email results to DBA, DBDev, etc.
Understand and use EXPLAIN
Select_scan - Number of full table scans
Select_full_join - Joins without indexes
47Performance Tuning Basics Tuning steps – Step 5 – Queries
The IN clause in MySLQ is very fast!
Select ... Where idx IN(1,23,345,456) - Much faster than a join
Don’t wrap your indexes in expressions in Where
Select ... Where func(idx) = 20 [index ignored]
Select .. Where idx = otherfunc(20) [may use index]
Best practice : Keep index alone on left side of condition
Avoid % at the start of LIKE on an index
Select ... Where idx LIKE(‘ABC%’) can use index
Select ... Where idx LIKE(‘%XYZ’) must do full table scan
Use union all when appropriate, default is union distinct!
Understand left/right joins and use only when needed.
48Performance Tuning Basics Tuning steps – Step 6 – Schema
Too many indexes slow down inserts/deletes
Use only the indexes you must have
Check often
mysql> show create table tabname ;
Don’t duplicate leading parts of compound keys
index key123 (col1,col2,col3)
index key12 (col1,col2) <- Not needed!
index key1 (col1) <-- Not needed!
Use prefix indexes on large keys
Best indexes are 16 bytes/chars or less
Indexes bigger than 32 bytes/chars should be looked at very closely
should have there own cache if in MyISAM
For large strings that need to be indexed, i.e. URLs, consider using a separate column using the MySQL MD5 to create a hash key.
49Performance Tuning Basics Tuning steps – Step 6 – Schema
Size = performance, smaller is better
Size is important. Do not automatically use 255 for VARCHAR
Temp tables, most caches, expand to full size
Use “procedure analyse” to determine the optimal types given the values in your table
mysql> select * from tab procedure analyse (64,2000)\G
Consider the types:
enum: http://dev.mysql.com/doc/refman/5.5/en/enum.html
set: http://dev.mysql.com/doc/refman/5.5/en/set.html
Compress large strings
Use the MySQL COMPRESS and UNCOMPRESS functions
Very important in InnoDB!
50Performance Tuning Basics General Tuning Session
Never make a change in production first
Have a good benchmark or reliable load
Start with a good baseline
Only change 1 thing at a time identify a set of possible changes
try each change separately
try in combinations of 2, then 3, etc.
Monitor the results
Query performance - query analyzer, slow query log, etc. throughput
single query time
average query time
CPU - top, vmstat
IO - iostat, top, vmstat, bonnie++
Network bandwidth
Document and save the results
51Performance Tuning Basics Deploying MySQL and Benchmarking
Benchmarking can be a very revealing process. It can be used to isolate performance problems, and drill down to specific bottlenecks. More importantly, it can be used to compare different servers in your environment, so you have an expectation of performance from those servers, before you put them to work servicing your application.
MySQL can be deployed on a spectrum of different servers. Some may be servers we physically setup in a data centre, while others are managed hosting servers, and still others are cloud hosted.
Benchmarking can help give us a picture of what we're dealing with.
52Performance Tuning Basics Deploying MySQL and Benchmarking
Why Benchmarking?
We want to know what our server can handle. We want to get an idea of the IO performance, CPU, and overall database throughput. Simple queries run on the server can give us a sense of queries per second, or transactions per second if we want to get more complicated.
53Performance Tuning Basics Deploying MySQL and Benchmarking
Benchmarking Disk IO
On Linux systems, there is a very good tool for benchmarking disk IO. It's called sysbench. Let's run through a simple example of installing sysbench and running our server through some paces.
Installation
$ apt-get –y install sysbench
Test run
$ sysbench --test=fileio prepare
$ sysbench --test=fileio --file-test-mode=rndrw run
$ sysbench --test=fileio cleanup
54Performance Tuning Basics Deploying MySQL and Benchmarking
Benchmarking CPU
Sysbench can also be used to test the CPU performance. It is simpler, as it doesn't need to set up files and so forth.
Test run
$ sysbench --test=cpu run
55Performance Tuning Basics Deploying MySQL and Benchmarking
Benchmarking Database Throughput
With MySQL 5.1 distributions there is a tool included that can do very exhaustive database benchmarking. It's called mysqlslap.
$ mysqlslap -uroot -proot -h localhost --create-schema=sakila -i 5 -c 10 -q "select * from actor order by rand() limit 10"
56Performance Tuning Tools MySQL Monitoring Tools
Open Source Community Monitoring Tools
Benchmark Tools
Stress Tools
57Performance Tuning Tools MySQL Monitoring Tools
MySQL Enterprise Monitor
http://www.mysql.com/products/enterprise/monitor.html
MySQL Workbench
http://www.mysql.com/products/workbench/
Percona Toolkit for MySQL
http://www.percona.com/software/percona-toolkit
58Performance Tuning Tools Open Source Community Monitoring Tools
Mysqladmin
Mysqlreport
Innotop http://sourceforge.net/projects/innotop/
Oprofile http://oprofile.sourceforge.net/about/
Sysbench http://sysbench.sf.net/
Percona Monitoring Plugins http://www.percona.com/software/percona-monitoring-plugins
Mytop
59Performance Tuning Tools Benchmarck Tools MySQL Super Smack http://jeremy.zawodny.com/mysql/super-smack/
Database Test Suite http://sourceforge.net/projects/osdldbt/
Percona’s TPCC-MySQL Tool https://launchpad.net/perconatools
MySQL’s BENCHMARK() Function. MySQL has a handy BENCHMARK() function that you can use to test execution speeds for certain types of operations. You use it by specifying a number of times to execute and an expression to execute.
sysbench
sysbench https://launchpad.net/sysbench is a multithreaded system benchmarking tool. Its goal is to get a sense of system performance, in terms of the factors important for running a database server.
60Performance Tuning Tools Stress Tools Mysqltuner http://mysqltuner.pl/
Neotys
http://www.neotys.com/product/monitoring-mysql-web-load-testing.html
IOZone http://www.iozone.org/
Open Source Database Benchmark http://osdb.sourceforge.net/
Mysqlslap http://dev.mysql.com/doc/refman/5.5/en/mysqlslap.html
61MySQL Server TuningMost of the tuning work should start from the core, being the MySQL server itself. In this case, “server” matches the presence of a mysqld service running on a physical machine, providing visible results as a response to queries, stored procedures and make available data for any treatment, such as populating dynamic web pages.
MySQL is very different from other database servers, and its architectural characteristics make it useful for a wide range of purposes.
At the same time, MySQL can power embedded applications, data warehouses, content indexing and delivery software, highly available redundant systems, online transaction processing (OLTP), and much more.
62MySQL Server Tuning Major Components of the MySQL Server
A picture of how MySQL’s components work together will help you understand the server. Figure shows a logical view of MySQL’s architecture.The topmost layer contains the services that aren’t unique to MySQL. They’re services most network-based client/server tools or servers need: connection handling, authentication, security, and so forth.
63MySQL Server Tuning Major Components of the MySQL Server
The second layer is where things get interesting. Much of MySQL’s brains are here, including the code for query parsing, analysis, optimization, caching, and all the built-in functions (e.g., dates, times, math, and encryption). Any functionality provided across storage engines lives at this level: stored procedures, triggers, and views.
64MySQL Server Tuning Major Components of the MySQL Server
The third layer contains the storage engines. They are responsible for storing and retrieving all data stored “in” MySQL. Like the various filesystems available for GNU/Linux, each storage engine has its own benefits and drawbacks. The server communicates with them through the storage engine API. This interface hides differences between storage engines and makes them largely transparent at the query layer. The API contains a couple of dozen low-level functions that perform operations such as “begin a transaction” or “fetch the row that has this primary key.” The storage engines don’t parse SQL or communicate with each other; they simply respond to requests from the server.
65MySQL Server Tuning MySQL Thread Handling
Each client connection gets its own thread within the server process. The connection’s queries execute within that single thread, which in turn resides on one core or CPU.The server caches threads, so they don’t need to be created and destroyed for each new connection. When clients (applications) connect to the MySQL server, the server needs to authenticate them. Authentication is based on username, originating host, and password. By default, connection manager threads associate each client connection with a thread dedicated to it that handles authentication and request processing for that connection. Manager threads create a new thread when necessary but try to avoid doing so by consulting the thread cache first to see whether it contains a thread that can be used for the connection. When a connection ends, its thread is returned to the thread cache if the cache is not full.
66MySQL Server Tuning MySQL Memory Usage
The following list indicates some of the ways that the mysqld server uses memory.
All threads share the MyISAM key buffer; its size is determined by the key_buffer_size variable.
Each thread that is used to manage client connections uses some thread-specific space. The following list indicates these and which variables control their size:
stack (variable thread_stack)
connection buffer (variable net_buffer_length)
result buffer (variable net_buffer_length)
All threads share the same base memory
Each request that performs a sequential scan of a table allocates a read buffer (variable read_buffer_size).
67MySQL Server Tuning MySQL Memory Usage
All joins are executed in a single pass, and most joins can be done without even using a temporary table.
When a thread is no longer needed, the memory allocated to it is released and returned to the system unless the thread goes back into the thread cache.
Almost all parsing and calculating is done in thread-local and reusable memory pools. No memory overhead is needed for small items, so the normal slow memory allocation and freeing is avoided. Memory is allocated only for unexpectedly large strings.
A FLUSH TABLES statement or mysqladmin flush-tables command closes all tables that are not in use at once and marks all in-use tables to be closed when the currently executing thread finishes. This effectively frees most in-use memory. FLUSH TABLES does not return until all tables have been closed.
The server caches information in memory as a result of GRANT, CREATE USER, CREATE SERVER, and INSTALL PLUGIN statements. This memory is not released by the corresponding REVOKE, DROP USER, DROP SERVER, and UNINSTALL PLUGIN statements, so for a server that executes many instances of the statements that cause caching, there will be an increase in memory use. This cached memory can be freed with FLUSH PRIVILEGES.
68MySQL Server Tuning Simultaneous Connections in MySQL
One means of limiting use of MySQL server resources is to set the global max_user_connections system variable to a nonzero value.
This limits the number of simultaneous connections that can be made by any given account, but places no limits on what a client can do once connected.
In addition, setting max_user_connections does not enable management of individual accounts.
You can set max_connections at server startup or at runtime to control the maximum number of clients that can connect simultaneously.
69MySQL Server Tuning Reusing Threads
MySQL is a single process with multiple threads. Not all databases are architected this way; some have multiple processes that communicate through shared memory or other means.
This is generally so fast that there isn’t really the need for connection pools as there is with other databases.
However, many development environments and programming languages really want a connection pool.
Many others use persistent connections by default, so that a connection isn’t really closed when it’s closed.
There can be more than one solution to this problem, but the one that’s actually partially implemented is a pool of threads.
The thread pool plugin is a commercial feature. It is not included in MySQL community distributions.
This tool provides an alternative thread-handling model designed to reduce overhead and improve performance. It implements a thread pool that increases server performance by efficiently managing statement execution threads for large numbers of client connections.
To control and monitor how the server manages threads that handle client connections, several system and status variables are relevant.
70MySQL Server Tuning Effects of Thread Caching
MySQL uses a separate thread for each client connection. In environments where applications do not attach to a database instance persistently, but rather create and close a lot of connections every second, the process of spawning new threads at high rate may start consuming significant CPU resources. To alleviate this negative effect, MySQL implements thread cache, which allows it to save threads from connections that are being closed and reuse them for new connections. The parameter thread_cache_size defines how many unused threads can be kept alive at any time.
The default value is 0 (no caching), which causes a thread to be set up for each new connection and disposed of when the connection terminates. Set thread_cache_size to N to enable N inactive connection threads to be cached. thread_cache_size can be set at server startup or changed while the server runs. A connection thread becomes inactive when the client connection with which it was associated terminates.
71MySQL Server Tuning Reusing Tables
MySQL is multi-threaded, so there may be many clients issuing queries for a given table simultaneously. To minimize the problem with multiple client sessions having different states on the same table, the table is opened independently by each concurrent session. This uses additional memory but normally increases performance.
When the table cache fills up, the server uses the following procedure to locate a cache entry to use:
Tables that are not currently in use are released, beginning with the table least recently used.
If a new table needs to be opened, but the cache is full and no tables can be released, the cache is temporarily extended as necessary. When the cache is in a temporarily extended state and a table goes from a used to unused state, the table is closed and released from the cache.
72MySQL Server Tuning Reusing Tables
You can determine whether your table cache is too small by checking the mysqld status variable Opened_tables, which indicates the number of table-opening operations since the server started
mysql> SHOW GLOBAL STATUS LIKE 'Opened_tables';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Opened_tables | 277 |
+---------------+-------+
73MySQL Server Tuning Setting table_open_cache
The table_open_cache and max_connections system variables affect the maximum number of files the server keeps open. If you increase one or both of these values, you may run up against a limit imposed by your operating system on the per-process number of open file descriptors. Many operating systems permit you to increase the open-files limit, although the method varies widely from system to system. Consult your operating system documentation to determine whether it is possible to increase the limit and how to do so.
table_open_cache is related to max_connections. For example, for 200 concurrent running connections, specify a table cache size of at least 200 * N, where N is the maximum number of tables per join in any of the queries which you execute. You must also reserve some extra file descriptors for temporary tables and files.
Make sure that your operating system can handle the number of open file descriptors implied by the table_open_cache setting. If table_open_cache is set too high, MySQL may run out of file descriptors and refuse connections, fail to perform queries, and be very unreliable.
74MySQL Query Cache MySQL Query Cache
The query cache stores the text of a SELECT statement together with the corresponding result that was sent to the client. If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again. The query cache is shared among sessions, so a result set generated by one client can be sent in response to the same query issued by another client.
Before even parsing a query, MySQL checks for it in the query cache, if the cache is enabled. This operation is a case-sensitive hash lookup. If the query differs from a similar query in the cache by even a single byte, it won’t match and the query processing will go to the next stage.
The query cache can be useful in an environment where you have tables that do not change very often and for which the server receives many identical queries. This is a typical situation for many Web servers that generate many dynamic pages based on database content. For example, when an order form queries a table to display the lists of all US states or all countries in the world, those values can be retrieved from the query cache. Although the values would probably be retrieved from memory in any case (from the InnoDB buffer pool or MyISAM key cache), using the query cache avoids the overhead of processing the query, deciding whether to use a table scan, and locating the data block for each row.
The query cache always contains current and reliable data. Any insert, update, delete, or other modification to a table causes any relevant entries in the query cache to be flushed.
75MySQL Query Cache When to Use the MySQL Query Cache
The query cache offers the potential for substantial performance improvement. Query Cache is quite helpful for MySQL performance optimization tasks and is great for certain applications, typically simple applications deployed on limited scale or applications dealing with small data sets. Query Cache comes handy under few particular situations:
Third party application – You can’t change how it works with MySQL to add caching but you can enable query cache so it works faster.
Low load applications – If you’re building application which is not designed for extreme load, like many personal application query cache might be all you need. Especially if it is mostly read only scenario.
76MySQL Query Cache When NOT to Use the MySQL Query Cache
As a first consideration, the query cache is disabled by default. This means that having the query cache on has some overhead, even if no queries are ever cached. This means also that Query Cache has relative benefits.
The cache is not used for queries of the following types:
Queries that are a subquery of an outer query
Queries executed within the body of a stored function, trigger, or event
Caching works on full queries only, so it does not work for subselects, inline views or parts of UNION.
Only SELECT queries are cached, SHOW commands or stored procedure calls are not, even if stored procedure would simply preform select to retrieve data from table.
Might not work with transactions – Different transactions may see different states of the database, depending on the updates they have performed and even depending on snapshot they are working on. If you’re using statements outside of transaction you have best chance for them to be cached.
Limited amount of usable memory – Queries are constantly being invalidated from query cache by table updates, this means number of queries in cache and memory used can’t grow forever even if your have very large amount of different queries being run.
77MySQL Query Cache MySQL Query Cache Settings
The query cache system variables all have names that begin with query_cache_.
The have_query_cache server system variable indicates whether the query cache is available:
mysql> SHOW VARIABLES LIKE 'have_query_cache';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| have_query_cache | YES |
+------------------+-------+
78MySQL Query Cache MySQL Query Cache Settings
query_alloc_block_size (defaults to 8192): the actual size of the memory blocks created for result sets in the query cache (don’t adjust)
query_cache_limit (defaults to 1048576): queries with result sets larger than this won’t make it into the query cache
query_cache_min_res_unit (defaults to 4096): the smallest size (in bytes) for blocks in the query cache (don’t adjust)
query_cache_size (defaults to 0): the total size of the query cache (disables query cache if equal to 0)
query_cache_type (defaults to 1): 0 means don’t cache, 1 means cache everything, 2 means only cache result sets on demand
query_cache_wlock_invalidate (defaults to FALSE): allows SELECTS to run from query cache even though the MyISAM table is locked for writing
79MySQL Query Cache MySQL Query Cache Status Variables
mysql> SHOW STATUS LIKE 'Qcache%';
+-------------------------+----------+
| Variable_name | Value |
+-------------------------+----------+
| Qcache_free_blocks | 1 |
| Qcache_free_memory | 16759696 |
| Qcache_hits | 0 |
| Qcache_inserts | 0 |
| Qcache_lowmem_prunes | 0 |
| Qcache_not_cached | 164 |
| Qcache_queries_in_cache | 0 |
| Qcache_total_blocks | 1 |
+-------------------------+----------+
80MySQL Query Cache MySQL Query Cache Status Variables
Qcache_free_blocks: The number of free memory blocks in query cache.
Qcache_free_memory: The amount of free memory for query cache.
Qcache_hits: The number of cache hits.
Qcache_inserts: The number of queries added to the cache.
Qcache_lowmem_prunes: The number of queries that were deleted from the cache because of low memory.
Qcache_not_cached: The number of non-cached queries (not cachable, or due to query_cache_type).
Qcache_queries_in_cache: The number of queries registered in the cache.
Qcache_total_blocks: The total number of blocks in the query cache.
81MySQL Query Cache Improve Query Cache Results
If you want to get optimized and speedy response from your MySQL server then you need to add following two configurations directive to your MySQL server:
query_cache_size=SIZE
The amount of memory (SIZE) allocated for caching query results. The default value is 0, which disables the query cache.
query_cache_type=OPTION
Set the query cache type. Possible options are as follows:
0 : Don’t cache results in or retrieve results from the query cache.
1 : Cache all query results except for those that begin with SELECT S_NO_CACHE.
2 : Cache results only for queries that begin with SELECT SQL_CACHE
You can setup them in /etc/my.cnf (Red Hat) or /etc/mysql/my.cnf (Debian) file:
$ vi /etc/mysql/my.cnf
Append config directives as follows:
query_cache_size = 268435456
query_cache_type=1
query_cache_limit=1048576
82InnoDB InnoDB Storage Engine
InnoDB is a storage engine for MySQL. MySQL 5.5 and later use it by default, rather than MyISAM. It provides the standard ACID-compliant transaction features, along with foreign key support (Declarative Referential Integrity).
The InnoDB tables fully support ACID-compliant and transactions. They are also very optimal for performance. InnoDB table supports foreign keys, commit, rollback, roll-and forward operations. The size of the InnoDB table can be up to 64TB.
The InnoDB storage engine maintains its own buffer pool for caching data and indexes in main memory. When the innodb_file_per_table setting is enabled, each new InnoDB table and its associated indexes are stored in a separate file. When the innodb_file_per_table option is disabled, InnoDB stores all its tables and indexes in the single system tablespace, which may consist of several files (or raw disk partitions). InnoDB tables can handle large quantities of data, even on operating systems where file size is limited to 2GB.
ACID - Atomicity, Consistency, Isolation, Durability
83InnoDB InnoDB Storage Engine Uses
Transactions
If your application requires transactions, InnoDB is the most stable, well-integrated, proven choice. MyISAM is a good choice if a task doesn’t require transactions and issues primarily either SELECT or INSERT queries. Sometimes specific components of an application (such as logging) fall into this category.
Backups
The need to perform regular backups might also influence your choice. If your server can be shut down at regular intervals for backups, the storage engines are equally easy to deal with. However, if you need to perform online backups, you basically need InnoDB.
Crash recovery
If you have a lot of data, you should seriously consider how long it will take to recover from a crash. MyISAM tables become corrupt more easily and take much longer to recover than InnoDB tables. In fact, this is one of the most important reasons why a lot of people use InnoDB when they don’t need transactions.
84InnoDB Using the InnoDB Storage Engine
InnoDB is designed to handle transactional applications that require crash recovery, referential integrity, high levels of user concurrency and fast response times.
When to use InnoDB?
You are developing an application that requires ACID compliance. At the very least, your application demands the storage layer support the notion of transactions.
You require expedient crash recovery. Almost all production sites fall into this category, however MyISAM table recovery times will obviously vary from one usage pattern to the next. To estimate an accurate figure for your environment, try running myisamchk over a many-gigabyte table from your application's backups on hardware similar to what you have in production. While recovery times of MyISAM tables increase with growth of the table, InnoDB table recovery times remain largely constant throughout the life of the table.
Your web site or application is mostly multi-user. The database is having to deal with frequent UPDATEs to a single table and you would like to make better use of your multi-processing hardware.
85InnoDB InnoDB Log Files and Buffers
InnoDB is a general-purpose storage engine that balances high reliability and high performance. It is a transactional storage engine and is fully ACID compliant, as would be expected from any relational database. The durability guarantee provided by InnoDB is made possible by the redo logs.
By default, InnoDB creates two redo log files (or just log files) ib_logfile0 and ib_logfile1 within the data directory of MySQL.
The redo log files are used in a circular fashion. This means that the redo logs are written from the beginning to end of first redo log file, then it is continued to be written into the next log file, and so on till it reaches the last redo log file. Once the last redo log file has been written, then redo logs are again written from the first redo log file.
The log files are viewed as a sequence of blocks called "log blocks" whose size is given by OS_FILE_LOG_BLOCK_SIZE which is equal to 512 bytes. Each log file has a header whose size is given by LOG_FILE_HDR_SIZE, which is defined as 4*OS_FILE_LOG_BLOCK_SIZE.
86InnoDB InnoDB Log Files and Buffers
The global log system object log_sys holds important information related to log subsystem of InnoDB.
This object points to various positions in the in-memory redo log buffer and on-disk redo log files.
The picture shows the locations pointed to by the global log_sys object. The picture clearly shows that the redo log buffer maps to a specific portion of the redo log file.
87InnoDB Committing Transactions
By default, MySQL starts the session for each new connection with autocommit mode enabled, so MySQL does a commit after each SQL statement if that statement did not return an error. If a statement returns an error, the commit or rollback behavior depends on the error.
If a session that has autocommit disabled ends without explicitly committing the final transaction, MySQL rolls back that transaction.
Some statements implicitly end a transaction, as if you had done a COMMIT before executing the statement.
To optimize InnoDB transaction processing, find the ideal balance between the performance overhead of transactional features and the workload of your server.
The default MySQL setting AUTOCOMMIT=1 can impose performance limitations on a busy database server. Where practical, wrap several related DML operations into a single transaction, by issuing SET AUTOCOMMIT=0 or a START TRANSACTION statement, followed by a COMMIT statement after making all the changes.
88InnoDB Committing Transactions
Avoid performing rollbacks after inserting, updating, or deleting huge numbers of rows. If a big transaction is slowing down server performance, rolling it back can make the problem worse, potentially taking several times as long to perform as the original DML operations. Killing the database process does not help, because the rollback starts again on server startup.
When rows are modified or deleted, the rows and associated undo logs are not physically removed immediately, or even immediately after the transaction commits. The old data is preserved until transactions that started earlier or concurrently are finished, so that those transactions can access the previous state of modified or deleted rows. Thus, a long-running transaction can prevent InnoDB from purging data that was changed by a different transaction.
89InnoDB InnoDB Table Design
Use short PRIMARY KEY
Primary key is part of all other indexes on table
Consider artificial auto_increment PRIMARY KEY and UNIQUE for original PRIMARY KEY
INT keys are faster than VARCHAR/CHAR
PRIMARY KEY is most efficient for lookups
Reference tables by PRIMARY KEY when possible
Do not update PRIMARY KEY
This will require all other keys to be modified for row
This often requires row relocation to other page
Cluster your accesses by PRIMARY KEY
Inserts in PRIMARY KEY order are much faster.
90InnoDB InnoDB Table Design
InnoDB creates each table and associated primary key index either in the system tablespace, or in a separate tablespace (represented by a .ibd file).
Always set up a primary key for each InnoDB table, specifying the column or columns that:
Are referenced by the most important queries.
Are never left blank.
Never have duplicate values.
Rarely if ever change value once inserted.
Although the table works correctly without you defining a primary key, the primary key is involved with many aspects of performance and is a crucial design aspect for any large or frequently used table.
InnoDB provides an optimization that significantly improves scalability and performance of SQL statements that insert rows into tables with AUTO_INCREMENT columns.
91InnoDB InnoDB Table Design
Limits on InnoDB Tables
A table can contain a maximum of 1000 columns.
A table can contain a maximum of 64 secondary indexes.
By default, an index key for a single-column index can be up to 767 bytes.
The InnoDB internal maximum key length is 3500 bytes, but MySQL itself restricts this to 3072 bytes.
The maximum row length is slightly less than half of a database page. The default database page size in InnoDB is 16KB.
Although InnoDB supports row sizes larger than 65,535 bytes internally, MySQL itself imposes a row-size limit of 65,535 for the combined size of all columns.
92InnoDB SHOW ENGINE INNODB STATUS
The InnoDB storage engine exposes a lot of information about its internals in the output of SHOW ENGINE INNODB STATUS. Unlike most of the SHOW commands, its output consists of a single string, not rows and columns.
HEADER
The first section is the header, which simply announces the beginning of the output, the current date and time, and how long it has been since the last printout.
SEMAPHORES
If you have a high-concurrency workload, you might want to pay attention to the next section, SEMAPHORES . It contains two kinds of data: event counters and, optionally, a list of current waits. If you’re having trouble with bottlenecks, you can use this information to help you find the bottlenecks.
LATEST FOREIGN KEY ERROR
This section, LATEST FOREIGN KEY ERROR, doesn’t appear unless your server has had a foreign key error. Sometimes the problem is to do with a transaction and the parent or child rows it was looking for while trying to insert, update, or delete a record.
LATEST DETECTED DEADLOCK
Like the foreign key section, the LATEST DETECTED DEADLOCK section appears only if your server has had a deadlock. The deadlock error messages are also overwritten every time there’s a new error, and the pt-deadlock -logger tool from Percona Toolkit can help you save these for later analysis. A deadlock is a cycle in the waits-for graph, which is a data structure of row locks held and waited for. The cycle can be arbitrarily large.
93InnoDB SHOW ENGINE INNODB STATUS
FILE I/O
The FILE I/O section shows the state of the I/O helper threads, along with performance counters.
INSERT BUFFER AND ADAPTIVE HASH INDEX
This section shows the status of these two structures inside InnoDB.
LOG
This section shows statistics about InnoDB’s transaction log (redo log) subsystem.
BUFFER POOL AND MEMORY
This section shows statistics about InnoDB’s buffer pool and how it uses memory.
ROW OPERATIONS
This section shows miscellaneous InnoDB statistics.
94InnoDB InnoDB Monitors and Settings
InnoDB monitors provide information about the InnoDB internal state. This information is useful for performance tuning. There are four types of InnoDB monitors:
The standard InnoDB Monitor displays the following types of information:
Table and record locks held by each active transaction.
Lock waits of a transaction.
Semaphore waits of threads.
Pending file I/O requests.
Buffer pool statistics.
Purge and insert buffer merge activity of the main InnoDB thread.
The InnoDB Lock Monitor is like the standard InnoDB Monitor but also provides extensive lock information.
The InnoDB Tablespace Monitor prints a list of file segments in the shared tablespace and validates the tablespace allocation data structures.
The InnoDB Table Monitor prints the contents of the InnoDB internal data dictionary.
95InnoDB InnoDB Monitors and Settings
When switched on, InnoDB monitors print data about every 15 seconds. Server output usually is directed to the error log. This data is useful in performance tuning. InnoDB sends diagnostic output to stderr or to files rather than to stdout or fixed-size memory buffers, to avoid potential buffer overflows.
The output of SHOW ENGINE INNODB STATUS is written to a status file in the MySQL data directory every fifteen seconds. The name of the file is innodb_status.pid, where pid is the server process ID. InnoDB removes the file for a normal shutdown.
96InnoDB InnoDB Monitors and Settings
Enabling the Standard InnoDB Monitor
To enable the standard InnoDB Monitor for periodic output, create the innodb_monitor table:
CREATE TABLE innodb_monitor (a INT) ENGINE=INNODB;
To disable the standard InnoDB Monitor, drop the table:
DROP TABLE innodb_monitor;
Enabling the InnoDB Lock Monitor
To enable the InnoDB Lock Monitor for periodic output, create the innodb_lock_monitor table:
CREATE TABLE innodb_lock_monitor (a INT) ENGINE=INNODB;
To disable the InnoDB Lock Monitor, drop the table:
DROP TABLE innodb_lock_monitor;
97InnoDB InnoDB Monitors and Settings
Enabling the InnoDB Tablespace Monitor
To enable the InnoDB Tablespace Monitor for periodic output, create the innodb_tablespace_monitor table:
CREATE TABLE innodb_tablespace_monitor (a INT) ENGINE=INNODB;
To disable the standard InnoDB Tablespace Monitor, drop the table:
DROP TABLE innodb_tablespace_monitor;
Enabling the InnoDB Table Monitor
To enable the InnoDB Table Monitor for periodic output, create the innodb_table_monitor table:
CREATE TABLE innodb_table_monitor (a INT) ENGINE=INNODB;
To disable the InnoDB Table Monitor, drop the table:
DROP TABLE innodb_table_monitor;
98InnoDB InnoDB Monitors and Settings
To fine tune InnoDB working parameters, first check their values.
mysql> show variables like 'innodb_buffer%';
+------------------------------+-----------+
| Variable_name | Value |
+------------------------------+-----------+
| innodb_buffer_pool_instances | 1 |
| innodb_buffer_pool_size | 134217728 |
+------------------------------+-----------+
mysql> show variables like 'innodb_log%';
+---------------------------+---------+
| Variable_name | Value |
+---------------------------+---------+
| innodb_log_buffer_size | 8388608 |
| innodb_log_file_size | 5242880 |
| innodb_log_files_in_group | 2 |
| innodb_log_group_home_dir | ./ |
+---------------------------+---------+
99InnoDB InnoDB Monitors and Settings
To make the modification persistent, edit the “my.cnf” configuration file.
$ vi /etc/mysql/my.cnf
Add the following lines with values as needed:
# innodb
innodb_buffer_pool_size = 128M
innodb_log_file_size = 32M
100
MyISAM MyISAM Storage Engine Uses
MyISAM is a storage engine employed by MySQL database that was used by default prior to MySQL version 5.5 (released in December, 2009). It is based on ISAM (Indexed Sequential Access Method), an indexing algorithm developed by IBM that allows retrieving information from large sets of data in a fast way.
Read-only tables. If your applications use tables that are never or rarely modified, you can safely change their storage engine to MyISAM.
Replication configuration. Replication enables you to automatically keep several databases synchronized. Unlike clustering, in which all nodes are self-sufficient, replication suggests that you assign different roles to different servers. Particularly, you can make an InnoDB-based Master database which is used for writing and processing data and MyISAM-based Slave database which is used for reading.
Backup. The most effective approach to MySQL backup is a combination of Master-to-Slave replication and backup of Slave Servers.
101
MyISAM MyISAM Table Design
MyISAM is no longer the default storage engine. All new tables will be created with InnoDB storage engine if you do not specify any storage engine name. But if you want to create a new table with MyISAM storage engine explicitly, you can specify "ENGINE = MYISAM" as the end of the "CREATE TABLE" statement.
MyISAM supports three different storage formats. The fixed and dynamic format are chosen automatically depending on the type of columns you are using. The compressed format can be created only with the myisampack utility.
102
MyISAM MyISAM Table Design
Static-format tables have these characteristics:
CHAR and VARCHAR columns are space-padded to the specified column width, although the column type is not altered. BINARY and VARBINARY columns are padded with 0x00 bytes to the column width.
Very quick.
Easy to cache.
Easy to reconstruct after a crash, because rows are located in fixed positions.
Reorganization is unnecessary unless you delete a huge number of rows and want to return free disk space to the operating system. To do this, use OPTIMIZE TABLE or myisamchk -r.
Usually require more disk space than dynamic-format tables.
103
MyISAM MyISAM Table Design
Dynamic-format tables have these characteristics:
All string columns are dynamic except those with a length less than four.
Each row is preceded by a bitmap that indicates which columns contain the empty string (for string columns) or zero (for numeric columns). Note that this does not include columns that contain NULL values. If a string column has a length of zero after trailing space removal, or a numeric column has a value of zero, it is marked in the bitmap and not saved to disk. Nonempty strings are saved as a length byte plus the string contents.
Much less disk space usually is required than for fixed-length tables.
Each row uses only as much space as is required. However, if a row becomes larger, it is split into as many pieces as are required, resulting in row fragmentation. For example, if you update a row with information that extends the row length, the row becomes fragmented. In this case, you may have to run OPTIMIZE TABLE or myisamchk -r from time to time to improve performance. Use myisamchk -ei to obtain table statistics.
More difficult than static-format tables to reconstruct after a crash, because rows may be fragmented into many pieces and links (fragments) may be missing.
104
MyISAM MyISAM Table Design
Compressed tables have the following characteristics:
Compressed tables take very little disk space. This minimizes disk usage, which is helpful when using slow disks (such as CD-ROMs).
Each row is compressed separately, so there is very little access overhead. The header for a row takes up one to three bytes depending on the biggest row in the table. Each column is compressed differently. There is usually a different Huffman tree for each column. Some of the compression types are:
Suffix space compression.
Prefix space compression.
Numbers with a value of zero are stored using one bit.
If values in an integer column have a small range, the column is stored using the smallest possible type. For example, a BIGINT column (eight bytes) can be stored as a TINYINT column (one byte) if all its values are in the range from -128 to 127.
If a column has only a small set of possible values, the data type is converted to ENUM.
A column may use any combination of the preceding compression types.
105
MyISAM Optimizing MyISAM
The MyISAM storage engine performs best with read-mostly data or with low-concurrency operations, because table locks limit the ability to perform simultaneous updates.
Some general tips for speeding up queries on MyISAM tables:
To help MySQL better optimize queries, use ANALYZE TABLE or run myisamchk --analyze on a table after it has been loaded with data. This updates a value for each index part that indicates the average number of rows that have the same value.
Try to avoid complex SELECT queries on MyISAM tables that are updated frequently, to avoid problems with table locking that occur due to contention between readers and writers.
For MyISAM tables that change frequently, try to avoid all variable-length columns (VARCHAR, BLOB, and TEXT).
Use INSERT DELAYED when you do not need to know when your data is written. This reduces the overall insertion impact because many rows can be written with a single disk write.
Use OPTIMIZE TABLE periodically to avoid fragmentation with dynamic-format MyISAM tables.
You can increase performance by caching queries or answers in your application and then executing many inserts or updates together. Locking the table during this operation ensures that the index cache is only flushed once after all updates.
106
MyISAM MyISAM Table Locks
To achieve a very high lock speed, MySQL uses table locking for almost all storage engines including MyISAM.
Table lock is exactly what does it mean: it locks the entire table.
When a client has to write to a table (insert, delete, update, etc.), it acquires a write lock. This keeps all other read and write operations pending.
When nobody is writing, readers can obtain read locks, which don’t conflict with other read locks.
107
MyISAM MyISAM Table Locks
Considerations for Table Locking
Table locking in MySQL is deadlock-free for storage engines that use table-level locking. Deadlock avoidance is managed by always requesting all needed locks at once at the beginning of a query and always locking the tables in the same order.
MySQL grants table write locks as follows:
If there are no locks on the table, put a write lock on it.
Otherwise, put the lock request in the write lock queue.
MySQL grants table read locks as follows:
If there are no write locks on the table, put a read lock on it.
Otherwise, put the lock request in the read lock queue.
The MyISAM storage engine supports concurrent inserts to reduce contention between readers and writers for a given table: If a MyISAM table has no free blocks in the middle of the data file, rows are always inserted at the end of the data file. In this case, you can freely mix concurrent INSERT and SELECT statements for a MyISAM table without locks.
108
MyISAM MyISAM Settings
MyISAM offers table-level locking, meaning that when data is being written into a table, the whole table is locked, and if there are other writes that must be performed at the same time on the same table, they will have to wait until the first one has finished writing data.
The problems of table-level locking are only noticeable on very busy servers. For the typical website scenario, usually MyISAM offers better performance at a lower server cost.
If the load on the MySQL server is very high and the server is not using the swap file, before upgrading the server with a more expensive one with more processing power, you may want to try and alter its tables to use the MyISAM engine instead of the InnoDB to see what happens.
In the end, which engine you should use will depend on the particular scenario of the server.
If you decide to use only MyISAM tables, you must add the following configuration lines to your my.cnf file:
default-storage-engine=MyISAM
default-tmp-storage-engine=MyISAM
If you only have MyISAM tables, you can disable the InnoDB engine, which will save you RAM, by adding the following line to your my.cnf file:
skip-innodb
Note, however, that if you don't add the two lines presented above to your my.cnf file, the skip-innodb configuration will prevent your MySQL server from starting, since current versions of the MySQL server uses InnoDB by default.
109
MyISAM MyISAM Key Cache
To minimize disk I/O, the MyISAM storage engine exploits a strategy that is used by many database management systems. It employs a cache mechanism to keep the most frequently accessed table blocks in memory:
For index blocks, a special structure called the key cache (or key buffer) is maintained. The structure contains a number of block buffers where the most-used index blocks are placed.
For data blocks, MySQL uses no special cache. Instead it relies on the native operating system file system cache.
The MyISAM key caches are also referred to as key buffers; there is one by default, but you can create more. MyISAM caches only indexes, not data (it lets the operating system cache the data). If you use mostly MyISAM, you should allocate a lot of memory to the key caches.
110
MyISAM MyISAM Key Cache
To control the size of the key cache, use the key_buffer_size system variable. If this variable is set equal to zero, no key cache is used. The key cache also is not used if the key_buffer_size value is too small to allocate the minimal number of block buffers.
key caches should not be bigger than the total index size or 25% to 50% of the amount of memory you reserved for operating system caches.
By default, MyISAM caches all indexes in the default key buffer, but you can create multiple named key buffers. This lets you keep more than 4 GB of indexes in memory at once. To create key buffers named key_buffer_1 and key_buffer_2 , each sized at 1 GB, place the following in the “my,cnf” configuration file:
key_buffer_1.key_buffer_size = 1G
key_buffer_2.key_buffer_size = 1G
111
MyISAM MyISAM Full-Text Search
MySQL has support for full-text indexing and searching:
A full-text index in MySQL is an index of type FULLTEXT.
Full-text indexes can be used only with MyISAM tables. Full-text indexes can be created only for CHAR, VARCHAR, or TEXT columns.
A FULLTEXT index definition can be given in the CREATE TABLE statement when a table is created, or added later using ALTER TABLE or CREATE INDEX.
For large data sets, it is much faster to load your data into a table that has no FULLTEXT index and then create the index after that, than to load data into a table that has an existing FULLTEXT index.
Full-text searching is performed using MATCH() ... AGAINST syntax. MATCH() takes a comma-separated list that names the columns to be searched. AGAINST takes a string to search for, and an optional modifier that indicates what type of search to perform. The search string must be a string value that is constant during query evaluation.
112
MyISAM MyISAM Full-Text Search
Before you can perform full-text search in a column of a table, you must index its data and re-index its data whenever the data of the column changes. In MySQL, the full-text index is a kind of index named FULLTEXT.
You can define the FULLTEXT index in a variety of ways:
Typically, you define the FULLTEXT index for a column when you create a new table by using the CREATE TABLE.
CREATE TABLE posts (
id int(4) NOT NULL AUTO_INCREMENT,
title varchar(255) NOT NULL,
post_content text,
PRIMARY KEY (id),
FULLTEXT KEY post_content (post_content)
) ENGINE=MyISAM;
In case you already have an existing tables and want to define full-text indexes, you can use the ALTER TABLE statement or CREATE INDEX statement.
This is the syntax of define a FULLTEXT index using the ALTER TABLE statement:
ALTER TABLE table_name ADD FULLTEXT(column_name1, column_name2,…)
You can also use CREATE INDEX statement to create FULLTEXT index for existing tables.
CREATE FULLTEXT INDEX index_name ON table_name(idx_column_name,...)
113
MyISAM MyISAM Full-Text Search
SPHINX
Sphinx http://www.sphinxsearch.com is a free, open source, full-text search engine, designed from the ground up to integrate well with databases. It has DBMS-like features, is very fast, supports distributed searching, and scales well. It is also designed for efficient memory and disk I/O, which is important because they’re often the limiting factors for large operations.
Sphinx works well with MySQL. It can be used to accelerate a variety of queries, including full-text searches; you can also use it to perform fast grouping and sorting operations, among other applications.
114
MyISAM MyISAM Full-Text Search
SPHINX
Sphinx can complement a MySQL-based application in many ways, increasing performance where MySQL is not a good solution and adding functionality MySQL can’t provide.
Typical usage scenarios include:
Fast, efficient, scalable, relevant full-text searches
Optimizing WHERE conditions on low-selectivity indexes or columns without indexes
Optimizing ORDER BY ... LIMIT N queries and GROUP BY queries
Generating result sets in parallel
Scaling up and scaling out
Aggregating partitioned data
115
Other MySQL Storage Engines and Issues
Large Objects
Even though MySQL is used to power a lot of web sites and applications that handle large binary objects (BLOBs) like images, videos or audio files, these objects are usually not stored in MySQL tables directly today. The reason for that is that the MySQL Client/Server protocol applies certain restrictions on the size of objects that can be returned and that the overall performance is not acceptable, as the current MySQL storage engines have not really been optimized to properly handle large numbers of BLOBs.
In MySQL the maximum size of a given blob can be up to 4 GB. MySQL doesn't offer any other parameter directly impacting blob performance.
116
Other MySQL Storage Engines and Issues
Large Objects
BLOBs create big rows in memory, and sequential scans are not possible. The database can become too big to handle, and then the database won't scale well. In addition, BLOBs slows down replication, and BLOB data must be written to the binary log.
BLOB operations are transactional and have valid references and putting the BLOBs in a database makes replication possible.
Solution is Scalable BLOB Streaming Project for MySQL such as "PrimeBase XT Storage Engine for MySQL" (PBXT) and "PrimeBase Media Streaming" engine (PBMS).
117
Other MySQL Storage Engines and Issues
MEMORY Storage Engine Uses
The MEMORY storage engine creates special-purpose tables with contents that are stored in memory. Because the data is vulnerable to crashes, hardware issues, or power outages, only use these tables as temporary work areas or read-only caches for data pulled from other tables.
A typical use case for the MEMORY engine involves these characteristics:
Operations involving transient, non-critical data such as session management or caching. When the MySQL server halts or restarts, the data in MEMORY tables is lost.
In-memory storage for fast access and low latency. Data volume can fit entirely in memory without causing the operating system to swap out virtual memory pages.
A read-only or read-mostly data access pattern (limited updates).
Basically, it’s a engine that’s really only useful for a single connection in limited use cases.
118
Other MySQL Storage Engines and Issues
MEMORY Storage Engine Performance
People often wants to use the MySQL memory engine to store web sessions or other similar volatile data.
There are good reasons for that, here are the main ones:
Data is volatile, it is not the end of the world if it is lost
Elements are accessed by primary key so hash index are good
Sessions tables are accessed heavily (reads/writes), using Memory tables save disk IO
Unfortunately, the Memory engine also has some limitations that can prevent its use on a large scale:
Bound by the memory of one server
Variable length data types like varchar are expanded
Bound to the CPU processing of one server
The Memory engine only supports table level locking, limiting concurrency
Those limitations can be hit fairly rapidly, especially if the session payload data is large.
However, MEMORY performance is constrained by contention resulting from single-thread execution and table lock overhead when processing updates.
MySQL Cluster offers the same features as the MEMORY engine with higher performance levels.
119
Other MySQL Storage Engines and Issues Multiple Storage Engine Advantages
MySQL supports several storage engines that act as handlers for different table types. MySQL storage engines include both those that handle transaction-safe tables and those that handle non-transaction-safe tables.
Transaction-safe tables (TSTs) have several advantages over non-transaction-safe tables (NTSTs):
Safer. Even if MySQL crashes or you get hardware problems, you can get your data back, either by automatic recovery or from a backup plus the transaction log.
You can combine many statements and accept them all at the same time with the COMMIT statement (if autocommit is disabled).
You can execute ROLLBACK to ignore your changes (if autocommit is disabled).
If an update fails, all your changes will be restored. (With non-transaction-safe tables, all changes that have taken place are permanent.)
Transaction-safe storage engines can provide better concurrency for tables that get many updates concurrently with reads.
Non-transaction-safe tables have several advantages of their own, all of which occur because there is no transaction overhead:
Much faster
Lower disk space requirements
Less memory required to perform updates
You can combine transaction-safe and non-transaction-safe tables in the same statements to get the best of both worlds.
120
Other MySQL Storage Engines and Issues
Single Storage Engine Advantages
One of the strenght points of MySQL is support for Multiple Storage engines, and from the glance view it is indeed great to provide users with same top level SQL interface allowing them to store their data many different way. As nice as it sounds the in theory this benefit comes at very significant cost in performance, operational and development complexity.
What is interesting for probably 95% of applications single storage engine would be good enough. In fact people already do not love to mix multiple storage engines very actively because of potential complications involved.
Now lets think what we could have if we have a version of MySQL Server which drops everything but Innodb (or any else) Storage engine: we could save a lot of CPU cycles by having storage format same as processing format. We could tune Optimizer to handle Innodb specifics well. We could get rid of SQL level table locks and using Innodb internal data dictionary instead of Innodb files. We would use Innodb transactional log for replication. Finally backup can be done safely.
Single Storage Engine server would be also a lot easier to test and operate.
This also would not mean one has to give up flexibility completely, for example one can imagine having Innodb tables which do not log the changes, hence being faster for update operations. One could also lock them in memory to ensure predictable in memory performance.
121
Schema Design and Performance
Schema Design Considerations
Good logical and physical design is the cornerstone of high performance, and you must design your schema for the specific queries you will run. This often involves trade-offs. Adding counter and summary tables is a great way to optimize queries, but they can be expensive to maintain. MySQL’s particular features and implementation details influence this quite a bit. The most optimization tricks for MySQL focus on query performance or server tuning. But the optimization starts with the design of the database schema. When you forget to optimize the base of your database (the structure), then you will pay the price of your laxity from the beginning of your work with the database. Sure, every storage engine have his own advantages and disadvantages. But regardless of the engine you choose, you should consider some items in your database schema.
As a quick rule of thumb, consider these initial few steps:
Do not index columns that you not need in a select
Use clever refactoring to admit changes to current schema
Choose the minimal character set, that fits the actual needs
Use triggers just, only when needed
122
Schema Design and Performance
Normalization and Performance
In a normalized database, each fact is represented once and only once. Conversely, in a denormalized database, information is duplicated, or stored in multiple places.
Database normalization is a process by which an existing schema is modified to bring its component tables into compliance with a series of progressive normal forms.
The goal of database normalization is to ensure that every non-key column in every table is directly dependent on the key, the whole key and nothing but the key and with this goal come benefits in the form of reduced redundancies, fewer anomalies, and improved efficiencies. While normalization is not the be-all and end-all of good design, a normalized schema provides a good starting point for further development.
123
Schema Design and Performance
Normalization and Performance
Why normalization is a preferred approach in terms of performance:
You cannot write generic queries/views to access the data. Basically, all queries in the code need to by dynamic, so you can put in the right table name.
Maintaining the data becomes cumbersome. Instead of updating a single table, you have to update multiple tables.
Performance is a mixed bag. Although you might save the overhead of storing the customer id in each table, you incur another cost. Having lots of smaller tables means lots of tables with partially filled pages. Depending on the number of jobs per customer and number of overall customers, you might actually be multiplying the amount of space used. In the worst case of one job per customer where a page contains -- say -- 100 jobs, you would be multiplying the required space by about 100.
The last point also applies to the page cache in memory. So, data in one table that would fit into memory might not fit into memory when split among many tables.
Through the process of database normalization it's possible to bring the schema's tables into conformance with progressive normal forms. As a result the tables each represent a single entity (a book, an author, a subject, etc) and we benefit from decreased redundancy, fewer anomalies and improved efficiency.
124
Schema Design and Performance
Schema Design
The major schema design principle states you should use one table per object of interest. That means one table for users, one table for pages, one table for posts, etc. Use a normalized database for transactional data.
Although there are universally bad and good design principles, there are also issues that arise from how MySQL is implemented.
Too many columns. MySQL storage engines interacts with the server storing rows in buffers. High CPU consumption can be noticed when using extremely wide tables (hundreds of columns), even though only a few columns were actually used. This can have a cost with the server’s performance characteristics.
Too many joins. MySQL has a limitation of 61 tables per join. It’s better to have a dozen or fewer tables per query if you need queries to execute very fast with high concurrency.
ENUM. Enumerated value type are a problem in database design. It's preferrable to have a INT as a foreign key for quick lookups.
SET. An ENUM permits the column to hold one value from a set of defined values. A SET permits the column to hold one or more values from a set of defined values: this may lead to confusion.
NULL. It's a good practice to avoid NULL when possible, but consider MySQL does index NULL, which doesn’t include non-values in indexes.
125
Schema Design and Performance
Data Types
MySQL supports a large variety of data types, and choosing the correct type to store your data is crucial to getting good performance.
Whole Numbers There are two kinds of numbers: whole numbers and real numbers (numbers with a fractional part). If you’re storing whole numbers, use one of the integer types: TINYINT, SMALLINT, MEDIUMINT, INT or BIGINT.
Real Numbers Real numbers are numbers that have a fractional part. However, they aren’t just for fractional numbers; you can also use DECIMAL to store integers that are so large they don’t fit in BIGINT. The FLOAT and DOUBLE types support approximate calculations with standard floating-point math.
String Types MySQL supports quite a few string data types, with many variations on each.
VARCHAR stores variable-length character strings and is the most common string data type.
CHAR is fixed-length: MySQL always allocates enough space for the specified number of characters.
BLOB and TEXT are string data types designed to store large amounts of data as either binary or character strings, respectively.
Using ENUM instead of a string type Sometimes you can use an ENUM column instead of conventional string types. An ENUM column can store a predefined set of distinct string values.
126
Schema Design and Performance
Data Types
Date and Time Types
MySQL has many types for various kinds of date and time values, such as YEAR and DATE. The finest granularity of time MySQL can store is one second.
DATETIME This type can hold a large range of values, from the year 1001 to the year 9999, with a precision of one second.
TIMESTAMP the TIMESTAMP type stores the number of seconds elapsed since midnight, January 1, 1970, Greenwich Mean Time (GMT)—the same as a Unix timestamp.
Special Types of Data
Some kinds of data don’t correspond directly to the available built-in types.
IPv4 address. People uses VARCHAR(15) or unsigned 32-bit integers to insert the dotted-separated IP address notation, but MySQL provides the INET_ATON() and INET_NTOA() functions to convert between the two representations.
127
Schema Design and Performance Indexes
Indexes (also called “keys” in MySQL) are data structures that storage engines use to find rows quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows.
The easiest way to understand how an index works in MySQL is to think about the index in a book. To find out where a particular topic is discussed in a book, you look in the index, and it tells you the page number(s) where that term appears.
MySQL uses indexes for these operations:
To find the rows matching a WHERE clause quickly.
To eliminate rows from consideration. If there is a choice between multiple indexes, MySQL normally uses the index that finds the smallest number of rows.
To retrieve rows from other tables when performing joins. MySQL can use indexes on columns more efficiently if they are declared as the same type and size.
For comparisons between non binary string columns, both columns should use the same character set.
Comparison of dissimilar columns.
To find the MIN() or MAX() value for a specific indexed column key_col.
To sort or group a table if the sorting or grouping is done on a leftmost prefix of a usable key.
Indexes are less important for queries on small tables, or big tables where report queries process most or all of the rows. When a query needs to access most of the rows, reading sequentially is faster than working through an index. Sequential reads minimize disk seeks, even if not all the rows are needed for the query.
128
Schema Design and Performance Indexes
Types of Indexes
There are many types of indexes, each designed to perform well for different purposes. Indexes are implemented in the storage engine layer, not the server layer: so they are not standardized. Indexing works slightly differently in each engine, and not all engines support all types of indexes.
B-Tree Indexes
This is the default index for most storage engines in MySql. The general idea of a B-Tree is that all the values are stored in order, and each leaf page is the same distance from the root. A B-Tree index speeds up data access because the storage engine doesn’t have to scan the whole table to find the desired data. Instead, it starts at the root node.
Hash indexes
A hash index is built on a hash table and is useful only for exact lookups that use every column in the index. 4 For each row, the storage engine computes a hash code of the indexed columns, which is a small value that will probably differ from the hash codes computed for other rows with different key values. It stores the hash codes in the index and stores a pointer to each row in a hash table.
Spatial (R-Tree) indexes
MyISAM supports spatial indexes, which you can use with partial types such as GEOMETRY. Unlike B-Tree indexes, spatial indexes don’t require WHERE clauses to operate on a leftmost prefix of the index. They index the data by all dimensions at the same time. As a result, lookups can use any combination of dimensions efficiently.
Full-text indexes
FULLTEXT is a special type of index that finds keywords in the text instead of comparing values directly to the values in the index. It is much more analogous to what a search engine does than to simple WHERE parameter matching.
129
Schema Design and Performance
Partitioning
Partitioning is performed by logically dividing one large table into small physical fragments.
Partitioning may bring several advantages:
In some situations query performance can be significantly increased, especially when the most intensively used table area is a separate partition or a small number of partitions. Such a partition and its indexes are more easily placed in the memory than the index of the whole table.
When queries or updates are using a large percentage of one partition, the performance may be increased simply through a more beneficial sequential access to this partition on the disk, instead of using the index and random read access for the whole table. In our case the B-Tree (itemid, clock) type of indexes are used that substantially benefit in performance from partitioning.
Mass INSERT and DELETE can be performed by simply deleting or adding partitions, as long as this possibility is planned for when creating the partition. The ALTER TABLE statement will work much faster than any statement for mass insertion or deletion.
It is not possible to use tablespaces for InnoDB tables in MySQL. You get one directory - one database. Thus, to transfer a table partition file it must by physically copied to another medium and then referenced using a symbolic link.
130
Schema Design and Performance
Partitioning
Partitioned Tables
A partitioned table is a single logical table that’s composed of multiple physical subtables. The way MySQL implements partitioning means that indexes are defined per-partition, rather than being created over the entire table.
How Partitioning Works
As we’ve mentioned, partitioned tables have multiple underlying tables, which are represented by Handler objects. You can’t access the partitions directly. Each partition is managed by the storage engine in the normal fashion (all partitions must use the same storage engine), and any indexes defined over the table are actually implemented as identical indexes over each underlying partition.
Types of Partitioning
MySQL supports several types of partitioning. The most common type we’ve seen used is range partitioning, in which each partition is defined to accept a specific range of values for some column or columns, or a function over those columns. Next slides brings further details.
131
MySQL Query Performance
General SQL Tuning Best Practices
The goals of writing any SQL statement include delivering quick response times, using the least CPU resources, and achieving the fewest number of I/O operations BUT there are not many cases where these so-called best practices can be applied in a real life situation.
Do not use SELECT * in your queries.
Always write the required column names after the SELECT statement: this technique results in reduced disk I/O and better performance.
Always use table aliases when your SQL statement involves more than one source.
If more than one table is involved in a from clause, each column name must be qualified using either the complete table name or an alias. The alias is preferred. It is more human readable to use aliases instead of writing columns with no table information.
Use the more readable ANSI-Standard Join clauses instead of the old style joins.
With ANSI joins, the WHERE clause is used only for filtering data. Where as with older style joins, the WHERE clause handles both the join condition and filtering data. Furthermore ANSI join syntax supports the full outer join.
132
MySQL Query Performance
General SQL Tuning Best Practices
Do not use column numbers in the ORDER BY clause.
Always use column names in an order by clause. Avoid positional references.
Always use a column list in your INSERT statements.
Always specify the target columns when executing an insert command. This helps in avoiding problems when the table structure changes (like adding or dropping a column).
Always use a SQL formatter to format your sql.
The formatting of SQL code may not seem that important, but consistent formatting makes it easier for others to scan and understand your code. SQL statements have a structure, and having that structure be visually evident makes it much easier to locate and verify various parts of the statements. Uniform formatting also makes it much easier to add sections to and remove them from complex SQL statements for debugging purposes.
133
MySQL Query Performance
EXPLAIN
The EXPLAIN command is the main way to find out how the query optimizer decides to execute queries. This feature has limitations and doesn’t always tell the truth, but its output is the best information available, and it’s worth studying so you can learn how your queries are executed. Learning to interpret EXPLAIN will also help you learn how MySQL’s optimizer works.
To use EXPLAIN, simply add the word EXPLAIN just before the SELECT keyword in your query. MySQL will set a flag on the query. When it executes the query, the flag causes it to return information about each step in the execution plan, instead of executing it. It returns one or more rows, which show each part of the execution plan and the order of execution.
134
MySQL Query Performance
EXPLAIN
EXPLAIN tells you:
In which order the tables are read
What types of read operations that are made
Which indexes could have been used
Which indexes are used
How the tables refer to each other
How many rows the optimizer estimates to retrieve from each table
135
MySQL Query Performance EXPLAIN
EXPLAIN example
mysql> explain select * from actor where 1;
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
| 1 | SIMPLE | actor | ALL | NULL | NULL | NULL | NULL | 200 | |
+----+-------------+-------+------+---------------+------+---------+------+------+-------+
1 row in set (0.00 sec)
mysql> explain select * from actor where actor_id = 192;
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
| 1 | SIMPLE | actor | const | PRIMARY | PRIMARY | 2 | const | 1 | |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
1 row in set (0.00 sec)
136
MySQL Query Performance
EXPLAIN - Output
Column Description
id The SELECT identifier
select_type The SELECT type
table The table for the output row
partitions The matching partitions
type The join type
possible_keys The possible indexes to choose
key The index actually chosen
key_len The length of the chosen key
ref The columns compared to the index
rows Estimate of rows to be examined
filtered Percentage of rows filtered by table condition
Extra Additional information
137
MySQL Query Performance
EXPLAIN - Types
Column Description
system The table has only one row
const At the most one matching row, treated as a constant
eq_ref One row per row from previous tables
ref Several rows with matching index value
ref_or_null Like ref, plus NULL values
index_merge Several index searches are merged
unique_subquery Same as ref for some subqueries
index_subquery As above for non-unique indexes
range A range index scan
index The whole index is scanned
ALL A full table scan
138
MySQL Query Performance
EXPLAIN - SELECT
SELECT TYPE Description
simple Simple SELECT (not using UNION or subqueries)
primary Outermost SELECT
union Second or later SELECT statement in a UNION
dependent union Second or later SELECT statement in a UNION, dependent on outer query
union result Result of a UNION.
subquery First SELECT in subquery
dependent subquery First SELECT in subquery, dependent on outer query
derived Derived table SELECT (subquery in FROM clause)
uncacheable subquery
A subquery for which the result cannot be cached and must be re-evaluated for each row of the outer query
uncacheable union The second or later select in a UNION that belongs to an uncacheable subquery
139
MySQL Query Performance
EXPLAIN – Performance troubleshooting
When dealing with a real-world application there is a number of tables with many relations between them, but sometimes it’s hard to anticipate the most optimal way to write a query. This is a sample query which uses tables with no indexes or primary keys, only to demonstrate the impact of such a bad design by writing a pretty awful query.
EXPLAIN SELECT * FROM
orderdetails d
INNER JOIN orders o ON d.orderNumber = o.orderNumber
INNER JOIN products p ON p.productCode = d.productCode
INNER JOIN productlines l ON p.productLine = l.productLine
INNER JOIN customers c on c.customerNumber = o.customerNumber
WHERE o.orderNumber = 10101G
140
MySQL Query Performance EXPLAIN – Performance troubleshooting
********************** 1. row **********************
id: 1
select_type: SIMPLE
table: l
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 7
Extra:
********************** 2. row **********************
id: 1
select_type: SIMPLE
table: p
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 110
Extra: Using where; Using join buffer
141
MySQL Query Performance EXPLAIN – Performance troubleshooting
********************** 3. row **********************
id: 1
select_type: SIMPLE
table: c
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 122
Extra: Using join buffer
********************** 4. row **********************
id: 1
select_type: SIMPLE
table: o
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 326
Extra: Using where; Using join buffer
142
MySQL Query Performance
EXPLAIN – Performance troubleshooting
********************** 5. row **********************
id: 1
select_type: SIMPLE
table: d
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 2996
Extra: Using where; Using join buffer
5 rows in set (0.00 sec)
If you look at the above result, you can see all of the symptoms of a bad query. But even if I wrote a better query, the results would still be the same since there are no indexes. The join type is shown as “ALL” (which is the worst), which means MySQL was unable to identify any keys that can be used in the join and hence the possible_keys and key columns are null.
143
MySQL Query Performance
EXPLAIN – Performance troubleshooting
Now lets add some obvious indexes, such as primary keys for each table, and execute the query once again. As a general rule of thumb, you can look at the columns used in the JOIN clauses of the query as good candidates for keys because MySQL will always scan those columns to find matching records.Let’s re-run the same query again after adding the indexes and the result should look like this:
********************** 1. row **********************
id: 1
select_type: SIMPLE
table: o
type: const
possible_keys: PRIMARY,customerNumber
key: PRIMARY
key_len: 4
ref: const
rows: 1
Extra:
144
MySQL Query Performance EXPLAIN – Performance troubleshooting
********************** 2. row **********************
id: 1
select_type: SIMPLE
table: c
type: const
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: const
rows: 1
Extra:
********************** 3. row **********************
id: 1
select_type: SIMPLE
table: d
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: const
rows: 4
Extra:
145
MySQL Query Performance EXPLAIN – Performance troubleshooting
********************** 4. row **********************
id: 1
select_type: SIMPLE
table: p
type: eq_ref
possible_keys: PRIMARY,productLine
key: PRIMARY
key_len: 17
ref: classicmodels.d.productCode
rows: 1
Extra:
********************** 5. row **********************
id: 1
select_type: SIMPLE
table: l
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 52
ref: classicmodels.p.productLine
rows: 1
Extra:
After adding indexes, the number of records scanned has been brought down to 1 × 1 × 4 × 1 × 1 = 4. That means for each record with orderNumber 10101 in the orderdetails table, MySQL was able to directly find the matching record in all other tables using the indexes and didn’t have to resort to scanning the entire table.
146
MySQL Query Performance
MySQL Optimizer
The MySQL Query Optimizer
The goal of MySQL optimizer is to take a SQL query as input and produce an optimal execution plan for the query.
When you issue a query that selects rows, MySQL analyzes it to see if any optimizations can be used to process the query more quickly. In this section, we'll look at how the query optimizer works.
The MySQL query optimizer takes advantage of indexes, of course, but it also uses other information.
For example, if you issue the following query, MySQL will execute it very quickly, no matter how large the table is:
SELECT * FROM tbl_name WHERE 0;
In this case, MySQL looks at the WHERE clause, realizes that no rows can possibly satisfy the query, and doesn't even bother to search the table. You can see this by issuing an EXPLAIN statement, which tells MySQL to display some information about how it would execute a SELECT query without actually executing it.
Optimizer is enabled by issuing the following:
set optimizer_trace=1;
147
MySQL Query Performance
MySQL Optimizer
How the Optimizer Works
The MySQL query optimizer has several goals, but its primary aims are to use indexes whenever possible and to use the most restrictive index in order to eliminate as many rows as possible as soon as possible.
The reason the optimizer tries to reject rows is that the faster it can eliminate rows from consideration, the more quickly the rows that do match your criteria can be found. Queries can be processed more quickly if the most restrictive tests can be done first. You can help the optimizer take advantage of indexes by using the following guidelines:
Try to compare columns that have the same data type. When you use indexed columns in comparisons, use columns that are of the same type. Identical data types will give you better performance than dissimilar types.
Try to make indexed columns stand alone in comparison expressions. If you use a column in a function call or as part of a more complex term in an arithmetic expression, MySQL can't use the index because it must compute the value of the expression for every row.
148
MySQL Query Performance
MySQL Optimizer
How the Optimizer Works
Don't use wildcards at the beginning of a LIKE pattern. Some string searches use a WHERE clause. Don't put '%' on both sides of the string simply out of habit.
Use EXPLAIN to verify optimizer operation. The EXPLAIN statement can tell you whether indexes are being used. This information is helpful when you're trying different ways of writing a statement or checking whether adding indexes actually will make a difference in query execution efficiency.
Give the optimizer hints when necessary. Normally, the MySQL optimizer considers itself free to determine the order in which to scan tables to retrieve rows most quickly. On occasion, the optimizer will make a non-optimal choice. If you find this happening, you can override the optimizer's choice using the STRAIGHT_JOIN keyword.
149
MySQL Query Performance
MySQL Optimizer
How the Optimizer Works
Take advantage of areas in which the optimizer is more mature. MySQL can do joins and subqueries, but subquery support is more recent, having been added in MySQL 4.1. Consequently, the optimizer has been better tuned for joins than for subqueries in some cases.
Test alternative forms of queries, but run them more than once. When testing alternative forms of a query (for example, a subquery versus an equivalent join), run it several times each way. If you run a query only once each of two different ways, you'll often find that the second query is faster just because information from the first query is still cached and need not actually be read from the disk.
Avoid overuse of MySQL's automatic type conversion. MySQL will perform automatic type conversion, but if you can avoid conversions, you may get better performance.
150
MySQL Query Performance
MySQL Optimizer
Overriding Optimization
It sounds odd, but there may be times when you'll want to defeat MySQL's optimization behaviour.
To override the optimizer's table join order. Use STRAIGHT_JOIN to force the optimizer to use tables in a particular order. If you do this, you should order the tables so that the first table is the one from which the smallest number of rows will be chosen.
To empty a table with minimal side effects. When you need to empty a MyISAM table completely, it's fastest to have the server just drop the table and re-create it based on the description stored in its .frm file. To do this, use a TRUNCATE TABLE statement.
151
MySQL Query Performance
Finding Problematic Queries
Database performance is affected by many factors. One of them is the query optimizer. To be sure the query optimizer is not introducing noise to well functioning queries we must analyse slow queries, if any. Watch the Slow query log first, as stated previously in the course. By default, the slow query log is disabled. To specify the initial slow query log state explicitly, use
mysqld --slow_query_log[={0|1}]
With no argument or an argument of 1, --slow_query_log enables the log. With an argument of 0, this option disables the log.
One of best tools to accomplish query analysis execution is pt-query-digest from Percona. It’s a third party tool that relies on logs, processlist, and tcpdump.
You also need the log to include all the queries, not just those that take more than N seconds. The reason is that some queries are individually quick, and would not be logged if you set the long_query_time configuration variable to 1 or more seconds. You want that threshold to be 0 seconds while you’re collecting logs.
152
MySQL Query Performance
Finding Problematic Queries
Another good practice involves processlist and show explain:
mysql> show processlist;
mysql> show explain for <PID>;
An evolution to this approach comes from the performance_schema database. There are many
ways to analyze via queries
events_statements_summary_by_digest
count_star, sum_timer_wait, min_timer_wait, avg_timer_wait, max_timer_wait
digest_text, digest
sum_rows_examined, sum_created_tmp_disk_tables, sum_select_full_join
events_statements_history
sql_text, digest_text, digest
timer_start, timer_end, timer_wait
rows_examined, created_tmp_disk_tables, select_full_join
153
MySQL Query Performance
Improve Query Executions
One nice feature added to the EXPLAIN statement in MySQL > 4.1 is the EXTENDED keyword which provides you with some helpful additional information on query optimization. It should be used together with SHOW WARNINGS to get information about how query looks after transformation as well as what other notes the optimizer may wish to tell us.
While it may look like a regular EXPLAIN statement, MySQL brings the SQL statement into its optimized form. Using SHOW WARNINGS afterwards prints out the optimized SELECT statement.
Adding the EXPLAIN EXTENDED prefix to the statement below will execute the statement behind the scenes so that the compiler optimizations can be analyzed:
EXPLAIN EXTENDED SELECT COUNT(*) FROM employees WHERE id IN (SELECT emp_id FROM bonuses);
The resulting output table is very much like the one produced by the regular EXPLAIN except for the added filtered column in the second last position. The filtered column indicates an estimated percentage of table rows that will be filtered by the table condition. Hence, the rows column shows the estimated number of rows examined and rows × filtered / 100 calculates the number of rows that will be joined with previous tables.
Applying EXPLAIN EXTENDED to our query gives us the opportunity to run the Show Warnings statement afterwards to see final optimized query:
SHOW WARNINGS;
154
MySQL Query Performance
Locate and Correct Problematic Queries
Finding bad queries is a big part of optimization. Queries, or groups of queries, are bad because:
they are slow and provide a bad user experience
they add too much load to the system
they block other queries from running
In real world, problematic queries can result from improper situations:
Bad query plan
Rewrite the query
Force a good query plan
Bad optimizer settings
Do tuning
Query is inherently complex
Don't waste time with it
Look for other solutions
155
MySQL Query Performance Locate and Correct Problematic Queries
Baseline. Always establish the current baseline of MySQL performance before any changes are made. Otherwise it is really only a guess afterwards whether the changes improved MySQL performance. The easiest way to baseline MySQL performance is with mysqlreport.
Assess Baseline. The report that mysqlreport writes can contain a lot of information, but for our purpose here there are only three things we need to look at. It is not necessary to understand the nature of these values at this point, but they give us an idea how well or not MySQL is really running.
Log Slow Queries and Wait. By default MySQL does not log slow queries and the slow query time is 10 seconds. This needs to be changed by adding these lines under the [msyqld] section in /etc/my.cnf:
log-slow-queries
long_query_time = 1
Restart MySQL and wait at least a full day. This will cause MySQL to log all queries which take longer than 1 second to execute.
Isolate Top 10 Slow Queries. The easiest way to isolate the top 10 slowest queries in the slow queries log is to use mysqlsla. Run mysqlsla on your slow queries log and save the output to a file. For example: "mysqlsla --log-type slow /var/lib/mysql/slow_queries.log > ~/top_10_slow_queries". That command will create a file in your home directory called top_10_slow_queries.
Post-fix Proof. Presuming that your MySQL expert was able to fix the top slow queries, the final step is to actually prove this is the case and not just coincidence. Restart MySQL and wait as long as MySQL had ran in the first step (at least a day ideally). Then baseline MySQL performance again with mysqlreport. Compare the first report with this second report, specifically the three values we looked at in step two (Read ratio, Slow, and Waited).
156
Performance Tuning Extras
Configuring Hardware
Your MySQL server can perform only as well as its weakest link, and the operating system and the hardware on which it runs are often limiting factors. The disk size, the available memory and CPU resources, the network, and the components that link them all limit the system’s ultimate capacity. MySQL requires significant memory amounts in order to provide optimal performance. The fastest and most effective change that you can make to improve performance is to increase the amount of RAM on your web server - get as much as possible (e.g. 4GB or more). Increasing primary memory will reduce the need for processes to swap to disk and will enable your server to handle more users.
157
Performance Tuning Extras
Configuring Hardware
Better performance is gained by obtaining the best processor capability you can, i.e. dual or dual core processors. A modern BIOS should allow you to enable hyperthreading, but check if this makes a difference to the overall performance of the processors by using a CPU benchmarking tool.
If you can afford them, use SCSI hard disks instead of SATA drives. SATA drives will increase your system's CPU utilization, whereas SCSI drives have their own integrated processors and come into their own when you have multiple drives. If you must have SATA drives, check that your motherboard and the drives themselves support NCQ (Native Command Queuing).
Purchase hard disks with a low seek time. This will improve the overall speed of your system, especially when accessing MySQL tablespaces and datafiles.
158
Performance Tuning Extras
Configuring Hardware
Size your swap file correctly. The general advice is to set it to 4 x physical RAM.
Use a RAID disk system. Although there are many different RAID configurations you can create, the following generally works best:
install a hardware RAID controller
the operating system and swap drive on one set of disks configured as RAID-1.
MySQL server on another set of disks configured as RAID-5 or RAID-10.
Use gigabit ethernet for improved latency and throughput. This is especially important when you have your webserver and database server separated out on different hosts.
Check the settings on your network card. You may get an improvement in performance by increasing the use of buffers and transmit/receive descriptors (balance this with processor and memory overheads) and off-loading TCP checksum calculation onto the card instead of the OS.
159
Performance Tuning Extras
Considering Operating Systems
You can use Linux (recommended), Unix-based, Windows or Mac OS X for the server operating system. *nix operating systems generally require less memory than Mac OS X or Windows servers for doing the same task as the server is configured with just a shell interface. Additionally Linux does not have licensing fees attached, but can have a big learning curve if you're used to another operating system. If you have a large number of processors running SMP, you may also want to consider using a highly tuned OS such as Solaris.
Check your own OS and vendor specific instructions for optimization steps.
For Linux look at the Linux Performance Team site.
Linux investigate the hdparm command, e.g. hdparm -m16 -d1 can be used to enable read/write on multiple sectors and DMA. Mount disks with the async and noatime options.
For Windows set the server to be optimized for network applications (Control Panel, Network Connections, LAN connection, Properties, File & Printer Sharing for Microsoft Networks, Properties, Optimization). You can also search the Microsoft TechNet site for optimization documents.
160
Performance Tuning Extras
Operating Systems Configurations
Windows
If you install MySQL on a Windows system and used the Windows Installation Wizard, the most is already done. When that wizard completes, it most likely launched the MySQL Configuration Wizard which walked you through the process of configuring the database. When the wizard starts for the first time, it asks you if you'd like to perform a standard configuration or a detailed configuration. The standard configuration process consists of two steps: service options and security options. You'll first see a screen asking you if you'd like to install MySQL as a service. In most cases, you should select this option. Running the database as a service lets it run in the background without requiring user interaction. The second phase of the standard configuration process allows you to set two types of security settings. The first is the use of a root password, which is strongly recommended. This root password controls access to the most sensitive administration tasks on your server. The second option you'll select on this screen is whether you'd like to have an anonymous user account. We recommend that you do not enable this option unless absolutely necessary to increase the security of your system.
161
Performance Tuning Extras
Operating Systems Configurations
Linux
whatever the distribution chosen, the configuration is based on the file my.cnf. Most of the cases, you should not touch this file. By default, it will have the following entries:
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
[mysql.server]
user=mysql
basedir=/var/lib
[safe_mysqld]
err-log=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
162
Performance Tuning Extras
Logging
MySQL Server has several logs that can help you find out what activity is taking place.
Error log Problems encountered starting, running, or stopping mysqld
General query log Established client connections and statements received from clients
Binary log Statements that change data (also used for replication)
Relay log Data changes received from a replication master server
Slow query log Queries that took more than long_query_time seconds to execute
By default, no logs are enabled and the server writes files for all enabled logs in the data directory.
163
Performance Tuning Extras
Logging
Logging parameters are located under [mysqld] section in /etc/my.cnf configuration file. A typical schema should be the following:
[mysqld]
log-bin=/var/log/mysql-bin.log
log=/var/log/mysql.log
log-error=/var/log/mysql-error.log
log-slow-queries=/var/log/mysql-slowquery.log
164
Performance Tuning Extras Logging Error Log
Error Log goes to syslog due to /etc/mysql/conf.d/mysqld_safe_syslog.cnf, which contains the following:
[mysqld_safe]
syslog
General Query Log
To enable General Query Log, uncomment (or add) the relevant lines
general_log_file = /var/log/mysql/mysql.log
general_log = 1
Slow Query Log
To enable Slow Query Log, uncomment (or add) the relevant lines
log_slow_queries = /var/log/mysql/mysql-slow.log
long_query_time = 2
log-queries-not-using-indexes
Restart MySQL server after changes
This method requires a server restart.
$ Service mysql restart
165
Performance Tuning Extras
Backup and Recovery
It is important to back up your databases so that you can recover your data and be up and running again in case problems occur, such as system crashes, hardware failures, or users deleting data by mistake. Backups are also essential as a safeguard before upgrading a MySQL installation, and they can be used to transfer a MySQL installation to another system or to set up replication slave servers.
166
Performance Tuning Extras Backup and Recovery
Logical Backups
Logical Backup (mysqldump)
Amongst other things, the mysqldump command allows you to do logical backups of your database by producing the SQL statements necessary to rebuild all the schema objects. An example is shown below.
$ # All DBs
$ mysqldump --user=root --password=mypassword --all-databases > all_backup.sql
$ # Individual DB (or comma separated list for multiple DBs)
$ mysqldump --user=root --password=mypassword mydatabase > mydatabase_backup.sql
$ # Individual Table
$ mysqldump --user=root --password=mypassword mydatabase mytable > mydatabase_mytable_backup.sql
Recovery from Logical Backup (mysql)
The logical backup created using the mysqldump command can be applied to the database using the MySQL command line tool, as shown below.
$ # All DBs
$ mysql --user=root --password=mypassword < all_backup.sql
$ # Individual DB
$ mysql --user=root --password=mypassword --database=mydatabase < mydatabase_backup.sql
167
Performance Tuning Extras
Backup and Recovery
Cold Backups
Cold backups are a type of physical backup as you copy the database files while the database is offline.
Cold Backup
The basic process of a cold backup involves stopping MySQL, copying the files, the restarting MySQL. You can use whichever method you want to copy the files (cp, scp, tar, zip etc.).
# service mysqld stop
# cd /var/lib/mysql
# tar -cvzf /tmp/mysql-backup.tar.gz ./*
# service mysqld start
Recovery from Cold Backup
To recover the database from a cold backup, stop MySQL, restore the backup files and start MySQL again.
# service mysqld stop
# cd /var/lib/mysql
# tar -xvzf /tmp/mysql-backup.tar.gz
# service mysqld start
168
Performance Tuning Extras Backup and Recovery
Binary Logs : Point In Time Recovery (PITR)
Binary logs record all changes to the databases, which are important if you need to do a Point In Time Recovery (PITR). Without the binary logs, you can only recover the database to the point in time of a specific backup. The binary logs allow you to wind forward from that point by applying all the changes that were written to the binary logs. Unless you have a read-only system, it is likely you will need to enable the binary logs.
To enable the binary blogs, edit the "/etc/my.cnf" file, uncommenting the "log_bin" entry.
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
log_bin
The binary logs will be written to the "datadir" location specified in the "/etc/my.cnf" file, with a default prefix of "mysqld". If you want alter the prefix and path you can do this by specifying an explicit base name.
# Prefix set to "mydb". Stored in the default location.
log_bin=mydb
# Files stored in "/u01/log_bin" with the prefix "mydb".
log_bin=/u01/log_bin/mydb
Restart the MySQL service for the change to take effect.
# service mysqld restart
The mysqlbinlog utility converts the contents of the binary logs to text, which can be replayed against the database.
169
Conclusion
Course Overview
Course Aims
Understand the basics of performance tuning
Use performance tuning tools
Tune the MySQL Server instance to improve performance
Improve performance of tables based on the storage engine being used
Implement proper Schema Design to improve performance
Improve the performance of MySQL Queries
Describe additional items related to performance tuning
170
Conclusion
Training and Certification Website
The following is a small list of sites of interest for related MySQL training course.
Oracle University
http://education.oracle.com/pls/web_prod-plq-ad/db_pages.getpage?page_id=3
MySQL Training
http://www.mysql.it/training/
MySQL Certifications
http://www.mysql.it/certification/
171
Conclusion
Course evaluation
Please answer to the questions in order to verify the knowledge achieved during this course. Thanks.
172
Conclusion
Thank you!
173
Conclusion
Q & A
174
Lab 1: Basic MySQL operations MySQL installation
On Debian Linux distros, this is done by entering the command:
$ sudo apt-get –y install mysql-server
Other distributions rely on similar commands, such as SuSE Zypper, Red Hat YUM and others.
Set root password
$ mysql -u root
mysql> SET PASSWORD FOR 'ROOT'@'LOCALHOST“ PASSWORD(‘root');
Set host
mysql> GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'root' WITH GRANT OPTION;
mysql> FLUSH PRIVILEGES;
175
Lab 1: MySQL DB connection MySQL connection
On the command line, just type
$ mysql –u root -p
Then you are prompted to insert the password. Once entered, a banner greets you and a new command prompt appears:
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 70
Server version: 5.5.38-0+wheezy1-log (Debian)
Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>
176
Lab 1: MySQL Environment OS commands
$ cat /proc/cpuinfo
$ cat /proc/meminfo
$ iostat –dx 5
$ netstat –an
$ dstat
177
Lab 1: MySQL Environment First MySQL server configuration. Find and edit the main
configuration file called “my,cnf” and enter these values, then restart MySQL
[mysqld]
performance_schema
performance_schema_events_waits_history_size=20
performance_schema_events_waits_history_long_size=15000
log_slow_queries = slow_query.log
long_query_time = 1
log_queries_not_using_indexes = 1
$ service mysql restart
178
Lab 1: Benchmarks Try to use native BENCHMARK() function to compare operators
mysql> SELECT BENCHMARK(100000000, CONCAT('a','b'));
Now try the same function against queries:
mysql> use sakila;
mysql> SELECT BENCHMARK(100, SELECT `actor_id` FROM `actor`);
Did it work? Why?
179
Lab 1: Storage enginesCreate a brand new table without specifying the engine to use:
use test;
mysql> CREATE TABLE char_test( char_col CHAR(10));
Try to To see what tables are in what engines
mysql> SHOW TABLE STATUS;
Selecting the storage engine to use is a tuning decision
mysql> alter table char_test engine=myisam;
Re-run the previous command to see the differences:
mysql> SHOW TABLE STATUS;
180
Lab 1: I/O Benchmark Install “sysbench” and try to run it with simple options as shown before:
$ sysbench --test=fileio prepare
$ sysbench --test=fileio --file-test-mode=rndrw run
$ sysbench --test=fileio cleanup
Install “iozone” and try the same:
$ iozone –a
You can also save the output to a spreadsheet using iozone -b
$ ./iozone -a -b output.xls
181
Lab 2: Performances Enable Slow Query Log
Find and edit configuration file “my.cnf” with:
log_slow_queries = <example slow_query.log>
long_query_time = 1
log_queries_not_using_indexes = 1
Then restart the MySQL daemon
$ service mysql restart
Now run the Mysqldumpslow command, after some MySQL operations:
$ mysqldumpslow
or
$ mysqldumpslow <options> <example slow_query.log>
182
Lab 2: MySQL Query Cache Let’s assume we have a standard “my.cnf” configuration file. To enable
query cache, we have to edit it
$ vi /etc/mysql/my.cnf
Append the following lines and then restart the MySQL daemon
query_cache_size = 268435456
query_cache_type=1
query_cache_limit=1048576
$ service mysql restart
Now run a benchmark session and keep note of the results
$ mysqlslap -uroot -proot -h localhost --create-schema=sakila -i 5 -c 10 -q "select * from actor order by rand() limit 10"
183
Lab 2: MySQL Query Cache Disable query cache in any of following ways, from inside MySQL prompt:
SET GLOBAL query_cache_size=0;
SHOW GLOBAL STATUS LIKE ‘QCache%’;
SET SESSION query_cache_type=0;
Re-run the benchmark session and observe the differences
$ mysqlslap -uroot -proot -h localhost --create-schema=sakila -i 5 -c 10 -q "select * from actor order by rand() limit 10"
184
Lab 3: InnoDB Launch and figure out how InnoDB is set on the server:
SHOW ENGINE INNODB STATUS;
Enable the InnoDB logging facilities
mysql> use mysql;
mysql> CREATE TABLE innodb_monitor (a INT) ENGINE=INNODB;
mysql> CREATE TABLE innodb_lock_monitor (a INT) ENGINE=INNODB; mysql> CREATE TABLE innodb_tablespace_monitor (a INT) ENGINE=INNODB;
mysql> CREATE TABLE innodb_table_monitor (a INT) ENGINE=INNODB;
185
Lab 3: MyISAM Choose and use any Sakila DB table to define a FULLTEXT index using the
ALTER TABLE statement:
mysql> ALTER TABLE table_name ADD FULLTEXT(column_name1, column_name2,…)
You can also use CREATE INDEX statement to create FULLTEXT index for existing tables.
mysql> CREATE FULLTEXT INDEX index_name ON able_name(idx_column_name,...)
Use any benchmark tool to see the differences in speed during queries without and with the fulltext indexing enabled.
186
Lab 3: MyISAM with Sphinx Example: create a table
CREATE TABLE `film` (
`film_id` smallint(5) unsigned NOT NULL
auto_increment,
`title` varchar(255) NOT NULL,
`description` text,
`last_update` timestamp NOT NULL default
CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
...
PRIMARY KEY (`film_id`),
...
) ENGINE=InnoDB ;
187
Lab 3: MyISAM with Sphinx Example: edit the sphinx.conf file
source film
{
type = mysql
sql_host = localhost
sql_user = sakila_ro
sql_pass = 123456
sql_db = sakila
sql_port = 3306# optional, default is 3306
sql_query = \
SELECT film_id, title, UNIX_TIMESTAMP(last_update) AS
last_update_timestamp FROM film
sql_attr_int = film_id
sql_attr_timestamp = last_update_timestamp
sql_query_info = SELECT * FROM film WHERE film_id=$id
}
188
Lab 3: MyISAM with Sphinx Example: edit the sphinx.conf file
index film
{
source = film
path = /usr/bin/sphinx/data/film
}
Run queries
189
Lab 3: MyISAM with Sphinx Example: create a table using the Sphinx Storage Engine (SphinxSE)
CREATE TABLE sphinx_film
(
film_id INT NOT NULL,
weight INT NOT NULL,
query VARCHAR(3072) NOT NULL,
last_update INT,
INDEX(query)
) ENGINE=SPHINX
CONNECTION="sphinx://localhost:12321/film";
190
Lab 3: MyISAM with Sphinx Example: SphinxSE queries
SELECT * FROM sphinx_film WHERE query='drama';
SELECT * FROM sphinx_film INNER JOIN file USING (film_id) WHERE query='drama';
SELECT * FROM sphinx_film
INNER JOIN file USING(film_id) WHERE query='drama;limit=50';
SELECT * FROM sphinx_film
INNER JOIN file USING(film_id) WHERE
query='drama;limit=50;sort=attr_asc:last_update';
SELECT * FROM sphinx_film INNER JOIN file USING
(film_id) WHERE query='drama;limit=50;groupby=day:last_update';
191
Lab 4: Explain EXPLAIN
Suppose you want to rewrite the following UPDATE statement to make it EXPLAIN -able:
mysql> UPDATE sakila.actor
INNER JOIN sakila.film_actor USING (actor_id)
SET actor.last_update=film_actor.last_update;
The following EXPLAIN statement is not equivalent to the UPDATE , because it doesn’t
require the server to retrieve the last_update column from either table:
mysql> EXPLAIN SELECT film_actor.actor_id
-> FROM sakila.actor
-> INNER JOIN sakila.film_actor USING (actor_id)\G
192
Lab 4: Explain EXPLAIN
This is a better situation, close to the first one:
mysql> EXPLAIN SELECT film_actor.last_update, actor.last_update
-> FROM sakila.actor
-> INNER JOIN sakila.film_actor USING (actor_id)\G
Rewriting queries like this is not an exact science, but it’s often good enough to help
you understand what a query will do.
193
Lab 4: Critical queries Make practice with these commands:
mysql> show processlist;
mysql> show explain for <PID>;
Make practice with information_schema database
information_schema is the database where the information about all the other databases is kept, for example names of a database or a table, the data type of columns, access privileges, etc. It is a built-in virtual database with the sole purpose of providing information about the database system itself. The MySQL server automatically populates the tables in the information_schema.
194
Lab 4: Performance_schema queries Once enabled, try to use the performance_schema monitoring database
$ vi /etc/my.cnf
[mysqld]
performance_schema=on
mysql> USE performance_schema;
mysql> SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'performance_schema';
mysql> SHOW TABLES FROM performance_schema;
mysql> SHOW CREATE TABLE setup_timers\G
mysql> UPDATE setup_instruments SET ENABLED = 'YES', TIMED = 'YES';
mysql> UPDATE setup_consumers SET ENABLED = 'YES';
mysql> SELECT * FROM events_waits_current\G
195
Lab 4: Performance_schema queriesmysql> SELECT THREAD_ID, NUMBER_OF_BYTES
-> FROM events_waits_history
-> WHERE EVENT_NAME LIKE 'wait/io/file/%'
-> AND NUMBER_OF_BYTES IS NOT NULL;
Performance Schema Runtime Configuration
mysql> SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
-> WHERE TABLE_SCHEMA = 'performance_schema'
-> AND TABLE_NAME LIKE 'setup%';
196
Case studies
Case study n. 1
197
Case studies – Case study n. 1 Scope of Problem
Overnight the query performance went from <1ms to 50x worse.
Nothing changed in terms of server configuration, schema, etc.
Tried throttling the server to 1/2 of its workload
from 20k QPS to 10k QPS
no improvement
198
Case studies – Case study n. 1 Considerations
Change in config client doesn't know about?
Hardware problem such as a failing disk?
Load increase: data growth or QPS crossed a "tipping point"?
Schema changes client doesn't know about (missing index?)
Network component such as DNS?
199
Case studies – Case study n. 1 Elimination of easy possibilities:
ALL queries are found to be slower in slow-query-log
eliminates DNS as a possibility.
Queries are slow when run via Unix socket
eliminates network.
No errors in dmesg or RAID controller
suggests (doesn't eliminate) that hardware is not the problem.
Detailed historical metrics show no change in Handler_ graphs
suggests (doesn't eliminate) that indexing is not the problem.
Also, combined with the fact that ALL queries are 50x slower, very strong reason to believe indexing is not the problem.
200
Case studies – Case study n. 1 Investigation of the obvious:
Aggregation of SHOW PROCESSLIST shows queries are not in Locked status.
Investigating SHOW INNODB STATUS shows no problems with semaphores, transaction states such as "commit", main thread, or other likely culprits.
However, SHOW INNODB STATUS shows many queries in "" status, as here:
---TRANSACTION 4 3879540100, ACTIVE 0 sec, process
no 26028, OS thread id 1344928080
MySQL thread id 344746, query id 1046183178
10.16.221.148 webuser
SELECT ....
All such queries are simple and well-optimized according to EXPLAIN.
The system has 8 CPUs, Intel(R) Xeon(R) CPU E5450 @ 3.00GHz and a RAID controller with 8 Intel XE-25 SSD drives behind it, with BBU and WriteBack caching.
201
Case studies – Case study n. 1 vmstat 5
r b swpd free buff cache si so bi bo in cs us sy id wa
4 0 875356 1052616 372540 8784584 0 0 13 3320 13162 49545 18 7 75 0
4 0 875356 1070604 372540 8785072 0 0 29 4145 12995 47492 18 7 75 0
3 0 875356 1051384 372544 8785652 0 0 38 5011 13612 55506 22 7 71 0
iostat -dx 5
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 61.20 1.20 329.20 15.20 4111.20 24.98 0.03 0.09 0.09 3.04
dm-0 0.00 0.00 0.80 390.60 12.80 4112.00 21.08 0.03 0.08 0.07 2.88
mpstat 5
10:36:12 PM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
10:36:17 PM all 18.81 0.05 3.22 0.22 0.24 2.71 0.00 74.75 13247.40
10:36:17 PM 0 19.57 0.00 3.52 0.98 0.20 2.74 0.00 72.99 1939.00
10:36:17 PM 1 18.27 0.00 3.08 0.38 0.19 2.50 0.00 75.58 1615.40
202
Case studies – Case study n. 1 Premature Conclusion
As a result of all the above, we conclude that nothing external to the database is obviously the problem
The system is not virtualized
I expect the database to be able to perform normally.
What to do next?
Try to use a tool to make things easy.
Solution: use pt-ioprofile (from Percona Tool Kit).
203
Case studies – Case study n. 1 Solution
Start innotop (just to have a realtime monitor)
Disable query cache.
Watch QPS change in innotop.
Additional Confirmation
The slow query log also confirms queries back to normal
tail -f /var/log/slow.log | perl pt-query-digest --run-time 30s --report-format=profile
204
Case studies
Case study n. 2
205
Case studies – Case study n. 2 Information Provided
About 4PM on Saturday, queries suddenly began taking insanely long to complete
From sub-ms to many minutes.
As far as the customer knew, nothing had changed.
Nobody was at work.
They had disabled selected apps where possible to reduce load.
206
Case studies – Case study n. 2 Overview
They are running 5.0.77-percona-highperf-b13.
The server has an EMC SAN
with a RAID5 array of 5 disks, and LVM on top of that
Server has 2 quad-core CPUSXeon L5420 @ 2.50GHz.
No virtualization.
They tried restarting mysqld
It has 64GB of RAM, so it's not warm yet.
207
Case studies – Case study n. 2 Train of thought
The performance drop is way too sudden and large.
On a weekend, when no one is working on the system.
Something is seriously wrong.
Look for things wrong first.
208
Case studies – Case study n. 2 Elimination of easy possibilities:
First, confirm that queries are actually taking a long time to complete.
They all are, as seen in processlist.
Check the SAN status.
They checked and reported that it's not showing any errors or failed disks.
209
Case studies – Case study n. 2 Investigation of the obvious:
Server's incremental status variables don't look amiss
150+ queries in commit status.
Many transactions are waiting for locks inside InnoDB
But no semaphore waits, and main thread seems OK.
iostat and vmstat at 5-second intervals:
Suspicious IO performance and a lot of iowait
But virtually no work being done.
210
Case studies – Case study n. 2iostat
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 49.00 10.00 104.00 320.00 8472.00 77.12 2.29 20.15 8.78 100.10
sdb1 0.00 49.00 10.00 104.00 320.00 8472.00 77.12 2.29 20.15 8.78 100.10
vmstat
r b swpd free buff cache si so bi bo in cs us sy id wa st
5 1 176 35607308 738468 19478720 0 0 48 351 0 0 1 0 96 3 0
0 1 176 35605912 738472 19478820 0 0 560 848 2019 2132 4 1 83 13 0
0 2 176 35605788 738480 19479048 0 0 608 872 2395 2231 0 1 85 14 0
From vmstat/iostat:
It looks like something is blocking commits
Likely to be either a serious bug (a transaction that has gotten the commit mutex and is hung?) or a hardware problem.
IO unreasonably slow, so that is probably the problem.
211
Case studies – Case study n. 2 Analysis
Because the system is not "doing anything,"
profiling where CPU time is spent is probably useless.
We already know that it's spent waiting on mutexes in the commit problem, so oprofile will probably show nothing.
✦ Other options that come to mind:
profile IO calls with strace -c
benchmark the IO system, since it seems to be suspicious.
212
Case studies – Case study n. 2Oprofile ★ As expected: nothing useful in oprofile
samples % symbol name
6331 15.3942 buf_calc_page_new_checksum
2008 5.1573 sync_array_print_long_waits
2004 4.8728 MYSQLparse(void*)
1724 4.1920 srv_lock_timeout_and_monitor_thread
1441 3.5039 rec_get_offsets_func
1098 2.6698 my_utf8_uni
780 1.8966 mem_pool_fill_free_list
762 1.8528 my_strnncollsp_utf8
682 1.6583 buf_page_get_gen
650 1.5805 MYSQLlex(void*, void*)
604 1.4687 btr_search_guess_on_hash
566 1.3763 read_view_open_now
strace –c ★ Nothing relevant after 30 seconds or so.
Process 24078 attached - interrupt to quit
Process 24078 detached%
time seconds usecs/call calls errors syscall
100.00 0.098978 14140 7 select
0.00 0.000000 0 7 accept
213
Case studies – Case study n. 2 Examine history
Look at 'sar' for historical reference.
Ask the client to look at their graphs to see if there are obvious changes around 4PM.
Observations
writes dropped dramatically around 4:40
at the same time iowait increased a lot
corroborated by the client's graphs
points to decreased performance of the IO subsystem
SAN attached by fibre channel, so it could be
this server
the SAN
the connection
the specific device on the SAN.
214
Case studies – Case study n. 2 Elimination of Options:
Benchmark /dev/sdb1 and see if it looks reasonable.
This box or the SAN?
check the same thing from another server.
Tool: use iozone with the -I flag (O_DIRECT).
The result was 54 writes per second on the first iteration
canceled it after that because that took so long.
Conclusions
Customer said RAID failed after all
Moral of the story: information != facts
Customer‟s web browser had cached SAN status page!
215
Case studies
Case study n. 3
216
Case studies – Case study n. 3 Information from the start
Sometimes (once every day or two) the server starts to reject connections with a max_connections error.
This lasts from 10 seconds to a couple of minutes and is sporadic.
Server specs:
16 cores
12GB of RAM, 900MB data
Data on Intel XE-25 SSD
Running MySQL 5.1 with InnoDB Plugin
217
Case studies – Case study n. 3 Considerations
Pile-ups cause long queue waits?
thus incoming new connections exceed max_connections?
Pile-ups can be
the query cache
InnoDB mutexes
218
Case studies – Case study n. 3 Elimination
There are no easy possibilities.
We'd previously worked with this client and the DB wasn't the problem then.
Queries aren't perfect, but are still running in less than 10ms normally.
Investigation
Nothing is obviously wrong.
Server looks fine in normal circumstances.
219
Case studies – Case study n. 3 Analysis
We are going to have to capture server activity when the problem happens.
We can't do anything without good diagnostic data.
Decision: install 'collect' (from Aspersa) and wait.
For further info, please refer to Percona Aspersa Official Site:
http://www.percona.com/blog/2011/04/17/aspersa-tools-bit-ly-download-shortcuts/
After several pile-ups nothing very helpful was gathered
But then we got a good one
This took days/a week
Result of diagnostics data: too much information!
220
Case studies – Case study n. 3 During the Freeze
Connections increased from normal 5-15 to over 300.
QPS was about 1-10k.
Lots of Com_admin_commands.
Vast majority of "real" queries are Com_select (300-2000 per second)
There are only 5 or so Com_update, other Com_are zero.
No table locking.
Lots of query cache activity, but normal-looking.
no lowmem_prunes.
20 to 100 sorts per second
between 1k and 12k rows sorted per second.
221
Case studies – Case study n. 3 During the Freeze
Between 12 and 90 temp tables created per second
about 3 to 5 of them created on disk.
Most queries doing index scans or range scans – not full table scans or cross joins.
InnoDB operations are just reads, no writes.
InnoDB doesn't write much log or anything.
InnoDB status:
✦ InnoDB main thread was in "flushing buffer pool pages“ and there were basically no dirty pages.
✦ Most transactions were waiting in the InnoDB queue.
"12 queries inside InnoDB, 495 queries in queue"
✦ The log flush process was caught up.
✦ The InnoDB buffer pool wasn't even close to being full (much bigger than the data size).
222
Case studies – Case study n. 3 There were mostly 2 types of queries in SHOW PROCESSLIST, most of them in the
following states:
$ grep State: status-file | sort | uniq -c | sort -nr
161 State: Copying to tmp table
156 State: Sorting result
136 State: statistics
223
Case studies – Case study n. 3iostat
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda3 0.04 493.63 0.65 15.49 142.18 4073.09 261.18 0.17 10.68 1.02 1.65
sda3 0.00 8833.00 1.00 500.00 8.00 86216.00 172.10 5.05 11.95 0.59 29.40
sda3 0.00 33557.00 0.00 451.00 0.00 206248.00 457.31 123.25 238.00 1.90 85.90
sda3 0.00 33911.00 0.00 565.00 0.00 269792.00 477.51 143.80 245.43 1.77 100.00
sda3 0.00 38258.00 0.00 649.00 0.00 309248.00 476.50 143.01 231.30 1.54 100.10
sda3 0.00 34237.00 0.00 589.00 0.00 281784.00 478.41 142.58 232.15 1.70 100.00
vmstat
r b swpd free buff cache si so bi bo in cs us sy id wa st
50 2 86064 1186648 3087764 4475244 0 0 5 138 0 0 1 1 98 0 0
13 0 86064 1922060 3088700 4099104 0 0 4 37240 312832 50367 25 39 34 2 0
2 5 86064 2676932 3088812 3190344 0 0 0 136604 116527 30905 9 12 71 9 0
1 4 86064 2782040 3088812 3087336 0 0 0 153564 34739 10988 2 3 86 9 0
0 4 86064 2871880 3088812 2999636 0 0 0 163176 22950 6083 2 2 89 8 0
Oprofile
samples % image name app name symbol name
473653 63.5323 no-vmlinux no-vmlinux /no-vmlinux
95164 12.7646 mysqld mysqld /usr/libexec/mysqld
53107 7.1234 libc-2.10.1.so libc-2.10.1.so memcpy
224
Case studies – Case study n. 3 Analysis:
There is a lot of data here
most of it points to nothing in particular except "need more research."
For example, in oprofile, what does build_template() do in InnoDB?
Why is memcpy() such a big consumer of time?
What is hidden within the 'mysqld' image/symbol?
We could spend a lot of time on these things.
In looking for things that just don't make sense, the iostat data is very strange.
We can see hundreds of MB per second written to disk for sustained periods
but there isn't even that much data in the whole database.
So clearly this can't simply be InnoDB's "furious flushing" problem
Virtually no reading from disk is happening in this period of time.
Raw disk stats show that all the time is consumed in writes.
There is an enormous queue on the disk.
225
Case studies – Case study n. 3 Analysis:
There was no swap activity, and 'ps' confirmed that nothing else significant was happening.
'df -h' and 'lsof' showed that:
mysqld's temp files became large
disk free space was noticeably changed while this pattern happened.
So mysqld was writing GB to disk in short bursts
Although this is not fully instrumented inside of MySQL, we know that
MySQL only writes data, logs, sort, and temp tables to disk.
Thus, we can eliminate data and logs.
Discussion with developers revealed that some kinds of caches could expire and cause a stampede on the database.
226
Case studies – Case study n. 3 Conclusion
Based on reasoning and knowledge of internals: it is likely that poorly optimized queries are causing a storm of very large temp tables on disk.
Plan of Attack
Optimize the 2 major kinds of queries found in SHOW PROCESSLIST so they don't use temp tables on disk.
These queries are fine in isolation, but when there is a rush on the database, can pile up.
Problem resolved after removing temporary tables on disk