My sql performance tuning course

1

MySQL Performance Tuning

2Course topics

Introduction

MySQL Overview

MySQL Products and Tools

MySQL Services and Support

MySQL Web Pages

MySQL Courses

MySQL Certification

MySQL Documentation

3Course topics

Performance Tuning Basics

Thinking About Performance

Areas to Tune

Performance Tuning Terminology

Benchmark Planning

Benchmark Errors

Tuning Steps

General Tuning Session

Deploying MySQL and Benchmarking

4Course topics

Performance Tuning Tools

MySQL Monitoring Tools

Open Source Community Monitoring Tools

Benchmark Tools

Stress Tools

5Course topics

MySQL Server Tuning

Major Components of the MySQL Server

MySQL Thread Handling

MySQL Memory Usage

Simultaneous Connections in MySQL

Reusing Threads

Effects of Thread Caching

Reusing Tables

Setting table open_cache

6Course topics

MySQL Query Cache

MySQL Query Cache

When to Use the MySQL Query Cache

When NOT to Use the MySQL Query Cache

MySQL Query Cache Settings

MySQL Query Cache Status Variables

Improve Query Cache Results

7Course topics

InnoDB

InnoDB Storage Engine

InnoDB Storage Engine Uses

Using the InnoDB Storage Engine

InnoDB Log Files and Buffers

Committing Transactions

InnoDB Table Design

SHOW ENGINE INNODB STATUS

InnoDB Monitors and Settings

8Course topics

MyISAM

MyISAM Storage Engine Uses

MyISAM Table Design

Optimizing MyISAM

MyISAM Table Locks

MyISAM Settings

MyISAM Key Cache

MyISAM Full-Text Search

9Course topics

Other MySQL Storage Engines and Issues

Large Objects

MEMORY Storage Engine Uses

MEMORY Storage Engine Performance

Multiple Storage Engine Advantages

Single Storage Engine Advantages

10Course topics

Schema Design and Performance

Schema Design Considerations

Normalization and Performance

Schema Design

Data Types

Indexes

Partitioning

11Course topics

MySQL Query Performance

General SQL Tuning Best Practices

EXPLAIN

MySQL Optimizer

Finding Problematic Queries

Improve Query Executions

Locate and Correct Problematic Queries

12Course topics

Performance Tuning Extras

Configuring Hardware

Considering Operating Systems

Operating Systems Configurations

Logging

Backup and Recovery

13Introduction MySQL Overview

MySQL is a database management system.

A database is a structured collection of data.

MySQL databases are relational.

A relational database stores data in separate tables rather than putting all the data in one big storeroom.

MySQL software is Open Source.

Open Source means that it is possible for anyone to use and modify the software.

MySQL Server works in client/server or embedded systems.

The MySQL Database Software is a client/server system that consists of a multi-threaded SQL server that supports different backend, several different client programs and libraries, administrative tools, and a wide range of application programming interfaces (APIs).

14Introduction MySQL Products and Tools

MySQL Database Server

It is a fully integrated transaction-safe, ACID compliant database with full commit, rollback, crash recovery and row level locking capabilities

MySQL Connectors

MySQL provides standards-based drivers for JDBC, ODBC, and .Net enabling developers to build database applications

MySQL Replication

MySQL Replication enables users to cost-effectively deliver application performance, scalability and high availability.

MySQL Fabric

MySQL Fabric is an extensible framework for managing farms of MySQL Servers.

15Introduction MySQL Products and Tools

MySQL Partitioning

MySQL Partitioning enables developers and DBAs to improve database performance and simplify the management of very large databases.

MySQL Utilities

MySQL Utilities is a set of command-line tools that are used to work with MySQL servers.

MySQL Workbench

MySQL Workbench provides data modeling, SQL development, and comprehensive administration tools for server configuration, user administration, backup, and much more.

16Introduction MySQL Services and Support

MySQL Technical Support Services provide direct access to our expert MySQL Support engineers who are ready to assist you in the development, deployment, and management of MySQL applications.

Even though you might have highly skilled technical staff that can solve your issues, MySQL Support Engineers can typically solve those same issues a lot faster. A vast majority of the problems the MySQL Support Engineers encounter, they have seen before. So an issue that could take several weeks for your staff to research and resolve, may be solved in a matter of hours by the MySQL Support team.

17Introduction MySQL Web Pages

Home page http://www.mysql.com/

Downloadshttp://www.mysql.com/downloads/

Documentationhttp://dev.mysql.com/doc/

Developer Zonehttp://dev.mysql.com/

http://www.mysql.com/



http://www.mysql.com/downloads/

http://www.mysql.com/downloads/

http://dev.mysql.com/doc/


http://dev.mysql.com/

http://dev.mysql.com/

18Introduction MySQL Courses

MySQL Database Administrator

MySQL for Beginners

MySQL for Database Administrators

MySQL Performance Tuning

MySQL High Availability

MySQL Cluster

MySQL Developer

MySQL for Beginners

MySQL and PHP - Developing Dynamic Web Applications

MySQL for Developers

MySQL Developer Techniques

MySQL Advanced Stored Procedures

19Introduction MySQL Certification

Competitive Advantage

The rigorous process of becoming Oracle certified makes you a better technologist. The knowledge gained through training and practice will significantly expand the skill set and increase one's credibility when interviewing for jobs.

Salary Advancement

Companies value skilled workers. According to Oracle's 2012 salary survey, more than 80% of Oracle Certified individuals reported a promotion, compensation increase or other career improvements as a result of becoming certified.

Opportunity and Credibility

The skills and knowledge gained by becoming certified will lead to greater confidence and increased career security. Expanded skill set will also help unlock opportunities with employers and potential employers.

20Introduction MySQL Documentation

Main source to MySQL official documentation is found at

http://dev.mysql.com/doc/ or http://docs.oracle.com/cd/E17952_01/

Anyway it’s quite easy to find whatever you need being a well documented database system.




http://docs.oracle.com/cd/E17952_01/

http://docs.oracle.com/cd/E17952_01/

21Performance Tuning Basics Thinking about performance

Performance is measured by the time required to complete a task. In other words, performance is response time.

A database server’s performance is measured by query response time, and the unit of measurement is time per query.

So if the goal is to reduce response time, we need to understand why the server requires a certain amount of time to respond to a query, and reduce or eliminate whatever unnecessary work it’s doing to achieve the result.

In other words, we need to measure where the time goes. This leads to our second important principle of optimization: you cannot reliably optimize what you cannot measure.

Your first job is therefore to measure where time is spent.

22Performance Tuning Basics Areas to tune

Performance is usually pinned at few parameters: Hardware

MySQL Configuration

Schema and Queries

Application Architecture

23Performance Tuning Basics Areas to tune -> Hardware

CPU

MySQL works fine on 64-bit architectures, that's now the default. Make sure you use a 64-bit operating system on 64-bit hardware.

The number of CPUs MySQL can use effectively and how it scales under increasing load depend on both the workload and the system architecture.

The CPU architecture (RISC, CISC, depth of pipeline, etc.), CPU model, and operating system all affect MySQL’s scaling pattern.

A good choice is to adopt up to 24 cores CPUs.


RAM

The biggest reason to have a lot of memory isn’t so you can hold a lot of data in memory: it’s ultimately so you can avoid disk I/O, which is orders of magnitude slower than accessing data in memory. The trick is to balance the memory and disk size, speed, cost, and other qualities so you get good performance for your workload.

To ensure a reliable work and a good performance standard, MySQL environment should count up to 100's of GB.


I/O

The main bottleneck in a database environment is usually located at a mechanical layer such disk drivers and storage. Transaction logs and temporary spaces are heavy consumers of I/O, and affect performance for all users of the database. This is why disks have to wait for spindle, read and write operations and swapping between RAM and dedicated partitions.

Storage engines often keep their data and/or indexes in single large files, which means RAID (Redundant Array of Inexpensive Disks) is usually the most feasible option for storing a lot of data. 7 RAID can help with redundancy, storage size, caching, and speed.


Network

Modern NIC (Network Interface Cards) are capable of high speeds, high bandwidth and low latency.

For best performances and robustness, dedicated servers can rely on bonding and teaming OS features.

1Gb Ethernet are good enough to ensure optimal throughput even in clustered configurations


Measure, that is finding the bottleneck or limiting resource:

CPU

RAM

I/O

Network bandwidth

Measure I/O: vmstat and iostat (from sysstat package)

Measure RAM: ps, free, top

Measure CPU: top, vmstat, dstat

Measure network bandwidth: dstat, ifconfig

28Performance Tuning Basics Areas to tune -> MySQL Configuration

MySQL allows a DBA or developer to modify parameters including the maximum number of client connections, the size of the query cache, the execution style of different logs, index memory cache size, the network protocol used for client-server communications, and dozens of others. This is done by editing the “my.cnf” configuration file, as in this example:

[mysqld]

performance_schema

performance_schema_events_waits_history_size=20

performance_schema_events_waits_history_long_size=15000

log_slow_queries = slow_query.log

long_query_time = 1

log_queries_not_using_indexes = 1

29Performance Tuning Basics Areas to tune -> Schema and Queries

Queries are often intended as a sequence of SELECT, INSERT, UPDATE, DELETE statements.

A database is designed to handle queries quickly, efficiently and reliably.

"Quickly" means getting a good response time in any circumstance

"Efficiently" means a wise use of resources, such as CPU, Memory, IO, Disk Space. Practically speaking this is translated into growing money income and decreasing human effort.

"Reliably" means High Availability. High availability and performance come together to ensure continuity and fast responses.

30Performance Tuning Basics Areas to tune -> Application Architecture

Not all application performance problems come from MySQL, as well as not all application performance problems which come from MySQL are resolved on MySQL level.

One of architecture questions changing how application logic translates to queries is a great optimization.

To have an application working better, it’s fundamental to tune the statement, tune the code and the tune the logic behind it.

31Performance Tuning Basics Performance Tuning Terminology

Term Definition

Bottlenecks The bottleneck is the part of a system which is at capacity. Other parts of the system will be idle waiting for it to perform its task.

Capacity The capacity of a system is the total workload it can handle without violating predetermined key performance acceptance criteria.

Investigation Investigation is an activity based on collecting information related to the speed, scalability, and/or stability characteristics of the product under test that may have value in determining or improving product quality. Investigation is frequently employed to prove or disprove hypotheses regarding the root cause of one or more observed performance issues.

Latency Delay experienced in network transmissions as network packets traverse the network infrastructure.

Metrics Metrics are measurements obtained by running performance tests as expressed on a commonly understood scale. Some metrics commonly obtained through performance tests include processor utilization over time and memory usage by load.


Term Definition

Metrics Metrics are measurements obtained by running performance tests as expressed on a commonly understood scale. Some metrics commonly obtained through performance tests include processor utilization over time and memory usage by load.

Performance Performance refers to information regarding your application’s response times, throughput, and resource utilization levels.

Resource utilization

Resource utilization is the cost of the project in terms of system resources. The primary resources are processor, memory, disk I/O, and network I/O.

Response time

Response time is a measure of how responsive an application or subsystem is to a client request.

Scalability Scalability refers to an application’s ability to handle additional workload, without adversely affecting performance, by adding resources such as processor, memory, and storage capacity.


Term Definition

Stress test A stress test is a type of performance test designed to evaluate an application’s behaviour when it is pushed beyond normal or peak load conditions. The goal of stress testing is to reveal application bugs that surface only under high load conditions. These bugs can include such things as synchronization issues, race conditions, and memory leaks. Stress testing enables you to identify your application’s weak points, and shows how the application behaves under extreme load conditions.

Throughput Typically expressed in transactions per second (TPS), expresses how many operations or transactions can be processed in a set amount of time.

Utilization In the context of performance testing, utilization is the percentage of time that a resource is busy servicing user requests. The remaining percentage of time is considered idle time.

Workload Workload is the stimulus applied to a system, application, or component to simulate a usage pattern, in regard to concurrency and/or data inputs. The workload includes the total number of users, concurrent active users, data volumes, and transaction volumes, along with the transaction mix.

34Performance Tuning Basics Planning a benchmark

Designing and Planning a Benchmark

The first step in planning a benchmark is to identify the problem and the goal. Next, decide whether to use a standard benchmark or design your own.

Next, you need queries to run against the data. You can make a unit test suite into a rudimentary benchmark just by running it many times, but that’s unlikely to match how you really use the database.

How Long Should the Benchmark Last?

It’s important to run the benchmark for a meaningful amount of time.

Most systems have some buffers that create burstable capacity — the ability to absorb spikes, defer some work, and catch up later after the peak is over.

35Performance Tuning Basics Planning a benchmark

Capturing System Performance and Status

It is important to capture as much information about the system under test (SUT) as possible while the benchmark runs.

It’s a good idea to make a benchmark directory with subdirectories for each run’s results. You can then place the results, configuration files, measurements, scripts, and notes for each run in the appropriate subdirectory.

Getting Accurate Results

The best way to get accurate results is to design your benchmark to answer the question you want to answer.

Are you capturing the data you need to answer the question? Are you benchmarking by the wrong criteria? For example, are you running a CPU-bound benchmark to predict the performance of an application you know will be I/O-bound?

36Performance Tuning Basics Benchmark errors

The BENCHMARK() function can be used to compare the speed of MySQL functions or operators. For example:

mysql> SELECT BENCHMARK(100000000, CONCAT('a','b'));

However, this cannot be used to compare queries:

mysql> SELECT BENCHMARK(100, SELECT `id` FROM `lines`);

ERROR 1064 (42000): You have an error in your SQL syntax;check the manual that corresponds to your MySQL server version for the right syntax to use near 'SELECT `id` FROM `lines`)' at line 1

As MySQL needs a fraction of a second just to parse the query and the system is probably busy doing other things, too, benchmarks with runtimes of less than 5-10s can be considered as totally meaningless and equally runtimes differences in that order of magnitude as pure chance.

37Performance Tuning Basics Benchmark errors

As a general rule, when you run multiple instance of any benchmarking tools, as you increase the number of concurrent connections, you might encounter a "Too many connections" error. You need to adjust MySQL's 'max_connections' variable, which controls the maximum number of concurrent connections allowed by the server.

38Performance Tuning Basics Tuning steps

Step 1 - Storage Engines (MyISAM, InnoDB)

Step 2 - Connections

Step 3 - Sessions

Step 4 - Query Cache

Step 5 - Queries

Step 6 - Schema

39Performance Tuning Basics Tuning steps – Step 1 - Storage Engines

MySQL supports multiple storage engines:

MyISAM - Original Storage Engine, great for web apps

InnoDB - Robust transactional storage engine

Memory Engine - Stores all data in Memory

InfoBright - Large scale data warehouse with 10x or more compression

Kickfire - Appliance based, Worlds fasted 100GB TPC-H

To see what tables are in what engines

mysql> SHOW TABLE STATUS ;

Selecting the storage engine to use is a tuning decision

mysql> alter table tab engine=myisam ;

40Performance Tuning Basics Tuning steps – Step 1 – MyISAM

The primary tuning factor in MyISAM are its two caches:

key_buffer_cache should be 25% of available memory

system cache - leave 75% of available memory free

Available memory is:

All on a dedicated server, if the server has 8GB, use 2GB for the key_buffer_cache and leave the rest free for the system cache to use.

Percent of the part of the server allocated for MySQL, i.e. if you have a server with 8GB, but are using 4GB for other applications then use 1GB for the key_buffer_cache and leave the remaining 3GB free for the system cache to use.

Maximum size for a single key buffer cache is 4GB

41Performance Tuning Basics Tuning steps – Step 1 – MyISAM

mysql> show status like 'Key%' ;

Key_blocks_not_flushed - Dirty key blocks not flushed to disk

Key_blocks_unused - unused blocks in the cache

Key_blocks_used - used Blocks in the cache

% of cache free : Key_blocks_unused /( Key_blocks_unused + Key_blocks_used )

Key_read_requests - key requests to the cache

Key_reads - times a key read request went to disk

Cache read hit % : Key_reads / Key_read_requests

Key_write_requests - key write request to cache

Key_writes - times a key write request went to disk

Cache write hit % : Key_writes / Key_write_request

$ cat /proc/meminfo

to see the system cache in linux

42Performance Tuning Basics Tuning steps – Step 1 – InnoDB

Unlike MyISAM InnoDB uses a single cache for both index and data

Innodb_buffer_pool_size - should be 70-80% of available memory.

It is not uncommon for this to be very large, i.e. 44GB on a system with 40GB of memory.

Make sure its not set so large as to cause swapping!

mysql>show status like 'Innodb_buffer%' ;

InnoDB can use direct IO on systems that support it, linux, FreeBSD, and Solaris.

Innodb_flush_method = O_DIRECT

43Performance Tuning Basics Tuning steps – Step 2 – Connections

MySQL caches the threads used by a connection

mysql> show status like ‘thread%’;

thread_cache_size - Number of threads to cache

Setting this to 100 or higher is not unusual

Monitor Threads_created to see if this is an issue

Counts connections not using the thread cache

Should be less that 1-2 a minute

Usually only an issue if more than 1-2 a second

Only an issue is you create and drop a lot of connections, i.e. PHP

Overhead is usually about 250k per thread

44Performance Tuning Basics Tuning steps – Step 3 – Sessions

Some session variables control space allocated by each session (connection)

Setting these to small can give bad performance

Setting these too large can cause the server to swap!

Can be set by connection

SET SORT_BUFFER_SIZE =1024*1024*128

Set small be default, increase in connections that need it

sort_buffer_size

Used for ORDER BY, GROUP BY, SELECT DISTINCT, UNION DISTINCT

Monitor Sort_merge_passes < 1-2 an hour optimal

Usually a problem in a reporting or data warehouse database

Other important session variables

read_rnd_buffer_size - Set to 1/2 sort_buffer_size

join_buffer_size - (BAD) Watch Select_full_join

read_buffer_size - Used for full table scans, watch Select_scan

tmp_table_size - Max temp table size in memory, watch Created_tmp_disk_tables

45Performance Tuning Basics Tuning steps – Step 4 – Query Cache

MySQL Query Cache caches both the query and the full result set

query_cache_type - Controls behavior

0 or OFF - Not used (buffer may still be allocated)

1 or ON cache all unless SELECT SQL_NO_CACHE (DEFAULT)

2 or DEMAND cache none unless SELECT SQL_CACHE

query_cache_size - Determines the size of the cache

mysql> show status like 'Qc%' ;

Gives great performance if:

Identical queries returning identical data are used often

No or rare inserts, updates or deletes

Best Practice

Set to DEMAND

Add SQL_CACHE to appropriate queries

46Performance Tuning Basics Tuning steps – Step 5 – Queries

Often the #1 issue in overall performance

Always have the slow query log on

http://dev.mysql.com/doc/refman/5.5/en/slow-query-log.html

Analyze using mysqldumpslow

Use: log_queries_not_using_indexes

Check it regularly

Use mysqldumpslow

Best practice is to automate running mysqldumpslow every morning and email results to DBA, DBDev, etc.

Understand and use EXPLAIN

Select_scan - Number of full table scans

Select_full_join - Joins without indexes



47Performance Tuning Basics Tuning steps – Step 5 – Queries

The IN clause in MySLQ is very fast!

Select ... Where idx IN(1,23,345,456) - Much faster than a join

Don’t wrap your indexes in expressions in Where

Select ... Where func(idx) = 20 [index ignored]

Select .. Where idx = otherfunc(20) [may use index]

Best practice : Keep index alone on left side of condition

Avoid % at the start of LIKE on an index

Select ... Where idx LIKE(‘ABC%’) can use index

Select ... Where idx LIKE(‘%XYZ’) must do full table scan

Use union all when appropriate, default is union distinct!

Understand left/right joins and use only when needed.

48Performance Tuning Basics Tuning steps – Step 6 – Schema

Too many indexes slow down inserts/deletes

Use only the indexes you must have

Check often

mysql> show create table tabname ;

Don’t duplicate leading parts of compound keys

index key123 (col1,col2,col3)

index key12 (col1,col2) <- Not needed!

index key1 (col1) <-- Not needed!

Use prefix indexes on large keys

Best indexes are 16 bytes/chars or less

Indexes bigger than 32 bytes/chars should be looked at very closely

should have there own cache if in MyISAM

For large strings that need to be indexed, i.e. URLs, consider using a separate column using the MySQL MD5 to create a hash key.

49Performance Tuning Basics Tuning steps – Step 6 – Schema

Size = performance, smaller is better

Size is important. Do not automatically use 255 for VARCHAR

Temp tables, most caches, expand to full size

Use “procedure analyse” to determine the optimal types given the values in your table

mysql> select * from tab procedure analyse (64,2000)\G

Consider the types:

enum: http://dev.mysql.com/doc/refman/5.5/en/enum.html

set: http://dev.mysql.com/doc/refman/5.5/en/set.html

Compress large strings

Use the MySQL COMPRESS and UNCOMPRESS functions

Very important in InnoDB!

50Performance Tuning Basics General Tuning Session

Never make a change in production first

Have a good benchmark or reliable load

Start with a good baseline

Only change 1 thing at a time identify a set of possible changes

try each change separately

try in combinations of 2, then 3, etc.

Monitor the results

Query performance - query analyzer, slow query log, etc. throughput

single query time

average query time

CPU - top, vmstat

IO - iostat, top, vmstat, bonnie++

Network bandwidth

Document and save the results

51Performance Tuning Basics Deploying MySQL and Benchmarking

Benchmarking can be a very revealing process. It can be used to isolate performance problems, and drill down to specific bottlenecks. More importantly, it can be used to compare different servers in your environment, so you have an expectation of performance from those servers, before you put them to work servicing your application.

MySQL can be deployed on a spectrum of different servers. Some may be servers we physically setup in a data centre, while others are managed hosting servers, and still others are cloud hosted.

Benchmarking can help give us a picture of what we're dealing with.


Why Benchmarking?

We want to know what our server can handle. We want to get an idea of the IO performance, CPU, and overall database throughput. Simple queries run on the server can give us a sense of queries per second, or transactions per second if we want to get more complicated.


Benchmarking Disk IO

On Linux systems, there is a very good tool for benchmarking disk IO. It's called sysbench. Let's run through a simple example of installing sysbench and running our server through some paces.

Installation

$ apt-get –y install sysbench

Test run

$ sysbench --test=fileio prepare

$ sysbench --test=fileio --file-test-mode=rndrw run

$ sysbench --test=fileio cleanup


Benchmarking CPU

Sysbench can also be used to test the CPU performance. It is simpler, as it doesn't need to set up files and so forth.

Test run

$ sysbench --test=cpu run


Benchmarking Database Throughput

With MySQL 5.1 distributions there is a tool included that can do very exhaustive database benchmarking. It's called mysqlslap.

$ mysqlslap -uroot -proot -h localhost --create-schema=sakila -i 5 -c 10 -q "select * from actor order by rand() limit 10"

56Performance Tuning Tools MySQL Monitoring Tools

Open Source Community Monitoring Tools

Benchmark Tools

Stress Tools

57Performance Tuning Tools MySQL Monitoring Tools

MySQL Enterprise Monitor

http://www.mysql.com/products/enterprise/monitor.html

MySQL Workbench

http://www.mysql.com/products/workbench/

Percona Toolkit for MySQL

http://www.percona.com/software/percona-toolkit







58Performance Tuning Tools Open Source Community Monitoring Tools

Mysqladmin

Mysqlreport

Innotop http://sourceforge.net/projects/innotop/

Oprofile http://oprofile.sourceforge.net/about/

Sysbench http://sysbench.sf.net/

Percona Monitoring Plugins http://www.percona.com/software/percona-monitoring-plugins

Mytop

http://sourceforge.net/projects/innotop/

http://sourceforge.net/projects/innotop/

http://oprofile.sourceforge.net/about/

http://oprofile.sourceforge.net/about/

http://sysbench.sf.net/

http://sysbench.sf.net/

http://www.percona.com/software/percona-monitoring-plugins

http://www.percona.com/software/percona-monitoring-plugins

59Performance Tuning Tools Benchmarck Tools MySQL Super Smack http://jeremy.zawodny.com/mysql/super-smack/

Database Test Suite http://sourceforge.net/projects/osdldbt/

Percona’s TPCC-MySQL Tool https://launchpad.net/perconatools

MySQL’s BENCHMARK() Function. MySQL has a handy BENCHMARK() function that you can use to test execution speeds for certain types of operations. You use it by specifying a number of times to execute and an expression to execute.

sysbench

sysbench https://launchpad.net/sysbench is a multithreaded system benchmarking tool. Its goal is to get a sense of system performance, in terms of the factors important for running a database server.

http://jeremy.zawodny.com/mysql/super-smack/

http://jeremy.zawodny.com/mysql/super-smack/

http://sourceforge.net/projects/osdldbt/

http://sourceforge.net/projects/osdldbt/

https://launchpad.net/perconatools

https://launchpad.net/perconatools

60Performance Tuning Tools Stress Tools Mysqltuner http://mysqltuner.pl/

Neotys

http://www.neotys.com/product/monitoring-mysql-web-load-testing.html

IOZone http://www.iozone.org/

Open Source Database Benchmark http://osdb.sourceforge.net/

Mysqlslap http://dev.mysql.com/doc/refman/5.5/en/mysqlslap.html

http://mysqltuner.pl/

http://mysqltuner.pl/




http://www.iozone.org/

http://www.iozone.org/

http://osdb.sourceforge.net/



http://dev.mysql.com/doc/refman/5.5/en/mysqlslap.html

http://dev.mysql.com/doc/refman/5.5/en/mysqlslap.html

61MySQL Server TuningMost of the tuning work should start from the core, being the MySQL server itself. In this case, “server” matches the presence of a mysqld service running on a physical machine, providing visible results as a response to queries, stored procedures and make available data for any treatment, such as populating dynamic web pages.

MySQL is very different from other database servers, and its architectural characteristics make it useful for a wide range of purposes.

At the same time, MySQL can power embedded applications, data warehouses, content indexing and delivery software, highly available redundant systems, online transaction processing (OLTP), and much more.

62MySQL Server Tuning Major Components of the MySQL Server

A picture of how MySQL’s components work together will help you understand the server. Figure shows a logical view of MySQL’s architecture.The topmost layer contains the services that aren’t unique to MySQL. They’re services most network-based client/server tools or servers need: connection handling, authentication, security, and so forth.


The second layer is where things get interesting. Much of MySQL’s brains are here, including the code for query parsing, analysis, optimization, caching, and all the built-in functions (e.g., dates, times, math, and encryption). Any functionality provided across storage engines lives at this level: stored procedures, triggers, and views.


The third layer contains the storage engines. They are responsible for storing and retrieving all data stored “in” MySQL. Like the various filesystems available for GNU/Linux, each storage engine has its own benefits and drawbacks. The server communicates with them through the storage engine API. This interface hides differences between storage engines and makes them largely transparent at the query layer. The API contains a couple of dozen low-level functions that perform operations such as “begin a transaction” or “fetch the row that has this primary key.” The storage engines don’t parse SQL or communicate with each other; they simply respond to requests from the server.

65MySQL Server Tuning MySQL Thread Handling

Each client connection gets its own thread within the server process. The connection’s queries execute within that single thread, which in turn resides on one core or CPU.The server caches threads, so they don’t need to be created and destroyed for each new connection. When clients (applications) connect to the MySQL server, the server needs to authenticate them. Authentication is based on username, originating host, and password. By default, connection manager threads associate each client connection with a thread dedicated to it that handles authentication and request processing for that connection. Manager threads create a new thread when necessary but try to avoid doing so by consulting the thread cache first to see whether it contains a thread that can be used for the connection. When a connection ends, its thread is returned to the thread cache if the cache is not full.

66MySQL Server Tuning MySQL Memory Usage

The following list indicates some of the ways that the mysqld server uses memory.

All threads share the MyISAM key buffer; its size is determined by the key_buffer_size variable.

Each thread that is used to manage client connections uses some thread-specific space. The following list indicates these and which variables control their size:

stack (variable thread_stack)

connection buffer (variable net_buffer_length)

result buffer (variable net_buffer_length)

All threads share the same base memory

Each request that performs a sequential scan of a table allocates a read buffer (variable read_buffer_size).

67MySQL Server Tuning MySQL Memory Usage

All joins are executed in a single pass, and most joins can be done without even using a temporary table.

When a thread is no longer needed, the memory allocated to it is released and returned to the system unless the thread goes back into the thread cache.

Almost all parsing and calculating is done in thread-local and reusable memory pools. No memory overhead is needed for small items, so the normal slow memory allocation and freeing is avoided. Memory is allocated only for unexpectedly large strings.

A FLUSH TABLES statement or mysqladmin flush-tables command closes all tables that are not in use at once and marks all in-use tables to be closed when the currently executing thread finishes. This effectively frees most in-use memory. FLUSH TABLES does not return until all tables have been closed.

The server caches information in memory as a result of GRANT, CREATE USER, CREATE SERVER, and INSTALL PLUGIN statements. This memory is not released by the corresponding REVOKE, DROP USER, DROP SERVER, and UNINSTALL PLUGIN statements, so for a server that executes many instances of the statements that cause caching, there will be an increase in memory use. This cached memory can be freed with FLUSH PRIVILEGES.

68MySQL Server Tuning Simultaneous Connections in MySQL

One means of limiting use of MySQL server resources is to set the global max_user_connections system variable to a nonzero value.

This limits the number of simultaneous connections that can be made by any given account, but places no limits on what a client can do once connected.

In addition, setting max_user_connections does not enable management of individual accounts.

You can set max_connections at server startup or at runtime to control the maximum number of clients that can connect simultaneously.

69MySQL Server Tuning Reusing Threads

MySQL is a single process with multiple threads. Not all databases are architected this way; some have multiple processes that communicate through shared memory or other means.

This is generally so fast that there isn’t really the need for connection pools as there is with other databases.

However, many development environments and programming languages really want a connection pool.

Many others use persistent connections by default, so that a connection isn’t really closed when it’s closed.

There can be more than one solution to this problem, but the one that’s actually partially implemented is a pool of threads.

The thread pool plugin is a commercial feature. It is not included in MySQL community distributions.

This tool provides an alternative thread-handling model designed to reduce overhead and improve performance. It implements a thread pool that increases server performance by efficiently managing statement execution threads for large numbers of client connections.

To control and monitor how the server manages threads that handle client connections, several system and status variables are relevant.

70MySQL Server Tuning Effects of Thread Caching

MySQL uses a separate thread for each client connection. In environments where applications do not attach to a database instance persistently, but rather create and close a lot of connections every second, the process of spawning new threads at high rate may start consuming significant CPU resources. To alleviate this negative effect, MySQL implements thread cache, which allows it to save threads from connections that are being closed and reuse them for new connections. The parameter thread_cache_size defines how many unused threads can be kept alive at any time.

The default value is 0 (no caching), which causes a thread to be set up for each new connection and disposed of when the connection terminates. Set thread_cache_size to N to enable N inactive connection threads to be cached. thread_cache_size can be set at server startup or changed while the server runs. A connection thread becomes inactive when the client connection with which it was associated terminates.

71MySQL Server Tuning Reusing Tables

MySQL is multi-threaded, so there may be many clients issuing queries for a given table simultaneously. To minimize the problem with multiple client sessions having different states on the same table, the table is opened independently by each concurrent session. This uses additional memory but normally increases performance.

When the table cache fills up, the server uses the following procedure to locate a cache entry to use:

Tables that are not currently in use are released, beginning with the table least recently used.

If a new table needs to be opened, but the cache is full and no tables can be released, the cache is temporarily extended as necessary. When the cache is in a temporarily extended state and a table goes from a used to unused state, the table is closed and released from the cache.

72MySQL Server Tuning Reusing Tables

You can determine whether your table cache is too small by checking the mysqld status variable Opened_tables, which indicates the number of table-opening operations since the server started

mysql> SHOW GLOBAL STATUS LIKE 'Opened_tables';

+---------------+-------+

| Variable_name | Value |

+---------------+-------+

| Opened_tables | 277 |

+---------------+-------+

73MySQL Server Tuning Setting table_open_cache

The table_open_cache and max_connections system variables affect the maximum number of files the server keeps open. If you increase one or both of these values, you may run up against a limit imposed by your operating system on the per-process number of open file descriptors. Many operating systems permit you to increase the open-files limit, although the method varies widely from system to system. Consult your operating system documentation to determine whether it is possible to increase the limit and how to do so.

table_open_cache is related to max_connections. For example, for 200 concurrent running connections, specify a table cache size of at least 200 * N, where N is the maximum number of tables per join in any of the queries which you execute. You must also reserve some extra file descriptors for temporary tables and files.

Make sure that your operating system can handle the number of open file descriptors implied by the table_open_cache setting. If table_open_cache is set too high, MySQL may run out of file descriptors and refuse connections, fail to perform queries, and be very unreliable.

74MySQL Query Cache MySQL Query Cache

The query cache stores the text of a SELECT statement together with the corresponding result that was sent to the client. If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again. The query cache is shared among sessions, so a result set generated by one client can be sent in response to the same query issued by another client.

Before even parsing a query, MySQL checks for it in the query cache, if the cache is enabled. This operation is a case-sensitive hash lookup. If the query differs from a similar query in the cache by even a single byte, it won’t match and the query processing will go to the next stage.

The query cache can be useful in an environment where you have tables that do not change very often and for which the server receives many identical queries. This is a typical situation for many Web servers that generate many dynamic pages based on database content. For example, when an order form queries a table to display the lists of all US states or all countries in the world, those values can be retrieved from the query cache. Although the values would probably be retrieved from memory in any case (from the InnoDB buffer pool or MyISAM key cache), using the query cache avoids the overhead of processing the query, deciding whether to use a table scan, and locating the data block for each row.

The query cache always contains current and reliable data. Any insert, update, delete, or other modification to a table causes any relevant entries in the query cache to be flushed.

75MySQL Query Cache When to Use the MySQL Query Cache

The query cache offers the potential for substantial performance improvement. Query Cache is quite helpful for MySQL performance optimization tasks and is great for certain applications, typically simple applications deployed on limited scale or applications dealing with small data sets. Query Cache comes handy under few particular situations:

Third party application – You can’t change how it works with MySQL to add caching but you can enable query cache so it works faster.

Low load applications – If you’re building application which is not designed for extreme load, like many personal application query cache might be all you need. Especially if it is mostly read only scenario.

76MySQL Query Cache When NOT to Use the MySQL Query Cache

As a first consideration, the query cache is disabled by default. This means that having the query cache on has some overhead, even if no queries are ever cached. This means also that Query Cache has relative benefits.

The cache is not used for queries of the following types:

Queries that are a subquery of an outer query

Queries executed within the body of a stored function, trigger, or event

Caching works on full queries only, so it does not work for subselects, inline views or parts of UNION.

Only SELECT queries are cached, SHOW commands or stored procedure calls are not, even if stored procedure would simply preform select to retrieve data from table.

Might not work with transactions – Different transactions may see different states of the database, depending on the updates they have performed and even depending on snapshot they are working on. If you’re using statements outside of transaction you have best chance for them to be cached.

Limited amount of usable memory – Queries are constantly being invalidated from query cache by table updates, this means number of queries in cache and memory used can’t grow forever even if your have very large amount of different queries being run.

77MySQL Query Cache MySQL Query Cache Settings

The query cache system variables all have names that begin with query_cache_.

The have_query_cache server system variable indicates whether the query cache is available:

mysql> SHOW VARIABLES LIKE 'have_query_cache';

+------------------+-------+


+------------------+-------+

| have_query_cache | YES |

+------------------+-------+

78MySQL Query Cache MySQL Query Cache Settings

query_alloc_block_size (defaults to 8192): the actual size of the memory blocks created for result sets in the query cache (don’t adjust)

query_cache_limit (defaults to 1048576): queries with result sets larger than this won’t make it into the query cache

query_cache_min_res_unit (defaults to 4096): the smallest size (in bytes) for blocks in the query cache (don’t adjust)

query_cache_size (defaults to 0): the total size of the query cache (disables query cache if equal to 0)

query_cache_type (defaults to 1): 0 means don’t cache, 1 means cache everything, 2 means only cache result sets on demand

query_cache_wlock_invalidate (defaults to FALSE): allows SELECTS to run from query cache even though the MyISAM table is locked for writing

79MySQL Query Cache MySQL Query Cache Status Variables

mysql> SHOW STATUS LIKE 'Qcache%';

+-------------------------+----------+


+-------------------------+----------+

| Qcache_free_blocks | 1 |

| Qcache_free_memory | 16759696 |

| Qcache_hits | 0 |

| Qcache_inserts | 0 |

| Qcache_lowmem_prunes | 0 |

| Qcache_not_cached | 164 |

| Qcache_queries_in_cache | 0 |

| Qcache_total_blocks | 1 |

+-------------------------+----------+

80MySQL Query Cache MySQL Query Cache Status Variables

Qcache_free_blocks: The number of free memory blocks in query cache.

Qcache_free_memory: The amount of free memory for query cache.

Qcache_hits: The number of cache hits.

Qcache_inserts: The number of queries added to the cache.

Qcache_lowmem_prunes: The number of queries that were deleted from the cache because of low memory.

Qcache_not_cached: The number of non-cached queries (not cachable, or due to query_cache_type).

Qcache_queries_in_cache: The number of queries registered in the cache.

Qcache_total_blocks: The total number of blocks in the query cache.

81MySQL Query Cache Improve Query Cache Results

If you want to get optimized and speedy response from your MySQL server then you need to add following two configurations directive to your MySQL server:

query_cache_size=SIZE

The amount of memory (SIZE) allocated for caching query results. The default value is 0, which disables the query cache.

query_cache_type=OPTION

Set the query cache type. Possible options are as follows:

0 : Don’t cache results in or retrieve results from the query cache.

1 : Cache all query results except for those that begin with SELECT S_NO_CACHE.

2 : Cache results only for queries that begin with SELECT SQL_CACHE

You can setup them in /etc/my.cnf (Red Hat) or /etc/mysql/my.cnf (Debian) file:

$ vi /etc/mysql/my.cnf

Append config directives as follows:

query_cache_size = 268435456

query_cache_type=1

query_cache_limit=1048576

82InnoDB InnoDB Storage Engine

InnoDB is a storage engine for MySQL. MySQL 5.5 and later use it by default, rather than MyISAM. It provides the standard ACID-compliant transaction features, along with foreign key support (Declarative Referential Integrity).

The InnoDB tables fully support ACID-compliant and transactions. They are also very optimal for performance. InnoDB table supports foreign keys, commit, rollback, roll-and forward operations. The size of the InnoDB table can be up to 64TB.

The InnoDB storage engine maintains its own buffer pool for caching data and indexes in main memory. When the innodb_file_per_table setting is enabled, each new InnoDB table and its associated indexes are stored in a separate file. When the innodb_file_per_table option is disabled, InnoDB stores all its tables and indexes in the single system tablespace, which may consist of several files (or raw disk partitions). InnoDB tables can handle large quantities of data, even on operating systems where file size is limited to 2GB.

ACID - Atomicity, Consistency, Isolation, Durability

83InnoDB InnoDB Storage Engine Uses

Transactions

If your application requires transactions, InnoDB is the most stable, well-integrated, proven choice. MyISAM is a good choice if a task doesn’t require transactions and issues primarily either SELECT or INSERT queries. Sometimes specific components of an application (such as logging) fall into this category.

Backups

The need to perform regular backups might also influence your choice. If your server can be shut down at regular intervals for backups, the storage engines are equally easy to deal with. However, if you need to perform online backups, you basically need InnoDB.

Crash recovery

If you have a lot of data, you should seriously consider how long it will take to recover from a crash. MyISAM tables become corrupt more easily and take much longer to recover than InnoDB tables. In fact, this is one of the most important reasons why a lot of people use InnoDB when they don’t need transactions.

84InnoDB Using the InnoDB Storage Engine

InnoDB is designed to handle transactional applications that require crash recovery, referential integrity, high levels of user concurrency and fast response times.

When to use InnoDB?

You are developing an application that requires ACID compliance. At the very least, your application demands the storage layer support the notion of transactions.

You require expedient crash recovery. Almost all production sites fall into this category, however MyISAM table recovery times will obviously vary from one usage pattern to the next. To estimate an accurate figure for your environment, try running myisamchk over a many-gigabyte table from your application's backups on hardware similar to what you have in production. While recovery times of MyISAM tables increase with growth of the table, InnoDB table recovery times remain largely constant throughout the life of the table.

Your web site or application is mostly multi-user. The database is having to deal with frequent UPDATEs to a single table and you would like to make better use of your multi-processing hardware.

85InnoDB InnoDB Log Files and Buffers

InnoDB is a general-purpose storage engine that balances high reliability and high performance. It is a transactional storage engine and is fully ACID compliant, as would be expected from any relational database. The durability guarantee provided by InnoDB is made possible by the redo logs.

By default, InnoDB creates two redo log files (or just log files) ib_logfile0 and ib_logfile1 within the data directory of MySQL.

The redo log files are used in a circular fashion. This means that the redo logs are written from the beginning to end of first redo log file, then it is continued to be written into the next log file, and so on till it reaches the last redo log file. Once the last redo log file has been written, then redo logs are again written from the first redo log file.

The log files are viewed as a sequence of blocks called "log blocks" whose size is given by OS_FILE_LOG_BLOCK_SIZE which is equal to 512 bytes. Each log file has a header whose size is given by LOG_FILE_HDR_SIZE, which is defined as 4*OS_FILE_LOG_BLOCK_SIZE.

86InnoDB InnoDB Log Files and Buffers

The global log system object log_sys holds important information related to log subsystem of InnoDB.

This object points to various positions in the in-memory redo log buffer and on-disk redo log files.

The picture shows the locations pointed to by the global log_sys object. The picture clearly shows that the redo log buffer maps to a specific portion of the redo log file.

87InnoDB Committing Transactions

By default, MySQL starts the session for each new connection with autocommit mode enabled, so MySQL does a commit after each SQL statement if that statement did not return an error. If a statement returns an error, the commit or rollback behavior depends on the error.

If a session that has autocommit disabled ends without explicitly committing the final transaction, MySQL rolls back that transaction.

Some statements implicitly end a transaction, as if you had done a COMMIT before executing the statement.

To optimize InnoDB transaction processing, find the ideal balance between the performance overhead of transactional features and the workload of your server.

The default MySQL setting AUTOCOMMIT=1 can impose performance limitations on a busy database server. Where practical, wrap several related DML operations into a single transaction, by issuing SET AUTOCOMMIT=0 or a START TRANSACTION statement, followed by a COMMIT statement after making all the changes.

88InnoDB Committing Transactions

Avoid performing rollbacks after inserting, updating, or deleting huge numbers of rows. If a big transaction is slowing down server performance, rolling it back can make the problem worse, potentially taking several times as long to perform as the original DML operations. Killing the database process does not help, because the rollback starts again on server startup.

When rows are modified or deleted, the rows and associated undo logs are not physically removed immediately, or even immediately after the transaction commits. The old data is preserved until transactions that started earlier or concurrently are finished, so that those transactions can access the previous state of modified or deleted rows. Thus, a long-running transaction can prevent InnoDB from purging data that was changed by a different transaction.

89InnoDB InnoDB Table Design

Use short PRIMARY KEY

Primary key is part of all other indexes on table

Consider artificial auto_increment PRIMARY KEY and UNIQUE for original PRIMARY KEY

INT keys are faster than VARCHAR/CHAR

PRIMARY KEY is most efficient for lookups

Reference tables by PRIMARY KEY when possible

Do not update PRIMARY KEY

This will require all other keys to be modified for row

This often requires row relocation to other page

Cluster your accesses by PRIMARY KEY

Inserts in PRIMARY KEY order are much faster.


InnoDB creates each table and associated primary key index either in the system tablespace, or in a separate tablespace (represented by a .ibd file).

Always set up a primary key for each InnoDB table, specifying the column or columns that:

Are referenced by the most important queries.

Are never left blank.

Never have duplicate values.

Rarely if ever change value once inserted.

Although the table works correctly without you defining a primary key, the primary key is involved with many aspects of performance and is a crucial design aspect for any large or frequently used table.

InnoDB provides an optimization that significantly improves scalability and performance of SQL statements that insert rows into tables with AUTO_INCREMENT columns.


Limits on InnoDB Tables

A table can contain a maximum of 1000 columns.

A table can contain a maximum of 64 secondary indexes.

By default, an index key for a single-column index can be up to 767 bytes.

The InnoDB internal maximum key length is 3500 bytes, but MySQL itself restricts this to 3072 bytes.

The maximum row length is slightly less than half of a database page. The default database page size in InnoDB is 16KB.

Although InnoDB supports row sizes larger than 65,535 bytes internally, MySQL itself imposes a row-size limit of 65,535 for the combined size of all columns.

92InnoDB SHOW ENGINE INNODB STATUS

The InnoDB storage engine exposes a lot of information about its internals in the output of SHOW ENGINE INNODB STATUS. Unlike most of the SHOW commands, its output consists of a single string, not rows and columns.

HEADER

The first section is the header, which simply announces the beginning of the output, the current date and time, and how long it has been since the last printout.

SEMAPHORES

If you have a high-concurrency workload, you might want to pay attention to the next section, SEMAPHORES . It contains two kinds of data: event counters and, optionally, a list of current waits. If you’re having trouble with bottlenecks, you can use this information to help you find the bottlenecks.

LATEST FOREIGN KEY ERROR

This section, LATEST FOREIGN KEY ERROR, doesn’t appear unless your server has had a foreign key error. Sometimes the problem is to do with a transaction and the parent or child rows it was looking for while trying to insert, update, or delete a record.

LATEST DETECTED DEADLOCK

Like the foreign key section, the LATEST DETECTED DEADLOCK section appears only if your server has had a deadlock. The deadlock error messages are also overwritten every time there’s a new error, and the pt-deadlock -logger tool from Percona Toolkit can help you save these for later analysis. A deadlock is a cycle in the waits-for graph, which is a data structure of row locks held and waited for. The cycle can be arbitrarily large.

93InnoDB SHOW ENGINE INNODB STATUS

FILE I/O

The FILE I/O section shows the state of the I/O helper threads, along with performance counters.

INSERT BUFFER AND ADAPTIVE HASH INDEX

This section shows the status of these two structures inside InnoDB.

LOG

This section shows statistics about InnoDB’s transaction log (redo log) subsystem.

BUFFER POOL AND MEMORY

This section shows statistics about InnoDB’s buffer pool and how it uses memory.

ROW OPERATIONS

This section shows miscellaneous InnoDB statistics.

94InnoDB InnoDB Monitors and Settings

InnoDB monitors provide information about the InnoDB internal state. This information is useful for performance tuning. There are four types of InnoDB monitors:

The standard InnoDB Monitor displays the following types of information:

Table and record locks held by each active transaction.

Lock waits of a transaction.

Semaphore waits of threads.

Pending file I/O requests.

Buffer pool statistics.

Purge and insert buffer merge activity of the main InnoDB thread.

The InnoDB Lock Monitor is like the standard InnoDB Monitor but also provides extensive lock information.

The InnoDB Tablespace Monitor prints a list of file segments in the shared tablespace and validates the tablespace allocation data structures.

The InnoDB Table Monitor prints the contents of the InnoDB internal data dictionary.


When switched on, InnoDB monitors print data about every 15 seconds. Server output usually is directed to the error log. This data is useful in performance tuning. InnoDB sends diagnostic output to stderr or to files rather than to stdout or fixed-size memory buffers, to avoid potential buffer overflows.

The output of SHOW ENGINE INNODB STATUS is written to a status file in the MySQL data directory every fifteen seconds. The name of the file is innodb_status.pid, where pid is the server process ID. InnoDB removes the file for a normal shutdown.


Enabling the Standard InnoDB Monitor

To enable the standard InnoDB Monitor for periodic output, create the innodb_monitor table:

CREATE TABLE innodb_monitor (a INT) ENGINE=INNODB;

To disable the standard InnoDB Monitor, drop the table:

DROP TABLE innodb_monitor;

Enabling the InnoDB Lock Monitor

To enable the InnoDB Lock Monitor for periodic output, create the innodb_lock_monitor table:

CREATE TABLE innodb_lock_monitor (a INT) ENGINE=INNODB;

To disable the InnoDB Lock Monitor, drop the table:

DROP TABLE innodb_lock_monitor;


Enabling the InnoDB Tablespace Monitor

To enable the InnoDB Tablespace Monitor for periodic output, create the innodb_tablespace_monitor table:

CREATE TABLE innodb_tablespace_monitor (a INT) ENGINE=INNODB;

To disable the standard InnoDB Tablespace Monitor, drop the table:

DROP TABLE innodb_tablespace_monitor;

Enabling the InnoDB Table Monitor

To enable the InnoDB Table Monitor for periodic output, create the innodb_table_monitor table:

CREATE TABLE innodb_table_monitor (a INT) ENGINE=INNODB;

To disable the InnoDB Table Monitor, drop the table:

DROP TABLE innodb_table_monitor;


To fine tune InnoDB working parameters, first check their values.

mysql> show variables like 'innodb_buffer%';

+------------------------------+-----------+


+------------------------------+-----------+

| innodb_buffer_pool_instances | 1 |

| innodb_buffer_pool_size | 134217728 |

+------------------------------+-----------+

mysql> show variables like 'innodb_log%';

+---------------------------+---------+


+---------------------------+---------+

| innodb_log_buffer_size | 8388608 |

| innodb_log_file_size | 5242880 |

| innodb_log_files_in_group | 2 |

| innodb_log_group_home_dir | ./ |

+---------------------------+---------+


To make the modification persistent, edit the “my.cnf” configuration file.


Add the following lines with values as needed:

# innodb

innodb_buffer_pool_size = 128M

innodb_log_file_size = 32M

100

MyISAM MyISAM Storage Engine Uses

MyISAM is a storage engine employed by MySQL database that was used by default prior to MySQL version 5.5 (released in December, 2009). It is based on ISAM (Indexed Sequential Access Method), an indexing algorithm developed by IBM that allows retrieving information from large sets of data in a fast way.

Read-only tables. If your applications use tables that are never or rarely modified, you can safely change their storage engine to MyISAM.

Replication configuration. Replication enables you to automatically keep several databases synchronized. Unlike clustering, in which all nodes are self-sufficient, replication suggests that you assign different roles to different servers. Particularly, you can make an InnoDB-based Master database which is used for writing and processing data and MyISAM-based Slave database which is used for reading.

Backup. The most effective approach to MySQL backup is a combination of Master-to-Slave replication and backup of Slave Servers.

101

MyISAM MyISAM Table Design

MyISAM is no longer the default storage engine. All new tables will be created with InnoDB storage engine if you do not specify any storage engine name. But if you want to create a new table with MyISAM storage engine explicitly, you can specify "ENGINE = MYISAM" as the end of the "CREATE TABLE" statement.

MyISAM supports three different storage formats. The fixed and dynamic format are chosen automatically depending on the type of columns you are using. The compressed format can be created only with the myisampack utility.

102


Static-format tables have these characteristics:

CHAR and VARCHAR columns are space-padded to the specified column width, although the column type is not altered. BINARY and VARBINARY columns are padded with 0x00 bytes to the column width.

Very quick.

Easy to cache.

Easy to reconstruct after a crash, because rows are located in fixed positions.

Reorganization is unnecessary unless you delete a huge number of rows and want to return free disk space to the operating system. To do this, use OPTIMIZE TABLE or myisamchk -r.

Usually require more disk space than dynamic-format tables.

103


Dynamic-format tables have these characteristics:

All string columns are dynamic except those with a length less than four.

Each row is preceded by a bitmap that indicates which columns contain the empty string (for string columns) or zero (for numeric columns). Note that this does not include columns that contain NULL values. If a string column has a length of zero after trailing space removal, or a numeric column has a value of zero, it is marked in the bitmap and not saved to disk. Nonempty strings are saved as a length byte plus the string contents.

Much less disk space usually is required than for fixed-length tables.

Each row uses only as much space as is required. However, if a row becomes larger, it is split into as many pieces as are required, resulting in row fragmentation. For example, if you update a row with information that extends the row length, the row becomes fragmented. In this case, you may have to run OPTIMIZE TABLE or myisamchk -r from time to time to improve performance. Use myisamchk -ei to obtain table statistics.

More difficult than static-format tables to reconstruct after a crash, because rows may be fragmented into many pieces and links (fragments) may be missing.

104


Compressed tables have the following characteristics:

Compressed tables take very little disk space. This minimizes disk usage, which is helpful when using slow disks (such as CD-ROMs).

Each row is compressed separately, so there is very little access overhead. The header for a row takes up one to three bytes depending on the biggest row in the table. Each column is compressed differently. There is usually a different Huffman tree for each column. Some of the compression types are:

Suffix space compression.

Prefix space compression.

Numbers with a value of zero are stored using one bit.

If values in an integer column have a small range, the column is stored using the smallest possible type. For example, a BIGINT column (eight bytes) can be stored as a TINYINT column (one byte) if all its values are in the range from -128 to 127.

If a column has only a small set of possible values, the data type is converted to ENUM.

A column may use any combination of the preceding compression types.

105

MyISAM Optimizing MyISAM

The MyISAM storage engine performs best with read-mostly data or with low-concurrency operations, because table locks limit the ability to perform simultaneous updates.

Some general tips for speeding up queries on MyISAM tables:

To help MySQL better optimize queries, use ANALYZE TABLE or run myisamchk --analyze on a table after it has been loaded with data. This updates a value for each index part that indicates the average number of rows that have the same value.

Try to avoid complex SELECT queries on MyISAM tables that are updated frequently, to avoid problems with table locking that occur due to contention between readers and writers.

For MyISAM tables that change frequently, try to avoid all variable-length columns (VARCHAR, BLOB, and TEXT).

Use INSERT DELAYED when you do not need to know when your data is written. This reduces the overall insertion impact because many rows can be written with a single disk write.

Use OPTIMIZE TABLE periodically to avoid fragmentation with dynamic-format MyISAM tables.

You can increase performance by caching queries or answers in your application and then executing many inserts or updates together. Locking the table during this operation ensures that the index cache is only flushed once after all updates.

106

MyISAM MyISAM Table Locks

To achieve a very high lock speed, MySQL uses table locking for almost all storage engines including MyISAM.

Table lock is exactly what does it mean: it locks the entire table.

When a client has to write to a table (insert, delete, update, etc.), it acquires a write lock. This keeps all other read and write operations pending.

When nobody is writing, readers can obtain read locks, which don’t conflict with other read locks.

107

MyISAM MyISAM Table Locks

Considerations for Table Locking

Table locking in MySQL is deadlock-free for storage engines that use table-level locking. Deadlock avoidance is managed by always requesting all needed locks at once at the beginning of a query and always locking the tables in the same order.

MySQL grants table write locks as follows:

If there are no locks on the table, put a write lock on it.

Otherwise, put the lock request in the write lock queue.

MySQL grants table read locks as follows:

If there are no write locks on the table, put a read lock on it.

Otherwise, put the lock request in the read lock queue.

The MyISAM storage engine supports concurrent inserts to reduce contention between readers and writers for a given table: If a MyISAM table has no free blocks in the middle of the data file, rows are always inserted at the end of the data file. In this case, you can freely mix concurrent INSERT and SELECT statements for a MyISAM table without locks.

108

MyISAM MyISAM Settings

MyISAM offers table-level locking, meaning that when data is being written into a table, the whole table is locked, and if there are other writes that must be performed at the same time on the same table, they will have to wait until the first one has finished writing data.

The problems of table-level locking are only noticeable on very busy servers. For the typical website scenario, usually MyISAM offers better performance at a lower server cost.

If the load on the MySQL server is very high and the server is not using the swap file, before upgrading the server with a more expensive one with more processing power, you may want to try and alter its tables to use the MyISAM engine instead of the InnoDB to see what happens.

In the end, which engine you should use will depend on the particular scenario of the server.

If you decide to use only MyISAM tables, you must add the following configuration lines to your my.cnf file:

default-storage-engine=MyISAM

default-tmp-storage-engine=MyISAM

If you only have MyISAM tables, you can disable the InnoDB engine, which will save you RAM, by adding the following line to your my.cnf file:

skip-innodb

Note, however, that if you don't add the two lines presented above to your my.cnf file, the skip-innodb configuration will prevent your MySQL server from starting, since current versions of the MySQL server uses InnoDB by default.

109

MyISAM MyISAM Key Cache

To minimize disk I/O, the MyISAM storage engine exploits a strategy that is used by many database management systems. It employs a cache mechanism to keep the most frequently accessed table blocks in memory:

For index blocks, a special structure called the key cache (or key buffer) is maintained. The structure contains a number of block buffers where the most-used index blocks are placed.

For data blocks, MySQL uses no special cache. Instead it relies on the native operating system file system cache.

The MyISAM key caches are also referred to as key buffers; there is one by default, but you can create more. MyISAM caches only indexes, not data (it lets the operating system cache the data). If you use mostly MyISAM, you should allocate a lot of memory to the key caches.

110

MyISAM MyISAM Key Cache

To control the size of the key cache, use the key_buffer_size system variable. If this variable is set equal to zero, no key cache is used. The key cache also is not used if the key_buffer_size value is too small to allocate the minimal number of block buffers.

key caches should not be bigger than the total index size or 25% to 50% of the amount of memory you reserved for operating system caches.

By default, MyISAM caches all indexes in the default key buffer, but you can create multiple named key buffers. This lets you keep more than 4 GB of indexes in memory at once. To create key buffers named key_buffer_1 and key_buffer_2 , each sized at 1 GB, place the following in the “my,cnf” configuration file:

key_buffer_1.key_buffer_size = 1G

key_buffer_2.key_buffer_size = 1G

111

MyISAM MyISAM Full-Text Search

MySQL has support for full-text indexing and searching:

A full-text index in MySQL is an index of type FULLTEXT.

Full-text indexes can be used only with MyISAM tables. Full-text indexes can be created only for CHAR, VARCHAR, or TEXT columns.

A FULLTEXT index definition can be given in the CREATE TABLE statement when a table is created, or added later using ALTER TABLE or CREATE INDEX.

For large data sets, it is much faster to load your data into a table that has no FULLTEXT index and then create the index after that, than to load data into a table that has an existing FULLTEXT index.

Full-text searching is performed using MATCH() ... AGAINST syntax. MATCH() takes a comma-separated list that names the columns to be searched. AGAINST takes a string to search for, and an optional modifier that indicates what type of search to perform. The search string must be a string value that is constant during query evaluation.

112


Before you can perform full-text search in a column of a table, you must index its data and re-index its data whenever the data of the column changes. In MySQL, the full-text index is a kind of index named FULLTEXT.

You can define the FULLTEXT index in a variety of ways:

Typically, you define the FULLTEXT index for a column when you create a new table by using the CREATE TABLE.

CREATE TABLE posts (

id int(4) NOT NULL AUTO_INCREMENT,

title varchar(255) NOT NULL,

post_content text,

PRIMARY KEY (id),

FULLTEXT KEY post_content (post_content)

) ENGINE=MyISAM;

In case you already have an existing tables and want to define full-text indexes, you can use the ALTER TABLE statement or CREATE INDEX statement.

This is the syntax of define a FULLTEXT index using the ALTER TABLE statement:

ALTER TABLE table_name ADD FULLTEXT(column_name1, column_name2,…)

You can also use CREATE INDEX statement to create FULLTEXT index for existing tables.

CREATE FULLTEXT INDEX index_name ON table_name(idx_column_name,...)

113


SPHINX

Sphinx http://www.sphinxsearch.com is a free, open source, full-text search engine, designed from the ground up to integrate well with databases. It has DBMS-like features, is very fast, supports distributed searching, and scales well. It is also designed for efficient memory and disk I/O, which is important because they’re often the limiting factors for large operations.

Sphinx works well with MySQL. It can be used to accelerate a variety of queries, including full-text searches; you can also use it to perform fast grouping and sorting operations, among other applications.

114


SPHINX

Sphinx can complement a MySQL-based application in many ways, increasing performance where MySQL is not a good solution and adding functionality MySQL can’t provide.

Typical usage scenarios include:

Fast, efficient, scalable, relevant full-text searches

Optimizing WHERE conditions on low-selectivity indexes or columns without indexes

Optimizing ORDER BY ... LIMIT N queries and GROUP BY queries

Generating result sets in parallel

Scaling up and scaling out

Aggregating partitioned data

115


Large Objects

Even though MySQL is used to power a lot of web sites and applications that handle large binary objects (BLOBs) like images, videos or audio files, these objects are usually not stored in MySQL tables directly today. The reason for that is that the MySQL Client/Server protocol applies certain restrictions on the size of objects that can be returned and that the overall performance is not acceptable, as the current MySQL storage engines have not really been optimized to properly handle large numbers of BLOBs.

In MySQL the maximum size of a given blob can be up to 4 GB. MySQL doesn't offer any other parameter directly impacting blob performance.

116


Large Objects

BLOBs create big rows in memory, and sequential scans are not possible. The database can become too big to handle, and then the database won't scale well. In addition, BLOBs slows down replication, and BLOB data must be written to the binary log.

BLOB operations are transactional and have valid references and putting the BLOBs in a database makes replication possible.

Solution is Scalable BLOB Streaming Project for MySQL such as "PrimeBase XT Storage Engine for MySQL" (PBXT) and "PrimeBase Media Streaming" engine (PBMS).

117


MEMORY Storage Engine Uses

The MEMORY storage engine creates special-purpose tables with contents that are stored in memory. Because the data is vulnerable to crashes, hardware issues, or power outages, only use these tables as temporary work areas or read-only caches for data pulled from other tables.

A typical use case for the MEMORY engine involves these characteristics:

Operations involving transient, non-critical data such as session management or caching. When the MySQL server halts or restarts, the data in MEMORY tables is lost.

In-memory storage for fast access and low latency. Data volume can fit entirely in memory without causing the operating system to swap out virtual memory pages.

A read-only or read-mostly data access pattern (limited updates).

Basically, it’s a engine that’s really only useful for a single connection in limited use cases.

118


MEMORY Storage Engine Performance

People often wants to use the MySQL memory engine to store web sessions or other similar volatile data.

There are good reasons for that, here are the main ones:

Data is volatile, it is not the end of the world if it is lost

Elements are accessed by primary key so hash index are good

Sessions tables are accessed heavily (reads/writes), using Memory tables save disk IO

Unfortunately, the Memory engine also has some limitations that can prevent its use on a large scale:

Bound by the memory of one server

Variable length data types like varchar are expanded

Bound to the CPU processing of one server

The Memory engine only supports table level locking, limiting concurrency

Those limitations can be hit fairly rapidly, especially if the session payload data is large.

However, MEMORY performance is constrained by contention resulting from single-thread execution and table lock overhead when processing updates.

MySQL Cluster offers the same features as the MEMORY engine with higher performance levels.

119

Other MySQL Storage Engines and Issues Multiple Storage Engine Advantages

MySQL supports several storage engines that act as handlers for different table types. MySQL storage engines include both those that handle transaction-safe tables and those that handle non-transaction-safe tables.

Transaction-safe tables (TSTs) have several advantages over non-transaction-safe tables (NTSTs):

Safer. Even if MySQL crashes or you get hardware problems, you can get your data back, either by automatic recovery or from a backup plus the transaction log.

You can combine many statements and accept them all at the same time with the COMMIT statement (if autocommit is disabled).

You can execute ROLLBACK to ignore your changes (if autocommit is disabled).

If an update fails, all your changes will be restored. (With non-transaction-safe tables, all changes that have taken place are permanent.)

Transaction-safe storage engines can provide better concurrency for tables that get many updates concurrently with reads.

Non-transaction-safe tables have several advantages of their own, all of which occur because there is no transaction overhead:

Much faster

Lower disk space requirements

Less memory required to perform updates

You can combine transaction-safe and non-transaction-safe tables in the same statements to get the best of both worlds.

120


Single Storage Engine Advantages

One of the strenght points of MySQL is support for Multiple Storage engines, and from the glance view it is indeed great to provide users with same top level SQL interface allowing them to store their data many different way. As nice as it sounds the in theory this benefit comes at very significant cost in performance, operational and development complexity.

What is interesting for probably 95% of applications single storage engine would be good enough. In fact people already do not love to mix multiple storage engines very actively because of potential complications involved.

Now lets think what we could have if we have a version of MySQL Server which drops everything but Innodb (or any else) Storage engine: we could save a lot of CPU cycles by having storage format same as processing format. We could tune Optimizer to handle Innodb specifics well. We could get rid of SQL level table locks and using Innodb internal data dictionary instead of Innodb files. We would use Innodb transactional log for replication. Finally backup can be done safely.

Single Storage Engine server would be also a lot easier to test and operate.

This also would not mean one has to give up flexibility completely, for example one can imagine having Innodb tables which do not log the changes, hence being faster for update operations. One could also lock them in memory to ensure predictable in memory performance.

121


Schema Design Considerations

Good logical and physical design is the cornerstone of high performance, and you must design your schema for the specific queries you will run. This often involves trade-offs. Adding counter and summary tables is a great way to optimize queries, but they can be expensive to maintain. MySQL’s particular features and implementation details influence this quite a bit. The most optimization tricks for MySQL focus on query performance or server tuning. But the optimization starts with the design of the database schema. When you forget to optimize the base of your database (the structure), then you will pay the price of your laxity from the beginning of your work with the database. Sure, every storage engine have his own advantages and disadvantages. But regardless of the engine you choose, you should consider some items in your database schema.

As a quick rule of thumb, consider these initial few steps:

Do not index columns that you not need in a select

Use clever refactoring to admit changes to current schema

Choose the minimal character set, that fits the actual needs

Use triggers just, only when needed

122



In a normalized database, each fact is represented once and only once. Conversely, in a denormalized database, information is duplicated, or stored in multiple places.

Database normalization is a process by which an existing schema is modified to bring its component tables into compliance with a series of progressive normal forms.

The goal of database normalization is to ensure that every non-key column in every table is directly dependent on the key, the whole key and nothing but the key and with this goal come benefits in the form of reduced redundancies, fewer anomalies, and improved efficiencies. While normalization is not the be-all and end-all of good design, a normalized schema provides a good starting point for further development.

123



Why normalization is a preferred approach in terms of performance:

You cannot write generic queries/views to access the data. Basically, all queries in the code need to by dynamic, so you can put in the right table name.

Maintaining the data becomes cumbersome. Instead of updating a single table, you have to update multiple tables.

Performance is a mixed bag. Although you might save the overhead of storing the customer id in each table, you incur another cost. Having lots of smaller tables means lots of tables with partially filled pages. Depending on the number of jobs per customer and number of overall customers, you might actually be multiplying the amount of space used. In the worst case of one job per customer where a page contains -- say -- 100 jobs, you would be multiplying the required space by about 100.

The last point also applies to the page cache in memory. So, data in one table that would fit into memory might not fit into memory when split among many tables.

Through the process of database normalization it's possible to bring the schema's tables into conformance with progressive normal forms. As a result the tables each represent a single entity (a book, an author, a subject, etc) and we benefit from decreased redundancy, fewer anomalies and improved efficiency.

124


Schema Design

The major schema design principle states you should use one table per object of interest. That means one table for users, one table for pages, one table for posts, etc. Use a normalized database for transactional data.

Although there are universally bad and good design principles, there are also issues that arise from how MySQL is implemented.

Too many columns. MySQL storage engines interacts with the server storing rows in buffers. High CPU consumption can be noticed when using extremely wide tables (hundreds of columns), even though only a few columns were actually used. This can have a cost with the server’s performance characteristics.

Too many joins. MySQL has a limitation of 61 tables per join. It’s better to have a dozen or fewer tables per query if you need queries to execute very fast with high concurrency.

ENUM. Enumerated value type are a problem in database design. It's preferrable to have a INT as a foreign key for quick lookups.

SET. An ENUM permits the column to hold one value from a set of defined values. A SET permits the column to hold one or more values from a set of defined values: this may lead to confusion.

NULL. It's a good practice to avoid NULL when possible, but consider MySQL does index NULL, which doesn’t include non-values in indexes.

125


Data Types

MySQL supports a large variety of data types, and choosing the correct type to store your data is crucial to getting good performance.

Whole Numbers There are two kinds of numbers: whole numbers and real numbers (numbers with a fractional part). If you’re storing whole numbers, use one of the integer types: TINYINT, SMALLINT, MEDIUMINT, INT or BIGINT.

Real Numbers Real numbers are numbers that have a fractional part. However, they aren’t just for fractional numbers; you can also use DECIMAL to store integers that are so large they don’t fit in BIGINT. The FLOAT and DOUBLE types support approximate calculations with standard floating-point math.

String Types MySQL supports quite a few string data types, with many variations on each.

VARCHAR stores variable-length character strings and is the most common string data type.

CHAR is fixed-length: MySQL always allocates enough space for the specified number of characters.

BLOB and TEXT are string data types designed to store large amounts of data as either binary or character strings, respectively.

Using ENUM instead of a string type Sometimes you can use an ENUM column instead of conventional string types. An ENUM column can store a predefined set of distinct string values.

126


Data Types

Date and Time Types

MySQL has many types for various kinds of date and time values, such as YEAR and DATE. The finest granularity of time MySQL can store is one second.

DATETIME This type can hold a large range of values, from the year 1001 to the year 9999, with a precision of one second.

TIMESTAMP the TIMESTAMP type stores the number of seconds elapsed since midnight, January 1, 1970, Greenwich Mean Time (GMT)—the same as a Unix timestamp.

Special Types of Data

Some kinds of data don’t correspond directly to the available built-in types.

IPv4 address. People uses VARCHAR(15) or unsigned 32-bit integers to insert the dotted-separated IP address notation, but MySQL provides the INET_ATON() and INET_NTOA() functions to convert between the two representations.

127

Schema Design and Performance Indexes

Indexes (also called “keys” in MySQL) are data structures that storage engines use to find rows quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows.

The easiest way to understand how an index works in MySQL is to think about the index in a book. To find out where a particular topic is discussed in a book, you look in the index, and it tells you the page number(s) where that term appears.

MySQL uses indexes for these operations:

To find the rows matching a WHERE clause quickly.

To eliminate rows from consideration. If there is a choice between multiple indexes, MySQL normally uses the index that finds the smallest number of rows.

To retrieve rows from other tables when performing joins. MySQL can use indexes on columns more efficiently if they are declared as the same type and size.

For comparisons between non binary string columns, both columns should use the same character set.

Comparison of dissimilar columns.

To find the MIN() or MAX() value for a specific indexed column key_col.

To sort or group a table if the sorting or grouping is done on a leftmost prefix of a usable key.

Indexes are less important for queries on small tables, or big tables where report queries process most or all of the rows. When a query needs to access most of the rows, reading sequentially is faster than working through an index. Sequential reads minimize disk seeks, even if not all the rows are needed for the query.

128

Schema Design and Performance Indexes

Types of Indexes

There are many types of indexes, each designed to perform well for different purposes. Indexes are implemented in the storage engine layer, not the server layer: so they are not standardized. Indexing works slightly differently in each engine, and not all engines support all types of indexes.

B-Tree Indexes

This is the default index for most storage engines in MySql. The general idea of a B-Tree is that all the values are stored in order, and each leaf page is the same distance from the root. A B-Tree index speeds up data access because the storage engine doesn’t have to scan the whole table to find the desired data. Instead, it starts at the root node.

Hash indexes

A hash index is built on a hash table and is useful only for exact lookups that use every column in the index. 4 For each row, the storage engine computes a hash code of the indexed columns, which is a small value that will probably differ from the hash codes computed for other rows with different key values. It stores the hash codes in the index and stores a pointer to each row in a hash table.

Spatial (R-Tree) indexes

MyISAM supports spatial indexes, which you can use with partial types such as GEOMETRY. Unlike B-Tree indexes, spatial indexes don’t require WHERE clauses to operate on a leftmost prefix of the index. They index the data by all dimensions at the same time. As a result, lookups can use any combination of dimensions efficiently.

Full-text indexes

FULLTEXT is a special type of index that finds keywords in the text instead of comparing values directly to the values in the index. It is much more analogous to what a search engine does than to simple WHERE parameter matching.

129


Partitioning

Partitioning is performed by logically dividing one large table into small physical fragments.

Partitioning may bring several advantages:

In some situations query performance can be significantly increased, especially when the most intensively used table area is a separate partition or a small number of partitions. Such a partition and its indexes are more easily placed in the memory than the index of the whole table.

When queries or updates are using a large percentage of one partition, the performance may be increased simply through a more beneficial sequential access to this partition on the disk, instead of using the index and random read access for the whole table. In our case the B-Tree (itemid, clock) type of indexes are used that substantially benefit in performance from partitioning.

Mass INSERT and DELETE can be performed by simply deleting or adding partitions, as long as this possibility is planned for when creating the partition. The ALTER TABLE statement will work much faster than any statement for mass insertion or deletion.

It is not possible to use tablespaces for InnoDB tables in MySQL. You get one directory - one database. Thus, to transfer a table partition file it must by physically copied to another medium and then referenced using a symbolic link.

130


Partitioning

Partitioned Tables

A partitioned table is a single logical table that’s composed of multiple physical subtables. The way MySQL implements partitioning means that indexes are defined per-partition, rather than being created over the entire table.

How Partitioning Works

As we’ve mentioned, partitioned tables have multiple underlying tables, which are represented by Handler objects. You can’t access the partitions directly. Each partition is managed by the storage engine in the normal fashion (all partitions must use the same storage engine), and any indexes defined over the table are actually implemented as identical indexes over each underlying partition.

Types of Partitioning

MySQL supports several types of partitioning. The most common type we’ve seen used is range partitioning, in which each partition is defined to accept a specific range of values for some column or columns, or a function over those columns. Next slides brings further details.

131



The goals of writing any SQL statement include delivering quick response times, using the least CPU resources, and achieving the fewest number of I/O operations BUT there are not many cases where these so-called best practices can be applied in a real life situation.

Do not use SELECT * in your queries.

Always write the required column names after the SELECT statement: this technique results in reduced disk I/O and better performance.

Always use table aliases when your SQL statement involves more than one source.

If more than one table is involved in a from clause, each column name must be qualified using either the complete table name or an alias. The alias is preferred. It is more human readable to use aliases instead of writing columns with no table information.

Use the more readable ANSI-Standard Join clauses instead of the old style joins.

With ANSI joins, the WHERE clause is used only for filtering data. Where as with older style joins, the WHERE clause handles both the join condition and filtering data. Furthermore ANSI join syntax supports the full outer join.

132



Do not use column numbers in the ORDER BY clause.

Always use column names in an order by clause. Avoid positional references.

Always use a column list in your INSERT statements.

Always specify the target columns when executing an insert command. This helps in avoiding problems when the table structure changes (like adding or dropping a column).

Always use a SQL formatter to format your sql.

The formatting of SQL code may not seem that important, but consistent formatting makes it easier for others to scan and understand your code. SQL statements have a structure, and having that structure be visually evident makes it much easier to locate and verify various parts of the statements. Uniform formatting also makes it much easier to add sections to and remove them from complex SQL statements for debugging purposes.

133


EXPLAIN

The EXPLAIN command is the main way to find out how the query optimizer decides to execute queries. This feature has limitations and doesn’t always tell the truth, but its output is the best information available, and it’s worth studying so you can learn how your queries are executed. Learning to interpret EXPLAIN will also help you learn how MySQL’s optimizer works.

To use EXPLAIN, simply add the word EXPLAIN just before the SELECT keyword in your query. MySQL will set a flag on the query. When it executes the query, the flag causes it to return information about each step in the execution plan, instead of executing it. It returns one or more rows, which show each part of the execution plan and the order of execution.

134


EXPLAIN

EXPLAIN tells you:

In which order the tables are read

What types of read operations that are made

Which indexes could have been used

Which indexes are used

How the tables refer to each other

How many rows the optimizer estimates to retrieve from each table

136


EXPLAIN - Output

Column Description

id The SELECT identifier

select_type The SELECT type

table The table for the output row

partitions The matching partitions

type The join type

possible_keys The possible indexes to choose

key The index actually chosen

key_len The length of the chosen key

ref The columns compared to the index

rows Estimate of rows to be examined

filtered Percentage of rows filtered by table condition

Extra Additional information

137


EXPLAIN - Types

Column Description

system The table has only one row

const At the most one matching row, treated as a constant

eq_ref One row per row from previous tables

ref Several rows with matching index value

ref_or_null Like ref, plus NULL values

index_merge Several index searches are merged

unique_subquery Same as ref for some subqueries

index_subquery As above for non-unique indexes

range A range index scan

index The whole index is scanned

ALL A full table scan

138


EXPLAIN - SELECT

SELECT TYPE Description

simple Simple SELECT (not using UNION or subqueries)

primary Outermost SELECT

union Second or later SELECT statement in a UNION

dependent union Second or later SELECT statement in a UNION, dependent on outer query

union result Result of a UNION.

subquery First SELECT in subquery

dependent subquery First SELECT in subquery, dependent on outer query

derived Derived table SELECT (subquery in FROM clause)

uncacheable subquery

A subquery for which the result cannot be cached and must be re-evaluated for each row of the outer query

uncacheable union The second or later select in a UNION that belongs to an uncacheable subquery

139


EXPLAIN – Performance troubleshooting

When dealing with a real-world application there is a number of tables with many relations between them, but sometimes it’s hard to anticipate the most optimal way to write a query. This is a sample query which uses tables with no indexes or primary keys, only to demonstrate the impact of such a bad design by writing a pretty awful query.

EXPLAIN SELECT * FROM

orderdetails d

INNER JOIN orders o ON d.orderNumber = o.orderNumber

INNER JOIN products p ON p.productCode = d.productCode

INNER JOIN productlines l ON p.productLine = l.productLine

INNER JOIN customers c on c.customerNumber = o.customerNumber

WHERE o.orderNumber = 10101G

140

MySQL Query Performance EXPLAIN – Performance troubleshooting

********************** 1. row **********************

id: 1

select_type: SIMPLE

table: l

type: ALL

possible_keys: NULL

key: NULL

key_len: NULL

ref: NULL

rows: 7

Extra:

********************** 2. row **********************

id: 1

select_type: SIMPLE

table: p

type: ALL

possible_keys: NULL

key: NULL

key_len: NULL

ref: NULL

rows: 110

Extra: Using where; Using join buffer

141


********************** 3. row **********************

id: 1

select_type: SIMPLE

table: c

type: ALL

possible_keys: NULL

key: NULL

key_len: NULL

ref: NULL

rows: 122

Extra: Using join buffer

********************** 4. row **********************

id: 1

select_type: SIMPLE

table: o

type: ALL

possible_keys: NULL

key: NULL

key_len: NULL

ref: NULL

rows: 326


142



********************** 5. row **********************

id: 1

select_type: SIMPLE

table: d

type: ALL

possible_keys: NULL

key: NULL

key_len: NULL

ref: NULL

rows: 2996


5 rows in set (0.00 sec)

If you look at the above result, you can see all of the symptoms of a bad query. But even if I wrote a better query, the results would still be the same since there are no indexes. The join type is shown as “ALL” (which is the worst), which means MySQL was unable to identify any keys that can be used in the join and hence the possible_keys and key columns are null.

143



Now lets add some obvious indexes, such as primary keys for each table, and execute the query once again. As a general rule of thumb, you can look at the columns used in the JOIN clauses of the query as good candidates for keys because MySQL will always scan those columns to find matching records.Let’s re-run the same query again after adding the indexes and the result should look like this:

********************** 1. row **********************

id: 1

select_type: SIMPLE

table: o

type: const

possible_keys: PRIMARY,customerNumber

key: PRIMARY

key_len: 4

ref: const

rows: 1

Extra:

144


********************** 2. row **********************

id: 1

select_type: SIMPLE

table: c

type: const

possible_keys: PRIMARY

key: PRIMARY

key_len: 4

ref: const

rows: 1

Extra:

********************** 3. row **********************

id: 1

select_type: SIMPLE

table: d

type: ref


key: PRIMARY

key_len: 4

ref: const

rows: 4

Extra:

145


********************** 4. row **********************

id: 1

select_type: SIMPLE

table: p

type: eq_ref

possible_keys: PRIMARY,productLine

key: PRIMARY

key_len: 17

ref: classicmodels.d.productCode

rows: 1

Extra:

********************** 5. row **********************

id: 1

select_type: SIMPLE

table: l

type: eq_ref


key: PRIMARY

key_len: 52

ref: classicmodels.p.productLine

rows: 1

Extra:

After adding indexes, the number of records scanned has been brought down to 1 × 1 × 4 × 1 × 1 = 4. That means for each record with orderNumber 10101 in the orderdetails table, MySQL was able to directly find the matching record in all other tables using the indexes and didn’t have to resort to scanning the entire table.

146


MySQL Optimizer

The MySQL Query Optimizer

The goal of MySQL optimizer is to take a SQL query as input and produce an optimal execution plan for the query.

When you issue a query that selects rows, MySQL analyzes it to see if any optimizations can be used to process the query more quickly. In this section, we'll look at how the query optimizer works.

The MySQL query optimizer takes advantage of indexes, of course, but it also uses other information.

For example, if you issue the following query, MySQL will execute it very quickly, no matter how large the table is:

SELECT * FROM tbl_name WHERE 0;

In this case, MySQL looks at the WHERE clause, realizes that no rows can possibly satisfy the query, and doesn't even bother to search the table. You can see this by issuing an EXPLAIN statement, which tells MySQL to display some information about how it would execute a SELECT query without actually executing it.

Optimizer is enabled by issuing the following:

set optimizer_trace=1;

147


MySQL Optimizer

How the Optimizer Works

The MySQL query optimizer has several goals, but its primary aims are to use indexes whenever possible and to use the most restrictive index in order to eliminate as many rows as possible as soon as possible.

The reason the optimizer tries to reject rows is that the faster it can eliminate rows from consideration, the more quickly the rows that do match your criteria can be found. Queries can be processed more quickly if the most restrictive tests can be done first. You can help the optimizer take advantage of indexes by using the following guidelines:

Try to compare columns that have the same data type. When you use indexed columns in comparisons, use columns that are of the same type. Identical data types will give you better performance than dissimilar types.

Try to make indexed columns stand alone in comparison expressions. If you use a column in a function call or as part of a more complex term in an arithmetic expression, MySQL can't use the index because it must compute the value of the expression for every row.

148


MySQL Optimizer


Don't use wildcards at the beginning of a LIKE pattern. Some string searches use a WHERE clause. Don't put '%' on both sides of the string simply out of habit.

Use EXPLAIN to verify optimizer operation. The EXPLAIN statement can tell you whether indexes are being used. This information is helpful when you're trying different ways of writing a statement or checking whether adding indexes actually will make a difference in query execution efficiency.

Give the optimizer hints when necessary. Normally, the MySQL optimizer considers itself free to determine the order in which to scan tables to retrieve rows most quickly. On occasion, the optimizer will make a non-optimal choice. If you find this happening, you can override the optimizer's choice using the STRAIGHT_JOIN keyword.

149


MySQL Optimizer


Take advantage of areas in which the optimizer is more mature. MySQL can do joins and subqueries, but subquery support is more recent, having been added in MySQL 4.1. Consequently, the optimizer has been better tuned for joins than for subqueries in some cases.

Test alternative forms of queries, but run them more than once. When testing alternative forms of a query (for example, a subquery versus an equivalent join), run it several times each way. If you run a query only once each of two different ways, you'll often find that the second query is faster just because information from the first query is still cached and need not actually be read from the disk.

Avoid overuse of MySQL's automatic type conversion. MySQL will perform automatic type conversion, but if you can avoid conversions, you may get better performance.

150


MySQL Optimizer

Overriding Optimization

It sounds odd, but there may be times when you'll want to defeat MySQL's optimization behaviour.

To override the optimizer's table join order. Use STRAIGHT_JOIN to force the optimizer to use tables in a particular order. If you do this, you should order the tables so that the first table is the one from which the smallest number of rows will be chosen.

To empty a table with minimal side effects. When you need to empty a MyISAM table completely, it's fastest to have the server just drop the table and re-create it based on the description stored in its .frm file. To do this, use a TRUNCATE TABLE statement.

151



Database performance is affected by many factors. One of them is the query optimizer. To be sure the query optimizer is not introducing noise to well functioning queries we must analyse slow queries, if any. Watch the Slow query log first, as stated previously in the course. By default, the slow query log is disabled. To specify the initial slow query log state explicitly, use

mysqld --slow_query_log[={0|1}]

With no argument or an argument of 1, --slow_query_log enables the log. With an argument of 0, this option disables the log.

One of best tools to accomplish query analysis execution is pt-query-digest from Percona. It’s a third party tool that relies on logs, processlist, and tcpdump.

You also need the log to include all the queries, not just those that take more than N seconds. The reason is that some queries are individually quick, and would not be logged if you set the long_query_time configuration variable to 1 or more seconds. You want that threshold to be 0 seconds while you’re collecting logs.

http://www.percona.com/doc/percona-toolkit/2.2/pt-query-digest.html

http://www.percona.com/doc/percona-toolkit/2.2/pt-query-digest.html

152



Another good practice involves processlist and show explain:

mysql> show processlist;

mysql> show explain for <PID>;

An evolution to this approach comes from the performance_schema database. There are many

ways to analyze via queries

events_statements_summary_by_digest

count_star, sum_timer_wait, min_timer_wait, avg_timer_wait, max_timer_wait

digest_text, digest

sum_rows_examined, sum_created_tmp_disk_tables, sum_select_full_join

events_statements_history

sql_text, digest_text, digest

timer_start, timer_end, timer_wait

rows_examined, created_tmp_disk_tables, select_full_join

153


Improve Query Executions

One nice feature added to the EXPLAIN statement in MySQL > 4.1 is the EXTENDED keyword which provides you with some helpful additional information on query optimization. It should be used together with SHOW WARNINGS to get information about how query looks after transformation as well as what other notes the optimizer may wish to tell us.

While it may look like a regular EXPLAIN statement, MySQL brings the SQL statement into its optimized form. Using SHOW WARNINGS afterwards prints out the optimized SELECT statement.

Adding the EXPLAIN EXTENDED prefix to the statement below will execute the statement behind the scenes so that the compiler optimizations can be analyzed:

EXPLAIN EXTENDED SELECT COUNT(*) FROM employees WHERE id IN (SELECT emp_id FROM bonuses);

The resulting output table is very much like the one produced by the regular EXPLAIN except for the added filtered column in the second last position. The filtered column indicates an estimated percentage of table rows that will be filtered by the table condition. Hence, the rows column shows the estimated number of rows examined and rows × filtered / 100 calculates the number of rows that will be joined with previous tables.

Applying EXPLAIN EXTENDED to our query gives us the opportunity to run the Show Warnings statement afterwards to see final optimized query:

SHOW WARNINGS;

154


Locate and Correct Problematic Queries

Finding bad queries is a big part of optimization. Queries, or groups of queries, are bad because:

they are slow and provide a bad user experience

they add too much load to the system

they block other queries from running

In real world, problematic queries can result from improper situations:

Bad query plan

Rewrite the query

Force a good query plan

Bad optimizer settings

Do tuning

Query is inherently complex

Don't waste time with it

Look for other solutions

155

MySQL Query Performance Locate and Correct Problematic Queries

Baseline. Always establish the current baseline of MySQL performance before any changes are made. Otherwise it is really only a guess afterwards whether the changes improved MySQL performance. The easiest way to baseline MySQL performance is with mysqlreport.

Assess Baseline. The report that mysqlreport writes can contain a lot of information, but for our purpose here there are only three things we need to look at. It is not necessary to understand the nature of these values at this point, but they give us an idea how well or not MySQL is really running.

Log Slow Queries and Wait. By default MySQL does not log slow queries and the slow query time is 10 seconds. This needs to be changed by adding these lines under the [msyqld] section in /etc/my.cnf:

log-slow-queries

long_query_time = 1

Restart MySQL and wait at least a full day. This will cause MySQL to log all queries which take longer than 1 second to execute.

Isolate Top 10 Slow Queries. The easiest way to isolate the top 10 slowest queries in the slow queries log is to use mysqlsla. Run mysqlsla on your slow queries log and save the output to a file. For example: "mysqlsla --log-type slow /var/lib/mysql/slow_queries.log > ~/top_10_slow_queries". That command will create a file in your home directory called top_10_slow_queries.

Post-fix Proof. Presuming that your MySQL expert was able to fix the top slow queries, the final step is to actually prove this is the case and not just coincidence. Restart MySQL and wait as long as MySQL had ran in the first step (at least a day ideally). Then baseline MySQL performance again with mysqlreport. Compare the first report with this second report, specifically the three values we looked at in step two (Read ratio, Slow, and Waited).

http://hackmysql.com/mysqlreport

156



Your MySQL server can perform only as well as its weakest link, and the operating system and the hardware on which it runs are often limiting factors. The disk size, the available memory and CPU resources, the network, and the components that link them all limit the system’s ultimate capacity. MySQL requires significant memory amounts in order to provide optimal performance. The fastest and most effective change that you can make to improve performance is to increase the amount of RAM on your web server - get as much as possible (e.g. 4GB or more). Increasing primary memory will reduce the need for processes to swap to disk and will enable your server to handle more users.

157



Better performance is gained by obtaining the best processor capability you can, i.e. dual or dual core processors. A modern BIOS should allow you to enable hyperthreading, but check if this makes a difference to the overall performance of the processors by using a CPU benchmarking tool.

If you can afford them, use SCSI hard disks instead of SATA drives. SATA drives will increase your system's CPU utilization, whereas SCSI drives have their own integrated processors and come into their own when you have multiple drives. If you must have SATA drives, check that your motherboard and the drives themselves support NCQ (Native Command Queuing).

Purchase hard disks with a low seek time. This will improve the overall speed of your system, especially when accessing MySQL tablespaces and datafiles.

158



Size your swap file correctly. The general advice is to set it to 4 x physical RAM.

Use a RAID disk system. Although there are many different RAID configurations you can create, the following generally works best:

install a hardware RAID controller

the operating system and swap drive on one set of disks configured as RAID-1.

MySQL server on another set of disks configured as RAID-5 or RAID-10.

Use gigabit ethernet for improved latency and throughput. This is especially important when you have your webserver and database server separated out on different hosts.

Check the settings on your network card. You may get an improvement in performance by increasing the use of buffers and transmit/receive descriptors (balance this with processor and memory overheads) and off-loading TCP checksum calculation onto the card instead of the OS.

159


Considering Operating Systems

You can use Linux (recommended), Unix-based, Windows or Mac OS X for the server operating system. *nix operating systems generally require less memory than Mac OS X or Windows servers for doing the same task as the server is configured with just a shell interface. Additionally Linux does not have licensing fees attached, but can have a big learning curve if you're used to another operating system. If you have a large number of processors running SMP, you may also want to consider using a highly tuned OS such as Solaris.

Check your own OS and vendor specific instructions for optimization steps.

For Linux look at the Linux Performance Team site.

Linux investigate the hdparm command, e.g. hdparm -m16 -d1 can be used to enable read/write on multiple sectors and DMA. Mount disks with the async and noatime options.

For Windows set the server to be optimized for network applications (Control Panel, Network Connections, LAN connection, Properties, File & Printer Sharing for Microsoft Networks, Properties, Optimization). You can also search the Microsoft TechNet site for optimization documents.

http://linuxperf.sourceforge.net/

http://technet.microsoft.com/

160



Windows

If you install MySQL on a Windows system and used the Windows Installation Wizard, the most is already done. When that wizard completes, it most likely launched the MySQL Configuration Wizard which walked you through the process of configuring the database. When the wizard starts for the first time, it asks you if you'd like to perform a standard configuration or a detailed configuration. The standard configuration process consists of two steps: service options and security options. You'll first see a screen asking you if you'd like to install MySQL as a service. In most cases, you should select this option. Running the database as a service lets it run in the background without requiring user interaction. The second phase of the standard configuration process allows you to set two types of security settings. The first is the use of a root password, which is strongly recommended. This root password controls access to the most sensitive administration tasks on your server. The second option you'll select on this screen is whether you'd like to have an anonymous user account. We recommend that you do not enable this option unless absolutely necessary to increase the security of your system.

161



Linux

whatever the distribution chosen, the configuration is based on the file my.cnf. Most of the cases, you should not touch this file. By default, it will have the following entries:

[mysqld]

datadir=/var/lib/mysql

socket=/var/lib/mysql/mysql.sock

[mysql.server]

user=mysql

basedir=/var/lib

[safe_mysqld]

err-log=/var/log/mysqld.log

pid-file=/var/run/mysqld/mysqld.pid

162


Logging

MySQL Server has several logs that can help you find out what activity is taking place.

Error log Problems encountered starting, running, or stopping mysqld

General query log Established client connections and statements received from clients

Binary log Statements that change data (also used for replication)

Relay log Data changes received from a replication master server

Slow query log Queries that took more than long_query_time seconds to execute

By default, no logs are enabled and the server writes files for all enabled logs in the data directory.

163


Logging

Logging parameters are located under [mysqld] section in /etc/my.cnf configuration file. A typical schema should be the following:

[mysqld]

log-bin=/var/log/mysql-bin.log

log=/var/log/mysql.log

log-error=/var/log/mysql-error.log

log-slow-queries=/var/log/mysql-slowquery.log

164

Performance Tuning Extras Logging Error Log

Error Log goes to syslog due to /etc/mysql/conf.d/mysqld_safe_syslog.cnf, which contains the following:

[mysqld_safe]

syslog

General Query Log

To enable General Query Log, uncomment (or add) the relevant lines

general_log_file = /var/log/mysql/mysql.log

general_log = 1

Slow Query Log

To enable Slow Query Log, uncomment (or add) the relevant lines

log_slow_queries = /var/log/mysql/mysql-slow.log

long_query_time = 2

log-queries-not-using-indexes

Restart MySQL server after changes

This method requires a server restart.

$ Service mysql restart

165


Backup and Recovery

It is important to back up your databases so that you can recover your data and be up and running again in case problems occur, such as system crashes, hardware failures, or users deleting data by mistake. Backups are also essential as a safeguard before upgrading a MySQL installation, and they can be used to transfer a MySQL installation to another system or to set up replication slave servers.

166

Performance Tuning Extras Backup and Recovery

Logical Backups

Logical Backup (mysqldump)

Amongst other things, the mysqldump command allows you to do logical backups of your database by producing the SQL statements necessary to rebuild all the schema objects. An example is shown below.

$ # All DBs

$ mysqldump --user=root --password=mypassword --all-databases > all_backup.sql

$ # Individual DB (or comma separated list for multiple DBs)

$ mysqldump --user=root --password=mypassword mydatabase > mydatabase_backup.sql

$ # Individual Table

$ mysqldump --user=root --password=mypassword mydatabase mytable > mydatabase_mytable_backup.sql

Recovery from Logical Backup (mysql)

The logical backup created using the mysqldump command can be applied to the database using the MySQL command line tool, as shown below.

$ # All DBs

$ mysql --user=root --password=mypassword < all_backup.sql

$ # Individual DB

$ mysql --user=root --password=mypassword --database=mydatabase < mydatabase_backup.sql

167


Backup and Recovery

Cold Backups

Cold backups are a type of physical backup as you copy the database files while the database is offline.

Cold Backup

The basic process of a cold backup involves stopping MySQL, copying the files, the restarting MySQL. You can use whichever method you want to copy the files (cp, scp, tar, zip etc.).

# service mysqld stop

# cd /var/lib/mysql

# tar -cvzf /tmp/mysql-backup.tar.gz ./*

# service mysqld start

Recovery from Cold Backup

To recover the database from a cold backup, stop MySQL, restore the backup files and start MySQL again.

# service mysqld stop

# cd /var/lib/mysql

# tar -xvzf /tmp/mysql-backup.tar.gz

# service mysqld start

168

Performance Tuning Extras Backup and Recovery

Binary Logs : Point In Time Recovery (PITR)

Binary logs record all changes to the databases, which are important if you need to do a Point In Time Recovery (PITR). Without the binary logs, you can only recover the database to the point in time of a specific backup. The binary logs allow you to wind forward from that point by applying all the changes that were written to the binary logs. Unless you have a read-only system, it is likely you will need to enable the binary logs.

To enable the binary blogs, edit the "/etc/my.cnf" file, uncommenting the "log_bin" entry.

# Remove leading # to turn on a very important data integrity option: logging

# changes to the binary log between backups.

log_bin

The binary logs will be written to the "datadir" location specified in the "/etc/my.cnf" file, with a default prefix of "mysqld". If you want alter the prefix and path you can do this by specifying an explicit base name.

# Prefix set to "mydb". Stored in the default location.

log_bin=mydb

# Files stored in "/u01/log_bin" with the prefix "mydb".

log_bin=/u01/log_bin/mydb

Restart the MySQL service for the change to take effect.

# service mysqld restart

The mysqlbinlog utility converts the contents of the binary logs to text, which can be replayed against the database.

169

Conclusion

Course Overview

Course Aims

Understand the basics of performance tuning

Use performance tuning tools

Tune the MySQL Server instance to improve performance

Improve performance of tables based on the storage engine being used

Implement proper Schema Design to improve performance

Improve the performance of MySQL Queries

Describe additional items related to performance tuning

170

Conclusion

Training and Certification Website

The following is a small list of sites of interest for related MySQL training course.

Oracle University

http://education.oracle.com/pls/web_prod-plq-ad/db_pages.getpage?page_id=3

MySQL Training

http://www.mysql.it/training/

MySQL Certifications

http://www.mysql.it/certification/









171

Conclusion

Course evaluation

Please answer to the questions in order to verify the knowledge achieved during this course. Thanks.

172

Conclusion

Thank you!

173

Conclusion

Q & A

174

Lab 1: Basic MySQL operations MySQL installation

On Debian Linux distros, this is done by entering the command:

$ sudo apt-get –y install mysql-server

Other distributions rely on similar commands, such as SuSE Zypper, Red Hat YUM and others.

Set root password

$ mysql -u root

mysql> SET PASSWORD FOR 'ROOT'@'LOCALHOST“ PASSWORD(‘root');

Set host

mysql> GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'root' WITH GRANT OPTION;

mysql> FLUSH PRIVILEGES;

175

Lab 1: MySQL DB connection MySQL connection

On the command line, just type

$ mysql –u root -p

Then you are prompted to insert the password. Once entered, a banner greets you and a new command prompt appears:

Enter password:

Welcome to the MySQL monitor. Commands end with ; or \g.

Your MySQL connection id is 70

Server version: 5.5.38-0+wheezy1-log (Debian)

Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its

affiliates. Other names may be trademarks of their respective

owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

176

Lab 1: MySQL Environment OS commands

$ cat /proc/cpuinfo

$ cat /proc/meminfo

$ iostat –dx 5

$ netstat –an

$ dstat

177

Lab 1: MySQL Environment First MySQL server configuration. Find and edit the main

configuration file called “my,cnf” and enter these values, then restart MySQL

[mysqld]

performance_schema

performance_schema_events_waits_history_size=20

performance_schema_events_waits_history_long_size=15000

log_slow_queries = slow_query.log

long_query_time = 1


$ service mysql restart

178

Lab 1: Benchmarks Try to use native BENCHMARK() function to compare operators

mysql> SELECT BENCHMARK(100000000, CONCAT('a','b'));

Now try the same function against queries:

mysql> use sakila;

mysql> SELECT BENCHMARK(100, SELECT `actor_id` FROM `actor`);

Did it work? Why?

179

Lab 1: Storage enginesCreate a brand new table without specifying the engine to use:

use test;

mysql> CREATE TABLE char_test( char_col CHAR(10));

Try to To see what tables are in what engines

mysql> SHOW TABLE STATUS;

Selecting the storage engine to use is a tuning decision

mysql> alter table char_test engine=myisam;

Re-run the previous command to see the differences:

mysql> SHOW TABLE STATUS;

180

Lab 1: I/O Benchmark Install “sysbench” and try to run it with simple options as shown before:

$ sysbench --test=fileio prepare

$ sysbench --test=fileio --file-test-mode=rndrw run

$ sysbench --test=fileio cleanup

Install “iozone” and try the same:

$ iozone –a

You can also save the output to a spreadsheet using iozone -b

$ ./iozone -a -b output.xls

181

Lab 2: Performances Enable Slow Query Log

Find and edit configuration file “my.cnf” with:

log_slow_queries = <example slow_query.log>

long_query_time = 1


Then restart the MySQL daemon


Now run the Mysqldumpslow command, after some MySQL operations:

$ mysqldumpslow

or

$ mysqldumpslow <options> <example slow_query.log>

182

Lab 2: MySQL Query Cache Let’s assume we have a standard “my.cnf” configuration file. To enable

query cache, we have to edit it


Append the following lines and then restart the MySQL daemon

query_cache_size = 268435456

query_cache_type=1

query_cache_limit=1048576


Now run a benchmark session and keep note of the results


183

Lab 2: MySQL Query Cache Disable query cache in any of following ways, from inside MySQL prompt:

SET GLOBAL query_cache_size=0;

SHOW GLOBAL STATUS LIKE ‘QCache%’;

SET SESSION query_cache_type=0;

Re-run the benchmark session and observe the differences


184

Lab 3: InnoDB Launch and figure out how InnoDB is set on the server:

SHOW ENGINE INNODB STATUS;

Enable the InnoDB logging facilities

mysql> use mysql;

mysql> CREATE TABLE innodb_monitor (a INT) ENGINE=INNODB;

mysql> CREATE TABLE innodb_lock_monitor (a INT) ENGINE=INNODB; mysql> CREATE TABLE innodb_tablespace_monitor (a INT) ENGINE=INNODB;

mysql> CREATE TABLE innodb_table_monitor (a INT) ENGINE=INNODB;

185

Lab 3: MyISAM Choose and use any Sakila DB table to define a FULLTEXT index using the

ALTER TABLE statement:

mysql> ALTER TABLE table_name ADD FULLTEXT(column_name1, column_name2,…)

You can also use CREATE INDEX statement to create FULLTEXT index for existing tables.

mysql> CREATE FULLTEXT INDEX index_name ON able_name(idx_column_name,...)

Use any benchmark tool to see the differences in speed during queries without and with the fulltext indexing enabled.

186

Lab 3: MyISAM with Sphinx Example: create a table

CREATE TABLE `film` (

`film_id` smallint(5) unsigned NOT NULL

auto_increment,

`title` varchar(255) NOT NULL,

`description` text,

`last_update` timestamp NOT NULL default

CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,

...

PRIMARY KEY (`film_id`),

...

) ENGINE=InnoDB ;

187

Lab 3: MyISAM with Sphinx Example: edit the sphinx.conf file

source film

{

type = mysql

sql_host = localhost

sql_user = sakila_ro

sql_pass = 123456

sql_db = sakila

sql_port = 3306# optional, default is 3306

sql_query = \

SELECT film_id, title, UNIX_TIMESTAMP(last_update) AS

last_update_timestamp FROM film

sql_attr_int = film_id

sql_attr_timestamp = last_update_timestamp

sql_query_info = SELECT * FROM film WHERE film_id=$id

}

188

Lab 3: MyISAM with Sphinx Example: edit the sphinx.conf file

index film

{

source = film

path = /usr/bin/sphinx/data/film

}

Run queries

189

Lab 3: MyISAM with Sphinx Example: create a table using the Sphinx Storage Engine (SphinxSE)

CREATE TABLE sphinx_film

(

film_id INT NOT NULL,

weight INT NOT NULL,

query VARCHAR(3072) NOT NULL,

last_update INT,

INDEX(query)

) ENGINE=SPHINX

CONNECTION="sphinx://localhost:12321/film";

190

Lab 3: MyISAM with Sphinx Example: SphinxSE queries

SELECT * FROM sphinx_film WHERE query='drama';

SELECT * FROM sphinx_film INNER JOIN file USING (film_id) WHERE query='drama';

SELECT * FROM sphinx_film

INNER JOIN file USING(film_id) WHERE query='drama;limit=50';

SELECT * FROM sphinx_film

INNER JOIN file USING(film_id) WHERE

query='drama;limit=50;sort=attr_asc:last_update';

SELECT * FROM sphinx_film INNER JOIN file USING

(film_id) WHERE query='drama;limit=50;groupby=day:last_update';

191

Lab 4: Explain EXPLAIN

Suppose you want to rewrite the following UPDATE statement to make it EXPLAIN -able:

mysql> UPDATE sakila.actor

INNER JOIN sakila.film_actor USING (actor_id)

SET actor.last_update=film_actor.last_update;

The following EXPLAIN statement is not equivalent to the UPDATE , because it doesn’t

require the server to retrieve the last_update column from either table:

mysql> EXPLAIN SELECT film_actor.actor_id

-> FROM sakila.actor

-> INNER JOIN sakila.film_actor USING (actor_id)\G

192

Lab 4: Explain EXPLAIN

This is a better situation, close to the first one:

mysql> EXPLAIN SELECT film_actor.last_update, actor.last_update

-> FROM sakila.actor

-> INNER JOIN sakila.film_actor USING (actor_id)\G

Rewriting queries like this is not an exact science, but it’s often good enough to help

you understand what a query will do.

193

Lab 4: Critical queries Make practice with these commands:

mysql> show processlist;

mysql> show explain for <PID>;

Make practice with information_schema database

information_schema is the database where the information about all the other databases is kept, for example names of a database or a table, the data type of columns, access privileges, etc. It is a built-in virtual database with the sole purpose of providing information about the database system itself. The MySQL server automatically populates the tables in the information_schema.

194

Lab 4: Performance_schema queries Once enabled, try to use the performance_schema monitoring database

$ vi /etc/my.cnf

[mysqld]

performance_schema=on

mysql> USE performance_schema;

mysql> SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'performance_schema';

mysql> SHOW TABLES FROM performance_schema;

mysql> SHOW CREATE TABLE setup_timers\G

mysql> UPDATE setup_instruments SET ENABLED = 'YES', TIMED = 'YES';

mysql> UPDATE setup_consumers SET ENABLED = 'YES';

mysql> SELECT * FROM events_waits_current\G

195

Lab 4: Performance_schema queriesmysql> SELECT THREAD_ID, NUMBER_OF_BYTES

-> FROM events_waits_history

-> WHERE EVENT_NAME LIKE 'wait/io/file/%'

-> AND NUMBER_OF_BYTES IS NOT NULL;

Performance Schema Runtime Configuration

mysql> SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES

-> WHERE TABLE_SCHEMA = 'performance_schema'

-> AND TABLE_NAME LIKE 'setup%';

196

Case studies

Case study n. 1

197

Case studies – Case study n. 1 Scope of Problem

Overnight the query performance went from <1ms to 50x worse.

Nothing changed in terms of server configuration, schema, etc.

Tried throttling the server to 1/2 of its workload

from 20k QPS to 10k QPS

no improvement

198

Case studies – Case study n. 1 Considerations

Change in config client doesn't know about?

Hardware problem such as a failing disk?

Load increase: data growth or QPS crossed a "tipping point"?

Schema changes client doesn't know about (missing index?)

Network component such as DNS?

199

Case studies – Case study n. 1 Elimination of easy possibilities:

ALL queries are found to be slower in slow-query-log

eliminates DNS as a possibility.

Queries are slow when run via Unix socket

eliminates network.

No errors in dmesg or RAID controller

suggests (doesn't eliminate) that hardware is not the problem.

Detailed historical metrics show no change in Handler_ graphs

suggests (doesn't eliminate) that indexing is not the problem.

Also, combined with the fact that ALL queries are 50x slower, very strong reason to believe indexing is not the problem.

200

Case studies – Case study n. 1 Investigation of the obvious:

Aggregation of SHOW PROCESSLIST shows queries are not in Locked status.

Investigating SHOW INNODB STATUS shows no problems with semaphores, transaction states such as "commit", main thread, or other likely culprits.

However, SHOW INNODB STATUS shows many queries in "" status, as here:

---TRANSACTION 4 3879540100, ACTIVE 0 sec, process

no 26028, OS thread id 1344928080

MySQL thread id 344746, query id 1046183178

10.16.221.148 webuser

SELECT ....

All such queries are simple and well-optimized according to EXPLAIN.

The system has 8 CPUs, Intel(R) Xeon(R) CPU E5450 @ 3.00GHz and a RAID controller with 8 Intel XE-25 SSD drives behind it, with BBU and WriteBack caching.

201

Case studies – Case study n. 1 vmstat 5

r b swpd free buff cache si so bi bo in cs us sy id wa

4 0 875356 1052616 372540 8784584 0 0 13 3320 13162 49545 18 7 75 0

4 0 875356 1070604 372540 8785072 0 0 29 4145 12995 47492 18 7 75 0

3 0 875356 1051384 372544 8785652 0 0 38 5011 13612 55506 22 7 71 0

iostat -dx 5

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util

sda 0.00 61.20 1.20 329.20 15.20 4111.20 24.98 0.03 0.09 0.09 3.04

dm-0 0.00 0.00 0.80 390.60 12.80 4112.00 21.08 0.03 0.08 0.07 2.88

mpstat 5

10:36:12 PM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s

10:36:17 PM all 18.81 0.05 3.22 0.22 0.24 2.71 0.00 74.75 13247.40

10:36:17 PM 0 19.57 0.00 3.52 0.98 0.20 2.74 0.00 72.99 1939.00

10:36:17 PM 1 18.27 0.00 3.08 0.38 0.19 2.50 0.00 75.58 1615.40

202

Case studies – Case study n. 1 Premature Conclusion

As a result of all the above, we conclude that nothing external to the database is obviously the problem

The system is not virtualized

I expect the database to be able to perform normally.

What to do next?

Try to use a tool to make things easy.

Solution: use pt-ioprofile (from Percona Tool Kit).

203

Case studies – Case study n. 1 Solution

Start innotop (just to have a realtime monitor)

Disable query cache.

Watch QPS change in innotop.

Additional Confirmation

The slow query log also confirms queries back to normal

tail -f /var/log/slow.log | perl pt-query-digest --run-time 30s --report-format=profile

204

Case studies

Case study n. 2

205

Case studies – Case study n. 2 Information Provided

About 4PM on Saturday, queries suddenly began taking insanely long to complete

From sub-ms to many minutes.

As far as the customer knew, nothing had changed.

Nobody was at work.

They had disabled selected apps where possible to reduce load.

206

Case studies – Case study n. 2 Overview

They are running 5.0.77-percona-highperf-b13.

The server has an EMC SAN

with a RAID5 array of 5 disks, and LVM on top of that

Server has 2 quad-core CPUSXeon L5420 @ 2.50GHz.

No virtualization.

They tried restarting mysqld

It has 64GB of RAM, so it's not warm yet.

207

Case studies – Case study n. 2 Train of thought

The performance drop is way too sudden and large.

On a weekend, when no one is working on the system.

Something is seriously wrong.

Look for things wrong first.

208

Case studies – Case study n. 2 Elimination of easy possibilities:

First, confirm that queries are actually taking a long time to complete.

They all are, as seen in processlist.

Check the SAN status.

They checked and reported that it's not showing any errors or failed disks.

209

Case studies – Case study n. 2 Investigation of the obvious:

Server's incremental status variables don't look amiss

150+ queries in commit status.

Many transactions are waiting for locks inside InnoDB

But no semaphore waits, and main thread seems OK.

iostat and vmstat at 5-second intervals:

Suspicious IO performance and a lot of iowait

But virtually no work being done.

210

Case studies – Case study n. 2iostat

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util

sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

sdb 0.00 49.00 10.00 104.00 320.00 8472.00 77.12 2.29 20.15 8.78 100.10

sdb1 0.00 49.00 10.00 104.00 320.00 8472.00 77.12 2.29 20.15 8.78 100.10

vmstat

r b swpd free buff cache si so bi bo in cs us sy id wa st

5 1 176 35607308 738468 19478720 0 0 48 351 0 0 1 0 96 3 0

0 1 176 35605912 738472 19478820 0 0 560 848 2019 2132 4 1 83 13 0

0 2 176 35605788 738480 19479048 0 0 608 872 2395 2231 0 1 85 14 0

From vmstat/iostat:

It looks like something is blocking commits

Likely to be either a serious bug (a transaction that has gotten the commit mutex and is hung?) or a hardware problem.

IO unreasonably slow, so that is probably the problem.

211

Case studies – Case study n. 2 Analysis

Because the system is not "doing anything,"

profiling where CPU time is spent is probably useless.

We already know that it's spent waiting on mutexes in the commit problem, so oprofile will probably show nothing.

✦ Other options that come to mind:

profile IO calls with strace -c

benchmark the IO system, since it seems to be suspicious.

212

Case studies – Case study n. 2Oprofile ★ As expected: nothing useful in oprofile

samples % symbol name

6331 15.3942 buf_calc_page_new_checksum

2008 5.1573 sync_array_print_long_waits

2004 4.8728 MYSQLparse(void*)

1724 4.1920 srv_lock_timeout_and_monitor_thread

1441 3.5039 rec_get_offsets_func

1098 2.6698 my_utf8_uni

780 1.8966 mem_pool_fill_free_list

762 1.8528 my_strnncollsp_utf8

682 1.6583 buf_page_get_gen

650 1.5805 MYSQLlex(void*, void*)

604 1.4687 btr_search_guess_on_hash

566 1.3763 read_view_open_now

strace –c ★ Nothing relevant after 30 seconds or so.

Process 24078 attached - interrupt to quit

Process 24078 detached%

time seconds usecs/call calls errors syscall

100.00 0.098978 14140 7 select

0.00 0.000000 0 7 accept

213

Case studies – Case study n. 2 Examine history

Look at 'sar' for historical reference.

Ask the client to look at their graphs to see if there are obvious changes around 4PM.

Observations

writes dropped dramatically around 4:40

at the same time iowait increased a lot

corroborated by the client's graphs

points to decreased performance of the IO subsystem

SAN attached by fibre channel, so it could be

this server

the SAN

the connection

the specific device on the SAN.

214

Case studies – Case study n. 2 Elimination of Options:

Benchmark /dev/sdb1 and see if it looks reasonable.

This box or the SAN?

check the same thing from another server.

Tool: use iozone with the -I flag (O_DIRECT).

The result was 54 writes per second on the first iteration

canceled it after that because that took so long.

Conclusions

Customer said RAID failed after all

Moral of the story: information != facts

Customer‟s web browser had cached SAN status page!

215

Case studies

Case study n. 3

216

Case studies – Case study n. 3 Information from the start

Sometimes (once every day or two) the server starts to reject connections with a max_connections error.

This lasts from 10 seconds to a couple of minutes and is sporadic.

Server specs:

16 cores

12GB of RAM, 900MB data

Data on Intel XE-25 SSD

Running MySQL 5.1 with InnoDB Plugin

217

Case studies – Case study n. 3 Considerations

Pile-ups cause long queue waits?

thus incoming new connections exceed max_connections?

Pile-ups can be

the query cache

InnoDB mutexes

218

Case studies – Case study n. 3 Elimination

There are no easy possibilities.

We'd previously worked with this client and the DB wasn't the problem then.

Queries aren't perfect, but are still running in less than 10ms normally.

Investigation

Nothing is obviously wrong.

Server looks fine in normal circumstances.

219

Case studies – Case study n. 3 Analysis

We are going to have to capture server activity when the problem happens.

We can't do anything without good diagnostic data.

Decision: install 'collect' (from Aspersa) and wait.

For further info, please refer to Percona Aspersa Official Site:

http://www.percona.com/blog/2011/04/17/aspersa-tools-bit-ly-download-shortcuts/

After several pile-ups nothing very helpful was gathered

But then we got a good one

This took days/a week

Result of diagnostics data: too much information!




220

Case studies – Case study n. 3 During the Freeze

Connections increased from normal 5-15 to over 300.

QPS was about 1-10k.

Lots of Com_admin_commands.

Vast majority of "real" queries are Com_select (300-2000 per second)

There are only 5 or so Com_update, other Com_are zero.

No table locking.

Lots of query cache activity, but normal-looking.

no lowmem_prunes.

20 to 100 sorts per second

between 1k and 12k rows sorted per second.

221

Case studies – Case study n. 3 During the Freeze

Between 12 and 90 temp tables created per second

about 3 to 5 of them created on disk.

Most queries doing index scans or range scans – not full table scans or cross joins.

InnoDB operations are just reads, no writes.

InnoDB doesn't write much log or anything.

InnoDB status:

✦ InnoDB main thread was in "flushing buffer pool pages“ and there were basically no dirty pages.

✦ Most transactions were waiting in the InnoDB queue.

"12 queries inside InnoDB, 495 queries in queue"

✦ The log flush process was caught up.

✦ The InnoDB buffer pool wasn't even close to being full (much bigger than the data size).

222

Case studies – Case study n. 3 There were mostly 2 types of queries in SHOW PROCESSLIST, most of them in the

following states:

$ grep State: status-file | sort | uniq -c | sort -nr

161 State: Copying to tmp table

156 State: Sorting result

136 State: statistics

223

Case studies – Case study n. 3iostat

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util

sda3 0.04 493.63 0.65 15.49 142.18 4073.09 261.18 0.17 10.68 1.02 1.65

sda3 0.00 8833.00 1.00 500.00 8.00 86216.00 172.10 5.05 11.95 0.59 29.40

sda3 0.00 33557.00 0.00 451.00 0.00 206248.00 457.31 123.25 238.00 1.90 85.90

sda3 0.00 33911.00 0.00 565.00 0.00 269792.00 477.51 143.80 245.43 1.77 100.00

sda3 0.00 38258.00 0.00 649.00 0.00 309248.00 476.50 143.01 231.30 1.54 100.10

sda3 0.00 34237.00 0.00 589.00 0.00 281784.00 478.41 142.58 232.15 1.70 100.00

vmstat

r b swpd free buff cache si so bi bo in cs us sy id wa st

50 2 86064 1186648 3087764 4475244 0 0 5 138 0 0 1 1 98 0 0

13 0 86064 1922060 3088700 4099104 0 0 4 37240 312832 50367 25 39 34 2 0

2 5 86064 2676932 3088812 3190344 0 0 0 136604 116527 30905 9 12 71 9 0

1 4 86064 2782040 3088812 3087336 0 0 0 153564 34739 10988 2 3 86 9 0

0 4 86064 2871880 3088812 2999636 0 0 0 163176 22950 6083 2 2 89 8 0

Oprofile

samples % image name app name symbol name

473653 63.5323 no-vmlinux no-vmlinux /no-vmlinux

95164 12.7646 mysqld mysqld /usr/libexec/mysqld

53107 7.1234 libc-2.10.1.so libc-2.10.1.so memcpy

224

Case studies – Case study n. 3 Analysis:

There is a lot of data here

most of it points to nothing in particular except "need more research."

For example, in oprofile, what does build_template() do in InnoDB?

Why is memcpy() such a big consumer of time?

What is hidden within the 'mysqld' image/symbol?

We could spend a lot of time on these things.

In looking for things that just don't make sense, the iostat data is very strange.

We can see hundreds of MB per second written to disk for sustained periods

but there isn't even that much data in the whole database.

So clearly this can't simply be InnoDB's "furious flushing" problem

Virtually no reading from disk is happening in this period of time.

Raw disk stats show that all the time is consumed in writes.

There is an enormous queue on the disk.

225

Case studies – Case study n. 3 Analysis:

There was no swap activity, and 'ps' confirmed that nothing else significant was happening.

'df -h' and 'lsof' showed that:

mysqld's temp files became large

disk free space was noticeably changed while this pattern happened.

So mysqld was writing GB to disk in short bursts

Although this is not fully instrumented inside of MySQL, we know that

MySQL only writes data, logs, sort, and temp tables to disk.

Thus, we can eliminate data and logs.

Discussion with developers revealed that some kinds of caches could expire and cause a stampede on the database.

226

Case studies – Case study n. 3 Conclusion

Based on reasoning and knowledge of internals: it is likely that poorly optimized queries are causing a storm of very large temp tables on disk.

Plan of Attack

Optimize the 2 major kinds of queries found in SHOW PROCESSLIST so they don't use temp tables on disk.

These queries are fine in isolation, but when there is a rush on the database, can pile up.

Problem resolved after removing temporary tables on disk

My sql performance tuning course

Education

Transcript of My sql performance tuning course