Post on 27-Jun-2015
MySQL Query Optimization
2010.07.09 Cai Baohua
10年7月9日星期五
Agenda
• What the query optimizer is
• The principles of the optimization
• Explain and Profiling
• Use Index
• JOIN Optimization
• ORDER BY, GROUP BY Optimization
10年7月9日星期五
MySQL Query Optimizer
10年7月9日星期五
MySQl Query Optimizer
OptimizerTable
Maintenance Module
Table Modification
Module
Access Control Module
.....
Parser
10年7月9日星期五
MySQl Query Optimizer
• Not only CBO but aslo CBO + RBO
• Cost Base Optimizer
• Rule Base Optimizer
10年7月9日星期五
The Principles Of the Optimization
10年7月9日星期五
10年7月9日星期五
• Optimizing the query which need more optimization
10年7月9日星期五
• Optimizing the query which need more optimization
• Identify the performance bottleneck
10年7月9日星期五
• Optimizing the query which need more optimization
• Identify the performance bottleneck
• Find clear optimization objects
10年7月9日星期五
• Optimizing the query which need more optimization
• Identify the performance bottleneck
• Find clear optimization objects
• Start with Explain and use Profile more often
10年7月9日星期五
• Optimizing the query which need more optimization
• Identify the performance bottleneck
• Find clear optimization objects
• Start with Explain and use Profile more often
• Always using the small result set to drive the large result set
10年7月9日星期五
• Optimizing the query which need more optimization
• Identify the performance bottleneck
• Find clear optimization objects
• Start with Explain and use Profile more often
• Always using the small result set to drive the large result set
• Complete the sequencing in the index as much as possible
10年7月9日星期五
• Optimizing the query which need more optimization
• Identify the performance bottleneck
• Find clear optimization objects
• Start with Explain and use Profile more often
• Always using the small result set to drive the large result set
• Complete the sequencing in the index as much as possible
• Fetch the only fields that we need
10年7月9日星期五
• Optimizing the query which need more optimization
• Identify the performance bottleneck
• Find clear optimization objects
• Start with Explain and use Profile more often
• Always using the small result set to drive the large result set
• Complete the sequencing in the index as much as possible
• Fetch the only fields that we need
• Only use the most effective conditions of the filter
10年7月9日星期五
• Optimizing the query which need more optimization
• Identify the performance bottleneck
• Find clear optimization objects
• Start with Explain and use Profile more often
• Always using the small result set to drive the large result set
• Complete the sequencing in the index as much as possible
• Fetch the only fields that we need
• Only use the most effective conditions of the filter
• Avoid the complex Join and sub queries as far as possible
10年7月9日星期五
Explain and Profiling
10年7月9日星期五
Use Explain and Profiling
• In which order the tables are read
• What types of read operations that are made
• Which indexes could have been used
• Which indexes are used
• How the tables refer to each other
• How many rows the optimizer estimates to retrieve from each table
Explain tells you:
10年7月9日星期五
Use Explain and Profiling
10年7月9日星期五
Use Explain and ProfilingExplain TypesDifferent join types.
system !"#$%&'(#$"&)$*+(,$*+#$-*.$
const /%$%"#$0*)%$*+#$0&%1"2+3$-*.4$%-#&%#5$&)$&$1*+)%&+%$
eq_ref 6+#$-*.$7#-$-*.$8-*0$7-#92*:)$%&'(#)$
ref ;#9#-&($-*.)$.2%"$0&%1"2+3$2+5#<$9&(:#$
ref_or_null =2>#$-#84$7(:)$?@==$9&(:#)$
index_merge ;#9#-&($2+5#<$)#&-1"#)$&-#$0#-3#5$
unique_subquery ;&0#$&)$-#8$8*-$)*0#$):'A:#-2#)$
index_subquery /)$&'*9#$8*-$+*+B:+2A:#$2+5#<#)$
range /$-&+3#$2+5#<$)1&+$
index !"#$."*(#$2+5#<$2)$)1&++#5$
ALL /$8:(($%&'(#$)1&+$
10年7月9日星期五
Use Explain and Profiling
good
bad
Explain TypesDifferent join types.
system !"#$%&'(#$"&)$*+(,$*+#$-*.$
const /%$%"#$0*)%$*+#$0&%1"2+3$-*.4$%-#&%#5$&)$&$1*+)%&+%$
eq_ref 6+#$-*.$7#-$-*.$8-*0$7-#92*:)$%&'(#)$
ref ;#9#-&($-*.)$.2%"$0&%1"2+3$2+5#<$9&(:#$
ref_or_null =2>#$-#84$7(:)$?@==$9&(:#)$
index_merge ;#9#-&($2+5#<$)#&-1"#)$&-#$0#-3#5$
unique_subquery ;&0#$&)$-#8$8*-$)*0#$):'A:#-2#)$
index_subquery /)$&'*9#$8*-$+*+B:+2A:#$2+5#<#)$
range /$-&+3#$2+5#<$)1&+$
index !"#$."*(#$2+5#<$2)$)1&++#5$
ALL /$8:(($%&'(#$)1&+$
10年7月9日星期五
Use Explain and ProfilingExplain ExtraThis column contains additional information about how MySQL resolves the query.
Using index !"#$%#&'()$*&$+%#,)#-$&)%,*.")$/%01$)"#$*2-#3$
Using where 40)$,(($%05&$,%#$'&#-$*2$)"#$%#&'()$
Distinct 62(7$,$&*2.(#$%05$*&$%#,-$8#%$%05$+019*2,:02$
Not exists ;$LEFT JOIN$1*&&*2.$%05&$08:1*<,:02$*&$'&#-$
Using filesort ;2$#3)%,$%05$&0%:2.$&)#8$*&$-02#$
Using temporary ;$)#180%,%7$),9(#$*&$'&#-$
Range checked for each record
!"#$%#,-$)78#$*&$08:1*<#-$*2-*=*-',((7$/0%$#,+"$+019*2,:02$0/$%05&$/%01$)"#$8%#=*0'&$),9(#&$
10年7月9日星期五
Use Explain and Profiling
• Open / Close Query Profiler
mysql> set profiling = 1 (close: 0)
10年7月9日星期五
Use Explain and ProfilingShow profiles
10年7月9日星期五
Use Explain and Profiling
SHOW PROFILE• ALL - displays all information
• BLOCK IO - displays counts for block input and output operations
• CONTEXT SWITCHES - displays counts for voluntary and involuntary context switches
• IPC - displays counts for messages sent and received
• MEMORY - is not currently implemented
• PAGE FAULTS - displays counts for major and minor page faults
• SOURCE - displays the names of functions from the source code, together with the name and line number of the file in which the function occurs
• SWAPS - displays swap count
10年7月9日星期五
Use Explain and ProfilingShow more info
10年7月9日星期五
Use Index
10年7月9日星期五
Index Types
10年7月9日星期五
Index Types• Balance-Tree
• Primary Key
• Secondary Index
• InnoDB, MyISAM often use
10年7月9日星期五
Index Types• Balance-Tree
• Primary Key
• Secondary Index
• InnoDB, MyISAM often use
• Hash
• Memory, NDB Cluster
• “=”, “IN”, “<=>” not > < between != like
• not work for ORDER BY
10年7月9日星期五
Index Types• Balance-Tree
• Primary Key
• Secondary Index
• InnoDB, MyISAM often use
• Hash
• Memory, NDB Cluster
• “=”, “IN”, “<=>” not > < between != like
• not work for ORDER BY
• Fulltext
• CHAR, VARCHAR and TEXT
• Uses it instead of LIKE ‘%*****%’, more efficient
10年7月9日星期五
Index Types• Balance-Tree
• Primary Key
• Secondary Index
• InnoDB, MyISAM often use
• Hash
• Memory, NDB Cluster
• “=”, “IN”, “<=>” not > < between != like
• not work for ORDER BY
• Fulltext
• CHAR, VARCHAR and TEXT
• Uses it instead of LIKE ‘%*****%’, more efficient
• R-Tree
• to solve the problem of spatial data retrieval
• only data type: GEOMETRY
10年7月9日星期五
Pros and Cons of Index
10年7月9日星期五
Pros and Cons of Index• Pros
• Improve the efficiency of data retrieval
• Reduce the cost of database I/O
• Reduce the cost of data sorting
10年7月9日星期五
Pros and Cons of Index• Pros
• Improve the efficiency of data retrieval
• Reduce the cost of database I/O
• Reduce the cost of data sorting
• Cons
• index will take more disk space
• slow the speed of updating table (insert, update, delete)
10年7月9日星期五
When Use Index?
10年7月9日星期五
When Use Index?
• Field used in WHERE more frequently, use index
10年7月9日星期五
When Use Index?
• Field used in WHERE more frequently, use index
• Field like status or type, no index
10年7月9日星期五
When Use Index?
• Field used in WHERE more frequently, use index
• Field like status or type, no index
• Contain too many records records, which bring too many random I/O, to many duplicate I/O
10年7月9日星期五
When Use Index?
• Field used in WHERE more frequently, use index
• Field like status or type, no index
• Contain too many records records, which bring too many random I/O, to many duplicate I/O
• Field updated too often, no index
10年7月9日星期五
When Use Index?
• Field used in WHERE more frequently, use index
• Field like status or type, no index
• Contain too many records records, which bring too many random I/O, to many duplicate I/O
• Field updated too often, no index
• Field not in WHERE, no index
10年7月9日星期五
1 or N-Columns Index
10年7月9日星期五
1 or N-Columns Index
• No absolute conclusion
10年7月9日星期五
1 or N-Columns Index
• No absolute conclusion
• When a filter field can filter data more than 90% and the other filter fields will be updated often, which we can try to use composite index
10年7月9日星期五
1 or N-Columns Index
• No absolute conclusion
• When a filter field can filter data more than 90% and the other filter fields will be updated often, which we can try to use composite index
• Reduce the cost of index updating and disk space of index
10年7月9日星期五
1 or N-Columns Index
• No absolute conclusion
• When a filter field can filter data more than 90% and the other filter fields will be updated often, which we can try to use composite index
• Reduce the cost of index updating and disk space of index
• let one index used in different quries
10年7月9日星期五
1 or N-Columns Index
• No absolute conclusion
• When a filter field can filter data more than 90% and the other filter fields will be updated often, which we can try to use composite index
• Reduce the cost of index updating and disk space of index
• let one index used in different quries
• Don’t over index
10年7月9日星期五
Index Prefixes
10年7月9日星期五
Index Prefixes
• Index prefixes of CHAR, VARCHAR, BINARY, VARBINARY, BLOB, and TEXT columns
10年7月9日星期五
Index Prefixes
• Index prefixes of CHAR, VARCHAR, BINARY, VARBINARY, BLOB, and TEXT columns
• name char (200)
10年7月9日星期五
Index Prefixes
• Index prefixes of CHAR, VARCHAR, BINARY, VARBINARY, BLOB, and TEXT columns
• name char (200)
• most value are unique within the first 10-20
10年7月9日星期五
Index Prefixes
• Index prefixes of CHAR, VARCHAR, BINARY, VARBINARY, BLOB, and TEXT columns
• name char (200)
• most value are unique within the first 10-20
• CREATE INDEX part_of_name ON customer (name(10));
10年7月9日星期五
Index Prefixes
• Index prefixes of CHAR, VARCHAR, BINARY, VARBINARY, BLOB, and TEXT columns
• name char (200)
• most value are unique within the first 10-20
• CREATE INDEX part_of_name ON customer (name(10));
• faster query and disk I/O reduction
10年7月9日星期五
Limitation of Mysql Index
• MyISAM - the total length of index <= 1000 bytes
• BLOB and TEXT only create Index Prefix
• Mysql not support Function Index
• “!=” or “<>”, won’t use index
• abs(column) etc, won’t use index
• Join (a.city = b.city). If the filter fileds’ type are not the same, mysql won’t use index
• Like ‘%abc’, won’t use index
• Hash index only can be used when “=”, “<=>”, “IN”
10年7月9日星期五
Join
10年7月9日星期五
Principle
• Nested Loop Join
10年7月9日星期五
Example
10年7月9日星期五
10年7月9日星期五
users_group(g)index ref scan
10年7月9日星期五
users_group(g)index ref scan
Nested Loop (ref)g.group_id=m.group_id
group_message(m)index ref scan
10年7月9日星期五
users_group(g)index ref scan
Nested Loop (ref)g.group_id=m.group_id
group_message(m)index ref scan
Result Set O
utput
Nested Loop (ref)m.id=c.group_msg_id
group_message_content()index ref scan
10年7月9日星期五
Ideas for optimization
• Minimize the number of Nested Loop
• Give priority to optimizing the inner loop
• Indexing filter fields
• ... FROM A, B WHERE B.group_id = A.group_id
• Join Buffer size, type is All, index, range, index_merge
10年7月9日星期五
Order By, Group By
10年7月9日星期五
How Satisfy Order By
10年7月9日星期五
How Satisfy Order By
• Use Index, without doing a any extra sorting
10年7月9日星期五
How Satisfy Order By
• Use Index, without doing a any extra sorting
• Use filesort algorithms
10年7月9日星期五
Use Index
10年7月9日星期五
Use Index
10年7月9日星期五
Use IndexSELECT col1, col2 FROM a ORDER BY [sort] sort
SELECT col1, col2 FROM a WHERE colX=value ORDER BY [sort]
(colx, sort)
SELECT * FROM a WHERE uid=1 ORDER BY x, y
(uid, x, y)
SELECT * FROM a ORDER BY YEAR(date) won’t use index
...... ......
10年7月9日星期五
Use IndexSELECT col1, col2 FROM a ORDER BY [sort] sort
SELECT col1, col2 FROM a WHERE colX=value ORDER BY [sort]
(colx, sort)
SELECT * FROM a WHERE uid=1 ORDER BY x, y
(uid, x, y)
SELECT * FROM a ORDER BY YEAR(date) won’t use index
...... ......
10年7月9日星期五
Use IndexSELECT col1, col2 FROM a ORDER BY [sort] sort
SELECT col1, col2 FROM a WHERE colX=value ORDER BY [sort]
(colx, sort)
SELECT * FROM a WHERE uid=1 ORDER BY x, y
(uid, x, y)
SELECT * FROM a ORDER BY YEAR(date) won’t use index
...... ......
10年7月9日星期五
Use IndexSELECT col1, col2 FROM a ORDER BY [sort] sort
SELECT col1, col2 FROM a WHERE colX=value ORDER BY [sort]
(colx, sort)
SELECT * FROM a WHERE uid=1 ORDER BY x, y
(uid, x, y)
SELECT * FROM a ORDER BY YEAR(date) won’t use index
...... ......
10年7月9日星期五
Use IndexSELECT col1, col2 FROM a ORDER BY [sort] sort
SELECT col1, col2 FROM a WHERE colX=value ORDER BY [sort]
(colx, sort)
SELECT * FROM a WHERE uid=1 ORDER BY x, y
(uid, x, y)
SELECT * FROM a ORDER BY YEAR(date) won’t use index
...... ......
10年7月9日星期五
Use Filesort
• increase max_length_for_stort_data
• remove return fields which are not necessary
• increase sort_buffer_size
10年7月9日星期五
How Satisfy Group By
• Loose Index Scan
• Tight Index Scan
10年7月9日星期五
Loose Index Scan
10年7月9日星期五
Loose Index Scan
10年7月9日星期五
Loose Index ScanConditions Example
The query is over a single table
SELECT c1, c2 FROM t1 GROUP BY c1, c2;
only columns that form a leftmost prefix of the
index and no other columns.
index on (c1,c2,c3)• GROUP BY c1, c2• CROUP BY c2, c3
only can use aggregate functions like MAX,
MIN
SELECT c1, MIN(c2) FROM t1 GROUP BY c1;
10年7月9日星期五
Loose Index ScanConditions Example
The query is over a single table
SELECT c1, c2 FROM t1 GROUP BY c1, c2;
only columns that form a leftmost prefix of the
index and no other columns.
index on (c1,c2,c3)• GROUP BY c1, c2• CROUP BY c2, c3
only can use aggregate functions like MAX,
MIN
SELECT c1, MIN(c2) FROM t1 GROUP BY c1;
10年7月9日星期五
Loose Index ScanConditions Example
The query is over a single table
SELECT c1, c2 FROM t1 GROUP BY c1, c2;
only columns that form a leftmost prefix of the
index and no other columns.
index on (c1,c2,c3)• GROUP BY c1, c2• CROUP BY c2, c3
only can use aggregate functions like MAX,
MIN
SELECT c1, MIN(c2) FROM t1 GROUP BY c1;
10年7月9日星期五
Loose Index ScanConditions Example
The query is over a single table
SELECT c1, c2 FROM t1 GROUP BY c1, c2;
only columns that form a leftmost prefix of the
index and no other columns.
index on (c1,c2,c3)• GROUP BY c1, c2• CROUP BY c2, c3
only can use aggregate functions like MAX,
MIN
SELECT c1, MIN(c2) FROM t1 GROUP BY c1;
10年7月9日星期五
Loose Index Scan
10年7月9日星期五
Loose Index Scan
10年7月9日星期五
Loose Index Scan
Any other parts of the index than those from
the GROUP BY referenced in the query
must be constants
SELECT c1, c2 FROM t1 WHERE c1 < const GROUP BY c1, c2;
SELECT MAX(c3), MIN(c3), c1, c2 FROM t1 WHERE c2 > const GROUP BY c1, c2;
Prefix index cannot be used for loose index scan
col VARCHAR(20), INDEX (col(10))
10年7月9日星期五
Loose Index Scan
Any other parts of the index than those from
the GROUP BY referenced in the query
must be constants
SELECT c1, c2 FROM t1 WHERE c1 < const GROUP BY c1, c2;
SELECT MAX(c3), MIN(c3), c1, c2 FROM t1 WHERE c2 > const GROUP BY c1, c2;
Prefix index cannot be used for loose index scan
col VARCHAR(20), INDEX (col(10))
10年7月9日星期五
Loose Index Scan
If loose index scan is applicable to a query, the EXPLAIN output shows Using index for group-by in the Extra column.
10年7月9日星期五
10年7月9日星期五
Tight Index Scan
• MySQL Query Optimizer
• If loose index scan are not met, then try tight index scan
• Different with loose, tight
• After finding all index keys in WHERE conditions, then MySQL do the grouping operation
10年7月9日星期五
Tight Index Scan
10年7月9日星期五
Tight Index Scanidx(c1,c2,c3) on table t1(c1,c2,c3,c4)
10年7月9日星期五
Tight Index Scanidx(c1,c2,c3) on table t1(c1,c2,c3,c4)
• A gap in the GROUP BY
• SELECT c1, c2, c3 FROM t1 WHERE c2 = 'a' GROUP BY c1, c3;
10年7月9日星期五
Tight Index Scanidx(c1,c2,c3) on table t1(c1,c2,c3,c4)
• A gap in the GROUP BY
• SELECT c1, c2, c3 FROM t1 WHERE c2 = 'a' GROUP BY c1, c3;
• not the first part of the key
• SELECT c1, c2, c3 FROM t1 WHERE c1 = 'a' GROUP BY c2, c3;
10年7月9日星期五
More...
• Books
• <<MySQL性能调优和架构设计>>, Author: 简朝阳
• Web Sites
• http://dev.mysql.com/doc/refman/5.1/en/optimization.html
• http://www.slideshare.net/
10年7月9日星期五
Q and A
10年7月9日星期五