Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf ·...

78
Target Practice A Workshop in Tuning MySQL Queries Jay Pipes Community Relations Manager, North America MySQL, Inc. [email protected]

Transcript of Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf ·...

Page 1: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

Target PracticeA Workshop in Tuning MySQL Queries

Jay PipesCommunity Relations Manager, North America

MySQL, [email protected]

Page 2: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 2

Agenda (if we actually stick to it...)

● Part I - A runthrough of the workbook

break time (the best part of the tutorial)● Part II - The schema, indexes, and MySQL

query cache● Part III - Benchmarking● Q&A:

– Short questions - fire away– Longer ones - leave to end (please)

Page 3: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

The EXPLAIN Command(the stuff in the workbook)

Page 4: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

A graphical overview

Page 5: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 5

EXPLAIN Basics

● Provides the execution plan chosen by the MySQL optimizer for a specific SELECT statement

● Simply append the word EXPLAIN to the beginning of your SELECT statement

● Each row in output represents a set of information used in the SELECT● A real schema table● A virtual table (derived table) or temporary table● A subquery in SELECT or WHERE● A unioned set

Page 6: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 6

EXPLAIN columns

● select_type - type of “set” the data in this row contains● table - alias (or full table name if no alias) of the table

or derived table from which the data in this set comes● type - “access strategy” used to grab the data in this set● possible_keys - keys available to optimizer for query● keys - keys chosen by the optimizer● rows - estimate of the number of rows in this set● Extra - information the optimizer chooses to give you● ref - shows the column used in join relations

Page 7: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

Example EXPLAIN output

Page 8: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

EXPLAIN examples(workbook p.6-28)

Page 9: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 9

Example #1 - the const access type

EXPLAIN SELECT * FROM rental WHERE rental_id = 13\G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: const possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 1 Extra: 1 row in set (0.00 sec)

Page 10: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 10

Constants in the optimizer

● a field indexed with a unique non-nullable key

● The access strategy of system is related to const and refers to when a table with only a single row is referenced in the SELECT

● Can be propogated across joined columns

Page 11: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 11

Example #2 - constant propogationEXPLAIN SELECT r.*, c.first_name, c.last_nameFROM rental r INNER JOIN customer c ON r.customer_id = c.customer_id WHERE r.rental_id = 13\G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: r type: const possible_keys: PRIMARY,idx_fk_customer_id key: PRIMARY key_len: 4 ref: const rows: 1 Extra: *************************** 2. row *************************** id: 1 select_type: SIMPLE table: c type: const possible_keys: PRIMARY key: PRIMARY key_len: 2 ref: const /* Here is where the propogation occurs...*/ rows: 1 Extra: 2 rows in set (0.00 sec)

Page 12: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 12

Example #3 - the range access type

SELECT * FROM rentalWHERE rental_date BETWEEN '2005-06-14' AND '2005-06-16'\G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: range possible_keys: rental_date key: rental_date key_len: 8 ref: NULL rows: 364 Extra: Using where 1 row in set (0.00 sec)

Page 13: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 13

Considerations with range accesses

● Index must be available on the field operated upon by a range operator

● If too many records are estimated to be returned by the condition, the range optimization won't be used– index or full table scan will be used instead

● The indexed field must not be operated on by a function call! (Important for all indexing)

Page 14: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 14

The scan vs. seek dilemma

● A seek operation, generally speaking, jumps into a random place -- either on disk or in memory -- to fetch the data needed.

– Repeat for each piece of data needed from disk or memory

● A scan operation, on the other hand, will jump to the start of a chunk of data, and sequentially read data -- either from disk or from memory -- until the end of the chunk of data

● For large amounts of data, scan operations tend to be more efficient than multiple seek operations

Page 15: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 15

Example #4 - Full table scan

EXPLAIN SELECT * FROM rentalWHERE rental_date BETWEEN '2005-06-14' AND '2005-06-21'\G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: ALL possible_keys: rental_date /* larger range forces scan choice */ key: NULL key_len: NULL ref: NULL rows: 16298 Extra: Using where 1 row in set (0.00 sec)

Page 16: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 16

Why full table scans pop up

● No WHERE condition (duh.)● No index on any field in WHERE condition● Poor selectivity on an indexed field● Too many records meet WHERE condition● < MySQL 5.0 and using OR in a WHERE clause● Using SELECT * FROM

Page 17: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 17

Example #5 - Full index scan

EXPLAIN SELECT rental_id, rental_date FROM rental\G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: indexpossible_keys: NULL key: rental_date key_len: 13 ref: NULL rows: 16325 Extra: Using index1 row in set (0.00 sec)

Page 18: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 18

Example #6 - eq_ref strategyEXPLAIN SELECT r.*, c.first_name, c.last_nameFROM rental r INNER JOIN customer c ON r.customer_id = c.customer_idWHERE r.rental_date BETWEEN '2005-06-14' AND '2005-06-16'\G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: r type: range possible_keys: idx_fk_customer_id,rental_date key: rental_date key_len: 8 ref: NULL rows: 364 Extra: Using where *************************** 2. row *************************** id: 1 select_type: SIMPLE table: c type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 2 ref: sakila.r.customer_id rows: 1 Extra: 2 rows in set (0.00 sec)

Page 19: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 19

When eq_ref pops up

● Joining two sets on a field where– One side has unique, non-nullable index– Other side has at least a non-nullable index

● In example #8, an eq_ref access strategy was chosen because a unique, non-nullable index is available on customer.customer_id and an index is available on the rental.customer_id field

Page 20: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 20

Nested loops join algorithm

● For each record in outermost set– Fetch a record from the next set via a join

column condition– Repeat until done with outermost set

● Main algorithm in optimizer– Main work in 5.1+ is in the area of subquery

optimization and additional join algorithms like semi- and merge joins

Page 21: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 21

Example #7 - ref strategy

EXPLAIN SELECT * FROM rentalWHERE rental_id IN (10,11,12)AND rental_date = '2006-02-01' \G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: ref possible_keys: PRIMARY,rental_date key: rental_date key_len: 8 ref: const rows: 1 Extra: Using where 1 row in set (0.02 sec)

Page 22: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 22

OR conditions and the index merge

● Index merge best thing to happen in optimizer for 5.0

● Allows optimizer to use more than one index to satisfy a join condition– Prior to MySQl 5.0, only one index– In case of OR conditions in a WHERE, MySQL

<5.0 would use a full table scan

Page 23: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 23

Example #8 - index_merge strategy

EXPLAIN SELECT * FROM rentalWHERE rental_id IN (10,11,12)OR rental_date = '2006-02-01' \G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: index_merge possible_keys: PRIMARY,rental_date key: rental_date,PRIMARY key_len: 8,4 ref: NULL rows: 4 Extra: Using sort_union(rental_date,PRIMARY); Using where 1 row in set (0.02 sec)

Page 24: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 24

The subquery access types

● Occur when using a subquery in the SELECT or WHERE clauses– unique_subquery: when results of

subquery are known to be distinct– index_subquery: otherwise

● Avoid them if possible– Always replace with a join– Especially correlated subqueries

Page 25: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 25

Example #9 - unique_subquery strategy

SELECT c.customer_id, c.first_name, c.last_name, p.amountFROM customer cINNER JOIN payment pON c.customer_id = p.customer_idWHERE p.rental_id IN ( SELECT rental_id FROM rental WHERE rental_date BETWEEN '2005-06-14' AND '2005-06-16')\G

(explain on next slide)

Page 26: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 26

Example #9 - unique_subquery strategy*************************** 1. row *************************** id: 1 select_type: PRIMARY table: c type: ALL rows: 541 *************************** 2. row *************************** id: 1 select_type: PRIMARY table: p type: ref possible_keys: idx_fk_customer_id key: idx_fk_customer_id key_len: 2 ref: sakila.c.customer_id rows: 15 Extra: Using where *************************** 3. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: rental type: unique_subquery possible_keys: PRIMARY,rental_date key: PRIMARY key_len: 4 ref: func /* Never quite been sure what this is...*/ rows: 1 Extra: Using where

Page 27: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 27

Explaining example #9

1)The optimizer chooses to first access the customer table, c, using a full table scan strategy

2)The IN() subquery (non-correlated) in the WHERE clause yields a unique_subquery access strategy that generates a list of unique rental_ids

Page 28: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 28

Explaining example #9 (cont'd)

3)In the unique_subquery set (row 3), both the PRIMARY and the rental_date indexes are available to the optimizer to reduce the set to an appropriate set of rental_id values. The optimizer chooses to use the PRIMARY key on rental_id, even though a range access could be used on rental_date index

4)In row 2, which represents the payment table set, we see a ref access strategy deployed on the customer_id index. This makes sense, since the index customer_id is non-unique and not-nullable. However, notice that in the Extra column you see “Using where”. Why?

Page 29: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 29

Example #10 - Rewrite #9 with a join

EXPLAIN SELECT c.customer_id, c.first_name, c.last_name, p.amountFROM customer cINNER JOIN payment pON c.customer_id = p.customer_idINNER JOIN rental rON p.rental_id = r.rental_idWHERE r.rental_date BETWEEN '2005-06-14' AND '2005-06-16'\G)\G

What performance difference do we see?

Page 30: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 30

Example #11 - Evil correlated subquery

EXPLAIN SELECT p.payment_id, p.amount, p.payment_date, c.first_name, c.last_nameFROM payment pINNER JOIN customer cON p.customer_id = c.customer_idWHERE p.payment_date = ( SELECT MAX(payment_date) FROM payment WHERE payment.customer_id = p.customer_id)\G

Page 31: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 31

EXPLAIN output from example #11*************************** 1. row *************************** id: 1 select_type: PRIMARY table: c type: ALL rows: 541 *************************** 2. row *************************** id: 1 select_type: PRIMARY table: p type: ref possible_keys: idx_fk_customer_id key: idx_fk_customer_id key_len: 2 ref: sakila.c.customer_id rows: 14 Extra: Using where *************************** 3. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: payment type: ref possible_keys: idx_fk_customer_id key: idx_fk_customer_id key_len: 2 ref: sakila.p.customer_id rows: 14

Page 32: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 32

Correlated subquery == evil (at least in MySQL)

● Correlated subquery can be (and most often is) executed once for each matched row in the outer set of information.– Kills scalability and performance as data

sets grows (even to a tiny 10K rows)● Can always be rewritten using joins

– Yes, always– That's what many optimizers do anyway

Page 33: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 33

Example #12 - Rewriting evil subquery

EXPLAIN SELECT p.payment_id, p.amount, p.payment_date, c.first_name, c.last_nameFROM payment pINNER JOIN ( SELECT customer_id, MAX(payment_date) AS payment_date FROM payment GROUP BY customer_id) AS last_ordersON p.customer_id = last_orders.customer_idAND p.payment_date = last_orders.payment_dateINNER JOIN customer cON p.customer_id = c.customer_id\G

Page 34: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 34

EXPLAIN output from example #12*************************** 1. row *************************** id: 1 select_type: PRIMARY table: <derived2> type: ALL rows: 599 *************************** 2. row *************************** id: 1 select_type: PRIMARY table: c type: eq_ref ref: last_orders.customer_id rows: 1 *************************** 3. row *************************** id: 1 select_type: PRIMARY table: p type: ref possible_keys: idx_fk_customer_id key: idx_fk_customer_id key_len: 2 ref: sakila.c.customer_id rows: 15 Extra: Using where *************************** 4. row *************************** id: 2 select_type: DERIVED table: payment type: index key: idx_fk_customer_id key_len: 2 rows: 16451

Page 35: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

phew. break time.

Page 36: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

The Schema, Indexes, and MySQL Query Cache

Page 37: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 37

Covering indexes

● An index is a covering index when all fields in SELECT statement for a specific table are contained in an index

● Shows up in Extra column of EXPLAIN as “Using index”

● Removes need for “bookmark lookup” operation– What the heck is a bookmark lookup

operation?

Page 38: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

Non clustered layout (MyISAM)

1-100

Data filecontaining unordered

data records

1-33 34-66 67-100

Root Index Node stores a directory of keys, along with pointers to non-leaf nodes (or leaf nodes for a very

small index)

Leaf nodes store sub-

directories of index keys

with pointers into the data

file to a specific record

these are the bookmark lookups...that a covering index eliminates

Page 39: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 39

Example #13 - covering index

EXPLAIN SELECT rental_id, customer_id, inventory_id FROM rental WHERE rental_date = '2005-06-14'\G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: ref possible_keys: rental_date key: rental_date key_len: 8 ref: const rows: 1 Extra: Using index 1 row in set (0.00 sec)

Why is this a covering index?

Page 40: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

Clustered layout (InnoDB)

1-100

1-33

In a clustered layout, the leaf nodes actually

contain all the data for the record (not just the index key,

like in the non-clustered layout)

Root Index Node stores a directory of keys, along with pointers to non-leaf nodes (or leaf nodes for a very

small index)

34-66 67-100

Important:

When looking up a record by a primary key, for a clustered layout/organization, the lookup operation (following the pointer from the leaf node to the data file) involved in a non-clustered layout is not needed.

Page 41: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 41

Clustered layout

● Primary key value is appended to every record in a secondary index– So use a small primary key on InnoDB tables

● If you don't add a primary key...– InnoDB will automagically create one...

● A 6-byte integer that you have no control over– So always create a primary key...

Page 42: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 42

Filesorts and temporary tables

● Occur when– A sorted list of values needed does not

exist– More memory needed for grouping/sorting

● > tmp_table_size AND max_heap_table_size– a derived table (subquery in from clause)– Inappropriate column order in index

● SHOW STATUS LIKE 'Created_tmp_%';

Page 43: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 43

Temporary tables

● A temporary table is a MEMORY table or a MyISAM table– depending on size– or if the SELECT contains a TEXT or BLOB!

● VARCHARs turn into CHARs– Crazy Dr. Jekyll / Mr. Hyde situation?– Could be this

● One benefit: no locking...since within session

Page 44: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 44

Example #14 - temporary tables & filesorts

SELECT rental_rate, COUNT(*) AS num_filmsFROM film WHERE language_id = 1 GROUP BY rental_rate ORDER BY rental_rate DESC\g*************************** 1. row *************************** id: 1 select_type: SIMPLE table: film type: ref possible_keys: idx_fk_language_id key: idx_fk_language_id key_len: 1 ref: const rows: 511 Extra: Using where; Using temporary; Using filesort

Page 45: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 45

Example #15 - Remedy to filesort in #14

ALTER TABLE film DROP INDEX idx_fk_language_id, ADD INDEX idx_language_rental_rate (language_id, rental_rate);...re-run EXPLAIN form example #14...*************************** 1. row *************************** id: 1 select_type: SIMPLE table: film type: ref possible_keys: idx_language_rental_rate key: idx_language_rental_rate key_len: 1 ref: const rows: 469 Extra: Using where; Using index

Page 46: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 46

Example #16 - Ah, but there's a problem

SELECT rental_rate, COUNT(*) AS num_filmsFROM film GROUP BY rental_rate ORDER BY rental_rate DESC\g*************************** 1. row *************************** id: 1 select_type: SIMPLE table: film type: index possible_keys: NULL key: idx_language_rental_rate key_len: 3 ref: NULL rows: 938 Extra: Using index; Using temporary; Using filesort

Page 47: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 47

Example #17 - Effects of functions on indexes

EXPLAIN SELECT * FROM film WHERE title LIKE 'Tr%'\G*************************** 1. row *************************** select_type: SIMPLE table: film type: range possible_keys: idx_title key: idx_title key_len: 767 ref: NULL rows: 15

EXPLAIN SELECT * FROM film WHERE LEFT(title, 2) = 'Tr'\G*************************** 1. row *************************** select_type: SIMPLE table: film type: ALL rows: 938 Extra: Using where

Page 48: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 48

Example #18 - Rewriting to avoid functions

SELECT * FROM OrdersWHERE TO_DAYS(CURRENT_DATE()) – TO_DAYS(order_created) <= 7;

Not a good idea! Lots o' problems with this...

SELECT * FROM OrdersWHERE order_created >= CURRENT_DATE() ­ INTERVAL 7 DAY;

Better... Now the index on order_created will be used at least. Still a problem, though...

SELECT order_id, order_created, customerFROM OrdersWHERE order_created >= '2007­02­11' ­ INTERVAL 7 DAY;

Best. Now the query cache can cache this query, and given no updates, only run it once a day...

replace the CURRENT_DATE() function with a constant string in your programming language du jour... for instance, in PHP, we'd do:

$sql= “SELECT order_id, order_created, customer FROM Orders WHERE order_created >= '“ .date('Y-m-d') . “' - INTERVAL 7 DAY”;

Page 49: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 49

Schema guidelines

● Use the smallest data type possible– Do you really need that BIGINT?

● The smaller your data types, the more index (and data) records can fit into a single block of memory– Especially important for indexed fields

● Normalize first, denormalize later...

Page 50: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

Journey to the Center of the Database

Ahh, normalization...

http://thedailywtf.com/forums/thread/75982.aspx

Page 51: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 51

IP Addresses

● An IP address always reduces down to an INT UNSIGNED

● Each subnet part corresponds to one 8-byte division of the underlying INT UNSIGNED

● Use INET_ATON() to convert from a string to an integer

● Use INET_NTOA() to convert from integer to string

Page 52: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 52

Example #20 - IP addressingCREATE TABLE Sessions ( session_id INT UNSIGNED NOT NULL AUTO_INCREMENT, ip_address INT UNSIGNED NOT NULL // Compared to CHAR(15)!!, session_data TEXT NOT NULL, PRIMARY KEY (session_id), INDEX (ip_address)) ENGINE=InnoDB;

// Find all sessions coming from a local subnetSELECT * FROM SessionsWHERE ip_address BETWEEN INET_ATON('192.168.0.1') AND INET_ATON('192.168.0.255');

The INET_ATON() function reduces the string to a constant INT and a highly optimized range operation will be performed for:

SELECT * FROM SessionsWHERE ip_address BETWEEN 3232235521 AND 3232235775

Page 53: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

The MySQL Query Cache

Clients

Parser

Optimizer

QueryCache

Pluggable Storage Engine API

MyISAM InnoDB MEMORY Falcon Archive PBXT SolidDB Cluster(Ndb)

ConnectionHandling &

Net I/O

“Packaging”

Page 54: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 54

MySQL Query Cache

● To use effectively, you must understand application read/write ratio

● QC design is a compromise between CPU usage and read performance

● Bigger query cache != better performance, even for heavy read applications

Page 55: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 55

MySQL Query Cache invalidation

● Coarse invalidation designed to prevent CPU overuse during finding and storing cache entries

● This means any modification to any table referenced in the SELECT will invalidate any cache entry which uses that table

● Remedy with vertical table partitioning

Page 56: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 56

Example #19 - Remedy to QC thrashingCREATE TABLE Products ( product_id INT UNSIGNED NOT NULL AUTO_INCREMENT, name VARCHAR(80) NOT NULL, unit_cost DECIMAL(7,2) NOT NULL, description TEXT NULL, image_path TEXT NULL, num_views INT UNSIGNED NOT NULL, num_in_stock INT UNSIGNED NOT NULL, num_on_order INT UNSIGNED NOT NULL, PRIMARY KEY (product_id), INDEX (name(20))) ENGINE=InnoDB; // Or MyISAM

CREATE TABLE Products ( product_id INT UNSIGNED NOT NULL AUTO_INCREMENT, name VARCHAR(80) NOT NULL, unit_cost DECIMAL(7,2) NOT NULL, description TEXT NULL, image_path TEXT NULL, PRIMARY KEY (product_id), INDEX (name(20))) ENGINE=InnoDB; // Or MyISAMCREATE TABLE ProductCounts ( product_id INT UNSIGNED NOT NULL, num_views INT UNSIGNED NOT NULL, num_in_stock INT UNSIGNED NOT NULL, num_on_order INT UNSIGNED NOT NULL, PRIMARY KEY (product_id)) ENGINE=InnoDB;

Page 57: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

Benchmarking

Page 58: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 58

Benchmarking basics

● Enables tracking of performance of something over time– A whole application– An SQL script– An application script or web page

● Enables load or stress testing● Enables the comparison of two similar

techniques, engines, platforms, etc.

Page 59: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 59

Change one thing at a time

● Change only one thing at a time– Config variable– Addition of an index– Modification to a schema table– Change to an SQL script

● Adhere to scientific principle of isolating a variable condition

● Re-run the benchmark after change

Page 60: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 60

Isolate the benchmark environment

● Disable non-essential services– Network traffic analyzers– Non-essential servers– MySQL query cache (most of the time)

● Use a separate machine if possible● Separate MYSQL instance

– Use the --socket configuration variable to differentiate instance

Page 61: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 61

Save the benchmark results

● Save all benchmark files– Result files of benchmark runs– Configuration files at time of run– OS/Hardware configuration, if changed

● Have a baseline folder when doing stress testing– Configuration file at start of changes– Base results of benchmark tests

Page 62: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 62

Why a benchmarking framework?

● Automates the most tedious part of testing● Standardizes benchmark results● Can customize to your needs● Framework can be analyzed by outside

parties to determine:– If framework provides consistent results– If framework is biased or slanted in any

way

Page 63: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 63

A note about large benchmark frameworks

● Huge frameworks like TPC-x or SPECjAppServer2004– Cost lots of moolah– Developed to test fairly specific things

● Application server performance● Transaction processing load

● Some (like TPC) only allow member organizations to publish benchmarks

Page 64: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 64

More specific benchmarking tools

● sysbench– Useful for doing raw comparisons of

different MySQL versions or platforms● supersmack

– Fairly flexible (but difficult) tool for measuring SQL script performance

● ab (ApacheBench)– Great for load testing a web server

Page 65: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 65

Generic frameworks for the common folk

● Many other frameworks designed for the needs of the common developer– Junit/Ant (for you Java drinkers...)– MyBench (for you Perl mongers...)– thewench (for you PHP folks...)

● Today, I'll show thewench ... why?– Because I wrote it as a prototype– Because it will give you ideas to create your own

Page 66: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

thewench(a.k.a. Jay's humble little servant)

Page 67: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 67

thewench system requirements

● System requirements– PHP 5 with CLI (command line interface)– ext/mysqli– simple_xml support (PHP 5.1.10+)

● Future versions to support:– PDO, phpnd, other DB drivers– PostgreSQL, Oracle, other Dbs– Human and machine readable output

Page 68: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 68

Installing thewench

● Download thewench from her online home:$> wget http://jpipes.com/thewench.tar.gz

● Untar into some directory and make her an executable:$> tar xzf thewench.tag.gz

$> cd thewench/

$> chmod 0755 thewench.php

$> cd /usr/sbin

$> sudo ln -s ~/thewench/thewench.php thewench

Page 69: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 69

About thewench

● Multi-threaded -- uses pctrl_fork()● Runs test files

– Tests can be grouped into test suites● Configured either via config files or command

line args– Looks in command line and in./, ~/, $installdir

and /etc/thewench/

Page 70: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 70

thewench configuration options

-d | --database | database=mydbname

-h | --host | host=myhost

-u | --user | user=mydbuser

-p | --password | password=mydbpass

-c | --concurrency | concurrency=n

-n | --seconds | seconds=n

--port | port=nnnn

--socket | socket=path_to_socket

--test-dir | test-dir=path_to_tests

--suite-dir | suite-dir=path_to_test_suites

Page 71: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 71

Sample thewench.cnf

test_dir = "/home/jpipes/benchmarks/tests"

suite_dir = "/home/jpipes/benchmarks/suites"

host = "localhost"

user = "root"

password = ""

database = "sakila"

port = 3307

socket=/tmp/target.sock

Page 72: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 72

Running thewench

● Either name a set of tests:$> thewench [options] correlated derived

● or specify a test suite name:$> thewench [options] --suite=derived-compare

● Either case depends on existence of test files– XML test file format

● Must be named testname.[test|xml]– A test suite is simply a newline-delimited

list of tests in a file named for the suite

Page 73: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 73

thewench test files

● <test> must be outermost element● 1 or more <query> elements

Page 74: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 74

The <query> element

● One or more required per test● Must contain an <sql> element

– Good idea to put content within <![CDATA[ tags

● Can contain one or more <arg> elements– Each <arg> should be represented by a

question mark (?) in the <sql> element's SQL text

Page 75: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 75

Example test - correlated.xml

<?xml version='1.0' standalone='yes'?><test> <query> <sql><![CDATA[SELECT p.*FROM sakila.payment pWHERE p.payment_date =( SELECT MAX(payment_date) FROM sakila.payment WHERE customer_id=p.customer_id)]]> </sql> </query></test>

Page 76: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 76

Example test - derived.xml

<?xml version='1.0' standalone='yes'?><test> <query> <sql><![CDATA[SELECT p.*FROM ( SELECT customer_id, MAX(payment_date) as last_order FROM sakila.payment GROUP BY customer_id) AS last_ordersINNER JOIN sakila.payment pON p.customer_id = last_orders.customer_idAND p.payment_date = last_orders.last_order]]></sql> </query></test>

Page 77: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 77

Parting words...

● Email me <[email protected]>● New MySQL Forge coming soon

– http://forge.mysql.com● Party tomorrow night at DoubleTree w/Zend● Birds of a Feather on Wednesday evening

– Come hack on thewench with me...

Page 78: Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf · 2010-03-11 · 08/02/07 OSCON 2007 - Target Practice 14 The scan vs. seek dilemma A seek operation,

08/02/07 OSCON 2007 - Target Practice 78

Example output - derived vs. correlated

jpipes@shakedown:~/benchmarks/tests$ thewench --seconds=3 derived correlated>> thewench v0.6Results, ordered by average time taken per request:+-----+--------------------------------+-----------+----------+----------+| # | test name | req/sec | comp | fail |+-----+--------------------------------+-----------+----------+----------+| 1 | derived | 10.966133 | 33 | 0 || 2 | correlated | 0.813098 | 3 | 0 |+-----+--------------------------------+-----------+----------+----------+