Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf ·...
Transcript of Target Practice - joinfujoinfu.com/presentations/target-practice/target-practice.pdf ·...
Target PracticeA Workshop in Tuning MySQL Queries
Jay PipesCommunity Relations Manager, North America
MySQL, [email protected]
08/02/07 OSCON 2007 - Target Practice 2
Agenda (if we actually stick to it...)
● Part I - A runthrough of the workbook
break time (the best part of the tutorial)● Part II - The schema, indexes, and MySQL
query cache● Part III - Benchmarking● Q&A:
– Short questions - fire away– Longer ones - leave to end (please)
The EXPLAIN Command(the stuff in the workbook)
A graphical overview
08/02/07 OSCON 2007 - Target Practice 5
EXPLAIN Basics
● Provides the execution plan chosen by the MySQL optimizer for a specific SELECT statement
● Simply append the word EXPLAIN to the beginning of your SELECT statement
● Each row in output represents a set of information used in the SELECT● A real schema table● A virtual table (derived table) or temporary table● A subquery in SELECT or WHERE● A unioned set
08/02/07 OSCON 2007 - Target Practice 6
EXPLAIN columns
● select_type - type of “set” the data in this row contains● table - alias (or full table name if no alias) of the table
or derived table from which the data in this set comes● type - “access strategy” used to grab the data in this set● possible_keys - keys available to optimizer for query● keys - keys chosen by the optimizer● rows - estimate of the number of rows in this set● Extra - information the optimizer chooses to give you● ref - shows the column used in join relations
Example EXPLAIN output
EXPLAIN examples(workbook p.6-28)
08/02/07 OSCON 2007 - Target Practice 9
Example #1 - the const access type
EXPLAIN SELECT * FROM rental WHERE rental_id = 13\G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: const possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 1 Extra: 1 row in set (0.00 sec)
08/02/07 OSCON 2007 - Target Practice 10
Constants in the optimizer
● a field indexed with a unique non-nullable key
● The access strategy of system is related to const and refers to when a table with only a single row is referenced in the SELECT
● Can be propogated across joined columns
08/02/07 OSCON 2007 - Target Practice 11
Example #2 - constant propogationEXPLAIN SELECT r.*, c.first_name, c.last_nameFROM rental r INNER JOIN customer c ON r.customer_id = c.customer_id WHERE r.rental_id = 13\G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: r type: const possible_keys: PRIMARY,idx_fk_customer_id key: PRIMARY key_len: 4 ref: const rows: 1 Extra: *************************** 2. row *************************** id: 1 select_type: SIMPLE table: c type: const possible_keys: PRIMARY key: PRIMARY key_len: 2 ref: const /* Here is where the propogation occurs...*/ rows: 1 Extra: 2 rows in set (0.00 sec)
08/02/07 OSCON 2007 - Target Practice 12
Example #3 - the range access type
SELECT * FROM rentalWHERE rental_date BETWEEN '2005-06-14' AND '2005-06-16'\G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: range possible_keys: rental_date key: rental_date key_len: 8 ref: NULL rows: 364 Extra: Using where 1 row in set (0.00 sec)
08/02/07 OSCON 2007 - Target Practice 13
Considerations with range accesses
● Index must be available on the field operated upon by a range operator
● If too many records are estimated to be returned by the condition, the range optimization won't be used– index or full table scan will be used instead
● The indexed field must not be operated on by a function call! (Important for all indexing)
08/02/07 OSCON 2007 - Target Practice 14
The scan vs. seek dilemma
● A seek operation, generally speaking, jumps into a random place -- either on disk or in memory -- to fetch the data needed.
– Repeat for each piece of data needed from disk or memory
● A scan operation, on the other hand, will jump to the start of a chunk of data, and sequentially read data -- either from disk or from memory -- until the end of the chunk of data
● For large amounts of data, scan operations tend to be more efficient than multiple seek operations
08/02/07 OSCON 2007 - Target Practice 15
Example #4 - Full table scan
EXPLAIN SELECT * FROM rentalWHERE rental_date BETWEEN '2005-06-14' AND '2005-06-21'\G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: ALL possible_keys: rental_date /* larger range forces scan choice */ key: NULL key_len: NULL ref: NULL rows: 16298 Extra: Using where 1 row in set (0.00 sec)
08/02/07 OSCON 2007 - Target Practice 16
Why full table scans pop up
● No WHERE condition (duh.)● No index on any field in WHERE condition● Poor selectivity on an indexed field● Too many records meet WHERE condition● < MySQL 5.0 and using OR in a WHERE clause● Using SELECT * FROM
08/02/07 OSCON 2007 - Target Practice 17
Example #5 - Full index scan
EXPLAIN SELECT rental_id, rental_date FROM rental\G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: indexpossible_keys: NULL key: rental_date key_len: 13 ref: NULL rows: 16325 Extra: Using index1 row in set (0.00 sec)
08/02/07 OSCON 2007 - Target Practice 18
Example #6 - eq_ref strategyEXPLAIN SELECT r.*, c.first_name, c.last_nameFROM rental r INNER JOIN customer c ON r.customer_id = c.customer_idWHERE r.rental_date BETWEEN '2005-06-14' AND '2005-06-16'\G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: r type: range possible_keys: idx_fk_customer_id,rental_date key: rental_date key_len: 8 ref: NULL rows: 364 Extra: Using where *************************** 2. row *************************** id: 1 select_type: SIMPLE table: c type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 2 ref: sakila.r.customer_id rows: 1 Extra: 2 rows in set (0.00 sec)
08/02/07 OSCON 2007 - Target Practice 19
When eq_ref pops up
● Joining two sets on a field where– One side has unique, non-nullable index– Other side has at least a non-nullable index
● In example #8, an eq_ref access strategy was chosen because a unique, non-nullable index is available on customer.customer_id and an index is available on the rental.customer_id field
08/02/07 OSCON 2007 - Target Practice 20
Nested loops join algorithm
● For each record in outermost set– Fetch a record from the next set via a join
column condition– Repeat until done with outermost set
● Main algorithm in optimizer– Main work in 5.1+ is in the area of subquery
optimization and additional join algorithms like semi- and merge joins
08/02/07 OSCON 2007 - Target Practice 21
Example #7 - ref strategy
EXPLAIN SELECT * FROM rentalWHERE rental_id IN (10,11,12)AND rental_date = '2006-02-01' \G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: ref possible_keys: PRIMARY,rental_date key: rental_date key_len: 8 ref: const rows: 1 Extra: Using where 1 row in set (0.02 sec)
08/02/07 OSCON 2007 - Target Practice 22
OR conditions and the index merge
● Index merge best thing to happen in optimizer for 5.0
● Allows optimizer to use more than one index to satisfy a join condition– Prior to MySQl 5.0, only one index– In case of OR conditions in a WHERE, MySQL
<5.0 would use a full table scan
08/02/07 OSCON 2007 - Target Practice 23
Example #8 - index_merge strategy
EXPLAIN SELECT * FROM rentalWHERE rental_id IN (10,11,12)OR rental_date = '2006-02-01' \G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: index_merge possible_keys: PRIMARY,rental_date key: rental_date,PRIMARY key_len: 8,4 ref: NULL rows: 4 Extra: Using sort_union(rental_date,PRIMARY); Using where 1 row in set (0.02 sec)
08/02/07 OSCON 2007 - Target Practice 24
The subquery access types
● Occur when using a subquery in the SELECT or WHERE clauses– unique_subquery: when results of
subquery are known to be distinct– index_subquery: otherwise
● Avoid them if possible– Always replace with a join– Especially correlated subqueries
08/02/07 OSCON 2007 - Target Practice 25
Example #9 - unique_subquery strategy
SELECT c.customer_id, c.first_name, c.last_name, p.amountFROM customer cINNER JOIN payment pON c.customer_id = p.customer_idWHERE p.rental_id IN ( SELECT rental_id FROM rental WHERE rental_date BETWEEN '2005-06-14' AND '2005-06-16')\G
(explain on next slide)
08/02/07 OSCON 2007 - Target Practice 26
Example #9 - unique_subquery strategy*************************** 1. row *************************** id: 1 select_type: PRIMARY table: c type: ALL rows: 541 *************************** 2. row *************************** id: 1 select_type: PRIMARY table: p type: ref possible_keys: idx_fk_customer_id key: idx_fk_customer_id key_len: 2 ref: sakila.c.customer_id rows: 15 Extra: Using where *************************** 3. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: rental type: unique_subquery possible_keys: PRIMARY,rental_date key: PRIMARY key_len: 4 ref: func /* Never quite been sure what this is...*/ rows: 1 Extra: Using where
08/02/07 OSCON 2007 - Target Practice 27
Explaining example #9
1)The optimizer chooses to first access the customer table, c, using a full table scan strategy
2)The IN() subquery (non-correlated) in the WHERE clause yields a unique_subquery access strategy that generates a list of unique rental_ids
08/02/07 OSCON 2007 - Target Practice 28
Explaining example #9 (cont'd)
3)In the unique_subquery set (row 3), both the PRIMARY and the rental_date indexes are available to the optimizer to reduce the set to an appropriate set of rental_id values. The optimizer chooses to use the PRIMARY key on rental_id, even though a range access could be used on rental_date index
4)In row 2, which represents the payment table set, we see a ref access strategy deployed on the customer_id index. This makes sense, since the index customer_id is non-unique and not-nullable. However, notice that in the Extra column you see “Using where”. Why?
08/02/07 OSCON 2007 - Target Practice 29
Example #10 - Rewrite #9 with a join
EXPLAIN SELECT c.customer_id, c.first_name, c.last_name, p.amountFROM customer cINNER JOIN payment pON c.customer_id = p.customer_idINNER JOIN rental rON p.rental_id = r.rental_idWHERE r.rental_date BETWEEN '2005-06-14' AND '2005-06-16'\G)\G
What performance difference do we see?
08/02/07 OSCON 2007 - Target Practice 30
Example #11 - Evil correlated subquery
EXPLAIN SELECT p.payment_id, p.amount, p.payment_date, c.first_name, c.last_nameFROM payment pINNER JOIN customer cON p.customer_id = c.customer_idWHERE p.payment_date = ( SELECT MAX(payment_date) FROM payment WHERE payment.customer_id = p.customer_id)\G
08/02/07 OSCON 2007 - Target Practice 31
EXPLAIN output from example #11*************************** 1. row *************************** id: 1 select_type: PRIMARY table: c type: ALL rows: 541 *************************** 2. row *************************** id: 1 select_type: PRIMARY table: p type: ref possible_keys: idx_fk_customer_id key: idx_fk_customer_id key_len: 2 ref: sakila.c.customer_id rows: 14 Extra: Using where *************************** 3. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: payment type: ref possible_keys: idx_fk_customer_id key: idx_fk_customer_id key_len: 2 ref: sakila.p.customer_id rows: 14
08/02/07 OSCON 2007 - Target Practice 32
Correlated subquery == evil (at least in MySQL)
● Correlated subquery can be (and most often is) executed once for each matched row in the outer set of information.– Kills scalability and performance as data
sets grows (even to a tiny 10K rows)● Can always be rewritten using joins
– Yes, always– That's what many optimizers do anyway
08/02/07 OSCON 2007 - Target Practice 33
Example #12 - Rewriting evil subquery
EXPLAIN SELECT p.payment_id, p.amount, p.payment_date, c.first_name, c.last_nameFROM payment pINNER JOIN ( SELECT customer_id, MAX(payment_date) AS payment_date FROM payment GROUP BY customer_id) AS last_ordersON p.customer_id = last_orders.customer_idAND p.payment_date = last_orders.payment_dateINNER JOIN customer cON p.customer_id = c.customer_id\G
08/02/07 OSCON 2007 - Target Practice 34
EXPLAIN output from example #12*************************** 1. row *************************** id: 1 select_type: PRIMARY table: <derived2> type: ALL rows: 599 *************************** 2. row *************************** id: 1 select_type: PRIMARY table: c type: eq_ref ref: last_orders.customer_id rows: 1 *************************** 3. row *************************** id: 1 select_type: PRIMARY table: p type: ref possible_keys: idx_fk_customer_id key: idx_fk_customer_id key_len: 2 ref: sakila.c.customer_id rows: 15 Extra: Using where *************************** 4. row *************************** id: 2 select_type: DERIVED table: payment type: index key: idx_fk_customer_id key_len: 2 rows: 16451
phew. break time.
The Schema, Indexes, and MySQL Query Cache
08/02/07 OSCON 2007 - Target Practice 37
Covering indexes
● An index is a covering index when all fields in SELECT statement for a specific table are contained in an index
● Shows up in Extra column of EXPLAIN as “Using index”
● Removes need for “bookmark lookup” operation– What the heck is a bookmark lookup
operation?
Non clustered layout (MyISAM)
1-100
Data filecontaining unordered
data records
1-33 34-66 67-100
Root Index Node stores a directory of keys, along with pointers to non-leaf nodes (or leaf nodes for a very
small index)
Leaf nodes store sub-
directories of index keys
with pointers into the data
file to a specific record
these are the bookmark lookups...that a covering index eliminates
08/02/07 OSCON 2007 - Target Practice 39
Example #13 - covering index
EXPLAIN SELECT rental_id, customer_id, inventory_id FROM rental WHERE rental_date = '2005-06-14'\G*************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: ref possible_keys: rental_date key: rental_date key_len: 8 ref: const rows: 1 Extra: Using index 1 row in set (0.00 sec)
Why is this a covering index?
Clustered layout (InnoDB)
1-100
1-33
In a clustered layout, the leaf nodes actually
contain all the data for the record (not just the index key,
like in the non-clustered layout)
Root Index Node stores a directory of keys, along with pointers to non-leaf nodes (or leaf nodes for a very
small index)
34-66 67-100
Important:
When looking up a record by a primary key, for a clustered layout/organization, the lookup operation (following the pointer from the leaf node to the data file) involved in a non-clustered layout is not needed.
08/02/07 OSCON 2007 - Target Practice 41
Clustered layout
● Primary key value is appended to every record in a secondary index– So use a small primary key on InnoDB tables
● If you don't add a primary key...– InnoDB will automagically create one...
● A 6-byte integer that you have no control over– So always create a primary key...
08/02/07 OSCON 2007 - Target Practice 42
Filesorts and temporary tables
● Occur when– A sorted list of values needed does not
exist– More memory needed for grouping/sorting
● > tmp_table_size AND max_heap_table_size– a derived table (subquery in from clause)– Inappropriate column order in index
● SHOW STATUS LIKE 'Created_tmp_%';
08/02/07 OSCON 2007 - Target Practice 43
Temporary tables
● A temporary table is a MEMORY table or a MyISAM table– depending on size– or if the SELECT contains a TEXT or BLOB!
● VARCHARs turn into CHARs– Crazy Dr. Jekyll / Mr. Hyde situation?– Could be this
● One benefit: no locking...since within session
08/02/07 OSCON 2007 - Target Practice 44
Example #14 - temporary tables & filesorts
SELECT rental_rate, COUNT(*) AS num_filmsFROM film WHERE language_id = 1 GROUP BY rental_rate ORDER BY rental_rate DESC\g*************************** 1. row *************************** id: 1 select_type: SIMPLE table: film type: ref possible_keys: idx_fk_language_id key: idx_fk_language_id key_len: 1 ref: const rows: 511 Extra: Using where; Using temporary; Using filesort
08/02/07 OSCON 2007 - Target Practice 45
Example #15 - Remedy to filesort in #14
ALTER TABLE film DROP INDEX idx_fk_language_id, ADD INDEX idx_language_rental_rate (language_id, rental_rate);...re-run EXPLAIN form example #14...*************************** 1. row *************************** id: 1 select_type: SIMPLE table: film type: ref possible_keys: idx_language_rental_rate key: idx_language_rental_rate key_len: 1 ref: const rows: 469 Extra: Using where; Using index
08/02/07 OSCON 2007 - Target Practice 46
Example #16 - Ah, but there's a problem
SELECT rental_rate, COUNT(*) AS num_filmsFROM film GROUP BY rental_rate ORDER BY rental_rate DESC\g*************************** 1. row *************************** id: 1 select_type: SIMPLE table: film type: index possible_keys: NULL key: idx_language_rental_rate key_len: 3 ref: NULL rows: 938 Extra: Using index; Using temporary; Using filesort
08/02/07 OSCON 2007 - Target Practice 47
Example #17 - Effects of functions on indexes
EXPLAIN SELECT * FROM film WHERE title LIKE 'Tr%'\G*************************** 1. row *************************** select_type: SIMPLE table: film type: range possible_keys: idx_title key: idx_title key_len: 767 ref: NULL rows: 15
EXPLAIN SELECT * FROM film WHERE LEFT(title, 2) = 'Tr'\G*************************** 1. row *************************** select_type: SIMPLE table: film type: ALL rows: 938 Extra: Using where
08/02/07 OSCON 2007 - Target Practice 48
Example #18 - Rewriting to avoid functions
SELECT * FROM OrdersWHERE TO_DAYS(CURRENT_DATE()) – TO_DAYS(order_created) <= 7;
Not a good idea! Lots o' problems with this...
SELECT * FROM OrdersWHERE order_created >= CURRENT_DATE() INTERVAL 7 DAY;
Better... Now the index on order_created will be used at least. Still a problem, though...
SELECT order_id, order_created, customerFROM OrdersWHERE order_created >= '20070211' INTERVAL 7 DAY;
Best. Now the query cache can cache this query, and given no updates, only run it once a day...
replace the CURRENT_DATE() function with a constant string in your programming language du jour... for instance, in PHP, we'd do:
$sql= “SELECT order_id, order_created, customer FROM Orders WHERE order_created >= '“ .date('Y-m-d') . “' - INTERVAL 7 DAY”;
08/02/07 OSCON 2007 - Target Practice 49
Schema guidelines
● Use the smallest data type possible– Do you really need that BIGINT?
● The smaller your data types, the more index (and data) records can fit into a single block of memory– Especially important for indexed fields
● Normalize first, denormalize later...
Journey to the Center of the Database
Ahh, normalization...
http://thedailywtf.com/forums/thread/75982.aspx
08/02/07 OSCON 2007 - Target Practice 51
IP Addresses
● An IP address always reduces down to an INT UNSIGNED
● Each subnet part corresponds to one 8-byte division of the underlying INT UNSIGNED
● Use INET_ATON() to convert from a string to an integer
● Use INET_NTOA() to convert from integer to string
08/02/07 OSCON 2007 - Target Practice 52
Example #20 - IP addressingCREATE TABLE Sessions ( session_id INT UNSIGNED NOT NULL AUTO_INCREMENT, ip_address INT UNSIGNED NOT NULL // Compared to CHAR(15)!!, session_data TEXT NOT NULL, PRIMARY KEY (session_id), INDEX (ip_address)) ENGINE=InnoDB;
// Find all sessions coming from a local subnetSELECT * FROM SessionsWHERE ip_address BETWEEN INET_ATON('192.168.0.1') AND INET_ATON('192.168.0.255');
The INET_ATON() function reduces the string to a constant INT and a highly optimized range operation will be performed for:
SELECT * FROM SessionsWHERE ip_address BETWEEN 3232235521 AND 3232235775
The MySQL Query Cache
Clients
Parser
Optimizer
QueryCache
Pluggable Storage Engine API
MyISAM InnoDB MEMORY Falcon Archive PBXT SolidDB Cluster(Ndb)
ConnectionHandling &
Net I/O
“Packaging”
08/02/07 OSCON 2007 - Target Practice 54
MySQL Query Cache
● To use effectively, you must understand application read/write ratio
● QC design is a compromise between CPU usage and read performance
● Bigger query cache != better performance, even for heavy read applications
08/02/07 OSCON 2007 - Target Practice 55
MySQL Query Cache invalidation
● Coarse invalidation designed to prevent CPU overuse during finding and storing cache entries
● This means any modification to any table referenced in the SELECT will invalidate any cache entry which uses that table
● Remedy with vertical table partitioning
08/02/07 OSCON 2007 - Target Practice 56
Example #19 - Remedy to QC thrashingCREATE TABLE Products ( product_id INT UNSIGNED NOT NULL AUTO_INCREMENT, name VARCHAR(80) NOT NULL, unit_cost DECIMAL(7,2) NOT NULL, description TEXT NULL, image_path TEXT NULL, num_views INT UNSIGNED NOT NULL, num_in_stock INT UNSIGNED NOT NULL, num_on_order INT UNSIGNED NOT NULL, PRIMARY KEY (product_id), INDEX (name(20))) ENGINE=InnoDB; // Or MyISAM
CREATE TABLE Products ( product_id INT UNSIGNED NOT NULL AUTO_INCREMENT, name VARCHAR(80) NOT NULL, unit_cost DECIMAL(7,2) NOT NULL, description TEXT NULL, image_path TEXT NULL, PRIMARY KEY (product_id), INDEX (name(20))) ENGINE=InnoDB; // Or MyISAMCREATE TABLE ProductCounts ( product_id INT UNSIGNED NOT NULL, num_views INT UNSIGNED NOT NULL, num_in_stock INT UNSIGNED NOT NULL, num_on_order INT UNSIGNED NOT NULL, PRIMARY KEY (product_id)) ENGINE=InnoDB;
Benchmarking
08/02/07 OSCON 2007 - Target Practice 58
Benchmarking basics
● Enables tracking of performance of something over time– A whole application– An SQL script– An application script or web page
● Enables load or stress testing● Enables the comparison of two similar
techniques, engines, platforms, etc.
08/02/07 OSCON 2007 - Target Practice 59
Change one thing at a time
● Change only one thing at a time– Config variable– Addition of an index– Modification to a schema table– Change to an SQL script
● Adhere to scientific principle of isolating a variable condition
● Re-run the benchmark after change
08/02/07 OSCON 2007 - Target Practice 60
Isolate the benchmark environment
● Disable non-essential services– Network traffic analyzers– Non-essential servers– MySQL query cache (most of the time)
● Use a separate machine if possible● Separate MYSQL instance
– Use the --socket configuration variable to differentiate instance
08/02/07 OSCON 2007 - Target Practice 61
Save the benchmark results
● Save all benchmark files– Result files of benchmark runs– Configuration files at time of run– OS/Hardware configuration, if changed
● Have a baseline folder when doing stress testing– Configuration file at start of changes– Base results of benchmark tests
08/02/07 OSCON 2007 - Target Practice 62
Why a benchmarking framework?
● Automates the most tedious part of testing● Standardizes benchmark results● Can customize to your needs● Framework can be analyzed by outside
parties to determine:– If framework provides consistent results– If framework is biased or slanted in any
way
08/02/07 OSCON 2007 - Target Practice 63
A note about large benchmark frameworks
● Huge frameworks like TPC-x or SPECjAppServer2004– Cost lots of moolah– Developed to test fairly specific things
● Application server performance● Transaction processing load
● Some (like TPC) only allow member organizations to publish benchmarks
08/02/07 OSCON 2007 - Target Practice 64
More specific benchmarking tools
● sysbench– Useful for doing raw comparisons of
different MySQL versions or platforms● supersmack
– Fairly flexible (but difficult) tool for measuring SQL script performance
● ab (ApacheBench)– Great for load testing a web server
08/02/07 OSCON 2007 - Target Practice 65
Generic frameworks for the common folk
● Many other frameworks designed for the needs of the common developer– Junit/Ant (for you Java drinkers...)– MyBench (for you Perl mongers...)– thewench (for you PHP folks...)
● Today, I'll show thewench ... why?– Because I wrote it as a prototype– Because it will give you ideas to create your own
thewench(a.k.a. Jay's humble little servant)
08/02/07 OSCON 2007 - Target Practice 67
thewench system requirements
● System requirements– PHP 5 with CLI (command line interface)– ext/mysqli– simple_xml support (PHP 5.1.10+)
● Future versions to support:– PDO, phpnd, other DB drivers– PostgreSQL, Oracle, other Dbs– Human and machine readable output
08/02/07 OSCON 2007 - Target Practice 68
Installing thewench
● Download thewench from her online home:$> wget http://jpipes.com/thewench.tar.gz
● Untar into some directory and make her an executable:$> tar xzf thewench.tag.gz
$> cd thewench/
$> chmod 0755 thewench.php
$> cd /usr/sbin
$> sudo ln -s ~/thewench/thewench.php thewench
08/02/07 OSCON 2007 - Target Practice 69
About thewench
● Multi-threaded -- uses pctrl_fork()● Runs test files
– Tests can be grouped into test suites● Configured either via config files or command
line args– Looks in command line and in./, ~/, $installdir
and /etc/thewench/
08/02/07 OSCON 2007 - Target Practice 70
thewench configuration options
-d | --database | database=mydbname
-h | --host | host=myhost
-u | --user | user=mydbuser
-p | --password | password=mydbpass
-c | --concurrency | concurrency=n
-n | --seconds | seconds=n
--port | port=nnnn
--socket | socket=path_to_socket
--test-dir | test-dir=path_to_tests
--suite-dir | suite-dir=path_to_test_suites
08/02/07 OSCON 2007 - Target Practice 71
Sample thewench.cnf
test_dir = "/home/jpipes/benchmarks/tests"
suite_dir = "/home/jpipes/benchmarks/suites"
host = "localhost"
user = "root"
password = ""
database = "sakila"
port = 3307
socket=/tmp/target.sock
08/02/07 OSCON 2007 - Target Practice 72
Running thewench
● Either name a set of tests:$> thewench [options] correlated derived
● or specify a test suite name:$> thewench [options] --suite=derived-compare
● Either case depends on existence of test files– XML test file format
● Must be named testname.[test|xml]– A test suite is simply a newline-delimited
list of tests in a file named for the suite
08/02/07 OSCON 2007 - Target Practice 73
thewench test files
● <test> must be outermost element● 1 or more <query> elements
08/02/07 OSCON 2007 - Target Practice 74
The <query> element
● One or more required per test● Must contain an <sql> element
– Good idea to put content within <![CDATA[ tags
● Can contain one or more <arg> elements– Each <arg> should be represented by a
question mark (?) in the <sql> element's SQL text
08/02/07 OSCON 2007 - Target Practice 75
Example test - correlated.xml
<?xml version='1.0' standalone='yes'?><test> <query> <sql><![CDATA[SELECT p.*FROM sakila.payment pWHERE p.payment_date =( SELECT MAX(payment_date) FROM sakila.payment WHERE customer_id=p.customer_id)]]> </sql> </query></test>
08/02/07 OSCON 2007 - Target Practice 76
Example test - derived.xml
<?xml version='1.0' standalone='yes'?><test> <query> <sql><![CDATA[SELECT p.*FROM ( SELECT customer_id, MAX(payment_date) as last_order FROM sakila.payment GROUP BY customer_id) AS last_ordersINNER JOIN sakila.payment pON p.customer_id = last_orders.customer_idAND p.payment_date = last_orders.last_order]]></sql> </query></test>
08/02/07 OSCON 2007 - Target Practice 77
Parting words...
● Email me <[email protected]>● New MySQL Forge coming soon
– http://forge.mysql.com● Party tomorrow night at DoubleTree w/Zend● Birds of a Feather on Wednesday evening
– Come hack on thewench with me...
08/02/07 OSCON 2007 - Target Practice 78
Example output - derived vs. correlated
jpipes@shakedown:~/benchmarks/tests$ thewench --seconds=3 derived correlated>> thewench v0.6Results, ordered by average time taken per request:+-----+--------------------------------+-----------+----------+----------+| # | test name | req/sec | comp | fail |+-----+--------------------------------+-----------+----------+----------+| 1 | derived | 10.966133 | 33 | 0 || 2 | correlated | 0.813098 | 3 | 0 |+-----+--------------------------------+-----------+----------+----------+