What you wanted to know about MySQL, but could not find using inernal instrumentation only

Post on 08-Feb-2017

184 views 1 download

Transcript of What you wanted to know about MySQL, but could not find using inernal instrumentation only

What you wanted to know about MySQLbut could not find using internal instrumentation only

February, 3, 2017

Sveta Smirnova

∙ MySQL Support engineer∙ Author of

∙ MySQL Troubleshooting∙ JSON UDF functions∙ FILTER clause for MySQL

∙ Speaker∙ Percona Live, OOW, Fosdem,

DevConf, HighLoad...

Sveta Smirnova

2

Year 2009

3

∙ In modern versions we have a lot of onlineinformation

∙ However users usually notice error from logfiles, when context is already gone

∙ Partially this is solved by modern monitoringtools (PMM) which can save historicalstatistics

∙ But not about everything

Historical Data

4

∙ In modern versions we have a lot of onlineinformation

∙ However users usually notice error from logfiles, when context is already gone

∙ Partially this is solved by modern monitoringtools (PMM) which can save historicalstatistics

∙ But not about everything

Historical Data

4

∙ In modern versions we have a lot of onlineinformation

∙ However users usually notice error from logfiles, when context is already gone

∙ Partially this is solved by modern monitoringtools (PMM) which can save historicalstatistics

∙ But not about everything

Historical Data

4

∙ In modern versions we have a lot of onlineinformation

∙ However users usually notice error from logfiles, when context is already gone

∙ Partially this is solved by modern monitoringtools (PMM) which can save historicalstatistics

∙ But not about everything

Historical Data

4

∙ It is easy to find in the Audit log records querywhich failed with this error<AUDIT_RECORD

NAME="Query"RECORD="2_2017-01-12T20:40:36"

TIMESTAMP="2017-01-12T20:41:32 UTC"COMMAND_CLASS="update"CONNECTION_ID="3"

STATUS=" 1205"SQLTEXT="update t1 set f1=f1-1"

USER="root[root] @ localhost [127.0.0.1]"HOST="localhost"

OS_USER=IP="127.0.0.1"DB="test"

/>

∙ But there is the query which holds the lock?∙ Even hard to find online∙ Multiple statement transactions make it worse∙ However server has all information to print all

queries of locking transaction∙ MySQL Bug #84563

Lock wait timeout

5

∙ It is easy to find in the Audit log records querywhich failed with this error

∙ But there is the query which holds the lock?

∙ Even hard to find online∙ Multiple statement transactions make it worse∙ However server has all information to print all

queries of locking transaction∙ MySQL Bug #84563

Lock wait timeout

5

∙ It is easy to find in the Audit log records querywhich failed with this error

∙ But there is the query which holds the lock?∙ Even hard to find online∙ Especially if you have thousands of running

threads!

∙ Multiple statement transactions make it worse∙ However server has all information to print all

queries of locking transaction∙ MySQL Bug #84563

Lock wait timeout

5

∙ It is easy to find in the Audit log records querywhich failed with this error

∙ But there is the query which holds the lock?∙ Even hard to find online∙ Multiple statement transactions make it worse

∙ However server has all information to print allqueries of locking transaction

∙ MySQL Bug #84563

Lock wait timeout

5

∙ It is easy to find in the Audit log records querywhich failed with this error

∙ But there is the query which holds the lock?∙ Even hard to find online∙ Multiple statement transactions make it worse∙ However server has all information to print all

queries of locking transaction

∙ MySQL Bug #84563

Lock wait timeout

5

∙ It is easy to find in the Audit log records querywhich failed with this error

∙ But there is the query which holds the lock?∙ Even hard to find online∙ Multiple statement transactions make it worse∙ However server has all information to print all

queries of locking transaction∙ MySQL Bug #84563

Lock wait timeout

5

∙ First transaction––––––––––––LATEST DETECTED DEADLOCK––––––––––––2017-01-19 13:03:42 7f37fc636700*** (1) TRANSACTION:TRANSACTION 1298, ACTIVE 3 sec starting index read...DELETE FROM t WHERE i = 1*** (1) WAITING FOR THIS LOCK TO BE GRANTED:RECORD LOCKS space id 0 page no 314 n bits 72 index ‘GEN_CLUST_INDEX‘of table ‘test‘.‘t‘ trx id 1298 lock_mode X waiting...

∙ Second transaction∙ Which query held the lock?∙ SELECT * FROM t WHERE i = 1 LOCK IN SHARE MODE;

∙ How would we know?∙ refman/.../innodb-deadlock-example.html∙ Bug #84607

What exactly caused the deadlock?

6

∙ First transaction∙ Second transaction

*** (2) TRANSACTION:TRANSACTION 1297, ACTIVE 7 sec starting index read...DELETE FROM t WHERE i = 1*** (2) HOLDS THE LOCK(S):RECORD LOCKS space id 0 page no 314 n bits 72 index ‘GEN_CLUST_INDEX‘...*** (2) WAITING FOR THIS LOCK TO BE GRANTED:RECORD LOCKS space id 0 page no 314 n bits 72 index ‘GEN_CLUST_INDEX‘...*** WE ROLL BACK TRANSACTION (1)

∙ Which query held the lock?∙ SELECT * FROM t WHERE i = 1 LOCK IN SHARE MODE;

∙ How would we know?∙ refman/.../innodb-deadlock-example.html∙ Bug #84607

What exactly caused the deadlock?

6

∙ First transaction∙ Second transaction∙ Which query held the lock?

∙ SELECT * FROM t WHERE i = 1 LOCK IN SHARE MODE;

∙ How would we know?∙ refman/.../innodb-deadlock-example.html∙ Bug #84607

What exactly caused the deadlock?

6

∙ First transaction∙ Second transaction∙ Which query held the lock?∙ SELECT * FROM t WHERE i = 1 LOCK IN SHARE MODE;

∙ How would we know?∙ refman/.../innodb-deadlock-example.html∙ Bug #84607

What exactly caused the deadlock?

6

∙ First transaction∙ Second transaction∙ Which query held the lock?∙ SELECT * FROM t WHERE i = 1 LOCK IN SHARE MODE;

∙ How would we know?

∙ refman/.../innodb-deadlock-example.html∙ Bug #84607

What exactly caused the deadlock?

6

∙ First transaction∙ Second transaction∙ Which query held the lock?∙ SELECT * FROM t WHERE i = 1 LOCK IN SHARE MODE;

∙ How would we know?∙ refman/.../innodb-deadlock-example.html

∙ Bug #84607

What exactly caused the deadlock?

6

∙ First transaction∙ Second transaction∙ Which query held the lock?∙ SELECT * FROM t WHERE i = 1 LOCK IN SHARE MODE;

∙ How would we know?∙ refman/.../innodb-deadlock-example.html∙ Bug #84607

What exactly caused the deadlock?

6

∙ Performance Schema∙ Bug #71364 Please provide warning text

information into P_S∙ Bug #61030 Make an I_S table of client error

codes∙ Bug #58058 please add instrumentation to track

error counts on a server

∙ General logging

Some past requests

7

∙ Performance Schema∙ General logging

∙ Bug #70796 Error messages and warnings forsql-mode behaviours need more verbosity

∙ Bug #64190 Log failed queries in a separate log∙ Bug #60884 Enable logging of all errors to the

error log∙ Bug #34137 Additional logging of the server

shutdown process

Some past requests

7

∙ Which kind of query can produce this output?∙ t is InnoDB table

mysql> select * from table_handles where object_name=’t’\G*************************** 1. row ***************************

OBJECT_TYPE: TABLEOBJECT_SCHEMA: test

OBJECT_NAME: tOBJECT_INSTANCE_BEGIN: 140108477034256

OWNER_THREAD_ID: 23OWNER_EVENT_ID: 3788INTERNAL_LOCK: NULLEXTERNAL_LOCK: READ EXTERNAL

1 row in set (0,00 sec)

∙ Bug #84609∙ Bug #84610

Table_handles

8

∙ Which kind of query can produce this output?∙ lock table t read;

∙ Bug #84609∙ Bug #84610

Table_handles

8

∙ Which kind of query can produce this output?∙ lock table t read;∙ select * from t [lock in share mode];

∙ Bug #84609∙ Bug #84610

Table_handles

8

∙ Which kind of query can produce this output?∙ lock table t read;∙ select * from t [lock in share mode];∙ select * from t where i [=,in,<,>] ...

∙ Bug #84609∙ Bug #84610

Table_handles

8

∙ Which kind of query can produce this output?∙ lock table t read;∙ select * from t [lock in share mode];∙ select * from t where i [=,in,<,>] ...∙ But not select * from t where unique_key = ... !

∙ Bug #84609∙ Bug #84610

Table_handles

8

∙ Which kind of query can produce this output?∙ t is InnoDB table

mysql> select * from table_handles where object_name=’t’\G*************************** 1. row ***************************

OBJECT_TYPE: TABLEOBJECT_SCHEMA: test

OBJECT_NAME: tOBJECT_INSTANCE_BEGIN: 140108477034256

OWNER_THREAD_ID: 23OWNER_EVENT_ID: 4379INTERNAL_LOCK: NULLEXTERNAL_LOCK: WRITE EXTERNAL

1 row in set (0,00 sec)

∙ Bug #84609∙ Bug #84610

Table_handles

8

∙ Which kind of query can produce this output?∙ lock table t write;

∙ Manual says: "The table lock used at the storageengine level. The value is one of READEXTERNAL or WRITE EXTERNAL."

∙ Is this storage engine level operation?

∙ Bug #84609∙ Bug #84610

Table_handles

8

∙ Which kind of query can produce this output?∙ lock table t write;∙ select * from t for update;

∙ Bug #84609∙ Bug #84610

Table_handles

8

∙ Which kind of query can produce this output?∙ lock table t write;∙ select * from t for update;∙ update t set i=i+sleep(i) where i [=,in,<,>] ...

∙ Bug #84609∙ Bug #84610

Table_handles

8

∙ Which kind of query can produce this output?∙ Bug #84609

∙ Bug #84610

Table_handles

8

∙ Which kind of query can produce this output?∙ Bug #84609∙ Bug #84610

Table_handles

8

∙ In past we had only one troubleshooting tool∙ SHOW SLAVE STATUS

∙ Today Performance Schema supportsreplication

∙ But it still misses

Replication

9

∙ In past we had only one troubleshooting tool∙ Today Performance Schema supports

replication

∙ But it still misses

Replication

9

∙ In past we had only one troubleshooting tool∙ Today Performance Schema supports

replication∙ But it still misses

∙ Bug #81249 SLAVE_NET_TIMEOUT TOP_S FOR SLAVE THREAD VARIABLES

∙ Bug #78918 Metric for succesful slave reconnects∙ Bug #77605 Add more information to SQL

thread-related P_S tables

Replication

9

∙ In past we had only one troubleshooting tool∙ Today Performance Schema supports

replication∙ But it still misses

∙ Bug #76828 Slave details on a master∙ Bug #74809 Stats per binlog event type∙ Bug #72826 Support for joining

replication_execute_status_by_%∙ Bug #70951 Threads shutdown info

Replication

9

∙ What does this output mean?2017-01-20T21:44:52.301177Z 5 [Note] Aborted connection 5 to db: ’test’ user: ’root’host: ’localhost’ (Got timeout reading communication packets)

∙ Timeout while connection was establishing?∙ Connection was aborted, because

interactive_timeout/wait_timeout passed?∙ Something else?∙ Bug #51219, Bug #28836, Bug #78843, Bug

#84612

Connection errors

10

∙ What does this output mean?∙ Timeout while connection was establishing?

∙ Connection was aborted, becauseinteractive_timeout/wait_timeout passed?

∙ Something else?∙ Bug #51219, Bug #28836, Bug #78843, Bug

#84612

Connection errors

10

∙ What does this output mean?∙ Timeout while connection was establishing?∙ Connection was aborted, because

interactive_timeout/wait_timeout passed?

∙ Something else?∙ Bug #51219, Bug #28836, Bug #78843, Bug

#84612

Connection errors

10

∙ What does this output mean?∙ Timeout while connection was establishing?∙ Connection was aborted, because

interactive_timeout/wait_timeout passed?∙ Something else?

∙ Bug #51219, Bug #28836, Bug #78843, Bug#84612

Connection errors

10

∙ What does this output mean?∙ Timeout while connection was establishing?∙ Connection was aborted, because

interactive_timeout/wait_timeout passed?∙ Something else?∙ Bug #51219, Bug #28836, Bug #78843, Bug

#84612

Connection errors

10

∙ Bug #77888 max_used_connection peruser/account missing in P_S/sys

∙ Bug #77581 Collect DNS timing informationinto Performance_Schema

∙ Bug #76403COUNT_ABORTED_CLIENT_ERRORS toP_S.host_cache

∙ Bug #72219 First and last connectiontimestamps to P_S.users table

Other connection requests

11

∙ Bug #71305PERFORMANCE_SCHEMA.THREADStable, add a PORT column

∙ Bug #71186 P_S.host_cache does not collectconnections aborted entries

∙ Bug #69880 Track and expose connectioncreation timestamp

Other connection requests

11

∙ Bug #69725 P_S.socket_instances doesn’tinclude named pipe or shared memoryconnections

∙ Bug #45817 Please add SHOW command forinc_host_errors(max_connect_errors)

∙ Bug #21565 More verbose connection log

Other connection requests

11

∙ One more outputmysql> flush status;Query OK, 0 rows affected (0,00 sec)

mysql> select ...600048 rows in set (1 min 17,26 sec)

mysql> show status like ’Created_tmp%’;+-------------------------+-------+| Variable_name | Value |+-------------------------+-------+| Created_tmp_disk_tables | 2 || Created_tmp_files | 6 || Created_tmp_tables | 3 |+-------------------------+-------+3 rows in set (0,00 sec)

∙ Were tables created in simultaneously?∙ What is their size?∙ Solution: watch lsof∙ Bug #74484∙ Bug #84613

Temporary tables

12

∙ One more output∙ Were tables created in simultaneously?

∙ What is their size?∙ Solution: watch lsof∙ Bug #74484∙ Bug #84613

Temporary tables

12

∙ One more output∙ Were tables created in simultaneously?∙ What is their size?

∙ Solution: watch lsof∙ Bug #74484∙ Bug #84613

Temporary tables

12

∙ One more output∙ Were tables created in simultaneously?∙ What is their size?∙ Solution: watch lsof

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAMEmysqld 8697 sveta 70u REG 0,43 11765657 43001204 /tmp/mysqld.1/MYSeEOHe (deleted)mysqld 8697 sveta 71u REG 0,43 11765657 43001205 /tmp/mysqld.1/MYVwF8Od (deleted)

∙ Bug #74484∙ Bug #84613

Temporary tables

12

∙ One more output∙ Were tables created in simultaneously?∙ What is their size?∙ Solution: watch lsof∙ Bug #74484

∙ Bug #84613

Temporary tables

12

∙ One more output∙ Were tables created in simultaneously?∙ What is their size?∙ Solution: watch lsof∙ Bug #74484∙ Bug #84613

Temporary tables

12

∙ I_S.OPTIMIZER_TRACE is good addition forOptimizer

∙ But what about other parts of the server?

∙ Runtime∙ Parser∙ Binary logging∙ InnoDB

∙ Bug #84620

Trace

13

∙ I_S.OPTIMIZER_TRACE is good addition forOptimizer

∙ But what about other parts of the server?

∙ Runtime∙ Parser∙ Binary logging∙ InnoDB

∙ Bug #84620

Trace

13

∙ I_S.OPTIMIZER_TRACE is good addition forOptimizer

∙ But what about other parts of the server?∙ Runtime

∙ Parser∙ Binary logging∙ InnoDB

∙ Bug #84620

Trace

13

∙ I_S.OPTIMIZER_TRACE is good addition forOptimizer

∙ But what about other parts of the server?∙ Runtime∙ Parser

∙ Binary logging∙ InnoDB

∙ Bug #84620

Trace

13

∙ I_S.OPTIMIZER_TRACE is good addition forOptimizer

∙ But what about other parts of the server?∙ Runtime∙ Parser∙ Binary logging

∙ InnoDB∙ Bug #84620

Trace

13

∙ I_S.OPTIMIZER_TRACE is good addition forOptimizer

∙ But what about other parts of the server?∙ Runtime∙ Parser∙ Binary logging∙ InnoDB

∙ Bug #84620

Trace

13

∙ I_S.OPTIMIZER_TRACE is good addition forOptimizer

∙ But what about other parts of the server?∙ Runtime∙ Parser∙ Binary logging∙ InnoDB

∙ Bug #84620

Trace

13

∙ General∙ Bug #83626 Collect per column usage data in

performance_schema∙ Bug #71755 Provide per partition summary

information in PERFORMANCE_SCHEMA∙ Bug #81020 performance_schema: Please add

optimizer usage statistics∙ Bug #55171 How much sort_buffer_size are

actually used?

∙ InnoDB

More tracing requests

14

∙ General∙ InnoDB

∙ Bug #81611 Add P_S metrics to collectcompressed page bytes vs other types written torelog

∙ Bug #78448 Provide better metrics oninnodb_sort_buffer_size usage

∙ Bug #71698 Add instrumentation for thedoublewrite buffer and undo segments

More tracing requests

14

∙ SHOW PROCESSLIST has multiple states

∙ Some of them are clear∙ But what do these mean?∙ Bug #57544∙ Bug#72083∙ Bug #84615

Vague stages

15

∙ SHOW PROCESSLIST has multiple states∙ Some of them are clear

∙ But what do these mean?∙ Bug #57544∙ Bug#72083∙ Bug #84615

Vague stages

15

∙ SHOW PROCESSLIST has multiple states∙ Some of them are clear∙ But what do these mean?

∙ System lock∙ statistics∙ freeing items∙ Sending data∙ cleaning up∙ closing tables∙ end

∙ Bug #57544∙ Bug#72083∙ Bug #84615

Vague stages

15

∙ SHOW PROCESSLIST has multiple states∙ Some of them are clear∙ But what do these mean?∙ Bug #57544

∙ Bug#72083∙ Bug #84615

Vague stages

15

∙ SHOW PROCESSLIST has multiple states∙ Some of them are clear∙ But what do these mean?∙ Bug #57544∙ Bug#72083

∙ Bug #84615

Vague stages

15

∙ SHOW PROCESSLIST has multiple states∙ Some of them are clear∙ But what do these mean?∙ Bug #57544∙ Bug#72083∙ Bug #84615

Vague stages

15

Summary

∙ Feature requests∙ Comments∙ Fixes

To better MySQL!

17

http://www.slideshare.net/SvetaSmirnova

https://twitter.com/svetsmirnova

https://github.com/svetasmirnova

Thank you!

18

DATABASE PERFORMANCEMATTERS