Marko Mäkelä - InnoDB
-
Upload
morten-andersen -
Category
Sales
-
view
157 -
download
2
description
Transcript of Marko Mäkelä - InnoDB
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7InnoDB—What’s NewSunny Bains – [email protected]
Senior Engineering Manager
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Marko Mäkelä [email protected]
Senior Principal Software Engineer
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Performance
Features
Download & Blogs
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Transactions
Transaction poolFixed chunks of 4MB each
sizeof(trx_t) reduced from 1144 to 712 bytes
Ordered on address, locality of reference
Improves performance of read-write transaction list scans
Reduces malloc()/free() overhead
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Transactions
Transaction life cycleAll transactions are considered as read-only by default
Read only transaction start/commit mutex free
No application changes required
Read views are cached
Read view created iff a RW transaction started since the last snapshot
Reduce contention when implicit → explicit row lock conversion is done
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Transactions
Transaction life cycleHigh priority transactions (Replication GCS)
Cannot be rolled back
Can jump the record lock queue – prioritized
Currently not visible to end users
Allows rollback of arbitrary user transactions internally• If purge blocked due to long running transactions
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Memcached Plugin
8.0 16.0 32.0 64.0 128.0 256.0 512.0 1024.00
200,000
400,000
600,000
800,000
1,000,000
1,200,000
MySQL 5.7 vs 5.6 - InnoDB & Memcached
MySQL 5.7
MySQL 5.6
Connections
Queries per Second
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Sysbench OLTP Read-Only
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Sysbench OLTP Read-Write
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Temporary Table Optimizations
DDL changesNot stored in the data dictionary – lower mutex contention
Special shared temporary tablespace – lower IO overhead
Compressed tables done the old way, separate .ibd file
Tablespace recreated on startup
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Temporary Table Optimizations
DML changesSpecial UNDO logs that are not redo logged
Undo logging required for rollback to savepoint
Changes to the temporary tablespace are not redo logged
No fsyncs() on the temporary tablespace
Configuration variablestemp_data_file_path := same format as the system tablespace
e.g., ibtmp1:12M:autoextend – default setting
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Temporary Tables Benchmarks
5.6 5.70
100
200
300
400
500
600
700
Temporary table CREATE/DROP
Version
Se
con
ds
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Temporary Tables Benchmarks
5.6 5.70
100
200
300
400
500
600
700
Insert 5M rows
Versions
Se
con
ds
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Temporary Tables Benchmarks
5.6 5,70
500
1000
1500
2000
2500
Delete 5M rows
Version
Se
con
ds
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Temporary Tables Benchmarks
5.6 5.70
500
1000
1500
2000
2500
Update 5M rows
Version
Se
con
ds
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Buffer pool improvements
Use atomics for page reference countingBug#68079 - INNODB DOES NOT SCALE WELL ON 12 CORE
Flush list traversalFix flush and LRU list rescanning
Previously after flushing a page we would start from the tail again
Extra work and increased mutex contention
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Buffer pool improvements
Multithreaded flushing5.6 introduced a separate thread for flushing
5.7 allows multiple threads
--innodb-page-cleaners := 1..64 – default is 1
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Redo log
Better concurrency and faster recoveryRemove unused code
Fix read on write issue – pad the log buffer before writing to disk
Refactor and clean up the mini-transaction code
Optimize mutex acquire/release during log checkpoint
Write data file names to the redo log
No need to open clean *.ibd files on crash recovery
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Memcached Plugin
Leverages the read-only transaction optimizationsFixed several bottlenecks in the Memcache and the plugin code
1.1 Million GET/s
Limiting factors were:• The network• Memcached client
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : Memcache Benchmarks
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : index->lock
Increased concurrency, improved performanceVery complex fix
Previously entire index X latched for tree structure modification
B-tree internal nodes not latched before fix
New SX lock mode – compatible with S lock mode
Increases concurrency e.g., index->lock(SX), reads can proceed
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Performance : DDL & Truncate
Truncate table is now atomicPreviously DROP + CREATE
ID mismatch or .ibd missing If crash after DROP but before CREATE
More schema-only ALTER TABLE supportedRename index
VARCHAR extension
Faster ALTER TABLEBug#17657223 EXCESSIVE TEMPORARY FILE USAGE IN ALTER TABLE
WL#7277 Bulk Load for Index Creation
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : Partitions
Native PartitioningReduced memory overhead
Allows us to easily add• Foreign key support• Full text index support
Makes it easier to plan for a parallel query infra-structure
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : PartitionsNative Partitioning memory overhead improvement — labs
Example Table with 8K partitionsCREATE TABLE `t1` (
`a` int(10) unsigned NOT NULL AUTO_INCREMENT,`b` varchar(1024) DEFAULT NULL, PRIMARY KEY (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY HASH (a) PARTITIONS 8192;
Memory overhead comparison
One open instance uses 49 % less memory (111 MB vs 218 MB)
Ten open instances take 90 % less memory (113 MB vs 1166 MB)
More work to be done – stay tuned!
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : PartitionsImport/Export support
Importing a single partition# If the table doesn't already exist, create it
mysql> CREATE TABLE partitioned_table <same as the source>;
# Discard the tablespaces for the partitions to be restored
mysql> ALTER TABLE partitioned_table DISCARD PARTITION p1,p4 TABLESPACE;
# Copy the tablespace files
$ cp /path/to/backup/db-name/partitioned_table#P#p{1,4}.{ibd,cfg} /path/to/mysql-datadir/db-name/
# Import the tablespaces
mysql> ALTER TABLE partitioned_table IMPORT PARTITION p1,p4 TABLESPACE;
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : Partitions
DMLIndex condition push down
Limited HANDLER support for partitionsCREATE TABLE t (a int, b int, KEY (a, b)) PARTITION BY HASH (b) PARTITIONS 2;
HANDLER t READ a = (1, 2);
HANDLER t READ a NEXT;
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : Tablespace management – labs
TablespacesSQL syntax for explicit tablespace management
Replaces legacy --innodb-file-per-table usage
• CREATE TABLESPACE Logs ADD DATAFILE 'log01.ibd';• CREATE TABLE http_req(c1 varchar) TABLESPACE=Logs ;• ALTER TABLE some_table TABLESPACE=Logs;• DROP TABLESPACE Logs; -- must be empty
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : Tablespace management – labs
TablespacesSQL syntax for explicit tablespace management
Replaces legacy --innodb-file-per-table usage
• CREATE TABLESPACE Logs ADD DATAFILE 'log01.ibd';• CREATE TABLE http_req(c1 varchar) TABLESPACE=Logs ;• ALTER TABLE some_table TABLESPACE=Logs;• DROP TABLESPACE Logs; -- must be empty
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : Buffer Pool
Dynamic buffer pool size re-sizeDone in a separate thread
--innodb_buffer_pool_chunk_size – resize done in chunk size
Example:
SET GLOBAL innodb_buffer_pool_size=402653184;
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : UNDO
UNDO Log Space ManagementRequires separate UNDO tablespaces to work
• --innodb-undo_log_truncate=on (default off)• --innodb-max_undo_log_size – default 1G• --innodb-purge-rseg-truncate-frequency – default 128 - advanced
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : Larger Page Sizes - labs
32K and 64K Page SizesLonger records can be stored “in-page”
Better compression with the new transparent page compression
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : Memcached Multiple Get - labs
MemcachedMultiple get
Gets around Memcached client protocol bottlenecks• Shorter query string• Fetch multiple keys in range• Extension to “traditional” Memcached
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : GIS
Spatial indexImplemented as an R-Tree
Supports all MySQL geometric types
Currently only 2D supported
Supports transactions & MVCC
Uses predicate locking to avoid phantom reads
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : GIS
R-TreeMulti-dimension spatial data search
Queries more like:• Find object “within”, “intersects” or “touches” another object• MySQL geometric types
• POINT, LINESTRING, POLYGON, MULTIPOINT,• MULTILINESTRING, MULTIPOLYGON, GEOMETRY
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : Mutexes
Flexible mutexesMix and match mutex types in the code – build time option only
Can use futex on Linux instead of condition variables
Futex eliminates “thundering herd” problem
Not enabled by default, build with -DMUTEX_TYPE=”futex” from source
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : Character Sets
Adds the MySQL character set gb18030 (Chinese encoding for Unicode)
Supports the China National Standard GB 18030 character set.
The new associated collations are gb18030_bin and gb18030_chinese_ci.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : Full Text Search
Support for external parser
For tokenizing the document and the query
Example:
CREATE TABLE t1 (
id INT AUTO_INCREMENT PRIMARY KEY,
doc CHAR(255), FULLTEXT INDEX (doc) WITH PARSER my_parser) ENGINE=InnoDB;
ALTER TABLE articles ADD FULLTEXT INDEX (body) WITH PARSER my_parser;
CREATE FULLTEXT INDEX ft_index ON articles(body) WITH PARSER my_parser;
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : Sandisk/FusionIO Atomic Writes
No new configuration variables – may change in GASystem wide settingDisables the doublewrite buffer if the system tablespace is on NVMFSMore to come
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : Transparent PageIO Compression
Proof of concept patch from FusionIO – currently Linux onlyRequires sparse file support : NVMFS, XFS, EXT4, ZFS & NTFSLinux 2.6.39+ added PUNCH HOLE supportCan co-exist with current ROW_FORMAT=COMPRESSED tablesWorks on all table types, including the system tablespace
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : Transparent PageIO Compression
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Zip compression Tested and tried, works well enough
Complicates buffer pool code
Special page format required
No IO layer changes
Algorithm supported - Zlib
Can't compress system tablespace
Can't compress UNDO tablespace
Features : Zip vs Page IO compression
PageIO compression Requires OS/FS support
Simple
Works with all file types, system tablespaces
Potential fragmentation issues
NVMFS doesn't suffer from fragmentation
Adds to the cost of IO
Current algorithms are tuned to existing assumptions
Requires multi-threaded flushing
Easy to add new algorithms – Zlib, LZ4 etc.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : PageIO Compression Benchmark
FusionIO – 25G BP – maxid 50 Million 64 Requesters - Linkbench
Normal PageIO compression Current compression0
10000
20000
30000
40000
50000
60000
70000
Size & Operations per/sec
Ops/sec
Size
Siz
e a
nd
op
era
tion
s p
er
se
c
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : Transparent PageIO Compression
Configuration options--innodb-compression-algorithm := 0,1,2
Where: 0 – None, 1 – Zlib and 2 – LZ4
--innodb-compression-level
--innodb-compression-punch-hole := boolean
--innodb-read-async := boolean
--innodb-read-block-size := boolean
Page IO Compression
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Features : Miscellaneous
Implement update_time for InnoDB tablesImprove select count(*) performance by using handler::records();Improve recovery, redo log tablespace meta data changes
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Download & Blogs
http://labs.mysql.com
http://dev.mysql.com/downloads/mysql/
http://mysqlserverteam.com/
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Thank You!
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.