2 Proprietary & Confidential What is Sharding Benefits of Sharding Alternatives of Sharding When to...
-
Upload
anissa-riddell -
Category
Documents
-
view
224 -
download
3
Transcript of 2 Proprietary & Confidential What is Sharding Benefits of Sharding Alternatives of Sharding When to...
2Proprietary & Confidential
• What is Sharding
• Benefits of Sharding
• Alternatives of Sharding
• When to start Sharding
Agenda
3Proprietary & Confidential
• Wikipedia:– Horizontal partitioning is a database design principle
whereby rows of a database table are held separately, rather than splitting by columns (which is what normalization and vertical partitioning do, to differing extents). Each partition forms part of a shard, which may in turn be located on a separate database server or physical location.
What Is Sharding
4Proprietary & Confidential
Example
ID First name Last name
100 Steven King
101 Neena Kochhar
102 Lex De Haan
103 Alexander Hunold
104 Bruce Ernst
105 David Austin
106 Valli Pataballa
ID First name Last name
102 Lex De Haan
105 David Austin
ID First name Last name
100 Steven King
103 Alexander Hunold
106 Valli Pataballa
ID First name Last name
101 Neena Kochhar
104 Bruce Ernst
5Proprietary & Confidential
• Every big web site you can think of
• FaceBook, Twitter, Flickr to name a few
Who Uses Sharding
6Proprietary & Confidential
• Sharding lets you:– Scale Out database
• Increate number of concurrent transactions
– Improve performance• Decrease latency
– Make the database elastic
Benefits Of Sharding
7Proprietary & Confidential
• Size– Table size is reduced– Index size is reduced– More in memory – less disk access
• Hits– Isolation is a pain– Less hits per database – less isolation
Performance Improvement
8Proprietary & Confidential
• Database needs to maintain copies of the data per user, to ensure transaction boundaries– More users – more copies– Longer transactions – more copies
• Indexes are stored on actual data– Copies are problematic
• See here for complete explanation - http://www.scalebase.com/isolation-levels-in-relational-databases/
• Sharding helps reduces # of transaction per database shard
Database Isolation
9Proprietary & Confidential
• Tuning
• Scale Up
• Read/Write Splitting
• NoSQL
Alternatives For Sharding
10Proprietary & Confidential
• There are many ways to tune your database
• Allot of data online, check out this post– http://forge.mysql.com/wiki/Top10SQLPerformanceTips
Database Tuning
11Proprietary & Confidential
• innodb_buffer_pool_size– Holds the data and indexes of tables in memory.– Bigger buffer results in faster row lookups.– The bigger the better.– Default – 8M
• Query Cache– Keeps the result of queries in memory until they are invalidated by
writes. – query_cache_size
• total size of memory available to query caching
– query_cache_limit• the maximum number of kilobytes one query may be in order to be cached.
– query_cache_size = 128MB– query_cache_limit = 4MB
Database Tuning – Some Examples
12Proprietary & Confidential
• Usually DB gets the strongest servers
• However – there is a limit to how much performance gains you can get from increasing hardware
• Some data:
Scaling Up Hardware
http://www.mysqlperformanceblog.com/2011/01/26/modeling-innodb-scalability-on-multi-core-servers/
13Proprietary & Confidential
• Solid State Drive– Better latency and access time than regular HDD– Cost more per GB (but prices are dropping)
• Vadim Tkachenko from Percona gave a great lecture on SSD at MySQL Conf 2011– (see slides at http
://en.oreilly.com/mysql2011/public/schedule/detail/17117)– Claims you can expect up to X7 performance from SSD
SSD
14Proprietary & Confidential
• Write to MySQL master, read from 1 (or more) slaves
• Excellent read scaling
• Many issues:– Since replication is a-synchronous – read might not be up to
date– Transactions create stickiness– Code changes
Read/Write Splitting
NoSQL
• A term used to designate databases which differ from classic relational databases in some way. These data stores may not require fixed table schemas, and usually avoid join operations and typically scale horizontally. Academics and papers typically refer to these databases as structured storage, a term which would include classic relational databases as a subset.
http://en.wikipedia.org/wiki/NoSQL
NoSQL Types
•Key/Value–A big hash table–Examples: Voldemort, Amazon Dynamo
•Big Table–Big table, column families–Examples: Hbase, Cassandra
•Document based–Collections of collections–Examples: CouchDB, MongoDB
•Graph databases–Based on graph theory–Examples: Neo4J
•Each solves a different problem
18Proprietary & Confidential
• Database Size (including indexes) > available memory– When databases go to disk, bad things happen
• Too many hits/second
• High write/read ration
When To Start Sharding
19Proprietary & Confidential
• Start small, end big
• TCO– Management– Backup– Time to market
Downsides of Sharding
20Proprietary & Confidential
• Sharding – no hassle
• No hidden costs– New features– Easy administration
Benefits Of ScaleBase