20141206 4 q14_dataconference_i_am_your_db
-
Upload
hyeongchae-lee -
Category
Software
-
view
2.080 -
download
0
Transcript of 20141206 4 q14_dataconference_i_am_your_db
![Page 1: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/1.jpg)
I’m your DB( I need a database that scales )
FB/hyeongchae.lee
4Q14 DataConference.IO 1
![Page 2: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/2.jpg)
4Q14 DataConference.IO 2
I’m your DB!May the oracle be with you
![Page 3: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/3.jpg)
Agenda
• About me
• DBMS vs NoSQL
• Local vs Global
• So... which databases scale?
• Amazon Aurora
4Q14 DataConference.IO 3
![Page 4: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/4.jpg)
ABOUT ME
----------------------------
4Q14 DataConference.IO 4
![Page 5: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/5.jpg)
4Q14 DataConference.IO 5
INERVITMobileLite
nhnCUBRID
TELCOWARE
Telcobase
ALTIBASEAltibase
TIBEROTibero
![Page 6: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/6.jpg)
4Q14 DataConference.IO 6
![Page 7: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/7.jpg)
Global Open Frontier Full-time
• Project : MySQL Redis Plug-in ( +MariaDB, +MaxScale )
– https://github.com/sql2/MySQL_Redis_Plugin_Dev
4Q14 DataConference.IO 7
![Page 8: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/8.jpg)
MySQL Memcached Plug-in
4Q14 DataConference.IO 8
Mysqld
MySQL Server
Handler API
Memcached plugin
innodb_memcache
local cache(optional)
InnoDB API
InnoDB Storage Engine
SQL Memcached protocol
Application
![Page 9: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/9.jpg)
MySQL Redis Plug-in
4Q14 DataConference.IO 9
Mysqld
MySQL Server
Handler API
Redis plugin
innodb_redislocal cache(optional)
InnoDB API
InnoDB Storage Engine
SQL Redis protocol
Application
![Page 10: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/10.jpg)
2015 : MaxScale Redis Cluster Plug-in
4Q14 DataConference.IO 10
URL : https://mariadb.com/blog/maxscale-proxy-mysql-replication-relay
![Page 11: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/11.jpg)
DBMS VS NoSQL
4Q14 DataConference.IO 11
![Page 12: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/12.jpg)
Rank Last Month DBMS Database Model Score Changes
1 1 Oracle Relational DBMS 1452.13 -19.77
2 2 MySQL Relational DBMS 1279.08 +16.11
3 3 Microsoft SQL Server Relational DBMS 1220.20 +0.59
4 4 PostgreSQL Relational DBMS 257.36 -0.36
5 5 MongoDB Document store 244.73 +4.33
6 6 DB2 Relational DBMS 206.23 -1.44
7 7 Microsoft Access Relational DBMS 138.84 -2.80
8 8 SQLite Relational DBMS 95.28 +0.33
9 10 Cassandra Wide column store 91.99 +6.29
10 9 Sybase ASE Relational DBMS 84.62 -2.17
DB-Engines Ranking
4Q14 DataConference.IO 12
2014.11.24
http://db-engines.com/en/ranking
![Page 13: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/13.jpg)
4Q14 DataConference.IO 13
http://db-engines.com/en/ranking_categories
![Page 14: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/14.jpg)
Winner !!
4Q14 DataConference.IO 14
![Page 15: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/15.jpg)
Magic Quadrant for Operational Database Management Systems
4Q14 DataConference.IO 15
1 Oracle's Letter to the EU Concerning MySQL
After an antitrust investigation, the European Commission approved Oracle's acquisition of Sun Microsystems, including MySQL, on 21 January 2010.
Wikileaks subsequently published cables indicating that the Obama administration applied pressure to the EU to approve the deal.
Concerns about the MySQL acquisition had been addressed in Oracle's 14 December 2009 pledges to customers, which were to extend for five years — thus expiring in early 2015.
Oracle's pledges included commitments to maintain certain APIs, extensions of licenses to then-current licensees, continued use of GPL licensing, and others. The expiration of these commitments may change the nature of Oracle's relationships with a number of hardware and software vendors, as well as its posture regarding product investment, support for purchasing requirements, and other aspects of MySQL's business model.
![Page 16: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/16.jpg)
LOCAL VS GLOBAL
4Q14 DataConference.IO 16
![Page 17: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/17.jpg)
Korean vs Japan
50M vs 127M
4Q14 DataConference.IO 17
![Page 18: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/18.jpg)
Korea vs Japan
4Q14 DataConference.IO 18
Slave Slave
Master
Slave
Slave Slave
Master
Slave
x3
![Page 19: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/19.jpg)
KakaoTalk vs LINE
4Q14 DataConference.IO 19
![Page 20: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/20.jpg)
KakaoTalk vs LINE
4Q14 DataConference.IO 20
![Page 21: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/21.jpg)
We Love FusionIO !!
4Q14 DataConference.IO 21
• facebook/flashcache
![Page 22: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/22.jpg)
Dolphinics’ Dolphin Interconnect Solutions
4Q14 DataConference.IO 22
![Page 23: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/23.jpg)
MEMSCALE
4Q14 DataConference.IO 23
![Page 24: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/24.jpg)
SO... WHICH DATABASES SCALE?
4Q14 DataConference.IO 24
![Page 25: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/25.jpg)
Read Caching
• Pros : Read-caching can take over a lot of read operations. If reads make up most of your workload, this will obviously help a lot. Even if you have a heavy write workload, read-caching might be enough to keep you from having to scale-out to handle writes.
• Cons : Read-caching, by nature, involves a memory store. If your data-access patterns are really random, or involve a large percentage of records, you might wind up with a pretty expensive memory foot print. Figuring out the right cache-invalidation for your app can also be really tricky. Many memory stores are pretty basic in terms of functionality — lack of support for transactions & joins can mean that you’ll need multiple process or network round-trips between the app & the cache.
4Q14 DataConference.IO 25
http://spiegela.com/2014/04/28/but-i-need-a-database-that-scales-part-1
![Page 26: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/26.jpg)
Write Coalescing
• Pros : In short: you can achieve better throughput of incoming writes. With many caching systems, you can also query the data in the cache creating a set of real-time use cases including: event-processing, triggers & real-time analytics.
• Cons : Coalescing writes will inherently mean that your persistence layer is behind your ingestion layer. To take advantage of this technique, you’ll need to consider a lot of questions:– Which data to query: cached, persisted, both?
– Does this data need to be made durable (survives a reboot)? How quickly?
– Are there consistency concerns? Unique indices? Atomic transaction?
4Q14 DataConference.IO 26
![Page 27: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/27.jpg)
Connection Scaling
• Pros : Connection scaling increases the number of concurrent connections (obviously, I think?) It’s biggest benefit, though, is in reliability, since any cluster node can fail and clients can simply reconnect.
• Cons : Connection Scaling requires shared storage. RAC, for example, typically uses OCFS, a clustered file-system, and SAN storage. The ability to handle more I/O transactions is dependent on scaling up that shared storage tier, which can be very expensive. Connection Scaling also doesn’t help much with capacity or analysis scaling since the data is shared, not spread out across nodes.
4Q14 DataConference.IO 27
![Page 28: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/28.jpg)
Master-Slave Replication
• Pros : While there’s some setup involved, it’s pretty seamless to your application. There’s still only a single node that has control over the data, so there are no new concerns around consistency. For read-constrained applications, nodes can be added quickly and the architecture remains relatively simple.
• Cons : MSR solves one problem: reader transactions. If you need to scale other aspects, you’re not doing it here. If you need more write throughput, MSR offloads the read transactions from the master, but writes are still limited to a single node. Also, slaves can lag in their updates from the master, if you need absolute consistency between the two, you’ll need to investigate options for synchronous replication which can impact performance of the master node.
4Q14 DataConference.IO 28
![Page 29: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/29.jpg)
Vertical Partitioning ( aka cluster )
• Pros : Having smaller databases makes indices perform better, and allows you to improve just about any aspect of scaling.
• Cons : If your model requires relationships between most or all of your tables for the basic operations, vertically partitioning may not be a fit. Even when you model fits well into partitions today, having these divisions can impact flexibility of performing joins across models in the future.
4Q14 DataConference.IO 29
![Page 30: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/30.jpg)
Horizontal Partitioning ( aka shard )
• Pros : This type of partitioning provides scaling for all of the elements of scale, allowing for very large data-sets and very good performance.
• Cons : Sharding can have a lot of drawbacks depending on the implementation. For one thing, the client must be aware of the partition key. When implementing sharding in MySQL, for example, an application will typically infer the partition key, and address the desired partition. Increasing the number of nodes, or changing the key requires an update to the app each time. Other trade-offs like database features are up for grabs too:
– Joins: if my data for two collections is distributed across multiple nodes, when I fetch the data back, I may need to join data across more than one — which is likely to be slower
– Transactions: if I have a transaction that involves two nodes of the cluster, how to I execute them atomic-ly? Do I lock multiple nodes? All of them?
– Bulk commits: If I update records in bulk across multiple nodes, this is really two transactions executed separately.
4Q14 DataConference.IO 30
![Page 31: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/31.jpg)
So... which databases scale?
• Scale Out Reads• Capacity• Scale Out Analysis• Scale Out Writes• Bulk Commits• Joins• Transactions• Durability• Consistency
4Q14 DataConference.IO 31
![Page 32: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/32.jpg)
4Q14 DataConference.IO 32
![Page 33: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/33.jpg)
Scaling Storytime
• http://en.wikipedia.org/wiki/Brad_Fitzpatrick
4Q14 DataConference.IO 33
![Page 34: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/34.jpg)
One Server
4Q14 DataConference.IO 34
MySQL
Apache
Internet
• Simple:
![Page 35: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/35.jpg)
Two Server
4Q14 DataConference.IO 35
MySQL
Apache
Internet
• Two SPOF
![Page 36: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/36.jpg)
• Replication !
Five Server
4Q14 DataConference.IO 36
Master
Apache
InternetApache
Apache
Slaveread
write
replication
![Page 37: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/37.jpg)
More Server
• Chaos !
4Q14 DataConference.IO 37
Master
Apache
Internet Apache
Slave
Apache
Apache
Apache
Apache Slave
Slave SlaveSlave
Slave
![Page 38: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/38.jpg)
Cluster vs Shard
Multi-Master
Cluster
Shard
Cluster + Shard4Q14 DataConference.IO 38
![Page 39: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/39.jpg)
MySQL Recruit
• Big Table ( X )
Small Table ( O )
• Performance ( X )
Scale-up ( O ) Distributed ( O )
• Query Tuning
hard ...
• Clustering & Sharding
mission ...
4Q14 DataConference.IO 39
![Page 40: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/40.jpg)
AMAZON AURORA
4Q14 DataConference.IO 40
![Page 41: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/41.jpg)
http://www.theregister.co.uk/2014/11/26/inside_aurora_how_disruptive_is_amazons_mysql_clone/
4Q14 DataConference.IO 41
![Page 42: 20141206 4 q14_dataconference_i_am_your_db](https://reader033.fdocuments.in/reader033/viewer/2022060206/55a209951a28ab96368b4595/html5/thumbnails/42.jpg)
OSSCON 4Q14 42