Idi2017 - Cloud DB: strengths and weaknesses
-
Upload
linuxariacom -
Category
Technology
-
view
186 -
download
0
Transcript of Idi2017 - Cloud DB: strengths and weaknesses
Core TeamRiccardo Capecchi - Marco Careddu - Piermarco Zerbini
Mar 2017 Devops Day 2017
Cloud DB - strengths and weaknesses
Shopfully - Who We Are
Founded in 2010, ShopFully is the leading platform used by over 25 million users worldwide when getting ready to go shopping in their neighborhood.
The platform contains a variety of information including details on promotions, new products, shops, opening times and contacts of the main retailers and brands in each shopping category, geolocated in one place and easily accessible to users.
2
Shopfully - What We Do
ShopFully is the last mile media, the first source of geolocalized information on promotions, new products, shops, opening times and contacts of the main retailers and brands in all shopping categories
The services offered by ShopFully can be accessed both online at www.shopfully.com (or country specific URLs) through the website as well as through the free app developed for all major mobile platforms: iOS, Android, Windows8, Amazon and BlackBerry.
3
Shopfully - We’re going to talk about
1. Some details on our old Database Infrastructure.2. How we choose our Cloud DB.3. The design of our application, mainly focused on the Database.4. How we move move on a Cloud DB5. The new Challenges and Benefits of a Cloud DB.6. Conclusions, should you consider to go on a cloud DB ?
4
WHY we move on DB As Service
● 9 dedicated servers ● Galera cluster multi-master managed by severalnines’ cluster control framework● Shared database infrastructure
Before Cloud Database
6
WHY we move on DB As Service
Before Cloud DatabaseProblems:● High load on all nodes during traffic spikes● Very high load on survivor nodes when recovering a broken node
Causes:● Cluster capacity was near to its limit
Possible solutions:● Horizontal scale up: unsafe because of high number of cluster nodes● Vertical scale up replacing all dedicated servers: losing two nodes at same
time was insecure7
WHERE do we move on?
Let’s go on DB As a Service, but… what do we want?
Goals● Zero Downtime● Latest mysql engine version as
possible● Reduce effort for database
management
9
WHERE do we move on?
Let’s go on DB As Service, but… what do we want?
Preferred providers● Google Cloud● AWS
10
WHERE do we move on?
Google Cloud SQL VS Amazon Web Service RDS
Both of them could import a live database, but….
11
WHERE do we move on?
Google Cloud SQLFirst generation VS Second generation
Second generation features:
● Up to 7X throughput and 20X storage capacity of First Generation instances
● Less expensive than First Generation for most use cases● Option to add High Availability failover and read replication● Mysql 5.7
COOL! But… second generation does not supports external replication master.
12
WHERE do we move on?
AWS RDSMysql VS Aurora
Let’s move on Amazon RDS. But which database service?
Amazon Aurora (Aurora) is a fully managed, MySQL-compatible, relational database engine that combines the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open-source databases. It delivers up to five times the performance of MySQL without requiring changes to most of your existing applications.
Amazon Aurora makes it simple and cost-effective to set up, operate, and scale your new and existing MySQL deployments, thus freeing you to focus on your business and applications. We love Aurora! But...
13
HOW MUCH does it cost?
OVH vs AWSThe database infrastructure costs was due to:
● visible:○ 9 dedicated servers
● invisible:○ 3 virtual machines (load balancers)○ 1 virtual machine (cluster monitor)○ storage backup○ sysadmin time worked
14
HOW MUCH does it cost?
OVH vs AWS
First look to AWS pricing:
● visible: x10 to Infrastructure Balance for databases
15
HOW MUCH does it cost?
OVH vs AWS
First look to AWS pricing:
● visible: x10 to Infrastructure Balance for databases
16
HOW MUCH does it cost?
Query AnalysisCollected data from our Mysql Galera Cluster highlights that:
● the “load” of each database● for each database, around half of the
total queries are routed on the writer’s endpoint of RDS Aurora, and the rest on the reader’s endpoint
We mapped each database into an AWS DB tier, in this way the total cost was reduced from 10x to around 4x.
DB1
DB2
DB3
DB4
DB5
DB6
18
HOW MUCH does it cost?
OVH vs AWS - Round 2Considering the followings:
● Our galera cluster was near its limits, and we should have paid more for maintenance and new hardware.
● Amazon solution offers:○ fully managed solution○ data replication across availability zones○ easy way to enlarge/reduce read replicas○ cloudwatch○ possibile automation○ costing management by tag
19
Prerequisites for the cloud: Application design + challenges
Is your application ready for the Cloud?● Follow some simple rules to simplify the configuration of your app
○ Twelve-Factor app is your friend (3rd factor in particular)● Our application became almost ‘twelvefactored’ in previous
iterations, to anticipate eventual cloud migration○ it helped a lot in the migration for extremely centralized
configuration
21
Prerequisites for the cloud: Application design + challenges
Is your application ready for a cloud DB ?● Keep a simple design
○ No DB triggers or stored procedures■ In our case we were able to substitute the first with async
application jobs and to avoid the second altogether○ Rare use of specific MySql features
● The day you will want to change DB vendor or upgrade to a new major release, you will also thank yourself
22
Prerequisites for the cloud: Application design + challenges
What we had● Multi-master (Galera)
○ DB Read Write split at the application level using CakePHP ORM■ a simple 'sticky' master after write, to mitigate inherent deadlocks
of multi-master modelWhat we needed
● Master-slave (Aurora)○ improve buggy DB Read Write split
■ moving to master-slave we discovered split was imperfect, ‘leaking’ write queries to slaves
■ bug hidden in the previous multi-master architecture
23
Prerequisites for the cloud: Application design + challenges
Scale for the cloud● Using proper dimensioned clusters pushed our application to the
limits○ Lessons learned
■ OLD (but gold): don’t forget to periodically check your DB indexes (or lack of) usage
■ Use any kind of shielding you can● CDNs, Application Caches etc.
■ Async, async everywhere
24
How do we move on?
From Galera Cluster to Aurora
GOAL:migrate db one at a timeProblem:binlog-do-db option is not supported by OUR Galera Cluster.
27
How do we move on?
From Galera Cluster to Aurora
Galera Cluster: Replication master for all databases.
Mysql:1 Slave Replica for all databases.
28
How do we move on?
From Galera Cluster to AuroraMysql as “Washing Machine”:Activates binlog for a single DB
29
How do we move on?
From Galera Cluster to AuroraMysql as “Washing Machine”:Works as external replication master for Aurora
30
How do we move on?
From Galera Cluster to Aurora
Load BalancerWriter endpoint Reader endpoint
Webservers
32
How do we move on?
From Galera Cluster to Aurora
Load BalancerWriter endpoint Reader endpoint
Webservers
33
How do we move on?
From Galera Cluster to Aurora
Writer endpoint Reader endpoint
Webservers
Load Balancer
34
New Challenges and Benefits
Great we are now with our DB on the Cloud, but how this changes our lives ?
Performance, Price and Availability are now more interconnected than ever, we want responsive and
quick services that use at their best the DB Instances to reduce AWS cost,
35
New Challenges and Benefits - Autoscale the DB
AWS services are famous for their auto-scale capability… but not on RDS.But “something” that turn instances on and off could really be useful to us because most of our traffic is predictable.
As first try we used the aws cli with some simple scheduled tasks on Jenkins to programmatically turn on/off the DB Instances for the different countries at their wake up/sleep time.
This was a good change, but sometimes the load was higher or lower of the expected so we wrote some small bash utilities that periodically (every 2 minutes) check the CPU usage of our replica instances and if the average it’s over or under a threshold it takes an action to scale up or down that cluster.
36
New Challenges and Benefits - Autoscale the DB
This is much better as the number of instances dynamically changes based on the load, but …● Adding an instance take up to 10 minutes.● Removing an instance causes failed connection for our users.
37
New Challenges and Benefits
Some Benefits of using a DB on AWS that we have found include:
- Easily create new instances and replica of them, this means that this tasks can be done now also by less skilled (on the DB) members of the team.
- Easily manage snapshot and restore an instance from them (this can be good also to scratch your staging/dev environment and start fresh every day or week).
- Don’t worry anymore about DR, with Aurora the data is automatically replicated across Availability Zones, optionally you could also have a replica on a different region.
- Easily change the class of your DB servers if you have over or underestimated your load.- Cloudwatch add value with a good range of metrics ready out-of-the-box to be used.- Support Center is your friend: we experienced positive and proactive interaction with them
when we had a development cluster crash experimenting advanced new features (Aurora’s spatial indexes implementation)
38
New Challenges and Benefits
So it’s all wonderful when you stay on cloud ? Not exactly.
These are a few drawbacks of using a DB on AWS for our uses:
- AWS It’s “fully managed” but you have to understand and setup VPC/subnet/security group, this require some costs on time or a consultant.
- RDS it’s “fully managed” but you have to understand how parameter groups works and the slightly difference in Aurora engine.
- RDS it’s great if you have a simple infrastructure but it’s also harder if you want to achieve a 0 downtime service.
- If you don’t plan it wisely the costs can easily grow (i.e. reserved instances).
39
Conclusions
Moving the database as first thing could not be your best option, latency could be an application killer and in general it’s best to start with an application that it’s totally on cloud, much better if it’s new and you can plan it from scratch to work on cloud and all the services you can find there.
Told that, if you have to plan a big change or upgrade for your DB infrastructure a cloud provider could be a great option as it gives a way to start quickly and over time change the capacity without too much hassle.
We have now moved our API to AWS as well and this has increased the performance and lowered the response time … but this is maybe good for another talk ...
41
THANK YOUFOR YOUR ATTENTION
We are hiringhttps://corporate.doveconviene.it/lavora-con-noi/
42
Contacts
● Riccardo Capecchi○ https://about.me/riccardocapecchi
● Marco Careddu○ [email protected]○ https://www.linkedin.com/in/marco-careddu-33707632○ https://twitter.com/sgradix
● Piermarco Zerbini○ [email protected]○ https://it.linkedin.com/in/piermarcozerbini○ https://github.com/Snafrutz
43
References
● SeveralNines - Cluster Control● Google Cloud - Replication with External Cluster● Google Cloud - 1st Vs 2nd Gen● AWS Guides - Mysql importing External Replica● Aurora Overview● Our bash scripts to autoscale Aurora● Twelve Factor App
44