Disaster Recovery on Demand on the Cloud
-
Upload
nati-shalom -
Category
Technology
-
view
564 -
download
1
description
Transcript of Disaster Recovery on Demand on the Cloud
![Page 1: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/1.jpg)
Protect your app from OutagesNati Shalom CTO GigaSpaces@natishalom
May 2013
![Page 2: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/2.jpg)
2
AWS and outages Outage impact Disaster Recovery – it’s all about redundancy! Cloudify as a solution for redundancy Demo with Cloudify on EC2
® Copyright 2013 GigaSpaces Ltd. All Rights Reserved
AGENDA
![Page 3: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/3.jpg)
3
AWS USAGE
Managing Big Data on the Cloud
• AWS – around 0.5M servers• Facebook – less than 0.1M servers• Google – around 1M servers
![Page 4: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/4.jpg)
4
THE OUTAGE PROBLEM
![Page 5: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/5.jpg)
® Copyright 2012 GigaSpaces Ltd. All Rights Reserved5
OUTAGE – APRIL 21, 2011
![Page 6: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/6.jpg)
® Copyright 2012 GigaSpaces Ltd. All Rights Reserved6
OUTAGE - JUNE 29, 2012
![Page 7: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/7.jpg)
® Copyright 2012 GigaSpaces Ltd. All Rights Reserved7
OUTAGE - OCTOBER 22, 2012
![Page 8: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/8.jpg)
® Copyright 2012 GigaSpaces Ltd. All Rights Reserved8
OUTAGE - CHRISTMAS EVE 2012
![Page 9: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/9.jpg)
® Copyright 2012 GigaSpaces Ltd. All Rights Reserved9
NOT ONLY AMAZON
28 December 2012 - some owners of Microsoft's XBox 360 gaming console were unable to access some of their cloud-based storage files.
26 July 2012 - Service for Microsoft’s Windows Azure Europe region went down for more than two hours
29 February 2012 - The ultimate result was service impacts of 8-10 hours for users of Azure data centers in Dublin, Ireland, Chicago, and San Antonio.
![Page 10: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/10.jpg)
10
THAT’S WHAT YOU EXPECT?
Managing Big Data on the Cloud
99% - 3.65 days downtime99.9% - 8.76 hours downtime99.99% - 53 minutes downtime99.999% - 5.26 minutes downtime
![Page 11: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/11.jpg)
® Copyright 2012 GigaSpaces Ltd. All Rights Reserved11
OUTAGE IMPACT – DESIGN FOR FAILURES
Outage could cost…$89K per hour for Amadeus$225K per hour for PayPal!
![Page 12: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/12.jpg)
12
DISASTER RECOVERY
![Page 13: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/13.jpg)
13
MULTI CLOUD
Managing Big Data on the Cloud
![Page 14: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/14.jpg)
14
PREPARE FOR DISASTER RECOVERY
Managing Big Data on the Cloud
•Dedicated expert for DR architecture•Define target recovery time & point•Assume every tier can fail•Use monitoring and alerts•Document your operational processes
![Page 15: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/15.jpg)
15
CHAOS MONKEY
Managing Big Data on the Cloud
![Page 16: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/16.jpg)
16
It’s all about REDUNDANCY!
![Page 17: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/17.jpg)
17
CLONE YOUR ENVIORMENT
Managing Big Data on the Cloud
![Page 18: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/18.jpg)
18
CLONE YOUR DATA
•RDS Read Replica•More to come…
![Page 19: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/19.jpg)
19
Automating your DR
Processes
![Page 20: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/20.jpg)
Leverage Existing Automation Frameworks
Configuration Centric APP Centric (PaaS)
![Page 21: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/21.jpg)
CLONE YOUR ENV - HOW DOES IT WORK?
![Page 22: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/22.jpg)
BUILT IN SUPPORT FOR MANAGING DATA IN THE CLOUD
Real Time Relational DB Clusters
NoSQL Clusters Hadoop
Storm MySQL MongoDB Hadoop (Hive, Pig,..)
Elastic Caching XAP Postgress Cassandra ZooKeeper
Couchbase
ElasticSearch
![Page 23: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/23.jpg)
23
Real Life Scenario
![Page 24: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/24.jpg)
24
Technology-based concrete process control and information service
Deployments across North America, Latin America, Asia, and Europe for nearly a decade
Part of W.R. Grace & Co , $6.3 B Company.
The problem: On-Demand HA/DR over multiple Cloud regions.
CASE STUDY: VERIFI
High Availability
Data Replication
Disaster Recovery
![Page 25: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/25.jpg)
ELASTIC ON-DEMAND DISASTER RECOVERY
25
Problem Can we eliminate the
RTO vs. Cost trade-off in the cloud?
Solution (Elastic DR) A hybrid between Hot
and Warm DR Switch to Active site
in matter of seconds through cloud-agnostic lifecycle automation recipes
![Page 26: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/26.jpg)
VERIFI (INITIAL) ARCHITECTURE
26
Availability region (US-West: Oregon)
Data VolumeInternet EC2 Instance
mod_cluster
EC2 Instance
JBoss
Data Volume
EC2 Instance
EC2 Instance
PostgresSQL
Cassandra
4 recipes
![Page 27: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/27.jpg)
ELASTIC DR ON-DEMAND: FAILOVER SCENARIO
27
Region (US-West Oregon)
App ServersPostgresSQL
Region (US-East Virginia)
PostgresSQL
Cloud #1 Cloud #2
Region (US-East Virginia )
PostgresSQL
Cloud #1 Cloud #2
XApp Servers
Region (US-West California)
PostgresSQL
Cloud #3
Region failure occurs
* Initially, all those actions may be done manually by Verifi’s Ops team (e.g.: via recipe commands in CLI)
Bootstrap another cloud in a different region using the same application recipe used to bootstrap cloud #2 above*
1 2 3
Liveness poll
Liveness poll
0 Upon initial deployment, the primary deplyoment of the application “verifi” will be bootstrapped onto cloud #1, another slightly modified application recipe “verifi_dr” will be bootstrapped as cloud #2, polling cloud #1 for failure, and acting as a PostgresSQL db slave.
Turn Postgres slave into master, Start app server instances*
![Page 28: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/28.jpg)
FAILOVER SCENARIO
28
Region (US-West Oregon)
App ServersPostgresSQL
Region (US-East Virginia)
PostgresSQL
Cloud #1 Cloud #2
Region (US-East Virginia )
PostgresSQL
Cloud #1 Cloud #2
XApp Servers
Region (US-West California)
PostgresSQL
Cloud #3
Region failure occurs
Bootstrap another cloud in a different region using the same application recipe used to bootstrap cloud #2 above*
1 2 3
Liveness poll
Liveness poll
0 Upon initial deployment, the primary deployment of the application will be bootstrapped onto cloud #1, another slightly modified application recipe will be bootstrapped as cloud #2, polling cloud #1 for failure, and acting as a PostgresSQL db slave.
Turn Postgres slave into master, Start app server instances*
![Page 29: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/29.jpg)
29 Copyright 2012 Gigaspaces. All Rights Reserved
NEXT STEPS
Across clouds(AWS, Rackspace, Azure…etc)
Across AWS regions
Across AWS zones
1 application + overrides
Several cloud drivers
1 application + overrides1 cloud driver
1 application + overrides 1 cloud driver
Avai
labi
lity
Supported byVerifi phase #1
![Page 30: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/30.jpg)
30 Copyright 2013 Gigaspaces. All Rights Reserved
ELASTIC ON-DEMAND DR: COSTS
Main Site (US-West) Warm DR Site (US-East) Hot DR Site
Cost $82,068 $12,625 $82,068
Main Site 1 Load balancer, 2 JBoss instances, 1 PostgreSQL master, 3 Cassandra
DR Site 1 PostgreSQL slave – All other instance start on demand upon failover
![Page 31: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/31.jpg)
31 Copyright 2013 Gigaspaces. All Rights Reserved
ELASTIC DR: WARM DR COST, CLOUD PORTABILITY
4 recipes
DR Site$12k
Sam
e Re
cipe
$14k
$6k
$5k
$9k
![Page 32: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/32.jpg)
32 Copyright 2013 Gigaspaces. All Rights Reserved
ELASTIC DR: HOT DR COST
4 recipes
DR Site$82k
Sam
e Re
cipe
$79k
$115k
$68k
$91k
![Page 33: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/33.jpg)
33
Disaster Recovery – it’s all about redundancy! Cloning your environment – app stack Cloning your Data – DB Replication
Automation makes DR processes simple Use recipes to clone your app stack consistently Use replication to clone your data
Leverage cloud economics to reduce the cost DR on Demand Multi Cloud
® Copyright 2013 GigaSpaces Ltd. All Rights Reserved
SUMMARY
![Page 34: Disaster Recovery on Demand on the Cloud](https://reader035.fdocuments.in/reader035/viewer/2022062514/557cbfa7d8b42ab37c8b5333/html5/thumbnails/34.jpg)
34
Thank You!@natishalom
® Copyright 2013 GigaSpaces Ltd. All Rights Reserved
QUESTIONS & ANSWERS