Best Practices for Surviving Outages

11
Best Practices for Surviving Outages Designing and implementing a High Availability and Disaster Recovery strategy Sal Cardello, Director of Pro Services Matt Dolian, System Engineer Avroham Katz, System Engineer

description

Site disruptions happen, often when you least expect. When your business depends on application uptime or access to critical data, a strategy for high availability (HA) and disaster recovery (DR) is essential. Carefully considering how to architect and successfully implement an HA and DR strategy helps ensure that you minimize risk, strengthen fault tolerance, and rapidly re-deploy your application and data in case of a disruption. This presentation walks through an overview of HA and DR, and offers some best practices from the Engine Yard team. The full on-demand webcast can be viewed here: http://pages.engineyard.com/BestPracticesforSurvivingOutagesWebcast.html

Transcript of Best Practices for Surviving Outages

Page 1: Best Practices for Surviving Outages

Best Practices for Surviving OutagesDesigning and implementing a High Availability and Disaster Recovery strategy

Sal Cardello, Director of Pro Services

Matt Dolian, System Engineer

Avroham Katz, System Engineer

Page 2: Best Practices for Surviving Outages

2

Disaster Recovery

Photo credit: naturaldisasterss.com/wp-content/uploads/2011/12/Natural-Disaster-Images.jpg

Page 3: Best Practices for Surviving Outages

3

0 - No off-site data

1 - Data backup with no hot site

2 - Data backup with hot site

3 - Electronic vaulting

4 - Point-in-time copies

5 - Transaction integrity

6 - Zero or near-Zero data loss

7 - Highly automated, business integrated solution

Tiers of Disaster Recovery

Citation: http://en.wikipedia.org/wiki/Seven_tiers_of_disaster_recovery

Page 4: Best Practices for Surviving Outages

4

Definition: High Availability

“Design approach & associated service implementation that ensures a pre-arranged level of operational performance will be met during a contractual measurement period”

Citation: ttp://en.wikipedia.org/wiki/High_availability

Page 5: Best Practices for Surviving Outages

5

High Availability Architecture

Page 6: Best Practices for Surviving Outages

6

Why implement HA?

Page 7: Best Practices for Surviving Outages

Best Practices for High Availability

7Photo Credit: http://bit.ly/z9OEwG

Environment Analysis

Geographic Mirroring

Database Replication

Store Assets Replication

Validate Synchronization

Escalation Plan

Test

Launch

Page 8: Best Practices for Surviving Outages

8

• Environment Specific Configurations

• Asset Hosting

• Page Caching

• Other Data Stores

• Background Processing

• Cron Jobs

Application Considerations

Photo credit: http://www.flickr.com/photos/dseneste/5912382808/

Page 9: Best Practices for Surviving Outages

9

1. Client contacted per terms of SLA

2. Engine Yard syncs database and performs manual failover

3. Redundant database promoted to master

4. DNS is updated

5. Replication to former master is re-established

Failover Process at Engine Yard

Manual, customer owned decision

Page 10: Best Practices for Surviving Outages

10

Questions?

Page 11: Best Practices for Surviving Outages

11

Get in touch

Contact us: Sal Cardello, Director of Pro [email protected]

Learn more:http://www.engineyard.com/services