Gopal Ashok Program Manager Microsoft Corp. What is this talk about? Deployments and Best Practices...
-
Upload
reynold-smith -
Category
Documents
-
view
212 -
download
0
Transcript of Gopal Ashok Program Manager Microsoft Corp. What is this talk about? Deployments and Best Practices...
Building a High Availability Strategy for Your Enterprise Using Microsoft SQL Server 2008DAT301
Gopal AshokProgram ManagerMicrosoft Corp
REVENUE
COMPLIANCE24X7 GLOBAL BUSINESS
GROWTH
What is this talk about?
Analysis
Solution Design
Implementation
Testing
Maintenance
Deployments and Best Practices
Ensuring IT servicesand operationalcontinuity in theenterprise
Protect missioncritical SQL Serverdatabases using Always On Technologies
Defining HA and DRHigh availability is a system design protocol and associated implementation that ensures a certain absolute degree of operational continuity during a given measurement period
Disaster Recovery involves processes and procedures designed to restore business operations due to a natural or human-induced disaster
Typically involves providing redundancy spanning multiple sites or across geographic regions
Availability defined in terms of service level agreements (SLA)
Recovery TimeData loss during unplanned downtime
Recovery Time Objective (RTO) guided by availability requirements
How much downtime can you tolerate?
Recovery Point Objective (RPO) guided by criticality of application data
How much data can you lose?
AvailabilityClass
Acceptable Downtime (hrs/yr) OR RTO
Acceptable Data Loss (time of last copy) OR RPO
Tier 1 >99.99%(1 hr or less)
5 min or less
Tier 2 99.9% - 99.99% (1- 8.5 hrs)
5 mins to 8.5 hrs
Tier 3 (<99.9%)(Hours to days)
Hours to days
Protection LevelsProtection against resource failures
Machine Database CorruptionDiskResource Bottlenecks
Location RedundancyBuilding< 10 miles
Local HA
Regional DR
Geographic DR
Protection against Network Outages Site Failures
Location Redundancy
– City, County– < 100
Protection against Natural Disasters
Location Redundancy
– State, Country– > 100 miles
SQL Server High Availability Planning
AnalysisApplication tiers serviced by the databasesProtection levels: Local HA, Regional DR, Geographic DRCauses of database downtime
Solution DesignNeed to understand what solutions exists?What are the characteristics and cost of the solution?
ImplementationWhat are the deployment steps and best practices?
Analysis
Solution Design
Implementation
Testing
Maintenance
Database Downtime Drivers
Database Downtime
Unplanned Downtime
FailureProtection
User Errors
Planned Downtime
Upgrade and Migrations
Online Administration
Predictable Resourcing
Solution Design
Understand the available technology options and characteristics before making a decision
Solution Architecture
HA Capabilities
Limitations and Caveats
Cost Vector
Always On Technologies
Provides a full range of options to minimize operational downtime and maintain appropriate levels of application availability.
• Backup and Restore• Log Shipping• Database Mirroring• Peer-Peer Replication• Failover Clustering
IncreaseAvailability
• Online Index Operations• Table Partitioning• Enhanced Locking• Resource Governor• Database Snapshot• Dedicated Admin Connection• Dynamic Configuration
Decrease OperationalDowntime
Always On Solution Characteristics
No Data Loss(RPO=0)
Failover Unit AutoFailover(RTO)
Inst DB Tab
+ **
Read Mult-iple
Write
*
*
*
Solutions
Log Shipping
DBM Sync
Async
Cluster
Transactional Replication
Peer-PeerReplication
RPO FailoverRedundancy and
Utilization
Hard-ware
App PerfImpact
Manag-eability
Low Low Low
Low High Low
Low Low Low
High*** Low *** Low***
Low Low High
Low Low High
Cost
* Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or database snapshots respectively** Database Mirroring provides fastest failover to hot secondary*** Depends on SAN technology
Increasing Availability: ServiceU
Unplanned downtime: Loss of a database server:
RPO = 0; that is, no data lossRTO = 60 seconds maximum
Loss of the primary data center, or the entire database storage unit in the primary data center:
RPO = 3 minutes maximum; RTO = 15 minutes total, including evaluation of the issue;
Planned downtime: RPO = 0 (no data loss)RTO = 60 seconds maximum; some database changes may require a longer downtime than 60 seconds; in those cases every effort is made to minimize the service interruption
Provide solutions for reservedseat ticketing, box officemanagement, eventmanagement and onlinePayments
No Service = No Revenue
ServiceU High Availability ArchitectureBasic Principle: Redundancy for all components
3-node clusterRedundancy during single node failure, patching etc
No Majority: Disk Only Quorum Model
Availability during multi-node failure
No automatic failback to preferred node
ServiceU Disaster Recovery Architecture
Using Log Shipping to setup Mirroringdemo
Upgrading to SQL Server 2008Windows Server 2003\SQL Server 2005
Upgraded both OS and SQL Server to 2008
Had to do this with very little downtimeHow much? Let’s find out!!!!
Primary Site Upgrade ProcessApplication Switch Over to temp cluster
Block usersSync mirroringDBM Failover RedirectionRemove DBM
Temporary SQL Server 2008 ClusterOn Windows Server 2008
Establish async DBM from 2005 to 2008
Total end user down-time10 minutes
Upgraded primary cluster to 2008Repeated steps above
Downtime 6 minutes
Windows Server 2008 & SQL Server 2008 Better Together
Failover ClusteringRolling upgrade and patching16 nodes
Resource GovernorManage SQL Server workloads and resources by specifying limits on resource consumption
Backup CompressionReduce backup and restore time
Database MirroringAutomatic recovery from page corruptionLog stream compressionFaster recovery on failover
Log ShippingSub-Minute Log ShippingBackup compression
ReplicationPeer-Peer Replication: Hot add new nodesImproved performance over WAN links
Database Mirroring CompressionBenefit
Cost
Automatic Page Repairdemo
Rolling upgrade using Mirroring
Failure is not an option: bWin
Environment100+ TB Data850+ DB’s100 Instances450K SQL Statements\Sec
Sports betting, Soft & skill games
1 million bets per day on > 90 Sports
The Mission: Failure is not an option &Money is not a problem
Rather lose availability and performance than data
Principal
Log backup file server
Mirror
LogShippingNo delay
Log Shipping1h delay
Log backup file server
Database backup file server Database backup file server
Datacenter A Datacenter B
Mirroring
bWin High Availability ArchitecturePrincipal: 32 IA64 Dual Core CPU’s
Mirror 32: IA64 Single Core
64 Network Ports (1 Gbps)
400 local SAS drives on 16
RAID controllers (for OS, TempDB and Log files – low latency)
16 HBA’s for 256 Disk / 256GB cache SAN system
Scale Out and Availability ScenarioAdventureworks is building a new web based order management system that allows customers from all over the world to access the system and place orders
The core group of customers are in Western Europe, South East Asia and North America
Requirements– Geo Redundancy – Data Locality– High Availability– Local Read-Scale
Workload Characteristics– Mainly reads– Few writes
Application Characteristics– Each user logging in connects to a
particular server Partitioned based on user-id and region Writes from a user always happen on one
server regardless of the region the user log in from
– All reads redirected to the closest geo-location
Reasonable tolerance for latency (5-10 minutes)
Replication Topology
Peer Nodes
Read-Only Servers
Asia1 Asia2
Key to Success
It’s not the vendor!It’s not the technology!
It’s not the features!
Licensing Facts
Passive servers are mirror, log shipped secondary and clustering passive node
No license required on passive if it is truly passive
A passive server does not need a license if the number of processors in the passive server is equal to or less than the number of processors in the active server.
The passive server can take the duties of the active server for 30 days. Afterwards, it must be licensed accordingly.
HA Features Edition SupportFeature Express Workgroup Standard Enterprise Comments
Database Mirroring 1
Advanced high availability solution that includes fast failover and automatic client redirection
Failover Clustering 2
Backup Log-shipping Data backup and recovery solution
Online System Changes
Includes Hot Add Memory, dedicated administrative connection, and other online operations
Online Indexing
Online Restore
Fast RecoveryDatabase available when undo operations begin
₁Single thread redo₂ Limited to 2 node cluster
question & answer
www.microsoft.com/teched
Sessions On-Demand & Community
http://microsoft.com/technet
Resources for IT Professionals
http://microsoft.com/msdn
Resources for Developers
www.microsoft.com/learning
Microsoft Certification & Training Resources
Resources
Related Content
Breakout SessionsDAT312 All You Needed to Know about Microsoft SQL Server 2008 Failover Clustering
Hands-on LabsDAT12-HOL Microsoft SQL Server 2008 Database Mirroring, Part 1DAT12-HOL Microsoft SQL Server 2008 Database Mirroring, Part 2DAT05-HOL Microsoft SQL Server 2008 Data SnapshotsDAT07-HOL Microsoft SQL Server 2008 Peer-to-Peer ReplicationDAT06-HOL Microsoft SQL Server 2008 Online Operations
Complete an evaluation on CommNet and enter to win an Xbox 360 Elite!
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS,
IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.