PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared...

17
PAGE 1 of 17 (877)-476-5973 www.polyserve.com White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting using SQL Server 2005 Enterprise Edition Abstract: Microsoft SQL Server™ 2005 Enterprise Edition supports scale-out reporting through scalable shared databases. Scale-out reporting enables multiple SQL Server 2005 systems to attach a read-only copy of the same database. When deployed using PolyServe’s Database Utility™ for SQL Server, Enterprises reduce report completion times by up to 16x. The solution reduces storage complexity, simplifying SQL Server scale-out for complex, off-hours reporting workloads. The PolyServe solution enables rapid transformation of OLTP to read- only data warehousing for scale-out and back again to OLTP—in seconds. This proof of concept (POC) demonstrates PolyServe’s solution for scalable shared databases. The POC consists of a 4-node PolyServe Matrix Server cluster running PolyServe’s Database Utility for SQL Server and Microsoft SQL Server 2005 Enterprise Edition, connected to a SAN.

Transcript of PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared...

Page 1: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

PAGE 1 of 17 (877)-476-5973 www.polyserve.com

White Paper

Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reportin g using SQL Server 2005 Enterprise Edition

Abstract : Microsoft SQL Server™ 2005 Enterprise Edition supports scale-out reporting through scalable shared databases. Scale-out reporting enables multiple SQL Server 2005 systems to attach a read-only copy of the same database.

When deployed using PolyServe’s Database Utility™ for SQL Server, Enterprises reduce report completion times by up to 16x. The solution reduces storage complexity, simplifying SQL Server scale-out for complex, off-hours reporting workloads. The PolyServe solution enables rapid transformation of OLTP to read-only data warehousing for scale-out and back again to OLTP—in seconds.

This proof of concept (POC) demonstrates PolyServe’s solution for scalable shared databases. The POC consists of a 4-node PolyServe Matrix Server cluster running PolyServe’s Database Utility for SQL Server and Microsoft SQL Server 2005 Enterprise Edition, connected to a SAN.

Page 2: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

WhWr

PAGE 1 of 17 (877)-476-5973 www.polyserve.com

White Paper

Scalable Shared Databases for SQL Server 2005.....................................................1 Achieving Linear Scalability for Scale-out Reporting using SQL Server 2005 Enterprise Edition ....... 1

Introduction ...............................................................................................................2 Single System Performance Limits ..................................................................................................... 2 Data Warehousing Challenges ........................................................................................................... 3

Introducing the Scalable Shared Database................................................................3 Analysis............................................................................................................................................... 4

Another Model for Scale Out: Shared Data................................................................5 Storage Management.......................................................................................................................... 5 Concurrent Scalability—for Scale Out................................................................................................. 6

Proof of Concept Results...........................................................................................6 Data Center Use Case ........................................................................................................................ 7 Configuration Overview....................................................................................................................... 8 Database Configuration Overview ...................................................................................................... 8 Performance Results........................................................................................................................... 8

Conclusions.............................................................................................................11

The Database Utility Overview ................................................................................12

The Database Utility™ for SQL Server Components ...............................................13

PolyServe’s Cluster Volume Manager .....................................................................14

Summary.................................................................................................................15

Page 3: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

WhWr

PAGE 2 of 17 (877)-476-5973 www.polyserve.com

White Paper

Introduction

With rapidly growing production databases deployed on Microsoft SQL Server 2005, the need for timely and scalable reporting operations has become a business-critical need among Enterprises.

Today, the size of databases deployed on individual SQL Server systems is often measured in the hundreds of Gigabytes to Terabytes. For larger databases, data warehouse preparation is challenging and time-consuming.

Further, companies face time-constrained reporting windows where there is effectively less time and (as the database grows) less available computing power to complete resource-intensive reporting jobs within an off-hours reporting window.

Single System Performance Limits When the power of SQL Server 2005 is combined with modern, industry-standard x64 servers and Storage Area Networks (SANs), Enterprises are provided with a robust platform for deploying mission-critical databases, at an optimal price-performance. For single system scalability, SQL Server 2005 possesses numerous performance and memory management advancements to exploit resource utilization within a single system.

For Online Transaction Processing (OLTP)—where queries tend to be shorter and less resource-intensive—a single server often provides adequate bandwidth.

However, more complex workloads—such as reporting, ad hoc queries, and data warehouse preparation—often require more throughput than a single server can provide. Scanning tables, sorting large amounts of data, and running multiple reporting jobs concurrently—all against the same database—these are resource-intensive tasks that can easily overburden a single server system. For these workloads, the server is often the bottleneck.

Consider a business requirement to execute 8 reports against a 300GB table used primarily for OLTP. The indexes have been optimized for OLTP. The query plans generated for the reports are based upon full table scans. If such reporting saturates a single server, the only way to complete the 8 reporting jobs in less time is to scale out to several servers, running multiple reports concurrently across multiple servers.

Page 4: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

WhWr

PAGE 3 of 17 (877)-476-5973 www.polyserve.com

White Paper

Data Warehousing Challenges Data warehouses and data marts often start small and simple (with a fact table or two and a few dimension tables). If successful, these small data warehouses may grow across an organization over time, transforming into corporate-wide repositories used for business intelligence and senior management decision support.

One goal behind transforming a large read-write database to a read-only data warehouses is to offload performance-intensive reporting functions to another server. Another overarching goal of a data warehouse is to maintain fresh data. Every second spent on data warehouse preparation means less time spent on reporting—and progressively less up-to-date data.

From a performance perspective, it does not take long for multiple reporting operations to overburden servers and direct-attached storage during the off-hours data warehouse preparation and reporting usage periods.

When faced with a fixed amount of time and fixed amount of bandwidth, the scalable shared database is a revolutionary breakthrough.

Introducing the Scalable Shared Database

Microsoft SQL Server™ 2005 Enterprise Edition supports scale-out reporting through Scalable Shared Databases. Scale-out reporting enables multiple SQL Server 2005 instances to attach a read-only version of the database.

The KB article focuses on use-cases for reporting and data warehouse activity.

In summary, the implementation described in the Microsoft KB article recommends the following configuration guidelines:

• Read Only NTFS Volumes. A Scalable Database must reside in a read-only volume or set of volumes. Note: this POC validates using Scalable Shared Databases on the PSFS (PolyServe File System), an NTFS-compatible file system built using Microsoft’s Installable File System (IFS) Kit.

• Private tempdb. The Scalable Database must be attached to an instance that has private tempdb.

• SAN. The database must reside on a read-only volume configured using the diskpart.exe utility from the database server attached to a Storage Area Network. This POC obviates the need for this function.

• Windows Version Requirement. Scalable Shared Databases are supported only on Microsoft Windows Server 2003 Service Pack 1 or later.

Page 5: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

WhWr

PAGE 4 of 17 (877)-476-5973 www.polyserve.com

White Paper

The non-shared data approach described in the KB article documents how to maintain an updated copy of the production database. To use this copy of the database for scale-out reporting, a new read-only volume must be created and managed using DISKPART.exe. To do so, a volume and its database copy must be unmounted and remounted in read-only mode on all the reporting servers.

Conversely, the volume must then be unmounted from all the reporting servers and re-mounted in read-write mode to refresh its contents (i.e., bring the database copy up to date with the production database). This cycle must be repeated for each reporting exercise. To say the least, it’s a complicated process.

This approach has been tested with scaling out to 8 nodes by Microsoft Corporation.

Analysis The approach covered in the KB article is based on using read-only volumes containing replicated database copies for reporting. This is a functional approach. But, given the operational overhead, likely not entirely useful.

An essential premise behind scalable shared databases is that the production database is large and growing and cannot be easily serviced by a single server for reporting or ad hoc query purposes.

There are still challenges with scale out:

• Each database requires a replicate or snapshot copy to manage. This creates storage management overhead. There is effectively more storage and data to manage, across more logical management points—for both the storage operator and DBA.

• Costly processing refreshes. Refreshing the replicated database affects the production database.

• Challenging to administer free space. There will be space management tasks for 1 large production database and its large replica.

• Challenging to maintain. Since the database is large, it would likely reside in several volumes, each of which must be individually managed. Mounting and unmounting volumes on each reporting server also creates administration overhead.

Page 6: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

WhWr

PAGE 5 of 17 (877)-476-5973 www.polyserve.com

White Paper

In the non-shared data approach, the volumes used for the replicated reporting database should be dedicated exclusively to reporting, since the volumes will be routinely unmounted and re-mounted in read-only mode. This will likely increase the total number of volumes to manage, per cluster.

Another Model for Scale Out: Shared Data

Another deployment option for scaling out databases on SQL Server 2005 is shared data. Through PolyServe’s Database Utility, all servers have read-write access to all storage. This makes storage management, data warehouse preparation time, and scale out and back for reporting, an easy operation.

To fully exploit the scale-out databases, an environment that supports scaling out to many nodes without replication and without explicit volume creation (as read-only), or mount/unmount operations, may be supported using the PolyServe solution.

Storage Management

Storage management is greatly simplified because all servers can “see” all storage. There are few volumes to manage, and storage does not have to be reconfigured. In the utility model, storage does not have to be reconfigured to be used in support of

Page 7: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

WhWr

PAGE 6 of 17 (877)-476-5973 www.polyserve.com

White Paper

scalable shared databases. Once provisioned, databases can be easily moved from server to server, in support of scale out and scale back operations.

Concurrent Scalability—for Scale Out Through a simple script operation, all SQL Server systems may attach the database as read-only. Reports or ad hoc queries can safely run on a server without affecting the performance of reports or queries running on other servers.

In effect, reports are isolated to a given server. More servers can be added as needed, running concurrently across more servers—against the same data.

Proof of Concept Results

As summarized above, SQL Server 2005 enables attaching a database in a read-only volume to more than one server. With PolyServe, there is no manual volume creation required. Attaching the read-only database to multiple servers can be automated through scheduling, or performed manually through a few easy steps.

This vastly simplifies the job for the DBA. Now, instead of having to create, mount and un-mount multiple volumes to each new server in the cluster, a simple file group permissions operation is sufficient. Thus, the production database itself can be scaled out (mounted by multiple servers) in support of reporting operations.

The benefits of Scalable Shared Database in the PolyServe Database Utility for SQL Server:

• No stale data—work in near-real-time. The database can be kept up-to-date since no copying via replication nor are snapshots required. While replication and snapshots are both supported in the PolyServe solution, there is no need for replication, or snapshots—unless the reporting job takes place on a remote cluster.

• Single-touch scale-out. Since this solution is based on PolyServe’s solution for Windows Server, the degree of scale-out is 16 servers.

• No need for operator intervention to perform complicated mount/unmount operations

• Rapid transformation from OLTP to Scalable Shared Database for reporting operations, and back again

o Transform from OLTP to Scalable Shared Database on 16 servers in less than 60 seconds

o Scale back to read-write mode in less then 30 seconds

Page 8: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

WhWr

PAGE 7 of 17 (877)-476-5973 www.polyserve.com

White Paper

Figure 1: Proof of Concept Main Table with 1,000,000 rows

o Reporting can begin within seconds on current data on up to 16 servers

• Volumes can be shared—and do not have to be dedicated to the scale-out database

• Tempdb databases for all servers can reside in the same high-performance volume as the production database

• No need to burden internal drives and other server resources. Since operations are automated, there is less burden on each server hosting the scale-out database.

• No “volume sprawl”. Since all pre-defined servers can share volumes in the SAN, there are fewer volumes to create, manage, and backup.

• Single point of backup. During the reporting window, one or more nodes in the cluster may be dedicated to backing up the production database while reporting jobs occur in parallel.

Data Center Use Case Most datacenters dedicate certain hours of operation to reporting, ad hoc queries, and data warehouse preparation. With PolyServe’s Utility approach to Scalable Shared Databases, not only is the transformation of the same database from OLTP to reporting performed in a matter of seconds, but the degree of scalability can be up to 16 servers. With 16 servers working on a set of reports, or performing intensive data warehouse preparation, the resultant window is reduced significantly freeing time to perform important database maintenance operations.

For example, consider a data center that currently performs, say, 50 reports each night in a reporting window of 4 hours. With the PolyServe approach to Scalable Shared

Page 9: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

WhWr

PAGE 8 of 17 (877)-476-5973 www.polyserve.com

White Paper

Figure 2: Proof of Concept Main Table File Properties

Database, the reporting could be reduced to as little as approximately 1/16th the time, or roughly 15 minutes. With roughly 3 hours and 45 minutes of new found “free time”, it is then possible to scale back to OLTP mode to perform maintenance tasks such as index reorganization or statistics updating.

Configuration Overview The Proof of Concept consisted of a 4-node PolyServe Matrix Server cluster for Windows, running SQL Server 2005 Enterprise Edition, attached to a Fiber Channel SAN. The servers were commodity dual-processor systems configured with 2GB physical memory. A PolyServe cluster filesystem was created in a high-performance cluster volume, created with Matrix Volume Manager, and mounted as drive “S:”

Database Configuration Overview The database used for the Scalable Shared Database was called debit . The debit database had a single table, called card, which had 100,000,000 credit card transactions (see Figure 1). The properties of the primary file for the debit table is shown to the right, in Figure 2. The card schema is depicted in Figure 3, on the following page.

The PolyServe cluster filesystem mounted as S: contained directories for both the scale-out database and all of the tempdb databases required by the scale-out servers; this is a simple way to deploy Scalable Shared Databases. Figure 4 shows, simply, how easy it is to have many tempdb databases located in a single, high performance PolyServe cluster filesystem in support of a large scale-out configuration. In the example, 4 directories were created named for each of the servers (e.g., tmr6s3) used in the Proof of Concept. The total space consumed by the card table was roughly 4.3GB.

Performance Results Using the card table described above, a set of queries were constructed to simulate a reporting workload that tests the Scalable Shared Database feature in the Database Utility model. The queries were constructed to stress all aspects of query processing.

Page 10: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

WhWr

PAGE 9 of 17 (877)-476-5973 www.polyserve.com

White Paper

Figure 3: Card Table Attributes

These workloads include:

• Physical Disk I/O o The Query Plans involved full

table scans of 100,000,000 rows

o A full scan of the card table required 4.3GB of physical disk reads

• Processor and Memory Utilization o Data filtering

� Processing the WHERE predicate

o Sorting o Grouping o Aggregation

The measured test consisted of executing 8 consecutive ad hoc queries based on the example listed in the box below. As mentioned above, the card table was 4.3GB so the total reporting workload consisted of 34.4GB of sequential I/O. Randomization of the queries was achieved by plugging in different values for m and n in the BETWEEN clause. There were approximately 1,000,000 unique vendors stored in the vendor_id column, and 26 transaction types stored in the trantype column. On average, randomly assigning values for the BETWEEN clause rendered a dataset of approximately 13,000,000 rows for the sorting, grouping and aggregation tasks.

After the baseline results were collected from 1 server, the PolyServe sql_scale.exe command was executed to prepare the Scalable Shared Database on 2 servers. The same 8 queries were then executed 4 per server and complete times measured. The test was then scaled out to 4 servers where 2 of the 8 reports were executed on each server and complete times measured.

Between each server count test, the database was transformed from scale-out mode to OLTP mode and a few more rows were inserted in order to validate the

SELECT vendor_id, avg(amt) avamt FROM card WHERE trantype BETWEEN m AND n GROUP BY vendor_id ORDER BY avamt desc;

Page 11: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

WhWr

PAGE 10 of 17 (877)-476-5973 www.polyserve.com

White Paper

Figure 4: Simplicity of using PolyServe Matrix Server for tempdb requirements

transformation from Scalable Shared Database mode to OLTP mode. The database was then transformed once again to a scale-out database on the next number of servers to be tested (e.g., scaling from 2 to 4 servers). At most only 58 seconds transpired between measured executions of the query set. That is, no more than 58 seconds was required for the following tasks between measured runs of the benchmark:

1. Scaling back the database from scale-out read-only mode to OLTP read/write mode

2. Execute a small number of insertions into the card table 3. Scaling out the database to the next number of nodes (e.g., from 2 to 4).

The task of scaling out the database includes the startup time for all the instances. Incidentally, with PolyServe’s solution, the instances can be started in parallel on all servers so the preparation time of the Scalable Shared Database was standardized.

Measured Results

The baseline job complete time for the 8 queries executed on 1 server was 48 minutes. After scaling out with the sql_scale.exe tool and running 4 queries on each of 2 servers, the job complete time was reduced to 24 minutes – linear scalability. Finally, the sql_scale.exe tool was used to scale-out to 4 servers. The same 8 queries were once again executed, 2 per server. The job complete time was 14 minutes as depicted in Figure 6. All told, the scalability from 1 to 4 nodes was 86%.

Page 12: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

WhWr

PAGE 11 of 17 (877)-476-5973 www.polyserve.com

White Paper

Reporting Workload Complete Times

0

10

20

30

40

50

60

1 2 4

Number of Servers

Min

utes

Figure 6: Scalability of the Scalable Shared Data base on PolyServe Matrix Server

Linear Scalability Requires a Balanced Hardware Con figuration

Given the architecture of the Scalable Shared Database, the only component that can affect scalability is storage bandwidth, as documented in this Proof of Concept at 4 servers. The storage allocated to the small Matrix Server test cluster from the SAN was sufficient to sustain increased I/O demand from 1 to 2 servers, but I/O latency increased from 2 to 4 servers. Thus, scalability was slightly affected. Adding disk capacity in the PolyServe Database Utility for SQL Server is a non-intrusive adminstrative action so this sort of bottleneck is simple to remedy.

Conclusions

Given ample SAN resources, PolyServe’s implementation of the Scalable Shared Database delivers linear scalability for SQL Server 2005.

To achieve linear scalability, an appropriate amount of both system and storage bandwidth must be available to the reporting operations. An individual array, depending on the vendor, may provide sufficient bandwidth for some applications. For others, more storage bandwidth may be required.

The Scalable Shared Database option, combined with PolyServe Matrix Server for Windows Server, included with the Database Utility for SQL Server, provides an easy way to achieve linear system scalability.

To achieve cost-effective storage bandwidth scalability through software—to scale-out storage—PolyServe’s Cluster Volume Manager (CVM) may be utilized.

Page 13: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

WhWr

PAGE 12 of 17 (877)-476-5973 www.polyserve.com

White Paper

The Database Utility Overview

Built on SQL Server and Windows Server, PolyServe’s Database Utility for SQL Server is deeply integrated with both Windows Server and SQL Server. One of the core products provided with the Database Utility for SQL Server is Matrix Server for Windows Server.

Matrix Server provides the underlying technology enabling this form of scale-out reporting. This technology provides the building blocks for shared data. Shared data means all servers can safely share storage and data on the SAN. PolyServe’s NTFS-compatible Cluster File System, included with Matrix Server, and designed using Microsoft’s Installable File System Kit, supports concurrent, direct, read/write activity through a cache-coherent cluster file system (PSFS). This enables a simple, one-touch operation for mounting the shared database across multiple servers in the cluster simultaneously.

Shared data also means all the databases are stored in a single place that can be accessed by all of the servers simultaneously. This means moving any SQL database (read-only or read-write) from one server to another can be performed rapidly, and with minimal storage operator or DBA intervention.

In short, shared data brings the following fundamental benefits:

• A single pool of servers: In this approach, you no longer think of installing a database on any particular server. Instead, you install into the cluster, and the database can then run on any server in the cluster. This allows an administrator to move a database from one server to another in order to rebalance load and maintain an appropriate level of utilization—without any need to copy or migrate data. It also means if any server fails, the databases it had been running can immediately and automatically be restarted on other machines, ensuring high availability.

• A single pool of storage: Similarly, there is no need to manage storage on a server-by-server basis and no requirement to have a backup job per server or separate monitoring for each server’s free space pool. In addition, because the servers are connected to storage over a SAN, it is easy to provision more storage when required—but that is only done when the environment as a whole needs more space.

• Flexibility: A shared data cluster can be formed from servers you already own. There is no need to buy matched servers; you can mix servers from different vendors, with different processor types and speeds, different numbers of processors and different memory configurations. You can even mix Windows 2000 and Windows Server 2003 in the same cluster, and use

Page 14: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

WhWr

PAGE 13 of 17 (877)-476-5973 www.polyserve.com

White Paper

the shared storage pool to migrate databases to Windows Server 2003 without having to copy data.

• Easy scalability: It is easy to add servers to a cluster if overall demand grows, and databases can be shifted in a matter of seconds onto newly added servers with no need for data copying. (Conversely, if workloads drop, you can shift databases off some of the servers and remove them for repurposing.) Databases receive full native performance with no virtualization overhead or virtualization limits, so if a database requires the full speed and capacity of a large server, it can have it. Once a cluster is created, adding Scalable Shared Databases becomes as easy as right-clicking and rehosting a given read-only database to any other server in the cluster.

• Easy high availability: Because all servers have access to all databases, high availability becomes easy to implement with no requirement for doubling up hardware. Simply by specifying where a database should be restarted in the event of failure—from among any of the servers in the cluster—you can ensure the database will remain available. The shared storage pool requires none of the complicated and brittle configuration on a server-pair-by-server-pair basis that traditional failover clustering can entail.

The following section describes the SQL Server-specific components included with PolyServe Database Utility for SQL Server.

The Database Utility™ for SQL Server Components

The final necessary ingredient to both Scalable Shared Databases and simplified management of SQL Server is SQL Server integration. The Database Utility provides an integration layer that applies the core capabilities of Matrix Server and the Cluster Volume Manager to SQL Server.

This layer, called the Database Utility for SQL Server, includes:

• A SQL Server Health Monitor that periodically probes SQL Server instances within the cluster to ensure that client requests are being successfully handled. This will detect if a SQL Server instance hangs, even if the operating system and hardware remain healthy.

• A SQL Instance Virtualizer that allows the creation of a “Virtual SQL Server.” A virtual SQL Server is an adaptation of the Matrix Server virtual host concept. It consists of a virtual IP address that clients use to connect, a specified primary server in the cluster, a prioritized list of backup servers, and from one to 16 associated SQL Server instances. If a server fails, the Virtualizer will move the virtual IP address to the top server in the list that is capable of hosting the virtual SQL server; it then restarts the database instances in the new location. Note that, unlike server virtualization, the SQL

Page 15: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

WhWr

PAGE 14 of 17 (877)-476-5973 www.polyserve.com

White Paper

instance Virtualizer component itself is not active during normal operation. Thus there is no performance penalty; SQL runs at full native speed, with native access to all hardware resources. The Virtualizer also ensures that a given database is only ever accessed from one server at a time, since SQL Server is not designed to allow multiple server concurrent access.

• A SQL Server Registry Replicator. SQL Server stores some configuration information in the Windows registry, which is located on each server’s individual C: drive. To ensure this information is available to other servers in the cluster if a database instance needs to move, the Solution Pack includes a component that automatically replicates relevant registry information into the cluster-wide shared storage.

• A SQL Installation and hotfix Updater agent. To simplify installation of SQL Server and of hotfixes across multiple servers, the Solution Pack includes a push installer agent. The administrator simply places the appropriate installation packages in the shared file system, then uses the push installer agent to perform installations or hotfix updates across the entire cluster—what PolyServe calls one-click maintenance.

The Database Utility for SQL Server thus provides an easy way to move SQL Server instances within a cluster to improve resource utilization, an easy path to complete high availability for all instances in the cluster, and a simple way of performing typical SQL Server installation and maintenance tasks, all leveraging the core capabilities of Matrix Server and Volume Manager. The shared data pool provided by Matrix Server is used to store SQL databases and log files in a single location, accessible to all servers. The Matrix Volume Manager allows this storage pool to be spread across space on multiple storage arrays. Matrix Server’s high-availability and application-control engine ensures that databases remain available regardless of server failures and, through Dynamic Re-Hosting, allows administrators to adjust work assignments to maximize utilization. Matrix Manager provides a single point for monitoring and controlling these elements.

PolyServe’s Cluster Volume Manager

PolyServe’s Windows-based Cluster Volume Manager enables multiple storage arrays (or storage devices within an array) to be grouped together into scalable, high-speed, storage pools—available to all SQL Server systems in the cluster.

Matrix Server provides the foundation for the Database Utility for SQL Server by allowing a set of servers to access shared file systems simultaneously and thus be managed as a single unit. The Cluster Volume Manager provides an analogous capability for storage. The CVM allows disk space from multiple storage arrays to be used and managed as a single pool.

Page 16: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

WhWr

PAGE 15 of 17 (877)-476-5973 www.polyserve.com

White Paper

• With the CVM, an administrator can create a single volume from free space on multiple arrays (or in multiple LUNs on a single array). In this way a file system can make use of whatever storage is available.

• The CVM also allows a volume to be striped across multiple arrays. This can improve I/O rates by aggregating the performance of the arrays, which is especially useful for sequential workloads frequently associated with data warehouses.

• Through concatenation or striping, CVM permits the construction of huge file systems that exceed the typical 2TB limit on the size of individual LUNs.

• In some environments, it is customary to configure every storage array with a set of LUNs of a fixed size—say, 30 gigabytes. The CVM allows a server administrator to construct file systems of whatever sizes are desired using these fixed LUNs.

• Finally, if a file system is becoming too full, the CVM can be used to expand the file system, without taking the cluster down, using free space from any array accessible to the cluster.

If the storage ever becomes a bottleneck, as it did in the POC, the Volume Manager enables multiple storage devices—controllers, cabinets, or arrays—to be aggregated together into high-performance shared storage pools. Once these pools are created, the volume manager can be used to stripe across each volume (LUN) to effectively aggregate the bandwidth across each device. Thus, if a single shared pool was composed of two LUNs, each located on a separate array, the logical, shared volume could span both arrays, aggregating the performance of each array (across spindles, controllers, and paths to the cluster attached to the storage).

Like all other aspects of the Database Utility, the CVM is managed cluster-wide from a single control point. Thus, just as Matrix Server allows servers’ data sets to be handled in a unified way, the Volume Manager allows the physical storage resources associated with a cluster to be used and managed as a single entity.

Summary

Microsoft SQL Server 2005™ Enterprise Edition now supports scale-out reporting through Scalable Shared Databases. Scale-out reporting enables multiple SQL Server 2005 systems to mount a read-only copy of a database.

When deployed using PolyServe’s Database Utility™ for SQL Server, Enterprises may achieve linear scalability of SQL Server for reporting, reduce reporting times up to 16x, and eliminate manual storage configuration processes.

Page 17: PolyServe SQL Server Scale-out Reporting · PAGE 1 of 17 (877)-476-5973 White Paper Scalable Shared Databases for SQL Server 2005 Achieving Linear Scalability for Scale-out Reporting

WhWr

PAGE 16 of 17 (877)-476-5973 www.polyserve.com

White Paper

This proof of concept (POC) demonstrates PolyServe’s solution for the Scalable Shared Database. The POC consists of a 4-node PolyServe Matrix Server cluster running PolyServe’s Database Utility for SQL Server and Microsoft SQL Server 2005 Enterprise Edition, connected to a SAN.