AO.pdf

Technical white paper

Adaptive Optimization for HP 3PAR StoreServ Storage Configure multiple tiers of storage devices for maximum performance

Table of contents Executive summary ...................................................................................................................................................................... 2 Storage tiers: Opportunity and challenge ................................................................................................................................ 2 HP 3PAR Adaptive Optimization software ............................................................................................................................... 3

Brief overview of volume mapping ....................................................................................................................................... 3 Adaptive optimization implementation ............................................................................................................................... 4 Design tradeoff: Tiering vs. caching ...................................................................................................................................... 5 Configuration ............................................................................................................................................................................. 5 Tiering analysis algorithm ....................................................................................................................................................... 6 Design tradeoff: Granularity of data movement ................................................................................................................ 7

Results ............................................................................................................................................................................................. 7 Customer case study .................................................................................................................................................................... 9 Summary ....................................................................................................................................................................................... 10

Technical white paper | Adaptive Optimization for HP 3PAR StoreServ Storage

2

Executive summary

New opportunities exist to optimize the cost and performance of storage arrays, thanks to the availability of a wide range of storage media such as solid state drives (SSDs), high-performance hard disk drives (HDDs), and high-capacity HDDs. But these opportunities come with the challenge of doing it effectively and without increasing administrative burdens, because the tradeoffs for storage arrays are different from CPU memory hierarchies. This white paper explains some of the tradeoffs, describes the technology that adaptively optimizes storage on HP 3PAR StoreServ Storage, and illustrates its effectiveness with performance results.

Storage tiers: Opportunity and challenge

Modern storage arrays support multiple tiers of storage media with a wide range of performance, cost, and capacity characteristics—ranging from inexpensive (~$200 USD) 2 TB SATA HDDs that can sustain only about 75 input/output operations per second (IOPS) to expensive (~$500+ USD) 50–200 GB SLC/MLC flash memory-based SSDs that can sustain more than 4,000 IOPS. Volume RAID and layout choices enable additional performance, cost, and capacity options. This wide range of cost, capacity, and performance characteristics is both an opportunity and a challenge.

Figure 1. Autonomic Tiering 3PAR StoreServ

The opportunity is that the performance and cost of the system can be optimized by correctly placing the data on different tiers: Move the most active data to the fastest (and most expensive) tier and move the idle data to the slowest (and least expensive ) tier. The challenge, of course, is to do this in a way that minimizes the burden on storage administrators while also providing them with appropriate controls. Currently, data placement on different tiers is a task usually performed by storage administrators—and their decisions are often based not on application demands but on the price paid by the users. If they don't use careful analysis, they may allocate storage based on available space rather than on performance requirements. At times, HDDs with the largest capacity may also have the highest number of accesses. But the largest HDDs are often the slowest HDDs. This can create significant performance bottlenecks.

There is an obvious analogy with CPU memory hierarchies. Although the basic idea is the same (use the smallest, fastest, most expensive resource for the busiest data), the implementation tradeoffs are different for storage arrays. While deep CPU memory hierarchies (first, second, and third level caches; main memory; and finally paging store) are ubiquitous and have mature design and implementation techniques, storage arrays typically have only a single cache level (the “cache” on disk drives usually acts more like a buffer than a cache). Automatic tiering in storage arrays is a recent development, and not commonplace at all. The industry still has much to learn about it.


3

HP 3PAR Adaptive Optimization software

Brief overview of volume mapping Before you can understand HP 3PAR Adaptive Optimization, it is important to understand volume mapping on HP 3PAR StoreServ Storage as illustrated in Figure 2.

Figure 2. HP 3PAR Adaptive Optimization

HP 3PAR virtual volumes (VVs) are organized into volume families (or trees) consisting of a base volume at the root and optional Un-Copy-On-Write (Un-COW) snapshot volumes of the base VV or of other snapshot VVs in the tree.

Each volume family has three distinct data storage spaces: 1) user space for the base volume; 2) snap space for the copy-on-write data; and 3) admin space for the mapping metadata for the snapshots. If the base volume is fully provisioned, there is a direct, one-to-one mapping from the VV virtual address to the user space. If the base volume is thin-provisioned, only written space in the base volume is mapped to user space and the mapping metadata is stored in the admin space. This is similar to Un-COW snapshots. The unit of mapping for the snapshot Un-COW or thin-provisioned VVs is a 16 KB page. Caching is done at the VV space level and at a granularity of 16 KB pages.

Physical storage in HP 3PAR StoreServ Storage is allocated to the volume family spaces in units of logical disk (LD) regions. The region size for the user and snap spaces is 128 MB, and the region size for the admin space is 32 MB.

Logical disk storage is striped across multiple RAID sets built from 256 MB allocation units of physical disks (PDs) known as chunklets. Every RAID set within one LD has the same RAID type (1, 5, or 6), set size, and disk type (SSD, FC, and SATA Nearline [NL]). These parameters determine the LD characteristics in terms of performance, cost, redundancy, and failure modes.

HP 3PAR StoreServ Storage is a cluster of controller nodes. The chunklets for one LD are allocated only from PDs with the primary access path directly connected to the same node, known as the LD owner node. You can achieve system level data striping by striping the volume family space across regions from LDs owned by different nodes. This ownership partitioning is one reason why thin-provisioned volumes still contain a user space mapping in which each region maps to a dummy zero LD with no physical storage.

A common provisioning group (CPG) is a collection of LDs. It contains the parameters for additional LD space creation, which includes RAID type, set size, and disk type for chunklet selection, plus total space warning and limit points. Multiple VV family spaces may be associated with a CPG from which they get LD space on demand. Therefore, the CPG is a convenient


4

way to specify a tier for adaptive optimization because it includes all of the necessary parameters and it permits adaptive optimization to operate after the cache. (There is no reason to bring busy data that is in the controller cache into high-performance storage below the cache.) An additional benefit of tiering at this level is that all three volume spaces, not just user space, are candidates for adaptive optimization. In fact, measurements show that admin space metadata regions are frequently chosen to be placed in the fastest tier.

Figure 2 illustrates the volume mapping for both non-tiered as well as tiered (adaptively optimized) volumes. For non-tiered VVs, each space (user, snap, or admin) is mapped to LD regions within a single CPG and therefore is in a single tier. For tiered VVs, each space can be mapped to regions from different CPGs.

Finally, remember that although this mapping from VVs to VV spaces to LDs to chunklets is complex, the user is not exposed to this complexity because the system software automatically creates the mappings.

The remainder of this white paper describes how this tiering is implemented and the benefits that can be expected.

Adaptive optimization implementation In order to implement tiering, HP 3PAR Adaptive Optimization needs to do four things: (1) collect historical data of accesses for all the regions in an array (this can be a lot of data); (2) analyze the data to determine the volume regions that should be moved between tiers; (3) instruct the array to move the regions from one CPG (tier) to another; and (4) provide the user with reports that show the impact of adaptive optimization.

HP 3PAR has an application software called System Reporter that runs from a host server and periodically collects detailed performance and space data from HP 3PAR arrays, stores the data in a database, and analyzes the data. System Reporter can then generate AO reports from a host, or the 3PAR StoreServ Storage array can generate AO reports from the 3PAR OS management console.

HP implemented adaptive optimization by enhancing System Reporter to collect region-level performance data, perform tiering analysis, and issue region movement commands to the array as shown in Figure 3.

Figure 3. Adaptive optimization implementation using System Reporter


5

Design tradeoff: Tiering vs. caching Traditional caching is an obvious choice for an algorithm to manage the different tiers of storage. In this case, data is copied from slower tiers into the fastest tier whenever it is accessed, replacing older data by using a simple, real-time algorithm such as least recently used (LRU). These caching algorithms have been extensively studied in the context of CPU-memory hierarchies. However, disk storage tiers in an array are different from a typical memory hierarchy in several respects.

In memory hierarchies, the faster tiers are almost always much smaller than the slower tiers. Plus, regions that are cached in the faster tier occupy space on the slower tier, but the space duplicated on the slower tier is a small fraction of its total size. In contrast, on arrays, the total space for mid-tier FC drives often is a significant fraction of the space on the slow-tier NL drives—and “losing” the duplicated space is generally not desirable.

Memory hierarchies require very fast response times, so it is not feasible to use complex analysis to figure out what should be cached or replaced. Simple algorithms such as LRU are all that designers can afford. For storage tiers, it is possible to devote time to more sophisticated analysis of access patterns to come up with more effective strategies than simple LRU algorithms.

Memory hierarchies typically use different hardware resources (memory buses) for different tiers, and transferring data between tiers may not significantly impact the available bandwidth to the fastest tier. Disk tiers may often share the same resources (FC ports). Also, the bandwidth used while transferring data between tiers impacts the total backend bandwidth available to the controllers.

For these reasons, HP chose to move regions between tiers instead of caching.

Configuration Simple administration is an important design goal, which makes it tempting to completely automate adaptive optimization. That would require the administrator to do no configuration at all. However, analysis indicates that some controls are in fact desirable for administration simplicity. Since HP 3PAR StoreServ Storage is typically used for multiple applications—often for multiple customers—HP allows administrators to create multiple adaptive optimization configurations so that they can use different configurations for different applications or customers. Figure 4 shows the configuration settings for an adaptive optimization configuration.

Figure 4. Configuration settings

You can select CPGs for each of the tiers and also set a tier size if you want to limit the amount of space that the algorithm will use in each tier. You can set a very large number if you do not want to limit the size available for any given tier. Note that adaptive optimization will attempt to honor this size limit in addition to any warning or hard limit specified in the CPG.

Make sure to define tier 0 to be on a higher performance level than tier 1, which in turn should be higher performance than tier 2. For example, you may choose RAID 1 with SSDs for tier 0, RAID 5 with FC drives for tier 1 and RAID 6 with NL or SATA drives for tier 2.

Best practices encourage you to begin your Adaptive Optimization configurations with your application CPG starting with tier 1. For example, tier 1 could be CPG using your FC or SAS physical disks. This allows you to add both higher and lower tier capabilities at a later date. If you don't have higher or lower tier, you can add either or both at a later date by using a new CPG, such as tier 0 using SSDs or tier 2 using NL. Or, you could have CPG tiers with RAID 1 or RAID 5 and RAID 6. The main point is that you should begin with middle CPG tier 1 when configuring Adaptive Optimization with your application.


6

It is also important to specify the schedule when a configuration will execute along with the measurement duration preceding the execution time. This allows the administrator to schedule data movement at times when the additional overhead of that data movement is acceptable (for example, non-peak hours). You can also set the schedule as to when adaptive optimization should stop working before the next measurement period.

Plus, you can set a mode configuration parameter to one of three values:

1. Performance mode biases the tiering algorithm (described in the next section) to move more data into faster tiers

2. Cost mode biases the tiering algorithm to move more data into the slower tiers

3. Balanced mode is a balance between performance and cost

The mode configuration parameter does not change the basic flow of the tiering analysis algorithm, but rather it changes certain tuning parameters that the algorithm uses.

Tiering analysis algorithm The tiering analysis algorithm that selects regions to move from one tier to another considers several things described in the following sections.

Space available in the tiers If the space in a tier exceeds the tier size (or the CPG warning limit), then the algorithm will first try to move regions out of that tier into any other tier with available space in an attempt to lower the tier’s size below the limit. If no other region has space, then the algorithm logs a warning and does nothing. (Note that if the warning limit for any CPG is exceeded, the array will generate an alert.) If space is available in a faster tier, it chooses the busiest regions to move to that tier. Similarly, if space is available in a slower tier, it chooses the most idle regions to move to that tier. The average tier service times and average tier access rates are ignored when data is being moved because the size limits of a tier have been exceeded.

Average tier service times Normally, HP 3PAR Adaptive Optimization tries to move busier regions in a slow tier into higher performance tiers. However, if a higher performance tier gets overloaded (too busy), performance for regions in that tier may actually be lower than regions in a “slower” tier. In order to prevent this, the algorithm does not move any regions from a slower to a faster tier if the faster tier’s average service time is not lower than the slower tier’s average service time by a certain factor (a parameter called svctFactor). There is an important exception to this rule because service times are only significant if there is sufficient IOPS load on the tier. If the IOPS load on the destination tier is below another value (a parameter called minDstIops), then we do not compare the destination tier’s average service time with the source tier’s average service time. Instead, we use an absolute threshold (a parameter called maxSvctms).

Average tier access rate densities When not limited, as described above, by lack of space in tiers or by high average tier service times, adaptive optimization computes the average tier access rate densities (a measure of how busy the regions in a tier are on average, calculated with units of IOPS per gigabyte per minute) and compares them with the access rate densities of individual regions in each tier. Then, it decides whether to move the region to a faster or slower tier.

We first consider the algorithm for selecting regions to move from a slower to a faster tier. For a region to be considered busy enough to move from a slower to a faster tier, its access rate density and accr(region) must satisfy these two conditions:

First, the region must be sufficiently busy compared to other regions in the source tier:

accr(region) > srcAvgFactorUp(Mode) * accr(srcTier)

Where accr(srcTier) is the average access rate density of the source (slower) tier and srcAvgFactorUp(Mode) is a tuning parameter that depends on the mode configuration parameter. Note that by selecting different values of srcAvgFactorUp for performance, balanced or cost mode values HP 3PAR Adaptive Optimization can control how aggressive the algorithm is in moving regions up to faster tiers.

Second, the region must meet one of two conditions: It must be sufficiently busy compared with other regions in the destination tier, or it must be exceptionally busy compared with the source tier regions. This second condition is added to cover the case in which a very small number of extremely busy regions are moved to the fast tier, but then the average access rate density of the fast tier create too high a barrier for other busy regions to move to the fast tier:

accr(region) > minimum((dstAvgFactorUp(Mode) * accr(dstTier)), (dstAvgMaxUp(Mode) * accr(srcTier)))


7

The algorithm for moving idle regions down from faster to slower tiers is similar in spirit—but instead of checking for access rate densities greater than some value, the algorithm checks for access rate densities less than some value:

accr(region) < srcAvgFactorDown(Mode) * accr(srcTier)

accr(region) < maximum((dstAvgFactorDown(Mode) * accr(dstTier)), (dstAvgMinDown(Mode) * accr(srcTier)))

HP makes a special case for regions that are completely idle (accr(region) = 0). These regions are moved directly to the lowest tier.

Design tradeoff: Granularity of data movement The volume space to LD mapping has a granularity of either 128 MB (user and snapshot data) or 32 MB (admin metadata)—and that is naturally the granularity at which the data is moved between tiers. Is that the optimal granularity? On the one hand, having fine-grain data movement is better since we can move a smaller region of busy data to high-performance storage without being forced to bring along additional idle data adjacent to it. On the other hand, having a fine-grain mapping imposes a larger overhead because HP 3PAR Adaptive Optimization needs to track performance of a larger number of regions, maintain larger numbers of mappings, and perform more data movement operations. Larger regions also take more advantage of spatial locality (the blocks near a busy block are more likely to be busy in the near future than a distant block). HP results show that the choice is a good one.

Results

HP measured the access rate for all regions for a number of application CPGs, sorted them by access rate, and plotted the cumulative access rate versus the cumulative space as shown in Figure 5. For all the applications, most of the accesses are concentrated in a small percentage of the regions. In several applications, this concentration of accesses is very pronounced (more than 95 percent of the accesses to less than 3 percent of the data) but less so for others (more than 30 percent of the space is needed to capture 95 percent of the accesses). In total, just 4 percent of the data gets 80 percent of the accesses. This indicates that the choice of region size is reasonably good, at least for some applications.

Figure 5. Distribution of IO accesses among regions for various applications

Because SSD space is still extremely expensive relative to HDD space (10x to 15x), very pronounced concentration of IO accesses to a small number of regions are needed in order for SSDs to be cost-effective. For applications that show less pronounced access concentration, HP 3PAR Adaptive Optimization may still be useful between different HDD tiers. One of the simple but important ideas in the implementation is the separation of the analysis and movement by CPGs (or applications).

The example results in Figure 6 describe region IO density after HP 3PAR Adaptive Optimization has run for a while. Both charts are histograms, with the x-axis showing the IO Rate Density buckets; the busiest regions are to the right and the most idle are to the left. The chart on the left shows on the y-axis the capacity for all the regions in each bucket, while the chart


8

on right shows on the y-axis the total IOPS/min for the regions in each bucket. As shown in the charts, the SSD tier (tier 0) occupies very little space but absorbs most of the IO accesses, whereas the Nearline tier (tier 2) occupies most of the space but absorbs almost no accesses at all. This is precisely what the user wants.

Figure 6. The two Region IO density reports after adaptive optimization, the first with two tiers and the second with three tiers.


9

Customer case study

This section describes the real benefits that a customer derived from using HP 3PAR Adaptive Optimization. The customer had a system with 96 300 GB 15k rpm FC drives and 48 1 TB 7.2k rpm NL drives. The customer had 52 physical servers connected and running VMware with more than 250 VMs. The workload was mixed (development and QA, databases, file servers) and they needed more space to accommodate many more VMs that were scheduled to be moved onto the array. However, they faced a performance issue: they had difficultly managing their two tiers (FC and NL) in a way that kept the busier workloads on their FC disks. Even though the NL disks had substantially less performance capability (because there were fewer NL disks and they were much slower), they had larger overall capacity. As a result, more workloads were allocated to them and they tended to be busier while incurring long latencies. The customer considered two options: either they would purchase additional 96 FC drives, or they would purchase additional 48 NL drives and 16 SSD drives and use HP 3PAR Adaptive Optimization to migrate busy regions onto the SSD drives. They chose the latter and were pleased with the results (illustrated in Figure 7).

Figure 7. Improved performance after adaptive optimization

Before HP 3PAR Adaptive Optimization as described in the charts—and even though there are fewer NL drives—they incur greater IOPS load than the FC drives in aggregate and consequently have very poor latency (~40 ms) compared with the FC drives (~10 ms). After HP 3PAR Adaptive Optimization has executed for a little while, as shown in the charts on the right, the IOPS load for the NL drives has dropped substantially and has been transferred mostly to the SSD drives. HP 3PAR Adaptive Optimization moved ~33 percent of the IOPS workload to the SSD drives even though that involved moving only 1 percent of the space. Performance improved in two ways: the 33 percent of the IOPS that were serviced by SSD drives got very good latencies (~2 ms), and the latencies for the NL drives also improved (from ~40 ms to ~15 ms). Moreover, the investment in the 16 SSD drives permitted them to add even more NL drives in the future, because the SSD drives have both space and performance headroom remaining.


Sign up for updates hp.com/go/getupdated

Rate this document

© Copyright 2012–2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.

4AA4-0867ENW, March 2013, Rev. 1

Summary

HP 3PAR Adaptive Optimization is a powerful tool for identifying how to configure multiple tiers of storage devices for maximum performance. Its management features can deliver results with minimal effort. As in all matters concerning performance, “your results may vary,” but proper focus and use of HP 3PAR Adaptive Optimization can deliver significant improvements in device utilization and total throughput.

Learn more at hp.com/go/3PARStoreServ

http://www.hp.com/go/getupdated

http://deploy.ztelligence.com/start/index.jsp?PIN=168M59L2W4H8Z&filename=4AA4-0867ENW

http://hp.com/go/3PARStoreServ

AO.pdf

Documents

Transcript of AO.pdf