Sun Storage 7000 - Configuration Guide

download Sun Storage 7000 - Configuration Guide

of 40

Transcript of Sun Storage 7000 - Configuration Guide

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    1/40

    CONFIGURING SUN STORAGE 7000

    UNIFIED STORAGE SYSTEMSFOR ORACLE DATABASES

    Jeffrey T. Wright, Application Integration EngineeringSridhar Ranganathan, Application Integration Engineerin

    Sun BluePrints Online

    Part No 820-7213-10

    Revision 1.0, 1/23/09

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    2/40

    Sun Microsystems, Inc.

    Table of Contents

    Configuring Sun Storage 7000 Unified Storage Systems for ORACLE Databases . .1Overview of the Sun Storage 7000 Unified Storage Systems. . . . . . . . . . . . . . . . . . . . 2

    Ideal Application Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    Configuring Unified Storage Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    Basic I/O Service Time Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    Test Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    128kB Skip-Sequential Read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    128kB Skip-Sequential Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    8kB Random Read. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    8kB 75 Percent Read/25 Percent Write/100 Percent Miss . . . . . . . . . . . . . . . . . . 13

    Estimating System Performance with Cache Hit and Miss I/O . . . . . . . . . . . . . . . . . 16

    Data Access Characteristics Using Oracle Database Workloads . . . . . . . . . . . . . . . . 19

    Environment for Oracle Database Testing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    Oracle Database Tests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    Oracle Database Testing Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    Implementation Guidelines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

    Choosing a Unified Storage System Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

    Write-Optimized Flash Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

    Read-Optimized Flash Devices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    Access Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    Project and Share Configuration for Snapshot and Replication . . . . . . . . . . . . . . 34

    General Database Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    Detailed DTrace Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    About the Authors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    Ordering Sun Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    Accessing Sun Documentation Online . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    3/40

    1 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Configuring Sun Storage 7000 Unified Storage

    Systems for ORACLE Databases

    In planning storage solutions for Oracle database applications, Sun Storage 7000

    Unified Storage Systems offer a range of performance and scalability options, and can

    be configured to meet specific application requirements for capacity, performance, and

    reliability. Because these systems offer such tremendous configuration flexibility,

    storage architects face a number of decisions in designing optimal configurations to

    meet Oracle database storage requirements.

    To help the system planner understand how to accurately match a Unified Storage

    System configuration to specific Oracle data access requirements, this article provides

    results for two sets of I/O-related tests: testing of fundamental system I/O service times

    and testing under a series of Oracle database workloads. This Sun BluePrints article

    reviews configuration criteria and also discusses some important implementation

    guidelines. The article addresses these topics:

    Overview of the Sun Storage 7000 Unified Storage Systems on page 2 highlights

    storage system architecture and features, comparing and contrasting models

    within the product family.

    Basic I/O Service Time Measurements on page 5 provides test results that

    demonstrate general I/O performance characteristics. I/O service time

    characteristics can help storage architects understand how Sun Storage 7000

    Unified Storage System configurations can effectively support different types ofI/O workloads.

    Estimating System Performance with Cache Hit and Miss I/O on page 16 gives a

    method for estimating average I/O response times based on the applications

    working set and projected rates of I/O requests that are serviced from cache, DRAM,

    read-optimized flash, and disk media.

    Data Access Characteristics Using Oracle Database Workloads on page 19

    presents test results for Oracle workloads that represent different types of

    Oracle applications.

    Implementation Guidelines on page 32 reviews some configuration decisions

    and criteria for selecting specific models and features according to Oracle

    database workload characteristics.

    This article is intended for storage architects who have a thorough understanding of

    data storage systems and basic skills to interpret test results derived from Oracle and

    operating system performance analysis tools.

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    4/40

    2 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    In this article, all test results reflect a single client accessing a single share on a Sun

    Storage 7000 Unified Storage System all response time and throughput metrics

    represent the single client/single share case. In practice, horizontal scaling of multiple

    clients accessing multiple shares over multiple interfaces can yield higher system

    throughput. Because such scaling depends strongly on specific system

    implementations, this article does not specifically address optimizing performance

    using multiple clients and multiple shares. Although this article focuses on testing with

    Oracle database workloads, the results presented here can be extended to other

    database software if the expected database workload is similar in composition.

    Overview of the Sun Storage 7000 Unified Storage SystemsThe Sun Storage 7000 Unified Storage Systems are low-cost, fully-functional network

    attached storage (NAS) storage devices designed around these core technologies:

    A general-purpose server as the NAS head

    A general-purpose operating system the OpenSolaris Operating System and the

    ZFS file system

    A large and adaptive two-tiered cache based on DRAM and flash memory devices

    The Sun Storage 7000 Unified Storage Systems do not rely on proprietary operating

    systems, controller hardware, or a particular vendor's disk, but instead incorporate an

    open-source operating system and commodity hardware. Thus customers can leverage

    industry-standard solutions in place of expensive, proprietary disk controllers. ZFS,

    which is included in the OpenSolaris operating system, enables Hybrid Storage Pools

    that provide data placement, data protection, and data services such as RAID, errorcorrection, and system management. These capabilities help to insulate applications

    from failures in the underlying storage hardware.

    Since Sun Storage 7000 systems are based on an open storage system architecture, they

    offer significant cost savings while providing enterprise-class data services, good

    scalability, and superior performance. They feature a common, easy-to-use

    management interface, along with a comprehensive analytics environment to help

    isolate and resolve issues. The systems support NFS, CIFS, and iSCSI data access

    protocols, mirrored and parity-based data protection, local point-in-time (PIT) copy,

    remote replication, data checksum, data compression, and data reconstruction.

    To meet varied needs for capacity, reliability, performance, and price, the product

    family includes three different models the Sun Storage 7110, 7210, and 7410

    systems. When appropriate data processing and data storage resources are configured

    in these models, the systems can support nearly arbitrary data access requirements for

    Oracle Online Transaction Processing (OLTP) applications, Decision Support Systems

    (DSS), and mixed OLTP/DSS systems. Table 1 summarizes product features, comparing

    tested models in the product family.

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    5/40

    3 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Table 1. Sun Storage 7000 Series Family Configurations

    a The initial release offers 288 TB in usable capacity. A software upgrade will enable the systems

    full 576 TB capacity to be utilized.b A single system includes a 2U system plus a 4U expansion array. Clustered system includes two

    2U systems plus a 4U expansion array

    Sun Storage 7000 systems combine general-purpose hardware and software

    components in a design that simplifies hardware integration and enables

    comprehensive features. Of note, a large and sophisticated adaptive resource cache

    automatically identifies and places active data on high-performance DRAM (ARC)1 or on

    read-optimized flash (L2ARC) on Sun Storage 7210 and 7410 systems. This capability

    keeps data access times low for frequently and recently accessed data while storing

    large amounts of infrequently accessed data on less expensive, high-density SATA disk

    media. Because the cache provides exceptionally high cache hit rates for practical

    business applications, it reduces the need to access data from spinning media.

    Because of these storage systems tiered design, configuration decisions differ from

    those for solutions that use traditional disk-based storage architectures. Specifically,

    Sun Storage 7000 system performance depends strongly on configured flash, DRAM,

    and CPU resources instead of the number of physical drives in the storage system.

    Consequently, to accurately predict I/O performance under a particular application

    load, the designer must understand how cache can service the application's active data

    set, and match application-level data access times and data access requirements to

    configurations of cache, disk, network, and CPU resources. This article provides

    guidelines and test results to guide storage architects in the task of configuring these

    systems to achieve optimal storage performance for Oracle database workloads.

    1 The Unified Storage System ARC is based on the work ARC: A Self-Tuning, Low Overhead

    Replacement Cache, Nimrod Megiddo and Dharmendra S. Modha, Proceedings of FAST 03: 2nd

    USENIX Conference on File and Storage Technologies, 2003.

    SystemModel

    MaximumStorage

    Capacity

    Space (RackUnits)

    NAS HeadProcessor(s)

    NASHeadMainMemory

    Write-OptimizedSolid StateDrive (SSD)

    Read-OptimizedSolid StateDrive (SSD)

    ClusterOption

    Sun Storage7110 system

    2 TB(16 x 2.5"SAS disks)

    2UQuad-CoreAMDOpteron

    8GB N N N

    Sun Storage7210 system

    44 TB(48 x 3.5"SATA II disks)

    4UTwo Quad-core AMDOpteron

    Up to64GB

    Up to 36 GB N N

    Sun Storage7410 system

    576 TBa(576 x 3.5"SATA II disks)

    6U for singlesystem, or 8Ufor clusteredconfigurationb

    Up to FourQuad-CoreAMDOpteron

    Up to128GB

    Up to 288 GBfor clusterconfiguration

    Up to 600 GB Y

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    6/40

    4 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Ideal Application Profiles

    While nearly any application profile can be supported with an appropriate

    configuration, Sun Storage 7000 Unified Storage Systems are ideal for consolidating

    many small and lightly accessed database servers on a single storage device. In the

    case of database consolidation, the systems large cache becomes a shareable resource

    among all small client systems, and the system automatically migrates active data to

    high-performance storage media and avoids over-provisioning media for inactive

    clients. Thus the architecture reduces system cost without compromising operations.

    The following application characteristics also help to determine whether a Sun Storage

    7000 system can accelerate application performance:

    The application performs sufficient I/O operations per transaction to generate a

    distribution of cache hit and cache miss I/O

    The application tolerates a bi-modal or tri-modal distribution of I/O response times

    (i.e., it exhibits a range of I/O response times because of the distribution of hits and

    misses in bi-modal configurations with DRAM and disk media or tri-modal

    configurations with flash, DRAM, and disk media)

    For example, an application that executes one read per transaction will have

    inconsistent transaction response times because there is no way to guarantee whether

    or not the I/O will be serviced from cache. On the other hand, an application that

    executes 100 reads per transaction is more likely to have an I/O distribution that reflects

    the engineered cache hit rate for the system. Likewise, if a user must perform a single

    read operation before making a decision, then response times will be inconsistent

    but if the user can batch process data such that multiple I/O operations are performed

    at each step, then response times will tend to be more constant.

    Configuring Unified Storage Systems

    Sun Storage 7000 Unified Storage Systems allow storage architects to control data

    placement to minimize storage system costs and meet service level agreements (SLA).

    Designing an optimal storage solution starts with quantifying data access requirements

    in terms of protection, access time, access rate, and capacity. Once these requirements

    have been defined, a Unified Storage System architecture can be created to meet these

    requirements along with a pre-determined amount of headroom for growth.

    In the context of configuring storage systems for Oracle database applications, the

    following guidelines can help to estimate an appropriate architecture for arbitrary data

    access requirements:

    Network interface configuration

    12,000 IOPS (8kB) or 110 MPBS per 1Gb NIC port

    1000 MBPS maximum per system

    CPU configuration

    One CPU core per 50 MBPS of client-side data traffic

    16 CPU core maximum

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    7/40

    5 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    DRAM configuration

    Four to eight times the combined database buffer cache of all active clients

    128GB maximum

    Maximum IOPS: 12,000 per 1Gb network interfaces

    Typical read response time: 0.3 ms

    Read-optimized flash device configuration (on the Sun Storage 7410 system only)

    One device per 16-32 GB of DRAM

    Six devices maximum

    One device per 2500 IOPS; 15,000 IOPS maximum

    Typical read response time: 5 ms

    Write-optimized flash device configuration (on Sun Storage 7210 and 7410

    systems only)

    Eight devices maximum

    Striped option: one per 9,000 IOPS or 100 MBPS of client-side synchronous writes

    Mirrored option: two per 9,000 IOPS or 100 MBPS of client-side

    synchronous writes

    Typical write response time: 0.3 - 0.4 ms

    Storage configuration

    Data protection: mirrored

    Physical drive count for performance: estimated back-end IOPS / 50 IOPS

    per drive

    Physical drive count for capacity: estimated capacity / (0.5 * drive capacity)

    Typical read response time: 45 ms

    Basic I/O Service Time Measurements

    Basic I/O service time characteristics can help storage architects understand how a

    storage system can effectively support an arbitrary database workload. For the Sun

    Storage 7000 Unified Storage Systems, it is helpful to analyze four basic workloads:

    128kB Skip-Sequential Read

    The 128kB skip-sequential read workload measures read bandwidth between the

    application server and tested Sun Storage 7000 system. The read could be from ARC,

    L2ARC, or physical drives. 128kB sequential read throughput is useful for planning

    storage systems for Data Warehouse applications, batch processing, and

    backup operations.

    128kB Skip-Sequential Write

    The 128kB skip-sequential write workload uses synchronous I/O to measure write

    bandwidth between the application server and either the back-end drives or the

    write-optimized flash of the tested storage system. 128kB skip-sequential write

    throughput is useful for planning database load, database recovery, and disk

    sort operations.

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    8/40

    6 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    8kB Random Read

    8kB random read workloads demonstrate the controller's ability to use storage

    media, including how hard the controller can effectively push the media and how

    much time it takes the controller to process I/O to and from the media. (The

    controller in the Sun Storage 7000 systems is the system server.) 8kB random I/O

    rate and response time data is useful when planning solutions for Online

    Transaction Processing (OLTP) applications.

    8kB 75 Percent Read/25 Percent Write/100 Percent Miss

    This workload demonstrates the controller's ability to handle a mixed workload,

    including how long it takes the controller to process I/O to and from the media

    when there are no cache hits. Writes are executed per synchronous I/O semantics.

    Test Environment

    All tests were conducted under the following client system configuration:

    Access protocol: NFSv3 over 1 Gb Ethernet

    NFS mount options: rw, bg, hard, nointr, rsize=32768, wsize=32768, proto=tcp,

    vers=3, noac, forcedirectio

    Ethernet interface MTU: 1500

    Operating system: Solaris 10 OS x64 update 5

    Hardware platform: Sun Fire V40Z and Sun Fire 4100 Series servers

    Table 2 details the hardware and software configuration for each tested Sun Storage

    7000 Unified Storage System. The tested systems represent only a subset of available

    configuration options. For data access requirements beyond those met by the testedconfigurations, larger or smaller configurations can be deployed.

    Table 2. Tested Storage System Configurations

    OptionSun Storage 7110System

    Sun Storage7210 System

    Sun Storage 7410System

    CPU core count 4 8 8

    Network interface 1x1Gb Ethernet 1x1Gb Ethernet 3x1Gb Ethernet

    DRAM 8 64 16

    Write-optimized flash 0 1 2

    Read-optimized flash 0 0 1

    Disk drives 16x10k RPM SAS34x7200 RPMSATA

    47x7200 RPMSATA

    Data protection mirror mirror mirror

    Share record size 8kB 8kB 8kB

    Access protocol NFSv3 NFSv3 NFSv3

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    9/40

    7 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    128kB Skip-Sequential Read

    As Figure 1 shows, all Sun Storage 7000 Unified Storage System models can service a

    single process executing 128kB skip-sequential reads at 40 MBPS. However, the

    scalability of throughput as the number of active I/O requests increases depends on the

    number of CPUs and the number and type of network interfaces available in the

    storage system. In this test, the Sun Storage 7110 and 7210 systems have equal

    network configurations but the Sun Storage 7210 system has twice the CPU resources.

    Consequently, throughput scales linearly on the Sun Storage 7210 system until the

    network interface saturates, while throughput on the Sun Storage 7110 system shows

    signs of non-linear scaling with only two outstanding I/O requests. In both cases, eight

    outstanding I/O requests in the queue are sufficient to saturate the 1Gb network

    interface. Configured with three network interfaces, the Sun Storage 7410 system

    shows linear throughput scaling up to four outstanding I/O requests and saturation at

    170 MBPS with eight outstanding I/O requests.

    This behavior shows a practical limit for a single client accessing a single file system on

    the Sun Storage 7410 system. Near linear throughput scaling occurs for two clients

    accessing separate file systems through separate network interfaces on the Sun Storage

    7410 system.

    Figure 1. Throughput Versus Queue Length: 128kB Skip-Sequential Read.

    128kB Skip-Sequential Write

    As shown by the test results in Figure 2, Sun Storage 7210 and 7410 systems can service

    a single process executing synchronous 128kB skip-sequential writes at 35 MBPS, while

    the Sun Storage 7110 system can support only 5 MBPS. This difference is due to the Sun

    0

    50

    100

    150

    200

    Sun Storage 7410 System

    Sun Storage 7210 System

    Sun Storage 7110 System

    32168421

    Queue Length

    MBPS

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    10/40

    8 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Storage 7110 systems lack of write-optimized flash for synchronous write processing.

    Consequently, the Sun Storage 7110 system must wait for the back-end disk drives to

    flush I/O from the drive cache to the physical spindle before acknowledging that the

    write is complete to the application. Non-synchronous I/O does not exhibit this

    behavior, but in the case of Oracle databases, an assumption of synchronous writes

    should be used for I/O planning.

    Throughput limits on the Sun Storage 7110 system show suitable use cases in the

    context of Oracle databases. For applications that can tolerate 5-15 MBPS of

    synchronous write processing, the Sun Storage 7110 system provides a low-cost storage

    solution. For applications that require higher throughput levels for synchronous write

    processing, the Sun Storage 7210 and 7410 systems configured with write-optimized

    flash devices are more appropriate building blocks.

    Figure 2. Throughput Versus Queue Length: 128kB Skip-Sequential Write.

    The throughput of Sun Storage 7210 and 7410 systems is limited by network bandwidth

    and the latency of the write-optimized flash device. In the tested configurations, 16

    outstanding I/O requests are sufficient to saturate a single network interface on the

    Sun Storage 7210 system. In the case of the Sun Storage 7410 system, a second network

    interface and a second write-optimized flash device is configured, and the saturation

    characteristics reflect the interplay of the network interface limits and the limits of the

    write-optimized flash device.

    0

    20

    40

    60

    80

    100

    120

    32168421

    Queue Length

    MBPS

    Sun Storage 7410 System

    Sun Storage 7210 System

    Sun Storage 7110 System

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    11/40

    9 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    From the perspective of capacity planning for database load and index build operations

    on the Sun Storage 7210 and 7410 systems, Figure 2 shows that four outstanding I/O

    requests to a single share over a single 1Gb network link with one solid state write

    processing device can process synchronous writes at 80 MBPS. As with 128kB sequential

    reads, near-linear throughput scaling can be expected for two servers accessing two

    shares over two network interfaces.

    8kB Random Read

    Because of the advanced cache architecture of the Sun Storage 7000 Unified Storage

    Systems, 8kB I/O tests are run in cache hit and cache miss test cases and the combined

    results of these tests are used to estimate average I/O response times for systems

    running with arbitrary cache hit rates. 8kB random read miss and hit testing shows how

    the Sun Storage 7000 Unified Storage Systems can respond to I/O requests for data

    stored on disk and in cache. From the perspective of planning for Oracle database

    applications, test results for 8kB read hit and 8kB read miss workloads can help system

    planners to create solutions that meet arbitrary db file sequential read wait times

    at arbitrary database I/O rates.

    8kB Read Hit

    The 8kB read hit test results demonstrate the best-case scenario for read performance

    from the storage systems because the cache-friendly workload ensures that all data is

    read from the ARC. Applications that have small active data sets compared to the size of

    the cache can assume cache-hit I/O for the majority of I/O operations. In practice, not

    all application I/O may be serviced from cache, so data access characteristics from the

    storage target may be an average of both hit and miss I/O.

    Figure 3 and Figure 4 show 8kB read hit performance. Figure 3 demonstrates

    throughput as a function of queue length to show planners how the data access

    characteristics an individual client can expect from the storage system. Figure 4

    demonstrates I/O response time as a function of I/O rate for purposes of

    general planning.

    As these figures show, there are three important features when planning for read-hit

    I/O workloads on the Sun Storage 7000 Unified Storage Systems:

    Latency in the lightly loaded case is 0.3 ms on all platforms

    Throughput scales linearly until the network bandwidth is saturated

    The network is the most significant factor affecting throughput

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    12/40

    10 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Consequently, a single host or client with a single outstanding I/O can expect 3000 IOPS

    or 27 MBPS for 8kB transfers from the ARC. As the number of outstanding I/O

    operations to the storage target increases, the total throughput increases until the

    network saturates. Beyond this point, as described by simple queuing theory, each I/O

    request observes an increase in I/O response time and a decrease in I/O throughput.

    Figure 3. I/O Access Rate Versus Queue Length: 8kB Read Hit.

    Figure 4. Response Time Versus Workload: 8kB Read Hit.

    0

    5000

    10000

    15000

    20000

    25000

    2561286432168421

    Queue Length

    IOPS

    Sun Storage 7410 System

    Sun Storage 7210 System

    Sun Storage 7110 System

    0 5000 10000 15000 20000 250000.0

    0.5

    1.0

    1.5

    2.0

    2.5

    3.0

    Sun Storage 7410 System

    Sun Storage 7210 System

    Sun Storage 7110 System

    IOPS

    ServiceTime(ms)

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    13/40

    11 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    8kB Read Miss

    The 8kB read miss test results demonstrate nominal read performance from the storage

    systems because the cache-hostile workload in this test helps to ensure that nearly all

    data is read from spinning media. Applications that have large active data sets

    compared to the size of the cache should assume cache-miss I/O behavior for the

    majority of I/O operations. In practice, some applications may benefit from cache hits

    and data access characteristics will be an average of both hit and miss I/O.

    Figure 5, Figure 6, and Figure 7 show three different ways to analyze 8kB read miss

    performance. Figure 5 demonstrates system throughput as a function of the number of

    outstanding I/O requests. For workloads that feature a known amount of active I/O

    requests, system planners can use Figure 5 to estimate how much throughput they can

    expect from the system and what the throughput would be for a given client.

    Figure 5. I/O Access Rate Versus Queue Length: 8kB Read Miss.

    Figure 6 shows client-observed I/O response time2 as a function of the total I/O rate per

    drive. This chart is useful for understanding the number of drives required to meet a

    specific I/O response time at an arbitrary level of system throughput. The data is

    especially relevant because of the wide range of drive configuration options available

    on the Sun Storage 7410 system.

    2 I/O response is defined by the time for the I/O system call (e.g.,pread64) to complete; this time

    reflects the asvc_t metric reported by iostat.

    0

    500

    1000

    1500

    2000

    2500

    3000

    3500

    2561286432168421

    Queue Length

    IOPS

    Sun Storage 7410 System

    Sun Storage 7210 System

    Sun Storage 7110 System

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    14/40

    12 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Figure 6. Response Time Versus Workload per Drive: 8kB Read Miss.

    Figure 7 shows I/O service times as a function of the total I/O rate and applies most

    directly to Sun Storage 7110 and 7210 systems.

    Figure 7. Response Time Versus Total Workload: 8kB Read Miss.

    0 50 100 150 200

    0

    10

    20

    30

    40

    50

    Sun Storage 7410 System

    Sun Storage 7210 System

    Sun Storage 7110 System

    IOPS per Active Drive

    ServiceTime(ms)

    0 500 1000 1500 2000 2500 3000

    0

    10

    20

    30

    40

    50

    Sun Storage 7410 System

    Sun Storage 7210 System

    Sun Storage 7110 System

    IOPS

    ServiceTime(ms)

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    15/40

    13 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Read miss performance depends on drive speed and drive count. Due to the higher

    drive speed, the Sun Storage 7110 system has lower I/O response times at a fixed I/O

    rate per drive and higher I/O throughput for small number of active I/O requests. In the

    case of the Sun Storage 7110 system, read-miss response time ranges from 12-30 ms

    over a range of 10-125 IOPS per drive or 80-1500 total IOPS. Beyond 125 IOPS per drive or

    1500 total IOPS, response time increases exponentially and total system throughput

    saturates. For a single active I/O request, a 12 ms response time provides a single-

    threaded throughput of 83 IOPS.

    Compared to the Sun Storage 7110 system, the Sun Storage 7210 and 7410 systems

    have longer response times and lower throughput per I/O request. This effect is due to

    the slower drive speed on these models. However, the Sun Storage 7210 and 7410

    systems scale to a higher total throughput for large numbers of outstanding I/O

    requests. Read miss response times range from 25 ms in lightly loaded cases to 50 ms at

    60 IOPS per drive. Beyond 60 IOPS per drive, the physical drive performance saturates

    and I/O response time increases exponentially.

    For planning purposes, the 44 drives available in Sun Storage 7210 configurations can

    practically support up to 2600 IOPS, while the Sun Storage 7410 system can scale read

    miss performance up to the maximum number of available drives. For example, a

    mirrored 80 drive configuration in a Sun Storage 7410 system can support up to 5800

    read miss IOPS at a 50 ms I/O response time.

    8kB 75 Percent Read/25 Percent Write/100 Percent Miss

    The 8kB 75 percent read/25 percent write/100 percent miss workload represents a mix

    of read and synchronous write I/O operations. This workload tests how well the storage

    device's I/O processing algorithm can balance the needs of front-end application I/O

    with the capabilities of the back-end drives and the data protection algorithm. There

    are three important results from this test to consider in the context of planning storage

    solutions for Oracle database applications:

    Front-end read response time versus front-end I/O rate (Figure 8)

    Front-end write response time versus front-end I/O rate (Figure 9)

    Total throughput as a function of queue length (Figure 10)

    Read response time (Figure 8) can be used to set expectations for how a user accessing

    the database will see response times for lookup operations that miss the database

    cache and must be serviced by spinning media. As the workload increases and system

    throughput approaches saturation, a gradual increase in the read response time

    translates to a gradual degradation in the end user's experience. Database operations

    can be dramatically influenced by write response time (Figure 9) because write

    response time directly affects database commit operations. For this reason system

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    16/40

    14 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    planners should ensure that workloads are kept within a range that provides consistent

    write response times. For example, Figure 9 shows the tested Sun Storage 7210 system

    can produce consistent write response times up to 2500 IOPS.

    Figure 8 and Figure 9 show read and write response times versus total I/O rate. The Sun

    Storage 7110 system (with 12 active 10,000 RPM SAS drives and no write-optimized

    flash devices) shows shorter read response times in the lightly loaded case with less

    total throughput. In contrast, in the saturated case, the Sun Storage 7210 system

    (configured with 44 active 7,200 RPM SATA drives and one write-optimized flash device)

    scales to more total throughput but with a reduced throughput per active I/O request.

    The performance of the Sun Storage 7410 system (configured with 32 active 7,200 RPM

    SATA drives and two write-optimized flash devices) falls between that of the Sun

    Storage 7110 and 7210 systems. Practical Sun Storage 7410 system configurations can

    be scaled to 80 active 7,200 RPM SATA drives to achieve both increased throughput

    and capacity.

    Figure 8. Read Response Time Versus Workload: 8kB 75% Read/25% Write/

    100% Miss.

    0 500 1000 1500 2000 2500 30000

    10

    20

    30

    40

    50

    Sun Storage 7410 System

    Sun Storage 7210 System

    Sun Storage 7110 System

    IOPS

    ServiceTime

    (ms)

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    17/40

    15 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Figure 9. Write Response Time Versus Workload: 8kB 75% Read/25% Write/

    100% Miss.

    Figure 10. I/O Access Rate Versus Queue Length: 8kB 75% Read/25% Write/

    100% Miss.

    0 500 1000 1500 2000 2500 30000

    10

    20

    30

    40

    50

    Sun Storage 7410 System

    Sun Storage 7210 System

    Sun Storage 7110 System

    IOPS

    ServiceTime(ms)

    0

    500

    1000

    1500

    2000

    2500

    3000

    1286432168421

    Queue Length

    IOPS

    Sun Storage 7410 System

    Sun Storage 7210 SystemSun Storage 7110 System

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    18/40

    16 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Estimating System Performance with Cache Hit and Miss I/OThis section presents a simple model to predict average I/O response time for Sun

    Storage 7210 and 7410 systems based on the percentage of I/O requests serviced from

    DRAM (primary ARC), read-optimized flash (secondary L2ARC), or spinning disk. The

    storage architect can use this information to design solutions to meet arbitrary

    performance requirements using the following process:

    1. Quantify the access rate and average access time requirements for the active

    working set. (The I/O access rate is typically specified in terms of IOPS or

    throughput (MBPS), and the access time defined in terms of read/write

    response times.)

    2. Estimate the size of the active working set for the targeted applications.

    3. Identify the cache hit rate that supports access time requirements for the active

    working set.

    4. Configure DRAM and flash resources to help maintain the appropriate cache

    hit rate.

    5. Configure CPU and network resources to meet the data access rate requirements.

    6. Configure physical disk drives to meet the total capacity and cache miss

    performance requirements.

    In a Sun Storage 7210 or 7410 system configured without read-optimized flash devices,

    the average I/O response time reflects the time for I/O operations to be serviced from

    cache and spinning disk. Thus a simple formula predicts the average I/O response time

    Tave as a function of cache hit ratex:

    Tave(x) = Tcache * x + Tdisk * (1 - x)

    Tave is the sum of the time it takes to service an I/O from cache (Tcache) times the

    cache hit probability, plus the time it takes to service an I/O from disk (Tdisk):

    Tcache and Tdiskcan be estimated from Figure 4 and Figure 6 by observing the 100%

    hit and 100% miss cases. From the data, Tcache equals 0.3 ms and Tdiskequals 45 ms.

    These results assume that there is enough network bandwidth to service I/O operations

    in the event of a hit and enough disk bandwidth to service I/O operations in the event

    of a miss. To plan storage solutions based on these heuristics, the system should run

    below 10,000 IOPS per 1Gb network interface and 50 IOPS per physical disk drive. Underthese conditions the following formula estimates average I/O response time (in

    milliseconds):

    Tave(x) = 0.3 * x + 45 * (1 x)

    Table 3 shows estimated response times based on the ARC cache hit rate.

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    19/40

    17 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Table 3. Estimated Response Time Versus ARC Hit Rate

    For Sun Storage 7410 system configurations with read-optimized flash devices, the

    average read response time can be estimated by adding a Tflash parameter that

    reflects the response time of the read-optimized flash device, and by including the

    probability of hits in the flash cache. In this model, the primary cache hit rate isx, the

    secondary flash cache hit rate isy, and a miss I/O operation occurs after a miss in the

    secondary cache:

    Tave(x,y) = Tcache * x + Tflash * y + Tdisk * (1 - y)

    where

    x + y1

    Tflash for the Sun Storage 7410 system has been measured as 5 ms when running flash

    devices at 2,500 IOPS. Assuming less than 10,000 IOPS per 1Gb network interface and

    less than 50 IOPS per physical spindle, the following formula estimates the average I/O

    response time Tave as a function of the primary and secondary cache hit ratesx andy:

    Tave(x,) = 0.3 *x + 5 * y + 45 * (1 y)

    Table 4 lists estimated response times based on ARC and L2ARC cache hit rates.

    ARC Hit RateAverage ReadResponse Time

    100 0.3

    95 2.5

    90 4.8

    80 9

    70 14

    60 18

    50 22

    40 27

    30 32

    20 36

    10 41

    0 45

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    20/40

    18 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Table 4. Estimated Response Times Versus ARC and L2ARC Hit Rates

    ARC HitRate

    L2ARC HitRate

    Read ResponseTime (ms)

    ARC HitRate

    L2ARCHHit Rate

    Read ResponseTime (ms)

    0 100 5 30 40 1690 9 30 20

    80 13 20 24

    70 17 10 28

    60 21 0 32

    50 25 40 60 3.1

    40 29 50 7.1

    30 33 40 11

    20 37 30 15

    10 41 20 19

    0 45 10 23

    10 90 4.5 0 27

    80 8.5 50 50 2.7

    70 13 40 6.7

    60 17 30 11

    50 21 20 15

    40 25 10 19

    30 29 0 23

    20 33 60 40 2.2

    10 37 30 6.20 41 20 10

    20 80 4.1 10 14

    70 8.1 0 18

    60 12 70 30 1.7

    50 16 20 5.7

    40 20 0 9.2

    30 28 0 14

    10 32 80 20 1.2

    0 36 10 5.2

    30 70 3.6 90 10 0.8

    60 7.6 0 4.8

    50 12 100 0 0.3

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    21/40

    19 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Data Access Characteristics Using Oracle Database WorkloadsThe previous section estimates response times based on rates at which I/O requests are

    serviced from the primary ARC, secondary L2ARC, or disk. These estimates can help to

    provide a starting point for planning solutions that meet specific data access rate and

    access time requirements for Oracle database applications. To help storage architects to

    more accurately optimize Sun Storage 7000 system configurations for Oracle

    transaction processing systems, this section presents the results of testing against

    actual Oracle database systems. By analyzing these results, the architect can better

    understand how to apply the I/O service times and response time calculations in the

    previous sections to predict complex database application behavior.

    Storage performance was measured for five types of database activity: database load,

    index build, OLTP processing, parallel table scan, and mixed OLTP/parallel table scan.

    The results help to set appropriate expectations for practical system response time andthroughput, and demonstrate and quantify variations in performance that are often

    inherent with complex Oracle database workloads.

    The test results represent performance of a single Oracle instance running from a single

    share on the storage target. In practice, a Unified Storage System can support an

    increased level of throughput with minimal increases in response times when multiple

    servers access multiple shares.

    Environment for Oracle Database Testing

    In the Oracle test environment, little tuning was done to the application, the database

    instance, or the operating system. Thus the results represent conservative estimates for

    system performance. In practice, careful system tuning can produce increases in

    throughput and/or decreases in response time. However, system tuning

    recommendations are generally application-specific and are beyond the scope of

    this article.

    Database Server Configuration

    Figure 11 depicts the test configuration. The database servers used were either dual-

    core dual-socket 3.1 GHz processors (Sun Fire 4100 servers) or dual-core 8-socket 2.4GHz

    processors (Sun Fire V40z servers). Both servers were over-configured with CPU

    resources compared to storage subsystem resources, so the processor configuration did

    not significantly affect system throughput. Each system had 16-32 GB of RAM with 4 GB

    reserved for the database buffer cache. Client and storage connections were made over

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    22/40

    20 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    alternate 1Gb Ethernet interfaces. With the exception of parallel table scan tests, the

    network interfaces had more bandwidth than that required by the application, so

    performance results reflect storage subsystem response time for specific I/O workloads.

    Figure 11. Test Configuration for Oracle Database Testing.

    The test operating system was Solaris 10 OS Update 5. The default MTU (Maximum

    Transmission Unit) of 1500 was used for all network interfaces. The kernel was tuned

    per standard Oracle recommendations for a 12 GB SGA and up to 1 MB I/O transfers

    as follows:

    The TCP stack was tuned with ndd as follows:

    Database Instance Configuration

    The tested Oracle database was configured to emulate a small-scale system that wouldbe a good candidate for consolidation on a Unified Storage System. At the database

    level, no specific cache tuning (such as reserving a keep pool for data that is known to

    benefit from caching) was implemented. A single file system mount point was used for

    all database data files. Oracle Managed Files (OMF) was used for file management, a

    separate share was created on the same project as the database data for the database

    flash recovery area, and a separate share was created for the archived redo logs. The

    Oracle

    Database Server

    (Sun Fire 4100 Server or

    Sun Fire V40z Server)

    Sun Storage 7110

    System

    Sun Storage 7210

    System

    Sun Storage 7410

    System

    System Under Test

    Database Client

    (Sun Fire X4600 Server)

    set noexec_user_stack=1

    set shmsys:shminfo_shmmax=12704901885

    ndd -set /dev/tcp tcp_conn_req_max_q 16384

    ndd -set /dev/tcp tcp_conn_req_max_q0 16384

    ndd -set /dev/tcp tcp_xmit_hiwat 131072

    ndd -set /dev/tcp tcp_recv_hiwat 131072

    ndd -set /dev/tcp tcp_naglim_def 1

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    23/40

    21 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    database ran in archive log mode, and copies of the online redo log and control file

    were multiplexed between the db_create_file_dest and the db_recovery_file_dest.

    The server parameter file, init.ora, contained the following entries:

    Database Application and Tables

    Each storage architecture was tested using an order-entry system with the following

    specifications:

    Capacity: 200 GB production data

    Transaction profile: 30 reads and 10 writes per transaction

    Peak workload: 180 transactions per second

    Response time: new-order transaction completes in < 3.0 seconds

    The test application was based on a classical order entry system. The application had

    nine tables (item, warehouse, stock, district, customer, history, orders, order line, and

    new order), and executed new-order, stock-level, payment, and order-status

    transactions. To prevent this implementation from creating an unrealistically simple

    workload for the storage subsystem, no special partitioning techniques were used to

    improve query execution.

    log_archive_dest_1='LOCATION=/fishworks/oraarch'

    log_archive_format=%t_%s_%r.dbf

    db_block_size=8192

    db_cache_size=4294967296

    db_file_multiblock_read_count=16

    cursor_sharing=force

    open_cursors=300

    db_domain=""

    db_name=bench

    background_dump_dest=/export/home/oracle/admin/bench/bdump

    core_dump_dest=/export/home/oracle/admin/bench/cdump

    user_dump_dest=/export/home/oracle/admin/bench/udump

    db_create_file_dest=/fishworks/oradata

    db_files=1024

    db_recovery_file_dest=/fishworks/orafra

    db_recovery_file_dest_size=214748364800

    job_queue_processes=10

    compatible=10.2.0.1.0

    java_pool_size=0

    large_pool_size=0

    shared_pool_size=536870912

    processes=1024

    sessions=1131

    log_buffer=104857600

    audit_file_dest=/export/home/oracle/admin/bench/adump

    remote_login_passwordfile=EXCLUSIVE

    pga_aggregate_target=1707081728

    undo_management=AUTOundo_tablespace=UNDOTBS1

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    24/40

    22 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Database Test Approach

    Each storage subsystem configuration was characterized by its ability to support five

    use cases: database load, index build, OLTP response time as a function of workload,

    parallel table scan, and mixed OLTP/scan. The following accounting tools were used to

    analyze each use case:

    Application: wall clock time, new order rate, and new order response time

    Operating system: vmstat and iostat (sar -d and sar -u)

    Database: Oracle STATSPACK

    Oracle Database Tests

    Database Load Test

    The database load test subjects the storage system to deep queues (50-150 active I/O

    requests) of large-block synchronous write I/O operations. Writes to the online redo log

    pose the most important storage-based bottleneck in this test. Consequently, system

    throughput reflects the total database I/O throughput based on the throughput

    available to multiplexed copies of the online redo log. From the perspective of the

    Oracle STATSPACK report, log file sync is the most important wait event. This event is

    influenced by write response time and the operating systems scheduling of Oracle's

    log writer process (LGWR), so the test results reflect the combined limits of the

    database server and the storage subsystem target.

    The database preload test is executed as follows:

    1. Begin without tables, indexes, or constraints defined.2. Build tables with pre-allocated extents and all table constraints disabled.

    3. Start iostat, vmstat, and create a STATSPACK snapshot; start wall clock.

    4. Use multiple (40) processes to simultaneously load each table.

    5. When all processes complete, stop vmstat, iostat, and wall clock, and take a

    STATSPACK snapshot.

    6. Create STATSPACK report; record vmstat, iostat, and wall clock results.

    Index Build Test

    The index build test performs full table scan and disk sort operations. This test drives

    large-block sequential I/O to the storage subsystem and shows how efficiently a storage

    system processes sequential transfers. Storage subsystem response time is a critical

    factor in the execution time for this test. The index build test is executed as follows:

    1. Start iostat, vmstat, and create a STATSPACK snapshot; start wall clock.

    2. Use multiple processes to simultaneously build indexes.

    3. When all processes complete, stop vmstat, iostat, and clock, and take a

    STATSPACK snapshot.

    4. Create STATSPACK report; record vmstat, iostat, and wall clock results.

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    25/40

    23 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    OLTP Workload Versus Response Time Test

    The OLTP workload versus response time test demonstrates the storage subsystem's

    ability to service the random, high-concurrency, and correlated I/O workload associated

    with online transaction processing systems. Since the data set is large with respect to

    the host and storage array cache sizes, this test also quantifies media access

    performance and how efficiently the storage subsystem uses the physical media. The

    OLTP workload versus response time test is executed as follows:

    1. Start iostat, vmstat, and create a STATSPACK snapshot.

    2. Connect 100 clients to the database with a specified think time to control the work-

    load (think time ranges from 0 to 12 seconds).

    3. Measure transaction rate and transaction response time over a specified interval

    (10 minutes) to determine an average rate and response time.

    4. Stop iostat and vmstat, and create a STATSPACK snapshot.

    5. Generate STATSPACK report.

    Parallel Table Scan Test

    The parallel table scan issues a parallel query hint for a table scan operation against a

    stock table in the order-entry system as follows:

    The count of the records is divided by the total execution time to calculate throughput

    in terms of rows per second. The process to execute the parallel table scan is as follows:

    1. Start iostat, vmstat, and create a STATSPACK snapshot; start wall clock.

    2. Issue select statement to measure the total number of rows and the sum of a

    non-indexed column.

    3. When the scan completes, stop vmstat, iostat, and clock, and take a

    STATSPACK snapshot.

    4. Create STATSPACK report; record vmstat, iostat, and calculated rows per second.

    Oracle Database Testing Results

    Database Load

    For a single database running from a single file system, a single 1Gb network interface

    and a single write-optimized flash device provide sufficient bandwidth for write

    processing. Table 5 lists the results from measuring Sun Storage 7210 and 7410 systems

    with one and two write-optimized flash devices on 8kB and 64kB shares.

    SELECT /*+PARALLEL( stock 16 )*/ COUNT( s_quantity ), SUM( s_quantity )

    INTO count, sum

    FROM stock;

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    26/40

    24 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Table 5. Database Load Results

    The STATSPACK information reflects the average throughput over the entire load

    process, which includes small and large numbers of clients pushing information into

    the database. Consequently, the STATSPACK metrics do not accurately capture peak

    system load. The iostat results reflect the combined work of writing the redo log

    members and reading and writing the archived redo log. The iostat results cover a

    short time period compared to the load process and consequently show a more

    accurate indication of peak throughput. Write response time (defined as the active

    queue length divided by the average service time) ranged from 0.20-0.36 ms per

    32kB I/O.

    The Unified Storage Systems cache architecture is designed and optimized to provide a

    large amount of throughput for a large number of clients. As a result, buffering and

    flushing data can result in decreased throughput and increased response time for

    individual clients. In practice, periodic flushing on a 5-30 second time scale can cause

    increased response time and reduced throughput for the client-side application during

    the flush (which typically lasts 1-3 seconds). The results given here reflect average

    throughput with this activity occurring. When quantifying performance requirements in

    service level agreements (SLAs), storage architects should take into account that a

    batch load process can incur increased latency because of more frequent data flushes.

    Index Build

    The index build test subjects the storage subsystem target to a large-block mixed

    read/write workload typical of disk sort operations. In this test, the data set is much

    larger than the buffer cache on the database server, so sort operations must be paged

    to temporary space allocated on the storage target. Ten different indexes are built in

    parallel, and each index is built with the PARALLEL( DEGREE 8 ) clause. The top Oracle

    STATSPACK wait events reported during the index build test are typically a direct path

    read or db file scattered read.

    Metric Result Measured by

    Average transaction rate 36-43 TPS STATSPACK

    Average redo generation rate 14-16 MBPS STATSPACK

    Log file parallel write wait time 140-160 ms STATSPACK

    Average physical writes persecond

    1800-2100 STATSPACK

    Peak write throughput3900-4400 IOPSand 75-89 MBPS

    iostat

    Peak read throughput300-700 IOPSand 10-22 MBPS

    iostat

    Queue length and service time80-110 activeand 22-29 ms

    iostat

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    27/40

    25 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    For a single database running from a single file system, a single 1Gb network interface

    and a single write-optimized flash device provide sufficient bandwidth for write

    processing. Since this workload is a scan and sort operation, the presence of read-cache

    devices does not strongly affect system throughput because the throughput of the

    spinning media is sufficient to saturate other system resources. For example,

    configurations with 16GB of DRAM perform comparably to systems with 64GB of DRAM,

    and systems with read-optimized flash devices perform comparably to systems without

    read-optimized flash devices. Share record size, however, strongly affects system

    throughput an index build throughput for systems with a 64kB record size is 40-75%

    greater than systems with an 8kB record size. Table 6 list results measured on Sun

    Storage 7210 and 7410 systems using 64kB and 8kB record sizes.

    Table 6. Index Build Results

    The STATSPACK information represents the average throughput over the entire build

    process, which includes small and large numbers of clients pushing information into

    the database. Consequently, the STATSPACK information does not accurately capture

    peak load on the system. The iostat results reflect the combined work of read and

    writing from and to the storage target, and since the iostat results cover a short time

    period compared to the load process, they show a more accurate indication of peak

    throughput. I/O response time (defined as the active queue length divided by the

    average service time) was 6.9 ms per I/O in the case of the 8kB record size, and

    3.7 - 5.1 ms in the case of the 64kB record size.

    Metric Result Measured by

    Average blocks per read 16 STATSPACK

    Average tablespace reads persecond with 8kB record size

    310-330 STATSPACK

    Average tablespace reads persecond with 64kB record size

    460-540 STATSPACK

    Tablespace read response timewith 8kB record size

    29-51 ms STATSPACK

    Tablespace read response timewith 64kB record size

    17-29 ms STATSPACK

    Peak write throughput with 8kBrecord size

    450-510 IOPSand 13-16 MBPS iostat

    Peak write throughput with 64kBrecord size

    350-600 IOPSand 11-17 MBPS

    iostat

    Peak read throughput with 8kBrecord size

    2700-3300 IOPSand 86-105MBPS

    iostat

    Peak read throughput with 64kBrecord size

    3200-3400 IOPSand 100-110MBPS

    iostat

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    28/40

    26 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    OLTP Processing

    Characterized by 8kB I/O operations to small isolated working sets that evolve with

    time, OLTP processing highlights the effectiveness of the Sun Storage 7000 systems

    cache architecture. In cases where the active working data set is cached in DRAM or

    read-optimized flash devices, access time to that data is extremely fast compared to

    traditional storage systems. In practical cases, a subset of the active working set

    typically resides in cache and the remainder of the data is served from disk media.

    Consequently, the response time for a given I/O request is either bi-modal (in the case

    of a system configured with DRAM and spinning media), or tri-modal (in the case of a

    system configured with read-optimized flash in addition to DRAM and spinning media).

    The I/O distribution, however, is not easily captured in an average measurement using

    application-level or operating-system level tools such as STATSPACK or iostat, so

    average measurements must be interpreted in the context of the underlyingdistribution. From a planning perspective, this means that the storage architect should

    define SLAs that provide an appropriate allocation for both cache hit and cache

    miss I/O.

    The test environment consists of a 200GB order entry system, and the Oracle instance

    maintains a 4GB buffer cache that produces a 90-95% cache hit ratio. As reported by the

    STATSPACK Load Profile, the physical I/O for the transaction profile is

    10-30 reads per transaction and 7-15 writes per transaction. The lightly loaded cases

    typically have fewer reads and more writes per transaction, and the heavier loaded

    cases have more reads and fewer writes per transaction.

    The most significant factors affecting storage system performance are the ratio of cache

    hit to cache miss I/O and the share record size. For systems where more I/O is served

    from cache, I/O response time is shorter and the variation in average I/O response time

    is less than in systems where fewer I/O requests are served from cache. To simplify

    analysis and discussion, reads from either the primary ARC cache stored in DRAM or the

    secondary L2ARC cache stored on read-optimized flash are treated as a cache hit. Misses

    in the L2ARC are treated as a cache miss. This process overstates I/O response time as a

    function of cache hit for systems with low hit rates in the primary ARC cache, but it is

    accurate for systems with higher hit rates in the primary ARC compared to the

    secondary L2ARC. To capture the impact of caching, cache statistics are gatheredthough DTrace Analytics capabilities that are built into the Sun Storage 7000 Unified

    Storage Systems.

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    29/40

    27 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Figure 12 shows database-reported read response times (db file sequential read)

    versus the cache hit rate for 8kB and 64kB record sizes on 1-tier and 2-tier systems

    running with a cache hit range of 50-100 percent. As Figure 12 shows, high cache hit

    rates result in shorter response times and less response time variation, and lower cache

    hit rates exhibit longer response times and greater response time variation.

    Figure 12. Database Read Response Time Versus Cache Hit Rate.

    Note that the I/O response time in the 100% cache hit case is greater than predictions

    made in the earlier section (see Estimating System Performance with Cache Hit and

    Miss I/O on page 16). This is a consequence of the models simplicity compared to the

    complexity of the actual database workload and the possibility of a 100% cache hit

    condition coming from a combination of DRAM and read-optimized flash resources.

    For planning purposes, the results suggest a practical limit of 1.25 ms for 100% cache

    hit systems, a 5ms response time for 90% cache hit systems, and an increase of 0.34 ms

    for each cache hit percentage point under 90%. The linear regression predicts a 35ms

    response time in the 0% cache hit case. This is a shorter response time than that

    predicted from the model presented earlier and is a consequence of the disk drives

    being lightly loaded (e.g.,

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    30/40

    28 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    data set size is the same in both cases. The results show that a larger primary cache can

    support more cache-hit I/O under light workloads, and that a small primary cache with

    a larger secondary cache is more effective under heavy workload conditions.

    This effect is driven by two factors:

    The working set size becomes larger as more clients drive the database harder and

    the smaller total cache size in a single tier system cannot maintain the entire working

    set in cache

    The time to warm up the second tier cache becomes shorter as the total workload on

    the system increases

    From a planning perspective, lightly loaded systems benefit more from a large primary

    cache tier and no secondary cache, and heavily loaded systems benefit from a large

    secondary cache and a smaller primary cache.

    Figure 13. Cache Hit Rate as a Function of I/O Rate.

    Write response time characteristics as a function of workload play an important role in

    database performance. The OLTP system under test has more read requests than write

    requests per transaction (30 reads per 10 writes), so log write response time to the

    write-optimized flash device is not the most significant factor in system performance.

    Notwithstanding, write response time can play a significant role in performance for

    lightly threaded applications, or for applications that perform few or no additional I/O

    0 1000 2000 3000 4000 5000 6000 7000 800040

    60

    80

    100

    2-Tier (16GB/100GB)

    1-Tier (64GB)

    Database I/O Rate

    (rps + wps + 2*tps)

    CacheHit

    Percentage

    Linear Regression (1-Tier)

    Linear Regression (2-Tier)

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    31/40

    29 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    operations per transaction. In these cases, it is useful to understand log write response

    time as a function of the database transaction rate in the context of small

    OLTP transactions.

    Figure 14 shows the variation in log write response time versus the database-reported

    transaction rate for systems running with one or two write-optimized flash devices and

    a single 1Gb network interface. In lightly loaded cases, log writes are serviced in 1 ms,

    and response time falls below 4 ms for transaction rates below 100 transactions per

    second (TPS). Beyond 100 TPS, larger variations in log write response times require

    more careful planning and tuning to optimize the system for the more

    demanding workload.

    Figure 14. Log Write Response Time Versus Transaction Rate.

    Figure 15 shows how 8kB and 64kB record sizes affect OLTP throughput and transaction

    response time for an Oracle database formatted with an 8kB database block size. The

    results show that the 8kB record size delivers 80% more throughput at a 300 ms

    transaction response time, with a 60% reduction in response time at lower throughput

    levels. For systems running pure OLTP workloads, the 8kB record size is the most

    appropriate implementation. However, for systems running heterogeneous workloads

    with concurrent OLTP and scan operations, the 64kB record size offers a compelling

    trade-off to balance available throughput for both workloads.

    0 50 100 150 2000

    5

    10

    15

    20

    Log Write Response Time

    Database Transaction Rate (TPS)

    LogFileParallelWrite(ms)

    Linear Regression (Log Write Response Time)

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    32/40

    30 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Figure 15. Application Transaction Response Time Versus Application Transaction

    Rate for 8kB and 64kB Record Sizes.

    Parallel Table Scan

    Parallel table scan operations provide guidelines to help plan deployments that support

    Decision Support Systems (DSS). In combination with basic I/O service times and index

    build data described previously, the bandwidth measurements from the parallel table

    scan tests provide the storage architect with guidelines for the number of parallel I/O

    streams necessary to extract full system throughput, as well as estimating how much

    throughput can be expected.

    As with other tests, this test case reflects a single database read from a single share

    over a single 1Gb network interface. The table scan operates on the 60 GB stock table of

    an order entry system. The average row size in the table is 300 bytes, and the table has

    200 million rows. The table is subject to updates but not appends, so the scan reflects

    reading of a fixed working set size. The operating system tool iostat measures the scan

    I/O rate for the NFS share. The database-reported physical read rate is measured withSTATSPACK, and the user-level application records the number of rows per

    second scanned.

    Per I/O latency and network interface bandwidth are the critical factors affecting

    performance for isolated parallel table scans. In this workload, for both 8kB and 64kB

    shares, throughput for the scan was as follows:

    Application throughput: 270k-300k rows/second

    Database throughput: 9900-11000 reads/second

    0 1000 2000 3000 4000 5000 60000.0

    0.1

    0.2

    0.3

    0.4

    0.5

    64kB Record Size

    8kB Record Size

    New Orders per Minute (1/minute)

    NewOrderResponseTime(sec)

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    33/40

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    34/40

    32 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    cache, OLTP transaction response time increases 100% during the scan. These data

    show that the larger ARC cache provides greater benefit for mixed processing than a

    smaller ARC cache and a larger L2ARC cache.

    The size of the primary ARC cache and secondary L2ARC cache weakly affects

    throughput for the table scan operation however, the share record size strongly

    affects throughput for the table scan operation that runs concurrently with OLTP

    processing. In the case of the 64kB share record size, table scan throughput drops 25%

    when OLTP processing is added to the scan operation. In the other case with an 8kB

    record size, table scan throughput drops 50% when OLTP processing is added to the

    table scan operation. This data shows that significant performance gains can be made

    for scan throughput if the system is designed to support OLTP processing with a 64kB

    share record size.

    Implementation GuidelinesThis section provides an overview of implementation details that should be considered

    when deploying Oracle databases on Sun Storage 7000 Unified Storage Systems.

    Important topics include the choice of Unified Storage System model, data protection

    on the storage pool, write-optimized and read-optimized flash devices, access protocol,

    and project and share configuration when using replication and snapshot.

    Choosing a Unified Storage System Model

    A critical configuration decision involves selecting a particular model in the family of

    Sun Storage 7000 Unified Storage Systems the Sun Storage 7110, 7210, or 7410systems. Of the three models, the Sun Storage 7110 and 7210 systems cannot be

    clustered, but the Sun Storage 7410 system can be deployed in a clustered

    configuration. For this reason, the Sun Storage 7410 platform is mandatory for

    applications that require a highly available NAS head. For applications that can tolerate

    outages, the Sun Storage 7110 and 7210 can be used instead. The Sun Storage 7110

    system, however, does not support write-optimized flash devices. Consequently,

    synchronous writes from the database require back-end media access to be complete

    before the write can be acknowledged to the database instance. While this does not

    impede read throughput, it limits write throughput for typical database workloads to

    15 ms. For data accessrequirements that fit in this range, the Sun Storage 7110 system offers a small and low-

    cost platform. For applications that require solid-state write acceleration, the Sun

    Storage 7210 system is a more appropriate choice.

    Write-Optimized Flash Devices

    Write-optimized flash devices provide solid-state acceleration for synchronous write

    requests, and they are available on both the Sun Storage 7210 and 7410 platforms.

    Write-optimized flash devices may be deployed in two configurations: striped and

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    35/40

    33 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    mirrored. In the striped configuration, data on the flash device is protected by a second

    copy in DRAM on the NAS head. In the event of a loss of the write-optimized flash

    device, there is a short time (less than 60 seconds) where writes that have been

    acknowledged to the host are not stored on persistent media. Following the failure, all

    writes are acknowledged after being written to spinning media, and the system will run

    with reduced synchronous write throughput and increased synchronous write response

    times. For applications with requirements met by these availability and performance

    features, the striped configuration of the write-optimized flash device offers a low-cost

    and high-performance solution. For applications that require redundant copies of data

    on alternate persistent storage, or applications that cannot tolerate the performance

    degradation associated with completely losing all write-optimized flash devices, then a

    mirrored configuration of the write-optimized flash device is appropriate.

    Read-Optimized Flash DevicesRead-optimized flash devices are available on the Sun Storage 7410 platform and offer a

    low-cost extension of the storage device's cache resources. Slower in speed and larger

    in size than comparably priced DRAM, read-optimized flash devices are a compelling

    storage technology since they afford a compromise between the high cost of DRAM and

    the long access time of spinning media. Technical optimizations of read-optimized flash

    restrict share record sizes to less than 16kB therefore, systems using read-optimized

    flash should be designed around the performance of 8kB-16kB record sizes. Since the

    read-optimized flash resides in the NAS head and is not replicated to an alternate NAS

    head, the loss of a NAS head in a cluster causes the loss of any performance gain due to

    the read-optimized flash devices. When defining service level agreements, the storage

    architect should take this point into consideration.

    Access Protocol

    The Unified Storage System supports NFS, CIFS, and iSCSI access protocols, and any of

    these protocols may be used to support an Oracle database. Careful optimizations in

    the NFS stack, however, result in the highest levels of performance over NFSv3 protocol.

    For applications that require CIFS or iSCSI protocol, system throughput may drop or

    system response time may increase compared to NFS depending on the specific

    workload. Furthermore, the iSCSI implementation provides a volatile cache option that

    is disabled by default. This option should be left disabled for general purposeapplications, clustered applications, and applications that do not manage synchronous

    writes through the SCSI SYNCHRONIZE CACHE command.

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    36/40

    34 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Project and Share Configuration for Snapshot and Replication

    The Sun Storage 7000 Unified Storage Systems provide a wide range of configurations

    to best match storage technology to business requirements. Several implementation

    choices are important for Oracle databases depending on the requirements for remote

    replication, snapshot, and horizontal scaling over multiple network interfaces or

    storage devices for increased throughput.

    On the Unified Storage Systems, replication is accomplished by periodic replication of

    project snapshots. Consequently, replication is asynchronous from the perspective of

    the application, and write-order is maintained only at the project level. To use

    automatic replication to protect an Oracle database, the entire database must be

    contained within the same project. Manual replication (controlled through scripts) can

    be coordinated with Oracle hot backup to replicate databases that span

    multiple projects.

    Snapshot can be executed at the project or share level. When combined with hot-

    backup at the database level, databases that span multiple projects can be backed up

    with snapshot. For iSCSI implementations with Oracle ASM, however, hot backup is not

    available because Oracle ASM does not have a hot backup mode. In this case, snapshot

    backup can protect the database if the database resides on a single iSCSI LUN and

    snapshot is executed against the LUN.

    General Database Layout

    Although manual data placement can yield better throughput for specific workloads, a

    stripe and mirror everything (SAME) approach is an effective solution for general-

    purpose Oracle databases. The Unified Storage System, in addition to automatically

    caching frequently and recently accessed data on high-performance DRAM and read-

    optimized flash resources, implements a SAME architecture over the physical drive

    resources for all data when mirrored data protection is selected as a configuration

    option. Consequently, the logical layout of database data on Unified Storage System

    shares can be designed arbitrarily to optimize other business processes, such as

    backup or redeployment, without compromising the performance benefits of a

    SAME architecture.

    Although optimal deployment depends on the requirements for a specific system, the

    following guidelines provide a good starting point for Oracle implementations on

    Unified Storage Systems:

    Segregate different databases to different projects

    Store all databases files for a specific database in the same project

    Segregate online redo log files and control files, production data files, temporary

    files, and recovery files to separate shares within the same project

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    37/40

    35 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    Segregate production data files to separate shares within the same project if the data

    files are subject to significantly different workloads or requirements

    Fine-tune share settings for recordsize, compression, checksumming, and security

    based on the requirements for the share

    Storing all database files in the same project streamlines replication and snapshot

    backup implementations because write order is preserved within a specific project.

    Segregating data files from temp files, log files, control files, and recovery files allows

    for both flexibility and simplicity in rollback, restore, and deployment of alternate

    systems using snapshot and clone features. Also, segregating specific database

    components to separate file systems within the same project allows for optimal tuning

    of share recordsize, compression, checksumming, and security features to match the

    specific requirements of the component.

    However, SAME may not be the right fit in all implementations. For example, SAMEmay not provide an optimal storage architecture for databases that store data with

    disparate value and diverse data access requirements. In these cases, segregating the

    database to specific storage tiers that can support different access requirements can

    help to provide a cost-effective solution that still meets business and operational

    requirements. In a tiered storage architecture, SAME is used over all elements in a

    specific storage tier, and multiple storage tiers support the database.

    Detailed DTrace Analytics

    Once Sun Storage 7000 Unified Storage Systems are deployed, storage architects and

    administrators may find it necessary to troubleshoot application performance issues.DTrace Analytics is a dynamic tracing framework that helps organizations simplify the

    task of identifying causes behind intermittent or sustained application performance

    problems. With Analytics, administrators and application developers can instrument a

    live operating system kernel and running applications without rebooting the kernel,

    recompiling, or even restarting applications. Instrumentation can be activated as

    needed, without adding any overhead when tracing is turned off.

    For Sun Storage 7000 Unified Storage Systems, Analytics provides real-time

    observability for the entire system, including:

    NFS V3 and V4

    CIFS

    iSCSI

    ZFS

    CPU

    Memory utilization

    Networking

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    38/40

    36 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    The in-depth Analytics feature helps organizations to fine-tune deployed Sun Storage

    7000 Unified Storage Systems. For instance, Analytics can help administrators

    understand and optimize workloads in real time, aiding them in:

    Understanding the benefits of write-optimized and read-optimized flash devices for

    specific storage workloads

    Understanding when CPU, memory, and networking are creating bottlenecks

    Understanding the read/write/metadata mix of particular workloads

    For more information on Analytics, see the white paper entitled Architected for Open,

    Simple, and Scalable Enterprise Storage: Sun Storage 7110, 7210, and 7410 Unified

    Storage Systems.

    Summary

    Sun Storage 7000 Unified Storage Systems offer incredible flexibility with respect tostorage system configuration. Storage architects can design systems to optimize data

    placement, taking advantage of various combinations of system DRAM, flash, and

    spinning disk media. Understanding the characteristics of Oracle database applications

    can help system planners define data access requirements for data protection, access

    times, access rates, and storage capacities. After these requirements are quantified, the

    storage architect can more accurately match the specific configuration of the Unified

    Storage System. The test results and decision criteria presented here can help the

    architect in developing an optimal storage system configuration.

    About the AuthorsJeff Wright works in the Application Integration Engineering (AIE) Group in Sun's

    Systems Organization. Jeff remains obsessed with performance topics involving Oracle

    and data storage technologies from product conception to final delivery. His recent

    focus has been on improving the timeliness and accuracy of storage solutions that Sun

    proposes and delivers to customers. Prior to joining Sun in 1997 through the StorageTek

    acquisition, Jeff worked as systems test engineer for disk storage products. His early

    career experiences in automotive manufacturing, software development, and

    experimental physics shaped his unique approach and perspectives on assessing

    storage performance and designing storage architectures for database systems.

    Sridhar Ranganathan also works in the Application Integration Engineering Group in

    Sun's Systems Organization. His group's charter is to continuously improve the

    competitive standing of Sun open storage products for key independent software

    vendor (ISV) applications. Sridhar works on feature, functionality, and performance

    tests involving Oracle database management systems in conjunction with traditional

    storage and open storage technologies. Prior to joining Sun in 2000, his early career

    experiences were in software development and database administration in the

    manufacturing and banking industries.

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    39/40

    37 Configuring Sun Storage 7000 Systems for Oracle Databases Sun Microsystems, Inc.

    References

    Bitar, Roger. Deploying Hybrid Storage Pools With Flash Technology and the Solaris ZFS

    File System, Sun BluePrints Online, October 2008. To access this article online, go tohttp://wikis.sun.com/display/BluePrints/Deploying+Hybrid+

    Storage+Pools+With+Flash+Technology+and+the+Solaris+ZFS+

    File+System

    Shapiro, Mike. An Economical Approach to Maximizing Data Availability: Sun Storage

    7000 Unified Storage Systems, Sun BluePrints Online, November 2008. To access this

    article online, go to http://wikis.sun.com/display/BluePrints/

    An+Economic+Approach+to+Maximizing+Data+Availability

    Wright, Jeffrey T. Balancing System Cost and Data Value With Sun StorageTek Tiered

    Storage Systems, Sun BluePrints Online, February 2008. To access this article online,go to http://wikis.sun.com/display/BluePrints/

    Balancing+System+Cost+and+Data+Value+With+Sun+StorageTek+Ti

    ered+Storage+Systems

    Sun Microsystems. Architected for Open, Simple, and Scalable Enterprise Storage: Sun

    Storage 7110, 7210, and 7410 Unified Storage Systems White Paper, November 2008.

    To access this white paper online, go to http://www.sun.com/offers/

    details/Unified_Storage_Systems_Architecture.html

    Sun Storage and Oracle Partnership:http://sun.com/storagetek/

    partnerships/oracle

    Sun Unified Storage Systems: http://www.sun.com/storage/

    disk_systems/unified_storage

    Ordering Sun DocumentsThe SunDocsSM program provides more than 250 manuals from Sun Microsystems, Inc. If

    you live in the United States, Canada, Europe, or Japan, you can purchase

    documentation sets or individual manuals through this program.

    Accessing Sun Documentation OnlineThedocs.sun.comweb site enables you to access Sun technical documentation

    online. You can browse the docs.sun.comarchive or search for a specific book title

    or subject. The URL is http://docs.sun.com/

    To reference Sun BluePrints Online articles, visit the Sun BluePrints Online Web site at:

    http://www.sun.com/blueprints/online.html

  • 7/30/2019 Sun Storage 7000 - Configuration Guide

    40/40

    Configuring Sun Storage 7000 Unified Storage Systems for ORACLE Databases On the Websun.com

    Sun Microsystems, Inc. 4150 Network Circle, Santa Clara, CA 95054 USA Phone 1-650-960-1300 or 1-800-555-9SUN (9786) Web sun.com