Reliability, Availability and Serviceability (RAS) Integration ......April 2015 2 No license...

23
Reliability, Availability and Serviceability (RAS) Integration and Validation Guide for the Intel® Xeon® Processor E7- v3 Family Memory Address Range Mirroring April 2015

Transcript of Reliability, Availability and Serviceability (RAS) Integration ......April 2015 2 No license...

  • Reliability, Availability and

    Serviceability (RAS) Integration

    and Validation Guide for the

    Intel® Xeon® Processor E7- v3

    Family

    Memory Address Range Mirroring

    April 2015

  • 2

    No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

    Intel disclaims all express and implied warranties, including without limitation, the implied warranties of

    merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course

    of performance, course of dealing, or usage in trade.

    This document contains information on products, services and/or processes in development. All information

    provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast,

    schedule, specifications and roadmaps.

    The products and services described may contain defects or errors known as errata which may cause deviations

    from published specifications. Current characterized errata are available on request.

    Copies of documents which have an order number and are referenced in this document may be obtained by calling

    1-800-548-4725 or by visiting www.intel.com/design/literature.htm.

    Intel, the Intel logo, and Intel Xeon are trademarks of Intel Corporation in the U.S. and/or other countries.

    *Other names and brands may be claimed as the property of others

    © 2015 Intel Corporation.

  • 3

    Contents 1 Introduction ........................................................................................ 4

    2 Address Range Mirroring Overview ..................................................... 6

    3 System Configuration .......................................................................... 8

    4 Software Integration ........................................................................... 9

    4.1 UEFI Variables for Memory Address Mirroring ................................... 9

    4.2 Management Interface Flow – BIOS .............................................. 13

    4.3 Management Interface Flow – OS ................................................. 13

    5 Intel® DFx Validation Recipe ............................................................ 14

    5.1 BIOS Setup ............................................................................... 15

    5.2 UCE Injection to TAD0 Primary Channel ........................................ 17

    5.3 UCE Injection to Non-Mirroring Range ........................................... 20

    5.4 Memory Mirror Failover Validation ................................................ 22

    Figures

    Figure 2-1. Typical Memory Layout ........................................................... 7

    Figure 2-2. Address Range Memory Mirroring............................................. 7

    Figure 5-1. Memory Mirroring Transaction Flow ........................................ 15

    Figure 5-2. BIOS Setup Menu for Interleave configuration setting .............. 15

    Figure 5-3. Enable Address Range Mirroring ............................................ 16

    Figure 5-4. Set Address Range Mirroring Size (size unit is 64 MB) .............. 16

    Figure 5-5. Independent Mode Memory Map ............................................ 17

    Figure 5-6. Address Range Mirror Enabled Memory Map. ........................... 18

    Figure 5-7. TAD Register of Address Range Mirror Enabled ........................ 19

    Tables

    Table 1-1. Definition, Business Value, and User Experience ......................... 4

    Table 1-2. Memory Address Range Mirroring Support Platform Scope .......... 5

    Table 3-1. Minimum System Configuration ................................................ 8

    Table 4-1. UEFI Run-time Services ......................................................... 10

  • 4

    1 Introduction

    The document describes the software and hardware integration and validation

    methodologies associated with Memory Address Range Mirroring RAS feature

    into Intel® Xeon® Processor E7 – 8800/4800/2800 v3 based systems (code

    named Haswell EX). For those who are responsible for BIOS development,

    system and platform validation, or product marketing, this document provides

    integration and validation guidance for the Memory Address-based Range

    Mirroring feature.

    In channel memory mirroring feature, mirroring is implemented across two

    memory channels. It requires allocating a significant amount of memory as

    redundant memory thus increasing the cost and power as compare to a

    nonmirrored system.

    The Memory Address Range Mirroring offers more granular mirroring controls

    allowing a cost-effective and power optimized solutions with the benefit of

    mirroring which ensures system uptime even when memory uncorrected fatal

    error event is detected.

    At a high-level, business-value and user-experience are described in Table 1-1.

    Table 1-1. Definition, Business Value, and User Experience

    Definition Business Value User Experience

    Memory Address Range Mirroring

    feature provides further granularity

    to mirror memory by allowing the firmware or OS to determine a range of

    memory addresses to be mirrored,

    leaving the rest of the memory in the

    socket in non-mirror mode. Dynamic

    (without reboot) failover to the

    mirrored memory is transparent to the

    OS and applications. The processor

    supports up to two mirror ranges, one

    mirror range per iMC. Each mirror

    range size is using 64MB granularity.

    More cost-effective

    mirroring solutions by

    mirroring just the

    critical portion of

    memory versus

    mirroring the entire

    memory space which

    ensures system

    uptime even when

    memory uncorrected

    fatal error event is

    detected

    Only mirror user defined

    memory address range

    through BIOS or OS for the

    critical software execution

    portion. For example, users

    may choose to mirror

    only the OS kernel space

    leaving rest of the

    memory configured in

    independent mode

    The Memory Address Range Mirroring feature description and platform support

    scope are highlighted in Table 1-2 below:

  • 5

    Table 1-2. Memory Address Range Mirroring Support Platform Scope

    Description Intel® Xeon® Processor E7 –V2

    Intel® Xeon® Processor E7 –V3

    Intel® Xeon® Processor E5 –V2

    Intel® Xeon® Processor E5 –V3

    The Memory Address

    Range Mirroring offers a

    more cost and power

    efficient mirroring option.

    It requires communication

    between OS/VMM and

    BIOS to determine the

    address range for

    mirroring.

    No Yes No No

    Integration guides in this document contain Unified Extensible Firmware (UEFI)

    BIOS and the OS interface setup flows for reporting and configuring mirror

    memory region during system boot and runtime.

    Validation guides in this document contain Memory Address Range Mirroring

    validation recipes using Intel® validation tool CScripts which is developed

    based on Intel® Design for Test/Debug/Manufacturing (DFx) Technologies.

  • 6

    2 Address Range Mirroring

    Overview

    Memory address range mirroring is a new memory RAS feature on Intel®

    Xeon® Processor E7 – 8800/4800/2800 v3 based systems (code named

    Haswell EX) that allows greater granularity in choosing how much memory is

    dedicated for redundancy. In normal channel mirroring, the memory is split into

    two identical mirrors (primary and secondary). Half of all installed memory is

    reserved for redundancy and not reported in total system memory size.

    To reduce the amount of memory lost for redundancy, partial memory

    mirroring can be used. For partial memory mirroring, mirroring can be enabled

    selectively. If mirroring is not enabled, memory will be configured in

    Independent mode and will be part of the total system memory. Once mirroring

    is enabled, the memory is split into two identical mirrors. Half of the installed

    memory with mirroring enabled will be reserved for redundancy, and not be

    included in total system memory.

    To further reduce the amount of memory reserved for redundancy, memory

    address range mirroring can be used. It is similar to partial memory mirroring

    and can be enabled selectively. The size (range) of the primary and secondary

    mirrors can be defined using 64MB intervals. The range is defined by the value

    programmed in the Target Address Decoder 0 (TAD0) register. TAD0 defines

    the size of the primary and secondary mirror ranges. The secondary mirror

    range is reserved for redundancy and not reported in the total memory size.

    Because TAD0 is used to define the range for memory address range mirroring,

    there is a TOLM limitation that contains memory (where system address 0x0

    begins). On the first memory segment, the largest range allowed is limited by

    TOLM (~2GB). This is because TAD1 will begin at system address 0x100000000

    (4GB). TAD0 will cover 0-4GB. However only 0GB – TOLM (~2GB) will be

    decoded to system memory. Note that TOLM to 4GB address range is assigned

    to MMIO. Refer to Figure 2-1 and Figure 2-2 for overview of the memory map

    and mirror region allocation.

    When the memory address range mirroring feature is correctly configured, once

    a multi-bit ECC error is injected to mirror region TAD0, the mulit-bit ECC error

    is treated as an uncorrected error (UCE) in independent mode will be

    downgraded to corrected error (CE) until mirror failover. Refer to Chapter 5 for

    the detail validation flows.

  • 7

    Figure 2-1. Typical Memory Layout

    Figure 2-2. Address Range Memory Mirroring

  • 8

    3 System Configuration

    The system environment in this document for Address Range Mirroring

    validation is based on the Intel® Xeon® Processor E7 – 8800/4800/2800 v3

    Family. Minimum hardware configurations are shown in Table 3-1.

    Table 3-1. Minimum System Configuration

    Component Version/stepping Quantity Unit Comments

    CPU Intel® Xeon

    Processor E7 –

    8800/4800/2800 v3

    4 ea Target is to use latest stepping

    of the processor

    Memory 8 GB DDR4 RDIMM 32 ea Each channel has one DIMM.

    BIOS 63.R00 1 ea Latest BIOS at the time of

    releasing this doc.

    OS N/A

    Intel® DFx

    Validation Tool

    492130 Latest CScripts version at the

    time of releasing this doc.

    Intel® In-

    Target Probe

    Intel® DFx

    Abstraction

    Layer

    1.9.4450.400

  • 9

    4 Software Integration

    This chapter describes BIOS and OS integration and run time Memory Address

    Range Mirroring setup through defined interface. Contact Intel® local

    representative for further detail interface specification and supports.

    The legacy memory mirroring feature is transparent to the OS; however, the

    address range mirroring feature requires a firmware-OS interface for users to

    specify the desired subset of memory to mirror. The OS needs the following

    supports to fully utilize Address Range Mirroring feature:

    • Present partial or total mirrored memory on the platform to the OS

    • Provide the OS a method to request the amount of mirrored memory

    that takes effect on subsequent boots.

    Partial memory mirrored ranges are indicated by an attribute field in the

    EFI_MEMORY_DESCRIPTOR. Any mirrored memory will need to be associated

    with the following attribute.

    Not only address range mirroring but also total mirrored memory can be

    reported to the OS.

    The OS uses the UEFI Call GetMemoryMap() returns to the OS all the address

    ranges presented by the platform and is proposed to be modified to include

    partially mirrored memory. Note that UEFI Call GetmemoryMap() is part of the UEFI BOOT services and hence cannot be invoked after ExitBootServices().

    4.1 UEFI Variables for Memory Address Mirroring

    UEFI Variables are used to allow the OS or system software to identify the

    memory blocks currently mirrored by the platform and also to allow the OS to

    request mirroring on a subsequent boot.

    Supported features are:

    • Enable and disable mirroring of memory below 4GB for a subsequent

    boot

    • Specify amount (up to 50%) of memory to be mirrored for a subsequent

    boot

    • Indicate whether memory below 4GB is mirrored for current boot

  • 10

    • Indicate amount of memory mirrored for current boot: The status of

    requested memory redundancy (% and below 4GB) during last boot and

    status of request – SUCCESS, FAILURE, PARTIAL…

    Table 4-1 lists the UEFI run-time services available for managing variables.

    Table 4-1. UEFI Run-time Services

    Use the following global unique identification data type (GUID) to associate

    with variables:

    #define ADDRESS_RANGE_MIRROR_VARIBLE_GUID {0x7b9be2e0, 0xe28a,

    0x4197, 0xad, 0x3e, 0x32, 0xf0, 0x62, 0xf9, 0x46, 0x2c}

    The variable data contained in the following structure:

    Typedef struct{

    UNIT8 MirrorVersion;

    BOOLEAN MirrorMemoryBelow4GB;

    UNIT16 MirroredAmountAbove4GB

    EFI_STATUS MirrorStatus;

    }ADDRESS_ RANGE_MIRROR_VARIBLE_DATA;

    MirroredAmountAbove4GB is the amount of available memory above 4GB that

    needs to be mirrored measured in basis points (hundredths of percent e.g.

    12.75%=1275)

    In a multi-socket system, the platform is required to distribute the mirrored

    memory range such that the amount mirrored is approximately propotional to

    the amount of memory on each NUMA node. For example, on a two-mode

    machine with 64GB on node 0 and 32GB on node 1, a request for 12GB of

    mirrored memory should be allocated with 8GB of mirrored on node 0 and 4GB

    on node1.

    For another example: if total memory in the system is 48GB and 12GB of

    memory above 4GB of mirrored memory is requested, the percentage of

    memory above 4GB that needs to be mirrored is:

  • 11

    Mirrored Memory Above 4GB / Total Memory above 4GB. Assume TOLM = 2GB,

    the percentage is (12-(4-TOLM))/44=22.72% = 2272 basis points

    Size of the ADDRESS_RANGE_MIRROR_VARIABLE_DATA is calculated as

    follows:

    # define ADDRESS_RANGE_MIRROR_VARIBLE_SIZE

    Sizeof(ADDRESS_RANGE_MIRROR_VARIBLE_DATA)

    The Variables will be associated with the following attributes. Note that BIOS

    may additionally specify authenticated write access attribute on “MirrorCurrent”

    to prevent the OS from updating or deleting it.

    #define

    ADDRESS_RANGE_MIRROR_VARIABLE_ATTRITUBE{EFI_VARIABLE_N

    ON_VOLATILE | EFI_VARIABLE_BOOTSERVICE_ACCESS |

    EFI_VARIABLE_RUNTIME_ACCESS}

    The following two commands are used in UEFI run-time service:

    GetVariable() is invoked using these parameters:

    Prototype

  • 12

    SetVariable() is invoked using these parameters:

    Prototype

  • 13

    4.2 Management Interface Flow – BIOS

    On the first boot of a system that supports memory address range mirroring,

    the BIOS must create the MemoryCurrent variable and initialize it to values

    supplied by BIOS setup options. If there are none, then it uses: [1,

    false,0,EFI_SUCCESS]. Note that once “MemoryCurrent” is corrupted or does

    not exist in a subsequent boot, the BIOS must create it.

    On each subsequent boot, the BIOS should first check if “MemoryRequest” has

    been set by the OS to request a change in mirror configuration. If it is, then

    the BIOS should try to set up mirroring using the new parameters:

    • If successful, copy parameters from “MemoryCurrent”, set

    MemoryCurrent.MirrorStatus=EFI_SUCCESS

    • If not successful, set MemoryRequest.MirrorStatus with an error code

    and fall through to normal path to set mirroring using the old

    parameters

    If MemoryRequest was not set (or if it was and could not be used as mentioned

    above), then the BIOS should set up mirroring from parameters in

    MemoryCurrent and set MemoryCurrent.MirrorStatus to indicate the

    success/fail status.

    4.3 Management Interface Flow – OS

    During early boot, the OS uses the attribute bits returned by GetMemoryMap()

    to locate which ranges of memory are mirrored (same data also available via

    _ATT attributes).

    During hot plug memory operations the OS uses _ATT attribute field to

    discover changes to memory mirror configuration.

    Later it examines MemoryCurrent and MemoryRequest to check for errors in

    setting up mirror ranges.

    For the OS to change the mirror configuration, it sets MemoryRequest to

    [1,desired-below-4GB,EFT_WARN_STALE_DATA] and then reboots to allow the

    BIOS to apply the new settings.

  • 14

    5 Intel® DFx Validation Recipe

    This chapter describes the methodology recommended for validating the

    Address Range Mirroring Feature. Error injection tools and libraries, data

    collector and analyzer are the key components described here. Primary

    objective is to validate end-to-end flow from the time a hardware error is

    injected to the time the error recovers successfully. Contact Intel® local

    representative regarding CScripts installation process, system integration, and

    detail recipes with CScripts command types and results. Regarding signal

    transaction flows in mirror processing, refer to Figure 6-1.

  • 15

    Figure 5-1. Memory Mirroring Transaction Flow

    5.1 BIOS Setup

    The reference BIOS setup shown in Figure 5-2, Figure 5-3, and Figure 5-4 is

    based on the BIOS version used in Intel® Xeon® Processor E7 –

    8800/4800/2800 v3 based Customer Reference Platform. For different

    platforms, users need to check with their BIOS vendors for the correct system

    configurations.

    The BIOS menu setup is shown below (also see Figure 5-2 through 5-4):

    EDKII Menu -> Advanced -> Memory RAS Configuration: � Memory Interleaving: NUMA(1-way) Node Inter leave or 2-way Node

    Interleave

    � Memory Mirroring: Partial CH Mirroring.

    � Memory Mirroring SCKx MCx: Enabled

    � Partial Mirroring Size: select size here, 64MB based unit

    Figure 5-2. BIOS Setup Menu for Interleave configuration setting

  • 16

    Figure 5-3. Enable Address Range Mirroring

    Figure 5-4. Set Address Range Mirroring Size (size unit is 64 MB)

  • 17

    5.2 UCE Injection to TAD0 Primary Channel

    For precise exeution commands, contact Intel® local representative for the

    details.

    Step 1: Configure Address Range Mirroring range

    Step 2: Check Memory Map

    Under EFI shell, current memory map can be shown using the memmap

    command. If the Address Range Mirroring feature is enabled, the memory

    map will be changed with mirrored size reduction. With the BIOS

    configuration in Section 5.1, the top memory address is 0x87fffffff (34 GB) in

    independent mode shown in Figure 5-5. In Address Range Mirroring mode,

    the top memory address is 0x83fffffff(33 GB) shown in Figure 5-6. The

    memory size is reduced 1 GB (16 x 64 MB = 1 GB). As there is a 2 GB MMIO

    size under 4 GB address space, memory size in independent mode is 32 GB

    (34 GB – 2 GB) and memory size in Address Range Mirroring mode is 31

    GB(33 GB – 2 GB).

    Figure 5-5. Independent Mode Memory Map

  • 18

    Figure 5-6. Address Range Mirror Enabled Memory Map.

    Step 3: Get TAD0 System Address

    TAD0 channel address is recorded in TAD0 registers. The physical mirroring

    address range can be obtained by translating the address from its channel

    address and its socket address recorded in SAD registers.

    TAD0 can be read by CScripts as below

    >> sv.socket0.uncore0.ha0_tad_0.show

    Or through the register dump as shown in Figure 5-7, TAD0 records 0xF3E4.

    based on the definition, the channel address limit is calculated as (0x0f + 1) x

    64 MB= 1 GB. As its node base address is 0, the physical range of the mirror

    area is 0 ~ 1 GB. Each node base address is recorded in SAD registers. SAD

  • 19

    registers describe each node’s limit address. In case there is no node level

    interleave, the base address of node[n] is equal to the limit address of

    node[n-1] plus 1. In this test configuration, since there is no DIMM installed

    on the other node, the base address of this node is 0.

    Figure 5-7. TAD Register of Address Range Mirror Enabled

    Step 4: Inject multi-bit ECC error to TAD0

    An uncorrectable error within mirroring range will be treated as a correctable

    error.

    In this validation recipe, one uncorrectable error will be injected to mirror

    address range. The error should be treated as a correctable error. If not, the

    test fails.

    Step 5: Identify if UCE downgraded to corrected error CE

    The new Event ID is produced and the error type shows a corrected error in

    Windows Event Viewer.

  • 20

    5.3 UCE Injection to Non-Mirroring Range

    To compare multi-bit ECC error injection to TAD0, UCE injection to the non-

    mirroring address range, the memory controller detects uncorrectable error

    and triggers machine check exception (MCE) to system software, and system

    goes to BSOD with 124 codes under Windows Server 2012.

    Step 1 ~3: The same as the step 1 through 3 in section 5.2

    Step 4: Inject multi-bit ECC error to the non-mirror range above 0x3fff_ffff

    Note that if inject multi-bit ECC error to TAD0 boundary 0x3ffffffe, this

    UCE type of error should downgrade to CE type of error as well.

    Once UCE type of error is injected to TAD1+, the system BSOD will be

    expected with error code ID 124.

    Step 5: Check system BSOD.

    After injecting UCE successfully, let system go then check

    Windows Event Viewer on Windows Server 2012

  • 21

    Step 6: check BSOD 124 event

    After a system reboot, go back to Windows 2012 and check if the

    system event generated a BSOD 124 event.

  • 22

    5.4 Memory Mirror Failover Validation

    To validate the memory mirror failover function, the mirror scrub should be

    enabled in the BIOS setup menu. When persistent UCE type of error is injected

    until mirror failover, the memory controller loses mirror redundancy and the

    system will operate in independent mode. At this stage, if one more multi-bit

    ECC error is injected to TAD0 region, the error will be treated as an

    uncorrectable error. Following steps are the detail validation procedure.

    Step 1 ~3: Refer to step 1 to step 3 in section 5.2

    Step 4: Inject mirror failover on TAD0

    Inject mirror_failover error type at TAD0 address.

    Step 5: Check system status

    Let system go. At this point, system should be still alive.

    Then check event log in event view. There is ID 47 event generated by

    OS CMCI handler.

    Step 6: Inject multi-bit ECC error after mirror failover occurs

    The System goes to BSOD with ID 124 error which is WHEA type.

  • 23