Oracle Flex ASM - What’s New and Best Practices by Jim Williams

40

description

Oracle Open World (OOW) 2014 Presentation by Jim Williams (Oracle ASM Product Manager) on Oracle Flex ASM - What's New and Best Practices. The presentation provides an overview of enhancements (What's New) in Oracle ASM 12c, especially with respect to Oracle Flex ASM, and provides best practices which can be applied in any environment (Flex or Standard ASM). This presentation has also more background information for some of the configuration recommendations that I made in my "Oracle RAC (12.1.0.2) Operational Best Practices" presentation.

Transcript of Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Page 1: Oracle Flex ASM - What’s New and Best Practices by Jim Williams
Page 2: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Oracle Flex Automatic Storage Management: What’s New and Best Practices

Jim Williams ASM Product Manager October 1, 2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Page 3: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Safe Harbor Statement

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

3

Page 4: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Program Agenda

4

Flex ASM

Best Practices - Avoiding 3:00 AM calls

Q & A

1

2

3

Page 5: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

ASM History 101

• Provide an integrated volume manager and file system

• Stripe and mirror files across disks in a ASM Disk Group

• Automatic “Rebalance” after storage configuration changes

• Built on the Oracle instance architecture

• I/O operations DO NOT go through the ASM instance!

• Manage storage as a global cluster of shared Disk Groups

The Simple Idea

5

Page 6: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

ASM History 101 The Simple Idea

6

Page 7: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

ASM History 101 The Simple Idea

Disk1 Disk 2 Disk 3 Disk 4 Disk 5 Disk 6 Disk 7

Disk Group A Disk Group B

Database Database Database

File 1 File 2

File 3 File 4

ASM Cluster Pool of Storage

Oracle

RAC

Servers

ASM

Instance

ASM

Instance

ASM

Instance

ASM

Instance

ASM

Instance

Database

Instance

Database

Instance

Database

Instance

Database

Instance

Database

Instance

Shared Disk Groups

Wide File Striping

1-1 ASM to Server

7

Page 8: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Pre-12c ASM Architecture

ASM architecture utilized an ASM instance on every server

– Database instances dependent on node-specific ASM instances

– ASM overhead scaled with size of cluster

– Cluster reconfiguration events increased with number of servers in a cluster

DB1 DB2 DB3

ASM

DB1 DB2 DB4

ASM

DB1 DB5 DB4

ASM

DB1 DB5 DB6

ASM

Server Server

Server Server

Cluster

8

Page 9: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Flex ASM Architecture

Eliminates requirement for an ASM instance on every cluster server

– Database instances connect to any ASM instance in the cluster

– Database instances survive loss of ASM instance

– Administrators specify the cardinality of ASM instances (default is 3)

– Clusterware ensures ASM cardinality is maintained

9

Page 10: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Flex ASM

• Increased maximum number of Disk Groups increased to 511

– Previous limit was 63

• Replicated physical metadata

– Improves reliability

– Virtual metadata has always been replicated with ASM mirroring

• Replace ASM Disk command

• Administrators can now specify a Failure Group repair time:

– Similar to existing disk repair time

– New disk group attribute - failgroup_repair_time

– Default setting is 24 hours

Other Flex Features

10

Page 11: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Flex ASM

• In previous versions, database instances used OS-authentication to connect to ASM

– This worked because ASM clients and servers were always on the same server

• With Oracle Database 12c, database instances and ASM servers can be on different servers

– Flex ASM uses password file authentication

– Password file is in an ASM Disk Group

– A default configuration is created when the ASM cluster is configured

• Databases can use a shared password file as well!

Remote Access

11

Page 12: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Program Agenda

12

Flex ASM

Avoiding 3:00 AM Calls

Q & A

1

2

3

Page 13: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Deploying Flex ASM Do not accept the default – choose “Advanced Installation”

• Typical Installation

– Does not provide an option to use “Flex ASM”

• Advanced Option

– Recommended for all configurations

13

Page 14: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Deploying Flex ASM “Advanced Installation” – Storage Options

• Four storage options are available:

1.Standard ASM • Pre 12c ASM configuration mode

2.Oracle Flex ASM • Recommended

3.ASM Client Cluster • Ignore for now

4.Non-ASM managed storage

14

Page 15: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Green Field Deployment

Pre 12c

Flex ASM

Isolate Flex ASM deployment from pre 12c databases

– Pre 12c databases require a local ASM instance

– Dedicated Flex ASM cluster

DB DB DB

ASM

Server

DB DB DB

ASM

Server

DB DB DB

ASM

Server

DB DB DB

ASM

Server

DB DB DB

ASM

Server

DB DB DB

ASM

Server

DB DB DB

ASM

Server

DB DB DB

Server

DB DB DB

ASM

Server

DB DB DB

Server

ASM

Pre

12c

12c

15

Page 16: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Mixed-mode Deployment

Flex ASM cluster supporting pre-12c databases

– Set cardinality to ALL

– ASM instance on every server

– Only 12c database instances can reconnect to a surviving ASM instance after a server failure

DB DB DB

ASM

Server

DB DB DB

Server

DB DB

ASM

Server

DB DB DB

Server

DB DB

ASM

Server

DB DB DB

Server

Pre

12c

12c

DB

DB

ASM

ASM

ASM

Mixed

16

Page 17: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Deploying Flex ASM

• Less than four nodes

– ASM instance on every node

• Four or more nodes

– Three ASM Instances or ALL if pre 12c databases

• SRVCTL command for: • Checking ASM instance status

– srvctl status asm

• Setting cardinality

– srvctl modify asm –count ALL

• Starting, stopping, and relocating ASM instances

– srvctl start asm –n node-name

– srvctl stop asm –n node-name

How many ASM instances?

17

Page 18: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Deploying Flex ASM

• Use Automatic Memory Management (AMM) “MEMORY_TARGET”

– Defaults to 1 GB – adequate for most configurations

– Exadata uses a custom memory configuration

• Process Count “PROCESSES” – For #_DBs < 10, PROCESSES = 50* #_DBs + 50

– For #_DBs >= 10, PROCESSES = 10* #_DBs + 450

– Oracle Exadata environment, the MAX(450 + 10* #_DBs, 1024)

• See ASM Administration Guide for details on setting

Memory and Process Count Considerations

18

Page 19: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

ASM Diskgroup Configuration

• ASM_DISKSTRING: search path that ASM uses to discover Disk Groups

– ASM examines all the devices specified as possible candidates as an ASM Disk

• When the candidate list is excessively large, Disk Group discovery becomes unnecessarily long causing slow ASM response

• Default value often is sufficient “/dev/sd*”, but if there is a need to change it, don’t make it too inclusive e.g. “/dev/*”

Disk Group Discovery

19

Page 20: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

ASM Diskgroup Configuration

• Best practice is DATA, FRA, GRID

– GRID DG for OCR, Voting File, SPFILE

– Use External Redundancy for most high-end storage arrays

• Separate Disk Groups for different storage tiers – REDO Disk Group for flash storage

How many Disk Groups?

20

Page 21: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

ASM Diskgroup Configuration

• How many Failure Groups

– Engineered systems automatically configure Failure Groups

– Choose by hardware boundary

– Use default of disk per Failure Group for small number of ASM Disks

– Failure Groups must be balanced – equal number of disks in each Failure Group • Except Quorum Failure Groups need not be balanced

– Failure Groups for Extended Clusters need to be site-based and require Quorum Failure Group

How many Failure Groups?

21

Page 22: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

ASM Diskgroup Configuration

• All disks must be same size for Normal and High Redundancy

– Similar performance characteristics

• Minimum disks: 4 times the number of paths for each Disk Group

– Normal Redundancy Disk Group with 2-way multipathing >= 8 disks

• Maximum: < 1000 disks in a Disk Group

– Long disk discovery times and frequent capacity additions with too many disks

How big and how many disks in a Diskgroup?

22

Page 23: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

ASM Disk Groups with Mirroring

• With ASM Mirroring, the Partner Status Table (PST) is replicated

– 3 copies for Normal Redundancy, and 5 copies for High Redundancy

– Each copy in separate Failure Groups, special case for <3 FG (Normal) and <5 FG (High)

• Two Failure Groups are problematic because quorum cannot be established! – Create Quorum Failure Group(s) to satisfy PST quorum

– Quorum Failure Groups are exempt from the homogeneity requirements

– Quorum Failure need not be large – not used for data

Quorum Failure Groups

23

Page 24: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

ASM Capacity Management

SELECT name, type, total_mb, free_mb, required_mirror_free_mb, usable_file_mb FROM V$ASM_DISKGROUP_STAT;

NAME TYPE TOTAL_MB FREE_MB REQUIRED_MIRROR_FREE_MB USABLE_FILE_MB

DATA NORMAL 51180 42204 10236 15984

24

Diskgroup Name Redundancy

Total raw space in Diskgroup Unused raw space in Diskgroup

Largest FG capacity Logical space that can be allocated and still have ASM restore redundancy after Failure Group failure

• USABLE_FILE_MB = (FREE_MB - REQUIRED_MIRROR_FREE_MB) / 2 [normal redundancy]

Page 25: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

ASM Diskgroup Configuration

• When creating a new Disk Group, 4 MB AUs provide minor benefit for data warehouse environments

– AU Size cannot be changed online and requires recreating the Disk Group

• Don’t invest effort in reconfiguring to accommodate 4 MB AU Disk Groups

– Variable extents introduced in 11.2 reduce benefit of larger AUs

– Storage array architecture affects I/O performance benefit of 4 MB AUs

AU Size

25

Page 26: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

ASM Diskgroup Configuration

• Use External Redundancy if you have complete confidence in your storage

– A small percentage of customer use Normal Redundancy with high-end storage arrays • Mirror data across storage arrays

• Need to provide Quorum Failure Group for Partner Status Table quorum

• Always use multi-pathing

– Provide ASM the MP O/S device path name

– Set MP timeout to less than clusterware heartbeat timeout (< 120 seconds)

– MOS note 294869.1

• Advance COMPATIBLE.ASM to 12.1

– Replicates physical metadata and makes External Redundancy Disk Groups more resilient to accidental corruption

Fault Tolerance

26

Page 27: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

ASM Diskgroup Configuration

• Disk Groups are containers with internal “versioned” data structures

– Support backward compatibility

– Compatibility settings determines the availability of ASM features (see matrix in ASM admin guide)

– Disk Group compatible attribute can only be advanced • BUT CANNOT BE REVERTED TO PREVIOUS VERSION

• The COMPATIBLE.ASM attribute must be >= value of other disk group compatibility attributes

– Advanced with SQL, ASMCA, ASMCMD

Compatibility Settings

27

Page 28: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

ASM Diskgroup Configuration

• Two new features in 12.1 are support for LUNs larger than 2 TBs and 511 Disk Groups

– Large LUN support • LUNs larger than 2TB require COMPATIBLE.ASM >= 12.1 and COMPATIBLE.RDBMS => 12.1

• ASM environments planning to use Disk Groups with large LUNs must only have post-12c databases

– 511 Disk Group support: not controlled by any compatibility attribute • Disk Group numbers are assigned at mount time (1..511) in the order they are discovered

• Pre 12c databases cannot access a Disk Group that is numbered greater than 63

• ASM environments planning to use more than 63 Disk Groups must only have post-12c databases

12.1 Feature Compatibility

28

Page 29: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

ASM Diskgroup Reconfiguration

• Power determines the number of concurrent I/O operations

– Highly configuration dependent, but values greater than 32 often have declining benefit with respect to rebalance performance

– Can be dynamically changed to manage performance impact

• A “Power” setting can now be used for disk resync (disk online)

• Administrators can now replace a disk as a fast and efficient operation – Disk Group reorganization is not required

– Replacement disk is populated with copies of ASM extents from online disks

ASM Power

29

Page 30: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Preventing Accidental Corruption

• The most common cause of corruption is accidental administrative action made to the wrong disk

– Over writing ASM Disk with a File System

– Assigning an ASM Disk to an LVM

• Employ operational procedures that established hard separation between ASM Disks and all others. A few ideas:

– Unique ASM Disk partitions e.g. /dev/sdu2 • First partition is small partition that aligns second partition to 1 MB boundary

• ASM is assigned second partition

• NEVER use second partitions elsewhere

• ASM Filter Driver – prevents over writing ASM Disks

30

Page 31: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Checking for Corruption

• Silent data corruption is a fact of life in today’s storage world

• The database checks for logical consistency when reading data

– If a logical corruption is detected then automatic recovery can be performed using the ASM mirror copies

– For seldom accessed data, over time all mirror copies of data could be corrupted

• With Oracle 12c ASM data can be proactively scrubbed: – Scrubbing occurs automatically during rebalance operations

– Scrubbing of a Disk Groups, individual files, or individual disks

– ALTER DISKGROUP <NAME> SCRUB [POWER AUTO|LOW|HIGH|MAX];

Silent Data Corruption

31

Page 32: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Disk Failure

• Disk and Failure Group Repair Timers

– Disk Repair Time default value is 3.6 hours

– Failure Group Repair Time feature provides extra opportunity to avoid unnecessary rebalance – default value is 24 hours

– ALTER DISKGROUP <NAME> SET ATTRIBUTE 'DISK_REPAIR_TIME' = ‘12H';

– DROP AFTER clause of "OFFLINE DISK"|"OFFLINE DISKS IN FAILGROUP“ can be used to reset active timer

– Timer runs only while Disk Group mounted

– REPAIR_TIMER column in V$ASM_DISK reflects remaining time

– When timer expires and the disk is force dropped, you cannot use ONLINE DISK or REPLACE DISK

Repair Times when mirroring

32

Page 33: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Recovery After Disk Failure

33

• Normal Redundancy Disk Group with two Failure Groups.

• USABLE_FILE_MB is negative

Page 34: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Recovery After Disk Failure

34

• What happens when there is a disk failure?

• ASM takes Disk OFFLINE

• Timer counts for disk to be DROPPED from Disk Group

Page 35: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Recovery After Disk Failure

35

• After timer expires disk is dropped from Disk Group

• ASM begins rebalancing Disk Group

Page 36: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Recovery After Disk Failure

36

• Eventually, one disk will become full and allocations in Disk Group cannot be made

• ASM begins rebalancing Disk Group, but cannot continue because the Disk Group is full.

Page 37: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Recovery After Disk Failure

37

• Even adding a new disk cannot allow rebalance of Disk Group to continue because of partnering of individual extents.

Page 38: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Recovery After Disk Failure

38

• Possible solution:

– Drop the disk that is full (disk E)

– Then undrop disk after a brief period once some of the extents have been relocated and freed up space

Page 39: Oracle Flex ASM - What’s New and Best Practices by Jim Williams

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Q & A

Page 40: Oracle Flex ASM - What’s New and Best Practices by Jim Williams