1 © 2005 EMC Corporation. All rights reserved. Project MegaGrid Building the Oracle Grid Reference...

1© 2005 EMC Corporation. All rights reserved.

Project MegaGridBuilding the Oracle Grid Reference Architecture

Akira HangaiOracle Partner EngineeringEMC Corporation


Agenda

Introduction– Enterprise Grid Computing Primer– Enterprise Grid Computing and Information Lifecycle Management– Project MegaGrid Overview

Anatomy of Project MegaGrid– Capacity Planning – Infrastructure Design– Deployment– Provisioning

Enterprise Grid Computing and Information Lifecycle Management

Q&A

Introduction


Enterprise Grid ComputingPrimer


“Enterprise” vs “Traditional” Grid Computing

Enterprise Traditional

Application Type

Business applications; i.e., multi-tiered database application (e.g., CRM, ERP, OLTP)

Scientific applications; i.e., encryption cracking, intense mathematical calculations (e.g., SETI, Distributed Net)

Technological method

Clustering, federated, no “master”

Distributed, independently parallel, centrally managed

Technological focus Scalability Manageability

Target Community IT organizations “Net” users

Technological objective Resource utilization

Business Objective Lower TCO


What Are the Problems IT Face?

Traditional architecture– A series of “islands” – Resources statically assigned

to a specific service

A demand for any service…– …Fluctuates over time– …Causing the changes in the

utilization of resources

Consequently…– Hard to react to the

fluctuating demands– Hard to utilize resources– Hard to scale

Data Center


What Is the Grid Computing Solution?

Standardization– The server equipment– The OS platform– Deployment methods– Management procedures &

tools

Consolidation– Network infrastructure– Storage infrastructure– Multiple application services

Shared resources in a single environment for higher utilization


How Does the Grid Computing Approach to Problems?

Heterogeneous resources Standardization

Scattered islands Consolidation

Fluctuating demands Resource Utilization

Components management Virtualization/Automation

Traditional problems Grid Approaches

High cost of deployment and

maintenance

Lower Total Cost of

Ownership


Enterprise Grid Computingand

Information Lifecycle Management


Information Lifecycle Management

Time

High

Low

DataValue(Solid)

Large

Small

DataAmount(Dashed)

Both the value and amount of data change over time!

App A

App B

App C

App D


Information Management

Grid Infrastructure Model

Enterprise Grid Computing Model

Grid Computing and Information Lifecycle Management

Service Consolidation

Enterprise Infrastructure

Tiered Networked

Storage

Resource Provisioning

End-to-End Connectivity

Data Movement

Higher Resource Utilization with Lower TCO


Project MegaGridOverview


What Is Project MegaGrid?

Collaborating on Enterprise Grid Computing ModelStandardization, consolidation, automation, utilization

Traditional IT ProblemsHigh maintenance, lack of scalability, lack of reusability…

Industry Leaders Implementing Grid Technologies

Allied as Project MegaGrid to Develop Best PracticesMerging technologies, integration, reducing technical risks

Delivering white papers that provide design, deployment and operational methodology recommendations


Other Standard Operational Issues•Business Continuity

•Backup/Restore•Disaster Recovery

What Technical Areas Project MegaGrid Address?

StandardizationStandardizing on existing

components andtechnologies from the partners

Consolidation•Developing infrastructure

design guidelines•Deploying multiple applications

Resource UtilizationDeveloping dynamic resource provisioning

methodologies and guidelines

Virtualization/AutomationDeveloping best practices for

managing virtualized resources

Grid Approaches Project MegaGrid


What Is Project MegaGrid Trying to Accomplish?

Consolidate partners’ technologies

Address each technical area introduced by employing the Grid Computing Model

How to design a large environment? How to build and scale? Performance? Management? Utilization? Provisioning? Why Grid?

Develop the joint best practices for each technical area using the real-world applications

Deliver a series of joint technical white papers

Accumulate the deliverables to build a joint reference architecture for the Enterprise Grid Computing deployment


What are the Project MegaGrid Activities?

Phase 1 Phase 2 Phase 3

InfrastructureDesign

Infrastructure Scaling-Out

ManagementIntegration

2004 2005 2006Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4

DeploymentGuidelines

RAC Scalability and Performance

PerformanceMonitoring and

ResourceProvisioning

MultipleApplicationsDeployment

Infrastructure Scaling-Out

Business Continuity

DisasterRecovery

36 nodes 66 nodes 144 nodes

Phase 1 Completed with three white papers

published!

Deployment Guidelines

Refresh

Technology collaboration Best practices development Seamless integration

Configuration Guidelines

Phase 2 Completed with another three white papers published! Pla

nnin

g in

Pro

gres

s


What Components Are Used in MegaGrid Environment?

Real Application Clusters

Database Control

Grid Control

E-Business Suite

Business Intelligence Ultra Search

Dell PowerEdge 1750/1850

Dell PowerEdge 7250 Catalyst

6509

MDS 9509

SymmetrixDMX 1000

CLARiiONCX 700

CelerraCNS

CelerraNS700

EMC ControlCenter

PowerPath

Telco service-provisioning

OLTP application

Red Hat Linux Enterprise Linux 3.0

Anatomy ofProject MegaGrid


Capacity Planning


Service Consolidation

E-Business Suite

E-Business Suite Telco OLTPTelco OLTPBusiness

IntelligenceBusiness

IntelligenceUltra SearchUltra Search Internet DirectoryInternet

Directory

RAC Database“EBS”

RAC Database“EBS”

RAC Database“US”

RAC Database“US”

RAC Database“BI”

RAC Database“BI”

RAC Database“OID”

RAC Database“OID”

RAC Database“TELCO”

RAC Database“TELCO”

Oracle ASM Oracle ASM

Disk Group“+EBS”


Disk Group“+US”

Disk Group“+US”

Disk Group“+BI”

Disk Group“+BI”

Disk Group“+OID”

Disk Group“+OID”

Disk Group“+TELCO”


Symmetrix DMX1000Symmetrix DMX1000

DMX Device Group“IA64”




CLARiiON CX700CLARiiON CX700

CX Storage Group“IA64”




Clients accessing multiple applications

Specific services are assigned with a set of servers.

A RAC database services each specific application.

A site-wide single ASM manages the disk groups for all the RAC databases


Se

rvic

es

an

d S

erv

ers

(C

PU

)

Capacity Planning Flow Primer

De

sig

n

Sto

rag

e C

ap

aci

ty

Sto

rag

e P

erf

orm

an

ce

Estimate aggregated IOPS

Estimate initial data size and growth rate

Break down IOPS per node

Add the fault tolerance requirements to the size

Calculate the total bandwidth requirement

per node

Add the backup requirements to the size

Develop the storage classes

Design the infrastructure

Estimate required capacity

Determine Business Functions and Service-Level Objectives

Create services and define the workload based on the “Service,

Module, and Action” model

Develop the server classes

“Capacity Planning” joint white paper contains detailed information


1. Determine the business functions and service-level objectivesExample

2. Create services and define the workload based on the “Service, Module, and Action” model

– Instrumentation • Execute dbms_application_info.set_module(‘RouteReport’,’DisplayRoute’);

– Service Performance Threshold• DBMS_SERVER_ALERT.SET_THRESHOLD()• SERVICE_ELAPSED_TIME, SERVICE_CPU_TIME

– Monitoring• Execute dbms_monitor.serv_mod_act_stat_enable(service_name

=>’Telco’,module_name =>’’Routereport’,action_name=>’DisplayRoute’);

Service and Server CPU Capacity Planning

Application Business FunctionResponse

Time Trans/h Mode Pct of load

Telco Route Report 850ms 254840 online 46%

Telco Status Change 1100ms 88640 online 16%

… … … … … …

Telco SUMMARY < 1sec 554000


Storage Capacity Planning

Estimate aggregated throughput and IOPS(E.g., 2GB/sec, or 300,000 IOPS)

Estimate initial data size and growth rate for all the applications(E.g., 500GB initial, double over two years, 1TB total)

Add the fault tolerance requirements(E.g., 2TB with RAID1, 1.nTB with RAID5)

Add the backup requirements to the size(E.g., Additional 1TB for a full, another 1TB for 5 incremental)

Calculate the total bandwidth requirement per node(E.g., 2GB/sec for 16 nodes = 128MB/node/sec or 300,000/16 = 18,750 IOPS/node)

Choose the appropriate storage class and build the configuration(E.g., 1,200 IOP per spindle, 16-way striped = 19,200 IOPS per LUN)


Infrastructure Design


MegaGrid Service Architecture Concept

Payroll Inventory Report

Business Intelligenc

e

Ultra Search

RAC RACRAC

RAC

Disk GroupDisk Group

DG Recovery GroupDG

Sto

rag

eG

rid DG

DG

NAS Group

Application Services

Application Servers

Database Servers

Database Instances

Automatic Storage Management (ASM)

EMC ControlCenterNavisphere

Networked Storage

Dat

abas

eG

rid

Ap

pli

cati

on

Gri

d

Device Group

Device Group

Device Group

RAID1+0 RAID5 ATA NAS

JDBC, SQLNet

FC-SW

SAN Provisioning

ASM Provisioning


MegaGrid Infrastructure Concept

Sto

rag

e F

arm

Se

rve

r F

arm

IP N

etw

ork Public/App-DB Private Interconnect NAS/iSCSI Management

SAN Fabric 1 SAN Fabric 2

a001 a002 a003 aNNNb001 b002 b003 bNNN

Storage 01 Storage 02 Storage NN

NAS NNLANWAN

Server and storage farms horizontally

scalable (“scaling-out”)

Server and storage farms horizontally

scalable (“scaling-out”)


IP

SA

N

MegaGrid Topological Architecture

Dell PowerEdge 1750/1850

Cisco/EMCMDS 9509224 ports

SymmetrixDMX 1000

CLARiiONCX 700

CelerraCNS

CelerraNS700

Dell PowerEdge 7250

CiscoCatalyst 6509Management VLAN (1GigE)

Public VLAN (1GigE)

NAS VLAN (1GigE)

Interconnect VLAN (1GigE)

Fabric 1 (2Gbps)

Fabric 2 (2Gbps)

Each switch uplinks to 2 x aggregation switches via 10Gb

32 ports, 16 per switch



8 IP ports, 4 per switch

Cisco/EMCMDS 9509224 ports

EMC PowerPath

EMC PowerPath


Infrastructure Design Summary

Scalability Design– Multiple services = Multiple server nodes– Multi-tiered = sizable inter-server communication– Clustering = multiple nodes accessing the same storage

Capacity Planning– Application workloads– Scale-out

Design Concerns– Port density: bandwidth requirements between server nodes: ISL or

big switches?– High availability: LACP, EMC PowerPath


Storage Configuration and Provisioning


Storage Configuration

Granular “building-block” model

Volume Logix (Symmetrix) and Storage Group (CLARiiON) technology for provisioning

Compliant with the existing “best practices” for performance

Complementary to the Oracle ASM model

DMX 1000-P •32 FA ports•144 disks•32 GB cache

• 16-way Metavolume(RAID1+0)

• 16-way BCV Metavolume(RAID0)

CX 700 •8 front ports•30 FC drives•15 ATA drives

• 4+1 RAID5 Groups on FC drives

• 4+1 RAID3 Groups on ATA drives

NS 700 •12 GigE ports (2 x 6-port SP)•15 FC disks

• 4+1 RAID5 Groups


SAN Storage

Storage Provisioning Primer

ServerHBAIF IF

SAN switch

Front-end adaptersIF IF IF IF IF IF IF IF

Cache

Backend adapters

LUN

IF IF IF IF IF IF IF IF

LUNLUNLUN

LUNLUNLUNLUN

LUNLUNLUNLUN

LUNLUNLUNLUN

LUN MaskingCreation of logical volumes and assignment of volumes to specific interfaces within the storage system

Port-based zoningAccess control allowing communication between certain ports

WWN-based zoningAccess control allowing communication between certain WWNs

Device MaskingAccess control allowing communication from certain WWNs to certain LUNs

“Port-to-port” ACL(2) “Port-to-port” ACL

(1) “Export” ACL

(3) “Service” ACL; i.e., LUN-to-WWN

ACL (most granular)


DMX or CX

Dev Group A Dev Group B

MegaGrid Storage Provisioning Model: SAN

SAN Switch (Fabric 1)

Server Node Class 1-001

HBAP1 P2

SAN Switch (Fabric 2)

Director 3/SPA

A0 1

B0 1

Director 14/SPB

A0 1

B0 1


HBAP1 P2

1) LUN Masking Configuration on the DMX(One-time only)Assign consistent device IDs per Director port to all the devices.

2) Zoning Configuration on the SAN Switches(One-time only)Create a zone per HBA (initiator) port (e.g., P1@1-001) with all the available DMX Director ports (e.g., 3A0, 3B1, 14A1, 14B0) on the same switch. Note that in this model, zoning alone will not allow access to the devices; it merely allows the communication between two points.

3) Device Masking on the DMX (Storage Group on the CX)(On-demand – dynamic provisioning)Allow access from a specific WWN (HBA port, or initiator) to certain devices by adding the entry to the Volume Logix database on the DMX or Storage Group configuration on the CX

0001 0009

0019 0021

0011

0029

02A1 02B1

02D1 02E1

02C1

02F1


Celerra NS700

FS

Gro

up

MegaGrid Storage Provisioning Model: NAS

IP Switch (NAS VLAN)


NICeth2 eth3

Data Mover 02 Data Mover 03


NICeth2 eth3

1) Export Configuration on the NS(One-time only)Export certain NAS file systems through a certain Data Mover to a specific subnet (security implication). Use FS Group to combine file systems for ease of management for SnapSure.

2) Mount on Server Nodes(On-demand – dynamic provisioning)Simply mount on server nodes using the “mount” command. To make it “permanent,” add the entry to the /etc/fstab file.

/nas01 /nas02

/nas03 /nas04

cge0 cge1 cge2 cge3 cge0 cge1 cge2 cge3

LACP

NAS provisioning is the standard NFS procedure.


Storage Configuration and Provisioning Summary

Granular building-block configuration

Manageable provisioning technology: Device Masking– Volume Logix – Storage Group

High Availability– PowerPath for multi-pathing I/O– Standard technologies such as LACP


Software Deploymentand Configuration


MegaGrid Deployment and Provisioning Model

InstallableSoftwarePackages

InstallableSoftwarePackages

OracleE-Business

Suite

OracleDatabase 10g

TelcoOLTP

Oracle Internet Directory

Oracle Cluster Ready Service

SharedStorage(applications)

SharedStorage(applications)

Installed on…

ServerFarmServerFarm

Mounted on…

SharedStorage(data)

SharedStorage(data)

+EBS +TELCO +OID +USRecovery

Area+BI

Sharing…

1. Application software is installed on shared storage to avoid duplicate installation.2. The software is mounted on a server when it’s dynamically allocated to run the service.3. The server then accesses the data for that service on the shared storage.


Read executables and init.ora file from the NAS device

Configuration Management

ServerFarmServerFarm

Oracle ASM onShared SAN Devices

Oracle ASM onShared SAN Devices



Shared NAS Devices•Executables

•Password files•init<SID>.ora files

(pfiles)•System logs (trace)

Shared NAS Devices•Executables

•Password files•init<SID>.ora files

(pfiles)•System logs (trace) Disk Group

“+OID”Disk Group

“+OID”



1 2 3

NASNAS ASMASM

Return the SP file

location on the ASM

disk group

Contact the ASM

instance

Return the config-uration

1

2

3

4

4

Initialization parameters are stored on the ASM disk group. Configuration for other apps are

stored on the shared NAS devices.Run the services


Software Deployment and Configuration Summary

Technical Issue Lessons Learned

Software Installation

Utilize the shared NAS storage to avoid duplicate of the installation, in terms of both time consumed for performing the tasks and the disk space.

Software Deployment

Utilize the existing technologies built into Oracle Cluster-Ready Service and Real Application Clusters.

Configuration Management

Utilize the ASM on the shared SAN to store the shared configuration for the database services.

Utilize the shared NAS storage as the central location for configuration as well as log files.

Utilize the existing techniques for the naming conventions and other policy-based rules to manage the multiple instances.


Software Deployment and Configuration Summary

Utilize NAS to install binaries once

Naming conventions for consistency


Performance Monitoringand

Scalability


Oracle Database 10g RAC Scalability

Completed in Phase 1, announced at Oracle OpenWorld San Francisco 2004.

A separate RAC database was deployed on each server-class cluster.

PE1750 and PE7250 clusters were tested during Project MegaGrid Phase 1, using Dell/Intel servers with EMC platforms.

SMP cluster was tested on a Unix platform with FC-AL storage.


Oracle Database Performance Monitoring

Monitoring Business Transactions– Externalized in V$SQL and V$SQLAREA– Set with

• OCIAttrSet, • setEndToEndMetrics• DBMS_APPLICATION_INFO

– HINT: There is no direct way to correlate a SQL statement with a service. Specify the module name in such a way that the service can be identified.

Database Alerts– Thresholds can be defined with the DBMS_SERVER_ALERT

PL/SQL package– The most important service levels are

• CPU time per call• Response time per call

“Performance Management” joint white paper published


Oracle Database 10g Monitoring using AWR

AWR (Automated Workload Repository)– “Statspack built into the kernel”– Metrics and statistics are collected in a database repository– Repository is stored in the SYSAUX database– Snapshots are automatically collected every hour for the whole

cluster– Across the cluster each snapshot has the same snapshot id.

Guidelines– If you want to keep the snapshots for more than seven days

add more space to the SYSAUX tablespace– If more database instances are added to database add more

space to the SYSAUX tablespace– After every installation and every update of one of the

components collect a new baseline. New baselines can be collected with the CREATE_BASELINE procedure.


Storage Performance Monitoring

Symmetrix DMX– EMC ControlCenter Performance Manager

• Throughput, service time, busy stats, and capacity utilization per physical device granularity• Trending information

– EMC Solutions Enabler• Simple command-line based tools for throughput and busy stats• Scriptable

CLARiiON CX– EMC Navisphere Management Suite

• Throughput, service time, busy stats, and capacity utilization• Trending information

– NaviCLI• Simple command-line based tools for throughput and busy stats• Scriptable

Celerra NS/CNS– Celerra Manager

• Web-based tool for throughput, service time, busy stats, and capacity utilization per file system and device level

“Performance Management” joint white paper published, available on http://www.emc.com/megagrid/


Resource Provisioning


The Foundation of Resource Provisioning: Infrastructure

End-to-end Connectivity– All the servers can communicate with each other with no bottleneck

(i.e., ISL).– All the servers can see all the storage devices, both SAN and NAS,

so any combination of the servers can assume running any application service.

Key Technologies– Enterprise-level storage systems (able to support a large number of

clients)– Enterprise-level networking, both for IP and SAN (avoiding obvious

physical limitation of ISL bottleneck)– Advanced provisioning technology: Volume Logix (Symmetrix) and

Storage Group (CLARiiON) to simplify storage provisioning


Enabler of Resource Provisioning: Application

End-to-end Monitoring– Mapping of the runtime environment, from the highest-level

application down to the storage.– All the data have to be statically stored on some specific storage;

they have to be identified easily with storage management mechanism.

Key Technologies– Instrument application

• DBMS_APPLICATION_INFO.Set_Module()– Set alerts

• DBMS_SERVER_ALERT.Set_Threshold()– Define Resource Manager Plans

• DBMS_RESOURCE_MANAGER– Enable more fine grained statistics collection

• DBMS_MONITOR


For More Information…

White Papers Available Since Q4 2004– Capacity Planning– Deployment Best Practics– Performance Monitoring for Large Clusters

White Papers Newly Available– Infrastructure Design– Resource Provisioning– Storage-based Data migration

All are available from – http://www.dell.com/megagrid– http://www.emc.com/megagrid– http://www.intel.com/go/megagrid– http://www.oracle.com/megagrid

http://www.dell.com/megagrid

http://www.dell.com/megagrid

http://www.emc.com/megagrid

http://www.emc.com/megagrid

http://www.oracle.com/megagrid

http://www.oracle.com/megagrid

1 © 2005 EMC Corporation. All rights reserved. Project MegaGrid Building the Oracle Grid Reference...

Documents

Transcript of 1 © 2005 EMC Corporation. All rights reserved. Project MegaGrid Building the Oracle Grid Reference...