Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

35
Budapest University of Technology and Economics Department of Measurement and Information Systems 1 Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis István Majzik Budapest University of Technology and Economics Department of Measurement and Information Systems June 2000

description

Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis. István Majzik Budapest University of Technology and Economics Department of Measurement and Information Systems June 2000. Introduction. Basis: FT-CORBA specification UML-based automatic dependability modeling Topics: - PowerPoint PPT Presentation

Transcript of Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Page 1: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 1

Fault Tolerant CORBA(FT-CORBA) -

Modeling and AnalysisIstván Majzik

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems

June 2000

Page 2: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 2

Introduction• Basis:

– FT-CORBA specification– UML-based automatic dependability modeling

• Topics:– Support to construct optimal FT-CORBA schemes– Evaluate existing architectures

• Part I: The FT-CORBA proposal• Part II: UML-based dependability analysis• Part III: Dep. modeling of FT-CORBA

Page 3: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 3

Part I

The FT-CORBA Proposal

Page 4: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 4

CORBA• OMG CORBA: standard of open OO systems

– Provides transparent access to services of remote objects (like local method calls)

– ORB: Object Request Brokercommunication of requests/responses(location, activation, parameter passing etc.)

• IOR: interoperable object reference• GIOP: general inter-ORB protocol• IIOP: Internet inter-ORB protocol

– IDL: Interface definition languageconsistency between client and server interfaces

Page 5: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 5

FT-CORBA

• Goal: Fault tolerance in CORBA environment

• History:– April 1998: Request for Proposal issued– October 1998: Initial submissions– December 1999: Joint revised submission

by Ericsson, Inprise, Iona, Lucent, Oracle, Sun,...

– April 2000: Final adopted specification

Page 6: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 6

FT-CORBA Concepts• Avoiding SPOF of single (server) objects• Fault tolerance by entity redundancy,

fault detection and recovery– creation of (server) object groups– infrastructure to maintain object replicas

• Basic properties:– replication transparency

(access independent of number/location)– failure transparency

(access independent of faulty server objects)

Page 7: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 7

Fault Tolerance Domains• FT domain:

– Object groups of server object replicas– Single Replication Manager

• Object groups:– different hosts– single object per host

• Replication Manager:– Creation and management of object groups– Support of application-controlled management

Page 8: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 8

Fault Tolerance Domain

Domains, object groups, hosts and replicas

Page 9: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 9

Architecture Overview

• Set of CORBA objects to support FT– Replication Manager– Fault Detector– Fault Notifier– Fault Analyzer

• ORB extensions– logging mechanism– recovery mechanism

• Commercial implementations?

Page 10: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 10

Fault Tolerance Infrastructure

Page 11: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 11

Replication Management• Infrastructure controlled case:

– application: create_object() method of the RM– RM: invokes local factory objects on hosts– RM manages membership, consistency

• Application controlled case:– application’s responsibility to manage replicas

• Parameters:• ReplicationStyle: stateless, cold / warm passive, active• MembershipStyle• ConsistencyStyle• InitialNumberReplicas, MinimumNumberReplicas

Page 12: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 12

Fault Detection and Notification

• Fault model: – object crash (incorrect results are not tolerated)

• Fault detection by polling– application objects inherit the PullMonitorable

interface: is_alive() method– Fault Detector invokes it periodically– hierarchy of fault detectors

• Fault notification and fault analysis• Parameters:

– FaultMonitoring (Style, Granularity, IntervalAndTimeout)

Page 13: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 13

Logging and Recovery• Application objects inherit:

– Checkpointable interface: get_state(), set_state()– Updateable interface: get_update(), set_update()

• Logging Mechanism:– storing GIOP messages– periodically storing state of the objects

• Recovery Mechanism:– restore object state and retrieve stored messages

• Parameters:– CheckpointInterval

Page 14: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 14

Client Failover

• Identification of object groups:– IOGR: interoperable object group reference– multiple IIOP profiles

addressing object group members or gateways

• Basic mechanisms of the client ORB:– retry all alternative IIOP profiles– transparent reinvocation of requests

(“at most once” execution semantics at the server)– heartbeating of the server

IIOP IIOP IIOP

Page 15: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 15

Part II

Dependability Modeling of Object-Oriented Systems

Described in UML

Page 16: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 16

Dependability AnalysisApproach by A. Bondavalli, I. Majzik, I. Mura

HIDE - High-level Integrated Design Environment for DependabilityESPRIT Open LTR No. 27493

• From UML-based models (class, object, deployment diagrams)

to Timed Petri Netsstandard PN evaluation tools can be used

• Supports– comparison of design choices– identification of bottlenecks

• System-wide, structural model

Page 17: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 17

Modeling Approach

1. UML model: Diagrams with extensionsstereotypes to identify roles (variant, tester, ...)tagged values to assign parameters

2. Intermediate model: Simplified structureelements: software, hardware, with/wo statesdependencies: „uses the service of” „is

composed of”class based redundancy fault tree

3. Dependability model: Timed Petri netsub-nets for elements and dependencies

Page 18: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 18

Failure/Propagation Sub-models

H E

fault

F

latency

p

1-p

prop

no_prop

restart

E1

E2

H2

<<SF-SW>>

UML model elements Petri net modules

O1

O2 O1

Page 19: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 19

Repair Sub-model

E

latency

m(E)=1H F

fault

P1

P2

P3

t0t1

perm

transimplicit

explicit m(P1)=0m(P1)=1

<<SF-HW>>

UML model Petri net module

O1

Page 20: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 20

Redundancy Sub-models

Variant1failure

Variant2failure

RedundancyManager

failure

Subsystem failure

G

P

SF

P

G

RM

V1 V2

UML model Fault tree Petri-net

Page 21: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 21

Part III

Dependability Modeling of FT-CORBA Architectures

Page 22: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 22

Approach• UML models:

– identification of elements/structures– additional parameters support of automatic modeling

• Tailoring to FT-CORBA– subnets to specific mechanisms– based on the parameters

• Restrictions: – non-replicated client, static structure– infrastructure controlled replication management

Page 23: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 23

UML Modeling• Identification of elements/structures

– Fault Tolerance Domain: package• independent of deployment

– Object groups: sub-package– Roles: stereotypes

• FT-CORBA properties as tagged values– ReplicationStyle– MembershipStyle– ConsistencyStyle– FaultMonitoring (Style, Granularity, Interval)– (Initial, Minimum) NumberReplicas

Page 24: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 24

Overall Structure

FT Domain Alpha

Domain2

FTI

RM FN FD

OG4OG3OG2OG1

S11 S12 FD1

Domain1<< >>

<< >>

<< >>

C1 C2

Page 25: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 25

Modularity• Available building blocks:

– failure subnet– propagation subnet– repair subnet– fault tree

• Sub-models in FT-CORBA:1. Client failover2. Server object failure3. Fault management (detection and notification)3. Recovery (replication management)

Page 26: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 26

1. Client Failover

• Semantics:– Primary is tried first– Failover conditions: „crash”

• Communication failure• No response

No failover: erroneous response– No failure exception until all profiles have been tried

Page 27: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 27

Dependability Sub-model

Fault tree (passive replication):– Top event: Client failure– Basic events:

• Server object crash• Server object erroneous response

– Composite events (OR): number n of profiles • S1 (primary) erroneous• S1 crash AND S2 erroneous• S1 crash AND S2 crash AND S3 erroneous• ... • S1 crash AND S2 crash AND ... AND Sn crash

Page 28: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 28

2. Server Object Failure• Distinction of failures:

– Crash Failover in client Error detected in the object group

– Erroneous response (commission fault) Propagated to clients,

application-specific error detection

Page 29: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 29

Dependability Sub-model

• Failure process:– failure subnet– distinguished cases: crash/erroneous response

• Propagation subnets– standard subnets (toward the client fault tree)

H E

fault rateF

latency

C ER

p 1-p

Page 30: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 30

3. Fault Management

• Fault detection+notification: Chain of events– Source: Fault Detector

• latency = MonitoringInterval• coverage depends on MonitoringGranularity:

– each member / single per host / single per host and type

– Propagation: Fault Notifier(s)• communication failures

– Destination: Replication Manager• Hierarchy of Fault Detectors• Infrastructure objects: Replication is possible

Page 31: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 31

Dependability Sub-model

• Error detection delay– timed PN transition

• Fault notification subsystem– fault tree (AND)

• Replicated infrastructure objects– local fault trees (AND)

Page 32: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 32

4. Recovery in the Object Group• Triggered by the Fault Notifier

in the Replication Manager• Goal: Maintain the number of replicas

– crashed object is removed– creation of new replica, restoring state– only a single replica on a given host!

• Repair is possible if:– current host is fault-free– current host is faulty, but there are available hosts

i.e. number of hosts >= NumberReplicas

Page 33: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 33

Dependability Sub-model

• Repair subnet: Explicit repair– latency: CheckpointInterval, ReplicationStyle

• Recovery of the replica:– Static deployment:

Standard repair subnet– Pool of identical hosts: Logic condition for repair

Free hosts (PN place)• marking increased by host repair and server object crash• marking decreased by host crash and server object repair

Guard on the transition for explicit repair

Page 34: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 34

Overall Structure of Subnets

Notification

Prop.

Client Fault Tree

S1 err

Prop.

S1 crash

Recovery Repair

NumberReplica

FaultMonitoringGranularity FaultMonitoringInterval

ReplicationStyle CheckpointInterval

Prop. Prop.

Prop. Prop.

Page 35: Fault Tolerant CORBA (FT-CORBA) - Modeling and Analysis

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems 35

System-wide Dependability Model• Analysis of the Petri-net:

– standard tools (SPNP, PANDA, ...)

• Sensitivity analysis– system-wide reliability, availability

Optimal selection of FT-CORBA parameters– replication (membership, consistency) styles– number of replicas– monitoring granularity, interval