Split Brain Detection Version 00

8
Split Brain Detection Version 00 Nigel Bragg September 4 th , 2012 1

description

Split Brain Detection Version 00. Nigel Bragg September 4 th , 2012. Introduction from :- new-haddock-RNNI-split-brain-avoidance-1210-v1.pdf. A “split-brain” situation arises when : - PowerPoint PPT Presentation

Transcript of Split Brain Detection Version 00

Page 1: Split Brain Detection Version  00

Split Brain Detection

Version 00

Nigel Bragg

September 4th , 2012

1

Page 2: Split Brain Detection Version  00

Introductionfrom :- new-haddock-RNNI-split-brain-avoidance-1210-v1.pdf

A “split-brain” situation arises when :1. In normal operation, two (or more) devices depend upon a

control path to coordinate their operation such that they function as a single virtual entity with a single identity; and

2. Upon failure of the common control path, the two (or more) devices operate independently but a) Each assumes the full functionality of the single virtual entity; and/or

b) Each continues to use the identity of the single virtual entity.

• Split-brain issues are avoided if the solution is designed so that conditions 2a and 2b do not occur. – There are two general approaches to achieving this.

2

Page 3: Split Brain Detection Version  00

Approach A: Easy Split-Brain Avoidance

• Prevent condition 2b by:– Assuring that all devices, or all but one pre-determined device, always

switch to a unique identity (different from the identity of the single virtual device) upon failure of the control path.

• Prevent condition 2a by either:– Assuring one and only one device assumes the full functionality of the

single virtual device upon failure of the control path; or– Assuring that each device deterministically assumes a subset of the

functionality that does not overlap or conflict with the subset assumed by another device.

• Link Aggregation, using the standard protocol without any changes running across the NNI, achieves this.

• Characterized as “easy” because this approach does not require distinguishing whether a node failure or a link failure resulted in the loss of the control path.

3

Page 4: Split Brain Detection Version  00

Approach B: Hard Split-Brain Avoidance

• Prevent condition 2b by:– Assuring that one and only one device continues to operate with the

identity of the single virtual device upon failure of the control path.– Note that with hard split-brain avoidance there is always one device

continuing to operate with the identity of the single virtual device, whereas with easy split-brain avoidance there may or may not be a device that continues to operate with the identity of the single virtual device.

• Prevention of condition 2a:– The options for prevention of condition 2a are the same for both easy and

hard split-brain avoidance. This is because once the identity issue is resolved, there are many possible ways to resolve the division of functionality.

• Characterized as “hard” because this approach requires distinguishing whether a node failure or a link failure resulted in the loss of the control path.

4

Page 5: Split Brain Detection Version  00

The reference model :- Two Systems with Distributed Aggregation

5

System A

Port Port Port Port

System B

Port Port Port Port

Each Network Port on System A advertises:1.Actor_System = A2.Actor_Key = Ax3.A Port ID for each port unique within A

Each (non Gateway) Port on System C advertises:1.Actor_System = C2.Actor_Key = Cn3.A Port ID for each port unique within CWhere Cn is the same value on all of the ports,

(possible) Network LinkIntra-Portal Link (could be virtual)

Emulated System C

Port Port Port Port PortPort

Each Network Port on System B advertises:1.Actor_System = B2.Actor_Key = Bx3.A Port ID for each port unique within B

Network Link Network Link

GatewayLink(virtual)

GatewayLink(virtual)

Page 6: Split Brain Detection Version  00

Split Brain Detection (1)It is desirable to solve the “hard” split brain problem to

ensure that a portal continues to operate as a single virtual device whichever node within it might fail,

• which in turn requires that we have a robust way of determining that a node has failed, and not just been partially disconnected.

Assertion• it is necessary to check for node reachability by all

possible paths before being entitled to regard it as deadSo• normal “keep-alive” can be limited to run on the inter-DAS link (1),but if that fails (e.g. from the PoV of A seeking to establish the

reachability of B above)1. we need to probe for network connectivity between A and B (2), and2. we need to ascertain reachability of B via the DRNI (3)

If B is unreachable by all routes, it doesn’t matter if it has failed or not.

6

X XA B

DRNI

C

XXY Z

DRNI

W

2

1

3

?

Page 7: Split Brain Detection Version  00

Split Brain Detection (2)If inter-DAS link (1), fails (e.g. from the PoV of A

seeking to establish the reachability of B above)

1. We need to probe for network connectivity between A and B (2) – should be straightforward :– LBM from MEP(Sys ID A) MEP(Sys ID B) ?

2. We need to ascertain the reachability of B via the DRNI (3) :– it is not clear now to probe B directly from A

(and be sure to use all the links (3) of the DRNI),– W may believe all links are a distributed LAG – poisoned reverse,so propose :– A could harvest from W the full list of Port IDs being offered by C :

• and need to request that that this information is “fresh”,– but the mechanism must also handle a dual-homed legacy real node W :

• is there a mechanism to allow this ?

What then ?

7

X XA B

DRNI

C

XXY Z

DRNI

W

? 2

1

3

Page 8: Split Brain Detection Version  00

Split Brain Detection (3)What then ?• “Assure that one and only one device continues to

operate with the identity of the single virtual device on failure of the control path”.

a) If a Node sees zero connectivity to its “mate” Node, it picks up the DRNI identity C;

b) If a Node has lost the inter-DAS link (1) and connectivity via its own network (2),• but some physical connectivity to its “mate” is

advertised by W over the DRNI (3),• or that information is not available :

– and so we must assume that connectivity exists,then the network behind A and B is severed : • Node A reverts to its “real” LAG parameters as A,• or would it be less disruptive to run its part of C

using “last agreed parameters” ?

c) Else use own network (2) to negotiate roles, or exchange DRCP messages

8

X XA B

DRNI

C

XXY Z

DRNI

W

? 2

1

3

123 o000 a)001 b)010 c)011 c)