Disk Group Cannot Be Imported. Serial Split Brain Detected

6
Description : Disk Group Cannot be imported. Serial Split Brain Detected Platform : Solaris Software : Veritas Volume Manager Category : How-To Procedure In this document we are going to see how to fix a Serial Split Brain issue When you try to import the disk group you would get the below error Disk group is not imported. Error being "vxvm:vxconfigd: [ID 457036 daemon.notice] V-5-1-9576 Split Brain. da id is 0.2, while dm id is 0.1 for dm A0D5 May 10 16:34:33 tncdx15 vxvm:vxconfigd: [ID 220643 daemon.error] V-5-1-569 Disk group datadg, Disk c3t21d0s2: C annot auto-import group:" Disk group is not imported automatically. Cause : A disk was being replaced in array Serial Split brain condition arises when the "SSB_ID" parameter stored into private region of every disk in a diskgroup doesn't match. This could happen if any disk was taken out of diskgroup (because of failure or to transfer some data into other host) Solution First the command "vxsplitlines" need to be run on the disk group. This gives result as which & all disks are suffered with serial split brain. # vxsplitlines –g <disk group name> # vxspiltlines -g datadg VxVM vxsplitlines NOTICE V-5-2-2708 There are 1 pools. The Following are the disks in each pool. Each disk in the same pool has config copies that are similar. VxVM vxsplitlines INFO V-5-2-2707 Pool 0. c3t0d0s2 A0D1

description

Disk Group Cannot Be Imported. Serial Split Brain Detected

Transcript of Disk Group Cannot Be Imported. Serial Split Brain Detected

Page 1: Disk Group Cannot Be Imported. Serial Split Brain Detected

Description : Disk Group Cannot be imported. Serial Split Brain Detected

Platform : Solaris

Software : Veritas Volume Manager

Category : How-To

Procedure

In this document we are going to see how to fix a Serial Split Brain issue

When you try to import the disk group you would get the below error

Disk group is not imported. Error being "vxvm:vxconfigd: [ID 457036 daemon.notice]

V-5-1-9576 Split Brain. da id is 0.2, while dm id is 0.1 for dm A0D5

May 10 16:34:33 tncdx15 vxvm:vxconfigd: [ID 220643 daemon.error]

V-5-1-569 Disk group datadg, Disk c3t21d0s2: C

annot auto-import group:"

Disk group is not imported automatically.

Cause : A disk was being replaced in array

Serial Split brain condition arises when the "SSB_ID" parameter stored into private region of every disk

in a diskgroup doesn't match.

This could happen if any disk was taken out of diskgroup (because of failure or to transfer some data

into other host)

Solution

First the command "vxsplitlines" need to be run on the disk group. This gives result as which & all disks

are suffered with serial split brain.

# vxsplitlines –g <disk group name>

# vxspiltlines -g datadg

VxVM vxsplitlines NOTICE V-5-2-2708 There are 1 pools.

The Following are the disks in each pool. Each disk in the same pool

has config copies that are similar.

VxVM vxsplitlines INFO V-5-2-2707 Pool 0.

c3t0d0s2 A0D1

Page 2: Disk Group Cannot Be Imported. Serial Split Brain Detected

To see the configuration copy from this disk issue

# /etc/vx/diag.d/vxprivutil dumpconfig /dev/vx/dmp/c3t0d0s2

To import the diskgroup with config copy from this disk use the following command

# /usr/sbin/vxdg -o selectcp=1141218744.29.tncdx15 import datadg

The following are the disks whose ssb ids don't match in this config copy

A0D3

A0D5

Above error mentions that disk A0D3 & A0D5 are suffering with Split brain. To verify this run

following command:

# vxsplitlines -g <disk group> -c <disk name>

For e.g

#vxsplitlines -g datadg -c c3t0d0s2

VxVM vxsplitlines INFO V-5-2-2701 DANAME(DMNAME) || Actual SSB || Expected SSB

VxVM vxsplitlines INFO V-5-2-2700 c3t0d0s2( A0D1 ) || 0.1 || 0.1 ssb ids match

VxVM vxsplitlines INFO V-5-2-2700 c3t1d0s2( A0D2 ) || 0.1 || 0.1 ssb ids match

VxVM vxsplitlines INFO V-5-2-2700 c3t2d0s2( A0D3 ) || 0.2 || 0.1 ssb ids don't match

VxVM vxsplitlines INFO V-5-2-2700 c3t3d0s2( A0D4 ) || 0.1 || 0.1 ssb ids match

VxVM vxsplitlines INFO V-5-2-2700 c3t4d0s2( A0D5 ) || 0.2 || 0.1 ssb ids don't match

VxVM vxsplitlines INFO V-5-2-2700 c3t5d0s2( A0D6 ) || 0.1 || 0.1 ssb ids match

VxVM vxsplitlines INFO V-5-2-2700 c3t6d0s2( A0D7 ) || 0.1 || 0.1 ssb ids match

VxVM vxsplitlines INFO V-5-2-2700 c3t7d0s2( A0D8 ) || 0.1 || 0.1 ssb ids match

VxVM vxsplitlines INFO V-5-2-2700 c3t9d0s2( A0D10 ) || 0.1 || 0.1 ssb ids match

VxVM vxsplitlines INFO V-5-2-2700 c3t16d0s2( A0D12 ) || 0.1 || 0.1 ssb ids match

from the output above can bee seen that AOD3 & A0D5 have different ssb_id.

ssb_id could also be verified by running "vxdisk list" on that disk.

Page 3: Disk Group Cannot Be Imported. Serial Split Brain Detected

#vxdisk list c3t0d0s2

devicetag: c3t0d0

type: auto

hostid: tncdx15

disk: name= id=1141218744.29.tncdx15

group: name=datadg id=1141218774.31.tncdx15

info: format=cdsdisk,privoffset=256,pubslice=2,privslice=2

flags: online ready private autoconfig autoimport

pubpaths: block=/dev/vx/dmp/c3t0d0s2 char=/dev/vx/rdmp/c3t0d0s2

version: 3.1

iosize: min=512 (bytes) max=2048 (blocks)

public: slice=2 offset=2304 len=71124864 disk_offset=0

private: slice=2 offset=256 len=2048 disk_offset=0

update: time=1178813226 seqno=0.123415

ssb: actual_seqno=0.1

Compare "vxdisk list" outputs of various disks in the diskgroup. It is quite possible that some of the disks

might have similar ssb_id, but it is not necessary that those disks have latest configuration copy.

To figure out which disk has latest configuration copy, run following command on multiple disks in a

diskgroup.

# /etc/vx/diag.d/vxprivutil dumpconfig /dev/rdsk/c3t0d0s2 >dump_c3t0d0s2

(Check for private slice for proper dumpconfig output)

# /etc/vx/diag.d/vxprivutil dumpconfig /dev/rdsk/c3t2d0s2 >dump_c3t3d0s2

# /etc/vx/diag.d/vxprivutil dumpconfig /dev/rdsk/c3t7d0s2 >dump_c3t7d0s2

From the various dumpconfig outputs, make a note of following information:

dump_c3t0d0s2 dump_c3t2d0s2

update_tid = 0.1027 update_tid=0.1027

config_tid = 0.1355 config_tid =0.1357

ssb_id =0.1 ssb_id =0.2

dump_c3t3d0s2

update_tid = 0.1027

config_tid = 0.1355

ssb_id=0.1

Page 4: Disk Group Cannot Be Imported. Serial Split Brain Detected

Now here it becomes a bit confusing, as we can see that dump_c3t3d0s2 has latest config_tid (0.1357) &

at same time it has ssb_id 0.2 which doesn't match with expected ssb_id that is 0.1.

To clear this confusion, construct a vxprint output with above "vxprivutil" output.

# cat dump_c3t2d0s2 | vxprint -ht -D -

Disk group: datadg

DG NAME NCONFIG NLOG MINORS GROUP-ID

ST NAME STATE DM_CNT SPARE_CNT APPVOL_CNT

DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE

RV NAME RLINK_CNT KSTATE STATE PRIMARY DATAVOLS SRL

RL NAME RVG KSTATE STATE REM_HOST REM_DG REM_RLNK

CO NAME CACHEVOL KSTATE STATE

VT NAME NVOLUME KSTATE STATE

V NAME RVG/VSET/CO KSTATE STATE LENGTH READPOL PREFPLEX UTYPE

PL NAME VOLUME KSTATE STATE LENGTH LAYOUT NCOL/WID MODE

SD NAME PLEX DISK DISKOFFS LENGTH [COL/]OFF DEVICE MODE

SV NAME PLEX VOLNAME NVOLLAYR LENGTH [COL/]OFF AM/NM MODE

SC NAME PLEX CACHE DISKOFFS LENGTH [COL/]OFF DEVICE MODE

DC NAME PARENTVOL LOGVOL

SP NAME SNAPVOL DCO

dg datadg default default 55000 1141218774.31.tncdx15

dm A0D1 - - - - -

dm A0D2 - - - - -

dm A0D3 - - - - -

dm A0D4 - - - - -

dm A0D5 - - - - -

dm A0D6 - - - - -

dm A0D7 - - - - -

dm A0D8 - - - - -

dm A0D9 - - - - -

dm A0D10 - - - - -

dm A0D11 - - - - -

dm A0D12 - - - - -

dm A0D13 - - - - -

dm A0D14 - - - - SPARE

dm A0D15 - - - - SPARE

dm A0D16 - - - - REMOVED

dm A0D17 - - - - -

dm A0D18 - - - - -

Page 5: Disk Group Cannot Be Imported. Serial Split Brain Detected

dm A0D19 - - - - -

dm A0D20 - - - - -

dm A0D21 - - - - -

dm A0D22 - - - - -

v db001_v - DISABLED ACTIVE 25165824 SELECT - fsgen

pl db001_v-01 db001_v DISABLED RECOVER 25165824 CONCAT - RW

sd A0D1-01 db001_v-01 A0D1 0 20971520 0 - DIS

sd A0D1-08 db001_v-01 A0D1 20974688 4194304 20971520 - DIS

pl db001_v-02 db001_v DISABLED RECOVER 25165824 CONCAT - RW

sd A0D16-01 db001_v-02 A0D16 0 25165824 0 - DIS

pl db001_v-03 db001_v DISABLED RECOVER LOGONLY CONCAT - RW

sd A0D1-02 db001_v-03 A0D1 20971520 528 LOG - DIS

v db002_v - DISABLED ACTIVE 6291456 SELECT - fsgen

pl db002_v-01 db002_v DISABLED RECOVER 6291456 STRIPE 2/128 RW

sd A0D2-01 db002_v-01 A0D2 0 3145728 0/0 - RLOC

sd A0D3-01 db002_v-01 A0D3 0 3145728 1/0 - DIS

pl db002_v-02 db002_v DISABLED ACTIVE 6291456 STRIPE 2/128 RW

sd A0D17-01 db002_v-02 A0D17 0 3145728 0/0 - DIS

sd A0D18-01 db002_v-02 A0D18 0 3145728 1/0 - DIS

pl db002_v-03 db002_v DISABLED RECOVER LOGONLY CONCAT - RW

sd A0D1-03 db002_v-03 A0D1 20972048 528 LOG - DIS

v db003_v - DISABLED ACTIVE 8388608 SELECT - fsgen

pl db003_v-01 db003_v DISABLED RECOVER 8388608 CONCAT - RW

sd A0D4-01 db003_v-01 A0D4 0 8388608 0 - DIS

pl db003_v-02 db003_v DISABLED RECOVER 8388608 CONCAT - RW

sd A0D19-UR-001 db003_v-02 A0D19 0 8388608 0 - RLOC

pl db003_v-03 db003_v DISABLED RECOVER LOGONLY CONCAT - RW

sd A0D1-04 db003_v-03 A0D1 20972576 528 LOG - DIS

v db004_v - DISABLED ACTIVE 6291456 SELECT - fsgen

pl db004_v-01 db004_v DISABLED ACTIVE 6291456 CONCAT - RW

sd A0D5-01 db004_v-01 A0D5 0 6291456 0 - DIS

pl db004_v-02 db004_v DISABLED ACTIVE 6291456 CONCAT - RW

sd A0D20-01 db004_v-02 A0D20 0 6291456 0 - DIS

pl db004_v-03 db004_v DISABLED RECOVER LOGONLY CONCAT - RW

sd A0D1-05 db004_v-03 A0D1 20973104 528 LOG - DIS

v db005_v - DISABLED ACTIVE 12582912 SELECT - fsgen

pl db005_v-01 db005_v DISABLED ACTIVE 12582912 CONCAT - RW

sd A0D6-01 db005_v-01 A0D6 0 12582912 0 - DIS

pl db005_v-02 db005_v DISABLED ACTIVE 12582912 CONCAT - RW

sd A0D21-01 db005_v-02 A0D21 0 12582912 0 - DIS

pl db005_v-03 db005_v DISABLED RECOVER LOGONLY CONCAT - RW

Page 6: Disk Group Cannot Be Imported. Serial Split Brain Detected

sd A0D1-06 db005_v-03 A0D1 20973632 528 LOG - DIS

v db006_v - DISABLED ACTIVE 10485760 SELECT - fsgen

pl db006_v-01 db006_v DISABLED ACTIVE 10485760 CONCAT - RW

sd A0D7-01 db006_v-01 A0D7 0 10485760 0 - DIS

pl db006_v-02 db006_v DISABLED ACTIVE 10485760 CONCAT - RW

sd A0D22-01 db006_v-02 A0D22 0 10485760 0 - DIS

pl db006_v-03 db006_v DISABLED RECOVER LOGONLY CONCAT - RW

sd A0D1-07 db006_v-03 A0D1 20974160 528 LOG - DIS

v repository - DISABLED ACTIVE 20971520 SELECT - fsgen

pl repository-01 repository DISABLED ACTIVE 20971776 STRIPE 6/128 RW

sd A0D8-01 repository-01 A0D8 0 3495296 0/0 - DIS

sd A0D9-01 repository-01 A0D9 0 3495296 1/0 - DIS

sd A0D10-01 repository-01 A0D10 0 3495296 2/0 - DIS

sd A0D11-01 repository-01 A0D11 0 3495296 3/0 - DIS

sd A0D12-01 repository-01 A0D12 0 3495296 4/0 - DIS

sd A0D13-01 repository-01 A0D13 0 3495296 5/0 - DIS

#

Check with customer if generated output appears to him as correct. If it appears as correct you can

import the diskgroup with configuration present on this disk.

# vxdg -o selectcp=<disk id> import <diskgroup>

For e.g

# /usr/sbin/vxdg -o selectcp=1141219312.37.tncdx15 import datadg

[ Please note it is quite possible that diskgroup wont import here. If it fails give an -Cf option with vxdg.

#/usr/sbin/vxdg -Cf -o selectcp=1141219312.37.tncdx15 import datadg ]

Confirm that disk group is imported.

# vxdisk list

Start the volume

#vxvol -g <diskgroup> start <volume name>

(If plexes are in recover state, you need to follow recovery procedure of plexes)

Mount the volume

#mount -F <fs type> /dev/vx/dsk/dg/vol-name /mount-point //(It may ask to run a fsck here)

Note : When diskgroup is imported, ssb_id parameter in all the hard disks is resetted to 0.0.