VSX Troubleshooting - CPUG

50
(c) Valeri Loukine 2010 CPUG 2010 Chur Switzerland VSX Troubleshooting Quick guide

Transcript of VSX Troubleshooting - CPUG

Page 1: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

VSX Troubleshooting

Q u i c k g u i d e

Page 2: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Agenda

How VSX is built (in brief)

Management scheme

Gateway architecture

Licensing

Issues to fix

Tools and methods

2

Page 3: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Reference Note

Pictures from Check Point publicly available documents are used in this presentation

Information from Check Point troubleshooting documentation used in this presentation

3

Page 4: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Management Side

V S X A r c h i t e c t u r e

Page 5: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

MGMT model of VSX

Three tear infrastructure

Two types

SmartCenter

Provider-1

5

Page 6: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

SmartCenter Model

Nothing special

6

Page 7: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Provider-1 Model

Virtual Systems are managed by different CMAs

Special so called “Main CMA” to manage VSX cluster objects

“Target CMAs” to manage particular VSs

7

Page 8: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Provider-1 Model

8

Page 9: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

MGMT DB objects

Two types instead of one as for regular FW

network_object - security aspects of a Virtual Device

vs_slot_objects - networking aspects of a Virtual Device

network_object - on the Target CMA

vs_slot objects - on Main CMA

9

Page 10: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

vs_slot objects

Special DB table called vs_slot_objects

Network interfaces of VS

Routes of VS

Other VS specific attributes, such as reference to hosting VSX object

Vital info for creation and change of VS

10

Page 11: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Network Configuration Scripts

Configuration changes are pushed to VSX from MGMT with NCS

NCS are generated on MGMT

On VSX GW NCS are parsed and executed

11

Page 12: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Types of NCS

local.vs - NCS file - last configuration change

local.vsall - NCS file, full configuration, executed on startup

local.vskeep - contains list of existing VSIDs, used at system startup

12

Page 13: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

local.vs

Interfaces lists of each vs_slot: interfaces - interfaces’ configuration to be interfaces_installed - existing interfaces configuration

Each vs_slot object has 2 attributes containing routes lists: routes - routing table to be routes_installed existing routing table

local.vs file is created by comparing and calculating the difference of “interfaces” to “interfaces_installed” and “routes” to “routes_installed”

13

Page 14: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

local.vsall and local.vskeep

Each Virtual Device has 2 NCS files kept on the management: VS_name.vsnew - NCS file containing interfaces VS_name.vsrt - NCS file containing routes

These files are updated each time configuration change is done using SmartDashboard.

local.vsall is created by concatenating all the files of all the Virtual Devices.

local.vskeep is created by going over vs_slo_objects table and writing all the Virtual Devices VSIDs to it.

14

Page 15: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Location of NCS files

All .vsnew and .vsrt files - in $FWDIR/conf/vs_repository/VSX_NAME of Main CMA

MGMT: local.vs, local.vsall and local.vskeep files - in $FWDIR/state/VSX_NAME/VSX/

VSX GW: local.vs, local.vsall and local.vskeep files in $FWDIR/state/__tmp/VSX/ If scripts processed successfully, they copied to to $FWDIR/state/local/VSX/ directory

15

Page 16: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Provider-1 forwarding

16

Page 17: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

VSX private network

“Funny IPs”

For internal communication

Default cluster private network: 192.168.196.0/22

Cluster Private Network can be changed: - In SmartDashboard, if there is no VDon the VSX - By using “vsx_util change_private_net”

17

Page 18: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

gateway Side V S X A r c h i t e c t u r e

Page 19: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Important note

VS is NOT a virtual machine

Common file space

Common kernel

VRFs and kernel contexts are different

19

Page 20: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

VRFs

A Multiple routing domain (VRF) has separate:

VRF IDInterfacesUnicast routing tableRouting cacheMulticast forwarding cacheARP tableLoopback interfaceSockets

VRFs enable overlapping IP addresses.

20

Page 21: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

VRF: CLI changesMind context when working with ip route, iputils, ping, traceroute, arp, route, ifconfig and netstat:

- traceroute –Z vrfid - ip route vrf vrfid - “-z vrfid” for the rest (arp –z 1, netstat –z 2 -rn)

Use “all” instead of “vrfid” to show information for all VRFs

VRF context can be changed by “vsx set <vsid>” or “vrfctl –s <vrf>” commands.

VRF 0 - physical machine

21

Page 22: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

VSX user mode

Processes:

Multi context: fwd, cpd, cplogd and fibmgr

Single context: vpnd and gated

System resources like CPU and HDD space are shared

22

Page 23: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

File Structure

VS folders - CTX under $FWDIR / $CPDIR

CTX00xxx - Virtual Device ID xxx.

VSX machine (VSID 0) - $FWDIR / $CPDIR

23

Page 24: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Creating VS object 1

License validation

Update context database with Virtual Device information $CPDIR/conf/ctxdb.C)

Create Virtual Device registry entries (OTP for SIC certificate)

Create Virtual Device directories and soft-links:

$CPDIR/CTX/CTX00xxx/conf

$FWDIR/CTX/CTX00xxx/log, database, …

24

Page 25: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Creating VS object 2

Create initial policy

Create the OS VRF instance

Create the VS instance in the FW kernel and load security policy

Send a message that notifies cpd and fwd that a new context was added. cpd adds the new context to its db.

25

Page 26: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Troubleshooting techniques

Page 27: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Useful knowledge

Management debugging (ref. P-1 lecture)

Gateway architecture and troubleshooting techniques

ClusterXL

SecureXL

27

Page 28: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Things to Check First

Licensing on both MGMT and GW sides

Connectivity between VSX and MGMT

All the jazz:

- local time settings - static routes - IP addressing - mind funny IPs- etc...

28

Page 29: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Management Debugging

Page 30: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Management Issues

Provisioning

Changes

vsx_util operations

policy installation

30

Page 31: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Important

Do not lock Main CMA while working with VSX on Target CMAs

31

Page 32: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Debbuging fwm

TDERROR_ALL_ALL - might be too much

vsx provisioning and vsx_util: TDERROR_ALL_VSXM

Policy installation: TDERROR_ALL_INSTMGR

32

Page 33: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

How to set debug flags

Mind context!!!

fw debug fwm on TDERROR_ALL_VSXM=INFO

Or export TDERROR_ALL_VSXM=INFOand restart fwm process

33

Page 34: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Debug output

$FWDIR/log/fwm.elg

34

Page 35: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Turning it off

fw debug fwm TDERROR_ALL_VSXM=0

fw debug fwm off

35

Page 36: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Which CMA?

Most of the cases - Main CMA

Policy installation - Target CMA

36

Page 37: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Gateway Debugging

Page 38: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Common Issues

Connectivity

Policy

Interfaces

Clustering

38

Page 39: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

To check first

Connectivity

Topology of VSX cluster and adjacent networks

Local times

Licenses

39

Page 40: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Overvew

vsx stat -v

VSX Gateway Status==================Name: test1Security Policy: StandardInstalled at: 25Jul2010 3:42:11

SIC Status: TrustNumber of Virtual Systems allowed by license: 25Virtual Systems [active / configured]: 7 / 7Virtual Routers and Switches [active / configured]: 0 / 0Total connections [current / limit]: 4994 / 135000

Virtual Devices Status====================== ID | Type & Name | Security Policy | Installed at | SIC Stat-----+-------------------------+-------------------+-----------------+--------- 1 | S test1_XXXXXXXXXXXX1...| Standard | 25Jul2010 3:42 | Trust 2 | S test1_XXXXXXXXXXXX2...| Standard | 25Jul2010 3:42 | Trust 3 | S test1_XXXXXXXXXXXX3...| Standard | 25Jul2010 3:42 | Trust 4 | S test1_XXXXXXXXXXXX2...| Standard | 25Jul2010 3:42 | Trust 5 | S test1_XXXXXXXXXXXX2...| Standard | 25Jul2010 3:42 | Trust 6 | S test1_XXXXXXXXXXXX2...| Standard | 25Jul2010 3:42 | Trust 7 | S test1_XXXXXXXXXXXX2...| Standard | 25Jul2010 3:42 | Trust

40

Page 41: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Tools

tcpdump -i <IF name>

fw monitor [–v <vsid>] -e <Your filter> Example: fw monitor –v 2 –e “port(80) and ip_p=17, accept;”

Note: changing context does NOT help “fw monitor” to limit output

41

Page 42: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Acceleration

Most of fw monitor output is accelerated, so you will see just the first packet.

fwaccel [-vs <vsid>] [conns | templates | stat | on | off]

Mind VS number

42

Page 43: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Cluster Issues

Get statuscphaprob [-vs vsid] stat

Interfaces statuscphaprob -a [-vs vsid] if

Problem notification list:cphaprob [-vs vsid] list

Force member UP or DOWN, for failover tests:clusterXL_admin up/down

43

Page 44: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Kernel Debug

One kernel for all!

Massive output, mind performance

Some express debugs:fw ctl zdebug drop | grep <your filter> - to see drop reason on specific traffic

Mind kernel buffer size

44

Page 45: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

System Tools

arp, route, netstat, ifconfig - to have “-z X” flag where X is VS number

-z all prints info for all VSs

ping -z...

traceroute -Z... (Capital letter)

45

Page 46: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

VS Policy

To fetch the last installed policy:fw [-vs <vsid>] fetch local

Fetching the last policy that failed to be installedfw fetchlocal -d $FWDIR/state/__tmp/FW1/

To unload policy:fw [-vs <vsid>] unloadlocal

Unload policy for all VSs:fw vsx unloadall

46

Page 47: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

VS configuration

To fetch configuration: fw vsx fetch

For specific VS: fw vsx fetchvs –vs 2

To see NCS script for a specific VS: fw vsx showncs <vsid>

47

Page 48: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Other tips

Double check topology

If you cannot figure connectivity issues, especially some traffic degradation, suspect ClusterXL before others

48

Page 49: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Questions And

Answers

Page 50: VSX Troubleshooting - CPUG

(c) Valeri Loukine 2010CPUG 2010 Chur Switzerland

Thank You For Your Time!