PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

39
PLNOG 2015 SDN and NFV Krakow, 28 th & 29 th September 2015

Transcript of PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

Page 1: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

You can also use images

that are high tech and

interesting.

To change

background image;

Go to View tab, slide master

PLNOG 2015

SDN and NFV

Krakow, 28th & 29th September 2015

Page 2: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

You can also use images

that are high tech and

interesting.

To change

background image;

Go to View tab, slide master

Assuring Performance, Scalability and

Reliability in NFV Deployments

Ronald Mai

Senior Systems Engineer - EMEA

Page 3: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

‹#› PROPRIETARY AND CONFIDENTIAL

Agenda

Testing NFV

• Benefits of NFV and the Testing Implications

• Challenges for NFV and the Testing Requirements

Test Tools Old & New

• What Existing Tools Do

• Hardware Tester Architecture – benefits/challenges

• Virtual Machine Tester Architecture – benefits/challenge

• What can we learn and what do we lose?

New NFV Test Methodologies

Page 4: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

‹#› PROPRIETARY AND CONFIDENTIAL

TESTING NFV PLNOG 2015

Page 5: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

‹#› PROPRIETARY AND CONFIDENTIAL

PASS Methodology

Performance

• Data-plane throughput,

latency, latency variation etc.

Availability

• Control-plane convergence

and failover mechanisms

• Data-plane reliability under

load

Security

• VLAN/VPN leakage

• Firewall Performance

Scale

• Control-plane peer scale

• Routing table scale

• Session quantity and

establishment rate

• Flow scale

Conformance

Interoperability

Page 6: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

‹#› PROPRIETARY AND CONFIDENTIAL

Benefits of NFV and the Testing Implications

Benefit Impact

Reduced Equipment Cost and

Reduced Power Consumption Equivalent Testing Costs must fall

Reduced Time-to-Market for

Innovative New Services

Test systems must integrate with

new lab platforms and be capable of

automation

Possibility of Running Production,

Test and Reference Facilities on the

same infrastructure

As above. Integrating Test System

with Orchestration is key

Virtual Test Ports, Standard APIs and Orchestration

Integration are key

Page 7: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

‹#› PROPRIETARY AND CONFIDENTIAL

Benefits of NFV and the Testing Implications (cont.)

Benefit Impact

Optimizing network configuration/

topology in near real-time based on

traffic and service demand

Test it!

What effect does this have on QoE for

service user.

Temporarily repair failures by

automated re-configuration and moving

network workloads onto spare capacity

Test it!

Do the failover mechanisms work?

What is the service impact during re-

configuration

Rapid Scaling of Services to meet real-

time demand. Scaling-up and scaling-

out of capacity under orchestration

control

Test it!

New methodologies required.

Does the orchestration mechanism

respond correctly to demand. How is

existing traffic affected when it does?

Page 8: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

‹#› PROPRIETARY AND CONFIDENTIAL

Challenges for NFV and the Testing Requirements

Challenge Requirement

Portability/

Interoperability

Test the functionality and performance in all

data centre environments that will be

encountered in service

Performance Trade-Offs

when using industry

standard hardware

Benchmark existing services (e.g. latency,

delay variation, power consumption for

different service levels). Determine the

resources required to continue meeting SLAs

Migration and Co-

existence/ Compatibility

with legacy platforms

Test services using a mixture of virtual and

physical network appliances

Page 9: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

‹#› PROPRIETARY AND CONFIDENTIAL

Challenges for NFV and the Testing Requirements (Cont.)

Challenge Requirement

Network Stability

Determine stability of data and control-planes

when large numbers of VMs are being created

or re-located

Integration

Test service chains as well as individual VNFs.

Requires complex protocol support from test

ports

Security and Resilience

Induce failure and test service downtime

(while network function is re-created).

Test servers, hypervisors, virtual appliances

and orchestration mechanisms against security

attacks.

Page 10: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

‹#› PROPRIETARY AND CONFIDENTIAL

Testing Within the NFV Infrastructure

Test Path Possibilities

vSwitch performance, availability and scalability

VNF performance, availability and scalability

Server performance, availability and scalability

Page 11: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

‹#› PROPRIETARY AND CONFIDENTIAL

PASS Methodologies for NFV

Performance

• Data-plane throughput, latency,

latency variation etc.

• VNF vs Dedicated hardware

• Effect of real-time optimization

on QoE

• Performance per environment

• Service Chain performance

• Power Consumption

Availability

• Control-plane convergence

• Data-plane reliability under load

• Migration and Auto-scaling

(SLA Maintenance)

Security

• VLAN/VPN leakage

• Firewall Performance

• Security of virtual

infrastructure

Scale

• Control-plane peer scale

• Routing table scale

• Session quantity and

establishment rate

• Capacity of NFVI

Page 12: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

‹#› PROPRIETARY AND CONFIDENTIAL

TEST TOOLS OLD AND NEW

PLNOG 2015

Page 13: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

‹#› PROPRIETARY AND CONFIDENTIAL

Test Ports Emulate Complex Environments

1G or 10G

Ethernet

V4 & V6

Addresses

RIP, BGP, IS-

IS or OSPF

10G, 40G or 100G

Ethernet

MPLS Label Stack

IS-IS or OSPF

Multi-Protocol iBGP

LDP

BFD

VRFs

Firewall Functions

Border Relay

Page 14: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

‹#› PROPRIETARY AND CONFIDENTIAL

Hardware-based Tester Model

Advantages

Repeatable results

Line rate traffic

High-scale control-plane

Accurate (to ~5nS) across

millions of streams

Single management interface

Easily automated

Cost effective

• Emulate realistic environment

• Power, real-estate

Hardware-based

Test Device

Data-plane traffic

Control-plane peering,

updates etc.

Page 15: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

‹#› PROPRIETARY AND CONFIDENTIAL

Module Module

Module Module

Architecture of a Hardware Test Device

Controller

Module

CPU /

MEM

CPU /

MEM

CPU /

MEM

CPU /

MEM

GPS

PTP Compute

Resource

CPU

Core

CPU

Core

CPU

Core

PHY

PHY

PHY

FPGA

FPGA

FPGA

Page 16: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

16 PROPRIETARY AND CONFIDENTIAL

Constant Bitrate

Traffic (CBR)

Variable Bitrate

Traffic (VBR)

Continuous Burst

Microburst

Realizing Test Functionality in VM Equivalents

CPU

Stateful control-plane protocols

Emulated and Simulated devices (L2-7)

• 1000s of peers per port

• Millions of routes per port

Test Configuration and control

Results processing and database

FPGA

Line rate performance

Traffic Generation and Analysis

• Sophisticated scheduling

• 1 Million flows per port

individually measureable in real-

time

Accurate & Stable time stamping

High-resolution sampling

Line Rate Capture buffers

Work is in progress to enhance soft FPGA performance

In cases where replicating hardware performance is not possible new methodologies are being developed

Page 17: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

17

SAMPLE NFV TEST

METHODOLOGIES

PLNOG 2015

Page 18: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

18

Target D/SUTs

• vBNG (PPPoE/DHCP)

• vCPE [vFW, vLB, vRouter] (IGMP, DHCP, OSPF/BGP, Stateful traffic)

• vPE (BGP, MPLS VPN, VPLS)

• System Infrastructure performance – Hypervisor, OS, vSwitch, vNIC

Measure

• Forwarding throughput (RFC 2544)

• Latency/Jitter – TWAMP Latency

• Orchestration with VM/VNF/(V)TA auto-scaling

Forwarding Performance Benchmarking of a VNF

Test Topology

(Virtual) Test Appliance (virtual) Test Appliance VNF under test

(V)TA (V)TA

Traffic

(V)TA (V)TA

Page 19: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

19

Objective - Test the fail-over convergence time when one of the VNFs fails and back up path has been configured for the test topology

Convergence Configurations

• ECMP Load sharing over Active/Active Paths

• Active/Standby Paths – Failover to Standby Path

Measure:

• Convergence Time

• Impact on convergence time of route/VRF table size

Fail-over Convergence Measurement

(Virtual) Test

Appliance

Test Appliance

VTA/

TA

Simulated

Endpoints

DUT Virtual Routers

(VNFs)

Emulated

Router

VTA/

TA

VTA/

TA

Traffic

.

.

.

.

Simulated

Endpoints

VTA/

TA

.

.

.

.

Page 20: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

20

Objective - Determine the performance of a distributed VNF during and after the migration of one or more constituent VMs

• Migration of a constituent VM from one physical server to another

• Migration of a VNF in service chain from one physical server to another

• Migration of VM or VNF across data centres

Measure (during a scheduled VNF VM Migration)

• Throughput and Latency before and after migration

• Service disruption time

Performance Impact of VM Migration

Test Appliance Test Appliance

.

.

.

.

VTA/

TA

VTA/

TA

.

.

.

.

Simulated

Server Cloud

Simulated

Workload Clients

Server 1

Server 2

Traffic

Page 21: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

21

Objective - Test the auto-scaling functionality of the VNF.

• Auto scaling triggered by mechanisms such as an embedded monitoring function/

threshold crossing detection & event notification

• Example – an increase in the number of PPPoE or DHCP incoming session requests

(beyond the scale supported by one VM)

Measure

• Disable the auto-scaling feature on DUT in order to base line the performance

• Re-Enable auto-scaling and gradually increase load

• Record the transactions/sec, average, min and max response time

• Record the total number of VMs instantiated by the VNF

• Record the NVFI resources used by the VNF (processor, memory, storage)

Auto Scaling of VMs in a VNF

Virtual Test Appliance Virtual Test Appliance VNF Under Test

.

.

.

.

VTA VTA

.

.

.

.

Simulated

Server Cloud

Simulated

Workload Clients

Page 22: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

22

NFV brings new benefits and challenges that require new testing techniques

Existing testing technology has been virtualized thus building on many years of experience

Virtualized test environments are challenging. The test community is:

• Addressing the challenges where this is technically feasible

• Creating new methodologies where it is not

Conclusions

Page 23: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

PROPRIETARY AND CONFIDENTIAL

[email protected]

Spirent Communications Munich

Thank You !

Page 24: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

PROPRIETARY AND CONFIDENTIAL

Back-up Slides

Page 25: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

25 PROPRIETARY AND CONFIDENTIAL

Control-plane Functionality of a Hardware Tester

Application Layer Protocols and Triple Play

HTTP, FTP, SIP, Video, DPG, XMPPvJ CIFS, Storage IO

IPTV & Video Quality Analysis

Switching

OpenFlow, TRILL, FC, FCoE, LACP, LLDP/DCBX, SPB, STP,

VEPA, VIC

Carrier Ethernet

EOAM, IEEE 1588v2 Link-OAM, SyncE, TWAMP

MPLS & MPLS-TP

6PE/6VPE, LDP, BGP VPLS, LDP VPLS, GMPLS, RSVP-TE,

Multicast VPN, LSP-Ping, MPLS-TP Y,1731OAM

Routing

BGP, OSPFv2 & v3, ISIS RIP(NG) BFD, PIM, LISP

Access

ANCP, DHCP. DHCPv6/PD L2TP, PPPoX, IGMP/MLD 802.1X,

IPv6 Autoconfiguration

Confo

rmance |

| F

unctio

nal |

| P

erfo

rmance

Next generation platform from Spirent

Page 26: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

26

PPPoE connections

MPLS tunnels

BFD for fault detection

Physical test devices emulate DSLAMS and 1000s of PPPoE client on one side and Edge and core routers on other side

In above example, the VNFs under test are virtualized BNG/PE running on standard server

• PPPoE and MPLS connections formed between test device and

VNF under test

VNF Functional & Performance Testing 1. Using physical test devices to validate performance of virtual BNG

Test system

emulates

DSLAMs & PPPoE

clients

Test system

emulates Edge

and core routers

Page 27: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

27

Virtual test appliances emulate realistic video and web clients and servers generating stateful L4-7 traffic

In above example, the VNFs under test are virtualized Firewall, Load Balancer and CE router running as a service chain inside a standard server

VNF Functional & Performance Testing 2. Using virtual test devices to validate performance Service Chain

vLoad

Balancer Emulated

Video/Web Clients

Emulated

Video/Web Server

Service chain

vFirewall vCE

Page 28: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

28

The following metrics are measured/verified by the test appliances for service chains that include virtual appliances such as Firewalls, IDS/IPS, DPI, Load Balancers, Traffic Classifiers, WAN Accelerators and CE devices

• Sustained packet forwarding rate

• Connection establishment rate & transactions per second

• Total number of connections

• Round trip time and goodput

• Denial of service handling & packet loss

• Service chain scale (and interference)

• Packet leakage across service chains

• Time between VM instantiation and first available packet

Service chain validation (e.g. vFirewall, vLB & vCE)

Page 29: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

29

The following methodology is used to ensure portability of VNFs and stability of NFV environment

• Virtual Test Appliance is connected to Service Chain as

• “x” service chains are created. The test appliances ensure that adding the

“x + 1”th service chain does not degrade the performance of the first “x”

service chains more than expected levels

• Tests are repeated for a number of different hypervisors and vSwitches and

the test appliances verify that the VNF performance is consistent across

different hypervisors

Service Chain Stability, Portability and

Scalability

Page 30: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

30

Testing reliability and availability of VNFs

LAG 1

Server A Server B

BFD

BFD

Virtual test devices form connections with primary and backup VNFs over a LAG

High frequency BFD running between the test devices and VNFs constantly monitor the connection liveliness

LAG 2

BFD

Page 31: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

31

The following methodology is used to ensure availability of VNFs

• Initially Port 1 is active on Both LAGs 1 and 2

• High frequency BFD monitors connection liveness

• VM Migration is initiated from Server A to Server B

• Port 2 becomes active and port 1 becomes standby on both LAGs

• Number of packets lost in forward direction is TX packets on Stream ID 1 on

LAG 1 minus RX packets on Stream ID 1 on LAG 2

• Number of packets lost in reverse direction is TX packets on Stream ID 2 on

LAG 2 minus RX packets on Stream ID 2 on LAG 1

• VM migration time is the greater of [Time of arrival of first packet on Port 2

of LAG 2 – Time of arrival of last packet on Port 1 of LAG 2] and [Time of

arrival of first packet on Port 2 of LAG 1 – Time of arrival of last packet on

Port 1 of LAG 1]

Reliability & availability of VMs (VM Migration)

Page 32: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

32

Monitoring

probe

Monitoring

probe

On on-demand basis, insert virtual monitoring probes

in the service chain, for active or passive monitoring

Performance monitoring

interface to OSS/BSS

NFV Service Assurance

vLoad

Balancer vFirewall vCE

vMonitoring

Probe

vMonitoring

Probe

Page 33: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

33

The following methodology is used to perform active and passive monitoring of NFV environments

• A combination of virtual and physical monitoring probes are used

• Probes provide information to OSS/BSS systems

• Virtual monitoring probes are inserted on an on-demand basis at various

points in the service chain to test a subset of or all of the functions of a

service chain

• For active monitoring, the virtual probes originate and terminate packets;

for passive monitoring, the virtual probes just tap in to the service chain

Active and passive monitoring of NFV

environments

Page 34: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

34

Traditional Router Architecture

Router

Router

Packet Forwarding Hardware

Network OS

App App App

Packet Forwarding Hardware

Network OS

App App App

Custom Designed Specialized Hardware

Based on ASIC, FPGA or Network

Processors

Proprietary Network Operating System

e.g. Cisco IOS or JUNOS

Embedded Software - Routing Protocols,

Routing Data Bases, SPF Algorithms,

Firewall Functionality etc.

Page 35: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

35

Conventional Routing The Control and Data Plane

Router

Router

Packet Forwarding

Hardware

Network OS

App App App

Router

Router

Packet Forwarding

Hardware

Network OS

App App App

Router

Router

Packet Forwarding

Hardware

Network OS

App App App

Router

Packet Forwarding

Hardware

Network OS

App App App

Router

Packet Forwarding

Hardware

Network OS

App App App

Router

Router

Packet Forwarding

Hardware

Network OS

App App App

Routers ‘talk’ to one another

via routing protocols to

discover neighbours and

topology

Each Router builds a database of

the network topology which it

uses to determine how to switch

data packets

Page 36: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

36

For

• Established - Tried and Tested

• Bomb Proof!

Against

• Inflexible – Changes require weeks to implement

• Expensive - Every node requires compute resources

• Proprietary - Every vendor implements routing algorithms in their own way

• Hard to Maintain - Every node must be visited for software maintenance

• Vulnerable to control-plane attack

Conventional Routing Pros and Cons

Page 37: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

37

Controller

SDN – What Changes?

Firewall

Network OS

Router

Network OS

Load Balancer

Network OS

Router

Network OS

Router

Network OS

Router

Packet Forwarding

Hardware

Packet Forwarding

Hardware

Packet Forwarding

Hardware

Packet Forwarding

Hardware

Packet Forwarding

Hardware

Packet Forwarding

Hardware

Network OS

Page 38: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

38

Controller

SDN and OpenFlow

Switches built from cheap merchant (off-the-shelf) silicon

OpenFlow is a component of SDN

Applications perform path calculations (like SPF today)

Much greater flexibility to add new functionality (e.g. SJ-BPF)

Packet Forwarding

Hardware

Packet Forwarding

Hardware

Packet Forwarding

Hardware

Packet Forwarding

Hardware

Packet Forwarding

Hardware

Packet Forwarding

Hardware

SDN Controller

(South-bound interface)

Page 39: PLNOG15 :Assuring Performance, Scalability and Reliability in NFV Deployments, Ronald Mai

40

Simplified Provisioning of Complex Topology SDN will enable dynamic

provisioning across network layers

Data Centre A

Data Centre B App App App

SDN Controller