Mastering Data Center QoS -...

89

Transcript of Mastering Data Center QoS -...

Page 1: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov
Page 2: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

Mastering Data Center QoS

BRKRST-2509

V5.2

Slides: https://db.tt/xN7Lw9AJ

Lucien Avramov @flying91

Distinguished Speaker and Technical Marketing Engineer – INSBU

Nexus 9000 and ACI

[email protected]

Page 3: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Session Objective: WHY, WHEN and HOW of QoS

3

At the end of the session, the participants should:

• Understand Data Center QoS Requirements and Capabilities

• Understand QoS implementation on Nexus platforms

• Understand how to configure QoS on Nexus

Page 4: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Mastering Data Center QoS – BRKRST-2509

4

• Data Center QoS Requirements

• Nexus QoS Capabilities

• Nexus QoS Configuration

–Nexus Configuration Model: MQC

–Platform Configuration Examples

1K Cisco Nexus

x86

Page 5: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

VMotion

FCoE

Evolution of QoS Design – Where do we Start?

5

• Quality of Service is just about voice and video anymore

• Campus Specialization

Desktop based Unified Communications

Blended Wired & Wireless Access

• Data Center Specialization

Compute and Storage Virtualization

Cloud Computing

• Protocol convergence onto the fabric

Storage – FCoE, iSCSI, NFS

Inter-Process and compute communication (RCoE, vMotion, … )

• Switching Evolution and Specialization

Page 6: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Data Center QoS Design Requirements Where are we starting from?

6

• VoIP and Video are now mainstream technologies

• Ongoing evolution to the full spectrum of Unified Communications

• High Definition Executive Communication Application requires stringent Service-Level Agreement (SLA)

– Reliable Service—High Availability Infrastructure

– Application Service Management—QoS

Page 7: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Media based Application Requirements Voice vs. Video – At the Packet Level

7

20 msec

Voice Packets

Bytes

200

600

1000

Audio

Samples

1400

Time

200

600

1000

1400

33 msec

Video Packets

Video

Frame

Video

Frame

Video

Frame

Page 8: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Enterprise / Campus QoS Design Requirements QoS for Voice / Video is implicit–Medianet RFC 4594

8

Application

Class

Per-Hop

Behavior

Admission

Control

Queuing &

Dropping

Application

Examples

VoIP Telephony EF Required Priority Queue (PQ) Cisco IP Phones (G.711, G.729)

Broadcast Video CS5 Required (Optional) PQ Cisco IP Video Surveillance / Cisco Enterprise TV

Realtime Interactive CS4 Required (Optional) PQ Cisco TelePresence

Multimedia Conferencing AF4 Required BW Queue + DSCP WRED Cisco Unified Personal Communicator, WebEx

Multimedia Streaming AF3 Recommended BW Queue + DSCP WRED Cisco Digital Media System (VoDs)

Network Control CS6 BW Queue EIGRP, OSPF, BGP, HSRP, IKE

Call-Signaling CS3 BW Queue SCCP, SIP, H.323

Ops / Admin / Mgmt (OAM) CS2 BW Queue SNMP, SSH, Syslog

Transactional Data AF2 BW Queue + DSCP WRED ERP Apps, CRM Apps, Database Apps

Bulk Data AF1 BW Queue + DSCP WRED E-mail, FTP, Backup Apps, Content Distribution

Best Effort DF Default Queue + RED Default Class

Scavenger CS1 Min BW Queue (Deferential) YouTube, iTunes, BitTorent, Xbox Live

Page 9: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Data Center QoS Requirements - Goodput

9

• A balanced fabric is a function of maximal throughput ‘and’ minimal loss => “Goodput”

• Application-level throughput (goodput): – Given by the total bytes received from

all senders divided by the finishing time of the last sender.

– “Understanding TCP Incast Throughput Collapse in Datacenter Networks”

5 millisecond view Congestion Threshold exceeded

Data Center Design Goal: Optimizing the balance of end to

end fabric latency with the ability to absorb traffic peaks and prevent

any associated traffic loss

http://www.ietf.org/id/draft-dcbench-def-00.txt

Page 10: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Diversity of Data Center Application Flows

10

• Small Flows/Messaging

(Heart-beats, Keep-alive, delay sensitive application messaging)

• Small – Medium Incast

(Hadoop Shuffle, Scatter-Gather, Distributed Storage)

• Large Flows

(HDFS Insert, File Copy)

• Large Incast

(Hadoop Replication, Distributed Storage)

Page 11: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Data Center QoS Design Requirements – what else do we need to consider?

11

• The Data Center adds a number of new traffic types and requirements

– No Drop, IPC, Storage, Vmotion, …

• New Protocols and mechanisms

– 802.1Qbb, 802.1Qaz, ECN, …

Spectrum of Design Evolution

Ultra Low Latency

• Queueing is designed out of the network whenever possible

• Nanoseconds matter

Warehouse Scale • ECN & Data Center TCP • Hadoop and Incast Loads on

the server ports

slot 1 slot 2 slot 3 slot 4 slot 5 slot 6 slot 7 slot 8

blade1 blade2 blade3 blade4 blade5 blade6 blade7 blade8

slot 1 slot 2 slot 3 slot 4 slot 5 slot 6 slot 7 slot 8

blade1 blade2 blade3 blade4 blade5 blade6 blade7 blade8

slot 1 slot 2 slot 3 slot 4 slot 5 slot 6 slot 7 slot 8

blade1 blade2 blade3 blade4 blade5 blade6 blade7 blade8

slot 1 slot 2 slot 3 slot 4 slot 5 slot 6 slot 7 slot 8

blade1 blade2 blade3 blade4 blade5 blade6 blade7 blade8

slot 1 slot 2 slot 3 slot 4 slot 5 slot 6 slot 7 slot 8

blade1 blade2 blade3 blade4 blade5 blade6 blade7 blade8

slot 1 slot 2 slot 3 slot 4 slot 5 slot 6 slot 7 slot 8

blade1 blade2 blade3 blade4 blade5 blade6 blade7 blade8

slot 1 slot 2 slot 3 slot 4 slot 5 slot 6 slot 7 slot 8

blade1 blade2 blade3 blade4 blade5 blade6 blade7 blade8

slot 1 slot 2 slot 3 slot 4 slot 5 slot 6 slot 7 slot 8

blade1 blade2 blade3 blade4 blade5 blade6 blade7 blade8

slot 1 slot 2 slot 3 slot 4 slot 5 slot 6 slot 7 slot 8

blade1 blade2 blade3 blade4 blade5 blade6 blade7 blade8

slot 1 slot 2 slot 3 slot 4 slot 5 slot 6 slot 7 slot 8

blade1 blade2 blade3 blade4 blade5 blade6 blade7 blade8

slot 1 slot 2 slot 3 slot 4 slot 5 slot 6 slot 7 slot 8

blade1 blade2 blade3 blade4 blade5 blade6 blade7 blade8

slot 1 slot 2 slot 3 slot 4 slot 5 slot 6 slot 7 slot 8

blade1 blade2 blade3 blade4 blade5 blade6 blade7 blade8

Virtualized Data Center

• vMotion, iSCSI, FCoE, NAS, CIFS

• Multi Tenant Applications • Voice & Video

HPC/GRID

• Low Latency • Bursty Traffic

(workload migration) • IPC • iWARP & RoCE

Page 12: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Trust Boundaries – What have we trusted?

12

Access-Edge Switches

Conditionally Trusted Endpoints

Example: IP Phone + PC

Secure Endpoint

Example: Software-protected PC

With centrally-administered QoS

markings

Unsecure Endpoint

Tru

st

Bo

un

dary

Tru

st

Bo

un

dary

Page 13: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

The Evolving Data Centre Architecture

13

Re-Visit - Where Is the Edge?

NIC

PCI-E Bus

Operating System and Device Drivers

FC 3/11

HBA

Eth 2/12

Edge of the Network and Fabric

pNIC

PCI-E Bus

Hypervisor provides virtualization of PCI-E resources

FC 3/11

HBA

Eth 2/12

Edge of the Fabric

VMF

S

SCSI

VNIC

VETH

Still 2 PCI Addresses on the BUS

PCI-E Bus

Hypervisor provides virtualization of PCI-E resources

Edge of the Fabric

VMF

S

SCS

I

PCIe

Eth

ern

et

Fib

re C

han

nel

10

Gb

E

10G

bE

Link

Eth 2/12

vFC 3

Converged Network Adapter provides

virtualization of the physical

Media

VNIC

VETH PCI-E Bus

Hypervisor provides virtualization of PCI-E resources

Edge of the Fabric

VM

FS

SC

SI

veth 1

vFC 4

SR-IOV adapter provides multiple PCIe resources

10GE - VNTag

Eth

1

FC

2

Eth

3

FC

4

Eth

126

vFC 2

vFC 3

vFC 126

Pas

s

Thr

u

VNIC

VETH

Compute and Fabric Edge are Merging

13

Page 14: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

What do we trust and where do we classify and mark?

14

• Data Centre architecture can be provide a new set of trust boundaries

• Virtual Switch extends the trust boundary into the memory space of the Hypervisor

• Converged and Virtualized Adapters provide for local classification, marking and queuing

vPC

vPC

VM #4

VM #3

VM #2

N1KV – Classification,

Marking & Queuing

COS Based Queuing in the

extended Fabric

Trust Boundary

CNA/A-FEX - Classification and Marking

N2K – CoS Marking

COS Based Queuing in the

extended Fabric

N5K – CoS/DSCP Marking,

Queuing and Classification

N7K/N6K – CoS/DSCP Marking,

Queuing and Classification

COS/DSCP Based Queuing in the

extended Fabric

Page 15: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Data Center QoS Model: 4 and 8 Class Model

15

Time

Critical Data

Realtime

4-Class Model

Best Effort

Signaling / Control Call Signaling

Critical Data

Interactive Video

Voice

8-Class Model

Scavenger

Best Effort

Streaming Video

Network Control

Network Management

Realtime Interactive

Transactional Data

Multimedia Conferencing

Voice

12-Class Model

Bulk Data

Scavenger

Best Effort

Multimedia Streaming

Network Control

Broadcast Video

Call Signaling

http://www.cisco.com/en/US/docs/solutions/Enterprise/WAN_and_MAN/QoS_SRND_40/QoSIntro_40.html#wp61135

Page 16: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public 16

iSCSI

Appliance

File System

Application

SCSI Device Driver iSCSI Driver

TCP/IP Stack

NIC

Volume Manager

NIC

TCP/IP Stack

iSCSI Layer

Bus Adapter

iSCSI

Gateway

FC

File System

Application

SCSI Device Driver iSCSI Driver

TCP/IP Stack

NIC

Volume Manager

NIC

TCP/IP Stack

iSCSI Layer

FC HBA

NAS

Appliance

NIC

TCP/IP Stack

I/O Redirector

File System

Application

NFS/CIFS

NIC

TCP/IP Stack

File System

Device Driver

Block I/O

NAS

Gateway

NIC

TCP/IP Stack

I/O Redirector

File System

Application

NFS/CIFS

FC

NIC

TCP/IP Stack

File System

FC HBA

The Flexibility of a

Unified Fabric

Transport

‘Any RU to Any

Spindle’

FCoE SAN

FCoE

SCSI Device Driver

File System

Application

Computer System Computer System Computer System Computer System Computer System

Block I/O File I/O

SAN IP IP IP IP

Block I/O

NIC

Volume Manager Volume Manager

FCoE Driver

Evolving Data Centre Architecture Where and How is Storage Connected?

16

Page 17: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

NX-OS QoS Requirements: COS or DSCP?

17

• We have non IP based traffic to consider again

– FCoE – Fibre Channel Over Ethernet

– RCoE – RDMA Over Ethernet

• DSCP is still marked but CoS will be required in Nexus Data Center designs

PCP/COS Network priority Acronym Traffic characteristics

1 0 (lowest) BK Background

0 1 BE Best Effort

2 2 EE Excellent Effort

3 3 CA Critical Applications

4 4 VI Video, < 100 ms latency

5 5 VO Voice, < 10 ms latency

6 6 IC Internetwork Control

IEEE 802.1Q-2005

Page 18: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

NX-OS QoS Requirements – Where do we put the new traffic types

18

• In this example of a Virtualized Multi-Tenant Data Center there is a potential overlap/conflict with Voice/Video queuing assignments, e.g.

– COS 3 – FCoE ‘and’ Call Control

Traffic Type Network Class COS Class, Property, BW Allocation

Infrastructure Control 6 Platinum, 10%

vMotion 4 Silver, 20%

Tenant

Gold, Transactional 5 Gold, 30%

Silver, Transactional 2 Bronze, 15%

Bronze, Transactional 1 Best effort, 10%

Storage FCOE 3 No Drop, 15%

NFS datastore 5 Silver

Non Classified Data 1 Best Effort

Page 19: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Mastering Data Center QoS – BRKRST 2509

19

• Data Center QoS Requirements

• Nexus QoS Capabilities

• Nexus QoS Configuration

1K Cisco Nexus

x86

Page 20: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Priority Flow Control – FCOE Flow Control – 802.1Qbb

20

• Enables lossless Ethernet using PAUSE based on a COS as defined in 802.1p

• When link is congested, CoS assigned to “no-drop” will be PAUSED

• Other traffic assigned to other CoS values will continue to transmit and rely on upper layer protocols for retransmission

• Not only for FCoE traffic

Packet

R_R

DY

Fibre Channel

Transmit Queues Ethernet Link

Receive Buffers

Eight

Virtual

Lanes

One One

Two Two

Three Three

Four Four

Five Five

Seven Seven

Eight Eight

Six Six

STOP PAUSE

B2B Credits

Page 21: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Enhanced Transmission Selection (ETS) Bandwidth Management – 802.1Qaz

21

• Prevents a single traffic class of “hogging” all the bandwidth and starving other classes

• When a given load doesn’t fully utilize its allocated bandwidth, it is available to other classes

• Helps accommodate for classes of a “bursty” nature

Offered Traffic

t1 t2 t3

10 GE Link Realized Traffic Utilization

3G/s HPC Traffic

3G/s

2G/s

3G/s Storage Traffic

3G/s

3G/s

LAN Traffic

4G/s

5G/s 3G/s

t1 t2 t3

3G/s 3G/s

3G/s 3G/s 3G/s

2G/s

3G/s 4G/s 6G/s

Page 22: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Unified Fabric

22

• Storage I/O - Flexibility and Serialized Re-Use

8 Gb

2 Gb

2 Gb

2 Gb

14 Gb

Boot Production vMotion

3 Gb 2 Gb 4 Gb

Back

Front

vMotion

SAN

Server Life Cycle Network

Page 23: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Unified Fabric

23

• Storage I/O - Flexibility and Serialized Re-Use

Network

20 Gb

20 Gb

Boot Production vMotion

2 Cables

Consolidated I/O

10 Gb 20 Gb 10 Gb

Server Life Cycle

Page 24: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Data Center Bridging Control Protocol – 802.1Qaz

24

• Negotiates Ethernet capability’s : PFC, ETS, CoS values between DCB capable peer devices

• Simplifies Management : allows for configuration and distribution of parameters from one node to another

• Responsible for Logical Link Up/Down signaling of Ethernet and Fibre Channel

• DCBX is LLDP with new TLV fields

• The original pre-standard CIN (Cisco, Intel, Nuova) DCBX utilized additional TLV’s

• DCBX negotiation failures result in: – per-priority-pause not enabled on CoS values

– vfc not coming up – when DCBX is being used in FCoE environment

DCBX Switch

DCBX CNA

Adapter

dc11-5020-3# sh lldp dcbx interface eth 1/40

Local DCBXP Control information:

Operation version: 00 Max version: 00 Seq no: 7 Ack no: 0

Type/

Subtype Version En/Will/Adv Config

006/000 000 Y/N/Y 00

<snip>

https://www.cisco.com/en/US/netsol/ns783/index.html

Page 25: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Explicit Congestion Notification [ECN] - TCP

25

ECN is an extension to TCP that provides end-to-end congestion notification without dropping packets. Both the network infrastructure and the end hosts have to be capable of supporting ECN for it to function properly. ECN uses the two least significant bits in the Diffserv field in the IP header to encode four different values. During periods of congestion a router will mark the DSCP header in the packet indicating congestion (0x11) to the receiving host who should notify the source host to reduce its transmission rate.

N3K-1(config)# policy-map type network-qos traffic-priorities

N3K-1(config-pmap-nq)# class type network-qos class-gold

N3K-1(config-pmap-nq-c)# congestion-control random-detect ecn

The configuration for enabling ECN is very similar to the previous WRED example, so only the policy-map configuration with the ecn option is displayed for simplicity.

ECN Configuration

Diffserv field Values in the IP Header

0x00 – Non ECN-Capable Transport

0x10 - ECN Capable Transport (0)

0x01 – ECN Capable Transport (1)

0x11 – Congestion Encountered

WRED and ECN are always applied to the system policy

Notes: When configuring ECN ensure there are not any queuing policy-maps applied to the

interfaces. Only configure the queuing policy under the system policy.

Page 26: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

ECN in Action! Incast Results

26

0

2000

4000

6000

8000

10000

12000

1 3 5 7 9 11 13 15 17 19 21 23

Go

od

pu

t in

Mb

ps

Server Incast #

TCP

ECN

Page 27: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

• Topology and traffic pattern changes require us to re-evaluate the assumptions of congestion

management within the Data Center

• Higher density of uplinks with greater multi-pathing ratio is resulting in more variability in

congestion patterns

• Distribution of workload is adding another dimension of traffic patterns

• Two options

• Spend the time to statically engineering marking, queuing and traffic patterns to accommodate

these new patterns

• Build a more systems based reactive approach to congestion management for traffic within

the Data Center

ACI Fabric - next generation DC QoS

40G links with

10G sinks and

sources

Higher density

of multi-pathing

Increasing

distribution of

workload

APIC

27

Page 28: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

ACI Fabric Load Balancing Flowlet Switching

H1 H2

TCP flow

• State-of-the-art ECMP hashes flows (5-

tuples) to path to prevent reordering

TCP packets.

• Flowlet switching* routes bursts of

packets from the same flow

independently.

• No packet re-ordering

Gap ≥ |d1 – d2|

d1 d2

*Flowlet Switching (Kandula et al ’04)

28

Page 29: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

ACI Fabric Load Balancing Congestion Monitoring

P1 5

At every ASIC hop along the path DRE field is

updated with the congestion metric (if my DRE >

current DRE then update DRE field else keep

current DRE)

2

P1 0 0

Every packet is sourced from an

iLeaf with an identifier for the

ingress port to the Fabric and a

DRE = 0 metric

0 P1

1

P1 0 5

Next packet sent back to the

originating iLeaf will carry

with it feedback information

on the max load recorded on

the ingress path 4

P4

On receipt of the metric feedback the original leaf

will update the Flowlet load balancing weights (It

will also receive indication of the load from the

other direction)

6 P1 3 5 P4 P1 5

On receipt of each packet update the

DRE weight in feedback table tracking

LBTag (source identifier)

3

P1 3 5 P4

5

Page 30: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

ACI Fabric Load Balancing Dynamic Flow Prioritization

Real traffic is a mix of large (elephant) and small (mice) flows.

F1

F2

F3

Standard (single priority):

Large flows severely impact

performance (latency & loss).

for small flows

High

Priority

Dynamic Flow Prioritization:

Fabric automatically gives a

higher priority to small flows.

Standard

Priority Key Idea:

Fabric detects initial few

flowlets of each flow and

assigns them to a high

priority class.

Page 31: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Application Performance Improvements ACI Fabric Load Balancing

0

0.2

0.4

0.6

0.8

1

30 40 50 60 70 80

No

rma

lize

d F

CT

IFLB results in ~20-35% better

throughput efficiency in stead

state

0

0.2

0.4

0.6

0.8

1

30 40 50 60 70 80

No

rma

lize

d F

CT

Traffic Load (%) Traffic Load (%)

0

5

10

15

20

30 40 50 60 70 80

Traffic Load (%)

Slo

wd

ow

n (

x T

ime

s S

low

er)

Standard ECMP with No Priority

ECMP ‘with’ Priority

Dynamic Load Balancing with

Priority

During Link Loss Application

Flow Completion is significantly

reduced

As traffic volumes increase

application impact is

significantly reduced

31

Page 32: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Hadoop Symmetric Topology (no link failure)

0

50

100

150

200

250

300

350

400

450

500

0 5 10 15 20 25 30 35 40

Jo

b C

om

ple

tio

n T

ime

(s

ec

)

Trial Number

ECMP

DLBFlowlet

32

Page 33: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Hadoop Asymmetric Topology (link failure)

0

50

100

150

200

250

300

350

400

450

500

0 5 10 15 20 25 30 35 40

Jo

b C

om

ple

tio

n T

ime

(s

ec

)

Trial Number

ECMP

~2x

improvement

33

Page 34: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Application Performance Improvements MemCacheD Throughput Improvement

0

50

100

150

200

0 100 200 300 400 500 600 700

Mem

cach

ed

T

hro

ug

hp

ut

(MB

/s)

Time (seconds)

Background

traffic (iperf)

starts Dynamic Packet

Prio enabled

~10x improvement

34

Page 35: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

When are buffers needed?

35

• Speed Mismatch

• Incast / Many to One conversations

• Storage

35

1GE

Acces

s

10GE

10GE 10GE

10GE

1GE

Acces

s

Page 36: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

1 13 25 37 49 61 73 85 97 109

121

133

145

157

169

181

193

205

217

229

241

253

265

277

289

301

313

325

337

349

361

373

385

397

409

421

433

445

457

469

481

493

505

517

529

541

553

565

577

589

601

613

625

637

649

661

673

685

697

709

721

733

745

757

769

781

793

Job

Com

plet

ion

Cell U

sage

1G Buffer Used 10G Buffer Used 1G Map % 1G Reduce % 10G Map % 10G Reduce %

Buffer Amount – 1GE vs. 10GE Buffer Usage

36

10 GE

Buffer

1 GE Buffer

Going to 10GE lowers the buffer utilization on switching layer

Page 37: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Remaining Logic (Forwarding, etc)

Tables (L2,L3,MC,etc)

Buffer Amount – the Buffer Bloat

37

Buffer

ASIC

Page 38: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Egress Buffer

I

N

G

R

E

S

S

E

G

R

E

S

S

I

N

G

R

E

S

S

Ingress per port Buffer

Scheduler

Crossbar Egress per port Buffer

E

G

R

E

S

S

Shared

Memory

Buffer Scheduler

I

N

G

R

E

S

S

E

G

R

E

S

S

Crossbar

Scheduler

Buffer Amount – The Switch Architecture

38

Cat6k

N7K M

N5K

N6K

N7k F

N3K

N9K multi-SoC

… …

Page 39: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Nexus 7000 F2/F2e QoS – Ingress Queuing

39

Ingress buffered architecture implements a large, distributed buffer pool to absorb congestion

Ingress buffered architectures absorb congestion at every ingress port contributing to congestion, leveraging all per-port ingress buffer

Versus egress buffered architectures (e.g., Catalyst 6500) which absorb congestion at each egress port, requiring large per-port egress buffer

Excess traffic does not consume fabric bandwidth, only to be dropped at the egress port

5

6

7

8

Fabric

1

2

Ingress

Egress

2:1 Ingress:Egress Ingress

Egress

8:1 Ingress:Egress 1

2

3

4 Fabric

Available buffer

for congestion

management:

Available buffer

for congestion

management:

Ingress

VOQ buffer

Page 40: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public 40

In typical Data Center access designs multiple ingress access ports transmit to a few uplink ports

Nexus 5000 and 5500 utilize an Ingress Queuing architecture

Packets are stored in ingress buffers until egress port is free to transmit

Ingress queuing provides an additive effective

The total queue size available is equal to [number of ingress ports x queue depth per port]

Statistically ingress queuing provides the same advantages as shared buffer memory architectures

Egress Queue 0

is full, link

congested

Traffic is Queued on all ingress interface buffers

providing a cumulative scaling of buffers for

congested ports

Nexus 6000/5500/5000 – Ingress Queuing

Page 41: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Nexus 5500 QoS Defaults

41

• QoS is enabled by default (not possible to turn it off)

• Three default class of services defined when system boots up

– Two for control traffic (CoS 6 & 7)

– Default Ethernet class (class-default – all others)

• Cisco Nexus 5500 switch supports five user-defined classes and the one default drop system class

• FCoE queues are ‘not’ pre-allocated

• When configuring FCoE the predefined service policies must be added to existing QoS configurations

# Predefined FCoE service policies

service-policy type qos input fcoe-default-in-policy

service-policy type queuing input fcoe-default-in-policy

service-policy type queuing output fcoe-default-out-policy

service-policy type network-qos fcoe-default-nq-policy

Gen 2 UPC

Unified Crossbar Fabric

Gen 2 UPC

Page 42: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Nexus 6000 - Increased Packet Buffer

448

Gbps 224

Gbps

Ingress

UPC Egress UPC

Un

ica

st

VO

Q

Mu

ltic

as

t V

OQ

Unified

Crossbar

Fabric

16MB 9MB

• 25-MB packet buffer is shared by every three 40 GE ports or twelve 10 GE

ports.

• Buffer is 16 MB at ingress and 9 MB at egress.

• Unicast packet can be buffered at both ingress and egress.

• Multicast is buffered at egress.

42

Page 43: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Nexus 6000/5000/5500 QoS – PFC

43

• Actions when congestion occurs depending on policy configuration – PAUSE upstream transmitter for lossless traffic

– Tail drop for regular traffic when buffer is exhausted

• Priority Flow Control (PFC) or 802.3X PAUSE can be deployed to ensure lossless for application that can’t tolerate packet loss

• Buffer management module monitors buffer usage for no-drop class of service. It signals MAC to generate PFC (or link level PAUSE) when the buffer usage crosses threshold

• FCoE traffic is assigned to class-fcoe, which is a no-drop system class

• Other class of service by default have normal drop behavior (tail drop) but can be configured as no-drop

SFP SFP SFP SFP

SFP SFP SFP SFP

Unified

Crossbar

Fabric

Egress

UPC

ingress

UPC

1. Congestion or

Flow Control on

Egress Port

2. Egress UPC

does not allow

Fabric Grants

3. Traffic is

Queued on

Ingress

4. If queue is

marked as no-drop

or flow control

then Pause is sent

Page 44: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Nexus Qos Key Concepts – Ingress Model

44

Nexus 6000 /5000/5500/7000 F

Ingress buffering and queuing (as defined by ingress queuing policy) occurs at VOQ of each ingress port

Ingress VOQ buffers are primary congestion-management point for arbitrated traffic

Egress scheduling (as defined by egress queuing policy) enforced by egress port

Egress scheduling dictates manner in which egress port bandwidth made available at ingress

Per-port, per-priority grants from arbiter control which ingress frames reach egress port

For Your Reference

Page 45: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public 45

Multi-Level Scheduling

Per-port Per-group

Deficit Round Robin

Buffer/Queuing Block UC Queue 0 UC Queue 1 UC Queue 2 UC Queue 3 UC Queue 4 UC Queue 5 UC Queue 6 UC Queue 7 MC Queue 0 MC Queue 1 MC Queue 2 MC Queue 3

UC Queue 0 UC Queue 1 UC Queue 2 UC Queue 3 UC Queue 4 UC Queue 5 UC Queue 6 UC Queue 7 MC Queue 0 MC Queue 1 MC Queue 2 MC Queue 3

UC Queue 0 UC Queue 1 UC Queue 2 UC Queue 3 UC Queue 4 UC Queue 5 UC Queue 6 UC Queue 7 MC Queue 0 MC Queue 1 MC Queue 2 MC Queue 3

….

Egress

port 2

….

Egress

port 1

Egress

port 64

9MB Total

Shared

Per Port

Reserved

A pool of 18MB/9MB Buffer

space is divided up among

Egress reserved and

Dynamically shared buffer

Nexus 3000 / 3500 – Shared Memory Architecture

Page 46: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Nexus 2248TP-E – 32MB Shared Buffer

46

• Speed mismatch between 10G NAS and 1G server requires QoS tuning

• Nexus 2248TP-E utilizes a 32MB shared buffer to handle larger traffic bursts

• Hadoop, NAS, AVID are examples of bursty applications

• You can control the queue limit for a specified Fabric Extender for egress direction (from the network to the host)

• You can use a lower queue limit value on the Fabric Extender to prevent one blocked receiver from affecting traffic that is sent to other non-congested receivers ("head-of-line blocking”)

N5548-L3(config-fex)# hardware N2248TPE queue-limit 4000000 rx

N5548-L3(config-fex)# hardware N2248TPE queue-limit 4000000 tx

N5548-L3(config)#interface e110/1/1

N5548-L3(config-if)# hardware N2348TP queue-limit 4096000 tx

VM

#4

VM

#3

VM

#2

NAS

iSCSI

10G Attached Source (NAS Array)

1G Attached Server

10G

NF

S

Tune 2248TP-E to support a extremely large

burst (Hadoop, AVID, …)

Page 47: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Nexus 2248TP-E Counters

47

N5596-L3-2(config-if)# sh queuing interface e110/1/1

Ethernet110/1/1 queuing information:

Input buffer allocation:

Qos-group: 0

frh: 2

drop-type: drop

cos: 0 1 2 3 4 5 6

xon xoff buffer-size

---------+---------+-----------

0 0 65536

Queueing:

queue qos-group cos priority bandwidth mtu

--------+------------+--------------+---------+---------+----

2 0 0 1 2 3 4 5 6 WRR 100 9728

Queue limit: 2097152 bytes

Queue Statistics:

---+----------------+-----------+------------+----------+------------+-----

Que|Received / |Tail Drop |No Buffer |MAC Error |Multicast |Queue

No |Transmitted | | | |Tail Drop |Depth

---+----------------+-----------+------------+----------+------------+-----

2rx| 5863073| 0| 0| 0| - | 0

2tx| 426378558047| 28490502| 0| 0| 0| 0

---+----------------+-----------+------------+----------+------------+-----

Ingress queue limit(Configurable)

Egress queue limit(Configurable)

Egress queues:

CoS to queue

mapping

Bandwidth allocation

MTU

Per port

per queue

counters

Drop due to

oversubscription

Page 48: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

dc11-5020-4# sh queuing int eth 1/39

Interface Ethernet1/39 TX Queuing

qos-group sched-type oper-bandwidth

0 WRR 50

1 WRR 50

Interface Ethernet1/39 RX Queuing

qos-group 0

q-size: 243200, HW MTU: 1600 (1500 configured)

drop-type: drop, xon: 0, xoff: 1520

Statistics:

Pkts received over the port : 85257

Ucast pkts sent to the cross-bar : 930

Mcast pkts sent to the cross-bar : 84327

Ucast pkts received from the cross-bar : 249

Pkts sent to the port : 133878

Pkts discarded on ingress : 0

Per-priority-pause status : Rx (Inactive), Tx (Inactive)

<snip – other classes repeated>

Total Multicast crossbar statistics:

Mcast pkts received from the cross-bar : 283558

SFP SFP SFP SFP

Unified

Crossbar

Fabric

UPC

Egress (Tx) Queuing

Configuration

Packets Arriving on this port but

dropped from ingress queue due to

congestion on egress port

Mapping the switch architecture to ‘show queuing’

48

Page 49: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Mastering Data Center QoS – BRKRST-2509

49

• Data Center QoS Requirements

• Nexus QoS Capabilities

• Nexus QoS Configuration

–Nexus Configuration Model: MQC

–Platform Configuration Examples • Nexus 7x00

• Nexus 6000 / 5x00 / 3000

1K Cisco Nexus

x86

Page 50: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Nexus QoS – Capabilities and Configuration

50

Nexus 1000v/3000/5000/6000/7000 supports a new set of QoS capabilities designed to provide per system class based traffic control

Lossless Ethernet—Priority Flow Control (IEEE 802.1Qbb)

Traffic Protection—Bandwidth Management (IEEE 802.1Qaz)

Configuration signaling to end points—DCBX (part of IEEE 802.1Qaz)

These new capabilities are added to and managed by the common Cisco MQC (Modular QoS CLI) which defines a three-step configuration model

Define matching criteria via a class-map

Associate action with each defined class via a policy-map

Apply policy to entire system or an interface via a service-policy

Nexus leverage the MQC qos-group capabilities to identify and define traffic in policy configuration

Page 51: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Nexus – QoS – Configuration Principles

51

• Class-Map

• Policy-Map

• Service-Policy 3 MQCs

• N1000v: 64 classes (8 pre-defined)

• N3K: 8 classes / Qos-groups (4 Multicast)

• N6K/N5K: 6 classes

• N7K: 2 to 8 classes

Classes

• Type Network-QOS

• Type Queuing

• Type QOS

Policies

Page 52: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Type (CLI) Description Applied To…

QoS Packet Classification based on Layer 2/3/4 (Ingress) Interface or System

Network-QoS Packet Marking (CoS), Congestion Control WRED/ECN (Egress) System

Queuing Scheduling - Queuing Bandwidth % / Priority Queue (Egress) Interface or System

Nexus QoS - Overview

52

QoS is enabled by default (NX-OS Default)

Qos policy defines how the system classifies traffic, assigned to qos-groups

Network QoS policy defines system policies, e.g. which COS values ALL ports treat as drop

versus no-drop

Ingress queuing policy defines how ingress port buffers ingress traffic for ALL destinations

over fabric

Egress queuing policy defines how egress port transmits traffic on wire

‒ Conceptually, controls how all ingress ports schedule traffic toward the egress port over

fabric (by controlling manner in which bandwidth availability reported to arbiter)

Page 53: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Classification System class matched by QoS Group

Policy MTU

Queue-Limit (5k)

Set CoS Mark 802.1p

Set DSCP (5500/3k)

ECN-WRED (3k)

Classification System class matched by qos-

group

Policy ETS Guaranteed scheduling deficit

weighted round robin (DWRR)

percentage

Priority

Strict priority scheduling – Only one

class can be configured for priority in

a given queuing policy

type qos type network-qos type queuing

1 2 3

4 Apply service-policy

Nexus QoS Methodology

53

Classification ACL, CoS, DSCP, IP RTP, Precedence,

Protocol

Policy Sets qos-group to the system class this

traffic flow is mapped to

Page 54: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Software QoS Model – Configuration Steps

Class-map type queuing class-app-1

Match qos-group 1

Class-map type queuing class-app-2

Match qos-group 2

Policy-map type queuing policy-queue

Class type queuing class-default

Bandwidth percent 10

Class type queuing class-app-1

Bandwidth percent 50

Class type queuing class-app-2

Bandwidth percent 40

Class-map type network-qos class-app-1

Match qos-group 1

Class-map type network-qos class-app-2

Match qos-group 2

Policy-map type network-qos policy-nq

Class type network-qos class-app-1

Pause no-drop

MTU 9216

Class type network-qos class-app-2

Set cos 2

Queue-limit 81920 bytes

STEP 1 Qos - Ingress STEP 2- Network-Qos STEP 3 – Queuing In/Egress

system qos

service-policy type qos input INGRESS_CLASS

service-policy type network-qos MARK_COS

service-policy type queuing output EGRESS_QUEUE

ACL app-2

qos-group=2

set cos 2

Buffer 82kb Cos=2

STEP 4

Apply QoS – Global / per int.

Class-map type qos class-app-1

Match access-group app-1

Class-map type qos class-app-2

Match access-group app-2

Policy-map type qos policy-qos

Clas type qos class-app-1

Set qos-group 1

Class type qos class-app-2

Set qos-group 2

54

Page 55: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Configuring QoS – type ‘network-qos’ Policies

55

• Define global queuing and scheduling parameters for all interfaces in switch – Identify drop/no-drop classes, instantiate specific default queuing policies, etc.

• One network-QoS policy per system, applies to all ports in all VDCs

• Assumption is network-QoS policy defined/applied consistently network-wide – Particularly for no-drop applications, end-to-end consistency mandatory

Switch 1 Switch 2 Switch 3

Network QoS policies should be applied

consistently on all switches network wide

Fabric

Ingress Module

Ingress Module

Ingress Module

Egress Module

Fabric

Ingress Module

Ingress Module

Ingress Module

Egress Module

Fabric

Ingress Module

Ingress Module

Ingress Module

Egress Module

Page 56: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Mastering Data Center QoS – BRKRST-2509

56

• Data Center QoS Requirements

• Nexus QoS Capabilities

• Nexus QoS Configuration

–Nexus Configuration Model: MQC

–Platform Configuration Examples • Nexus 7x00

• Nexus 6000 / 5x00 / 3000

1K Cisco Nexus

x86

Page 57: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Ingress/Egress Queuing Class-Maps

57

• class-map type queuing – Configures COS to Queue mapping

• Queuing class-map names are static, based on port-type and queue (Predefined)

• Configurable only in default VDC

– Changes apply to ALL ports of specified type in ALL VDCs

– Changes are traffic disruptive for ports of specified type

N7k-ADMIN(config)# class-map type queuing match-any

1p3q4t-out-pq1 1p7q4t-out-q-default 1p7q4t-out-q6 8q2t-in-q1 8q2t-in-q6

1p3q4t-out-q-default 1p7q4t-out-q2 1p7q4t-out-q7 8q2t-in-q2 8q2t-in-q7

1p3q4t-out-q2 1p7q4t-out-q3 2q4t-in-q-default 8q2t-in-q3

1p3q4t-out-q3 1p7q4t-out-q4 2q4t-in-q1 8q2t-in-q4

1p7q4t-out-pq1 1p7q4t-out-q5 8q2t-in-q-default 8q2t-in-q5

N7k-ADMIN(config)# class-map type queuing match-any 1p7q4t-out-pq1

N7k-ADMIN(config-cmap-que)# match cos 7

N7k-ADMIN(config-cmap-que)#

10G/40G/100G

Ingress Queue

Structure

10G/40G/100G

Egress Queue

Structure

1G Egress Queue

Structure 1G Ingress Queue

Structure

Page 58: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Ingress Queuing – Logical View

58

CoS 0-4 (Q-Default)

CoS 5-7 (Q1)

8e Template

CoS 3 (Q4)

CoS 5-7 (Q1)

7e Template

CoS 0,1 (Q-Default)

CoS 2,4 (Q3)

CoS 0-2 (Q-Default)

CoS 5-7 (Q1)

6e Template

CoS 4 (Q3)

CoS 3 (Q4)

CoS 0 (Q-Default)

CoS 5-7 (Q1)

4e Template

CoS 4 (Q3)

CoS 1-3 (Q4)

Skid Buffers High & Low Drop

Threshold

High (Pause)

Threshold Low (Resume)

Threshold

Legend:

Skid Buffer

Drop Queue

No-Drop Queue

Pause Active

8e-4q4q Template

CoS 3-4 (Q3)

CoS 5-7 (Q1)

CoS 2 (Q4)

CoS 0-1 (Q-Default)

Page 59: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Egress Queuing – Logical View

59

Egre

ss P

ort

Q-D

ef

PQ

1

Q3

Q2

default-4q-7e-out-policy

Q2

(0,1)

PQ1

(5,6,7)

Q3

(2,4)

Q-Def.

(3)

DWRR

Priority

DWRR

50% 50%

Egre

ss P

ort

Q-D

ef

(H)

PQ

3 (

L)

PQ

1

PQ

2

default-4q-6e-out-policy

PQ2

(0,1,2)

PQ3

(3)

PQ1

(5,6,7)

Q-Def.

(4)

DWRR

Prio Prio

DWRR

100%

Egre

ss P

ort

PQ

1 (

H)

PQ

2 (

L)

Q-D

ef

Q3

default-4q-4e-out-policy

Q3

(1,2,3)

PQ2

(0)

Q-Def.

(4)

PQ1

(5,6,7)

DWRR DWRR

Prio Prio

DWRR

100% 100%

red indicates no-drop

Egre

ss P

ort

PQ

1

Q2

Q3

Q-D

ef

DWRR

default-4q-8e-out-policy

Q-Def.

(0,1)

Q2

(3,4

)

Q3

(2)

PQ1

(5,6,7)

Priority

33% 33% 33%

Egre

ss P

ort

PQ

1

Q2

Q3

Q-D

ef

DWRR

default-4q4q-8e-out-policy

Q-Def.

(0,1)

Q2

(3,4)

Q3

(2)

PQ1

(5,6,7)

Priority

33% 33% 33%

Page 60: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Queuing Policies

• policy-map type queuing – Define per-queue behavior such as queue size, WRED, shaping

• priority – defines queue as the priority queue

• bandwidth – defines WRR weights for each queue

• shape – defines SRR weights for each queue

• queue-limit – defines queue size and defines tail-drop thresholds

• random-detect – sets WRED thresholds for each queue

60

N7k(config)# policy-map type queuing pri-q N7k(config-pmap-que)# class type queuing 1p7q4t-out-pq1 N7k(config-pmap-c-que)# bandwidth no queue-limit set exit priority random-detect shape N7k(config-pmap-c-que)#

Note that some “sanity” checks are only performed when you attempt to tie the policy to an interface

Example: WRED on ingress 10G ports

Page 61: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Queuing Service Policies

• service-policy type queuing – Attach a queuing policy-map to an interface

• Queuing policies always tied to physical port

• No more than one input and one output queuing policy per port

tstevens-7010(config)# int e1/1

tstevens-7010(config-if)# service-policy type queuing input my-in-q

tstevens-7010(config-if)# service-policy type queuing output my-out-q

tstevens-7010(config-if)#

61

Page 62: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Modifying MTU – N7K – F series linecards

62

• MTU in network-QoS policy applies to all F1/F2 interfaces in absence of per-port MTU configuration. User-configured per-port MTU overrides any MTU in network-QoS policy (for that port)

• Per-port or network-QoS defined MTUs must be less than or equal to configured system jumbomtu value

• L2 switchport MTU must be 1518 or the “system jumbomtu” value if MTU configured per-port

• Example of per-port MTU (modifies MTU only on specified port):

N7K(config)# interface e3/1 N7K(config-if)# mtu 9216 N7K(config-if)#

Page 63: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Modifying MTU – N7K - F1 and F2 series linecards

63

• Example of network-QoS MTU (modifies MTU for specified class on all F1/F2 ports)

N7K# !Clone the 7E policy (cannot modify default policies) N7K# qos copy policy-map type network-qos default-nq-7e-policy prefix new- N7K# conf Enter configuration commands, one per line. End with CNTL/Z.

N7K(config)# !Modify the newly cloned policy-map N7K(config)# policy-map type network-qos new-nq-7e N7K(config-pmap-nqos)# !Modify the 7E drop class N7K(config-pmap-nqos)# class type network-qos c-nq-7e-drop N7K(config-pmap-nqos-c)# mtu 8000 N7K(config-pmap-nqos-c)# !Apply the new policy-map to the system qos target N7K(config-pmap-nqos-c)# system qos N7K(config-sys-qos)# service-policy type network-qos new-nq-7e N7K(config-sys-qos)#

Page 64: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

What Is a Strict Priority Queue?

64

• In classic definition, SP queue gets complete, unrestricted access to all interface bandwidth and is serviced until empty

–Can theoretically starve all other traffic classes

• Depending on hardware implementation, additional options for SP queue exist: –Multiple PQs with hierarchical relationship (e.g., level 1 vs. level 2) –Multiple PQs with bandwidth sharing according to DWRR weights –Optional SP queue shaping

M1 modules:

• SP queue adheres to classic SP queue definition –You cannot limit how much interface bandwidth traffic mapped to SP queue consumes

• Use care in mapping traffic to SP queue – SP traffic should be low volume

F1/F2 modules:

• Multiple SP queues can exist, depending on active network-QoS template

• SP queue(s) can be shaped to prevent complete starvation of other classes –Note that a shaped queue cannot exceed the shaped rate even if no congestion exists

Page 65: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Modifying Queuing Behavior Shape the SP Queue on F1/F2 Modules

Clone a default egress “type

queuing” policy-map Creates a copy of a default egress queuing policy

Shape SP queue in new

(cloned) “type queuing” policy

Limit SP queue bandwidth consumption

Apply new “type queuing”

policy to target interface(s) Apply new queuing policy to F1/F2 interfaces

Important: applying new queuing policy takes effect immediately and is disruptive to any ports to which the policy is applied

65

Page 66: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Modifying Queuing Behavior Shape the SP Queue on F1/F2 Modules

66

• Example: Shape the SP queue to 2Gbps on an interface, using a queuing policy cloned from the default “8E” egress queuing policy

N7K# !Clone the 8E egress queuing policy

N7K# qos copy policy-map type queuing default-4q-8e-out-policy prefix new-

N7K# conf t

Enter configuration commands, one per line. End with CNTL/Z.

N7K(config)# !Modify new queuing policy

N7K(config)# policy-map type queuing new-4q-8e-out

N7K(config-pmap-que)# !Modify the SP queue

N7K(config-pmap-que)# class type queuing 1p3q1t-8e-out-pq1

N7K(config-pmap-c-que)# !Shape the queue to 20% (2G)

N7K(config-pmap-c-que)# shape percent 20

N7K(config-pmap-c-que)# !Apply the new policy to target interface

N7K(config-pmap-c-que)# int e 2/1

N7K(config-if)# service-policy type queuing output new-4q-8e-out

N7K(config-if)#

Page 67: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Modifying Queuing Behavior Make an Interface “Untrusted”

67

Create ingress “type queuing”

policy-map to set COS to 0

Rewrites COS of all frames to 0 – only needed if ingress is 1Q trunk

Create “type qos” marking policy to

set DSCP to 0 Rewrites DSCP of all IP packets to 0

Apply new policies to target

interface(s)

Apply new policies to interfaces

Important: applying new queuing policy takes effect immediately and is disruptive to any ports to which the policy is applied

Page 68: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Modifying Queuing Behavior Make an Interface “Untrusted” – F1/F2 Modules

68

• N7K# !Clone the default input queuing policy, or create a new one from scratch

N7K# qos copy policy-map type queuing default-4q-8e-in-policy prefix untrusted-

N7K# conf

Enter configuration commands, one per line. End with CNTL/Z.

N7K(config)# !Modify the cloned policy

N7K(config)# policy-map type queuing untrusted-4q-8e-in

N7K(config-pmap-que)# !For F1/F2, must specify all queues even for untrusted policy

N7K(config-pmap-que)# class type queuing 2q4t-8e-in-q1

N7K(config-pmap-c-que)# !Give q1 the minimum buffer space

N7K(config-pmap-c-que)# queue-limit percent 1

N7K(config-pmap-c-que)# class type queuing 2q4t-8e-in-q-default

N7K(config-pmap-c-que)# !Give q-default maximum buffer space

N7K(config-pmap-c-que)# queue-limit percent 99

N7K(config-pmap-c-que)# !Set COS 0 for all frames

N7K(config-pmap-c-que)# set cos 0

N7K(config-pmap-c-que)# policy-map type qos untrusted

N7K(config-pmap-qos)# !use class-default to match everything

N7K(config-pmap-qos)# class class-default

N7K(config-pmap-c-qos)# !change DSCP of all packets to 0

N7K(config-pmap-c-qos)# set dscp 0

N7K(config-pmap-c-qos)# int e1/1-32

N7K(config-if-range)# !tie the queuing & qos policies to the target interface(s)

N7K(config-if-range)# service-policy type queuing input untrusted-4q-8e-in

N7K(config-if-range)# service-policy type qos input untrusted

Note: for an access switchport, queuing policy not necessary since no COS received

Page 69: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Priority Flow Control – Nexus 7K Operations Configuration – Switch Level

69

• No-Drop PFC w/ MTU 2K set for Fibre Channel

N7K-50(config)# system qos

N7K-50(config-sys-qos)# service-policy type network-qos default-nq-7e-policy

show policy-map system

Type network-qos policy-maps

=====================================

policy-map type network-qos default-nq-7e-policy

class type network-qos c-nq-7e-drop

match cos 0-2,4-7

congestion-control tail-drop

mtu 1500

class type network-qos c-nq-7e-ndrop-fcoe

match cos 3

match protocol fcoe

pause

mtu 2112

Template Drop CoS (Priority) NoDrop CoS (Priority)

default-nq-8e-policy 0,1,2,3,4,5,6,7 5,6,7 - -

default-nq-7e-policy 0,1,2,4,5,6,7 5,6,7 3 -

default-nq-6e-policy 0,1,2,5,6,7 5,6,7 3,4 4

default-nq-4e-policy 0,5,6,7 5,6,7 1,2,3,4 4

Policy Template choices

show class-map type network-qos c-nq-7e-ndrop-fcoe

Type network-qos class-maps

=============================================

class-map type network-qos match-any c-nq-7e-ndrop-fcoe

Description: 7E No-Drop FCoE CoS map

match cos 3

match protocol fcoe

Page 70: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Hierarchical Queuing Policies for ETS – F1 and F2

70

Enhanced Transmission Selection (ETS) provides priority group mappings and

bandwidth ratios

‒Controls hierarchical queuing policies for drop versus no-drop traffic classes

‒Defines bandwidth ratios advertised in DCBX for drop versus no-drop

classes

Only active when no-drop network-qos policy active (7E/6E/4E)

Top-level policy-map defines overall queue-limit and bandwidth ratios for drop

versus no-drop classes

Second-level policy-map defines priority, queue-limit, and bandwidth ratios for

individual drop and no-drop classes

Page 71: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Example of ETS Queuing Policy – F linecards

71

• Example using default queuing policies under 6E network-QoS template

• Top-level policy-map policy-map type queuing default-4q-6e-out-policy

class type queuing c-4q-6e-drop-out

service-policy type queuing default-4q-6e-drop-out-policy

bandwidth remaining percent 70

class type queuing c-4q-6e-ndrop-out

service-policy type queuing default-4q-6e-ndrop-out-policy

bandwidth remaining percent 30

• Second-level policy-maps policy-map type queuing default-4q-6e-drop-out-policy

class type queuing 3p1q1t-6e-out-pq1

priority level 1

class type queuing 3p1q1t-6e-out-q-default

bandwidth remaining percent 100

policy-map type queuing default-4q-6e-ndrop-out-policy

class type queuing 3p1q1t-6e-out-pq2

priority level 1

class type queuing 3p1q1t-6e-out-pq3

priority level 2

Defines overall

bandwidth ratio for

drop classes

Defines overall

bandwidth ratio for no-

drop classes

Defines priority and

bandwidth ratios for

individual drop classes

Defines priority and

bandwidth ratios for

individual no-drop classes

Page 72: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Mastering Data Center QoS – BRKRST-2509

72

• Data Center QoS Requirements

• Nexus QoS Capabilities

• Nexus QoS Configuration

–Nexus Configuration Model: MQC

–Platform Configuration Examples • Nexus 7x00

• Nexus 6000 / 5x00 / 3000

1K Cisco Nexus

x86

Page 73: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Configuration Templates – MTU 6000/5000/3000/2000

73

• MTU can be configured for each class of service (no interface level MTU)

• No fragmentation since Nexus 5000 / 3000 is a L2 switch

• Cut-through, frames are truncated if they are larger than MTU

• Store-and-forward: frames are dropped if they are larger than MTU

• With L3 module (5000) or license (3000) L3 MTU at Routed Interface / SVI level

Each CoS queue on the

Nexus 5000 supports a

unique MTU

class-map type qos iSCSI

match cos 2

class-map type queuing iSCSI

match qos-group 2

policy-map type qos iSCSI

class iSCSI

set qos-group 2

class-map type network-qos iSCSI

match qos-group 2

policy-map type network-qos

iSCSI

class type network-qos iSCSI

mtu 9216

system qos

service-policy type qos input iSCSI

service-policy type network-qos

iSCSI

Page 74: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Configuration Templates– MTU 6000/5000/3000/2000

74

• Nexus 5000 / 3000 supports different MTU for each system class

• MTU is defined in network-qos policy-map

• L2: no interface level MTU support on Nexus 5000

• L3 MTU: at SVI / Routed port level

Policy-map type network-qos jumbo

Class type network-qos class-default

MTU 9216

System qos

Service-policy type network-qos jumbo

Interface ethernet 1/x

Mtu 9216

Each qos-group on the

Nexus 5000/3000 supports a

unique MTU

Page 75: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Configuration Templates – Cos Marking

75

Page 76: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

ETS – Strict Priority and Bandwidth management

76

• Create classification rules first by defining and applying policy-map type qos

• Define and apply policy-map type queuing to configure strict priority and bandwidth sharing

pod3-5010-2(config)# class-map type queuing class-voice

pod3-5010-2(config-cmap-que)# match qos-group 2

pod3-5010-2(config-cmap-que)# class-map type queuing class-high

pod3-5010-2(config-cmap-que)# match qos-group 3

pod3-5010-2(config-cmap-que)# class-map type queuing class-low

pod3-5010-2(config-cmap-que)# match qos-group 4

pod3-5010-2(config-cmap-que)# exit

pod3-5010-2(config)# policy-map type queuing policy-BW

pod3-5010-2(config-pmap-que)# class type queuing class-voice

pod3-5010-2(config-pmap-c-que)# priority

pod3-5010-2(config-pmap-c-que)# class type queuing class-high

pod3-5010-2(config-pmap-c-que)# bandwidth percent 50

pod3-5010-2(config-pmap-c-que)# class type queuing class-low

pod3-5010-2(config-pmap-c-que)# bandwidth percent 30

pod3-5010-2(config-pmap-c-que)# class type queuing class-fcoe

pod3-5010-2(config-pmap-c-que)# bandwidth percent 20

pod3-5010-2(config-pmap-c-que)# class type queuing class-default

pod3-5010-2(config-pmap-c-que)# bandwidth percent 0

pod3-5010-2(config-pmap-c-que)# system qos

pod3-5010-2(config-sys-qos)# service-policy type queuing output policy-BW

FCoE Traffic given 20%

of the 10GE link

1Gig FC HBAs

1Gig Ethernet NICs

Traditional Server

Page 77: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Priority Flow Control – Nexus 6000 / 5500/5000

77

• On Nexus 5000 once feature fcoe is configured, 2 classes are made by default

• class-fcoe is configured to be no-drop with an MTU of 2158

• Enabling the FCoE feature on Nexus 5548/96 does ‘not’ create no-drop policies automatically as on Nexus 5010/20

• Must add policies under system QOS:

policy-map type qos default-in-policy

class type qos class-fcoe

set qos-group 1

class type qos class-default

set qos-group 0

policy-map type network-qos default-nq-policy

class type network-qos class-fcoe

pause no-drop

mtu 2158

system qos

service-policy type qos input fcoe-default-in-policy

service-policy type queuing input fcoe-default-in-policy

service-policy type queuing output fcoe-default-out-policy

service-policy type network-qos fcoe-default-nq-policy

FCoE DCB Switch

DCB CNA Adapter

Page 78: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Configuring QoS on the Nexus 6000/ 5500

78

• Check System Classes

Page 79: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

QoS On ACI Fabric - Topology

79

Traffic marking can be performed by the fabric, simplifying the Data Center inbound and outbound QoS policy enforcement

VM

VM

Apic Apic Apic Apic

Page 80: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

DSCP Marking on ACI Fabric – Create Policy

80

Under the Custom QoS, verify there is a policy named

“Marking_Test” created. Double click on it. You´ll observe

that policy determines all packets marked with DSCP 0 to

63 will be remarked to AF23

Page 81: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

DSCP Marking on ACI – Apply Policy

81

Tenant -> Application Profiles

Verify Custom QoS

“Marking_Test” is associated

with that EPG.

Page 82: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

DSCP Marking – Verify Marking

82

Once the policy is associated, go to the

Web VM console and issue a ping towards

the App VM. In addition, on the App

console, enable the TCPDUMP, so it´s

possible to see the marking changed

Page 83: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Summary

83

• Data Center QoS requires characterization beyond voice and video.

• New capabilities: PFC, ETS , DCBX

• Platform consistency: MQC

• Platform dependencies : where PFC, PQ, Queue structure

• Different type of congestions / traffic flows

• More to QoS than Buffer Tuning: Application and transport tuning

• How to configure QoS on Nexus switches

Page 84: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Recommended Readings

84

Page 85: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Participate in the “My Favorite Speaker” Contest

• Promote your favorite speaker through Twitter and you could win $200 of Cisco Press products (@CiscoPress)

• Send a tweet and include

– Your favorite speaker’s Twitter handle <@flying91>

– Two hashtags: #CLUS #MyFavoriteSpeaker

• You can submit an entry for more than one of your “favorite” speakers

• Don’t forget to follow @CiscoLive and @CiscoPress

• View the official rules at http://bit.ly/CLUSwin

Promote Your Favorite Speaker and You Could be a Winner

85

Page 86: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Complete Your Online Session Evaluation

• Give us your feedback and you could win fabulous prizes. Winners announced daily.

• Complete your session evaluation through the Cisco Live mobile app or visit one of the interactive kiosks located throughout the convention center.

Don’t forget: Cisco Live sessions will be available for viewing on-demand after the event at CiscoLive.com/Online

86

Page 87: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov

© 2014 Cisco and/or its affiliates. All rights reserved. BRKRST-2509 Cisco Public

Continue Your Education

• Demos in the Cisco Campus

• Walk-in Self-Paced Labs

• Table Topics

• Meet the Engineer 1:1 meetings

87

Page 88: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov
Page 89: Mastering Data Center QoS - d2zmdbbm9feqrf.cloudfront.netd2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKRST-2509.pdf · Mastering Data Center QoS BRKRST-2509 V5.2 Slides: Lucien Avramov