1 PC-base Software Routers: High Performance and Application Service Support Author: Raffaele Bolla,...

18
1 PC-base Software Routers: High Performance and Application Service Support Author: Raffaele Bolla, Roberto Bruschi Publisher: PRESTO’08 Presenter: Hsin-Mao Chen Date:2010/02/24
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    212
  • download

    0

Transcript of 1 PC-base Software Routers: High Performance and Application Service Support Author: Raffaele Bolla,...

1

PC-base Software Routers: High Performance and Application Service Support

Author:

Raffaele Bolla, Roberto Bruschi

Publisher:

PRESTO’08

Presenter:

Hsin-Mao Chen

Date:2010/02/24

2

Outline

IntroductionArchitectural BottlenecksMulti-CPU/Core EnhancementsPerformance Evaluation

3

Introduction

Linux

Network boards

Packet Reception or Transmission

HW Interrupt

(IRQ)

Kernel

Software IRQs

(SoftIRQs)

Packet ProcessingRAM

TxRing and Rx

Ring

4

Introduction

A SoftIRQ executes two main tasks.

1.The de-allocation of already-transmitted packets placed in the TxRing.

2.All the real packet forwarding operations. The task handles the received packets in the RxRing.

5

Architectural Bottlenecks

SR architecture based on a single CPU/core.

1.The SR computational capacity.

2.The bandwidth/latency of I/O busses.

SR architecture based on multiprocessor.

Typical performance issues may sap parallelization gain.

1.Data accessing serialization.

2.CPU/core cache coherence.

6

Architectural Bottlenecks

Data accessing serialization

The SoftIRQ accesses to each TxRing are serialized by a code locking procedure (LLTX lock). This lock guarantees that each TxRing can be read or modified by only one SoftIRQ at a time.

7

Architectural Bottlenecks

CPU/core cache management

Whenever a CPU/core loads a TxRing to its local cache, all of the other processors also cashing it must invalidate their cache copies.

8

Mulit-CPU/core Enhancements

HW evolution

Intel® Advanced Smart Cache: It consists of a mechanism that allows level 2 cache-sharing among all the cores in the same processor.

Intel PRO 1000 adapters: It supports multiple Tx- and Rx Ring and multiple HW IRQs per network interface.

9

Mulit-CPU/core Enhancements

SW architecture

1.To entirely bind all operations carried out in forwarding a packet to a single CPU.

2.To reduce LLTX lock contention as much as possible.

3.To equally distribute the computational load among all the processors/cores in the system.

10

Mulit-CPU/core Enhancements

CPU/core binding to TxRing: Bind each CPU/core to a different TxRing on each output device.

CPU/core binging to RxRing: Bind each RxRings to a different CPU/core.Xeon core: 1 Mpkt/s

Gigabit Ethernet interface: 1.488 Mpkt/s with 64B sized frames

Fast Ethernet interface: 148.8 pkt/s with 64B sized frames

11

Mulit-CPU/core Enhancements

12

Performance Evaluation

Standard SR architecture

Agilent N2X router

13

Performance Evaluation

Standard SR architecture

14

Performance Evaluation

Enhanced SR architecture

15

Performance Evaluation

Enhanced SR architecture

16

Performance Evaluation

Enhanced SR architecture

17

Performance Evaluation

Multi-layer service support

18

Performance Evaluation

Multi-layer service support