XStream: Rapid Generation of Custom Processors for ASIC Designs

Post on 16-Mar-2016

40 views 0 download

description

XStream: Rapid Generation of Custom Processors for ASIC Designs. Binu Mathew. * ASIC: Application Specific Integrated Circuit. Overview. What is XStream ? Comparison to Network Processors Design Flow Design Example: Ethernet Bridge/VLAN Switch. What is XStream ?. - PowerPoint PPT Presentation

Transcript of XStream: Rapid Generation of Custom Processors for ASIC Designs

XStream: Rapid Generation of Custom Processorsfor ASIC Designs

Binu Mathew

* ASIC: Application Specific Integrated Circuit

2

Overview

What is XStream ?Comparison to Network ProcessorsDesign Flow Design Example: Ethernet Bridge/VLAN

Switch

3

What is XStream ?

Software tool to rapidly generate high performance custom stream processors Stream Processing: Repeated application of an algorithm kernel to

a sequence of packets subject to throughput specifications

Resulting custom processors: 40-90% performance of a custom ASIC < 5% design effort of a custom ASIC

Rapidly develop your own ultra high performance network processors!

4

When you use a Network Processor

What your product looks like What your competitor’sproduct looks like

5

XStream vs Network Processor

What if my application does not look like this ?

6

XStream vs Network Processor

What if my application does not look like this ?Network Processor: No helpXStream: Make a system that looks like my app in days

7

XStream vs Network Processor

What if I want to use cheaper DDR2 instead of RDRAM or need more b/w ?

8

XStream vs Network Processor

What if I want to use cheaper DDR2 instead of RDRAM or need more b/w ?Network Processor: No helpXStream: Select a different controller from the GUI and plop it on the chip

9

XStream vs Network Processor What if I need

Different type/number of micro-engines More capable control processor Additional high performance processors for value

added services More crypto cores Different trie lookup hardware Different DRAM bandwidth Etc, etc, etc

Network processor: No help XStream: Yes

10

Design Flow Draw an architecture diagram for your application Select processors, interfaces, IP blocks etc from a

GUI Specify parameters, throughput requirements etc Specify the high level function of any additional

custom coprocessors you need Press a button and wait... XStream generates the h/w for you

11

Design Example Objective:

Design a platform chip that is shared across different products to save cost

Product 1: 16 port Ethernet Bridge Product 2: 16 port VLAN switch with advanced

filtering abilities Major differences:

Wimpy ingress/egress processors ok on the bridge VLAN Switch needs high performance ingress/egress

processors VLAN Switch needs high performance filter rule

engine

12

XStream: Designing a Platform ChipLinkInterface

PortIngressProcessor

PortEgressProcessor

LinkInterface

PortIngressProcessor

PortEgressProcessor

.

.

.16 ports

IngressQueue

EgressQueue

Crossbar

StreamProcessorforSwitchingDecisions

ControlProcessor

ExternalDRAM

13

The Streams in XStreamLinkInterface

PortIngressProcessor

PortEgressProcessor

LinkInterface

PortIngressProcessor

PortEgressProcessor

.

.

.16 ports

IngressQueue

EgressQueue

Crossbar

StreamProcessorforSwitchingDecisions

ControlProcessor

ExternalDRAM

14

The Streams in XstreamLinkInterface

PortIngressProcessor

LinkInterface

PortIngressProcessor

PortEgressProcessor

.

.

.16 ports

IngressQueue

EgressQueue

Crossbar

StreamProcessorforSwitchingDecisions

ControlProcessor

ExternalDRAM

PortEgressProcessor

15

The Streams in XstreamLinkInterface

PortIngressProcessor

PortEgressProcessor

LinkInterface

PortIngressProcessor

PortEgressProcessor

.

.

.16 ports

IngressQueue

EgressQueue

Crossbar

StreamProcessorforSwitchingDecisions

ControlProcessor

ExternalDRAM

16

XStream: Mapping the core processorLinkInterface

PortIngressProcessor

PortEgressProcessor

LinkInterface

PortIngressProcessor

PortEgressProcessor

.

.

.16 ports

IngressQueue

EgressQueue

Crossbar

StreamProcessorforSwitchingDecisions

ControlProcessor

ExternalDRAM

17

XStream: Mapping the core processor...IngressQueue

EgressQueue

StreamProcessorforSwitchingDecisions

Imagine a snazzy GUI here Designer says:

Stream processor, 8 issue Stream 1: Input, 16x1 queue, N deep Stream 2: Output,16x1 queue, M deep Stream 3: Inout, RISC processor

interface Add a CAM: 2 port, 48 bit keys, 1024

entries, 4 way associative, hash=F(…) The tool ponders for a while…

Says: “Yes master”

18

IngressQueue

EgressQueue

StreamProcessorforSwitchingDecisions

Imagine a snazzy GUI here Designer writes 15 lines of code for the data

plane, say in a subset of C Designer says: Schedule and report The tool ponders for a while…Says:

Compiled 45 instructions Using modulo accelerator Initiation interval = 8 cycles Clock speed: 500 MHz Throughput based on 64 byte (worst case)

packet size: 500MHz/8 * 64 * 8 = 32 Gb/s Area: 2.5mm x 2.5mm Power: 1.2 W

Single stream processor @ 500 MHz = 32 Gb/s Have designed up to 1 GHz processor in 0.13u

process

XStream: Mapping the core processor...

19

XStream: Mapping the ingress processor... LinkInterface

PortIngressProcessor

PortEgressProcessor

LinkInterface

PortIngressProcessor

PortEgressProcessor

.

.

.16 ports

IngressQueue

EgressQueue

Crossbar

StreamProcessorforSwitchingDecisions

ControlProcessor

ExternalDRAM

20

XStream: Mapping the ingress processor...

PortIngressProcessor

FilterRuleEngine

Imagine a snazzy GUI here Designer says:

RISC processor engine, no-cache 2 issue, scratchpad memory Stream 1: Input, link interface Stream 2: Output, StreamProc:Ingress

Queue Add a Filter Rule Engine: Rule

complexity = 64 terms, … The tool ponders for a while…Says:

RISC core and compiler generated Area: 1mm x 1mm (i.e. this can be

replicated 100x on a 10x10mm chip) Power: 250 mW

21

Summary Showed network processor design

But might as well be multi-media or wireless product design

Very high performance custom processors replace ASIC modules Reduce design time for stream oriented ASIC modules

by 95% Retain 40-90% of ASIC performance

Software replaces hardware design Software prototype already exists Flexible, fast bug fixes, feature upgrades Share chip across product family