2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG...

22
Altera SoC Your User-Customizable System on Chip

Transcript of 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG...

Page 1: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

Altera SoC

Your User-Customizable System on Chip

Page 2: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

Providing the Best of Both Worlds

2

SoC FPGA

Dual Core ARM Cortex® -A9

MPCore Processor

Hard

Memory

Controller

Peripherals

ARM Processor System FPGA

ARM + Altera = SoC

®

Page 3: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

28nm SoC System Architecture

3

Processor

Dual-core ARM® Cortex-A9 MPCore™ processor

4,000 MIPS (up to 800 MHz per core)

NEON coprocessor with double-precision FPU

32-KB/32-KB L1 caches per core

512-KB shared L2 cache

Multiport SDRAM controller

Up to 533-MHz DDR3 and LPDDR2

Up to 400-MHz DDR2

Up to 200-MHz Mobile DDR

Integrated ECC support

High-bandwidth on-chip interfaces

> 125-Gbps HPS-to-FPGA interface

> 125-Gbps FPGA-to-SDRAM interface

Cost- and power-optimized FPGA fabric

Lowest power transceivers

Up to 1,600 GMACS, 300 GFLOPS

Up to 25Mb on-chip RAM

More hard intellectual property (IP): PCIe® and memory

controllers

PCIe Multiport DDR SDRAM

Controller

Multiport DDR SDRAM

Controller

Hard Processor System (HPS)

ARM Cortex-A9

NEON / FPU

L1 Cache

L2 Cache

USB

OTG

(x2) (1)

64-KB

RAM

DMA (8 Channels)

FPGA

Shared Multiport DDR

SDRAM Controller (2)

JTAG

Debug /

Trace (1)

ARM Cortex-A9

NEON / FPU

L1 Cache

SD /

SDIO/

MMC (1)

I2C

(x2)

CAN

(x2)

GPIO

SPI

(x2)

Hard

PCIe

Timers

(x11)

HPS to

FPGA

FPGA

to HPS

FPGA

Configu

ration

UART

(x2)

Ethernet

(x2) (1)

QSPI

Flash

Control

NAND

Flash (1) (2)

Hard Multiport DDR

SDRAM Controller (2)

3-, 5-, 6-,

and 10-Gbps

Transceivers

Notes:

(1) Integrated direct memory access (DMA)

(2) Integrated ECC

•28LP process

• 8-input ALMs

• Variable-precision DSP

• M10K memory and

640-bit MLABs

• fPLLs

HP

S I/O

s

FP

GA

Ge

ne

ral P

urp

os

e I/O

s

Page 4: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

Embedded SW Availability

4

Vendor OS/RTOS Development

Tools

Open Source Linux 3.10 Linaro compiler

Wind River Systems VxWorks 6.9.2 Workbench 3.3.2

Micriµm µC/OS-II, µC/OS-III GNU compiler

Enea OSE 5.5.3 Optima 2.6

Express Logic ThreadX 5.5.3 GNU compiler

Wind River Systems Wind River Linux 5 Workbench/GNU

QNX QNX/Neutrino 6.5.3 Momentics

Fujisoft Android GNU compiler

Green Hills INTEGRITY Multi/Green Hills

Microsoft Windows Embedded 7 Microsoft/Studio

Page 5: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

System Development Flow

Standard FPGA Flow Standard Software Flow

Hardware

Development

Software

Development

Release Release • Quartus II Programmer

• In-system Update • Flash Programmer

Simulate Simulate • ModelSim, VCS, NCSim, etc.

• AMBA-AXI and Avalon bus

functional models (BFMs)

• Virtual Target

Debug Debug • SignalTap™ II logic analyzer

• System Console

• GNU, Lauterbach, DS5

and ARM ecosystem

• Quartus II design software

• Qsys system integration tool

• Standard RTL flow

• Altera and partner IP

• ARM Development Studio 5

• GNU toolchain

• OS/BSP: Linux, VxWorks

• Etc…

Design Design

FP

GA

in

th

e L

oo

p

Page 6: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

ARM® DS-5™

Altera® Edition Toolkit

Page 7: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

One Device, Two Debugging Tools?

7

Dedicated JTAG connection

Visualize & control CPU

subsystem

JTAG

Dedicated JTAG connection

Visualize & control FPGA

ARM® DS-5™ Toolkit Altera Quartus® II Software

JTAG DSTREAM™

Page 8: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

One Device, Two Debugging Tools?

8

Dedicated JTAG connection

Visualize & control CPU

subsystem

JTAG

Dedicated JTAG connection

Visualize & control FPGA

ARM® DS-5™ Toolkit Altera Quartus™ II Software

JTAG DSTREAM™

Page 9: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

ARM® Development

Studio 5 (DS-5™) Altera® Edition Toolkit

Industry’s First FPGA-Adaptive Software Toolkit

Removes debugging barrier between CPUs and FPGA

Unique OEM agreement between Altera and ARM

Result of innovation in silicon, software, and business model

9

Altera USB-Blaster™

Connection

Page 10: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

Unprecedented Combination of Productivity-

Boosting Features

Single USB-Blaster target connection

for software and hardware debug

Automatic creation of register views

of FPGA peripherals

Non-intrusive trace of CPU software instructions correlated

with application events and FPGA hardware events

Hardware cross-triggering between the CPU and FPGA

domains

Simultaneous debug and trace for Cortex-A9 cores and

CoreSight™-compliant cores synthesized on FPGA

Streamline support: Statistical analysis of software load and

bus traffic spanning the CPUs and FPGA

10

Page 11: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

Altera SoC Roadmap

Page 12: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

Long-Term SoC Roadmap Commitment

12

Featu

res &

Perf

orm

ance

1st Generation 28 nm TSMC

1.05 GHz Dual

ARM Cortex-A9 MPCore™

2nd Generation 20 nm TSMC

3rd Generation 14 nm Intel Tri-Gate

Quad ARM Cortex-A53 ARMv8

Processor

1.5 GHz Dual ARM Cortex-A9 MPCore™

More Coming

Page 13: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

13

Arria 10 SoCs: Hard Processor Subsystem

Faster 1.5 GHz per core, total 7500 MIPS

Software Compatible Extensive of software, OS/BSP, tools reuse with

28nm SoC

More Secure Encryption (AES)

SHA authentication, EC DSA Authentication

Root of Trust Support (Certification Authority)

Anti-Tamper

Improved Architecture Device IO can be shared between FPGA logic and HPS

3rd EMAC

256 KB Scratch RAM

8 and 16 Bit NAND Flash device supported

eMMC SD Card Support

50% More Processor Performance

Full Software Compatibility

Superior Architecture

Page 14: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

ARM Cortex® -A53 on Intel 14 nm Tri-Gate Process

High Performance + Power Efficiency

>6x throughput improvement

Highest power efficiency of any 64-bit processor

Wide portfolio reusability

Software compatibility with previous generation (32-bit mode)

Cortex-A53 target markets overlap with Stratix 10 SoCs (communications infrastructure,

enterprise, datacenter)

14

Page 15: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

Summary

Altera Cyclone V / Arria V SoC Silicon in Mass Production

Comprehensive Ecosystem

ARM DS5 Altera Edition Industry’s First FPGA-Adaptive Software Toolkit

Hardware cross-triggering between the CPU and FPGA domains

Altera SoC roadmap Commitment 28nm SoC

20nm SoC

14nm SoC

And more…

15

Page 16: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

Thank You Thank You

Page 17: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

Qsys User Interface

17

Validation Display

Interfaces Exported for Hierarchy

Toolbar

Tabs IP Library System Interconnect

Page 18: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

A10 SoC Logic and Processor Functions By Application

Wireless

Infrastructure

Access, Metro &

Core

Transmission Cloud Servers

and Storage

Broadcast

Target Applications

• Remote Radio Head

• Mobile Backhaul

• Active Antenna

• Basestation (BTS)

• 4G/LTE Marco eNB

• 4G/LTE Micro eNB

• 40G GPON, EPON,

FFTH, Switch

• 100G / 200G NGPON

• 100G Traffic

Management

• NX 100G OTU 4

• 2 X OTU 4

• 4 X OTU 4

• Flash Cache

• Cloud

• Server

• Acceleration

• Pro A/V Equipment

• Switcher

• Server

• Transport

• Head End

• VoD Mux

Logic Functions

• RF Processing

• Digital Pre-Distortion

(DPD)

• Baseband Interface

• Aggregation

• Bridging

• Switching

• Traffic Management

• IO

• FEC

• Aggregation

• Muxponding

• IO

• Flash Cache

Processing

• Acceleration

• SATA/SAS

• PCIe Gen 3

• Video Format

Conversion

• Muxing

• Switching

• Bridging

Processor Functions

• OAM & Link

• Digital Pre-Distortion

(DPD)

• L2 Switch

• IO, Protocol Control

• Host Offload

• OAM & Link

• L2 Switch

• IO, Protocol Control

• Chassis Mgmt

• Host Offload

• OAM & Link

• IO Control

• Chassis Mgmt

• Flash Cache Control

• Host Offload

• Co-processing &

Acceleration Control

• Audio Processing

• Video Compression

• Link Management

18

Page 19: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

Stratix 10 SoC: Altera’s Most Advanced Product Ever

Delivering exceptional adaptability, performance, and power efficiency

19

QUAD 64-BIT

PROCESSOR

ADVANCED TOOLS

1GHZ FPGA LOGIC

>10 TFLOPS DSP

+

+

+

+

ALTERA®

SDK

FOR

OPENCL ™

Page 20: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

Stratix 10 SoC In Datacenter Compute Acceleration

Design Challenge

Surging demand for compute-

intensive applications (eg., search

acceleration)

Power budgets critical

Workload balancing between

host processor and accelerators

Stratix 10 SoC

Heterogeneous Computing

High performance search

acceleration engines implemented

in high performance logic and

hard floating point DSP blocks

Highest performance per watt vs.

competing solutions

Load balancing and flow control

with integrated A53 processor

20

A53

Layer

DSP

Layer

Logic

Layer

Embedded OpenCL host

Flow control & load balancing

FPGA link, config, power

management

Search rules

Data mining

Deep packet inspection (page

level)

Floating point computation

Matrix manipulations

Deep packet inspection (bit level)

Search acceleration engines

WO

RK

LO

AD

DI

ST

RI

BU

TI

ON

Host

Processor

(Control

Plane)

Page 21: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

Stratix 10 SoC In Networking and Transmission

Design Challenge

Subscriber bandwidths increasing

Cooling & Opex challenges due to system power & cost

Network complexity

Stratix 10 SoC Smart Line Card Processing

Increases subscriber bandwidth per line card with high-performance transceivers

Reduce system power & cost by integrating local line card processor with FPGA

Simplify networking by offloading local line card processing functions to on-chip A53 processor

21

A53

Layer

DSP

Layer

Logic

Layer

Line card chassis management

Operations & Maintenance (OAM)

Link management (EMAC, PCIe)

FPGA config and remote update

Control plane processing

Packet processing

Quality of Service management

Limited usage of DSP blocks

Bridging Transponding Forward Error Correction (FEC) MAC control PHY

Host

Processor

(Control

Plane)

WO

RK

LO

AD

DI

ST

RI

BU

TI

ON

Page 22: 2013 ARM Symposium Alteraarmtechforum.com.cn/2013/3_Altera.pdfsubsystem JTAG Dedicated JTAG connection Visualize & control FPGA ARM® DS-5 Toolkit Altera Quartus ... Load balancing

Host

Processor

(Control

Plane)

Stratix 10 SoC In Radar Applications

Design Challenge

Performance and power

efficient SoC design

TFLOPs-class floating point

performance

Security & Encryption

Stratix 10 SoC

Heterogeneous Computing Highest performance per watt vs.

competing solutions

High performance waveform and

signal processing with up to 10

TFLOPS hard floating point DSP

blocks

Secure boot, DSP and flow control

with integrated A53 processor

Design productivity with OpenCL

22

Embedded OpenCL host

Flow control & load balancing

Secure Boot

FPGA config & power management

Secure communication

Guidance and control

Waveform processing

DSP transforms, matrix operations

Floating point computations

Deep Packet Inspection

Waveform processing

Crypto engines

A53

Layer

DSP

Layer

Logic

Layer

WO

RK

LO

AD

DI

ST

RI

BU

TI

ON