Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

55
WEBINAR Traditional vs. SoC FPGA Design Flow with A Video Pipeline Case Study May 8, 2014

description

This presentation compares the impact of traditional FPGA engineering design flow to one employed with an SoC FPGA. The two approaches will be contrasted in terms of their impacts on system architecture design, debugging, risk mitigation, system integration, bring-up, feature enhancements, design obsolescence, and engineering effort. A case study is presented that explores these impacts within a video pipeline development effort.

Transcript of Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

Page 1: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

WEBINARTraditional vs. SoC FPGA

Design Flowwith

A Video Pipeline Case Study

May 8, 2014

Page 2: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

2

SPEAKERS

Allan DubeauPrincipal Design

Engineer, Nuvation Engineering

Todd KoellingSr. Marketing Manager,

SoC Products, Altera

Stefan RosingerProduct Manager,

CPU, ARM

Joseph XavierMarketing Manager, Nuvation Engineering

Moderator:

Page 3: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

3

1. Introductions

2. Compare the impacts of a traditional FPGA engineering design flow to one employing an SoC FPGA, from the perspective of:

ARM and ALTERA design/development

System architecture design

Risk mitigation

System integration & Bring-up

Achieving customer satisfaction through Requirements understanding

3. Case Study: Video pipeline evaluation platform

SOC FPGA VS. TRADITIONAL FPGA DESIGN FLOW

AGENDA

Page 4: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

4

Established in 1997

Offices in Silicon Valley and Waterloo, Canada

800+ successful projects completed

Clients ranging from start-ups to Fortune-50 companies

Exceptional Engineers and Program Managers

Flexible business models to match customer and market requirements

WHO WE ARECOMPANY PROFILE

NUVATION SPECIALIZES IN PRODUCT REALIZATION AND CUTTING-EDGE DESIGNS.

SUNNYVALE, CA USAWATERLOO, ON CAN

Page 5: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

A top-tier Altera Design Services Network partner since 1998, with projects delivered using virtually all device families

Experts with ARM-based microprocessors and SoCs

Altera Partner & EDS ProviderARM Connected Community Member

WHO WE ARENUVATION ENGINEERING

Page 6: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

6

THE SoC FPGA

SoC FPGAs integrate ARM-based processor system with FPGA fabric.

Benefits are higher performance, lower power, lower cost, and smaller board space of a single chip solution.

After

FPGA+ CPUD

evic

esMemory

Before

FPGA

Dev

ices

Memory

CPU

Dev

ices

Memory

Page 7: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

Altera Embedded Technology Centers of Excellence

7

Premal BuchVP EngineeringSan Jose

Ty GaribayVP EngineeringAustin Technology Center

Mark DickinsonVP EngineeringEurope Technology Center

FPGA IP and Development Tools

• FPGA Tools• Qsys• System Console• AXI/Avalon IP interface

Embedded IC DesignAnd Software

• SoC FPGA IC Design• Embedded Software• Nios II Soft Processsor

Embedded System Solutions

•Industrial Applications•Automotive Applications

•Communications Solutions•Video Processing

Embedded Technology Centers of Excellence

San Jose HQ Austin Technology Center Europe Technology Center

World Class Embedded Team Bringing Customizable SoCs to Market

Page 8: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

THE ARCHITECTURE FOR THE DIGITAL WORLD®

From sensors to servers, ARM designs processor technology that lies at the heart of advanced consumer products

Page 9: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

Software development tools

Physical IP – Design of the building blocks of the chip

Processor and Graphics IP – Design of the brain of the chip

Power Mgmt

Bluetooth

Cellular Modem

WiFi

SIM

GPS Flash Controller

Touchscreen & Sensor Hub

Sensor Hub

Camera

Apps Processor

Advanced consumer products are incorporating more and more ARM technology – from processor and multimedia IP to software

ARM TECHNOLOGY

Page 10: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

• ARM delivers technology to drive scalable, efficient system-on-chip solutions:– Software increasing system

efficiency with optimized software solutions

– Diverse components, including CPUs and GPUs designed for specific tasks

– Interconnect System IP delivering coherency and the quality of service required for lowest memory bandwidth

– Physical IP for a highly optimized processor implementation

THE CHIP IS THE SYSTEM

Page 11: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

11

ARM and Altera Design/Development:

Co-Design and Co-Development

Page 12: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

12

ARM AND ALTERA DESIGN/DEVELOPMENT:CO-DESIGN AND CO-DEVELOPMENT

ARM Development Studio 5 (DS-5) Altera Edition Toolkit

Altera USB-Blaster™II

Connection

Page 13: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

13

Industry First: FPGA-Adaptive Debugging

Removes debugging barrier between CPUs and FPGA

• Synchronizes debug between the FPGA and ARM processor software

• Cross-triggering between the HPS and FPGA fabric allows inter-domain synchronization, for instance, to stop all hard and soft processors simultaneously.

Automatic creation of register views of FPGA peripherals

• Qsys-generated FPGA peripheral registers definition can be imported into DS-5 Debugger for system register visibility while debugging software

ARM AND ALTERA DESIGN/DEVELOPMENT:CO-DESIGN AND CO-DEVELOPMENT

ARM® Development Studio 5 (DS-5™) Altera® Edition Toolkit

Page 14: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

ARM AND ALTERA DESIGN/DEVELOPMENT:CO-DESIGN AND CO-DEVELOPMENT

CUSTOMIZABLE DATA COLLECTION TO ENABLE SYSTEM WIDE OPTIMIZATION

OS counters: CPU load, threadwait, network, memory

CPU counters, aggregated, per coreor cluster: clock cycles, instructions,branch prediction, cache & TLB misses

System and custom counters: level2 cache, interconnect, custom metrics

Visual annotations: from bitmapsto frame buffers

Process/thread heat map: colorreflects CPU/GPU activity

Text annotations: signal high-levelevents in the application

System-Level Analysis with Streamline Performance Analyzer

Capable andCost Effective

Altera USB-Blaster™ II

Elaborate

ARM DSTREAM

Page 15: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

15

System Architecting

Page 16: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

16

SYSTEM ARCHITECTING

Profiling assessment in relation to a client’s requirements

• Initial requirements maybe biased based on pre-conceived feature limitations of bandwidth limitations of external interfaces.

• Can be enhanced by removing the bottleneck associated with external multi-chip solutions

• The tight integration between the HPS and FPGA fabric provides over 125-Gbps peak bandwidth with integrated data coherency between the processors and the FPGA.

• Computation profiling is an available feature of the development tools

—Streamline support: Statistical analysis of software load and bus traffic spanning the CPUs and FPGA

• Many elements of the hardware and software can be independently developed

Page 17: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

17

SYSTEM ARCHITECTING

Software Design Flow

HardwareDevelopment

SoftwareDevelopment

Release Release• Quartus II Programmer• In-system Update

• Flash Programmer

Simulate Simulate• ModelSim,etc.• AMBA-AXI and Avalon bus functional models (BFMs)

• Virtual Target

Debug Debug• SignalTap™ II logic analyzer• System Console

• ARM Development Studio 5• GNU, Lauterbach•iSystem, SEGGER

• Quartus II design software• Qsys system integration tool• Standard RTL flow• Altera and partner IP

• ARM Development Studio 5• GNU tool chain• OS/BSP: Linux, VxWorks• Hardware Libraries• Design Examples

Design DesignHW/SW Handoff

FPGA Design Flow

Page 18: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

18

SYSTEM ARCHITECTING

HW/SW Design Partitioning

Partitioning can be postponed and refined at a later phase of the design process and is not part of the critical path.

Page 19: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

19

SYSTEM ARCHITECTING

HW/SW Design Partitioning (cont.)

The architecture and development flow will lead towards an earlier IDM deployment, effort re-use, as well as a more IDM friendly architecture and deployment strategy.

Page 20: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

20

SYSTEM ARCHITECTING

HW/SW design partitioning (cont)

• Data coherency maximizes partitioning options

—Increase power efficient computing

o Co-processing certain tasks in the FPGA fabric can not only benefit from a performance standpoint but also reduces the per power computational element

—Decreased and much finer grained latency potentials (do not have to deal with protocol overheads)

—Saves on board level debugging

—Not I/O restrictive and removes having to commit to a board level CPU to FPGA interface

—Expedites board design faster

Hard Processor System (HPS)…….

FPGA

Shared Multiport DDR SDRAM Controller (2)

HPS toFPGA

FPGAto HPS

FPGAConfig.

Page 21: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

21

Monolithic SOC advantages

• Does not compromise some of the characteristics of a multi-chip solution

—Independent powering

—Flexible configuration and boot sequencing

• Reduced power, voltage rail requirement, clocking, BOM, layout etc.

SYSTEM ARCHITECTING

Page 22: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

22

SYSTEM ARCHITECTING

Integrated Hard IP peripherals

• Increased performance

• Reduces implementation time and cost

• Reduces FPGA resource and IP cost requirements

—PCIe multi-function support can save around 20K logic elements alone

• Can share external DDR memory

• FPGA can also access some of the other HPS Hard IP peripherals

Page 23: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

23

Risk Mitigation

Page 24: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

24

SOC lends itself towards a better step wise development and integration process

Integrated data coherency and non-committal multi-chip interface maximizes functional requirement changes and increased flexibility

• Not having to commit to the FPGA andCPU I/O at the board level.

Earlier customer wide deployment taking advantage of both In-the field updates and extended and flexible functional requirement changes and enhancements.

RISK MITIGATION

Page 25: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

25

Extends integration and definition of final functionality further along the design process cycle.

More easily adapts to handle unforeseen changes in external requirements or expected signaling behavior.

Reduces initial commitment directly associated to “What-if” scenarios

• Can allocate effort only when certain criteria become apparent or are part of critical path.

Based on well-established HW/SW manufacturers and distributor with mature and extensive eco-systems.

With a well established SoC-friendly architecture can increase the abstraction level for easier and faster incorporation of design enhancements as well as test and debug

RISK MITIGATION

Page 26: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

26

System Integration and Bring-Up

Page 27: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

27

Faster and less costly when factoring elements such as;

• Board and layout issues between multi-chip interface being eliminated

• Using extensive co-verification real-time debugging options

Easier systematic debugging process through incremental and step-wise development

• Taking advantage of the Cyclone V SoC ARM core’s computational power.

Can incorporate and leverage on familiar, extensive and powerful Linux based application level for debugging.

• The interfaces between many of the hard core periphery and software drivers are mature and can be used as-is without additional debugging

SYSTEM INTEGRATION AND BRING-UP

Page 28: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

28

3rd Party example of a HW/SW validation platform from Flexibilis that makes use of an external custom processor environment. Many benefits of this system level solution for validation can be realized and incorporated within the Cyclone V SoC itself. This includes system solutions that would not necessarily populate a processor

SYSTEM INTEGRATION AND BRING-UP3RD PARTY VENDOR EXAMPLE USING CUSTOM HW/SW VALIDATION

Page 29: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

29

Achieving Customer Satisfaction through Requirements

Understanding

Page 30: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

30

Progression milestones that fit with client’s demonstrations and establishes confidence in progress of deliverable(s)

• The SoC architecture lends itself to a more incremental functional bring-up process. Initial conceptual functionality and proof of concepts can be implemented using the powerful ARM core and be incorporated as milestones for demonstration purposes.

• The FPGA fabric can incorporate “What-if” scenarios for SW corner cases that maybe present in the field and tested in a controlled and observable manner.

—example shown on the next slide

ACHIEVING CUSTOMER SATISFACTION THROUGH REQUIREMENTS UNDERSTANDING

Page 31: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

31

Design for testability (Actual Example)

• ARM can be used

—to prototype FPGA destined algorithms

o developed faster in SW than RTL

o can later serve to verify HW with “Golden Test Vectors”

• FPGA can create Fault-Insertion Modules to profile and verify SW flow control handling and corner cases

ACHIEVING CUSTOMER SATISFACTION THROUGH REQUIREMENTS UNDERSTANDING

Page 32: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

32

Reliability

• Apply non-intrusive tasks between both CPU and FPGA

—Data and Flow control  integrity health monitoringoMany FPGA designs omit this due to the impact on

additional resources and development time A small subset can be implemented on latency

sensitive signaling and exposed to SW routinesoSoftware based health monitoring is faster/easier and

less resource intensive to implement

ACHIEVING CUSTOMER SATISFACTION THROUGH REQUIREMENTS UNDERSTANDING

Page 33: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

33

Cyclone V SoC Case Study:A Video Pipeline Evaluation

Platform

Page 34: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

34

Video Pipeline Evaluation Platform

17 Calendar Week Engagement

1600 Eng-Hours (40 Eng-Weeks)

2 FPGA Designs

1 Controller (NIOS II) Software Design

Nuvation Design Engagement

CYCLONE V SoC CASE STUDY:VIDEO PIPELINE EVALUATION PLATFORM

Page 35: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

35

BLOCK DIAGRAM OF FPGA ARCHITECTURE

FPGA ‘A’ Board below is the Board that is targeted for the Cyclone V SoC Solution

CYCLONE V SoC CASE STUDY:VIDEO PIPELINE EVALUATION PLATFORM

Page 36: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

36

ARM AND ALTERA DESIGN/DEVELOPMENT:CO-DESIGN AND CO-DEVELOPMENT

Cyclone V SoC Familiarity with tool chains does not

incur any extra engineering time and effort

Allows more flexibility for engineering resourcing

Complex interaction between PC and on-board FPGA ARM controller will benefit from the enhanced Debugging between FPGA fabric and ARM controller

Can utilize Linux and Ethernet interface to develop and debug application SW for both low level and PC interface

FPGA only Embedded OS

• PC based Application required both Low-Level and High Level control of FPGA Video Pipeline

• NIOS solution posed a problem in Horsepower and required additional time in IDE to nail down requirements

• The initial design UI was not ready until the late stages of the design

DEVELOPMENT AND DEBUG TOOL CHAIN BASED ON MATURE ECOSYSTEM OF TOOLS FOR BOTH ARM AND ALTERA

Page 37: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

37

SYSTEM ARCHITECTING

HW/SW DESIGN PARTITIONING

Cyclone V SoC An SoC solution allows some of

the implementation details to be optimized during implementation with an initial architecture approach unblocking the development phase of the FPGA.

FPGA only The Initial Design Engagement

phase was over 8 engineering weeks of effort spanning an initial 5 week period

• There was considerable effort spent on initial architecture due to custom protocol between the PC and on-board NIOS II

• The architecture partitioning affected the FPGA implementation and both the Software and Hardware Detailed Design Specifications

• Several architecture decisions could not be postponed

Page 38: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

38

SYSTEM ARCHITECTING

INTEGRATED HARD IP PERIPHERALS (MEMORY)

Cyclone V SoC Cyclone V SoC Hard IP

• The design memory bandwidth requirements can utilize a single HPS hard memory controller for ARM program execution and Video Capture storage

• The FPGA DDR Pipeline buffer can be accessed by the ARM to implement incremental bring-up

Using hard memory controller reduces implementation and debugging time for both the Calendar and Engineering effort by 2-3 weeks. Details to follow.

FPGA only Design required storage for Video

Frame buffering for Video Pipeline and Video Capture controlled by user on Host PC via the NIOS II

Original design necessitated 2 external interfaces and 2 DDR soft cores

• Timing closure required a reduction in speed.

Page 39: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

39

SYSTEM ARCHITECTING

INTEGRATED HARD IP PERIPHERALS (I2C AND SPI)

Cyclone V SoC This solution is available as Hard IP

connected directly to the ARM block

• Linux access to the I2C and SPI modules

—Low level drivers already exist

—Functional HW/SW blocks that do not to be debugged

—Already available to perform board level tests of the external interfaces

• Another example saving

—Time and effort costs

—FPGA internal resource reduction

FPGA only Original design used FPGA soft core

modules in conjunction with the NIOS II to interface with the second FPGA sensor board and configure external HDMI chips.

• This also involved Software Driver development work for the NIOS II developer

Page 40: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

40

SYSTEM ARCHITECTING

INTEGRATED HARD IP PERIPHERALS (USB 2.0/GIGE ETHERNET/CUSTOM)

Cyclone V SoC The Cyclone V SoC GIGE Ethernet

interface is a “slam dunk” winner for the client

• Reducing complexity and image transfer time to Host PC.

• Leverage on existing Linux Ethernet interface connected to the Host PC

• Drastically minimize controller software development complexity and custom developed interface to FPGA fabric

• Early access for integration and debug

FPGA only Client originally was looking to use

USB 2.0 to interface between the host PC and the FPGA system board

• The board USB 2.0 was not functional and another means was looked into

• GIGE Ethernet was available but the client was concerned with the additional development time + costs + feasibility

• Client opted for a Custom Interface using the FPGA GPIO pins and an external board they decided to design themselves.

Page 41: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

41

SYSTEM ARCHITECTING

FPGA “A” RESOURCE MAPPING INTO CYCLONE V SOC SE FAMILY OFFERING

Device Resources 5CSEA2 5CSEA4 5CSEA5 5CSEA6LEs (K) 25 40 85 110Adaptive logic modules (ALMs) 9,434 15,094 32,075 41,509M10K memory blocks 140 224 397 514M10K memory (Kb) 1,400 2,240 3,972 5,140MLABs (Kb) 138 220 480 62118-bit x 19-bit multipliers 72 116 174 224Variable-precision DSP blocks (1) 36 58 87 112FPGA PLLs 4 5 6 6HPS PLLs 3 3 3 3Maximum FPGA user I/Os 145 145 288 288Maximum HPS I/Os 188 188 188 188FPGA hard memory controllers 1 1 1 1

HPS hard memory controllers 1 1 1 1Processor cores (ARM CortexTM-A9 MPCoreTM) Single/Dual Single/Dual Single/Dual Single/Dual

Device Resources FPGA ‘A’LEs (K) 34Adaptive logic modules (ALMs) 12,600M10K memory blocks 90M10K memory (Kb) 900MLABs (Kb)18-bit x 19-bit multipliers 74Variable-precision DSP blocks (1) 37FPGA PLLs 2HPS PLLs 2Maximum FPGA user I/Os 140Maximum HPS I/Os 130FPGA hard memory controllers 1HPS hard memory controllers 1Processor cores (ARM CortexTM-A9 MPCoreTM) Single

Cyclone V SoC Device Selection Options

Design Requirements

Page 42: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

42

RISK MITIGATION

..FUNCTIONAL REQUIREMENT CHANGES AND ENHANCEMENTS.

Cyclone V SoC Almost all aspects of risk

mitigation that we highlighted earlier would be applicable to this project by utilizing a Cyclone V SoC based solution

Image Capture Rate optimization could have more easily been handled in an SoC based solution with the ARM and Linux interface

FPGA only The client wanted to

increase the Image Capture Rate transfer to the Host PC system but was limited by initial HW interface selection.

Page 43: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

43

RISK MITIGATION

Cyclone V SoC GIGE Ethernet and Linux

environment available in a Cyclone V SoC are a “slam dunk” solution.

FPGA only The fact that the USB 2.0 did not work

and client opted for a custom interface using 8 bit wide FIFO GPIO with an internally developed custom board design

• Added undesirable delays for the client in the overall debugging and development before they were able to use our system as was delivered

—Majority of problem was attributed directly to signal integrity issues of the 3rd party custom board design

MORE EASILY ADAPTS TO HANDLE UNFORESEEN CHANGES IN EXTERNAL REQUIREMENTS OR EXPECTED SIGNALING BEHAVIOR

Page 44: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

44

SYSTEM INTEGRATION AND BRING-UP

Cyclone V SoC We can take advantage of the

shared multi-port memory controller to insert and verify video frames

The next few slides make use of a brief “gut feel check” for data rates and throughputs

FPGA only Elements of the Video Pipeline

that were difficult to debug

• Capture Frame buffer

• Corner cases in some of the Video Pipeline Blocks

• Auto White Balancing

• Sensor Raw Data interface

FASTER AND LESS COSTLY WHEN FACTORING ELEMENTS SUCH AS USING EXTENSIVE CO-VERIFICATION REAL-TIME DEBUGGING OPTIONS

Page 45: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

45

SYSTEM INTEGRATION AND BRING-UP

FPGA “A” PARTIAL BLOCK DIAGRAM: THE VIDEO PROCESSING PIPELINE

Page 46: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

46

SYSTEM INTEGRATION AND BRING-UP

Easier systematic debugging process through incremental and step-wise development

• Taking advantage of the ARM cores computational power.

—The ARM is both powerful and fast enough to implement many of the sub-blocks of the Video Pipelining flow and manipulate the Pipeline Frame buffer

o Use reduced video rates when system wide solution utilizes software sub-modules

o FPGA sub-blocks all designed to support per module bypass

Video Res. 36 bit 32 bit 48 bit

Frame Buffer (12 bits / color) (10 bits/ color) (16 bits / color)

1080p60 4.5 Gb/s 4.0 Gb/s 6.0 Gb/s

1080p50 3.7 Gb/s 3.4 Gb/s 5.0 Gb/s

1080p30 2.3 Gb/s 2.0 Gb/s 3.0 Gb/s

1080p25 1.9 Gb/s 1.7 Gb/s 2.5 Gb/s

1080p24 1.8 Gb/s 1.6 Gb/s 2.4 Gb/s

720p60 2.0 Gb/s 1.7 Gb/s 2.7 Gb/s

720p50 1.7 Gb/s 1.5 Gb/s 2.3 Gb/s

720p30 1.0 Gb/s 0.9 Gb/s 1.4 Gb/s

Page 47: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

47

The ARM Hard IP memory interface “Gut Feel Check” for manipulating and sharing the Pipeline Frame Buffer DDR3.

Showing example of 32 bit external interface to DDR3 and sample memory size of 256 MB

The Design can consider Ping Pong buffering schemes

Many decisions are deferred and effort is modular depending on potential bring-up issues, risk mitigation aspects as well as feature demonstration requirements

Example below shows a full frame of video at the lowest rate to be read and written to/from memory in around 6 ms allowing a minimum 26 ms of processing in a 30 fps requirement.

Note – a conservative DDR3 efficiency of 50% is shown whereas 75% is typical

SYSTEM INTEGRATION AND BRING-UP

Memory Freq. Ext.Width Data Rate Efficiency

Shared AccessRate

DDR3

MHz Bits 50% 1/2

300 32 19.2 Gb/s 9.6 Gb/s 4.8 Gb/s

400 32 25.6 Gb/s 12.8 Gb/s 6.4 Gb/s

Video Res. PixelStorage

MemoryStorage

Req.

Sample Storage256 MB

Bits MB Frames

1080p48 12.5 20

32 8.3 31

720p48 5.6 46

32 3.7 69

Page 48: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

48

ACHIEVING CUSTOMER SATISFACTION THROUGH REQUIREMENTS UNDERSTANDING

Cyclone V SoC The Cyclone V SoC solution

introduced in this sample design would have been an ideal choice platform for this client. Additional and enhanced features could easily have been implemented within committed budgets and timeframes.

FPGA only Nuvation is able to clearly establish

and refine client requirements and align our engineering effort and deliverables during the IDE phase.

Some compromises were made early during the IDE phase for both cost and effort. Examples include initial support for HD-SDI and initial implementation of the auto white balance module. We were able to postpone decisions of these modules to the post implementation stages of the design.

ELABORATING VIA COHESIVE INITIAL DESIGN ENGAGEMENT PHASE AND ADDING CLARITY TO THE FEATURE REQUIREMENTS

Page 49: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

49

Incorporating trade-offs and benefits from an in-depth understanding of the full system level design process

• Time to evaluate our SoC-based solution

—We will utilize the metrics directly from our project timetable and effort and make the necessary modifications to where the SoC would impact the schedule

—We will utilize this information to directly quantify the impact of an SoC-based solution

—The following slides illustrate the evaluation process as it relates to the Design phases of the project

SoC FPGA VS TRADITIONAL FPGA DESIGN FLOW:CASE STUDY CONCLUSIONS

Page 50: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

50

Quantifying effort at all the various levels of the design process

• ARM estimates are compared to the NIOS II actuals columns and actuals are used if the schedule is not directly impacted or has a minimal effect

• Estimate imply a potential reduction in both calendar week and engineering hours respectively as;

—12 w vs. 17 w = 70%

—1111 hrs. vs. 1711 hrs. = 65 %

• Estimate reduction of 30 % of Calendar time and 35% Cost

SoC FPGA VS TRADITIONAL FPGA DESIGN FLOW: CASE STUDY CONCLUSIONS

Page 51: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

51

Cyclone V SoC FPGA “A”

• Engineering hours effort reduction

—260 vs. 410 = 63.5%

—Soft IP replaced by Hard IP

—Emulation modules reduction

Cyclone V SoC Controller NIOS II replaced by

embedded ARM

• Engineering hours effort reduction

—70 vs. 170 = 41%

—using GIGE Ethernet

—using Linux environment for Development and Host PC interface

LEVERAGING ON EXISTING DESIGN EXPERIENCE AND AVAILABLE IP MINIMIZES AND ACHIEVES ESTIMATE METRICS FOR THE DELIVERABLE COSTS AND TIME-FRAMES

SoC FPGA VS TRADITIONAL FPGA DESIGN FLOW: CASE STUDY CONCLUSIONS

Page 52: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

52

DESIGN FOR TESTABILITY AND RELIABILITY

Cyclone V SoC Inherent part of Nuvation’s

design philosophy. The Cyclone V SoC solution reduces engineering effort and calendar time for this particular project as it is the right fit for the desired end product.

FPGA only Made use of development

kits

Additional effort involved in FPGA modules implementing test modules

• Effort involved is higher using FPGA coding then C-based higher level abstractions

CASE STUDY CONCLUSIONS:ON-TIME DELIVERABLES – FIRST TIME RIGHT

Page 53: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

53

CLIENT HANDOFF

Cyclone V SoC A working system containing a

Cyclone V SoC with the initial low-level design details abstracted will further enable end-users to enable at a much higher programming level the numerous advantages a coherent FPGA/ARM processor has to offer.

FPGA only Deliverables in conjunction with

working prototypes allowed the client to use a GUI based interface on a Host PC to completely control the system. The details of the implementation are abstracted.

• New commands between the Host and the FPGA system were easily implemented using the NIOS II code headers and recompiling the NIOS II and updating the Flash on the system board

—again abstracting the FPGA details

SIMPLIFYING AND ABSTRACTING THE MORE COMPLEX FINER GRANULAR COMPONENTS OF COMPLEX SYSTEM DESIGN ELEMENTS

Page 54: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

54

Questions?

Allan DubeauPrincipal Design

Engineer, Nuvation Engineering

Todd KoellingSr. Marketing Manager,

SoC Products, Altera

Stefan RosingerProduct Manager,

CPU, ARM

Page 55: Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study

[email protected]

888.669.0828

Silicon Valley Headquarters151 Gibraltar Court, Sunnyvale, CA 94089 USA

Waterloo Design Center332 Marsland Drive, Suite 200, Waterloo, ON N2J 3Z1 Canada