Protea™ 4.8SP and 3.6SP DSP Loudspeaker System Processors ...
Overview of DSP Processors - Yonsei...
Transcript of Overview of DSP Processors - Yonsei...
DSP VLSI Design
Overview of DSP Processors
Byungin Moon
Yonsei University
1YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Outline
Introduction to DSP processor’s features and architecturesGeneral microprocessor vs. DSP processorExemplified by the SHARC® DSP
Basic common features to DSP processorsDSP processor embodiments
Single-chip processors, multichip modules, multiple processors on a chip, chip sets, DSP cores, and multiprocessors
Alternatives to DSP processorsAs a chip
GPPs, ASICs, ASSPs, and FPGAsAs a system
PC and workstation, PCB (Printed Circuit Board) using off-the-shelf components
2YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
General Microprocessors vs.Digital Signal Processors
Data manipulationWord processing and database managementRearranging stored data
Mathematical calculationScience, engineering and digital signal processingAlgorithms such as digital filtering and Fourier analysis
All microprocessors can perform both tasks; however it is difficult (expensive) to make a device that is optimized for bothDigital signal processors are microprocessors designed to perform the mathematical calculations needed in digital signal processing
3YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Two Tasks by Digital ComputersSource: DSPGuide
4YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Data Manipulation
Main operationsStoring and sorting information
For instance, consider a word processing programBasic tasks
Store, organize (cut and past, spell checking, page layout, etc.) and retrieve (saving the document or printing it) informationAccomplished by moving data from one location to another, and testing for inequalities (A=B, A<B, etc.)
Sorting a list of words into alphabetical orderTest two adjacent entries for being in alphabetical order (IF A>B THEN)Switch them if two entries are not in alphabetical order
While mathematics is occasionally used in this type of application, it is infrequent and does not significantly affect the overall execution speed
5YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
DSP Algorithms
The execution speed of most DSP algorithms is limited almost completely by the number of multiplications and additions requiredFor instance, consider an FIR digital filter
The task is to calculate the sample location n in the output sample by multiplying appropriate samples from the input signal by a group of coefficients
While there is some data transfer and inequality evaluation (such as keep track of intermediate results and control the loops) the math operation dominate the execution time
6YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
FIR Digital Filter (at Time Domain)Source: DSPGuide
7YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
FIR Digital Filter (Operations)Source: BDTi
8YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Off-line vs. Real-line
Off-line processingThe entire input signal resides in the computer at the same timeThe output is produced at a later timeSeismometer, computed tomography, and MRIThe realm of personal computers and mainframes
Real-time processingThe output signal is produced at the same time that the input signal is being acquiredTelephone communication, hearing aids, and radarThese applications must have the information immediately available (can be delayed by a short amount)The world of digital signal processors
9YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Considerer an FIR Filter
Real-time applications input a (group of) sample, perform the algorithm, and output a (group of) sample, over-and-overFIR filter with eight coefficients
Must store the value of the eight most recent samples from the input signal (x[n], x[n-1], … x[n-7])Produce an output sample
Eight iterations of multiplication and additionStore the output sampleA new sample is acquired and storedRepeat the above three stepsWhat is the best way to manage these stored samples?
Circular buffering
10YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Circular Buffering for FIR FiltersSource: DSPGuide
11YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Four Parameters Needed toManage a Circular Buffer
First, pointer indicating the start of the circular buffer in memory (20041)Second, pointer indicating the end of the array (20048), or a variable that holds its length (8)Third, step size of the memory addressing (data size)The above three values define the size and configuration of the circular buffer, and will not change during the program operationFourth, Pointer to the most recent sample
Must be modified as each new sample is acquiredWhile this logic is quite simple, it must be very fast
12YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Steps Needed to Implement an FIR FilterUsing Circular Buffers
Source: DSPGuide
13YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Explanations for 14 Steps
The goal of DSP processors is to make these steps execute quicklySteps 6-12
Repeated many timesSpecial attention must be given to these operations
Traditional MicroprocessorsCarry out these 14 steps in serial
DSPsDesigned to perform them in parallelIn some cases, all of the operations within the loop (steps 6-12) can be computed in a single clock cycle
Now, let’s look at the internal architecture
14YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Memory Architecture
One of the biggest bottlenecks in executing DSP algorithms is transferring information to and from memory
Needs transferring data such as samples from the input signal and the filter coefficients, as well as program instructionsMust fetch three binary values for one multiplication
Two numbers to be multiplied, plus the program instruction describing what to do
Memory architectures for MicroprocessorsVon Neumann architectureHarvard architectureSuper Harvard architecture
15YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Von Neumann Architecture andHarvard Architecture
Von Neumann architectureA single memory and a single bus for transferring data into and out of the CPUMemory architecture of early microprocessors and many low-cost microprocessorsSlower than the other two architectures
At least three cycles are needed to multiply two numbersHarvard Architecture
Separate memories for data and program instructions, with separate buses for eachUsed in most present-day microprocessors (also DSPs)Faster than Von Neumann, but not satisfactory
Two cycles are needed to transfer two numbers
16YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Super Harvard Architecture
Coined by Analog Devices to describe the internal operation of their ADSP-2106x and ADSP-211x families (SHARC®DSPs)Relocating part (e.g. filter coefficients) of the data to program memory
One value over the data memory bus, but two values over the program memory busConflicts of the program instructions and the coefficients
Instruction cacheSmall memory that contains about 32 of the most recent program instructionsDSP algorithms generally spend most of their execution time in loops (steps 6-12)After initial executions of the loop, the program instructions can be pulled from the instruction cacheAll the three transfers can be accomplished in a single cycle
17YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Super Harvard Architecture
I/O controller connected to data memorySerial and parallel communication ports
Extremely high speed connectionsOn-board or on-chip ADC and DACDMA (Direct Memory Access)
Allow data streams to be transferred directly into and out of memory without having to pass through the CPU’s registersSteps 1 & 14 on the list happen independently and simultaneously with the other steps (no cycles are stolen from the CPU)
This type of high speed I/O is a key characteristic of DSPs
18YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Block Diagram ofMemory Architectures
Source: DSPGuide
19YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Simplified Diagram of the SHARC® DSP Architecture
Source: DSPGuide
20YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Architecture of the SHARC DSP
Two DAGs (Data Address Generators)Control the addresses sent to the program and data memoriesControl eight circular buffers
Each DAG hold 32 variables (4 per buffer)Multiple-stage algorithms require multiple circular buffers for the fastest operation
Bit-reversed addressingFor the FFT algorithm
16 general purpose registers of 40 bits eachHold intermediate calculations, prepare data for the math processor, serve as a buffer for data transfer, hold flags for program control, and so on
Extra hardware registers used control loops and counters
21YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Architecture of the SHARC DSP
Math processingA Multiplier, an ALU (Arithmetic Logic Unit), and a barrel shifterThe multiplier and the ALU can be passed in parallel
80-bit accumulatorReduce the round-off error associated with multiple fixed-point math operations
Shadow registers for all the CPU’s key registersUsed for fast context switching, the ability to handle interrupts quicklyAn interrupt is handled by moving the internal data into the shadow registers in a single clock cycleAllow step 4 to be handled very quickly and efficiently
22YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Steps 1-14 on the SHARC DSP
Steps 6-12 are carried out in a single clock cycleThe SHARC DSP can perform a multiply (step 11), an addition (step 12), two data moves (steps 7 and 9), update two circular buffer pointers (steps 8 and 10), and control the loop (step 6) within a single clock cycleExtra clock cycles associated the beginning and ending the loop (steps 2, 3, 4, 5 and 13, plus moving initial values into place)
Also handled very efficientlyThe loop is executed more than a few times, the overhead for these steps will be negligible
As an example, consider an efficient FIR filter program using 100 coefficients
It is expected require about 105 to 110 clock cycles per sample to execute (i.e., 100 coefficient loops plus overhead) – Really!!
23YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Basic Features Common toDSP Processors
Fast multiply-accumulate (MAC)Most DSP algorithms, including filtering, transforms, etc. are multiplication-intensiveA multiplier and accumulator integrated into the main arithmeticprocessing unitExtra bits in their accumulator registers
Accommodate growth of the accumulated result without the possibility of arithmetic overflow
Multiple-access memory architectureMany data-intensive DSP operations require reading a program instruction and multiple data items during each cycle for best performanceMultiple on-chip buses, multiported on-chip memories, and multiple independent memory banks
24YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Basic Features Common toDSP Processors
Specialized addressing modesEfficient handling of data arrays and FIFO buffers in memoryDedicated address generation units
Once the appropriate addressing registers have been configured, the address generation units operate in the background, forming the addresses required for operand accesses in parallel with theexecution of arithmetic instructionsAllow arithmetic processing to proceed at maximum speedAllow specification of multiple operands in a small instruction word
Addressing modes tailored to DSP applicationsRegister-indirect addressing with post-increment
Used in situations where a repetitive computation is performed on a series of data stored sequentially in memory
Circular or modulo addressingSimplify the use of data buffers
Bit-reversed addressing for the FFT algorithm
25YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Basic Features Common toDSP Processors
Specialized execution controlEfficient control of loops for many iterative DSP algorithms
A special loop or repeat instruction is provided that allows the programmer to implement a for-next loop without expending any instruction cycles for updating and testing the loop counter or jumping back to the top of the loop
Fast interrupt handling for frequent I/O operationsFast context switching and low-latency, low-overhead interrupts for fast input/outputE.g. Shadow registers in the SHARC DSP
26YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Basic Features Common toDSP Processors
Peripherals and input/output interfacesI/O interfaces tailored for common peripherals
One or more serial or parallel I/O interfacesAllow low-cost, high-performance, clean interfaces to off-chip I/O devices
Specialized I/O handling mechanisms such as direct memory access (DMA)
Data transfers from I/O to memory or from memory to I/O can be carried out independently and simultaneously with other CPU tasks, without waste of CPU cycles
On-chip peripheralsAllow for small, low-cost system designsE.g. ADC and DAC
27YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
DSP Processors Used as Examples in the Textbook of the Course except DSP Cores
Source: DSPFundamentals
28YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
DSP Processor Embodiments
DSP processors can now be found in many different forms, sometimes masquerading as something else entirelyTypes of DSP processor embodiments
Single-chip processorsMultichip modulesMultiple processors on a chipChip setsDSP coresMultiprocessors
29YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Multichip Modules &Multiple Processors on a chip
Multichip modules (MCM)Combine multiple, bare (unpackaged) die into a single packageHigher packaging density
More circuits per unit area of printed circuit board (PCB)Increased operating speed and reduced power dissipation
An MCM from Texas Instruments (TI)Include two TMS320C40 processors and 128 Kwords of 32-bit SRAM
Multiple processors on a chipCombine multiple processors on a single ICIncreased performance and reduced power consumptionMotorola and Zilog offer parts that combine a DSP and microprocessor or microcontroller on a single chip
30YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Chip Sets
Divide the DSP into two or more separate packagesMake sense if the processor is very complex and if the number of I/O pins is very largeAdvantages
Allow the use of much less expensive IC packagesAdded flexibility
Allow the system designer to combine the individual ICs in the configuration best suited for the application
Potential of providing more I/O pins than individual chipsBufferfly DSP formerly sold by Sharp Microelectronics
Consist of the LH9320 address generator and the LH9124 processorMultiple address generator chips can be used in conjunction with one processor chip
31YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
DSP Cores
A DSP processor intended for use as a building block in creating a chip, as opposed to being packaged by itself as off-the shelf chip
Combine the benefits of a DSP processor (programmability, development tools, software libraries) with the benefits of custom circuits (low production cost, small size, and low power consumption)
Classifications of DSP coresDSP core-based application-specific integrated circuits & customized DSP processorsFoundry-captive & licensable
32YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Example Integrating a DSP core(TI OMAP5910)
Source: BDTi
33YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
DSP Core-based ASIC Customized forDifferent Applications
Source: DSPFundamentals
34YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Currently Available DSP CoresSource: DSPFundamentals
35YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
DSP Core-based ASIC
ASIC that incorporates a DSP core as one element of the overall chip
The system designer integrate a programmable DSP, interface logic, peripherals, memory, and other custom elements onto a single ICThe DSP core is not modified
Definitions of DSP cores depending their vendorsTI’s DSP core includes not only the processor, but memory and peripheralsClarkspur Design’s and DSP Group’s cores include memory but not peripheralsSGS-Thomson’s core includes only the processor and no peripherals or memory
36YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Customizable DSP Processors
DSP core that a designer is allowed to modify or extend (as opposed to adding external circuitry surrounding the core)
Certain features selectable by the chip designer (e.g., a 2nd MAC unit, cache)Data path, other features often customizableSynthesizable HDL description generatedSoftware tools automatically customized
Foundry-captiveThe customer may not have access to the internal design of the core
LicensableThe designers may not be sufficiently expert in the core’s internal design
StrengthsDSP characteristics mean that customization can yield huge gains
Speed, energy efficiency, cost/performanceCan use any foundry
37YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Customizable DSP Processors
WeaknessesRequires a very large investment
Must design own chipUnproven technologyUncertain company/technology roadmaps Requires that corresponding changes be made to the core’s software development tools (assembler, linker, simulator, and so on)
AT&T’s DSP1600Permit easy attachment of new execution units to its data pathSoftware development tools designed to facilitate the addition of support for new execution units into the tools
Philips’ EPICS coreVersions with different word widths
38YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Foundry-captive DSP Cores
The vendor of the core is also the provider of foundry services used to fabricate the chip containing the coreTI’s TEC320C52 for lower volume designs
16-bit fixed-point DSP with a gate arrayCustom circuitry to surround the core is implemented in the gate array portion of the chip (typically using logic synthesis tools)Enjoy advantages in the area of design cost, production cost, and manufacturing lead time
TI’s TMS320C1x, TMS320C2x, and so on, for high-volume designs
Core macrocells can be surrounded by full-custom layouts, standard cells, gate arrays, etc.
SGS-Thomson’s D950-CORE as a macrocellApplication-specific hardware designed by the customer is crafted from standard cells or in a gate array
39YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Licensable DSP Cores
The core vendor licenses the core design to the customer, who is then responsible for selecting an appropriate foundry
The customer receives a complete design description of the DSP coreFabricated as part of an ASIC using the IC foundry of the customer’s choiceGrowing importance of SoCsGrowing cost of in-house processor architectures
The form of licensable coresOptimized full-custom layout compatible with the fabrication processes of a particular foundrySynthesizable HDL design descriptions
40YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Multiprocessors
The needs of a large class of important applications cannot be met by a single processor
If programmability is important, a multiprocessor based on commercial DSPs may be an effective solution
Features of DSPs that are especially well suited to multiprocessor systems
Multiple external buses and bus-sharing logicMultiple, dedicated parallel ports designed for interprocesor communication that simplify hardware design and improve performance
TI’s TMS320C4x, Motorola’s DSP96002, and Analog Devices’ ADSP-2106x
41YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Alternatives to CommercialDSP Processors
There are many alternatives to DSP processors available to the designer contemplating a new applicationKey alternatives
As a chipGPPs (General-Purpose Processors) and DSP-enhanced GPPs
Embedded CPUPC CPU
ASICs (Application-Specific Integrated Circuits)ASSPs (Application-Specific Standard Processors)FPGAs (Field-Programmable Gate Arrays)
As a systemPC and workstationPCB (Printed Circuit Board) using off-the-shelf components
42YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
GPPs
As general-purpose microprocessors become faster they are increasingly able to handle some less-demanding DSP applications
Many electronic products (from telephones to automotive engine controllers) are currently designed using general-purpose microprocessors for control, user interface, and communication functions
Software development tools for GPPs are generally much more sophisticated and powerful than those for DSP processors
When the ease of development is a critical consideration, this can be an important factor
GPPs are much slower and less efficient than DSP processors for DSP applications
43YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
DSP-enhanced GPPs (Hybrids)
General-purpose microprocessors with DSP capabilitiesNeed for GPPs with DSP enhancements
Many products contain both a DSP and a GPPEliminating one can reduce cost
System designers often must choose between a GPP and a DSP, although both functions are needed
Today many GPPs have strong DSP capabilitiesMotorola/IBM PowerPC 601, MIPS R10000, Sun UltraSPARC, and HP PA-7100LC
Floating-point multiply-accumulate in one instruction cycleSpecial instructions aimed at multimedia applications
44YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Embedded CPUs
Can have strong DSP performanceSH3-DSP and ARM11
Dynamic features complicate programmingComplicates optimization & ensuring real-time behavior
Good tools, generally lack DSP support32-bit GPPs better targets for non-DSP tasks
TCP/IP network stacksVery good 3rd-party non-DSP software component supportCompatibility more commonHigh integration parts increasingly more common
45YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Example Embedded CPUSource: BDTi
46YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
PC (High-performance) CPUs
High-performance GPPs can implement demanding DSP tasks
Pentium 4 and PowerPC 7xxxMay be as fast or faster than DSPsCost & power consumption may be higherDynamic features complicate optimization and real-time designMany options for OS, 3rd party application softwareDevelopment tools mature, powerful
But typically lack DSP-oriented features
47YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Example PC CPUSource: BDTi
48YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
ASICs
Chips designed for a specific end product or group of end productsDesigned by the system developer“ASIC” does not imply an architecture
Traditionally DSP ASICs have used hard-wired logic with varied architectures
Sometimes with proprietary processor coresIncreasing licensed IP content
Processor cores, acceleratorsOn-chip peripherals, I/O interfacesBuses
Plus dedicated, custom logic
49YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
ASICs
StrengthsOffers the ultimate in tailored hardware
Speed, energy efficiency, cost/performanceIntegration to match the product requirements
Vast architectural optionsWeaknesses
Large development costs and risks vs. off-the-shelf hardware; mask-making costs increasing
Iteration is costly and time consumingLengthy development cycles
Hardware/software integration and whole-chip verification are particularly challengingHardware/software partitioning typically must be done early
InflexibilityLong, costly development precludes frequent design changes
Complex, costly, unreliable tools
50YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Example of ASIC Using Customizable DSP Processor (Tensilica Xtensa)
Source: BDTi
51YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
ASSPs
Off-the-shelf, fixed-function chips specialized for an applicationSimilar to an ASIC in design
Many architecture possibilitiesMay contain one or more processor cores
Which may or may not be use-programmableMay be a SoC with memory, peripherals, special I/O, etc…… or a building-block, like a video decoder
Similar to off-the-shelf processors in business model
52YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
ASSPs
StrengthsOften very well matched to the application
SoCs with extensive integrationArchitecture tuned for the application
Ease of useReduce system development costsReduce required implementation expertise
WeaknessesOften inflexibleLimited differentiation opportunities for system designerUsually single-sourceRoadmap often unclear
53YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Example ASSP (Micronas MDE9500)Source: BDTi
54YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Alternative systems to DSPs(Superfluous page)
PC and workstationSome DSP applications may be directly implemented on PCs or workstations that are not equipped with DSP processors
Software-only DSP-based products are appropriate for undemanding applicationsNon-real-time applications such as scientific and engineering applications
Have a tendency to add more DSP capabilitiesMore opportunities for software-only DSP products
Custom boardsConsist of off-the-shelf components such as standard logic devices, fixed-function or configurable arithmetic units, FPGAs, and ASICs (including DSP processors containing custom, mask-programmed software in ROM, and proprietary processors)
55YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
FPGAs
An amorphous “sea” of reconfigurable logic with reconfigurable interconnect
Possibly interspersed with fixed-logic resources, e.g., processors, multipliers
Potential for very high parallelismHistorically used for prototyping and “glue logic”, but becoming more sophisticated
DSP-oriented architecture featuresDSP-oriented tools and design libraries
Viterbi, Turbo, and Reed-Solomon coders and decoders, FIR filters, FFTs,…
Key DSP players: Altera an Xilinx
56YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
FPGAs
StrengthsMassive performance gains on some algorithmsArchitectural flexibility can yield efficiency
Adjust data widths throughout algorithmParallelism where you need it; distributed storage
Re-use hardware for diverse tasksSlow time-to-market compared to DSPs
WeaknessesCumbersome design flow that’s unfamiliar to most DSP engineersSuitability for single-channel, low-power, cost-sensitive DSP applications unclear
57YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Example FPGA (Altera Stratix)Source: BDTi
58YONSEI UNIVERSITYDSP VLSI Design
Overview ofDSP Processors
Summary of Alternatives to DSPs
DSPs face growing competition from many directions
GPPs, ASICs, ASSPs, FPGAs…Software—not hardware—is often the key
Performance advantage for DSPs over GPPs and FPGAs is diminishingAs application complexity increases, development costs and effort shift to softwareCutting-edge compilers and other tools are critical
There is no ideal processorThe best processor depends on the applicationHeterogeneous solutions will become more common