Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two...

199
The World Leader in High-Performance Signal Processing Solutions Developing with Digital Signal Processor Salvetat Richard SEMINAIRE Captronic – INSAVALOR 10 Mars 2011 S

Transcript of Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two...

Page 1: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

The World Leader in High-Performance Signal Processing Solutions

Developing with Digital Signal Processor

Salvetat RichardSEMINAIRE Captronic – INSAVALOR

10 Mars 2011

S

Page 2: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Agenda

IntroductionSection 1: DSP “Demystification” Section 2: DSP Core Architecture OverviewSection 3: DSP Core Example: BlackFin BF53x Section 4: Memory Manager / DMASection 5: Program Sequencer / Interrupt ManagerSection 6: I/O EmbeddedSection 7: Software environmentSection 8: Compiler C, Rules for an optimal software ConclusionTRAVAUX PRATIQUES

Page 3: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

The World Leader in High-Performance Signal Processing Solutions

Section 1DSP “demystification”

Page 4: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Digital sampling of an analogue signal

So what problem has to be solved?

BPFor

LPF

N-BITADC

N-BITDACDSP

fa

fs fs

BPFor

LPF

A

t

Most DSP algorithms can be expressed with MAC:

i

count

ii xaY *

1∑

=

=

Page 5: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Typical DSP Algorithms

Complex Fourier Transform

Infinite Impulse Response Filters (IIR)

Convolution

∑−

=

−⋅=1

0

/2N

n

Nnmjnm exX π

∑ ∑= =

−− ⋅+⋅=N

k

N

kknkknkn ybxay

0 1

∑−+

=−⋅=

2

0

kn NN

kknkn xhy

Page 6: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Mathematical Primitive

Basic Problem of most DSP Algorithms

Multiply and Accumulate (MAC) Operationy:=0;FOR n:=0 TO N-1 DO

y:= y + a[n]·x[n];

∑−

=

⋅=1

0

N

nnn xay

Page 7: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

The Harvard Architecture

Harvard Architecture: Simultaneous Access of Data and Instruction

Modified Harvard Architecture: Single-cycle Access of 2 Pieces of Data and 1 Instruction Three Bus Performance

DSPµProcessor

Data Storage

DM

Program Storage

PM

Data Data

Address Address

Page 8: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon 8

What are the characteristics of a good DSP?

Fast, flexible arithmetic computation units Unconstrained data flow to and from the computation

units Extended precision and dynamic range in the

computation units Dual address generators Efficient program sequencing Ease of programming Efficient I/O Processing

Page 9: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Floating vs. Fixed point processors Applications which require:

High precision. Wide dynamic range. High signal-to-noise ratio. Ease of use.

Need a floating point processor. Drawback of floating point processors:

Higher power consumption. Can be more expensive. Can be slower than fixed-point counterparts

and larger in size.

Page 10: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

General Purpose DSP vs. DSP in ASIC Application Specific Integrated Circuits (ASICs) are

semiconductors designed for dedicated functions. The advantages and disadvantages of using ASICs

are listed below:

AdvantagesAdvantages• High throughputHigh throughput• Lower silicon areaLower silicon area• Lower power consumptionLower power consumption• Improved reliabilityImproved reliability• Reduction in system noiseReduction in system noise• Low overall system costLow overall system cost

DisadvantagesDisadvantages• High investment costHigh investment cost• Less flexibilityLess flexibility• Long time from design to Long time from design to

marketmarket

Page 11: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

The World Leader in High-Performance Signal Processing Solutions

Section 2DSP Core Architecture Overview

Page 12: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Different Needs? Multiple Families

TigerSHARCHigh Performance

SHARCLow Cost

Floating Point

Per

form

ance

ADSP-21xxPower efficient

Fixed Point

BlackfinMedia enabled

Fixed Point

Wired Voice Wireless Voice VOIP/VON Industrial Control

Image compression3G TerminalsDigital Still/Video CameraMMOIPBiometrics

2.5G/3G InfrastructureMedical ImagingIndustrial ImagingMultiprocessing

AudioInfotainmentIndustrial

Power

Page 13: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Configurable Memory System

Supports a Cache Memory Model and an SRAM Memory Model Sustained Dual Data Accesses for DSP Applications Supports accesses of 8, 16, 32 bit Data Separate Multi-ported L1 Instruction and Data Memories

Processor Core

L1 InstructionSRAM & Cache

DMA

L2Instruction

& DataSRAM

L1 Data SRAM & Cache

Scratchpad SRAM

Page 14: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

The Blackfin Core –Microcontroller and DSP?

Acc1

40BarrelShifter

Acc0

40

16168 8 8 8

Address Arithmetic Unit

DAG0 DAG1

I3 L3 B3 M3I2 L2 B2 M2I1 L1 B1 M1I0 L0 B0 M0

P0P1P2P3P4P5FPSP

R0R1R2R3R4R5R6R7

Data Arithmetic Unit

Sequencer

Two 16-bit MultipliersTwo 32-bit ALUsFour 8-bit Video ALUsBarrel ShifterSixteen 16-bit Math registers /Eight 32-bit Math Registers

Two DAGs, byte addressingEight 32-bit pointer registersFour Sets of 32-bit Index, Modify, Length, Base

16-bit Instructions32-bit InstructionsMulti-Issue, 64-bit Instructions

Interlocked Pipeline

Page 15: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Great Performance ValueHighest Frequency (600MHz) Highest MMAC/$ Lowest mW/MMAC of any conventional DSP (0.15 mW/MMAC)

High System Integration Video I/O connects directly to ITU-R 656 encoders and decodersSPORTs support 8 Channels of I2S AudioCore Voltage RegulatorMicrocontroller features include WDT, RTC, SDRAM controller

Up to 600MHzBlackfin

Processor Core

SDRAM

FLASH/SRAMInterfaces

RTC

Watchdog

JTAG

System Peripherals

Up to80KBytesPM

4KBytes

Enhanced DMA

SPI 1

UART 1

Timers 3

GPIO 16

User Peripherals

PLL

Dynamic Power

Management

EnhancedSPORTs 2

PPIVideo I/O Switching

Regulator

Memory

ADSP-BF531/BF532/BF533Enhanced Blackfin Processors

32KBytesPMROM

Up to64KBytesDM

Page 16: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

The World Leader in High-Performance Signal Processing Solutions

Registers and

Data Types

Page 17: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon 17

32-bit Fixed-Point Formats

Bit 31 30 29 2 1 0Weight -231 230 229 ··· 22 21 20

Sign Signed Integer bit

Bit 31 30 29 2 1 0Weight 231 230 229 ··· 22 21 20

Unsigned Integer

Bit 31 30 29 2 1 0Weight -20 2-1 2-2 ··· 2-29 2-30 2-31

Sign Signed Fractional bit

Bit 31 30 29 2 1 0Weight 2-1 2-2 2-3 ··· 2-30 2-31 2-32

Unsigned Fractional

examples -3 = 0xfffffffd 3 = 0x00000003

2^31 = 0x80000000 3 = 0x00000003

.25 = 0x20000000 .75 = 0x60000000

.25 = 0x40000000 .75 = 0xc0000000

Page 18: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon 18

Floating-Point Formats

39 832-bit: s e7 e0 1.f22 f0

39 040-bit: s e7 e0 1.f30 f8 f7 f0

hidden bit

type exponent fraction value example representationNormal 1 ≤e≤254 any (-1)s(1.f)2e-127 0x3f800000 1 . 0Zero 0 0 zero 0x00000000 0 . 0Infinity 255 0 infinity 0x7f800000 1.# I N FNAN 255 non-zero undefined 0x7f800001 1.# N A NDenormal 0 any zero 0x00000001 1.# I N D

Page 19: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon 19

Integer/Fractional Fixed Point Multiplication• Integer Multiplier Fixed-Point Result Placement

0316379

INTEGER RESULTINTEGER RESULTOVERFLOW

MR2 MR1 MR0

ureg ZEROS

8 bits32 bits

Register File Placement

MRF or MRB Placement •

Binary Point

INTEGER RESULTOVERFLOW (is lost) •

ureg ZEROS

8 bits32 bits

0316379

FRACTIONAL RESULTOVERFLOW

MR2 MR1 MR0

• Fractional Multiplier Fixed-Point Result Placement

MRF or MRB Placement

Register File PlacementFRACTIONAL RESULT UNDERFLOW (is lost)

•Binary Point

FRACTIONAL RESULT

MV set

MV set

Page 20: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

The World Leader in High-Performance Signal Processing Solutions

Section 3DSP Core Example

Page 21: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Accessing Registers

There are two ways to access registers on the ADSP-2153x

A majority of registers are memory mapped and must be accessed indirectly Core MMRs are used to configure the core registers

They are listed in Appendix A of the HRM System MMRs are used to configure all other peripherals

They are listed in Appendix B of the HRM The addresses of the core and system MMRs are part of the

def21533.h and defblkfin.h header files MMRs can only be accessed in Supervisor mode

The remaining registers are accessed directly, by name

Page 22: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

System Registers

LT0LB0

Loop CounterLoop TopLoop Bottom

ASTAT

RETS

RETI

RETX

RETN

RETE

Arithmetic Status

Subroutine Return

Interrupt Return

Exception Return

NMI Return

Emulation Return

LT1LB1

System Config

Sequencer Status

SYSCFG

SEQSTAT

LC0

LC1

SystemRegisters

I0

I1

I2

I3

L0

L1

L2

L3

B0

B1

B2

B3

M0

M1

M2

M3

31 0 31 0 31 0 31 0

P0

P1

P2

P3

P4

P5

31 0

FP

SP

USP

Address Registers

R0

R1

R2

R3

R4

R5

R6

R7

R0.LR0.H

R1.LR1.H

R4.LR4.H

R7.LR7.H

1531

A1.H A1.L

A0.H A0.L

A1X

A0X

Data Registers

1531

Supervisor mode access only

Data Registers: R0-R7

Index Registers: I0-I3

Pointer Registers: P0-P5

Page 23: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Micro Signal Architecture Core

Acc1

40BarrelShifter

Acc0

40

16168 8 8 8

Address Arithmetic Unit

DAG0 DAG1

I3 L3 B3 M3I2 L2 B2 M2I1 L1 B1 M1I0 L0 B0 M0

P0P1P2P3P4P5FPSP

R0R1R2R3R4R5R6R7

Data Arithmetic Unit

Sequencer

Blackfin DSP Core based on the Micro Signal ArchitectureJointly Developed With Intel Corporation

• Two 16-bit Multipliers• Two 32/40-bit ALUs• Four 8-bit Video ALUs• Barrel Shifter• Sixteen 16-bit Math registers / Eight 32-bit Math Registers

• Two DAGs, byte addressing• Eight 32-bit pointer registers• Four Sets of 32-bit Index, Modify, Length, Base

• 16-bit Instructions• 32-bit Instructions• Multi-Issue, 64-bit Instructions

Page 24: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Arithmetic Logic Unit (ALU)

Data Arithmetic Unit

R0

R1

R2

R3

R4

R5

R6

R7

A1

40barrelshifter

A0

40

1616

8 8 8 8

Page 25: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Arithmetic Logic Unit (ALU)

Two ALUs operating on 16-bit, 32-bit, and 40-bit input operands and output 16-bit, 32-bit, and 40-bit results.

Functions Fixed-point addition and subtraction Addition and subtraction of immediate values Accumulator and subtraction of multiplier results Logical AND, OR, NOT, XOR, bitwise XOR, Negate Functions: ABS, MAX, MIN, Round, division primitives

Features Supports conditional instructions 8-bit video ALU operations

Page 26: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

ALU Operations Single 16-Bit Operations

Single 16-bit Addition, Subtraction Operations Any two 16-bit register halves may be used as inputs. One 16-bit result is deposited in designated 16-bit register

half.

General Form:Dreg_lo_hi = Dreg_lo_hi + Dreg_lo_hi;

Example:R6.H = R3.H + R2.L; Single

16-bit addition 031 16

031 16

031 16

R2

R3

R6

+

Page 27: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

ALU Operations Dual 16-Bit Operations Dual 16-bit Addition, Subtraction Operations

Any two 32-bit registers may be used as inputs. Two 16-bit results are deposited in designated 32-bit register.

General Form:Dreg = Dreg +|+ Dreg; Dreg = Dreg -|- Dreg;Dreg = Dreg +|- Dreg; Dreg = Dreg -|+ Dreg;

Example:R6 = R2 + | - R3;

Dual16-bit addition

31 16 0

R2

R3

R6+ -

Page 28: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

ALU Operations Quad 16-Bit Operations Quad 16-bit Addition, Subtraction Operations

Any two 32-bit registers may be used as inputs. Four 16-bit results are deposited in two designated 32-bit

registers. General Form:

Dreg = Dreg +|+ Dreg, Dreg = Dreg -|- Dreg;Dreg = Dreg +|- Dreg, Dreg = Dreg -|+ Dreg;

Example:R3 = R0 + | + R1, R2 = R0 - | - R1;

31 16 0

R0

R1

+R3

+R2

- -

R0R1

31 16 0

Quad16-bit addition

Page 29: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

ALU OperationsSingle 32-Bit Operations

Single 32-bit Addition, Subtraction Operations Any two 32-bit registers may be used as inputs. One 32-bit result is deposited in designated 32-bit register.

General Form:Dreg = Dreg + Dreg;Dreg = Dreg - Dreg;

Example:R6 = R2 + R3; 32-bit addition

031

031

031

R3

R6

+

R2

Page 30: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

ALU OperationsDual 32-Bit Operations

Dual 32-bit Addition, Subtraction Operations Any two 32-bit registers may be used as inputs. Two 32-bit result is deposited in designated 32-bit register.

General Form:Dreg = Dreg + Dreg, Dreg = Dreg - Dreg;

Example: R3 = R1 + R2, R4 = R1 - R2;

R4

-

R1R2

31 0

R3

+

R1R2

31 0

Dual32-bit operation

Page 31: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Other ALU Operations

Rounding Instructions Dreg_lo_hi = Dreg (RND) eg. R1.L = R5 (RND)

Pointer Register Example Instructions P5 = P3 + P0; // add two 32-bit pointer registers P5 += -4; // add immediate value to P register

Video Alu Instructions 4 Adds or Subtracts with 8bit inputs

Page 32: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

32-bit ALU Logical Operations

ANDGeneral Form:Dreg = Dreg & Dreg;Example:R4 = R4 & R3;

NOTGeneral Form:Dreg = ~Dreg;Example:R3 = ~ R4;

ORGeneral Form:Dreg = Dreg | Dreg;Example:R4 = R4 | R3;

XORGeneral Form:Dreg = Dreg ^ Dreg;Example:R4 = R4 ^ R3;

Page 33: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

ASTAT - Register

Arithmetic Status will be hold in the ASTAT register

AZ: Zero ResultAN: Negative ResultAC0: Alu0 CarryAC1: Alu1 CarryAV0: A0 OverflowAV1: A1 OverflowCC: Condition CodeAQ: Quotient BitRND_MOD: Rounding Mode

Page 34: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Conditional Code (CC) Bit in ASTAT

CC bit is used in several instructions Action taken in the instruction depends on the value of CC

If CC jump here; // if cc = 1, jump to label "here"If CC R3 = R0; // perform move if cc=1

CC bit value is based on a comparison of two registers, pointers or accumulators

CC = R3 == R2;CC = R3 < R2;

CC bit can be moved to and from a data register or ASTAT bitCC = R1;R1 = CC;CC = AZ;

Page 35: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Multiply-Accumulators (MAC)

Data Arithmetic Unit

R0

R1

R2

R3

R4

R5

R6

R7

A1

40barrelshifter

A0

40

1616

8 8 8 8

Page 36: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Multiply-Accumulators (MAC)

Two identical MACs Each can perform fixed point multiplication and multiply-and-

accumulate operations on 16-bit fixed point input data and outputs 32-bit or 40-bit results depending the destination.

Functions Multiplication Multiply-and-accumulate with addition (optional rounding) Multiply-and-accumulate with subtraction (optional rounding) Dual versions of the above

Features Saturation of accumulator results Optional rounding of multiplier results

Page 37: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Multiplication Modes -- Fractional Mode

Mode 1: fractional mode Multiplier assumes all numbers in a 1.15 format Multiplier automatically shifts product 1-bit left before accumulation (Result forced to 1.31 format) Example: A0 = R0.L * R1.L (fu);

0x4000 0x4000R0.L R1.L

A0.LA0.HA0.X

A0.H

0x00 2000 0000

0x2000underflowoverflow

=0.5=0.5 =0.5=0.5

=0.25=0.25

Page 38: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Multiplication Modes -- Integer ModeMode 2: integer mode Multiplier assumes all numbers in a 16.0 format No automatic left-shift necessary Example: A0 = R0.L * R1.L (IS);

0x4000 0x4000R0.L R1.L

A0.LA0.HA0.X

0x00 1000 0000

0x0000overflow

A0.L

overflow

=2=2 =2=2

=2=2

1414 1414

2828

Page 39: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Multiply Operations Example Instructions

Example input operand combinations Accumulator or data register or half-

register can be the destination

A0 = R2.L * R3.L;

R2

R3

XA1

R0 = R2.L * R3.H;

R2

R3

X

R0.H = R2.H * R3.L;

R2R3

XA0

R2

R3

XA0

A1 = R2.H * R3.H;

R0

R0

Default datatype is signed fractional(FU) fractional unsigned(IS) integer signed(IU) integer unsigned

Page 40: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

MAC Operations Example Instructions

Example input operand combinations

A0 += R2.L * R3.L;

R2

R3

X +A1

A0-= R2.L * R3.H;

R2

R3

X+

A1

A1 += R2.H * R3.L;

R2R3

X-

A0A0

R2

R3

X+

A0

A1 += R2.H * R3.H;

Page 41: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Multiply and MAC Operations

R4.L = (A0 += R2.L * R3.L);

R2

R3

X+

A0

R4

R4.H = (A1 += R2.L * R3.L);

R2

R3

X+

A1

R4

When Result is Transferred From the Accumulator to a 16-bit Data Register

Page 42: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Multiply and MAC Operations

R1 = (A1 += R2.L * R3.H);

R2R3

X+

A0A1

R0 = (A0 += R2.L * R3.H);R2

R3

X+

A0A0

R0

When A0 is used, the destinationmust be to an even Data Register, e.g. R0, R2, R4, R6

When A1 is used, the destinationmust be to an odd Data Register, e.g. R1, R3, R5, R7

In both cases, the accumulate can be removed or replaced by a subtraction

When Result is Transferred From the Accumulator to a 32-bit Data Register

R1

Page 43: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Dual Multiply Operations ExampleBoth Multipliers can be used in the same operation to double the throughput. The same two 32-bit input registers must be used.

A1 = R2.H * R3.H, A0 = R2.L * R3.L;

R2R3

XX

A1 A0

R2R3

XX

R0 R1

R0 = R2.H * R3.H, R1 = R2.L * R3.L;

32-bit Data Register Destinations must be used in pairs, e.g. R0:R1 or R2:R3 or R4:R5 or R6:R7

Page 44: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Dual MAC Operations ExampleBoth MACs can be used in the same operation to double the MAC throughput. The same two 32-bit input registers must be used (R2 and R3 in this example).

A1 -= R2.H * R3.H, A0 += R2.L * R3.L;

R2

R3

X+

X-

A1

A0

In both cases, the accumulate and subtraction are interchangeable

Page 45: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

32 Bit Multiplication Example

This instruction needs 5 CCLK cykles and works only in integer mode

R2 *= R3;

R2

R3

X

Page 46: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Barrel-Shifter (Shifter)

Data Arithmetic Unit

R0

R1

R2

R3

R4

R5

R6

R7

A1

40barrelshifter

A0

40

1616

8 8 8 8

Page 47: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Barrel-Shifter (Shifter)

The shifter performs bitwise shifting for 16-bit, 32-bit or 40-bit inputs and yields 16-bit, 32-bit, or 40-bit outputs.Functions Arithmetic Shift: The Arithmetic Shift instruction shifts a

registered number a specified distance and direction while preserving the sign of the original number. The sign bit value back-fills the left-most bit positions vacated by the arithmetic right shift.

Logical Shift: The Logical Shift instruction logically shifts a registered number a specified distance and direction. Logical shifts discard any bits shifted out of the register and backfill vacated bits with zeros.

Rotate: The Rotate instruction rotates a registered number through the CC bit a specified distance and direction.

Bit Operations Field Extract and Deposit

Page 48: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Arithmetic Shift Example

Immediate Shift MagnitudeR3.L = R0.H >>> 7; /* arithmetic right shift, half

word */R5 = R2 << 24 (S); /* arithmetic left shift */

Registered Shift MagnitudeR3.L = ashift R0.H by R7.L; /* arithmetic shift, half-word

*/A0 = ashift A0 by R7.L; /* arithmetic shift, accumulator */

Page 49: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Logical Shift Example

Pointer shift, fixed magnitudeP3 = P2 >> 1; /* pointer right shift by 1 */P0 = P1 << 2; /* pointer left shift by 2 */

Data shift, immediate shift magnitudeR3.L = R0.L >> 4; /* data right shift, half word register */R3 = R0 << 12; /* data left shift, 32-bit word */A0 = A0 << 7; /* accumulator left shift */

Data shift, registered shift magnitudeR3.H = lshift R0.L by R2.L; /* logical shift, half word

register */A1 = lshift A1 by R7.L; /* logical shift, accumulator */

Page 50: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Rotate Example Instruction

Immediate Rotate MagnitudeR4 = rot R1 by 8; /* rotate left by 8 */A0 = rot A0 by -5; /* rotate right by 5 */

Registered Rotate MagnitudeR4 = rot R1 by R2.L /* rotate by value in R2.L */A1 = rot A1 by R7.L /* rotate by value in R7.L */

Each bit that rotates out of the register is stored in the CC bit.

Page 51: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Bit Operations Example Instructions

Bit Clear: BITCLR(R2, 3);

Bit Set: BITSET(R2, 7);

Bit Toggle: BITTGL(R2, 24);

Bit Test: CC = BITTST (R7, 15);

Bit Test: CC = !BITTST (R3, 0);

Page 52: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Field Extract and Deposit Example

Bit Field ExtractionR7 = extract (R4, R3.L) (z); //zero-extendedR7 = extract (R4, R3.L) (x); //sign-extended

Bit Field DepositR7 = deposit (R4, R3); //zero-extendR7 = deposit (R4, R3) (x); //sign-extended

Page 53: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

The World Leader in High-Performance Signal Processing Solutions

Section 4Memory Management / DMA

Page 54: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Address Registers

I0

I1

I2

I3

L0

L1

L2

L3

B0

B1

B2

B3

M0

M1

M2

M3

31 0 31 0 31 0 31 0

P0

P1

P2

P3

P4

P5

31 0

FP

SP

USP

Address Registers

• One set of general purpose pointer registers • P0-P5, SP and FP

• One set of DSP addressing index registers• I0-I3, B0-B3, L0-L3, M0-M3

• All addresses are byte addresses

SP points to supervisor stack in Supervisor mode and user stack in User modeUSP is accessible in supervisor mode only - Allows access to user stack location while in Supervisor mode

Page 55: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Data Types

8-bit bytessigned or unsigned integers

16-bit half-words (little Endian)signed or unsigned integerssigned fractional (1.15)

32-bit words (little Endian)signed or unsigned integerssigned fractional (1.31)

Page 56: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Addressing Methods

Indirect Addressing Index Registers (32-bit and 16-bit accesses) Pointer Registers P0 - P5 (32-bit, 16-bit, and 8-bit accesses) Stack and Frame Pointer Registers (32-bit accesses)

Types of address modify Post-Modify Pre-Modify using Stack Pointer Pre-Modify without update

Circular buffering/modulo addressing Base registers for circular buffer wrap-arounds

Bit Reversal

Page 57: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Post-Modify and Pre-Modify Operations Post-Modify Instructions

32-bit accessesR0 = [P0++]; /* Increments the value of P0 by 4 after the read */R0 = [P1 ++ P2]; /* Increments P1 by P2 after reading 32-bit word

from P1 */ 16-bit accesses

R0 = W[I0--]; /* Decrements the value of I0 by 2 after the read */R0 = W[I2++M2]; /* Increments the value of I2 by M2 after reading 16-

bit word from I2 only */ 8-bit accesses

R0 = B[P0++]; /* Increments the value of P0 by 1 after the read */R2 = B[P4 ++P5]; /* Increments P4 by P5 after reading 8-bit word from

P4 only */Analogous store instructions exist

The only pre-modify instruction with update supported uses the Stack Pointer[ -- SP ] = R0; /* Decrements current value in SP by 4, and then

writes the value in R0 to the updated value in SP */

Page 58: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Stack Instructions

Push Instruction: [--SP] = src_reg; The push instruction stores the contents of a specified register or

registers in the stack The instruction pre-decrements the stack pointer to the next available

location in the stack first Push multiple instruction allows multiple registers to be placed on the

stack with single instruction

Pop Instruction: dest_reg= [SP++]; The pop instruction loads the contents of the stack indexed by the

current stack pointer into a specified register The instruction post-increments the stack pointer to the next occupied

location in the stack before concluding Pop multiple instruction allows multiple registers to be popped from

the stack with single instruction

Page 59: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

1M Byte Asynchronous

Memory & Interface

32K BytesInstruction ROM

32K BytesInstruction SRAM

16K Bytes Instruction SRAM/Cache

32K BytesData SRAM/Cache

4K BytesScratchpad SRAM

Exte

rnal

Mem

ory

Inte

rfac

e16

1M Byte Asynchronous

1M Byte Asynchronous

1M Byte Asynchronous

16M Byte – 128M ByteSynchronous

Page 60: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

BF531 Memory Levels Internal L1 memory -

Closest to the ProcessorCan be configured as cache or SRAMSmallest Memory Capacity(32KB Instruction, 20KB Data)Single Cycle Access

External L2 memory -Off ChipLargest Memory Capacity (Synchronous and Asynchronous)Slowest access time

Page 61: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Configurable Memory

As processor speeds increase (300Mhz - 1 GHz), it becomes increasingly difficult to have large memories running at full speed.

Two methods can be used to fill the L1 memory - Caching and Dynamic Downloading - Blackfin Supports Both.

Micro-controllers have typically used the caching method, as they have large programs often residing in external memory and determinism is not as important.

DSPs have typically used Dynamic Downloading as they need direct control over which code runs in the fastest memory.

Blackfin allows the programmer to chose one or both methods to optimize system performance.

Page 62: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

BF531 Internal SRAM

Processor

L1 Code SRAM/CACHE

(32KB)

L1 Data Bank ASRAM/CACHE

(8KB)

L1 Data Bank BSRAM/CACHE

(8KB)

L1 Scratch SRAM(4KB)

0xFF80 0000

0xFF90 0000

0xFF90 3FFF

0xFF80 3FFF

0xFFA0 0000

0xFFA0 3FFF

0xFFB0 0000

0xFFB0 0FFF

Page 63: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Configurable Memory System

Supports a Cache Memory Model and an SRAM Memory Model Sustained Dual Data Accesses for DSP Applications Supports accesses of 8, 16, 32 bit Data Separate Multi-ported L1 Instruction and Data Memories

Processor Core

L1 InstructionSRAM & Cache

DMA

L2Instruction

& DataSRAM

L1 Data SRAM & Cache

Scratchpad SRAM

Page 64: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Why Do Blackfin DSPs Have Cache?

To allow users to take advantage of single cycle memory without having to specifically move instructions and or data "manually" On-chip and off-chip L2 memory can be used to hold large programs

and data sets The paths to and from L1 memory are optimized to perform with cache

enabled Automatically optimizes code that reuses recently used or nearby

data

External L2 Memory:Largest capacityHighest latency

Internal L2 Memory:Larger capacityHigher latency

Internal L1 Memory:Smallest capacity

Single cycle access

Page 65: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

L1 Instruction Memory

Instruction

DMA

4KBsub-bank

Fill

4KBsub-bank

4KBsub-bank

4KBsub-bank

16 KB cache 4-way set associative with

arbitrary locking of ways LRU replacement No DMA access when

configured as cache

16 KB SRAM Four 4KB single-ported

sub-banks Allows simultaneous

DMA access to different banks

Page 66: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

L1 Data Memory

Two 16KB banks

Each bank can be cache or SRAM

4KB scratch SRAM (stack can be located here for fast context switching)

Data 1

Data 0

DMA A

16KBBank B

16KBBank A

4KBSRAM

Fill A

DMA B

Fill B

Page 67: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

16KB Super-bank Data Architecture

Four 4KB single-ported sub-banks

Multi-ported data access when using different sub-banks

Data 1

Data 0

DMA

4KBsub-bank

Fill

4KBsub-bank

4KBsub-bank

4KBsub-bank

When Used as Cache Each bank is 2-way

set-associative No DMA access

When Used as SRAM Dual Data Access DMA Access

Page 68: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Cache Hits and Misses

A cache hit occurs when the address for an instruction fetch request from the core matches a valid entry in the cache.

When a cache hit occurs, the target 64-bit instruction word is sent to the instruction alignment unit where it is stored in one of two 64-bit instruction buffers.

When a cache miss occurs, the instruction memory unit generates a cache line-fill access to retrieve the missing cache line from memory internal or external L2 memory to the core.

Page 69: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Cacheability Protection Lookaside Buffers (CPLBs)

Divide the entire Blackfin memory map into regions that have cacheability and protection properties.

16 Pages in Instruction Memory plus 16 Pages in Data memoryPage sizes: 1KB, 4KB, 1MB, 4MB

Each CPLB has 2 associated registers: 32bit Start Address: ICPLB_ADDRx,

DCPLB_ADDRx Cache/Protection Properties:

ICPLB_DATAx, DCPLB_DATAx

Memory

CPLB0

CPLB1

CPLB2

ICPLB_ADDR0

ICPLB_ADDR1

ICPLB_ADDR2

ICPLB_DATA1

ICPLB_DATA0

ICPLB_DATA2

0x0000

0x1000

0x1400

Page 70: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Using CPLBs

Cache enabled: CPLB must be used to

define cacheability properties

Cache disabled: CPLBs can be used to

protect pages of memory

For both cases, a valid CPLB must exist before an access to a specific memory location is attempted. Otherwise, an exception will be generated.

User and Supervisor mode protection is available without using CPLBs.

Page 71: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Example Protection Operation

Set up CPLBs to define regions and properties: Including CPLBs to protect system registers, MMRs and assign

exception handling memory. Disable all memory other than the desired memory space. Execute Code.

If code tries to access memory that has been disabled or protected, then a memory protection violation occurs as an exception.

Page 72: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Example CPLB Setup

Async: Non-cacheableOne 4MB page

SDRAM: CacheableEight 4MB pages

L1 Instruction: Non-cacheable1MB page

L2 Memory: Cacheable1MB page

PCI: Non-cacheableTwo 4MB pages

Async: Non-cacheableOne 4 MB page

SDRAM: CacheableEight 4MB pages

MMR’s: Non-cacheable4MB page

L1 Data: Non-cacheableOne 4MB page

Scratchpad: Non-cacheable4KB page

L2 Memory: Cacheable1MB page

Instruction CPLB setup

Data CPLB setup

Async: CacheableTwo 4MB pages

Async: CacheableOne 4 MB page

Reset Area Ex: 0xEF0000004MB page

Page 73: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Direct Memory Access (DMA)

• Transfers Data within memory or between memory and peripheral• Core Independent 8-bit, 16-bit, or 32-bit transfers• Support for separate Data and Error Interrupts• Support for two-dimensional (2D) DMA

• Interrupt at end of row or buffer• Descriptor Based DMA

• Set of registers stored within memory• Flexible Descriptor Structure• Can chain together multiple DMA sequences

• Register Based DMA• Directly Program DMA control registers• On completion, the original values can automatically be updated for

continuous transfer

Page 74: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

12 Prioritized DMA ChannelsDefault DMA channel priorities exist (PPI is highest)Can reprogram these priorities based on application

DMA Channel Default Peripheral Mapping Reprogrammable0 (highest priority) PPI YES1 SPORT0 RX YES

2 SPORT0 TX YES3 SPORT1 RX YES

4 SPORT1 TX YES5 SPI YES

6 UART RX YES

7 UART TX YES8 MemDMA Stream 0 TX NO9 MemDMA Stream 0 RX NO10 MemDMA Stream 1 TX NO

11 (lowest priority) MemDMA Stream 1 RX NO

Page 75: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

ADSP-BF531/2/3 DMA Performance Increases

• DMA controller write-back to memory descriptor not required• Reduces core intervention

• Autobuffer available in all channels• Increases performance and eliminates core interaction

• Descriptor load time GREATLY reduced • Memory DMA throughput increased

• Write from source to destination as soon as it is read• Additional memory DMA channels

• Reduces core intervention• Prioritized DMA channels

• Default DMA channel priorities will exist (PPI is highest)• Customer will be able to re-program these priorities based on

application (similar to way interrupts can be re-programmed)

Page 76: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

ADSP-BF531/2/3 DMA Engine

• Two descriptor-based methods • “Linked-list” model (similar to ADSP-BF535)• “Array of descriptors” model (Descriptors follow one another

until array ends)• Register-based

• Autobuffer (similar to ADSP-BF535)• Interrupts

• Separate levels for DMA completion and DMA error• Allows separate ISRs for each

Page 77: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

ADSP-BF531/2/3 Descriptor Enhancements

• Flexible size allows up to 9 elements within each descriptor

• Customer can use only what is needed• In addition to the ADSP-BF535 source, destination,

and # of transfers model …• The ADSP-BF531/2/3 DMA controller supports:

• One-dimensional strides (e.g. skip every n samples)• Two-dimensional transfers (Xcount, Xmodify, Ycount,

Ymodify)

Page 78: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Scalable Descriptors of the ADSP-BF531/2/3

• DMA descriptors are variable-sized data structures whose contents are loaded into DMA registers.

• The sequence of registers in the descriptor is fixed, while the length of the descriptor is programmable.

• The user may select a descriptor size from one register (Base Address Low) to nine registers (all the DMA parameters.)

• The three variations depend on what kind of Next Descriptor Pointer is included:

• None (called "Descriptor Array mode")• The low 16 bits of the Next Descriptor pointer (Descriptor List,

Small Model), • All 32 bits of the Next Descriptor pointer (Descriptor List, Large

Model).

Page 79: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

ADSP-BF531/2/3 Descriptor Definition

• NextDscPtrLow: Link Pointer to next descriptor.• NextDscPtrHigh • BaseAddrLow: Base address of current buffer. • BaseAddrHigh: • DMA Configuration Register • Xcount: Inner loop count. • Xmodify: Inner loop address increment, in bytes. (This is a signed,

2's complement value.)• Ycount: Outer loop count (2D only). • Ymodify: Outer loop address increment, in bytes. (This is a

signed, 2's complement value.)

Page 80: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Examples of ADSP-BF531/2/3 “Flex DMA”

A single, linear buffer that stops upon completion A linear buffer with stride greater than 1 A circular, auto-refreshing buffer that interrupts on each full buffer A similar buffer that interrupts on fractional buffers (e.g. 1/2, 1/4) 1-D DMA using a set of identical ping-pong buffers defined by a linked

ring of 3-word descriptors, each containing { link pointer, 32-bit address }.

1-D DMA using a linked list of 5-word descriptors containing { link pointer, 32-bit address, length, config }. (BF535 style)

2-D DMA using an array of 1-word descriptors specifying only the base DMA address within a common data page.

2-D DMA using a linked list of 9-word descriptors specifying everything.

Page 81: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Operation of 2-D Direct Memory Access

A E F GC DBPONMLKJIH

LKJIH

FG

EDCBA

....

Data Capture & Storage to Linear L2 Memory

2-D DMA to L1 Memory A, B, I, J

ProgrammableX &Y Count & Modify Values

2-D DMA significantly decreases S/W overhead in video applications!

Page 82: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

The World Leader in High-Performance Signal Processing Solutions

Section 5Program Sequencer / Interrupt Manager

Page 83: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Program Sequencer Features

The Program Sequencer controls all program flow:

Maintains Loops, Subroutines, Jumps, Idle, Interrupts and Exceptions

Contains an 8-stage instruction pipeline

Includes Zero-Overhead Loop Registers

Page 84: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Sequencer-Related RegistersRegister Name Description

RETXRETNRETIRETERETS

Retur Address Register:Exception ReturnNMI ReturnInterrupt ReturnEmulation ReturnSubroutine Return

LC0, LC1LT0, LT1LB0, LB1

Zero-Overhead Loop Registers:Loop CounterLoop TopsLoop Bottoms

FP, SP Frame Pointer Stack PointerSYSCFG System Configuration RegisterCYCLES, CYCLES2 Cycle Counters

Page 85: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Program Flow InstructionsProgram Flow Instruction Instruction Function

JUMP Unconditional BranchIF CC JUMPIF !CC JUMP

Conditional Branch

CALL Subroutine callRTS,RTI,RTX,RTN,RTE Return from Flow interrupterLSETUP Set up Hardware Loop Jump (P5); /* indirect jump instruction */ Jump (PC + P3); /* indirect jump with offset (PC-relative) */ Call (P5); /* RETS register is loaded with address

of instruction after call */ Call (PC + P3); /* RETS register is loaded with address

of instruction after call */ IF CC Jump <label>; /* jump on condition cc=1 */ Call <label>; /* OK within 24-bit offset from PC */

Page 86: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Conditional Execution – CC Bit

Condition Code Flag (CC bit) resolves branch direction

5 ways to access CC to control program flow Conditional branch is resolved by value of CC Dreg value can be copied to CC, and vice-versa Status flag can be copied into CC, and vice-versa

e.g., CC = AV1; CC can be set to result of a Preg comparison CC can be set to result of a Dreg comparison

e.g., CC = R3==R2;

Page 87: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

SSYNC instruction synchronizes " the System", executing everything in the processor pipeline, and completing all pending reads and writes from peripherals. Until SSYNC completes, no further instructions can

enter the pipeline.

CSYNC instruction synchronizes "the Core", executing everything in the processor pipeline CSYNC is typically used after Core MMR writes to

prevent imprecise behavior.

SSYNC and CSYNC instructions

Page 88: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Multi-Cycle Instructions Multi-cycle instructions will not execute faster

through rescheduling. ! See EE-171 Appnote for a complete list of stalls and

multicycle instructions

A 32-bit multiply operation is availabler0 *= r1; // 5 cycles to execute

The Push Multiple and Pop Multiple instructions take n cycles to complete, where n is the number of registers pushed or popped, assuming L1 memory.

[--SP] = (R7:0, P5:0); // 14 cycles to execute

Page 89: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

10-stage super-pipeline

The sequencer ensures that the pipeline is fully interlocked and that all the data hazards are hidden from the programmer

If executing an instruction that requires data to be fetched, the pipeline will stall until that data is available

BF533 Execution Pipeline

Page 90: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Instruction PipelinePipeline Stage DescriptionInstruction Fetch 1 (IF1 Start instruction memory accessInstruction Fetch 2 (IF2) Finish L1 instruction memory access

and align instructionInstruction decode (DEC)

Frame Pointer Stack Pointer

Address Calculation (AC)

Calculate data addresses and branch target address

Execute 1 (EX1) Read data and start access of data memory

Execute 2 (EX2) Finish access of data memory and start execution of dual cycle instructions

Execute 3 (EX3) Execute single cycle instructionWrite Back (WB)/Commit

Write states to Data and Pointer register files and process events

Page 91: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Pipeline Events• Stall

− A latency stall condition can occur when two instructions requireextra cycles to complete, because they are close to each other in the assembly program. Other stalls can be memory- or loop-related. Stalls can be diagnosed with the Pipeline Viewer, and can be remedied with some rescheduling.

• Kill− Instructions after a branch are invalidated in the pipeline,

because they will have entered the pipeline before the actual branch instruction gets serviced

• Multicycle instruction− Some instructions take more than one cycle to complete. These

extra cycles cannot be avoided without removing the instruction that caused them.

! See EE-171 Appnote for a complete list of stalls and multicycle instructions.

Page 92: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Some Examples of Stall Conditions

Use of a Preg loaded in the previous instruction causes a 3-cycle stallP0=[P1++];R0=[P0];

Use of a Preg which was transferred from Dreg in the previous instruction causes a 3-cycle stall.P0=R0;P1=P0+P2;

Use of a DAG-reg which was transferred from a Dreg in the previous instruction causes 3-cycle stallI0=R0;R1=[I0++];

Back-to-back multiplication where the result of first multiplication is used as an operand of the second multiplication causes 1-cycle stallR0 = A1+=R1.L*R2.L;R1 = A1+=R0.L*R2.L;

Page 93: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Avoiding Pipeline Stalls

Most common numeric operations have no instruction latency

Application note EE-171 available on avoiding stalls

Gives instruction combinations with associated stall info

VDSP++ 3.0 Pipeline Viewer highlights Stall, Kill conditions

Page 94: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Change of Instruction Flow

When a change of flow happens, a new address is presented to the Instruction Memory Unit

There will be a minimum of three cycles before the new instructions appear in the decoder (except when utilizing the hardware loop buffers)

When an instruction in a given pipeline stage is killed, all the instructions in stages above it will also be killed

Page 95: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Hardware Loop Buffers

• The BF53x DSP provides two sets of dedicated registers tosupport two zero-overhead nested loops • One way to load these registers is by using the Loop Setup (LSETUP) instruction;

• If the desired loop size exceeds the largest LSETUP size in the table above, LT[1:0], LB[1:0], LC[1:0] can be set manually

• If more than 2 nested loops are required, the stack must be used

Page 96: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Operating Modes

User mode Causes exceptions when protected resources are accessed. May be used for algorithm/application code

Supervisor mode has unprotected access to all resources. May be used for O/S kernel, device drivers, debuggers, ISRs

Emulator (or Debug) mode has supervisor abilities and is accessible via JTAG

Operating Modes provide a feature to implement RTOS architectures and Multitasking schemes.Smaller applications may simply run in Supervisor mode all the time.

Page 97: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Operating Modes

Supervisor mode

Emulator/Debug mode

User mode

System, Code and Event Handlers

Application Code

Sleep

RESET

Supervisor

User

Emulation

RTI, RTX

Interrupt or E

xception

RTE

Emulation Event

Emulation Event

RTE

Emulation Event

Reset inactive

Interrupt

Page 98: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Events (Interrupts / Exceptions)

Emulation (via SW or external pin) Reset (via SW or external pin) Non-Maskable Interrupt (NMI) - for events that require

immediate processor attention (via SW or external pin) Exception Interrupts

Global Interrupt Enable Hardware Error Core Timer 9 General-Purpose Interrupts for servicing peripherals

The Event Controller manages 5 types of Events:

Page 99: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Event Processing Flow

EM

U

NM

IR

ST

IMASK

EVS

W

Awake from IDLE!

RTCUSBPCISPORT0 RXSPORT0 TXSPORT1 RXSPORT1 TXSPI0SPI1UART0 RXUART0 TXUART1 RXUART1 TXTIMER0TIMER1TIMER2PF APF BMEMDMAWATCHDOG

IVHW

IVTMR

IVG7

IVG8

IVG9

IVG10

IVG11

IVG12

IVG13

IVG14

IVG15

SIC_IARx

SIC

_IMA

SK

SIC_IW

R

SIC

_ISR

IVG6

IVG5

IVG3

IVG2

IVG1

IVG0

"Event A"

0xFF80 0000

FF80 0000

FF80 0002

FF80 0004

FF80 0008

FF80 000C

FF80 0010

FF80 0012

FF80 0014

FF80 0016

[- -sp] = P4;

[- -sp] = R4;

R4.l = 0x1010;

P4.H = hi(FLAG_C);

P4.L = lo(FLAG_C);

W[P4] = R4;

R4 = [sp++];

P4 = [sp++];

RTI;

IPEND

CO

RE PIPE LIN

E

11

EVT

Keep "ILAT" Cleared"0"

Clear "IPEND"

ILAT

1

"Clear SIC_ISR"

R0

CLI R0; // "0000"

STI R0; // IMASK = R0;

Page 100: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Event Processing Flow1. SIC_ISR logs Interrupt A’s request and tracks system interrupts that

are asserted but not yet serviced2. SIC_IWR checks if it should wake the core from an idled state based

on this interrupt request3. SIC_IMASK masks off or enables interrupts from peripherals at the

system level. If Interrupt A isn’t masked, request proceeds to Step 44. SIC_IARx MMRs determine core priority of Interrupt A5. ILAT adds Interrupt A to its log of interrupts latched by the core but not

yet being serviced6. IMASK masks off or enables events of different core priorities. If IVGx

event corresponding to Interrupt A isn’t masked, request proceeds to Step 7

7. Event Vector Table (EVT) accessed to look up appropriate vector for Interrupt A’s interrupt service routine (ISR)

8. When event vector for Interrupt A has entered core pipeline, the appropriate IPEND bit is set, which clears the respective ILAT bit.

9. When Interrupt A’s ISR is finished, RTI instruction clears the appropriate IPEND bit. But the relevant SIC_ISR bit isn’t cleared unless the ISR clears the mechanism that generated Interrupt A.

Page 101: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Nested Interrupts The state of the processor needs to be saved onto the

Stack:ISR:[--SP] = RETI; // Interrupts enabled[--SP] = ASTAT;[--SP] = FP;[--SP] = (Rx,Ax,Px,Ix);……(Rx,Ax,Px,Ix) = [SP++];FP = [SP++];ASTAT = [SP++];RETI = [SP++]; // Interrupts disabledSSYNC; // Wait until RETI load takes effectRTI; // Interrupts enabled

Page 102: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

The World Leader in High-Performance Signal Processing Solutions

Section 6I/O Embedded

Page 103: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Serial Ports

Primary TXSecondary TX

Tx ClockTx Sync

Primary RxSecondary RX

Rx ClockRx Sync

Two Dual-Channel Synchronous Ports supporting 8 Stereo I2S Channels

Supports 3-32bit data widths

100MHz operation from external clock

SCLK/2 operation from internal clock ( up to 66MHz )

Page 104: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

ADSP-21533 SPORTs

Two synchronous serial ports Fully independent receive and transmit - double buffered Internal or externally generated serial clocks and frame

syncs Programmable internal/external frame syncs Built in hardware for u-law & A-law companding Support for multi-channel TDM interfaces Dedicated DMA engine capable of chained transfers Generates optional interrupts Operates up to 1/2 System bus clock rate (SCLK)

Page 105: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

ADSP-21533 Serial PORTs Features

Interrupt-driven, single-word transfers to/from on-chip memory controlled by ADSP-21535 core

Block word transfers to/from memory controlled by DMA controller

Several modes of operation Programmable serial word length, 3 to 16-bits Either MSB or LSB first Early Frame Sync Late Frame Sync No Frame Sync 128 time slot multi-channel capability for TDM interfaces (for example, a T1 interface)

Page 106: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Serial Port - Block Diagram

Page 107: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon 107

Serial Clock Divisor

Used For Internally generated Serial Clock

Frame Sync Divisor

Used for internally generated Frame Sync

Number of Serial CLK cycles between FS assertions (= xFSDIV + 1)

Serial Clock

Page 108: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon 108

Serial Port Timing CharacteristicsEarly vs. Late Framing

Early framing: frame sync precedes data by one cycle

Late framing: frame sync checked on first bit only

xCLK

Late Frame Sync

Early Frame Sync

DataB3 B2 B1 B0 ...

Page 109: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Multichannel Operation

D3 D0

RCLK

DT=DR D2 D1 D3 D0D2 D1 D3 D0D2 D1

RFS

TFSExample: Receive on Slot 0 and 2, Transmit on Slot 1

Slot 0 Slot 1 Slot 2

TDM method where serial data is sent or received on different channels sharing the same serial bus

Up to 128 channels can be enabled for transmit or receive RFS signals start of frame TFS is used as Transmit Data Valid (TDV) for external logic. Active

only during transmit channels

Page 110: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Example for Multichannel Connection

RCLK TCLK RFS

DT DR

RFS

DT DR

RFS

DT DR

ADSP-21535

MasterADSP-21535

Slave1

ADSP-21535

Slave2TFS TFS TFS

RCLK TCLK RCLK TCLK

TDV TDVTDV

Page 111: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

BF53x SPI Features 1 SPI-Compatible Ports 4 Pin Interface (MOSI, MISO, ~SPISS, SCK) Master and Slave Mode Operation

Supports Multimaster Environments Can Use 16 GP Flag Pins As Slave-Select Lines

2 Slave Select Input Pins (One Per SPI Port) 14 Slave Select Output Pins (Seven Per SPI Port)

Gated SPI Clock (Only Active During Transfers) DMA Support

Each Port Has One DMA Channel (Input or Output) Programmable Baud Rate Programmable Clock Polarity and Phase Programmable Serial Word Length (8 or 16 Bits)

Page 112: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

A Closer Look At How The Data Is MovedShift Registers Simultaneously Shift Data In And Out

Page 113: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Universal Asynchronous Receiver/Transmitter Interface (UART)

UART options 5-8 data bits 1, 1½ or 2 stop bits None, even or odd parity Baud rate = SCLK/

(16*DIVISOR)

BF 53x

Page 114: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

UART Features Full Duplex, Industry Standard 16450 Compatible

Asynchronous Serial Communication (7-12 Bits / Word) 1 Low-Going Start Bit 5 - 8 Data Bits None or 1 Parity Bit (Even or Odd Parity Supported) 1 or 2 Stop Bits

Data Is ALWAYS Transmitted or Received LSB First

DMA Capable Supports Separate TX and RX DMA Master Channels Used In Either Programmed I/O Mode or In DMA Mode

Data Is Double Buffered On Both Ends Transmit Holding (THR) & Receive Buffer (RBR) Registers Transmit Shift (TSR) & Receive Shift (RSR) Registers These Pairs Allow For Synchronization Delay

Page 115: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

BF-533 Core Timer

TSCALE8 bit

TCOUNT32 bit

TPERIOD32 bit

CCLK IRQ 6

Used to generate interrupts at multiples of CCLK rate

32-bit tick timer Dedicated Interrupt Priority 6 (fixed)

Autoreload is optional

Interrupt rate = CCLK x (TSCALE + 1) x TPERIOD

Page 116: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

BF533: Watchdog Timer

The watchdog timer can be used to improve system reliability by generating an event to the Blackfin core if the timer expires before being updated by software.

The event generated can be programmed to be: a reset (software reset takes place) a nonmaskable interrupt a general purpose interrupt

The Watchdog timer is clocked by the system clock (SCLK).

Page 117: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

BF533 Watchdog Timer

WDOG_STAT32 bit (readable)SCLK

IRQ

Needs to be serviced by software periodically

Unique pattern of 4 bits required to disable watchdog timer

No auto-reload. Sticky W1C bit indicates expiration

Halted in emulator mode

Watchdog Interval = WDOG_CNT / SCLK

WDOG_CNT32 bit

Any write to WDOG_STATtriggers reload from WDOG_CNT

WDOG_CTL16 bit

NMIReset

Page 118: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

BF533 RTC

RTC Interrupt may be issued by 6 different events

Dedicated RTC interrupt registers RTC_ITCL and RTC_ISTAT

Interval Interrupts (once per second, minute or 24 hours)

Alarm function supported by RTC_ALARM register

Alarm IRQ issued when hours, minutes and seconds match

Day IRQ issued when also 8-bit day matches

Additional Stopwatch functionality

Additional 8-bit counter decrements every minute

Interrupt when reaches zero, no autoreload

Page 119: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Three Peripheral Timers of the BF53X• Three identical timers• Timer Modes

− Pulse Width Modulation (PWM_OUT)− Width and Period Capture (WDTH_CAP)− External Event Counter (EXT_CLK)

• Dedicated Pins TMR2, TMR1, TMR0• One programmable interrupt each • Three 32-bit registers each (2 x 16 Bit)

− Width − Period − Counter (read-only)

• One 16-bit Configuration Register each• One 16-bit Status Register each (sticky)

− IRQ, Timer Enable, Overflow

Page 120: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Programmable Flags

• Features:− 16 bi-directional programmable flags− Write-1-to-set Flag Set register− Write-1-to-clear Flag Clear register− Two independent interrupt channels− Level or edge sensitive trigger of input source− Rising or falling edge trigger of input source− Single edge or both edges trigger of input source

Page 121: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Flag Pin Functionality16 bi-directional general-purpose I/O pins availableEach can be configured as an output, input, or an interrupt pin

ADSP-21535

PF0

PF15

PF7

PF8

SPISS0 /SPISS1 /

SPI0SEL1 /

SPI1SEL3 /

MSEL0 /

MSEL6 /DF /

SPI0SEL4 /

SPI1SEL7 /

PF7:0 operate as MSEL6:0 and DF for PLL at reset

PF15:0 also function as: 2 Slave Select input pins SPISSx (Slave Mode) and 14 Slave

Select Output Pins SPIxSELy (Master Mode)

Two InterruptRequests (FLAGA/FLAGB)

Page 122: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Parallel Peripheral Interface

VideoPort (PPI)with 2-D

DMA

CLOCKSYNCS

Appliances

Supports glueless ITU-R 656 Video Converter Interface directly at 27MHz

Provides general purpose parallel ADC / DAC interface at up to 65MSPS

Can be alternately configured as 16 general purpose Flag In/Out pins

Integral 2-D DMA Capability Significant Reduction in Software

Overhead for Video Applications.

External Clockup to 65MHz

Page 123: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Bi-directional interface 4 dedicated pins (PPI3 – PPI0) 12 multiplexed PF pins (PF4 – PF15) PPI_CLK (always sourced externally) operating up to SCLK/2

Supports bit-parallel ITU-R 656 recommendation Supports a flexible General Purpose mode

Up to 16 data lines, 1 clock, 3 Frame Syncs Frame Syncs 1&2 muxed with Timer 1&2 pins Frame Sync 3 muxed with PF3

Parallel Peripheral Interface (PPI)

Page 124: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Examples of General-Purpose I/O Modes

Output, 1 Sync

Input, 3 Sync

Page 125: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Frame Capture General-Purpose Mode

Page 126: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

General-Purpose PPI Modes

GP MODE PPI_FS1 PPI_FS2 PPI_FS3 DATA

Input, 1 Sync Input --- --- Input

Infinite Capture Input --- --- Input

Input, 3 Syncs Input Input Input Input

Frame Capture Output Output --- Input

Output, 1 Sync Output --- --- Output

Output, 3 Syncs Output Output Output Output

Page 127: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

• Can optionally ignore Field 2 (in Active Field Mode) • 4 control signal polarity choices (H,V,CLK)• Can skip even or odd data elements• Supports 16-bit data packing mode• Supports 32-bit DMA mode (2 bursts of 16-bit DMA)• Works hand-in-hand with 2D DMA Engine

• Can interrupt at end of row, frame, or partial-frame transfer

Additional PPI Features

Page 128: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

8-16 bits data

CLK

HSYNCVSYNC

PPI_FS3

Video Source

PPI

PPI_FS1PPI_FS2FIELD

PPIx

PPI_CLK

PPI GP Input Modes

8-16 bits data

CLK

HSYNC

A/D Converter

PPI_FS1

PPIx

PPI_CLK

Page 129: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

PPI GP Output Modes

D/A Converter

8-16 bits dataCLK

HSYNCVSYNC

Video Display

FIELDPPI_FS3

PPIPPI_FS1PPI_FS2

PPIx

PPI_CLK

8-16 bits data

CLK

HSYNCPPI_FS1

PPIx

PPI_CLK

Page 130: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Data and Control

PPIDMA L1

MemorySPORT

Filtering

DMA

PPIDMA

SDRAML1

Memory

DMA

Possible Data Transfer Scenarios

SPORT

DMA

External Processor

ADC

ADC

Page 131: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

PPI

PPI_FS1PPI_FS2

PPIx

PPI_CLK

Connection to ADC/DAC (Bidirectional interface)

SPIMISOMOSI

SCLK

ADSP-BF533

RX/TX PATH

TX_ENRX_EN

ADIOx

RX/TX CLK

SPI

SDATASDO

SCK

AD9975

External or AD9975 Internal Clock

Serial Link for Configuration (Modi, Powerdown …)

Frames controlled by Timer interrupts

Need Software Switching between Frame Capture and Output Modi

Page 132: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

ADSP-21533 Booting

Page 133: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Booting MethodsBMODE Pins [2:0] Description

000 Execute from 16-bit External ASYNC Bank0 memory (Bypass Boot ROM)

001 Boot from 8/16-bit Prom/Flash

010 Boot from an 8-bit Addressable SPI0 Device

011 Boot from a 16-bit Addressable SPI0 Device

100-111 Reserved

Page 134: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

ADSP-21533RESET

Jump To 16-Bit ExternalMemory For Execution

Jump To On-Chip Boot ROM

BMODE = 000 BMODE = 001, 010, 011

Behavior Upon RESET

Page 135: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

On-chip boot ROM loads N bytes into the start of L2 memory and, once completed, jumps to the start of L2 to begin execution

Boot Sequence

On-Chip Boot ROM

0xEF00 0000

L2 Memory(0xF000 0000)

4 Byte Header (N)

BlackFin

Application CodeOr

2nd Stage LoaderApplication Code

Or2nd Stage Loader

Prom/Flash Or SPI Device

0x0

NBytes

Page 136: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Creating a Loader File

Using the Loader Property Page under Project Options

Page 137: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Dynamic Power Management

Page 138: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Dynamic Power Management - Variable Frequency

PLL1x - 64x

÷ 1, 2, 4, 8

÷ 1 : 15

CLKIN

CCLK

SCLK

Dynamically ModificationOn the fly

Dynamic Modification Requires PLL Sequencing

CCLK

SCLK

SCLK =< CCLK

SCLK =< 133MHzPLL

Page 139: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Power Management--Variable Voltage

+-

VREF

VDDINT

VDDCTRL

VDDEXT

DSPINTERNALCIRCUIT

EXTERNALCOMPONENTS

2.25V -> 3.6V

TANTALUMOR

ELECTROLYTIC

CERAM IC

10 µF .1µF

On-chip Voltage Regulation

Generates core voltage from external 2.25V to 3.6V input

Core voltage programmable in 50mV increments

Optional bypass

Minimal external components required

Ind10µH

Uz=4V

Page 140: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Power Management Options

Low Active Power Flexible power management with selectable

power-down for each peripheral sectionDynamic Power Management allows dynamic

modification of both frequency and voltage(Power management companion chip: ADP3053).

Low Standby Power4 Power modesReal Time Clock with alarm and wakeup features

Page 141: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Blackfin DSPs

Optimize Power Consumption

DSP Operation

PLL Settling

RegulatorTransition

1.5V, 300MHz

1.0V, 100MHz

1.3V, 225MHz

RegulatorTransition

PLL SettlingDSP Operation

DSP Operation

PowerConsumption

Vdd

t

Just vary the frequency

Vary the voltage and frequency

Dynamic Power Management

Page 142: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Power Management States

Mode Relative Power Savings

Notes

Full On Min Max performance

Active Low Full core operation at CLKIN/2.System DMA to L1 supported.PLL is bypassed.

Sleep High Core idle. All clocks to core disabled. SCLK enabled.

Deep Sleep

Max Core idle. Async. peripherals enabled for limited operation. Exit only via HW reset or RTC interrupt.

Page 143: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Clock States in Different Power Modes

Mode PLL PLL Bypassed?

Core Clock (CCLK)

System Clock (SCLK)

Full On Enabled No Enabled Enabled

Active Enabled or Disabled

Yes Enabled Enabled

Sleep Enabled Yes or No(depending on previous mode)

Disabled Enabled

Deep Sleep Disabled Yes Disabled Disabled

Page 144: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

System Clocking

2 internal clock domains: CCLK and SCLK

CCLK is derived from PLL or directly from CLKIN pin

SCLK is divided down from CCLK & must not exceed 133 MHz

CLKIN can be driven from external oscillator (not a crystal)

Programmable PLL supports 1x to 31x frequency multiplication, enabling high-speed operation with low-frequency clock inputs

Page 145: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Programming PLL Transitions1) Program the wakeup event to bring the core out of

idle state

1) Program the new PLL parameters

1) Disable interrupts

1) Flush the pipeline, and place the core into an idle state

1) Reinstate interrupts

1) Have now transitioned into the new PLL mode

Page 146: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

IDLE state After executing IDLE and SSYNC instructions, the DSP core

stops executing instructions, retains the contents of pipeline and waits for an interrupt or wakeup. PLL, CCLK and SCLK continue running

2 ways to leave an IDLE state DSP services an interrupt. DSP will return to the instruction

after the IDLE;SSYNC; after executing the RTI instruction. A peripheral wakes DSP up (based on SIC_IWR settings), but no

interrupt occurs. DSP returns to instruction that follows IDLE;SSYNC;

Note that the IDLE instruction relates to Power-down modes too

Page 147: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

The World Leader in High-Performance Signal Processing Solutions

Section 7Software Environment: VDSP++ 5v0

Page 148: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Assembler Parses Input Source Code And Creates Object Files

Raw Assembly Input (.ASM File) Compiler-Generated Assembly Input (.S File) Object File Output (.DOJ File)

Preprocessor Substitutes Code/Data For Macros/Constants Preprocessor Directives

#define - define a macro or constant #undef - undo macro definition #if, #endif - conditional assembly #else,#elif - multiple conditional blocks #ifdef, #if defined - condition based on macro definition #ifndef - condition based on macro not defined #include - include source code from another file

Page 149: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Assembler

Preprocessor

Object File .doj (binary)

Listing File.lst (ASCII)

Assembler

Assembler file.asm

Data File.dat

Header file.h

Intermediate.is

Page 150: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

#include <def21535.h>

#include “myheader.h”

#ifdef mydef

R0 += 1;

#else

R0 += -1;

#endif

Assembler Property PageIf you want to

get the intermediate .is file, select

here

Depending on definitions, you

can select different codes

If chosen, a listing file

will be created

If chosen, you are able to

debug in the source code mydef

Page 151: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Directives for Assembler

Directive Function

.ALIGN Specifies a byte alignment requirement

.SECTION Mark start and end of section

.VAR Declare variable or buffer (32 bit)

.BYTE Declare 8 bit data

.BYTE2 Declare 16 bit data

.EXTERN Reference to a symbol declared in another file

.GLOBAL Make a symbol visible to other source files

Page 152: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Sections in Assembler Files

.SECTION data_a;.BYTE data_array[N];

.SECTION data_b;.VAR coeff_array[N];.VAR x = 0x12345689;

.SECTION program;_main: B0 = data_array;

L0 = length(data_array); . . .

The .SECTION directive assigns data and code- Multiple sections may be used within a single source file- Any section name may be chosen

Page 153: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

The defbf532.h Header File• Allows Programmer to Use Symbols for Memory Mapped Registers• Located in: \\VisualDSP\Blackfin\include\

To include it use:#include <defbf532.h> or#include <Cdefbf532.h>

• Example: P0.H = HI(EBIU_SDRRC);P0.L = LO(EBIU_SDRRC);R0 = 0x2345;[P0] = R0; // Write 0x2345 to EBIU_SDRRC

Page 154: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Assembler Source File Example #include <defbf532.h>

#define N 20 // replace N by 20

.GLOBAL start;

.SECTION data1; // data in L1 A memory

.byte2 buffer[N]="fill.dat"; // initialize data from file

.SECTION data2; // data in L1 B memory

.VAR xy = 0x12345678; // initialize var with 32bit value

.SECTION/PM program; // instructions in L1 Instructionmemory

start: i0.l =LO(buffer) ; // get low address word of array and load index registeri0.h=HI(buffer);

B0=I0; // load base register with address

L0=N*4; // size of array (circular buffer!) in bytes

R0=0 (z);

P2=N;

LSETUP (loopstart,loopend) LC0=P2; // setup looploopstart: R0 =R0 + 1; // 1st instruction in looploopend: W[I0++]=R0; // last instruction in loop

Page 155: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Listing file (.lst)Page 1 .\test.asmADI easmblkfn (2.1.5.0) 02 Apr 2002 15:32:00

offset opcode line====== ====== ==== 1 #include <def21535.h>; 2 #define N 20 //replace N by 20 3 .GLOBAL start; 4 .SECTION data_a; //data in L1 memory bank A 5 .VAR buffer[N]="fill.dat"; //initialise data from file 5 6 .SECTION data_b; //data in L1 memory bank B 7 .VAR x = 0x12345678; //initialise variable 8 .SECTION L2_program; //instructions in L2 memory 0 9 start: I0 = buffer (z); //get low address word of array 0 90e1 9 2 0000 9 4 50e1 10 I0.H = buffer; //get high address word of array 6 0000 10 8 8036 11 B0=I0; //load base register a 3ce1 12 L0=N*4; // size of array (circular buffer!) in bytes c 5000 12 e 0060 13 R0=0; 10 a068 14 P0=N; 12 b0e0 15 lsetup(loopstart,loopend) LC1 = P0; // setup loop 14 0000 15 16 16 loopstart: R0 += 1; // 1st instruction in loop 16 0864 16 18 17 loopend: [I0++]=R0; // last instruction in loop 18 009e 17

Line Nr. in the source code

Offset within the specified section

Generated opcode

Source code

Page 156: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Macros#define mymacro(x,y) R0 = x; R1 = y; R2 = R0 + R1

.SECTION program;

start: mymacro(0x4,P0);

[I0++] = R2;

The Preprocessor will create the following:start: R0 = 0x4 (Z);

R1 = P0; R2 = R0 + R1;[I0++] = R2;

Semicolon either here or here

Page 157: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Linker

• Generates a Complete Executable DSP Program (.dxe)• Resolves All External References• Assigns Addresses to re-locatable Code and Data

Spaces• Generates Optional Memory Map• Output Can Be Read by Loader, Simulator, and

Debugger• Controlled by linker commands contained in a linker

description file (LDF)

Page 158: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Linker

Linker

Memory Image File.DXE (binary)

Library Files.DLB

Object File.DOJ

Linker Description Files .LDF

Memory Map File.MAP (ASCII)

Page 159: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

The Linker Description File (LDF)

The link process is controlled by a linker command language

Provides complete specification of mapping between the linker's input files and its output.

It controls input files output file target memory configuration

Preprocessor Support

Page 160: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

LDF consist of three primary parts Global Commands

Defines architecture or processor Directory search paths Libraries and object files to include

Memory Description Defines memory sections

Link Project Commands Link against object file list Output file name Mapping of input sections to memory segments

Page 161: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Example Global Commands

ARCHITECTURE (ADSP-21535) // Processor Used

SEARCH_DIR( $ADI_DSP\Blackfin\lib ) // Directories to search for files

$OBJECTS = bootup.doj, $COMMAND_LINE_OBJECTS; // Macro listing all command line objects and bootup

Page 162: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Linker Description File Macros $COMMAND_LINE_OBJECTS:

List of objects (.DOJ) and libraries (.DLB) passed on command line.

$COMMAND_LINE_OUTPUT_FILE:Output executable file name specified on the command line with the -o switch.

$ADI_DSP: Path to VisualDSP installation directory.

$macro: User defined macro for a list of files.e.g.: $OBJECTS

Page 163: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

ARCHITECTURE (BF-533)SEARCH_DIR ($ADI_DSP\Blackfin\lib)$OBJECTS = $COMMAND_LINE_OBJECTS; MEMORY{

seg_data_a { TYPE(RAM) START(0xFF800000) END(0xFF803FFF) WIDTH(8) }seg_data_b { TYPE(RAM) START(0xFF900000) END(0xFF903FFF) WIDTH(8) }seg_data_scr { TYPE(RAM) START(0xFFB00000) END(0xFFB00FFF) WIDTH(8) }seg_prog_L1 { TYPE(RAM) START(0xFFA00000) END(0xFFA03FFF) WIDTH(8) }seg_prog_L2 { TYPE(RAM) START(0xF0000000) END(0xF003FFFF) WIDTH(8) }

}

Example LDFGlobal Commands & Memory Description

Segment name

Start address

End address

Memory width

Global Commands

Page 164: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

PROCESSOR p0{

OUTPUT( $COMMAND_LINE_OUTPUT_FILE )SECTIONS

{sec_data_a{ INPUT_SECTIONS( $OBJECTS(data_a) ) } > seg_data_asec_data_b SHT_NOBITS { INPUT_SECTIONS( $OBJECTS(data_b) ) } > seg_data_bsec_data_scr{ INPUT_SECTIONS( $OBJECTS(data_scr) ) } > seg_data_scrsec_prog_L1 { INPUT_SECTIONS( $OBJECTS(prog_L1) ) } >seg_prog_L1sec_prog_L2 { INPUT_SECTIONS( $OBJECTS(prog_L2) ) } >seg_prog_L2

}

}

Example LDF (con‘t)Link Commands

MEMORY SEGMENTSDeclared in the LDF

DXE SECTION NAMESUsed in .map file

OBJECT SECTIONSfrom assembly files

Keyword:Data in that SECTION will not be initialized

Page 165: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Expert Linker Features

Expert Linker is a Graphical tool that can: Use wizards to create LDF files

Define a DSP’s target memory map

Drag and Drop object sections into the memory map

Present watermarks for max Heap and Stack usage

Graphically Manage Overlay support

Import Legacy LDF files

Graphically highlights code elimination of unused objects

Page 166: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Launch the LDF Wizard

Using Expert Linker to create an LDF Launch the wizard by selecting:

Tools -> Expert Linker -> Create LDF The wizard guides through the 3 step process to

generating an LDF: Step 1: Name the LDF filename and choose project

programming language type (i.e. C, C++, Assembly) Step 2: Select processor type and other properties Step 3: A confirmation summary of the choices above is

shown. Click on the finish button to generate LDF

Page 167: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Expert Linker

Page 168: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Project Development

All development in VisualDSP++ occurs within a project.

The project file (.DPJ) stores your program’s build information: source files list and development tools option settings

Page 169: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

How to start a new project Create a new Folder

Copy standard .ldf from \VisualDSP\Blackfin\ldf to new Folder

Open VisualDSP++

Choose Project -> new

Switch to your new created Folder

Enter your project name at: File Name

Projects Options window will appear

Page 170: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Project Options

Choose your DSP

Set Type to executable file,

which is expected by the debugger

Then press OK

A new window appears. If you don‘t want to use the operating system

VDK press NO.

Page 171: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

VisualDSP++ Menu

Add source, header and .ldf

files to your project.

Build the project

File specific options:Select file, press right mouse button, choose:

File Options

Project Options

Page 172: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Debug Features

Single Step Run Halt Set Breakpoints Profiling Plotting Pipeline Viewer Cache Viewer Memory Viewing Register Viewing

Page 173: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Using Statistical Profiling Statistical Profiling accessed through Tools->Statistical

Profiling->New Profile in an emulator session Enable the Statistical Profiler through the context menu Run and watch as the results are updated in real-time; Halting

keeps the last snapshot on the screen

Page 174: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

C/C++ Profiler The profiler is very useful in C/C++ mode because it makes it easy

to benchmark a system from a module-by-module (I.e. C/C++ function) standpoint

Assembly modules can be wrapped in C/C++ functions to take advantage of this

Page 175: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

The World Leader in High-Performance Signal Processing Solutions

Section 8C compiler / Rules for optimal software

Page 176: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

C/C++ Programming for DSPs

Why Program In C/C++? Maintainability Portability Learning Curve

What Are the Tradeoffs in C/C++ Programming for DSPs? Code Efficiency

Size and Cycle Count Slower Interrupt Handling

Develop Programs Using a Mix of C/C++ and Assembly Languages

Page 177: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

blackfin C/C++ Compiler Compiler

Invoked Via IDDE Using Settings from Compiler Property Page Invoked from a DOS Command Line (ccblkfn.exe)

Linker Description File (LDF) Defines Segments in Memory for Code and Data Defines Segment in Memory for the Run-Time Stack Defines Segment in Memory for Heap

Run Time Header Several Separate Run-Time Headers To Choose From Linker Options Determine Which C Run-Time Libraries To Use

Size, File I/O, C++ Are All Selectable Provides Interrupt Handling Initializes C/C++ Run-Time Environment Must Be Linked With C/C++ Code

C/C++/Assembly Interface Prologue/Epilogue

Example

Page 178: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Build ProcessText Editor

C Preprocessor

Compiler

Assembly Preprocessor

Assembler

Linker

File.asm

File.dojFile.ldflib.dlbcrtx.doj

File.c

File.dxe

File.s

crtx.doj is short for the various user-selectable run-time headers supported by Blackfin

Page 179: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Compiler Property Pages

Compile/General Property Page Enable Debug and Optimization Stop After: Preprocessing, Compiling

Compile/Preprocessor Property Page Provide Processor Definitions Specifies Directory Paths for Include Files

Compile/Warning Property Page Turn On/Off Levels of Warning Messages

Page 180: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Compile / General Property Page

Page 181: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Supported Data Formats

Page 182: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Compiler-Generated Memory Section Names

Compiler generates default section names to be used by the linker

program - contains all program instructionsdata1 - contains all global and "static" dataconstdata - contains all data declared as "const"stack - storage for local data and return addresses (user mode)sysstack - storage for local data and return addresses (supervisor mode)heap - memory space to allocate from during run-time

Page 183: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

C Source with Alternate Sections

section (“extern”) int array[256];

section (“foo”) void bar(void){ int foovar; foovar = 1; foovar++;}

foo.C foo.DOJ

Object Section = fooType = RAMWidth = 8_bar : p0=_foovar;r0=w[p0]; r0=r0+1;w[p0] = r0;C-Compiler

Object Section = externType = RAMWidth = 8_array [0]_array [1] …_array [255]

Assembler

Object Section = mem_stackType = RAMWidth = 8_foovar: 1Note: The section( ) directive is used to

place data or code into a section other thanthe default section used by the compiler.

foo.S

Page 184: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

C Run Time Headers basiccrt.s Sets Up the C Runtime Environment

Installs Default Event Handlers Initializes Both the User and Supervisor Stacks Allows User to Supervisor or User Mode Configures Cache, If Necessary Initializes Profiling Support, If Necessary Initializes argc/argv Support, If Necessary Initializes Global C++ Objects, Sets Up Destructor Call For Clean-Up Calls _main To Start The Actual program Calls _exit When Program Terminates

basiccrt.s Combined With C Library Function EX_INTERRUPT_HANDLER Provides Interrupt Support Enables Global Interrupts Saves / Restores Context Services Interrupts

Page 185: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Interrupt Enabling Interrupts Enabled at Runtime Using:

EX_INTERRUPT_HANDLER(sig, isr_name);Example:

#include<signal.h> EX_INTERRUPT_HANDLER(ik_timer, MyTimerISR);

EX_INTERRUPT_HANDLER Does the Following Enables Global Interrupts Sets Appropriate Bit in IMASK Register Maps Specified Function (isr_name) as the Interrupt Service Routine (ISR) Prior to ISR Call, Interrupt Dispatcher SAVES Current Processor State When ISR Completes, Interrupt Dispatcher RESTORES Processor State

Context Switching Requires 32 Total Cycles To Save/Restore All Data Registers (R0, R1, R2, R3, R4, R5, R6, R7) All Pointer Registers (P0, P1, P2, P3, P4, P5) Frame Pointer (FP) Arithmetic Status Register (ASTAT)

Page 186: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Assembly Language Interface

C-Callable Assembly Language Functions

Assembly Language Statements Within a C Function (In-Line Assembly)

Associate C Variables with Assembly Language Symbols

Page 187: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

C-Callable Assembly Language Functions

Several Issues Involved When Writing C-Callable Assembly Language Functions Register Usage

“Dedicated” Registers “Call Preserved” Registers “Scratch” Registers

Argument Passing First Three Arguments Passed in R0, R1 and R2, respectively Arguments Four and Beyond Passed on Stack

4th Parameter Is Closest to SP at [FP+20], 5th at [FP+24], etc. Return Values of 32 Bits or Less Stored in R0

Overflows To R1 for Return Values of 33 to 64 Bits Anything Over 64 Bits Is Allocated on Stack but Passed as Pointer in a Hidden

Argument in P0

Page 188: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

C/C++ Compiler Register UsesDedicated Registers

Registers that C/C++ Compiler Reserves for its Own Use

REGISTER VALUE MODIFICATION RULESL0 – L3 0 Do Not Modify

SP Stack Pointer Stack Management Only, Restore

FP Frame Pointer Stack Management Only, Restore

Page 189: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

C/C++ Compiler Register UsesCall Preserved Registers

May be Used in an Assembly FunctionContents Should Be Saved and Restored

Values Assumed to be Preserved Across Function Calls

P3, P4, P5

R4, R5, R6, R7

Page 190: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

C/C++ Compiler Register UsesScratch Registers

Contents DO NOT Need to Be Saved/RestoredUse Freely in Assembly Sub-Routines

P0 – P3R0 – R3

LC0 – LC1CC

A0 – A1I0 – I3

B0 – B3M0 – M3

Page 191: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

C-Callable Assembly Language Functions

Macros in asm_sprt.h Provided to Make Function Calling Easier Save/Restore Preserved Registers (puts, pops) Restore Frame and Stack Pointers (exit)

PUTS = x; push a value in register x onto the stack

PUTS = R5; -> [- -SP] = R5; POPS(n); Read value off top of stack

POPS (R5); -> R5 = [SP++];

EXIT; Restores stack and frame pointers and jumps to the return address .Exit; -> P0 = [FP + 0x4];

JUMP (P0);

Page 192: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

In-Line Assembly Language

In-Line Assembly Is Accomplished Using the asm( ) Construct

Example:asm(“RO = w[p0];”);asm(“BITSET(R0,7);”);asm(“ssync;”);

Note: Can Produce Less Efficient Compiled Code – Optimizer Might Re-Sequence Instructions for Optimal Performance

Page 193: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Mixed C/Assembly Naming Conventions

To name an assembly symbol

that corresponds toa C symbol, add anunderscore prefixto the C symbol.

Declare as aglobal variable in

C program and as EXTERN in

assembly routine

To use an assemblyfunction or variable in your

C program, declare the symbol with .GLOBAL

directive in assembly routineand as EXTERN in the C

program

Page 194: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

VOLATILE Extension to ANSI CVOLATILE Extension to ANSI C

• VOLATILE Keyword Prevents Compiler from Relocating, Deleting, or Combining Assembly Instructions or Variable Declarations

volatile int x;asm volatile (“ssync;”);

Page 195: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Optimizing C Code Optimization Can Decrease Code Size or Lead to Faster Execution

- Can Be Controlled by Optimization Switchno switch optimization disabled-O optimization enabled-ipa inter-procedural optimization enabled

- Can Be Further Controlled In C Source Code Using Pragmas#pragma optimize_off - Disables Optimizer#pragma optimize_for_space - Decreases Code Size#pragma optimize_for_speed - Increases Performance

Other Optimization Ideas Take Advantage of Existing Assembly Library Functions Write Time-Critical Routines in Assembly as a C-Callable Subroutine App Note, “Tuning C Source Code For The Blackfin DSP Compiler”http://www.analog.com/library/applicationNotes/dsp/blackfin/EE-149.pdf

Page 196: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Example -- Add 5 Numbers in an Assembly Function Example C Program That Calls an Assembly Function (add5)

Adds 5 Integers Passed From C Calling Routine As Arguments C code:extern int add5(int,int,int,int,int); /* Function is located in assembly module */

volatile int sum; /* Variable only used in assembly sub-routine*//* volatile keeps sum from being optimized out */

main() {

int a=1; int b=2; int c=3; int d=4; int e=5; /* Initialize parameters */

int result=0; /* result and sum will have the same value */

result = add5(a,b,c,d,e); /* Call to the ADD5 function */

exit(0);}

Page 197: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

Assembly Routine/* Assembly Routines with Parameters Example - _add5 *//* int add5 (int a, int b, int c, int d, int e); *//* This is an assembly language routine that will add 5 numbers */#include <asm_sprt.h> /* Header file that defines the stack manipulation macros */.section program;.global _add5;.extern _sum;

_add5:r0=r0+r1; /* Add the first and second parameter */r0=r0+r2; /* Add the third parameter */r1=[FP+20]; /* Put the fourth parameter in R1 */r0=r0+r1; /* Add the fourth parameter */r1=[FP+24]; /* Put the fifth parameter in R1 */r0=r0+r1; /* R0 is always the return value, variable “result” from C will get r0 value */p0.h = _sum; /* we can also write directly to a globally defined variable as well */p0.l =_sum; /* could be used if this function was implemented with no return type */w[p0] = r0; /* Place the sum in the global variable (C is unaware of this

assignment)*/exit; /* Restores frame and stack pointers */

Page 198: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

C Header Files (MMR Access) In previous compilers for other architectures, there was no

straightforward way of accessing and changing MMR values

#include <Cdef21161.h>#define pIM0A ((volatile unsigned int *) 0x61 /* Internal DMA0 memory access modifier */

main ( ){

*pIM0A = 0x010; // set up IM DMA #0 }

This method, though functional, consumes memory for each MMR accessed this way, which is a waste of space

Page 199: Salvetat Richard · Barrel Shifter Sixteen 16-bit Math registers / Eight 32-bit Math Registers Two DAGs, byte addressing Eight 32-bit pointer registers Four Sets of 32-bit Index,

Stage DSP Mars 2011, Lyon

C Header Files (cdefbf53x.h)// include all Core registers and bit definitions #include <defbf532.h>

// include core specific register pointer definitions #include <cdef_LPBlackfin.h>

// Clock and System Control (0xFFC0 0400-0xFFC0 07FF) #define pPLL_CTL ((volatile unsigned short *)PLL_CTL) #define pPLL_STAT ((volatile unsigned short *)PLL_STAT) #define pPLL_LOCKCNT ((volatile unsigned short *)PLL_LOCKCNT)

#include <cdefBF533.h>main ( ){

*pPLL_CTL |= 0x0200; // Set BYPASS Bit}

The cdef header files use the compiler’s pre-processor to set up MMR accesses without using memory for the pointers

If users follow the p-prefix notation, every MMR is set-up for access based on its size (char = 8bit, short = 16bit, int = 32bit)