© 2002 ® Digital Predistortion. © 2002 ® Agenda Introduction Algorithm Standard lookup table...
-
Upload
mark-bailey -
Category
Documents
-
view
220 -
download
0
Transcript of © 2002 ® Digital Predistortion. © 2002 ® Agenda Introduction Algorithm Standard lookup table...
© 2002
®
Digital PredistortionDigital Predistortion
© 2002®
AgendaAgenda Introduction Algorithm
Standard lookup table method Phase related errors Memory effect
Implementation Multipliers Memory Cordic Processors
© 2002®
Introduction: Purpose Introduction: Purpose
Technology demonstrator Show DPD can be done efficiently on PLD Provide starting point for design Show efficient implementation of key
components Provide FPGA benchmark for customer design
© 2002®
Introduction: PredistortionIntroduction: Predistortion
VRF
Vin
VRF
Vin
Overall LinearResponseVrf = kVin
VRF
Vin
PAVrf = kVin
Ideal PA
Vd
VPA
PAVrf = fnlkVd
Real PA
DPDVd = 1/fnlVin
Vd
Vin
Predistorter
© 2002®
Algorithms: OverviewAlgorithms: Overview
Adaptive Lookup Table (LUT) Lookup table for phase and magnitude
correction values Deals with magnitude dependent errors
Volterra modelling of PA Direct implementation Indirect LUT implementation
© 2002®
Algorithm: Distance Gradient MethodAlgorithm: Distance Gradient Method
Assumption: Error only depends on magnitude
φ arctan(I/Q)
LUT(∆r & ∆φ)
I & QDemod
address
r(I2 +Q2)1/2
φ arctan(I/Q)
r(I2 +Q2)1/2delay
delay
Qr*cos(φ)
Ir*sin(φ)
I & Qmod
PAI,Q in I,Q outI,Q out
·(-1)·(-1)
·(-1)·(-1)
© 2002®
Algorithm: Distance Gradient MethodAlgorithm: Distance Gradient Method
• MATLAB simulation results using SALEH PA model
• EVM improved in region of 90%
© 2002®
Dist – Gradient LUT Freq PlotsDist – Gradient LUT Freq Plots
Predistortion improves ACLR by 70dB
(considering 700-900 and 1100-1300 as side-band regions for measurements) Simplified simulation environment
(using SALEH PA model)
© 2002®
Algorithm: Phase related errorAlgorithm: Phase related error
0 10 20 30 40 50 60 700.9
0.95
1
1.05
1.1
1.15
addr(magn)
mag
nitu
de c
orre
ctio
n
0 10 20 30 40 50 60 70-0.8
-0.6
-0.4
-0.2
0LUT content
addr(magn)
phas
e co
rrec
tion
Adds dimension to lookup table Increases memory, logic same size as before Increases time to convergeLUT content, LUT content,Without phase error compensation With phase error compensation
© 2002®
Algorithm: Memory effectAlgorithm: Memory effect
Models effect of short term temperature increase on silicon
Three possible approaches Add “delta look-up table” (Intersil solution) Add new dimension to LUT (fully adaptive) Use FIR in magnitude address calculation
© 2002®
Algorithm: Memory effectAlgorithm: Memory effect Error mechanism:
Error depends on temperature Temperature depends on previous magnitudes
New PA model: Altera model
Error depends on current and previous magnitudes Error independent of input phase
Solution FIR filter in address calculation
© 2002®
φ arctan(I/Q)
LUT(∆I & ∆Q)
I & Q Demod
address
r(I2 +Q2)1/2
φ arctan(I/Q)
r(I2 +Q2)1/2
delay
∆I = ∆r*sin(∆φ)∆Q = ∆r*cos(∆φ)
I & Qmod
PA
I,Q in
I,Q outI,Q out
·(-1)·(-1)
·(-1)·(-1)delay
FIR
Processor + hardware acceleration
Predistortion Reference Design
DSP Blocks
Sync NCO
FIR
© 2002®
DSP Block Architecture & ResourcesDSP Block Architecture & Resources
+
Op
tio
na
l P
ipe
lin
ing
Ou
tpu
t R
eg
iste
rs
Ou
tpu
t M
UX
+ -
+ - Inp
ut
Re
gis
ters
High Performance DSP Operation 18x18 Functions at 282 MHz
Input, Output & Pipelining registers Reduce overall Logic usage
Add/Accumulate/Subtract Signed & unsigned operations Dynamically change between Add &
Subtract
Support complex multiplications (Ar + jAi) x (Br + jBi) = (Ar Br – AiBi) + j(Ai Br + ArBi) 4 Multiplications, 1 Addition & 1 Subtraction
· · · · - +
© 2002®
φ arctan(I/Q)
LUT(∆I & ∆Q)
I & Q Demod
address
r(I2 +Q2)1/2
φ arctan(I/Q)
r(I2 +Q2)1/2
delay
∆I = ∆r*sin(∆φ)∆Q = ∆r*cos(∆φ)
I & Qmod
PA
I,Q in
I,Q outI,Q out
·(-1)·(-1)
·(-1)·(-1)delay
FIR
Processor + hardware acceleration
Predistortion Reference Design
DSP Blocks
RAM
Sync NCO
FIR
© 2002®
TriMatrix™ MemoryTriMatrix™ Memory Today’s applications need more high performance memory One size does not fit all Wide choice of modes and widths
M512 Blocks M4K Blocks M-RAM External Memory Devices DDR SDRAM & SRAM SDR SDRAM QDR & QDRII SRAM ZBT SRAM DDR FCRAM
True Dual Port RAM Embedded Shift Register
Mode 512K bits Operates Up to 300Mhz
True Dual Port RAM Embedded Shift
Register Mode Operates Up to
312Mhz
Rate Changing Embedded Shift
Register Mode Operates Up to
312Mhz
More Bits For Larger Memory Buffering
More Data Ports for Greater Memory Bandwidth
© 2002®
φ arctan(I/Q)
LUT(∆I & ∆Q)
I & Q Demod
address
r(I2 +Q2)1/2
φ arctan(I/Q)
r(I2 +Q2)1/2
delay
∆I = ∆r*sin(∆φ)∆Q = ∆r*cos(∆φ)
I & Qmod
PA
I,Q in
I,Q outI,Q out
·(-1)·(-1)
·(-1)·(-1)delay
FIR
Processor + hardware acceleration
Predistortion Reference Design
DSP Blocks
RAM
CORDIC
Sync NCO
FIR
© 2002®
CORDICCORDIC Hardware efficient algorithm for computing
functions such as: Trigonometric Hyperbolic Logarithmic
Iterative solution that uses only shifts and adding/subtracting High performance as no multiplications and
divisions Simple/less hardware required
© 2002®
Altera CORDIC solution for DPDAltera CORDIC solution for DPD
CORDIC
X_in
Y_in
Z_in
mode
X_out
Y_out
Z_out
• Cartesian to Polar conversion
• X_in, Y_in = Cartesian values, Z_in=0, mode = 0
• X_out = magnitude, Z_out = phase
• Polar to Cartesian conversion
•X_in = magnitude, Z_in=phase, Y_in=0, mode = 1
•X_out, Y_out = Cartesian values
• Mode selects conversion direction
• Pipelined enabling new inputs to be applied in every clk cycle
• After initial latency valid outputs will appear on every clk cycle
• Timesharing : on each clk cycle the mode of the CORDIC can be changed
© 2002®
CORDIC ArchitectureCORDIC Architecture
Quadrant detect &
IP modify
Add/Sub
&
Shift
RegQuadrant
Adjust
Iteration 1 Iteration n
• Parallel Architecture enabling high performance
• CORDIC algorithm can only deal with vector rotations of –90 to +90 degrees
• Require additional logic (Quadrant blocks) to be able to deal with vectors in any of the
four quadrants
• Parameterisable code
• input vector widths and
• number of iterations can be changed.
© 2002®
CORDIC ImplementationCORDIC Implementation LEs in Altera PLDs
Each LE is suited for implementing the required adders/subtractors.
LEs can dynamically change from operating as an adder to subtractor
Each LE contains a register Performance
Vector Widths
Iterations Fmax LEs required
Abs no. % of total in 1S10
16 16 219MHz 1300 12%
32 32 189MHz 4600 43%
© 2002®
φ arctan(I/Q)
LUT(∆I & ∆Q)
I & Q Demod
address
r(I2 +Q2)1/2
φ arctan(I/Q)
r(I2 +Q2)1/2
delay
∆I = ∆r*sin(∆φ)∆Q = ∆r*cos(∆φ)
I & Qmod
PA
I,Q in
I,Q outI,Q out
·(-1)·(-1)
·(-1)·(-1)delay
FIR
Processor + hardware acceleration
Predistortion Reference Design
DSP Blocks
RAM
CORDIC
Sync NCO
FIR
© 2002®
Implementation: Processor?Implementation: Processor?
Should we use processor? For
FlexibilityEasy to add custom interpolation or similarLow data rate in feedback path at base band
AgainstStraightforward data path (few “IF” branches)Too slow at IFNo clear size advantageDifficult to exploit deeply pipelined CORDIC
© 2002®
Per
form
ance
(D
hry
sto
ne
MIP
S 2
.1)
20
50
100
200
0 Soft Core Hard Core
Soft Core Advantages• Flexibility• Low Cost• Portable Design• Scalability• Obsolescence Proof• Fits Broad Range of Altera PLD Families
Hard Core Advantages• High Performance 922TDMI• Time-to-Market • Lots of On-Chip Memory• Leverage Large Existing Code Base
Excalibur™ Embedded Processor CoresExcalibur™ Embedded Processor Cores
© 2002®
Target devicesTarget devices
Stratix - Contains DSP Blocks TriMatrix RAM allows for Large lookup tables
(multiple dimensions) Suitable if up/down converters are also
integrated Cyclone -
Extensive use of CORDIC Lowest cost
© 2002®
Ref Design Resource Utilisation EstimatesRef Design Resource Utilisation Estimates 5000 LEs (50% of avail in 1S10) 4 DSP blocks (67% of avail in 1S10) 3 M4K RAM blocks (5% of avail in 1S10) 2 M512 RAM blocks (2% of avail in 1S10)
Assumes 18bit wide I/Q, 64 deep X 32 bit wide LUT.
The ref design only contains the adaptive lookup table algorithm.
© 2002®
SummarySummary
Reference design based on lookup tables on Stratix 1S10
Works for memoryless PA Compensates for memory effect
Assumption: Errors independent of phase Can be tuned and modified Open Source – extract key components,
leave the rest