1997 - A Primer on Turbo Code Concepts

10
IEE E Communications Magazine • December 1997 94 A P r i m er o n T u r b o C o d e C o n c ept s  0163-6804/97/$10.00 © 1997 IEEE oncatenated coding schemes were first proposed by For ney [1] for achieving large coding gains by com- bining two or more relatively s imple componen t or building- block codes. The resulting codes had the er ror corr ection capability of much longer cod es, and were en dowed with a structure that permitted relatively easy to moderately complex decoding. A serial c oncatena tion of codes is most often used for power-limited systems such as deep space probes. The most popu lar of these schemes consists of a Reed-Solomon outer code (a pplied first, removed last) follow ed b y a convol u- tional inner code (applied last, removed first) [2]. A turbo code can be thought of as a refinement o f the concatenated encoding structure and an iterative algorithm for decoding the assoc iated code sequence. Turbo codes were introduced by Berro u, Glavieux, and Thitimajshima [3, 4]. For a bit error probability of 10 –5 an d code rate = 1/2, the authors report a required E b  /N 0 of 0.7 dB. The codes are constructed by apply ing two or more com- ponent codes to different interleaved versions of the same information sequence. For any single traditional code, the final step at the decoder yields hard-decision decoded bits (or, more gener all y, decoded symbol s). In ord er for a concatenat- ed scheme such as a turbo code t o work proper ly, the decod- ing algorithm should not limit itself to passing hard decisions among the d ecoders. To best expl oit the information learned from each decoder, the d ecoding algorithm must effect an exchange of soft rather than hard decisions. For a system with two component codes, the concept behind turbo decoding is to pass soft decisions from the output of one decoder to the input of the other, and to iterate th is process several times to produce better decisions. A R EVIEW OF L IKELIHOODS T he mathematical foundations of hypothesis testing rests on Bayes’ theorem, which is der ived fro m the r elationship between the conditional and joint probability of events  A an d  B, as follows: P(  A| B) P(  B) = P(  B| A) P(  A) = P(  A,  B) ( 1) A statement o f the theo rem yi elds the a posteriori probability (APP), denoted P(  A| B), (2) whi ch allow s us to infer th e APP of an event  A conditioned on  B, from the conditional proba bili ty, P(  B| A ), and the a priori probabilities, P(  A) and P(  B). For communications engineering applications having additive white Gaussian noise (AWGN) in the chann el, the most useful form of Bayes’ theor em expresses the APP in terms of a continuous-valued random variable  x in the fol- lowi ng for m: (3) an d (4) where d = i represents data d belonging to the ith signal class from a set of   M classes, and  p(  x| d = i) represents the proba- bili ty density f unction ( pdf) o f a received continuo us-valued data-plus-noise sig nal,  x, conditioned on the signal class d = i.  p(  x) is the pdf of the received signal  x over the entire space of sig nal classes . In Eq. 3, for a pa rticular re ceiv ed signal,  p(  x) is a scaling factor since it has th e same value for each class. Lower case  p is used to designate the pdf of a continuous-val- ued signal, and upper case P is used to designate probability (a priori and APP). Equation 3 can be thought of as the result of an experiment involv ing a re ceiv ed signal and some statisti- cal knowledge of the signal classes to which the signal may belong. The pro babili ty of occurren ce of the ith signal class P( d = i), before the experiment, is the a priori probability. As a result of examining a particular received signal, we can com- pute the APP, P( d = i | x), which can be th ought o f as a “refinement” of our prior knowledge. T HE T WO-S IGNAL C LASS C AS E L et the binary logical elements 1 and 0 be r epresented electronically by voltages + 1 and – 1, respe ctiv ely. This pair of tr ansmitted voltages is assigned to t he variable d , which may now take on the values d = + 1 and d = –1. Let  p x p x d i P d i i  M ( ) = = ( ) = ( ) = 1 P d i x  p x d i P d i  p x i M = ( ) = = ( ) = ( ) ( ) = 1 , , K P A B P B A P A P B ( ) = ( ) ( ) ( ) Bernard Sklar, Communications Engineering Services C ABSTRACT The goal of this article is to describe the main ideas behind the new class of codes called turbo codes, whose performance in terms of bit error probability has been shown to be very close to the Shannon limit. In this article, the mathematical measures of a posteriori probability and likelihood are reviewed, and the benefits of turbo codes are explained in this context. Since the algorithms needed to implement the decoders have been well documented by others, they are only referenced here, not described in detail. A numerical example, using a simple concatenated coding scheme, provides a vehicle for illustrating how error performance can be improved when soft outputs from the decoders are used in an iterative decoding process.

Transcript of 1997 - A Primer on Turbo Code Concepts

Page 1: 1997 - A Primer on Turbo Code Concepts

8/8/2019 1997 - A Primer on Turbo Code Concepts

http://slidepdf.com/reader/full/1997-a-primer-on-turbo-code-concepts 1/9

IEE E Communications Magazine • D ecember 199794

A Primer on Turbo Code Concept s 

0163-6804/97/$10.00 © 1997 IEEE

oncatenated coding schemes were first proposed byFor ney [1] for achieving large coding gains by com-

bining two or more relatively simple componen t or building-block codes. The resulting codes had the er ror corr ectioncapability of much longer cod es, and were en dowed with astructure that permitted relatively easy to moderately complexdecoding. A serial concatena tion of codes is most often usedfor power-limited systems such as deep space probes. Themost popu lar of these schemes consists of a R eed-Solomonouter code (a pplied first, removed last) followed by a convolu-tional inner code (applied last, removed first) [2]. A turbocode can be thought of as a refinement o f the concatenatedencoding structure and an iterative algorithm for decoding theassociated code sequence.

Turbo codes were introduced by Berro u, Glavieux, andThitimajshima [3, 4]. For a bit error probability of 10–5 an dcode rate = 1/2, the authors report a required E b /N 0 of 0.7dB. The codes are constructed by applying two or more com-ponent codes to different interleaved versions of the sameinformation sequence. For any single traditional code, thefinal step at the decoder yields hard-decision decoded bits (or,more gener ally, decoded symbols). In ord er for a concatenat-ed scheme such as a turbo code t o work proper ly, the decod-ing algorithm should not limit itself to passing hard decisionsamong the d ecoders. To best exploit the information learnedfrom each decoder , the d ecoding algorithm must effect anexchange of soft rather than hard decisions. For a system withtwo component codes, the concept behind turbo decoding isto pass soft decisions from the output of one decoder to the

input of the other, and to iterate th is process several times toproduce better decisions.

A REVIEW OF LIKELIHOODS

T he mathematical foundations of hypothesis testing rests onBayes’ theo rem, which is der ived fro m the r elationship

between the conditional and joint probability of events A and B, as follows:

P( A | B) P( B) = P( B| A) P( A) = P( A, B) (1)

A statement o f the theo rem yields the a posteriori probability(APP), denoted P( A | B),

(2)

which allows us to infer the APP of an event  A conditioned on B, from the conditional proba bility, P( B| A), and the a prioriprobabilities, P( A) and P( B).

For communicat ions engineering applicat ions havingadditive white Gaussian noise (AWGN) in the chann el, themost useful form of Bayes’ theor em expresses the APP interms of a continuous-valued random variable  x in the fol-lowing form:

(3)

and

(4)

where d = i represents data d belonging to the ith signal classfrom a set of  M classes, and  p( x| d = i) represents the proba-bility density function ( pdf) o f a received continuo us-valueddata-plus-noise signal, x, conditioned on the signal class d = i.

 p( x) is the pdf of the received signal  x over the entire space of signal classes. In Eq. 3, for a pa rticular re ceived signal, p( x) isa scaling factor since it has th e same value for each class.Lower case  p is used to designate the pdf of a continuous-val-ued signal, and upper case P is used to designate probability(a priori and APP). Equation 3 can be thought of as the resultof an experiment involving a re ceived signal and some statisti-

cal knowledge of the signal classes to which the signal maybelong. The pro bability of occurren ce of the ith signal classP(d = i), before the experiment, is the a priori probability. Asa result of examining a particular received signal, we can com-pu t e t he APP, P(d  = i | x) , which can be th ought o f as a“refinement” of our prior knowledge.

THE TWO-SIGNAL CLASS CASE

Let the binary logical elements 1 and 0 be r epresentedelectronically by voltages + 1 and – 1, respe ctively. This

pair of tr ansmitted voltages is assigned to t he variable d ,which may now take on the values d = + 1 a nd d = –1. Let

 p x p x d i P d ii

 M 

( ) = =( ) =( )=∑

1

P d i x p x d i P d i

 p xi M =( ) =

=( ) =( )

( )= 1, ,K

P A BP B A P A

P B( ) =

( ) ( )

( )

Bernard Sklar, Communications Engineering Services 

ABSTRACT

The goal of this article is to describe the main ideas behind the new class of codes called turbo codes, whose performance in terms of

bit error probability has been shown to be very close to the Shannon limit. In this article, the mathematical measures of a posteriori 

probability and likelihood are reviewed, and the benefits of turbo codes are explained in this context. Since the algorithms needed to

implement the decoders have been well documented by others, they are only referenced here, not described in detail. A numerical

example, using a simple concatenated coding scheme, provides a vehicle for illustrating how error performance can be improved when

soft outputs from the decoders are used in an iterative decoding process.

Page 2: 1997 - A Primer on Turbo Code Concepts

8/8/2019 1997 - A Primer on Turbo Code Concepts

http://slidepdf.com/reader/full/1997-a-primer-on-turbo-code-concepts 2/9

IEE E Communications Magazine • December 1997 95

the binary 0 (or t he vol tage value –1) be the n ul lelement under addit ion. For an AWGN channel ,Fig. 1 shows the con dit ional pd fs referred t o aslikelihood functions. The rightmost function  p( x| d = + 1 ) s h o ws t h e p d f o f t h e r a n d o m v a r ia b l e  xgiven the condition that d  = + 1 was t ransmi t ted.The leftmost function  p( x| d = –1) illustrates a sim-i lar pdf given th at d  = – 1 was t ransmit ted. Theabscissa represents the full range of possible valuesof  x received at the detector. In Fig. 1 is shown onesuch arbi trary prede tect ion value  xk . A line sub-tended from  xk  intercepts the two likelihood func-tions yielding two likelihood values λ1 and λ2. A popularhard dec i sion rule , known as maximum likel ihood , is tochoose the symbol d  = + 1 or d  = –1 associated with thelarger of the two intercept values, λ1 or λ2 . This is tanta-mount to deciding d k  = + 1 if   xk  falls on the right side of the decision line labeled γ 0, othe rwise deciding d k  = –1.

A similar decision r ule, known as maximu m a posteriori(MAP) or minimum -error rule, takes into account the a prioriprobabil i t ies, as seen below in Eq. 6. The MAP rule isexpressed in terms of APPs as follows:

(5)

Equation 5 states that one should choose the hypothesis H 1 (d = + 1) if the APP P(d = + 1| x) is greater than the APP P(d =–1| x). O therwise, choose hypothesis H 2(d = –1). Using theBayes’ theorem in Eq. 3, the A PPs in Eq. 5 can be replacedby their equivalent expressions, yielding:

(6)

Equation 6 is generally expressed in terms of a ratio, calledthe likelihood ratio test , as follows:

(7)

LOG-LIKELIHOOD RATIO

B y taking the logarithm of the likelihood rat io in Eq. 7, weobtain a useful metric called the log-likelihood ratio (LLR).

It is the real numb er rep resenting a soft decision out o f adetector, designated  L (d | x), as follows:

(8)

(9)

 L (d | x) =  L ( x| d ) +  L (d ) (10)where  L ( x| d ) is the LLR of the channel measurements of  xunder th e alternate conditions that d = + 1 or d = –1 mayhave been transmitted, and  L (d ) is the a priori LLR of thedata bit d .

To simplify the notation, we represent Eq. 10 as follows:

 L ′( d̂ ) =  Lc( x) +  L (d ) (11)

where the notation  Lc( x) emphasizes that this LLR term is theresult of a channel measurement made at the dete ctor. Equa-tions 3 through 11 were developed with only a data detectorin mind. Next, the introduction of a decoder will typicallyyield de cision-making bene fits. For a systematic code, it canbe shown [3] that the LLR (soft out put)  L ( d̂ ) out of the

decoder is equal to L ( d̂ ) =  L ′(d ^ ) +  Le(d ^ ) (12)

where L ′(d ^ ) is the LLR of a data bit out of the detector (inputto the decoder), and L e(d ^ ), called the extrinsic LLR, repre-sents extra knowledge that is gleaned from the de coding pro-cess. The output sequence of a systematic decoder is made upof values representing data and parity. Equ ation 12 partitions

the decoder LLR into the data portion representedby the detector me asurement, and the extrinsic por-tion represented by the decoder contribution due toparity. From Eqs. 11 and 12, we write

 L ( d̂ ) =  Lc( x) +  L (d ) +  Le(d ^ ) (13)

The soft decision  L (d ^ ) is a real number that pro-

vides a hard decision as well as the reliability of thatdecision. The sign of  L (d ^ ) denotes the hard deci-sion; that is, for positive values of  L (d ^ ) decide + 1,for negative values decide –1. The magnitude of 

 L (d ^ ) denotes th e re liability of that decision.

PRINCIPLES OF

ITERATIVE (TURBO) DECODING

F or the f i rs t decoding i t e ra t ion of the sof tinput/soft outp ut decoder in Fig. 2, one gener-

ally assumes the binary data to be equally likely,

 L d x p x d 

 p x d 

P d 

P d 

( ) == +( )

=( )

+= +( )

=( )

log

log

1

1

1

1

 L d xP d x

P d x

 p x d P d  

 p x d P d  ( ) =

= +( )=( )

== +( ) = +( )

=( ) =( )

log–

log– –

1

1

1 1

1 1

 p x d 

 p x d 

P d 

P d 

  p x d P d  

 p x d P d  

 H 

 H 

 H 

 H 

= +( )

=( )

=( )

= +( )

= +( ) = +( )

=( ) =( )

<>

<>

1

1

1

1

1 1

1 11

2

1

2

1

– –

 

or

 

 p x d P d p x d P d   H 

 H 

= +( ) = +( ) =( ) =( )<>1 1 1 1

2

1– –

P d x P d x H 

 H 

= +( )

=( )<

>1 1

2

1–

s Figure 1. Likelihood functions.

Likelihood of d = –1p(x| d = –1)

 –1

λ1

λ2

Likelihood of d = + 1p(x| d = + 1)

γ 0

x k  + 1x 

s Figu re 2. Soft input/soft output decoder (for a systematic code).

Soft-insoft-outdecoder

Feedback for the next iteration

A priorivalues in

L(d )

Detector a posteroriLLR value L' (d ^  )= Lc (x ) + L(d )

Le (d )Output LLR L(d ^  )= L' (d ^ ) + Le (d ^  )

Lc (x ) L'(d ^  )

Channelvalues in

Extrinsicvalues out

A posteriori values

Page 3: 1997 - A Primer on Turbo Code Concepts

8/8/2019 1997 - A Primer on Turbo Code Concepts

http://slidepdf.com/reader/full/1997-a-primer-on-turbo-code-concepts 3/9

IEE E Communications Magazine • D ecember 199796

yielding an initial a priori LLR value of  L (d ) = 0 fo r t hethird term in E q. 9. The channel predetection LLR value,

 L c( x), is measured by forming the logarithm of the r atio of λ1 an d λ2, seen in Fig. 1, and appea ring as the second termof Eq. 9. The output  L (d ^ ) of the F ig. 2 decoder is made up

of the LLR from the detector,  L ′(d ^ ), and the extrinsic LLRoutput , L e(d ^ ), representing knowledge gleaned from thedecoding process. As illustrated in Fig. 2, for iterative decod-ing the extrinsic likelihood is fed back to the decoder input,to serve as a refinement of the a p riori value for the n extiteration.

Consider the two-dimensional code (pro duct code) depict-ed in Fig. 3. The configuration can be described as a da taarray made up of k 1 rows and k 2 columns. Each of the k 1 rowscontains a code vector made u p of k 2 data bits and n2 – k 2parity bits. Similarly, each of the k 2 columns contains a codevector made up of k 1 data bits and n1 – k 1 parity bits. The var-ious portions of the structure are labeled d for data,  ph fo rhorizontal parity (along the ro ws), and  pv for vertical parity(along the columns). Additionally, there are b locks labeled

 L eh and  L ev which house t he extrinsic LLR values learne dfrom th e h orizontal and vertical decoding steps, respectively.Notice that th is product code is a simple example of a con-catenated code. Its st ructure encompasses two separateencoding steps, horizontal and vertical. The iterative decodingalgorithm for this product code proceeds as follows:1 Set the a priori information  L (d ) = 0.2 Decode ho rizontally, and using Eq. 13 obtain the hori-

zontal extrinsic information as shown below: Leh(d ^ ) =  L (d ^ ) – Lc( x) – L (d ).

3 Set  L (d ) =  L eh(d ^ ).4 Decode vertically, and using Eq. 13 obtain th e vertical

extrinsic information as shown below: Lev(d ^ ) =  L (d ^ ) – Lc( x) – L (d ).

5 Set  L (d ) =  L ev(d ^ ).

6 If there have been enough iterations to yield a reliabledecision, go to step 7; otherwise, go to step 2.7 The soft output is:

 L (d ^ ) =  Lc(x) +  Leh(d ^ ) +  Lev(d ^ ). (14)

TWO-DIM ENSIONAL SINGLE-PARITY

CODE EXAMPLE

A t the en coder, let the data bits and par ity bits take on thevalues shown in Fig. 4, where the relationships between

the data and parity expressed as binary digits (0,1), are as fol-lows:

d i ⊕ d  j =  pij (15)

d i = d  j ⊕ pij (16)

d  j = d i ⊕ pij (17)

where ⊕ deno tes modu lo-2 addition. As shown in Fig. 4,the tran smitted symbols are represente d by the sequenced 1 d 2 d 3 d 4 p12 p34 p13 p 24 . A t t h e r e c e i ve r i n p u t , t h ereceived symbols are re presente d by the sequen ce { xi,  xij},where  xi = d i + n for each received d ata signal, xij =  pij +n for ea ch received par ity signal, and n represents inde-pendent and identically distributed noise. For notationalsimplicity, we shall deno te th e received seque nce with asingle index, as { xk }, where k  can be t rea ted as a t imeindex. Using the r elationships developed in E qs. 9 through11, and assuming Gaussian noise, we can write the LLRfor the channel measurement o f a rece ived s igna l  xk  a sfollows:

(18)

(19)

(20)

where the na tural logarithm is used. If we further make as i mpl i fyi ng a s sumpt i on t h a t t h e no i se va r i a nc e σ2 i sunity, then

 Lc( xk ) = 2 xk  (21)

Consider th e following example, where the data sequ ence d 1d 2 d 3 d 4 is made up of the binary digits 1 0 0 1, as shown inFig. 4. By the use of Eq. 15, it is seen that the parity sequence

 p12 p34 p13 p24 must be made up of 1 1 1 1. Thus, the transmit-ted sequence is

{d i, pij} = 1 0 0 1 1 1 1 1

which is shown in Fig. 4 as the encode r out put. E xpressed interms of b ipolar voltage values, the transmitted sequence is

+ 1 –1 –1 + 1 + 1 + 1 + 1 +1

Let us assume that the noise transforms this data-plus-paritysequence into the received sequence

{ xk } = 0.75, 0.05, 0.10, 0.15, 1.25, 1.0, 3.0, 0.5

From Eq. 21, the assumed channel measurem ents yield theLLR values

{ Lc( xk )} = 1.5, 0.1, 0.20, 0.3, 2.5, 2.0, 6.0, 1.0

which is shown in Fig. 4 as the decoder inpu t measure ments.It should be noted th at, if hard de cisions are made on the{ xk } or the { Lc( xk )} values shown above, such a detection pro-cess would r esult in two er ror s, since d 2 and d 3 would each beincorrectly classified as binary 1.

LOG-LIKELIHOOD ALGEBRA

T o be st explain t he iter ative feedback of soft d ecisions, weintroduce the idea of a log-likelihood algebra [5]. For sta-

tistically indep ende nt d , we define th e sum of two log likeli-hood ratios (LLRs) as follows:

=     

   ++ 

      =–

–1

2

1 1

2

1 22 2

2

 x x xk k 

k σ σ  σ 

=

    

  

+    

  

 

 

 

 

  

   

log

exp ––

exp –e

 x

 x

1

2

1

2

1

1

2

1

2

1

2

2

σ π σ

σ π σ

 L x p x d  

 p x d  c k  e

k k 

k k 

( ) == +( )=( )

log–

1

1

s Figure 3. Two-dimensional product code.

p h  Leh 

Extrinsichorizontal

p v 

Lev 

Extrinsicvertical

k 2columns

n 2 – k 2columns

n 1 – k 1 rows

k 1 rows

Page 4: 1997 - A Primer on Turbo Code Concepts

8/8/2019 1997 - A Primer on Turbo Code Concepts

http://slidepdf.com/reader/full/1997-a-primer-on-turbo-code-concepts 4/9

Page 5: 1997 - A Primer on Turbo Code Concepts

8/8/2019 1997 - A Primer on Turbo Code Concepts

http://slidepdf.com/reader/full/1997-a-primer-on-turbo-code-concepts 5/9

IEE E Communications Magazine • D ecember 199798

extrinsic LLRs yields the improvement in Table 4 (the extrin-sic vertical terms ar e no t yet being considered).

The or iginal LLR plus both t he hor izontal and verticalextrinsic LLRs yields the improvement in Table 5.

For this example, it is seen that the horizontal parity aloneyields the correct hard decisions out of th e decoder , but withvery low confidence for da ta bits d 3 an d d 4. After enh ancingthe d ecoder LLR s with th e vertical extrinsic LLRs, the new

LLR solutions have better confidence. Let us pursue oneadditional horizontal and vertical decoding iteration to see if there are any significant changes in the results. We again usethe re lationships shown in Eqs. 26 through 29 and pr oceedwith the second horizontal calculations for  L eh(d ^ ) using thenew L (d ) from the first vertical calculations, shown in Eqs. 34through 37.

 Leh(d ^1) = (0.1 – 0.1) 2.5 = 0 = new L (d 1) (38)

 Leh(d ^2) = (1.5 + 0.1) 2.5 = –1.6 = new L(d 2) (39)

 Leh(d ^3) = (0.3 + 1.0) 2.0 = –1.3 = new L(d 3) (40)

 Leh(d ^4) = (0.2 – 1.4) 2.0 = 1.2 = new L(d 4) (41)

Next, we proceed with the second vertical calculations for Lev(d ^ ) using the new L (d ) from the second horizontal calcula-tions, shown in Eqs. 38 through 41.

 Lev(d ^1) = (0.2 – 1.3) 6.0 = 1.1 = new L(d 1) (42)

 Lev(d ^2) = (0.3 + 1.2) 1.0 = –1.0 = new L(d 2) (43)

 Lev(d ^3) = (1.5 + 0) 6.0 = –1.5 = new L(d 3) (44)

 Lev(d ^4) = (0.1 – 1.6) 1.0 = 1.0 = new L(d 4) (45)

After the second iteration of horizontal and vertical decod-ing calculations, the soft-output likelihood values are againcalculated from E q. 14, rewritten below:

 L (d ^ ) =  L c( x) +  Leh(d ^ ) +  Lev(d ^ ) (46)

The final horizontal and vertical extrinsic LLRs, and theresulting decoder LLRs are displayed in Tables 6–8. For

this example, the second full iteration of both codes (yield-ing a total of 4 iterations) suggests a modest improvementover one full iteration. Now, there seems to be a balancingof the confidence values among each of the four dat a deci-sions.

The soft ou tput is shown in Table 9.

COMPONENT CODES FOR

IMPLEMENTING TURBO CODES

W e have described the b asic concepts of concatenation,iteration, and soft decisions using a simple product-code

model. We next apply these ideas to the implementation of turbo codes. We form a turbo code by the parallel concatena-tion of component codes (building blocks) [3, 6].

First, a reminder of a simple binary rate-1/2 convolutionalencoder with constraint length K and memory K – 1. Theinput to the en coder at time k  is a bit d k , and the correspond-ing codeword is the bit pair (uk , vk ), where

(47)

(48)

G1 = {g1i} and G2 = {g2i} are the code generators and d k 

is represent ed as a bina ry digit. This encoder h as a finiteimpulse response (FIR ), and gives rise to the familiar no n-systematic convolutional code (NSC), an example of which isseen in Fig. 5. In this example, the constraint length is K =3, and the two code generators are d escribed by G1 = {111}and G 2 = {101}. It is well known tha t, at large  E b /N 0, theerror performance of an NSC is better than that of a system-atic code with th e same me mory. At small E b /N 0, it is gener-ally the othe r way around [3]. A class of infinite impulseresponse (IIR ) convolutional codes has been proposed [3]

for the building blocks of a turbo code. Such codes are alsoreferred to as recursive systematic convolutional (RSC) codesbecause previously encoded information bits are continuallyfed back to the encoder’s input. For high code rate s, RSCcodes result in better error performance than the best NSCcodes at any E b /N 0. A binary rate-1/2 RSC code is obtainedfrom an NSC code by using a feedback loop, and setting oneof the two outputs (uk  or vk ) equal to d k . Figure 6 illustratesan example of such an R SC code, with K = 3, where ak  isrecur sively calculated as

(49)

and where gi′ is respectively equal to g1i if uk  = d k , and to g2i if vk  = d k .

We assume tha t the input bi t d k  takes on values of 0or 1 wi t h e qua l p roba b i l it y . I t i s s t a t e d i n [3 ] t ha t a k 

exhibits the same statistical properties as d k . The trellisstructure is identical for the RSC code of Fig. 6 and theNSC code of Fig. 5, and these two codes have the samefree distance. However, the two output sequences { uk }and {vk } do not correspond to the same input sequence{d k } for RSC and NSC codes. For th e same code genera-tors, we can say that R SC codes do not modify the ou t-p u t w e i g h t d i s t r i b u t i o n o f t h e o u t p u t c o d e w o r d scompared to NSC codes. They only change the mappingb e t w e e n i n p u t d a t a s e q u e n c e s a n d o u t p u t c o d e w o r dsequences.

a d g ak k i k ii

= + ′=∑ –

1

1

mod 2

v g d gk i k ii

i= ==∑ 2

0

1

2 0 1–

,mod 2

u g d gk i k i

i

i= ==

−∑ 1

0

1

1 0 1– ,mod 2

+

+

+

+

+

+

+

+

Table 6. OriginalLc(x)

measurements.

1.5 0.1

0.2 0.3

Table 7. Leh(d̂ )after second hori- zontal decoding.

0 –1.6

 –1.3 1.2

Table 8. Lev(d̂ )after second verti-

cal decoding.

1.1 –1.0

 –1.5 1.0

Table 9. L(d̂ ) =Lc(x) + Leh(d̂ )

+ Lev(d̂ ).

2.6 –2.5

 –2.6 2.5

Table 5. Improved  LL Rs due to

Leh(d̂ ) + Lev(d̂ )

1.5 –1.5

 –1.5 1.1

Table 4. Improved  LL Rs due to

Leh(d^

).

1.4 –1.4

 –0.1 0.1

Page 6: 1997 - A Primer on Turbo Code Concepts

8/8/2019 1997 - A Primer on Turbo Code Concepts

http://slidepdf.com/reader/full/1997-a-primer-on-turbo-code-concepts 6/9

IEE E Communications Magazine • December 1997 99

CONCATENATION OF

RSC CODES

C onsider the parallel concatenationof two R SC encoders of the type

shown in Figure 6 . Good turbo codeshave been constructed with th eir compo-nent codes having quite short constraintlengths (K = 3 to 5). An example of asuch a turbo encoder is shown in Fig. 7,where t he switch yielding vk  puncturesthe code, making i t rate 1/2. Withou tthe switch, the code would be r ate 1/3. Additional concate-nations can be continued with multiple component codes. Ingeneral, the component encoders need not be identical withregard to constraint length and rate . In designing turbocodes, the goal is to choose th e best compon ent codes bymaximizing the effective free d istance of th e code [7]. Atlarge values of  E b /  N 0, this is tantamount to ma ximizing theminimum weight codeword. However, at low values of  E b /  N 0(the region of greatest inter est) opt imizing the weight distri-bution of the codewords is more important than maximizingthe minimum weight [6].

The turbo encoder in Fig. 7 produces codewords from

each of two componen t encoder s. The weight d istributionfor the codewords out o f this parallel concatenation depend son how the codewords from one of the component encodersare combined with codewords from the other encoder(s).Int uitively, we would like to avoid pair ing low-weight code-

words f rom one e nc ode r wit h l ow-we i gh t c ode words f rom t he o t he rencoder . Many such pa i r ings can beavoided by prop er de sign of the inter-leaver . An inte r leaver th a t per mutesthe data in a random fashion pro videsbet ter per formance than the famil iarblock interleaver [8].

If the encoder s are not r ecursive, alow-weight codeword generate d by theinput sequence d = (0 0 ... 0 0 1 0 0 ...0 0) with a single bina ry 1 will always

appear aga in a t the input of the second encoder for anychoice o f interleaver. In ot her words, the interleaver wouldnot influence th e codeword weight distribution if the codeswere not recur sive. However, if the compon ent codes arerecursive, a weight-1 input sequence generates an IIR (infi-ni te weight outpu t). There fore, for the case of recursivecodes, the weight-1 input sequence does not yield the mini-mum weight codeword out of the encoder . The encodedoutput weight is kept finite only by trellis termination, aprocess that forces the coded sequence to terminate in thezero state. In effect, the convolutional code is converted toa block code.

For the F ig. 7 encoder, the minimum weight codeword foreach component encoder is generated by the weight-3 inputsequence (0 0 ... 0 0 1 1 1 0 0 0 ... 0 0) with three consecutive1s. Another input that produces fairly low-weight codewords isthe weight-2 sequen ce (0 0 ... 0 0 1 0 0 1 0 0 ... 0 0). Ho wever ,after the pe rmutations of an interleaver, either o f these dele-terious input patterns is not likely to appear again at the inputto an other encoder, so it is unlikely that a minimum weightcodeword will be combined with another minimum weightcodeword. The importa nt aspect of the building blocks used inturbo codes is that they are recursive (the systematic aspect ismerely incidental). It is the RSC code’s IIR property that pro-tects against tho se low-weight encodings which cannot beremedied by an interleaver. One can argue that turbo codeperfor mance is determ ined largely from minimum weight

codewords that result from t he weight-2 input sequence. Theargument is that weight-1 inputs can be ignored since theyyield large weights due to th e IIR code structur e, and forinput sequen ces having weight -3 and la rger , a proper lydesigned interleaver makes the numbe r of such input wordsrelatively rare [7–11].

A FEEDBACK DECODER

T he Viterbi algorithm (VA) is an optimal decoding methodfor minimizing the probability of sequence error. Unfortu-

nately, the VA is not a ble to yield the APP o r soft-decisionoutput for each decoded bit. A relevant algorithm for doingthis has been proposed by Bahl et al. [12]. The Bahl algorithmwas modified by Berrou et al. [3] for u se in decoding RSC

codes. The APP o f a decoded data bit d k  can be de rived fromthe joint probability Λk i(m) defined by

Λk i(m) = P{d k  = i, Sk  = m | R1

 N } (50)

where Sk  is the state at time k , and  R1 N 

is a received sequencefrom time k = 1 through some time  N .

Thus, the APP for a decoded data bit d k , represented as abinary digit, is equal to

(51)

The log-likelihood function is written as the logarithm of the ratio of APPs, as follows:

P d i R m ik  N 

k i

m

={ } = ( ) =∑1 0 1Λ , ,

sFigure 5. Nonsystematic convolu-tional (NSC) code.

+

+

d k  d k -1 d k -2

u k 

v k 

s Figu re 6. Recursive systematic convolutional (RSC) code.

a k  a k -1 a k -2

d k  u k 

v k 

+

+

s Figu re 7. Parallel concatenation of two RSC encoders.

Interleaver

d 'k 

d k  d k 

v k 

v 1k 

v 2k 

C1RSC code

C2RSC code

+

+

+

+

Page 7: 1997 - A Primer on Turbo Code Concepts

8/8/2019 1997 - A Primer on Turbo Code Concepts

http://slidepdf.com/reader/full/1997-a-primer-on-turbo-code-concepts 7/9

IEE E Communications Magazine • D ecember 1997100

(52)

The d ecoder can make a decision by comparing L (d ^k ) to a

zero threshold.

d ^k  = 1 if    L (d ^

k ) > 0(53)

d ^k  = 0 if    L (d ^

k ) < 0

For a systematic code, the LLR  L (d ^k ) associated with each

decoded bit d ^k  can be described as the sum of the LLR of d ^

out of the de tec tor and of other LLR s genera ted by thedecoder (extrinsic informa tion), as was expressed in E qs. 12and 13. Consider the detection of a noisy data sequence thatstems from the encoder of Fig. 7. The decoder is shown in

Fig. 8. For a discrete memoryless Gau s-sian channel an d binary modulation, thedecoder input is made up of a couple Rk 

of two random var iables  xk  a nd  yk .Where d k  and vk  are bits, we can expressthe received bit-to-bipolar pulse conver-sion at time k as follows:

 xk  = (2d k  – 1) + ik  (54)

 yk  = (2vk  – 1) + qk  (55)

where ik  and qk  are two independent nois-es with the same variance σ2. The redun-dant information,  yk , is demultiplexedand sent to decoder DEC1 as y1k  when vk 

= v1k , and to decoder DEC 2 as y2k  whenvk  = v2k . When the redundant informa-

tion of a given encoder ( C1 OR C2) is not emitted, the corre-sponding decoder input is set to zero. Note that th e output o f DEC1 has an interleaver structure identical to the one used atthe transmitter between the two encoders. This is because theinformation processed by DEC1 is the noninterleaved ou tput of C1 (corrupted by channel noise). Conversely, the informationprocessed by DEC2 is the noisy output of C2 whose input is the

same data going into C1, but permu ted by the interleaver.DE C2 makes use of the DEC 1 output, provided this output istime ordered in the same way as the input to C2.

DECODING WITH A

FEEDBACK LOOP

W e rewrite Eq. 13 for th e soft decision output at time k ,with the a priori LLR  L (d k ) initially set to zero. Th is

follows from th e assumption that the d ata bits are equ allylikely.

(56)

where  L (d ^k ) is the soft decision output, and  L c( xk ) is the LLR

channel measuremen t, stemming from th e ratio of likelihoodfunctions  p( xk | d k  = i) of the discrete memoryless channel.

 L e(d ^k ) =  L (d ^

k )| xk  = 0 is a function of the redundant informa-tion. It is the extrinsic information supplied by the d ecoder,and does not depend o n the decoder input  xk . Ideally, L c( xk )and  L e(d ^

k ) are corrupted by uncorrelated noise, and thus L e(d ^

k ) may be used as a new observation of d k  by anoth erdecoder for an iterative process. The fundamental pr inciple forfeeding back information to anothe r decoder is: never feed adecoder with information that stems from itself (because theinput and output corruption will be highly correlated).

For the Gaussian channel, we use the natural logarithmin Eq. 56 to describe the chann el LLR,  L c( xk ), as was done

in Eqs. 18 through 20. We rewrite the Eq. 20 LLR resultbelow:

(57)

Both decoders DEC1 and D EC2 use the modified Bahlalgorithm [6]. If the inputs L 1(d ^

k ) and  y2k  to decoder DEC2are indep endent, th en th e log-likelihood  L 2(d ^

k ) at the outputof DEC2 can be written as

 L2(d ^k ) =  f [ L 1(d ^

k )] +  L e2(d ^k ) (58)

with

 L xx x

 xc k k k 

k ( ) = 

  

   ++ 

  

   =––1

2

1 1

2

1 2

0

2

0

2

02

σ σ  σ 

 L d L x L d   p x d  

 p x d   L d k  c k  e k 

k k 

k k 

e k ˆ ˆ log ˆ( ) = ( ) + ( ) =

=( )=( )

+ ( )1

0

 L d 

m

mk 

k m

k m

ˆ log( ) =( )

( )

Λ

Λ

1

0

s Figu re 8. Feedback decoder.

DecoderDEC1

Feedback loop

Decoded output

L1(d ^k )

Le 2(d ^k )

L2(d ^k )

d ^k 

Z k 

X k 

Y k 

Y 1k  Y 2k 

Demux

L1(d ^n )

DecoderDEC2

Inter-leaving

Deinter-leaving

Deinter-leaving

s Figu re 9. Bit error probability as a function of Eb / N0 and mul-tiple iterations.

5

10-5

0 1

5

10-4

5

10-3

5

10-2

5

10-1

P B 

E b  / N 

0(dB)2 3 4 5

Uncoded

Iteration # 1# 2

# 3# 6# 18

Page 8: 1997 - A Primer on Turbo Code Concepts

8/8/2019 1997 - A Primer on Turbo Code Concepts

http://slidepdf.com/reader/full/1997-a-primer-on-turbo-code-concepts 8/9

IEE E Communications Magazine • December 1997 101

(59)

where  f [ ] indicates a functional re lationship. The extrinsicinformat ion  L e2 ( d ^

k ) out of DEC2 i s a func t ion of thesequence { L 1(d ^

n)}n ≠ k . Since  L 1(d ^n) depends on observation

 R1 N 

, the extrinsic information  L e2(d ^k ) is correlated with obser-

vations xk  and  y1k . Nevertheless, the greater | n – k | i s, theless correlated ar e  L

1(d ^

n) and th e observations  x

k , y

k . Thus,

due to the interleaving between D EC1 and DE C2, the extrin-sic information  L e2(d ^

k ) and the observations xk , y1k  are weak-ly correlated. Therefore, they can be jointly used for thedecoding of bit d k .

 zk  =  L e2(d ^k ) acts as a diversity effect in an iterative pro-

cess. In general, L e2(d ^k ) will have the same sign as d k . There-

fore,  L e2(d ^k ) may improve the LLR associated with e ach

decoded data bit.The algorithmic details for computing the LLR, L (d ^

k ), of the APP for each bit has been described by several authors[3–5], and suggestions for d ecreasing the implementa tionalcomplexity are still ongoing [13–15]. A reasonable way tothink of the process that produces APP values for each bit isto imagine computing an M LSE or a VA in two directionsover the block of coded b its. Having metrics associated with

states in the for ward and backward d irection allows us tocompute the APP of a bit if there is a transition between twogiven states. Thus, we can proceed with this bidirectional VAand, in a sliding-window fashion, compute the APP for eachcode bit in the block. With this view in mind, we can estimatethat th e complexity of decoding a turbo code is at least twotimes more complex than decoding one o f i ts componen tcodes using the VA.

TURBO CODE ERROR

PERFORMANCE EXAMPLE

M onte Carlo simulations have been presented for a rate-1/2,K = 5 encoder with generators G1 = {11111} and G2 =

{10001}, and parallel concatenation. The interleaver was a 256x 256 array. The mo dified Bahl algorithm has bee n used with adata block length of 65,536 bits [3]. For 18 iterations, the b iterror probability P B is lower than 10–5 at  E b /  N 0 = 0.7 dB. Theerror performan ce improvement as a function of iterations isseen in Fig. 9. Note that, as we approach the Shannon limit of –1.6 dB, the re quired ba ndwidth appro aches infinity, and th ecapacity (code rate) approaches zero. For binary modulation,several auth ors use P B = 10–5 and  E b /  N 0 = 0 dB as the Shan-non limit reference for a rate-1/2 code. Thus, with parallel con-catenation of R SC convolutional codes and feedback decoding,the erro r performa nce of a turbo code is at 0.7 dB from theShannon limit. Recently, a class of codes that use serial insteadof para llel concatenation o f the inter leaved bu ilding blockshave been propo sed. It has been suggested that these codes

may have superior performance to turbo codes [14].

SUMMARY

I n th is article basic statistical measure s, such as a posterioriprobability and likelihood, are reviewed. We then use these

measures for describing the error performance of a soft-input/soft-output decoder. A numerical example helps to

illustrate how performa nce is improved when soft outpu tsfrom concatenated decoders are used in an iterative decodingprocess. We next proceed to apply these concepts to the par-allel concatenation of re cursive systematic convolutional(R SC) codes, and explain why such codes are the pr eferredbui lding blocks in turbo cod es . A feedback decoder i sdescribed in general ways, and its remarkab le performa ncepresented.

REFERENCES[1] G. D. Forney, Jr., Concatenated Codes , Cambridge, MA: MIT.

Press, 1966.[2] J. H. Yuen, et al., “ Modu lation and Coding f or Satellite and Space

Communica t ions,” Proc. IEEE , vo l . 78 , no . 7 , Ju ly 1990, pp .1250–65.

[3] C. Berrou, A. Glavieux, and P. Thit imajshima, “ Near Shannon Limit Error -Correcting Coding and Decoding: Turbo Codes,” IEEE Proc. ICC ’93 ,Geneva, Switzerland, May 1993, pp. 1064–1070.

[4] C. Berrou and A. Glavieux, “ Near Opt imum Error Correcti ng Coding andDecoding: Turbo-Codes,” IEEE Trans. Commun., vol. 44, no. 10, Oct.1996, pp. 1261–71.

[5] J. Hagenauer, “ Iterative Decoding of Binary Block and Convolut ionalCodes,” IEEE Trans. I nf o. Theory , vol . 42, no. 2, Mar. 1996, pp.429–45.

[6] D. Divsalar and F. Pollar a, “ On th e Design of Turbo Codes,” TDAprogress rep. 42-123, Jet Propulsion Lab., Pasadena, CA, Nov. 15, 1995,pp. 99–121.

[7] D. Divsalar and R. J. McEliece, “ Eff ective Free Distance of Turbo Codes,”Elect. Lett., vol. 32, no. 5, Feb. 29, 1996, pp. 445–46.

[8] S. Dolinar and D. Divsalar, “ Weight Distribut ions for Turbo Codes UsingRandom and Nonrandom Permutat ions,” TDA progress rep. 42-122, JetPropulsion Lab., Pasadena, CA, Aug. 15, 1995, pp. 56–65.

[9] D. Divsalar and F. Pollara, “ Turbo Codes for Deep-Space Comm unica-ti ons,” TDA prog ress rep. 42 -120, Jet Propulsion Lab., Pasadena, CA,Feb. 15, 1995, pp. 29–39.

[10] D. Divsalar and F. Pollara, “ Mult iple Turbo Codes for Deep-Space Com-munications,” TDA progress rep. 42-121, Jet Propulsion Lab., Pasadena,CA, May 15, 1995, pp. 66–77.

[11] D. Divsalar and F. Pollara, “ Turbo Codes for PCS Applicat ions,” Proc.ICC ’95 , Seattle, WA, June 18–22, 1995.

[12] L. R. Bahl et al., “ Optimal Decoding of Linear Codes for M inimizingSymbol Error Rate,” Trans. Info. Theory , vol. IT-20, Mar. 1974, pp.248–87.

[13] S. Benedetto et. al., “Soft Outp ut Decoding Algorit hm in IterativeDecodin g of Turbo Codes,” TDA prog ress rep. 42-124, Jet Propul sionLab., Pasadena, CA, Feb. 15, 1996, pp. 63–87.

[14] S. Benedetto et al., “ A Soft -Input Soft-Output Maximum A Posteriori(MAP) Module to Decode Parallel and Serial Concatenat ed Codes,” TDAprogress rep. 42-127, Jet Propulsion Lab., Pasadena, CA, Nov. 15,1996, pp. 63–87.

[15] S. Benedetto et al., “ A Soft -Input Soft-Outpu t APP Module fo r IterativeDecoding of Concatenated Codes,” IEEE Commun. Lett., vol.1, no. 1,Jan. 1997, pp. 22–24.

BIOGRAPHY

BERNARD SKLAR [LSM] has over 40 years of experience in technical designand management positions at Republic Aviation Corp., Hughes Aircraft, Lit-ton Industries, and The Aerospace Corporation. At Aerospace, he helpeddevelop the MILSTAR satellite system, and was the principal architect forEHF Satellite Data Link Standards. Currently, he is head of advanced sys-tems at Communications Engineering Services, a consulting company hefounded in 1984. He has taught engineering courses at several universities,including the University of California, Los Angeles and the University ofSouth ern California, and has presented num erous training programsthroughout the world. He has published and presented scores of technicalpapers. He is the r ecipient of th e 1984 Prize Paper Award fro m t he IEEECommunications Society for his tutorial series on digital communications,and he is the author of the book Digital Communications (Prentice-Hall). Heis past chair of the Los Angeles Council IEEE Education Committee. His aca-demic credentials include a B.S. degree in math and science from the Uni-versity of Michigan, an M.S. degree in electrical engineering from thePolytechnic Instit ute of Brooklyn, New York, and a Ph.D. degree in engi-neering from the University of California, Los Angeles.

  L d x L d  k k e k  102 1

2ˆ ˆ( ) = + ( )σ 

Page 9: 1997 - A Primer on Turbo Code Concepts

8/8/2019 1997 - A Primer on Turbo Code Concepts

http://slidepdf.com/reader/full/1997-a-primer-on-turbo-code-concepts 9/9