Download - Optimizing LDPC Codes for message-passing decoding.

Optimizing LDPC Codes for message-passing decoding.

Jeremy Thorpe

Ph.D. Candidacy

2/26/03

Overview

Research Projects Background to LDPC Codes Randomized Algorithms for designing LDPC

Codes Open Questions and Discussion

Data Fusion for Collaborative Robotic Exploration

Developed a version of the Mastermind game as a model for autonomous inference.

Applied the Belief Propagation algorithm to solve this problem.

Showed that the algorithm had an interesting performance-complexity tradeoff.

Published in JPL's IPN Progress Reports.

Dual-Domain Soft-in Soft-out Decoding of Conv. Codes

Studied the feasibility of using the Dual SISO algorithm for high rate turbo-codes.

Showed that reduction in state-complexity was offset by increase in required numerical accuracy.

Report circulated internally at DSDD/HIPL S&S Architecture Center, Sony.

Short-Edge Graphs for Hardware LDPC Decoders.

Developed criteria to predict performance and implementational simplicity of graphs of Regular (3,6) LDPC codes.

Optimized criteria via randomized algorithm (Simulated Annealing).

Achieved codes of reduced complexity and superior performance to random codes.

Published in ISIT 2002 proceedings.

Evalutation of Probabilistic Inference Algorithms

Characterize the performance of probabilistic algorithms based on observable data

Axiomatic definition of "optimal characterization"

Existence, non-existence, and uniqueness proofs for various axiom sets

Unpublished

Optimized Coarse Quantizers for Message-Passing Decoding

Mapped 'additive' domains for variable and check node operations

Defined quantized message passing rule in these domains

Optimized quantizers for 1-bit to 4-bit messages

Submitted to ISIT 2003

Graph Optimization using Randomized Algorithms

Introduce Proto-graph framework Use approximate density evolution to predict

performance of particular graphs Use randomized algorithms to optimize

graphs (Extends short-edge work) Achieves new asymptotic performance-

complexity mark

Bacground to LDPC codes

The Channel Coding Strategy

Encoder chooses the mth codeword in codebook C and transmits it across the channel

Decoder observes the channel output y and generates m’ based on the knowledge of the codebook C and the channel statistics.

Decoder

Encoder

Channel

k}1,0{'m nYy

nXC xk}1,0{m

Linear Codes

A linear code C (over a finite field) can be defined in terms of either a generator matrix or parity-check matrix.

Generator matrix G (k×n)

Parity-check matrix H (n-k×n)

}{mGC

}':{ 0cHc C

LDPC Codes

LDPC Codes -- linear codes defined in terms of H

H has a small average number of non-zero elements per row or column.

1011001001

0100111010

0111010011

1100001101

0000100110

H

Graph Representation of LDPC Codes

H is represented by a bipartite graph.

There is an edge from v to c if and only if:

A codeword is an assignment of v's s.t.:

0,|

cv

vxc

Variable nodes

Check nodes

0),( cvH . . . . . .

v

c

Message-Passing Decoding of LDPC Codes

Message Passing (or Belief Propagation) decoding is a low-complexity algorithm which approximately answers the question “what is the most likely x given y?”

MP recursively defines messages mv,c(i) and

mc,v(i) from each node variable node v to each

adjacent check node c, for iteration i=0,1,...

Two Types of Messages...

Likelihood Ratio

For y1,...yn independent conditionally on x:

Probability Difference

For x1,...xn independent:

)0|(

)1|(,

xyp

xypyx )|0()|1(, yxpyxpyx

i

yxyx in ,, 1

i

yxyi

x ii ,,

...Related by the Biliniear Transform

Definition:

Properties:

x

xxB

1

1)(

yx

yx

yxpyxp

yp

ypyxpypyxp

yp

xypxyp

xypxyp

xypxyp

xyp

xypBB

,

,

)|1()|0(

)(2

)()|1(2)()|0(2

)(2

)1|()0|(

)1|()0|(

)1|()0|(

))0|(

)1|(()(

yxyx

yxyx

B

B

xxBB

,,

,,

)(

)(

))((

Message Domains

Likelihood Ratio

Log Likelihood Ratio

Log Prob. Difference

Probability Difference

)1|(

)0|(

xyP

xyP)|1()|0( yxPyxP

)(B

)(' B

e )log( e )log(

Variable to Check Messages

On any iteration i, the message from v to c is:

In the additive domain:

cvc

ivcv

icv mBm

'|

)1(,'

)(, )(

cvc

ivcv

icv mBm

'|

)1(,'

)(, )(')log(

. . . . . .

v

c

Check to Variable Messages

On any iteration, the message from c to v is:

In the additive domain:

vcv

icv

ivc mBm

'|

)(,'

)(, )('

. . . . . .

v

c

vcv

icv

ivc mBm

'|

)(,'

)(, )('

Decision Rule

After sufficiently many iterations, return the likelihood ratio:

otherwise ,1

0)( if ,0ˆ

)1(

|,,

i

vcvcyx mB

xvv

Theorem about MP Algorithm

If the algorithm stops after r iterations, then the algorithm returns the maximum a posteriori probability estimate of xv given y within radius r of v.

However, the variables within a radius r of v must be dependent only by the equations within radius r of v,

v

r

...

...

...

Regular (λ,ρ) LDPC codes

Every variable node has degree λ, every check node has degree ρ.

Best rate 1/2 code is (3,6), with threshold 1.09 dB.

This code had been invented by 1962 by Robert Gallager.

Regular LDPC codes look the same from anywhere!

The neighborhood of every edge looks the same.

If the all-zeros codeword is sent, the distribution of any message depends only on its neighborhood.

We can calculate a single message distribution once and for all for each iteration.

Analysis of Message Passing Decoding (Density Evolution)

We assume that the all-zeros codeword was transmitted (requires a symmetric channel).

We compute the distribution of likelihood ratios coming from the channel.

For each iteration, we compute the message distributions from variable to check and check to variable.

D.E. Update Rule

The update rule for Density Evolution is defined in the additive domain of each type of node.

Whereas in B.P, we add (log) messages:

In D.E, we convolve message densities:

vcv

icv

ivc mBm

'|

)(,'

)(, )('

vcv

MBM icv

ivc

PP'|

)(' )(',

)(,

*

cvc

ivcv

icv mBm

'|

)1(,'

)(, )(')log(

cvc

MBM ivcv

icv

PPP'|

)(')log( )1(',

)(,

*

Familiar Example:

If one die has density function given by:

The density function for the sum of two dice is given by the convolution:

1 3 6542

2 4 7653 8 10 12119

D.E. Threshold

Fixing the channel message densities, the message densities will either "converge" to minus infinity, or they won't.

For the gaussian channel, the smallest SNR for which the densities converge is called the density evolution threshold.

D.E. Simulation of (3,6) codes

Threshold for regular (3,6) codes is 1.09 dB

Set SNR to 1.12 dB (.03 above threshold)

Watch fraction of "erroneous messages" from check to variable

Improvement vs. current error fraction for Regular (3,6)

Improvement per iteration is plotted against current error fraction

Note there is a single bottleneck which took most of the decoding iterations

Irregular (λ, ρ) LDPC codes

a fraction λi of variable nodes have degree i. ρi of check nodes have degree i.

Edges are connected by a single random permutation.

Nodes have become specialized.

. . . . .

.

Variable nodes

Check nodes

πλ3

λn

ρ4

λ2

ρm

D.E. Simulation of Irregular Codes (Maximum degree 10)

Set SNR to 0.42 dB (~.03 above threshold)

Watch fraction of erroneous check to variable messages.

This Code was designed by Richardson et. al.

Comparison of Regular and Irregular codes

Notice that the Irregular graph is much flatter

Note: Capacity achieving LDPC codes for the erasure channel were designed by making this line exactly flat

Constructing LDPC code graphs from a proto-graph

Consider a bipartite graph G, called a "proto-graph"

Generate a graph G α called an "expanded graph"

replace each node by α nodes.

replace each edge by α edges, permuted at random

α=2

=G

=G2

Local Structure of Gα

The structure of the neighborhood of any edge in Gα can be found by examining G

The neighborhod of radius r of a random edge is increasingly probably loop-free as α→∞.

Density Evolution on G

For each edge (c,v) in G, compute:

and:

vcv

MBM icv

ivc

PP'|

)(' )1(',

)(,

*

cvc

MBM ivcv

icv

PPP'|

)(' )(',

)(,

**

Density Evolution without convolution

One-dimensional approximation to D.E, which requires: A statistic that is approximately additive for check nodes A statistic that is approximately additive for variable

nodes A way to go between these two statistics A way to characterize the message distribution from the

channel

Optimizing a Proto Graph using Simulated Annealing

Simulated Annealing is an iterative algorithm that approximately minimizes an energy function

Requirements: A space S over which to find the optimum point An energy function E(s):S→R A random perturbation function p(s):S→S A "temperature profile" t(i)

Optimization Space

Graphs with a fixed number of variable and check nodes (rate is fixed)

Optionally, we can add untransmitted (state) variables to the code

Typical Parameters 32 transmitted variables 5 untransmitted variables 21 parity checks

Energy function

Ideal: density evolution threshold. Practical:

Approximate density evolution threshold Number of iterations to converge to fixed error

probability at fixed SNR

Perturbations

Types of operation Add an edge Delete an edge Swap two edges

Note: Edge swapping operation not necessary to span the space

Basic Simulated Annealing Algorithm

Take s0 = a random point in S

For each iteration i, define si' = p(si)

if E(si') < E(si) set si+1 = si'

if E(si ') > E(si) set si+1 = si' w.p.)(

)()'(

it

sEsE ii

e

Degree Profile of Optimized Code

The optimized graph has a large fraction of degree 1 variables

Check variables range from degree 3 to degree 8

(recall that the graph is not defined by the degree profile) 0

2

4

6

8

10

12

1 2 3 4 5 6 7 8

VariablenodesChecknodes

Threshold vs. Complexity

Designed codes of rate .5 with threshold 8 mB from channel capacity on AWGN channel

Low complexity (maximum degree = 8)

Improvement vs. Error Fraction Comparison to Regular (3,6)

The regular (3,6) code has a dramatic bottleneck.

The irregular code with maximum degree 10 is flatter, but has a bottleneck.

The optimized proto-graph based code is nearly flat for a long stretch.

Simulation Results

n=8192, k=4096 Achieves bit error rate

of about 4×10-4 at SNR=0.8dB.

Beats the performance of n=10000 code in [1] by a small margin.

There is evidence that there is an error floor

Review

We Introduced the idea of LDPC graphs based on a proto-graph

We designed proto-graphs using the Simulated Annealing algorithm, using a fast approximation to density evolution

The design handily beats other published codes of similar maximum degree

Open Questions

What's the ultimate limit to the performance vs. maximum degree tradeoff?

Can we find a way to achieve the same tradeoff without randomized algorithms?

Why do optimizing distributions sometimes force the codes to have low-weight codewords?

A Big Question

Can we derive the shannon limit in the context of MP decoding of LDPC codes, so that we can meet the inequalities with equality?

Free Parameters within S.A.

Rate Maximum check, variable degrees Proto-graph size Fraction of untransmitted variables Channel Parameter (SNR) Number of iterations in Simulated Annealing

Performance of Designed MET Codes

Shows performance competitive with best published codes

Block error probability <10-5 at 1.2 dB

a soft error floor is observed at very high SNR, but not due to low-weight codewords

Multi-edge-type construction

Edges of a particular "color" are connected through a permutation.

Edges become specialized. Each edge type has a different message distribution each iteration.

MET D.E. vs. decoder simulation