DEX-SMI-wadayama (2)

download DEX-SMI-wadayama (2)

of 12

Transcript of DEX-SMI-wadayama (2)

  • 8/6/2019 DEX-SMI-wadayama (2)

    1/12

    Optimization Approach to Bit Flipping Algorithms

    for Decoding LDPC Codes

    Tadashi Wadayama

    Nagoya Institute of Technology, Gokiso, Showa-ku, Naogya, Aichi, Japan, 466-8555

    E-mail: [email protected]

    1. IntroductionA graphical model (or Markov random field (MRF) model) can be considered as a probabilisticmodel which can capture the behavior of a system composed from many objects with mutualinteraction. Graphical models and related inference problems are becoming one of a solid basisof inter-field research activities such as statistics, statistical physics, and information theory.

    Typical computational problems related to the graphical model (or MRF) can be classifiedinto the following three categories: (i) evaluation of the log partition function (free energy,cumulant generating function); (ii) evaluation of marginal probability; (iii) evaluation of mode(maximum assignment) of Gibbs distribution corresponding to the graphical model.

    If the graph is tree, all three problems shown above can be efficiently solved withbelief propagation (BP). Unfortunately, if the graph is not tree, those problems becomecomputationally intractable when the number of nodes is large. Many approximate algorithmsto solve those problems have been developed so far, such as mean field approximation, loopyBP, MCMC, etc.. A fast approximate algorithm with sufficient accuracy is important not onlyfrom theoretical point of view but also from engineering point of view because it yields practicalapplications in the fields where statistical inference plays an important role. Decoding of low-density parity-check (LDPC) codes [5, 6] is a noticeable example; loopy BP which approximatelysolves problem (ii) provides powerful error correction.

    To obtain deep understanding on the behavior of such approximate algorithms may benecessary to improve an algorithm and also to develop new related algorithms. In many cases, anapproximate algorithm can be regarded as a numerical minimization algorithm for an underlyingcontinuous optimization problem. The hidden objective function to be minimized provides us aclue to understand the behavior of the algorithms.

    In the case of loopy BP, Yedidia et al.[1] unveiled that the objective function associated withthe BP rule is Bethe free energy. They proved that the BP rules can be naturally derived fromthe stationary condition on Bethe free energy. Once the objective function was recognized, ittriggered the developments of algorithms which directly minimize the Bethe free energy such asCCCP algorithm by Yuille [2].

    Another example is variational formulation on the log partition function discussed in thepaper by Wainwright and Jordan [3]. They showed that relaxed optimization problems inducedfrom the original variational problem equivalent to problem (i) lead to some known approximatemethods such as mean field approximation, loopy BP and also new class of approximate

    algorithms for bounding the log partition function based on linear programming and semi-definite programming.

  • 8/6/2019 DEX-SMI-wadayama (2)

    2/12

    The aim of this tutorial is to present that knowing an underlying objective function bringsus new perspective to understand a class of approximate algorithms and that optimization view

    point is useful to design new algorithms. To make discussion concrete and simple enough, wewill focus on a specific topic here: a class of decoding algorithms for binary linear codes whichare called bit flipping (BF) algorithms. The BF algorithms are very simple to describe (actuallymuch simpler than loopy BP decoding and its variants) and easy to understand without detailedknowledge on coding theory. Recently, underlying optimization problem for BF algorithms wasshown in [4]. Therefore, it looks a suitable topic for the tutorial to pursue the aim presentedabove.

    2. PreliminariesIn this section, background and notation required for this paper are firstly introduced and thenbrief review of known BF algorithms is given.

    2.1. BF algorithmsBF algorithms for decoding LDPC codes have been extensively studied and many variants ofBF algorithms such as weighted BF (WBF) [7], modified weighted BF (MWBF) [8], and others[9, 10, 11] have been proposed. The first BF algorithm is developed by Gallager [5]. In adecoding process of Gallagers algorithm, some unreliable bits (in a binary quantized receivedword) corresponding to unsatisfied parity checks are flipped for each iteration. The succesorsof Gallagers BF algorithm inherits the basic strategy of Gallagers algorithm: find a (or some)unreliable bit(s) and then flip it(them). Although bit error rate (BER) performance of a BFalgorithm is inferior to that of sum-product algorithm or min-sum algorithm in general, thealgorithm enable us to design a much simpler decoder, which is easier to implement.

    2.2. NotationLet H be a binary m n parity check matrix, where n > m 1. The binary linear code C is

    defined by C= {c Fn2 : Hc = 0}, where F2 denotes the binary Galois field. In this paper,

    a vector is assumed to be a column vector. For convention, we introduce the bipolar codes Ccorresponding to C as follows:

    C= {(1 2c1, 1 2c2, . . . , 1 2cn) : c C}. (1)

    Namely, C, which is a subset of {+1, 1}n, is obtained from C by using binary (0, 1) to bipolar(+1, 1) conversion.

    The binary-input AWGN channel is assumed in this paper, which is defined by y = c + z,where c C. The vector z = (z1, . . . , zn) is a white Gaussian noise vector where zj(j [1, n]) is

    an i.i.d. Gaussian random variable with zero mean and variance 2. The notation [a, b] denotesthe set of consecutive integers from a to b.

    The notation for index sets associated with H is required for describing BF algorithms. Let

    N(i)= {j [1, n] : hij = 1}, i [1, m] (2)

    M(j)= {i [1, m] : hij = 1}, j [1, n], (3)

    where hij is the ij-element of the parity check matrix H. By using this notation, we can writethe parity condition in this way:

    i [1, m],

    jN(i)

    xj = 1, (4)

    which is equivalent to (x1, . . . , xn) C. The valuejN(i) xj {+1, 1} is called i-th bipolar

    syndrome ofx.

  • 8/6/2019 DEX-SMI-wadayama (2)

    3/12

    2.3. Known BF algorithmsNumber of variants of BF algorithms have been developed so far. We can classify the BF

    algorithms into two-classes: single bit flipping (single BF) algorithms and multiple bits flipping(multi BF) algorithms. In a decoding process of a single BF algorithm, only one bit is flippedaccording to its bit flipping rule. On the other hand, multi BF algorithm allows multiple bitsflipping per iteration in a decoding process. In general, a multi BF algorithm shows fasterconvergence than a single BF algorithm but it suffers from oscillation behavior of a decoderstate which is not easy to control.

    The framework of the single BF algorithms is summarized as follows:

    Single BF algorithm

    Step 1 (Hard decision) For j [1, n], let xj := sign(yj), where

    sign(x)= +1, x 0

    1, x < 0.(5)

    Let x= (x1, x2, . . . , xn).

    Step 2 (Parity check) If the parity equationjN(i) xj = 1 holds for all i [1, m], output x

    and then exit.

    Step 3 (Bit flip) Find the flip position given by

    := arg mink[1,n]

    k(x). (6)

    and then flip the symbol: x := x. The function k(x) is called an inversion function.

    Step 4 (Check on number of iterations) If the number of iterations is under the maximumnumber of iterations Lmax, return to Step 2; otherwise output x and exit.

    In a decoding process of a single BF algorithm, firstly, hard decision decoding for a given yis done and x is initialized to the hard decision result. Then, the minimum of the inversionfunction k(x) for k [1, n] is found

    1. An inversion function k(x) can be seen as a measurefor invalidness of bit assignment on xk. The bit x, where gives the smallest value of theinversion function, is then flipped.

    The inversion functions of WBF [7] is defined by

    (WBF)k (x)

    =

    iM(k)

    i

    jN(i)

    xj. (7)

    The values i(i [1, m]) is the reliability of bipolar syndromes defined by i= minjN(i) |yj |.

    In this case, the inversion function (WBF)k (x) gives the measure of invalidness of symbol

    assignment on xk, which is given by the sum of the weighted bipolar syndromes.The inversion functions of MWBF [8] has a similar form of the inversion function of WBF

    but it contains a term corresponding to a received symbol. The inversion function of MWBF isgiven by

    (MWBF)k (x)

    = |yk| +

    iM(k)

    i

    jN(i)

    xj, (8)

    where the parameter is a positive real number.

    1 When k(x) is an integer-valued function, we need a tie-break rule to resolve a tie.

  • 8/6/2019 DEX-SMI-wadayama (2)

    4/12

    3. Gradient descent formulation of BF algorithmsIt looks natural to consider that the dynamics of a BF algorithm as a minimization process of

    a hidden objective function. This observation leads to a gradient descent formulation of BFalgorithms.

    3.1. Objective functionThe maximum likelihood (ML) decoding problem for the binary AWGN channel is equivalentto the problem to find a (bipolar) codeword in C which gives the largest correlation to a givenreceived word y. Namely, the MLD rule can be written as2

    x = arg minxC

    nj=1

    (yi xi)2

    = arg maxxC

    n

    j=1 xiyi. (9)

    Based on this correlation decoding rule, we here define the following objective function:

    f(x)=

    ni=1

    xjyj +mi=1

    jN(i)

    xj . (10)

    The first term of the objective function corresponds to the correlation between a bipolarcodeword and the received word, which should be maximized. The second term is the sumof the bipolar syndromes ofx. If and only ifx C, then the second term has its maximumvalue

    mi=1jN(i)

    xj = m. Thus, this term can be considered as a penalty term which forces

    x to be a valid codeword. Note that this objective function is a non-linaer function and it hasmany local maxima. These local maxima become major source of sub-optimality of GD-BFalgorithm presented later.

    Based on the objective function f(x), we can define the following an approximate version ofthe MLD rule.

    Definition 1 (Approximate MLD rule) The approximate MLD rule for binary-AGWNchannel is defined by

    x = arg maxx{+1,1}n

    f(x). (11)

    The optimization problem defined in (11) is a discrete optimization problem but the underlying

    continuous optimization problem

    max f(x) subject to x Rn (12)

    give us a clue for solving the combinatorial optimization problem (11).

    3.2. Gradient descent BF algorithmFor numerical optimization problem for a differentiable function such as (12), the gradientdescent method is a natural choice for the first attempt. The partial derivative of f(x) withrespect to the variable xk(k [1, n]) can be immediately derived from the definition of f(x):

    xk

    f(x) = yk + iM(k)

    jN(i)\k

    xj. (13)

    2 This decoding rule is well-known as correlation decoding.

  • 8/6/2019 DEX-SMI-wadayama (2)

    5/12

    Let us consider the product of xk and the partial derivative of xk in x, namely

    xk

    xk f(x) = xkyk +

    iM(k)

    jN(i)

    xj. (14)

    For small real number s, we have the first order approximation:

    f(x1, . . . , xk + s , . . . , xn) f(x) + s

    xkf(x). (15)

    When xk

    f(x) > 0, we need to choose s > 0 in order to have

    f(x1, . . . , xk + s , . . . , xn) > f(x). (16)

    On the other hand, if xk

    f(x) < 0 holds, we should choose s < 0 to obtain the inequality

    (16). Therefore, if xkxk

    f(x) < 0 then flipping the kth symbol (xk := xk) may increase the

    objective function value3. Figure 1 shows an example. The upper figure shows the polarity ofxk(k [1, n]) and the lower figure represents the gradient at x. In the dotted circles (indexedby 2, j , n), xk and the corresponding partial derivative have different sign. Thus, flipping one ofthose bits can increase the objective function value.

    1 2 3 4 j n

    1

    n

    +

    1

    1

    1

    2

    3 4 j n

    1

    n

    G r a d i e t v e c t o r

    S e a r c h p o i t

    Figure 1. Gradient and flip position

    One reasonable way to find a flipping position is to choose the position where the absolutevalue of the partial derivative is largest. This flipping rule is closely related to the steepestdescent algorithm based on 1-norm (also known as coordinate descent algorithm) [13]. Accordingto this observation, we have the following rule to choose the flipping position.

    Definition 2 (Inversion function of GD-BF algorithm [4]) The single BF algorithmbased on the inversion function

    (GD)k (x)

    = xkyk +

    iM(k)

    jN(i)

    xj (17)

    is called Gradient descent BF(GD-BF) algorithm .

    3 There is possibility that the objective function value may decrease because the step size is fixed (such as singleflip).

  • 8/6/2019 DEX-SMI-wadayama (2)

    6/12

    Thus, a decoding process of GD-BF algorithm can be seen as a minimization process of f(x)(it can be considered as energy of the system) based bit-flipping gradient descent method.

    It is interesting to see that the combination of the objective function f(x) defined by

    f(x)=

    ni=1

    xj |yj | +mi=1

    i

    jN(i)

    xj (18)

    and the argument on gradient descent presented above gives the inversion functions ofconventional algorithms such as WBF algorithm (7) and MWBF algorithm (8). However, thisobjective function (18) looks less meaningful compared with the objective function (10). In

    other words, the inversion function (GD)k (x) defined in (17) has more natural interpretation

    than those of the conventional algorithms: (WBF)k (x) in (7) and

    (MWBF)k (x) in (8). Actually,

    the new inversion function (GD)k (x) is not only natural but also effective in terms of bit error

    performance and convergence speed.

    3.3. Multi GD-BF algorithmA decoding process of GD-BF algorithm can be regarded as a maximization process of theobjective function (10) in a gradient descent manner. Thus, we can utilize the objective functionvalue to know the convergence behavior. For example, it is possible to monitor the value ofobjective function for each iteration. In the first several iterations, the values increases as thenumber of iteration proceeds. However, the value eventually stops to increase when the searchpoint arrives at the nearest point in {+1, 1}n to local maximum of the objective function. Wecan easily detect such convergence to a local maximum by observing the value of the objectivefunction.

    BF algorithms reviewed in the previous section and GD-BF algorithm flips only one bit foreach iteration. In terms of the numerical optimization, in these algorithms, a search point movestowards a local maximum with very small step (i.e., 1 bit flip) in order to avoid oscillation aroundthe local maximum (See Fig.2 (A)). However, the small size step leads to slower convergenceto a local maximum. In general, compared with min-sum algorithm, BF algorithms (singleflip/iteration) requires larger number of iterations to achieve the same bit error probability.

    A multi bit flipping algorithm is expected to give faster convergence speed than that of singlebit flipping algorithm because of its larger step size. If the search point is close to a localmaximum, a fixed large step is not suitable to find the (near) local maximum point; it leads tooscillation behavior of a multi-bit flipping BF algorithm (Fig.2 (B)). We need to adjust the stepsize dynamically from large step size to small step size in an optimization process(Fig.2 (C)).

    The objective function is an useful guildline for adjusting the step size (i.e., number of flippingbits). The multi GD-BF algorithm is GD-BF algorithm including the multi-bit flipping idea [4].

    In the following, we assume the inversion function (GD)k (x) defined by (17) (the inversion

    function for GD-BF algorithm).The flow of multi GD-BF algorithm is almost same as that of GD-BF algorithm previously

    presented. To distinguish two decoding algorithm clearly, GD-BF algorithm presented in theprevious sub-subsection is called single GD-BF algorithm if necessary.

    In order to define multi GD-BF algorithm, we need to introduce new parameters and . Theparameter is a negative real number, which is called the inversion threshold. The binary (0 or1) variable , which is called mode flag, is set to 0 at the beginning of a decoding process. TheStep 3 of the BF algorithm should be replaced with the following multi-bit flipping procedure.

    Step 3 (multi-bit flipping) Evaluate the value of the objective function and let f1 := f(x).If = 0 then execute sub-step 3-1, else execute sub-step 3-2.

  • 8/6/2019 DEX-SMI-wadayama (2)

    7/12

    L o c a l m a x i m u m

    ( A ) S n g e fl p p n g

    O s c i l l a t i o

    b h a v i o r

    ( B ) M p e fl p p n g

    ( fi e d )

    ( C ) M p e fl p p n g

    ( d y n )

    (A) converging but slow, (B) not converging but fast, (C) converging and fast

    Figure 2. Convergence behavior

    3-1 (multi-bit mode) Flip all the bits satisfying (GD)k < (k [1, n]). Evaluate the

    value of the objective function again and let f2 := f(x). If f1 > f2 holds, then let = 1.

    3-2 (single-bit mode) Flip a single bit satisfying arg mink[1,n] (GD)k

    Usually, at the beginning of a decoding process, objective function value increases as the numberof iterations grows in multi-bit mode; namely f1 < f2 holds for the first few iterations. When thesearch point eventually arrives at the point satisfying f1 > f2, the bit flipping mode is changed

    from multi-bit mode ( = 0) to single-bit mode ( = 1). This mode change means adjustmentof the step size; it helps a search point to converge to a local maximum when the search pointis located close to the local maximum.

    4. Behavior of GD-BF algorithmsIn this section, behavior and decoding performance of (single and multi) GF-BF algorithmsobtained from computer simulations are presented.

    Figure 3 presents objective function values (10) as a function of the number of iterations. insingle and multi GD-BF processes. The code used in this experiment is a regular LDPC codewith n = 1008, m = 504 (called PEGReg504x1008 in [14]). The column weight of the code is 3.In both cases (single and multi), we tested the same noise instance and both algorithms outputscorrect codeword (i.e., successful decoding) .

    In the case of single GD-BF-algorithm, objective function value gradually increases as thenumber of iterations grows in first 5060 iterations. After the slope, the increment of theobjective function value eventually stops and a flat part which corresponds to a local maximumappears. In the flat part of the curves, oscillation behavior of the objective function value can beseen. Due to the constraint such that a search point x must lie in {+1, 1}, a GD-BF processcannot find a true local maximum point (the point where the gradient of the objective functionbecome zero vector) of the objective function. Thus, a search point moves around the localmaximum point; this move causes the oscillation behavior observed in a single GD-BF process.The curve corresponding to multi GD-BF algorithm shows much faster convergence comparedwith single GD-BF algorithm. It takes only 15 iterations for the search point to come very closeto the local maximum point.

    Figure 4 presents the bit error curves of single and multi GD-BF algorithms (Lmax =100, = 0.6). As references, the curves for WBF algorithm (Lmax = 100), MWBF algorithms

  • 8/6/2019 DEX-SMI-wadayama (2)

    8/12

    1250

    1300

    1350

    1400

    1450

    1500

    1550

    0 20 40 60 80 100

    ValueofObjectiveFunction

    Number of Iterations

    multi GD-BFsingle GD-BF

    n = 1008, m = 504, regular (PEGReg504x1008[14]), SNR=4dB

    Figure 3. Objective function values in GD-BF processes as a function of number of iterations

    (Lmax = 100, = 0.2), and normalized min-sum algorithm (Lmax = 5, scale factor 0.8) areincluded as well. The parameter Lmax denotes the maximum number of iterations for eachalgorithms. The code used in the simulation is, again, PEGReg504x1008 (n = 1008, m = 504).We can see that GD-BF algorithms performs much better than WBF and MWBF algorithms.

    For example, at BER = 106

    , multi GD-BF algorithm offers approximately 1.6 dB gaincompared with MWBF algorithm. Compared with single GD-BF algorithm, multi GD-BFalgorithm shows steeper slope in its error curve. Unfortunately, there is still large performancegap between the error curves of normalized min-sum algorithm and the GD-BF algorithms. GD-BF algorithm fails to decode when a search point is attracted to an undesirable local maximum ofthe objective function. This large performance gap suggests the existence of some local maximarelatively close to a bipolar codeword which degrades the BER performance.

    In order to evaluate convergence speed of BF algorithms, average number of iterations is anappropriate measure. Figure 5 presents average number of iterations (as a function of SNR)of GD-BF algorithms (single and multi), WBF, MWBF algorithms. The code is the regularcode PEGReg504x1008 (n = 1008, m = 504). Firstly, it can be pointed out that multi GD-BFalgorithm certainly have fast convergence property; large gaps between the curve of multi GD-BF algorithm and other curves can be observed. This result suggests that BER performanceof multi GD-BF algorithm overwhelms those of other algorithms if Lmax is limited to a smallnumber such as 10.

    5. Escape from a local maximumAs we have discussed, a decoding failure occurs when a search point is captured by a localmaximum which is not a transmitted codeword. Thus, it is desirable to know the effect of suchlocal maxima. Figure 6 presents three trajectories of weight and syndrome weight of a searchpoint in three decoding processes corresponding to decoding failure. The weight of a search

    point x is defined by w1(x)= |{j [1, n] : xj = 1}|. In a similar way, the syndrome weight of

    x is given by w2(x)

    =

    i [1, m] :jN(i) xj = 1

    . We assume that all-1 bipolar codeword

    (i.e., all zero binary codeword) is transmitted without loss of generality.

  • 8/6/2019 DEX-SMI-wadayama (2)

    9/12

    10-7

    10-6

    10-5

    10-4

    10-3

    10-2

    10

    -1

    100

    2 3 4 5 6 7 8

    BitErrorRate

    SNR [dB]

    WBF, Lmax=100

    MWBF, Lmax=100

    single GD-BF, Lmax=100multi GD-BF, Lmax=100

    Normalized min-sum, Lmax=5

    n = 1008, m = 504, regular (PEGReg504x1008[14])

    Figure 4. Bit error rate of GD-BF algorithms : regular LDPC code

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    2 2.5 3 3.5 4 4.5 5 5.5 6 6.5

    Averagenumberofitera

    tions

    SNR [dB]

    WBF

    MWBF

    single GD-BF

    multi GD-BF

    n = 1008, m = 504, regular (PEGReg504x1008[14]), Lmax = 100

    Figure 5. Average number of iterations

    We can obtain the following observation from Fig.6: (i) a decoding process starts from theposition where both w1(x) and w2(x) are large; (ii) w1(x) and w2(x) decreases as the iterationproceeds; (iii) The final states of the search point have relatively small value of w1(x) and w2(x).

    From these observations, we may be able to conjecture that a search point is finally trappedby a local maximum close to a near codeword in high probability4. Near codewords [15] arebipolar codewords of C which have both small weight and syndrome weight. Since the weight

    4 Note that other experiments also supports this conjecture.

  • 8/6/2019 DEX-SMI-wadayama (2)

    10/12

    0

    20

    40

    60

    80

    100

    0 5 10 15 20 25 30 35 40

    Syndromeweightofx

    Weight of x

    n = 1008, m = 504, regular (PEGReg504x1008[14])

    Figure 6. Trajectories of weight and syndrome weight of search points

    of a final position of a search point is so small, small perturbation to a captured search pointappears to be helpful for the search point to escape from the undesirable local maximum.

    One of the simplest ways to give a perturbation to a trapped search point is to switch fromsingle-bit mode to multi-bit mode forcibly when the number of iterations reaches the pre-defined

    number. This additional process is called escape process. Figrue 7 shows BER curve of sucha decoding algorithm (labeled with multi GD-BF with escape). In this decoding algorithm, adecoder changes its flipping mode (to multi-bit) when = 25, 75, 125, . . . , ( denotes the numberof iterations). The parameter = 0.8 is used in the first multi-bit mode and = 1.7 is usedin the other cases (when = 25, 50, . . .) .

    We can see that the BER curve of multi GD-BF with escape is much steeper than that of naivemulti GD-BF algorithm. This result implies that perturbation actually can save some trappedsearch points to converge to the desirable local maximum corresponding to the transmittedcodeword. It is an interesting open problem to optimize the flipping schedule to narrow the gapbetween min-sum BER curve and GD-BF BER curve.

    6. ConclusionA distinct advantage of GD-BF algorithms over the conventional BF algorithms is itsformulation. GD-BF algorithms are derived from simple non-linear objective functions. Adecoding process of GD-BF algorithms can be regarded as a maximization process of the objectfunction using bit-flipping gradient descent method (i.e., bit-flipping dynamics which minimizesthe energy f(x)). This property has been fully utilized to design multi bit flipping GD-BFalgorithm, which offers both very fast (maybe fastest among known BF algorithms) convergence.The gradient descent formulation would be useful to design a new BF algorithm for a non-AWGNchannel such as channels with memory. Furthermore, a view point from optimization brought usa new way to understand convergence behaviors of BF algorithms. One lesson we have learnedfrom this work is that fine control on flipping schedule is indispensable to improve decodingperformance.

    In this paper, we have discussed a topic on BF algorithms for decoding LDPC codes as anexample to show the importance to know the underlying objective function. In the field of

  • 8/6/2019 DEX-SMI-wadayama (2)

    11/12

    10-7

    10-6

    10-5

    10-4

    10-3

    10-2

    10

    -1

    100

    1 2 3 4 5 6 7

    BitErrorRate

    SNR [dB]

    multi GD-BF with escape, Lmax=300

    multi GD-BF, Lmax=100

    Normalized min-sum, Lmax=5Normalized min-sum, Lmax=100

    n = 1008, m = 504, regular (PEGReg504x1008[14])

    Figure 7. Bit error rate of GD-BF algorithm with escape process

    non-linear optimization, many classes of optimization algorithms for non-linear functions havebeen developed such as gradient descent method, Newton method, interior point method forconvex functions [13], etc.. These sophisticated optimization algorithms may improve a knownapproximate algorithm in both convergence speed and accuracy. This optimization approach

    appears still promising for many class of problems and to be studied more in depth. Furthermore,if non-linear optimization based formulation is established, a necessary (and possibly sufficient)condition for global (or local) optimality such as Karush-Kuhn-Tucker condition characterizesthe behavior of the approximate algorithms. Deeper understanding to the algorithm may comefrom such an analysis.

    AcknowledgementThe author would like to express his appreciation to Masayuki Yagita, Keisuke Nakamura(Nagoya Institute of Technology) and Yuuki Funahashi (Meijo University) for inspiring

    discussion and cooperative work on [4]. This work was supported by the Ministry of Education,Science, Sports and Culture, Japan, Grant-in-Aid for Scientific Research on Priority Areas(Deepening and Expansion of Statistical Informatics) 180790091 and partly supported by aresearch grant from SRC (Storage Research Consortium).

    [1] J.S. Yedidia, W.T. Freeman, and Y. Weiss, Constructing Free-Energy Approximations and GeneralizedBelief Propagation Algorithms, IEEE Trans. on Inform. Theory, vol. 51, pp. 2282-2312, 2005

    [2] A.L. Yuille, CCCP algorithms to minimize the Bethe and Kikuchi free energy convergent alternative tobelief propagation, Neural Computation, vol.14, pp.1691-1722, 2002.

    [3] M. J. Wainwright, and M. I. Jordan Graphical models, exponential families, and variational inference, UCBerkeley, Dept. of Statistics, Technical Report 649. September, 2003.

    [4] T. Wadayama, M. Yagita, Y. Funahashi, S.Usami, I. Takumi, Gradient descent-bit flipping algorithms fordecoding LDPC codes, arXiv:0711.0261, 2007.

    [5] R.G.Gallager, Low-Density Parity-Check Codes, in Research Monograph series. Cambridge, MITPress,1963.

  • 8/6/2019 DEX-SMI-wadayama (2)

    12/12

    [6] D.J.C.Mackay, Good error-correcting codes based on very sparse matirices, IEEE Trans. Inform. Theory,vol.45, pp.399431, 1999.

    [7] Y.Kou, S.Lin, and M.P.C Fossorier, Low-density parity-check codes based on finite geometries: a rediscoveryand new results, IEEE Trans. Inform. Theory, pp.27112736, vol.47, Nov. 2001.

    [8] J.Zhang, and M.P.C.Fossorier, A modified weighted bit-flipping decoding of low-density parity-check codes,IEEE Communications Letters, pp.165167, vol.8, Mar. 2004.

    [9] M.Jiang, C.Zhao, Z.Shi, and Y.Chen, An improvement on the modified weighted bit flipping decodingalgorithm for LDPC codes, IEEE Communications Letters , vol.9, no.9, pp.814816, 2005.

    [10] F. Guo, and H. Henzo, Reliability ratio based weighted bit-flipping decoding for low-dencity parity-checkcodes, IEEE Electronics Letters , vol.40, no.21, pp.13561358, 2004.

    [11] C. H. Lee, and W. Wolf, Implemntation-efficient reliability ratio based weighted bit-flipping decoding forLDPC codes, IEEE Electronics Letters , vol.41, no.13, pp.755757, 2005.

    [12] A. Nouh and A. H Banihashemi, Bootstrap decoding of low-density parity check matrix, IEEECommunications Letters , vol.6, no.9, pp.391393, 2002.

    [13] S.Boyd and L. Vandenberghe, Convex optimization, Cambridge University Press, 2004.[14] D. J. C. MacKay Encyclopedia of sparse graph codes,

    online: http://www.inference.phy.cam.ac.uk/mackay/codes/data.html[15] D.J.C. MacKay and M.S. Postol, Weaknesses of Margulis and Ramanujan-Margulis low-density parity-check

    codes, Electronic Notes in Computer Science, 2003.