Rate gains in block-coded modulation systems with interblock memory

18
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46,NO. 3, MAY2000 851 Rate Gains in Block-Coded Modulation Systems with Interblock Memory Chia-Ning Peng, Member, IEEE, Houshou Chen, Member, IEEE, John T. Coffey, Member, IEEE, and Richard G. C. Williams, Member, IEEE Abstract—This paper examines the performance gains achiev- able by adding interblock memory to, and altering the mapping of coded bits to symbols in, block-coded modulation systems. The channel noise considered is additive Gaussian, and the twin design goals are to maximize the asymptotic coding gain and to minimize the number of apparent nearest neighbors. In the case of the ad- ditive white Gaussian noise channel, these goals translate into the design of block codes of a given weighted or ‘normalized’ distance whose rate is as high as possible, and whose number of codewords at minimum normalized distance is low. The effect of designing codes for normalized distance rather than Hamming distance is to ease the problem of determining the best codes for given parameters in the cases of greatest interest, and many such best codes are given. Index Terms—Block-coded modulation, code constructions, gluing, multilevel codes, normalized distance, split weight enu- merators. I. INTRODUCTION B LOCK-coded modulation (BCM) systems derived from the multilevel construction method form a class of high-performance bandwidth-efficient communication schemes. This method provides a way to generate systematic constructions of block modulation codes with arbitrarily high asymptotic minimum Euclidean distance, by linking the design to the theory of binary block codes. The resulting codes can be decoded easily using decoders for the individual component binary block codes. These modulation codes were originally constructed by Imai and Hirakawa in 1977 [14], and thus appear at the very beginning of the coded modulation field. The codes can also be seen as a special case of more general constructions, such as the generalized concatenated code and related constructions, as in, e.g., Blokh and Zyablov [1], Zinov'ev [37], and Ginzburg [11]; these more general Manuscript received October 30, 1998; revised September 7, 1999. This work was supported in part by the U.S. Army Research Office under Grant DAAH04-96-1-0377. The material in this paper was presented in part at the IEEE International Symposium on Information Theory, Ulm, Germany, June–July 1997, and at the Workshop on Coding and Cryptography, Paris, France, January 1999. C.-N. Peng is with Texas Instruments Broadband Access Group, San Jose, CA 95124 USA (e-mail: [email protected]). H. Chen is with the Department of Electrical Engineering, National Chi-Nan University, Nantou, Taiwan 545, R.O.C. (e-mail: [email protected]). J. T. Coffey is with the Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, MI 48109 USA (e-mail: [email protected]). R. G. C. Williams is with 3-COM, San Diego, CA 92128 USA (e-mail: [email protected]). Communicated by E. Soljanin, Associate Editor for Coding Techniques. Publisher Item Identifier S 0018-9448(00)02891-1. constructions are extensively discussed in the review chapter of Dumer [8]. The codes have been very widely studied since then [2]–[4], [6], [16]–[23], [25], [27], [30], [32]–[36]. We refer the reader interested in a review of the extensive literature on this subject to the articles of Kasami et al. [17], Huber et al. [13], and Williams [34]. We will treat the basic concepts of block-coded modulation schemes as standard, apart from a brief review in Section II. In summary, in the standard block-coded modulation scheme, a coded sequence of symbols from a -ary modulation scheme is obtained by using binary codes, each of length , with an as- signment of bit labels to signal constellation points obtained via a set-partitioning scheme. Each binary code involves bits of the same significance from the bit labels of the transmitted sym- bols. Each transmitted symbol, conversely, has a bit label con- taining one bit from each of the component codes of length . The resulting minimum squared Euclidean distance can be shown to be , where is the minimum distance of the th binary block code and is obtained from the set parti- tioning labeling scheme. Thus the minimum squared Euclidean distance is determined by the minimum of the distances of the component binary block codes weighted by the set partitioning constants. Two simple modifications to these systems proposed by Lin [19] (see also [20], [21], [23], and [35]), namely, the introduc- tion of “interblock memory” combined with a staggered map- ping of code bits to transmitted symbols, have the effect of trans- forming the code design problem into one in which the criterion of interest is a weighted or “normalized” Hamming distance of a single binary block code of length . Thus a codeword that has Hamming weight in bits to for each has normalized weight ; the minimum Euclidean dis- tance of the overall coded modulation system will be shown to be . Compared to regular BCM systems, the systems with interblock memory support substantially higher transmission rates with given probability of error. The main cost is a more complicated structure, and hence more expensive de- coding. This paper will derive the relationship between the motivating coded modulation problem and the normalized distance coding problem, and develop constructions, bounds, and many optimal codes in this framework. It is an interesting fact that two ex- tremely simple bounds turn out to be exact in very many of the cases of interest. It should be acknowledged here that the relevance of the nor- malized distance criterion is directly tied to the goal of max- imizing the minimum Euclidean distance of the system. Sev- 0018–9448/00$10.00 © 2000 IEEE

Transcript of Rate gains in block-coded modulation systems with interblock memory

Page 1: Rate gains in block-coded modulation systems with interblock memory

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 3, MAY 2000 851

Rate Gains in Block-Coded Modulation Systems withInterblock Memory

Chia-Ning Peng, Member, IEEE, Houshou Chen, Member, IEEE, John T. Coffey, Member, IEEE, andRichard G. C. Williams, Member, IEEE

Abstract—This paper examines the performance gains achiev-able by adding interblock memory to, and altering the mappingof coded bits to symbols in, block-coded modulation systems. Thechannel noise considered is additive Gaussian, and the twin designgoals are to maximize the asymptotic coding gain and to minimizethe number of apparent nearest neighbors. In the case of the ad-ditive white Gaussian noise channel, these goals translate into thedesign of block codes of a given weighted or ‘normalized’ distancewhose rate is as high as possible, and whose number of codewordsat minimum normalized distance is low.

The effect of designing codes for normalized distance rather thanHamming distance is to ease the problem of determining the bestcodes for given parameters in the cases of greatest interest, andmany such best codes are given.

Index Terms—Block-coded modulation, code constructions,gluing, multilevel codes, normalized distance, split weight enu-merators.

I. INTRODUCTION

B LOCK-coded modulation (BCM) systems derivedfrom the multilevel construction method form a class

of high-performance bandwidth-efficient communicationschemes. This method provides a way to generate systematicconstructions of block modulation codes with arbitrarilyhigh asymptotic minimum Euclidean distance, by linkingthe design to the theory of binary block codes. The resultingcodes can be decoded easily using decoders for the individualcomponent binary block codes. These modulation codes wereoriginally constructed by Imai and Hirakawa in 1977 [14], andthus appear at the very beginning of the coded modulationfield. The codes can also be seen as a special case of moregeneral constructions, such as the generalized concatenatedcode and related constructions, as in, e.g., Blokh and Zyablov[1], Zinov'ev [37], and Ginzburg [11]; these more general

Manuscript received October 30, 1998; revised September 7, 1999. Thiswork was supported in part by the U.S. Army Research Office under GrantDAAH04-96-1-0377. The material in this paper was presented in part atthe IEEE International Symposium on Information Theory, Ulm, Germany,June–July 1997, and at the Workshop on Coding and Cryptography, Paris,France, January 1999.

C.-N. Peng is with Texas Instruments Broadband Access Group, San Jose,CA 95124 USA (e-mail: [email protected]).

H. Chen is with the Department of Electrical Engineering, National Chi-NanUniversity, Nantou, Taiwan 545, R.O.C. (e-mail: [email protected]).

J. T. Coffey is with the Electrical Engineering and Computer ScienceDepartment, University of Michigan, Ann Arbor, MI 48109 USA (e-mail:[email protected]).

R. G. C. Williams is with 3-COM, San Diego, CA 92128 USA (e-mail:[email protected]).

Communicated by E. Soljanin, Associate Editor for Coding Techniques.Publisher Item Identifier S 0018-9448(00)02891-1.

constructions are extensively discussed in the review chapter ofDumer [8]. The codes have been very widely studied since then[2]–[4], [6], [16]–[23], [25], [27], [30], [32]–[36]. We refer thereader interested in a review of the extensive literature on thissubject to the articles of Kasamiet al. [17], Huberet al. [13],and Williams [34].

We will treat the basic concepts of block-coded modulationschemes as standard, apart from a brief review in Section II.In summary, in the standard block-coded modulation scheme, acoded sequence ofsymbols from a -ary modulation schemeis obtained by usingbinary codes, each of length, with an as-signment of bit labels to signal constellation points obtained viaa set-partitioning scheme. Each binary code involves bits of thesame significance from the bit labels of thetransmitted sym-bols. Each transmitted symbol, conversely, has a bit label con-taining one bit from each of the component codes of length

. The resulting minimum squared Euclidean distance can beshown to be , where is the minimum distanceof the th binary block code and is obtained from the set parti-tioning labeling scheme. Thus the minimum squared Euclideandistance is determined by theminimumof the distances of thecomponent binary block codes weighted by the set partitioningconstants.

Two simple modifications to these systems proposed by Lin[19] (see also [20], [21], [23], and [35]), namely, the introduc-tion of “interblock memory” combined with a staggered map-ping of code bits to transmitted symbols, have the effect of trans-forming the code design problem into one in which the criterionof interest is a weighted or “normalized” Hamming distance ofa single binary block code of length . Thus a codeword thathas Hamming weight in bits to for eachhas normalized weight ; the minimum Euclidean dis-tance of the overall coded modulation system will be shown tobe . Compared to regular BCM systems, thesystems with interblock memory support substantially highertransmission rates with given probability of error. The main costis a more complicated structure, and hence more expensive de-coding.

This paper will derive the relationship between the motivatingcoded modulation problem and the normalized distance codingproblem, and develop constructions, bounds, and many optimalcodes in this framework. It is an interesting fact that two ex-tremely simple bounds turn out to be exact in very many of thecases of interest.

It should be acknowledged here that the relevance of the nor-malized distance criterion is directly tied to the goal of max-imizing the minimum Euclidean distance of the system. Sev-

0018–9448/00$10.00 © 2000 IEEE

Page 2: Rate gains in block-coded modulation systems with interblock memory

852 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 3, MAY 2000

eral alternative approaches have been proposed recently to thecode design problem for regular BCM: for example, the capacityrule [13] in which rates are assigned to the component codes ac-cording to random coding principles, and the error probabilityrule [2], [3], [32] in which the component codes are chosen tobalance the individual error probabilities at all coding levels.(Indeed, Wachsmann and Huber [32] go so far as to suggest thatvirtually any techniqueexceptthe traditional Euclidean distanceapproach will work well!)

Kasamiet al. [17] discuss block-coded modulation schemesin which there is interblock memory between the componentrow codes, but in which the coded bits are assigned to symbols inthe usual way. The resulting schemes do not have higher overallEuclidean distance in general, but do have far fewer apparentnearest neighbors, no smaller a Euclidean distance, and no morecomplex a decoder.

The interblock memory systems considered in this paper willall use the “staggered” assignment of coded bits to symbols, inwhich each coded bit affects exactly one transmitted symbol.The abbreviation BCMIM will refer to interblock memory sys-tems of this type, rather than the kind considered by Kasamiet al.Previous work on such block-coded modulation with in-terblock memory systems include the original paper of Lin [19]and subsequent papers by Yamaguchi and Imai [35] and by Linet al. [20], [21], [23] that give codes, with associated trellis de-coding structures and performance simulations, though not gen-eral bounds and constructions. This paper generalizes the per-formance metric and explores the limits of such schemes.

II. DEFINITIONS OFBCM AND BCMIM SYSTEMS

We assume throughout that the modulation scheme is-ary,with bit labels assigned by a set partitioning scheme in whichthe intrasubset squared Euclidean distances rise in the ratio

.A baseline BCM “codeword array” is constructed by taking

binary linear codes , each of length . A sequenceof transmitted symbols is obtained by filling the rows of aarray with codewords from these respective codes, then readinglabels upwards, column by column.

The codeword array thus has the form:

where LSLB denotes the least significant label bits, and MSLBdenotes the most significant label bits.

The minimum squared Euclidean distance between two se-quences of symbols in a baseline BCM system is well knownto be

where is the minimum Hamming distance of code. (See,for example, Sayegh [27].) The quantity is the dis-tance gain of the BCM system over the uncoded system.

For a baseline BCM system using a length code and a-ary modulation scheme to have a distance gain, we simply

choose the codeword array such that each row codehas min-imum Hamming distance . The maximum rate ofa baseline BCM system with distance gainis then

bits per symbol, where is the maximum dimension ofa length binary linear code with minimum Hamming distance

.

A. Block-Coded Modulation with Interblock Memory

BCMIM systems differ from BCM systems in two ways: thegenerator matrix is allowed a more general structure, and thecodewords are mapped to symbols in a different way.

A BCMIM system with interblock memory between the firstblocks is one with generator matrix of the form

in which is the generator matrix of a binary linear codeof length .

The resulting codewords are mapped to symbols in the fol-lowing staggered way (see the top of this page), where the sym-bols will be read and transmitted going from right to left. (Thusbit of the codeword affects the transmitted stream first.)

Thus the difference in generator matrix is that we allow in-terdependencies between the firstrows of the codeword array;and the difference in mapping is that we allow a codeword oflength to affect transmitted symbols, rather than a blockof symbols as in regular BCM.

The system is decoded one codeword at a time, and it is as-sumed that we do not use the results of “later” codewords to

Page 3: Rate gains in block-coded modulation systems with interblock memory

PENGet al.: RATE GAINS IN BLOCK-CODED MODULATION SYSTEMS 853

redecode “earlier” codewords. This is similar to the standard“staged decoder” assumption in BCM systems [27]. The pur-pose of transmitting blocks from right to left is that each code-word can be decoded as soon as it is received.

B. Asymptotic Performance and Normalized Weight

Definition. Normalized Weight:Given the basic block sizeand the sequence of increasing intrasubset

squared Euclidean distances, thenormalized weightof a wordof length is defined as

(1)

where is the Hamming weight of theth length basicblock of . The normalized distance between two codewordsand the minimum normalized distance of a code are definedanalogously.

The minimum normalized distance derives its signifi-cance from the fact that it represents the asymptotic (insignal-to-noise ratio) improvement in squared Euclideandistance of the BCMIM system over the uncoded system.Assuming that all past codewords have been decoded correctly,i.e., assuming that all blocks above the one to be decoded inthe codeword array above are known and correct, the squaredEuclidean distance between two codewordsand is

i.e., the minimum squared Euclidean distance is higher than inthe uncoded case by exactly the minimum normalized weight ofthe code, as claimed.

The probability of error is not exactly predicted by thisminimum squared Euclidean distance due to the conditioningon previous decoding results. The conditional distribution ofthe additive noise given that previous blocks were correct isnot Gaussian; more seriously, the conditional noise distributiongiven that errors did occur in the blocks above the currentone in the codeword array above will be both non-Gaussianand more severe than the unconditional distribution. (Kofmanet al. [18] develop rigorous analytical bounds on bit-errorprobability for a bit-interleaved coded modulation system usingconvolutional codes with 8-PSK, taking these complicatingfactors into account.) Thus there will be an error propagationeffect in the system. This effect may be ameliorated by the useof decoding algorithms that include soft output [12], or by oc-casional insertion of known symbols. At higher signal-to-noiseratios, however, the previous few symbols will be correct withhigh probability and the conditional noise distribution will beclose to the unconditional one. Thus normalized distance hasoperational significance in representing the exact gain of aBCMIM system over a baseline BCM system at asymptoticallyhigh signal-to-noise ratios.

Note that if we take a baseline BCM system and adopt thestaggered assignment of the codeword array to symbols asabove, then the minimum Euclidean distance of the resulting

system is as for BCMIM; but since thecodeword array consists of independently coded rows, thisreduces to as for BCM systems without the staggeredassignment. This gives an interpretation of the advantageachieved by BCMIM systems: if the minimum normalizeddistance is fixed, a BCMIM system allows us to augment thecode by codewords that span several of the basic blocks, andthat each have at least the target normalized distance. Thisgives us a system with the same asymptotic distance gain, buthigher rate. Thus on a graph of versus , at lowerror probability, the BCMIM operating point would be directlyabove the BCM operating point. On a graph of versus

, the BCMIM operating point would be asymptoticallyabove and to the left of the BCM operating point, with thehorizontal distance being decibels.

The net coding gain of a block-coded modulation systemat a given block error probability can be defined for the pur-poses of this discussion to be the distance on an versus

graph between the operating point of the coded modu-lation scheme at that error probability and an operating point atthe same rate on a curve interpolated between uncoded systems.Thus in Fig. 1, the uncoded QAM curve is given by

decibels, where is chosen to fit the un-coded quadrature phase-shift keying (QPSK) error probabilityto the target error probability.

C. Nearest Neighbors

It is widely known that the most serious single disadvantageof block-coded modulation schemes is the “nearest neighbor”problem [9], [10], [17], [34]. The standard staged decoding pro-cedure, in which each row of the codeword array is decodedin sequence, beginning with the top row, has the effect of pro-ducing a large number of “apparent nearest neighbors,” i.e., alarge multiplicity of the most likely error events. These causethe coding gain at finite signal-to-noise ratios to be less than theasymptotic coding gain. The exact difference varies according totarget probability of error; the usual rule of thumb, proposed byForney [9], is that the signal-to-noise ratio increases by 0.2 dBfor each doubling in the number of apparent nearest neighborsfor error probabilities in the range – .

We will take rough account of the effect of number of nearestneighbors by seeking codes that have as few codewords of min-imum normalized weight as possible. The number of apparentnearest neighbors depends on both the transmitted sequence andthe overall coded modulation system into which the code is em-bedded, as noted in the Appendix. Thus the number of minimumnormalized weight codewords bears on the number of apparentnearest neighbors only indirectly. It is, however, a property ofthe code rather than associated modulation schemes.

D. Complexity Comparisons

In adding extra codewords to a baseline BCM system to ob-tain a BCMIM system, we almost always increase the state com-plexity of a trellis decoding algorithm. Thus the extra perfor-mance of BCMIM over BCM comes at some price. However,it is, of course, possible to decode the resulting codes via othermethods, maximum-likelihood or suboptimal, thus producing avery wide variety of performance/complexity tradeoffs. We do

Page 4: Rate gains in block-coded modulation systems with interblock memory

854 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 3, MAY 2000

Fig. 1. Performance of BCMIM and BCM systems.

not examine these in this paper, and instead focus on findingcodes and evaluating the extra rate achievable by adding in-terblock memory for a fixed basic block length.

One more comparison is particularly relevant. Codes of basicblock length and interblock memory between the first twobasic blocks are effectively length codes. Instead of com-paring them in performance to the best baseline systems of basicblock length as above, we can compare the resulting BCMIMsystem to the best baseline BCM system with basic block length

. The reasoning here is that if we take the block length tobe a very rough guide to decoding complexity, the BCMIMsystem will involve decoding a length code plus some length

codes, while the baseline system will involve a collection oflength codes. (Thus the BCMIM system is in some way in-termediate in decoding complexity between baseline BCM sys-tems of basic block lengths and .)

We discuss this point in the Appendix and simply state theconclusion here: the BCMIM systems in most cases have higherrate, and in the cases where the information is available havefewer apparent nearest neighbors than the longer baseline coun-terpart. Thus BCMIM can be seen either as adding rate to a base-line system of length , or as reducing the number of apparentnearest neighbors of a baseline system of length.

E. Statement of Problem

The above discussion of coding gains motivates the normal-ized distance coding problem. (We do not translate from nor-malized distance back to coding gains in the remainder of thepaper.) The discussion will be confined to linear codes.

We thus consider the following problem in “classical” codingtheory.

Given a nondecreasing sequence , find binarylinear codes of length and basic block length, of highest

dimension for given minimum normalized distance, wherethe normalized weight of a codeword is defined as ,with the ordinary Hamming weight of the codeword in bits

to .Among those codes of highest dimension, find the ones with

smallest number of codewords of normalized weight.

When discussing codes it is more convenient to speak of di-mension, while in comparing overall systems it is convenient tospeak of rate; since it is trivial to convert from one to the otherwe use both terms throughout.

F. Notation

The notation for codes will be of the form

to indicate a code of length , dimension , andnormalized distance, i.e., , at least .The default values of the 's are as for quadratureamplitude-modulated (QAM) systems. Thus if the subscriptis omitted, it is implied that . Then, for example,

will indicate a code of length , dimension , inwhich for every nonzero codeword the weight on the left plustwice the weight on the right is at least. The notationwill always refer to a binary linear code of length, dimension

, and Hamming distance.In 8-PSK, we have and . We de-

note this set of 's by the subscript , so that, for example,denotes a length code of dimension, in which

for every nonzero codeword the weight in the firstbits plustimes the weight in the last bits is at least .

For an code, thepunctured future subcodeis the code obtained by deleting all nonzero code-

Page 5: Rate gains in block-coded modulation systems with interblock memory

PENGet al.: RATE GAINS IN BLOCK-CODED MODULATION SYSTEMS 855

words that have weight in the last bits, then deleting thefirst bits in all remaining codewords. The punc-tured past code is defined similarly. Theshortened future sub-codeis the code obtained by deleting all codewordsof the code that have nonzero weight in thefirst bits. The shortened past subcode is definedsimilarly.

The notation will denote the shortened past subcode oflength , and similarly will denote the shortened future codeof length . Any linear code of overall length may be rep-resented uniquely as , i.e., as the direct sum ofthe past and future shortened codes and some gluing code.The unsubscripted codesand will denote the shortened pastsubcode of length and the shortened future subcode of length

, respectively.

III. B OUNDS AND CONSTRUCTIONS FORNORMALIZED

WEIGHT CODES

This section will focus on using the concept of normalizedweight to design the code in a BCMIM system with a given dis-tance gain so that the rate of the designed code is maximum.We will consider a general BCMIM system employing a-arysignal constellation and using a code of lengthwith the th

-bit basic block protecting theth least significant label bits.We assume the minimum squared intrasubset Euclidean dis-tance within the signal set increases as( is normalized to ).

The problem is obviously related to the regular Hamming dis-tance problem, particularly when the interblock memory is be-tween just two basic blocks. However, techniques that are bestfor Hamming distance are not necessarily best for normalizeddistance, andvice versa. We identify the related Hamming dis-tance construction where applicable.

When we have interblock memory between three or morebasic blocks, on the other hand, a simple result (Lemma 1) turnsout to be very useful.

A. Upper Bounds on the Maximum Rate Gain in IntroducingInterblock Memory to a Baseline BCM System

Lemma 1 (Full Dimension Lemma):The maximum dimen-sion gain obtainable in extending interblock memory from be-tween the first basic blocks to between the firstblocks isupper-bounded by

bits.Proof: More generally, we compare a code with

parameters (with interblock memory be-tween the first blocks) to the best system with interblockmemory extending only to block : a direct sum of an

code of highest dimension and ancode. Since shortening the last

positions in the code gives a code withinterblock memory between the first basic blocks andnormalized distance at least , we must have .The gain obtained by extending the interblock memory to

include the th basic block, , is thenbits. Making all the 's equal reduces to

the assertion of the lemma.

Note that the normalizing constants and the nor-malized distance have not been used, beyond the implicitassumption that they are all positive. Note also that we have notassumed that the codehas been constructed by starting withthe best code for the memory- case, and then adding agluing code: this is not guaranteed in general to produce the bestmemory- code. We get equality in the above expression, how-ever, only if (but not necessarily if) the past length- sub-code obtained by shortening the positions in the last basic blockis optimal in dimension for the memory- case. A secondnecessary condition for equality is that the bits in theth basicblock are linearly independent, i.e., the punctured future codeof length is the code. Finally, note that since the'sform an increasing sequence, the upper bounds on dimensiongain follow a “law of diminishing returns” with increasing.

1) Plotkin Bound: Suppose we have a code with bits inthe first basic block, in the second, and so on. A straight-forward application of the classical Plotkin bound argument, inwhich we evaluate

via both rows and columns [24, pp. 41–42] gives the result

with equality if and only if every nonzero codeword has theminimum normalized weight .

In the case where all the 's are integers, we can transforma code with basic block lengths and normalized dis-tance into a code of length and Hamming dis-tance , by replicating each block times. The Plotkin boundabove is then simply the regular Plotkin bound applied to thecorresponding lengthened code.

We will see later (Section IV) that this gives tight results inmany cases when applied to shortened codes.

2) Griesmer-Type Bound:Suppose a codeword of min-imum normalized weight has weight distribution

, i.e., has weight in basic block for . Thenwe can obtain a code with basic block lengths

, minimum normalized weight ,and dimension one less. This follows by taking a generator ma-trix for the original code in which the given minimum-weightcodeword is one row; then deleting the given codeword and allthe columns in its support.

(Equivalently, we can transfer bits in higher basic blocks tothe first basic block, replicating times, to obtain a code withbasic block lengths

which contains a codeword of weight whose support is in thefirst basic block. Since at least half of the other codewords have

ones in these positions, we have the claimed normal-ized distance in the residual code (cf. [31, p. 46]).)

Page 6: Rate gains in block-coded modulation systems with interblock memory

856 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 3, MAY 2000

In the usual situation, we have and interblockmemory between the first two basic blocks. If we have the code-word , then the procedure gives us acode, i.e., an binary linear code. Anecessary condition then for an code to exist, andto contain the codeword , is that

The dimension gain is therefore upper-bounded by

It may happen that It is easy toverify that this cannot happen for integer, however. Thus forQAM constellations, the dimension gain by adding interblockmemory is upper-bounded by whenthe code contains the codeword .

The most natural way to construct a BCMIM system is basedon the corresponding baseline BCM system with highest rate,utilizing the trellis structures of the row codes in the baselineBCM system. In this case, with , the codewordis always present.

3) Shortening–Puncturing:With interblock memory be-tween the first blocks, the best code with basic block length

and normalized distance has dimension at mosthigher than the best code with basic block length andnormalized distance , since we can puncture the longercode in one position of the first basic block and shorten in oneposition of each of the other basic blocks. Thus in the case

with interblock memory between the first two basicblocks, the dimension cannot jump bywhen we move tobasic block length .

Note that this relation does not depend on the normalizingconstants .

4) Split Linear Programming:The most powerful generalupper bound for codes with ordinary Hamming distance is thelinear programming bound, and a suitable modification providesgood results for normalized distance also. Following Jaffe [15],we fix a partition of , and let be a linearcode. Then if denotes the number of codewords of

that have weight in partition element for each , thesplitweight enumeratorof is

where the 's are indeterminates.Then the split weight enumerator of is

Since the coefficients of this polynomial must be nonnegative,we have constraints that can be used with linear programming.

In our situation, the 's all have the same size (the basic blocklength), and we seek the maximum of subjectto the linear constraints for

all such that ,and

for all and all with ,where

is the usual Krawtchouk coefficient.

B. Lower Bounds on the Maximum Rate Gain in IntroducingInterblock Memory to a Baseline BCM System

The extension bound below relates the normalized andHamming distance problems directly, and is particularlyuseful when the interblock memory is between the first twobasic blocks. We can also apply the standard constructionsfor producing new codes from old ones from the Hammingdistance problem, though the normalized weight criterion altersthe relative usefulness of these. We list below the constructionsthat were found most useful in the main case. The distancebounds follow by elementary adaptation of the reasoning forthe ordinary Hamming distance case (see, for example, [24]).

1) Extension Bound:This bound relates the normalized andHamming distance problems directly. The bound is normallyuseful only in the case of interblock memory between the firsttwo basic blocks, though it can be extended in obvious ways.Assume that is an integer dividing both and . Then ifan code exists, we can take any bits,replicate them times, and take the resultingbits to form thefirst basic block, with the remaining bits forming the secondblock. The normalized distance of this code is clearly, so wehave constructed an code. Note also that the normal-ized weight enumerator of this code is the same as the Hammingweight enumerator for the original code.

This bound gives the highest dimension possible forand in the QAM problem (Table I). The bound falls

one short in the cases , and , and at least one short inthe cases and .

2) Constructions: If and are codes of thesame length, with parameters and , respec-tively, then the code consisting of all words of the form ,with , has parameters

A regular construction does not by itself producemany good codes, since the distance guarantee becomes

, and the negative termtends to rule out good codes.

Another construction is to take codes with parame-ters and and to apply a

construction, i.e., a regularconstruction on each of the halves. This results in a code withparameters .

Page 7: Rate gains in block-coded modulation systems with interblock memory

PENGet al.: RATE GAINS IN BLOCK-CODED MODULATION SYSTEMS 857

TABLE ICODES FORQAM-BASED SYSTEMS

3) and Reverse- Constructions: The construction[24, p. 581] takes codes ,with and . Then the new codeconsists of the words , where the 's are coset rep-resentatives of in , and .

This code has normalized parameters

In a reverse- construction, we interchange the blocks in thegenerator matrix above to get a code with normalized parame-ters

4) Construction: This construction takes four codeswith a union of distinct cosets of , with

coset representatives , and a union of distinct

cosets of , with coset representatives . The newcode consists of all words of the form , where

. The 's and 's can be paired to make theresulting code linear. When this is done, the resulting generatormatrix can be taken to have the form

where and [29].

Page 8: Rate gains in block-coded modulation systems with interblock memory

858 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 3, MAY 2000

The normalized parameters are

IV. BEST BCMIM CODES

In this section we find a large number of best codes for nor-malized distance. We concentrate on the natural case in whichthe normalized distance is equal to the basic block length. Theresults are summarized in Table I.

A. Best Codes for QAM-Based BCMIM Systems with BasicBlock Length

The codes in the baseline BCM system will be ,, , and codes, as given by Cusack [6].

The case is a main focus of the papers of Linetal.Lin and Ma give an code [20] and Lin, Wang, andMa follow by extending this to an code [21]. Thesepapers give efficient trellis representations and performancesimulations, but do not discuss general constructions, bounds,or optimality.

1) Best Codes: If we allow interblock memorybetween the first two basic blocks, we are seeking the best

code. The baseline BCM case with no interblockmemory is obtained by taking the direct sum of thecode and the code, and so has dimension.

From the full-dimension lemma, the gain in dimension wheninterblock memory is introduced is upper-bounded by

The first necessary condition for equality in this bound is thatthe shortened past subcodehas highest possible dimension,i.e., that . This implies that the codeword wouldbe in the code, so that we can apply the Griesmer argument,from which the maximum rate gain is upper-bounded by

i.e., we have for an code.This upper bound can be achieved using the code discussed

by Lin and Ma [20] with generator matrix (see Table II).This code can be obtained from an construction, with

an code, and codes, and ancode. (Lin and Ma [20] cite [29], which introduces the con-struction.)

(None of the other constructions of Section III give .The expansion construction gives , the three construc-tions based on the method give dimensions and

respectively, and the and reverse- constructions give di-mensions and , respectively.)

This code has split weight enumerator

and thus has 71 codewords of minimum normalized weight (oneof type - , 56 of type - , and 14 of type - ). This is the min-imum number possible. In fact, it is also the maximum number

TABLE IIGENERATORMATRIX FOR [8j8; 8; 8] CODE

possible: this code is (up to permutations) the uniquecode with normalized distance.

The uniqueness follows from the following reasoning. Wehave , where and are the shortened pastand future subcodes of length, and is an appropriate gluingcode. We must have aof dimension , for if not we would have

, could take a generator matrix for the lengthcode that had an identity matrix in the right half, and would needeight rows of weight and each distance apart in the lefthalf of the generator matrix. But [24, p. 684] sothis is impossible. Hence , and . Sincethe shortened past subcode has dimensionand length equalto the target normalized distance, the overall code must con-tain the codeword . Then for every nonzero sequence onthe right, there corresponds at least one sequence on the left ofweight . Thus we must have at least two nonzero bits on theright in every nonzero codeword, so the punctured future code

must be an even-weight code. Taking a systematicgenerator matrix for this, together with the codeword ,we need seven more rows of length, weight , and distanceexactly apart for the left of the generator matrix. These canonly be obtained by taking the 14 codewords of weightfroman Hamming code and selecting one word out of everycomplementary pair. All such choices lead to the same code upto permutation.

We note that the best code in terms of ordinaryHamming distance, a shortened quadratic residuecode, also unique [28], cannot by the above discussion bepartitioned into two halves to get normalized distance(though

is possible). This shows that we cannot solve the normalizeddistance problem in general by simply taking the best (linear)code for ordinary Hamming distance and finding the bestpartition for this code. (It is possible to partition the nonlinear

code to get an code with 57 minimumnormalized weight codewords, by taking the first 8 bits to beany eight bits that form the support of a codeword.)

2) Best Codes: The baseline code isobtained by a direct sum of an code for the first twobasic blocks and an code for the third basic block. Thususing the above code, we can achieve dimensionwithout ex-tending the interblock memory to include the third basic block.

The full-dimension lemma indicates that the dimension ofany code will be at most more than the dimen-

Page 9: Rate gains in block-coded modulation systems with interblock memory

PENGet al.: RATE GAINS IN BLOCK-CODED MODULATION SYSTEMS 859

sion of the best code. Thus with interblock memoryextended to the third basic block, we can potentially increasethe dimension by at most.

This can, in fact, be achieved by the code considered by Linet al. [21] in which the code for the first two basicblocks and the baseline code for the third basic blockare glued by the word

This code has the disadvantage that it has 131 codewords ofminimum normalized weight. This can be reduced to a min-imum of 87 with no loss in rate, though at the cost of increasingthe dimension of the gluing code. (Other things being equal, asmaller dimension gluing code is to be preferred, as the decoderis then usually less complex.) We seek the minimum number ofcodewords of normalized weightin any linear code with basicblock length and length with dimension . The past short-ened subcode must be the unique linear code givenabove, and the future subcode must have the identitymatrix as generator matrix. Suitable row operations can reduceany candidate generator matrix to the form

where denotes a single column of's and 's. The rows ofhave weight without loss of generality, since containsthe codeword . The rows of that are paired with rowsof weight in the middle must have weight exactly, and theothers have weight - . If one row of has weight , thenadding to a suitably chosen codeword of we would get acodeword with weights , i.e., normalized weight, thusruling out this possibility. Also any row of weightof mustbe at distance from the Hamming code at the left of .We conclude that without loss of generality the rows ofhaveweight either or , and the last eight rows of the overall matrixhave weights of the form , , or .

The rows of type each contribute four codewords ofminimum normalized weight if they are distinct in the left eightcomponents, and more otherwise. (Each word of weightisdistance from exactly three codewords of , providing threeminimum normalized weight codewords, plus the row itself.)Similarly, the rows of type each provide four such code-words if distinct in the left eight components, and more other-wise. The rows of type each contribute two such code-words if distinct from each other: the row itself, and the row plusthe word . We therefore fill all rows of with wordsof weight that are distinct from each other and their conversesand the words of the Hamming code above (there are morethan enough to make this possible) and get a code with 87 min-imum normalized weight codewords (71 from and 16 morefrom the rows involving ).

A generator matrix that results from this procedure is givenin Table III.

Together with eight uncoded bits for the fourth basicblock, we can build a BCMIM system with the parameters

. This is therefore a 16-QAM system that trans-mits 24 information bits per eight symbols.

TABLE IIIGENERATORMATRIX FOR

Making these most significant label bits uncoded adds eightminimum normalized weight codewords, whereas adding themost significant label bits to the least significant label bits addsnone. Thus with interblock memory between the first four basicblocks and a distance gain of, we have a maximum of 24 in-formation bits and a minimum of 87 codewords of normalizedweight . (The full-dimension lemma indicates that we cannothope to gain dimension by extending the interblock memory toinclude the fourth basic block; we can still improve the codein the sense of reducing the number of minimum normalizedweight codewords.)

B. Best Codes for QAM-Based BCMIM Systems with BasicBlock Length

The codes in a baseline BCM system are , ,, and .

1) Best Codes: With interblock memory betweenthe first two blocks, we seek the best code. A directsum of the and codes gives .

Applying split weight linear programming to this caseindicates that . On fixing the number of codewordsaccordingly at 128 and using split weight linear program-ming to minimize the number of codewords with normalizedweight , we find that the minimum of the objective is.This suggests that we might be able to find a code.This is in fact possible, and we will give a couple of differentconstructions.

This case is also interesting in that it is the first case in whichthe Griesmer-type bound of Section III-A2 is not tight (seeTable I).

Since the Griesmer argument indicates that if ,we must have , and so . The puncturedfuture code is then a code, and its dual is a

code. Shortening the overall code in any set ofpositionsholding a codeword of gives a code with

, and (this is construction Y1 fromMacWilliams and Sloane [24, p. 592]). Suppose that hadminimum distance . Then applying Y1 with gives a

Page 10: Rate gains in block-coded modulation systems with interblock memory

860 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 3, MAY 2000

code with , , and . But the Plotkin boundfor this case gives . Thus weconclude that is the unique linear code.

We can take the generator matrix for to be

Now applying Y1 again, with , we get a code with, , and . Applying the Plotkin bound to this

case, we find that ; thus the shortened codewould meet the Plotkin bound with equality. This in turn wouldimply that every nonzero codeword has the minimum nonzeronormalized weight, as noted earlier. This means that in eachof the rows of weight on the right, we would have to havea row of weight exactly on the left. This further narrows thepossibilities to generator matrices where, with the above matrixon the right, rows 2 and 3 on the left generate a linearcode, as do rows 4 and 5, and rows 6 and 7.

There are at least two inequivalent codes that satisfy theseconstraints. The first is obtained by a version of Piret's construc-tion [24, pp. 588–589] applied to the irreducible cycliccode. We first construct the code consisting ofand all wordsof the form

for , where is the idempotentof the given irreducible code, and .A generator matrix for this code can be taken as having rows

. The key property is that all nonzero codewordsare either rows or permutations of rows of the generator matrix,

and so we need only select the remaining parameterto maxi-mize the minimum distance of the rows of the generator matrix

where . We have, and for all , so the (unique) best

choice for in this case is , giving . (The factoron the right of the expression for and the corresponding op-timal choice of are the only differences with the developmentin [24].) This results in a code, and adding the row

to the generator matrix does not affect the minimum nor-malized distance, thus giving a code. (A brute-forcesearch showed that adding the row for any givesa lower minimum normalized distance.) From the construction,taking the same cyclic shift of the left and right sides of a code-word produces a codeword. The codewords of minimum nor-malized weight come in sets of 9, and there are four such sets,with weights - , - , - , and - , thus producing 36 code-words of minimum normalized weight. A generator matrix forthis code, , is shown at the bottom of this page.

A slightly smaller number of minimum normalized weightcodewords can be obtained from an optimal code par-titioned in an appropriate way. The generator matrix, , thatresults is also shown below, and has 33 codewords of minimumnormalized weight. (This code has more codewords of the nexthigher weight than the code based on Piret's construction, how-ever; 36 codewords of normalized weight, versus 27.) Theeven (Hamming) weight subcode obtained by deleting the firstrow is an code. It is known [7] that there are exactly twosuch codes up to equivalence, both obtainable by shortening the

Golay code. The corresponding Hamming weight enu-merators are and . Thecode generated by the last six rows of the matrix above has thesecond of these weight enumerators. There is no way to parti-tion and augment the first type to get a code. (For ex-

Page 11: Rate gains in block-coded modulation systems with interblock memory

PENGet al.: RATE GAINS IN BLOCK-CODED MODULATION SYSTEMS 861

ample, split linear programming shows that no codecan contain a codeword with Hamming weight.)

Split linear programming, constraining the code to have 128codewords, normalized distance, and to have a codeas dual of the punctured future code gives a lower bound of 31codewords of minimum normalized weight. We remark that inthe case of a code, there is a particularly large numberof solutions to the linear programming constraints that are fea-sible candidates for codes, in the sense of having nonnegativeinteger weights in both code and dual code. Many are ruled outby the extra constraint that the punctured future code is the dualof the code. We do not know of any such code with 31codewords of minimum normalized weight, though an integerweight enumerator, with integer dual weight enumerator, exists.

The Golay-based code can be extended to formand codes with 56 minimum normalized weightcodewords, as indicated in Table I. We omit the details.

C. Best Codes for QAM-Based BCMIM Systems with BasicBlock Length

The codes in a baseline BCM system are , ,, and .

1) Best Codes: With no interblock memorybetween the first two blocks, the baseline BCM system gives

. Split linear programming gives , and this boundcan easily be achieved by modifying the Golay code.Partitioning the Golay code so that each half holds a codewordof weight , and then shortening two positions on each sidegives a code, where the normalized distance fol-lows because no codeword of weightin the Golay code cansplit - and every codeword has even weight on each side. Thiscode has 20 codewords of minimum normalized weight.

A slightly more complicated construction allows us to reducethe number of normalized weight codewords. We partitionthe Golay code in such a way that each side forms an, inthe notation of [5]. Each side contains exactly three codewordsof weight : and up to permutation.Shortening in two appropriately chosen bits (e.g., bitsand) eliminates these three codewords and gives a

code with . Split integer programming gives a lowerbound of on the number of minimum normalized weight code-words.

2) Best Codes: The full-dimension lemmaindicates that the dimension is at mostgreater than in the caseabove, i.e., . This upper bound is achievable. We forma code by partitioning the Golay code into two

's, then shortening in two positions on either side. The punc-tured past and future codes are both codes in which oneof the cosets has weight enumerator

We choose any ten distinct words ofweight from this coset, and form the generator matrix

...

This code has minimum normalized weight and 68 code-words of this normalized weight: 18 with weight distribution

, 30 with weight distribution , and 20 from theshortened past subcode.

For four basic blocks, we can again achieve the bound ofthe full-dimension lemma, obtaining a code with parameters

. This is achieved by the code with gener-ator matrix

...

...

where are distinct words of weight from the samecoset as the 's. No codeword with positive weight in the fourthblock has normalized weight lower than. The code thereforehas minimum normalized weight with 68 codewords of thisnormalized weight.

D. Best Codes for QAM-Based BCMIM Systems with BasicBlock Lengths and

The codes in the baseline systems are , ,, and ; and , , ,

and , respectively. The systems without interblockmemory therefore give , , and

codes for basic block length , and, , and

codes for basic block length .The case is interesting as the first case in

which the split linear programming bound on dimension is nottight when . Split linear programming gives a maximumof 512 codewords even when the's are constrained to be in-tegers, suggesting that a dimension gain of five might be pos-sible. However, if such a (linear) code existed, it would have apast shortened subcodeof dimension by the Griesmer ar-gument, so the punctured future code would be an code.The dual would be an code, and a Y1 shorteningwould give an code, which the Plotkin bound showscannot exist.

This also rules out a basic block length, normalized dis-tance , code with interblock memory between the first twoblocks and dimension (again, split linear programming givesmaximum objective exactly 1024), since the optimumcannotjump by two by the shortening/puncturing argument.

We can however construct codes with the next higherdimensions, i.e., and (by shortening/puncturing)

codes. We use a reverse-construction. Takinga simplex code and puncturing in three positions thathold a codeword in the dual code gives a code withweight enumerator , which we use as bothand . Taking the dual of this simplex code and shortening in

Page 12: Rate gains in block-coded modulation systems with interblock memory

862 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 3, MAY 2000

the same three positions gives a shortened Hammingcode that contains as a subcode. Applying the reverse-construction gives a code; we need to augmentthis to get a code.

If we augment by adding the codeword , the punc-tured past code has weight enumerator

The only potential problem is if codewords of minimum weightin each half are paired, i.e., if we get a- break in some code-word. Since there are only three codewords of minimum weightin the punctured past code, we can avoid this, choosing a gen-erator matrix that pairs the three words of weighton the leftwith words of weight on the right.

One generator matrix that results from this procedure isshown at the bottom of this page, and has 60 codewords ofminimum normalized weight ( ; ; ;and ). Split integer programming gives a minimumof 6. (Constraining to be , as in the code above, gives aminimum of 29 codewords of minimum normalized weight.)

One extra property of this code will be used below. Code-words of odd weight in the right basic block must involve thelast row of the generator matrix. Since the last row on the leftis distance exactly from all linear combinations of the rowsabove it, we see that codewords with odd weight in the rightbasic block have weight exactlyin the left basic block.

1) An Code: We can obtain ancode from the code via shortening/puncturing.However, it is possible to increase the minimum normalizeddistance to without sacrificing rate. One such code isobtained by taking the Golay code and partitioningto get an code with the codeword . Allother codewords of weightfrom the original Golay code havedistribution - , - , or - across the basic blocks and thushave normalized weight at least. Shortening in one positionon the left therefore gives a code. Shorteningin three positions on the right gives a code, andthen transferring two bits from right to left and doubling givesan code, with 55 minimum normalized weightcodewords.

One can verify that although this procedure involves thechoice of six positions, the normalized weight enumerator ofthe resulting code does not depend on the choices.

There are at least two more inequivalent codes,though none are known that have as few minimum normal-ized weight codewords. Split integer programming gives a lowerbound of 24 on the number of minimum normalized weightcodewords.

The code described above can be extended to formand codes, meeting the upper

bound of the full dimension lemma in each case, with 198 code-words of minimum normalized weight. We omit the details.

2) Best and Codes: Wewill use the same approach to construct a codeas in the basic block length case. The puncturedsimplex code above has 48 cosets with weight enumerator

We choose to be words of weight from distinctcosets among these. We form the generator matrix

...

The 's are distance from each other, and distancefrom .They are distance at leastfrom . Also codewords ofodd weight in are paired with words of weight exactly

in the first basic block. Then the weight distributions betweenbasic blocks are of the form , , ,

, and . The normalized weight is thus at least. There are 214 codewords of minimum normalized weight.

Page 13: Rate gains in block-coded modulation systems with interblock memory

PENGet al.: RATE GAINS IN BLOCK-CODED MODULATION SYSTEMS 863

To form a code, we take a generator ma-trix of the form

...

...

The words are words of weight, each from distinctcosets with weight enumerator

of the code above. The only case that needsto be checked is the case , which hasweight distribution . Thus none of the codewordswith ones in the fourth basic block have normalized weight aslow as . The code is thus with 214 code-words of normalized weight .

E. Best Codes for QAM-Based BCMIM Systems with BasicBlock Lengths –

The codes of highest dimension with basic block lengths inthis range can all be obtained from the case below viashortening and puncturing. There is one case that is not covered:the optimum dimension for a code is . However,the normalized distance can be increased towithout sacri-ficing rate.

To construct such a code, we begin with the (unique)code. We apply the expansion construction, taking five posi-tions and doubling them for the left block, and leaving the other14 for the right block. This produces a code. We

then augment by adding to the generator matrix a word with ain an eleventh position on the left, a in each pair of dou-bled positions on the left, and a wordin the rightmost 14positions that is a coset leader in the punctured futurecode of the code above. The augmented word isnormalized distance at least from codewords of the

code. We therefore choose the five positions fromthe code to double so that the punctured fu-ture code has covering radius. Equivalently we seek to punc-ture the code in five positions, in such a way that the re-sulting code has covering radius at least. Some trial anderror shows that this is possible. Thus we obtain ancode, and transferring one bit from right to left and doublinggives a code. One such generator matrix is shownat the bottom of this page.

We obtain a code with 96 codewords of minimum normalizedweight. Split linear programming gives a lower bound of 26 suchcodewords in any code.

F. Best Codes for QAM-Based BCMIM Systems with BasicBlock Length

The codes in the baseline BCM system will be ,, , and .

1) Best Codes: Either split linear program-ming or the Y1/Plotkin argument show that . This can beachieved using the duplication construction, starting with the

Golay code, and forming the first block by duplicatingan arbitrary subset of eight bits. This gives 759 codewords ofnormalized weight .

This number can be reduced to 503 as follows. We use anconstruction with the code, the

first-order Reed–Muller code, and the extended Hammingcode. This still leaves to be chosen. We choose

to be the doubly even code formed by taking thefirst-order Reed–Muller code, plus two linearly inde-

pendent cosets; we choose the cosets so that they both haveweight enumerators

and their “sum” (i.e., the translate of one coset by any elementof the other coset) has the same weight enumerator. This is pos-sible if we choose one coset to be the one represented by the

Page 14: Rate gains in block-coded modulation systems with interblock memory

864 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 3, MAY 2000

Boolean function and the other to be the one representedby the Boolean function [24, p. 418]. The “sum” cosetis represented by the Boolean function . Letting

, we see that , with

from which we conclude (see [24, Theorem 14.4, p. 417]) thatthe cosets represented by and have the sameweight distribution.

The code thus has weight enumerator

A straightforward application of the bound gives the lowerbound on normalized distance with these codes; however,only codewords with weight distribution can have normal-ized weight this low, and all others have normalized weight atleast . Since the number of codewords of weightin islow, we can match each to words of weight greater thanon theright, with some trial and error.

A resulting generator matrix is shown at the bottom of thispage. This matrix has normalized weight enumerator

(Once we have chosen , this number of minimum normal-ized weight codewords is minimum: split linear programmingconstraining the left code to have the weight enumerator ofgives a minimum of exactly 503 for the number of codewordsof normalized weight . Furthermore, this number is the onlyfeasible solution to the split linear programming problem whenthe past subcode is constrained to have the weight enumeratorof .)

Split linear programming gives an overall lower bound of 271on the number of minimum normalized weight codewords.

2) Best Codes: The full-dimension lemmacombined with the result above indicates that in this

case. We can achieve this upper bound using the following con-struction. We begin by forming a code from theGolay code by partitioning the 24 positions of the Golay codeinto a and , again in the notation of [5], then bringing the

to the left and doubling it. The resulting code has generatormatrix

where generates the code . Wefind from [5, Table III.B] that a intersects every codewordof weight from the Golay code in at least two places, and thereare 16 such codewords that intersect in exactly two places. Thusthe punctured future code is a (even) code,and, furthermore, has covering radiussince there are too manycodewords of weight for the covering radius to be.

We now select a coset leaderof weight of the punc-tured future code and select a generator matrix for the

code of the form

......

Our requirements for the 's are that i) they all have weight atleast (for codewords of the form ); ii) they are distanceat least from (for codewords of the form );iii) they are distance at leastapart (for codewords of the form

); iv) for codewords of the form, we must avoid weights . Note that other cases give

normalized weight regardless of the choice of the's.We can find such 's by concatenating the Hamming

code with the nonlinear code . The resulting's all have weight exactly, are distance exactlyor apart,

and distance exactlyfrom . Furthermore, codewords of the

Page 15: Rate gains in block-coded modulation systems with interblock memory

PENGet al.: RATE GAINS IN BLOCK-CODED MODULATION SYSTEMS 865

TABLE IVCODES FOR8-PSK-BASED SYSTEMS

form that have weight in the middle basicblock must come from codewords of the original Golay codewith weight distribution across the partition. Theword is a word of weight or concatenated with the

repetition code, and is thus distance at leastfrom theword of weight concatenated with the repetition code.

We conclude that this is a code. There are1671 codewords of minimum normalized weight.

This code can easily be extended to more basic blocks, for-ming and codes,i.e., the full dimension lemma bound can be achieved in eachcase.

V. BEST CODES FOR8-PSK-BASED BCMIM SYSTEM

In an 8-PSK modulation system, the minimum squared intra-subset Euclidean distance increases with the ratio

. The results for basic blocks lengthsto are shownin Table IV. Note that in this case the codes of highest rateandminimum number of codewords at the target minimum normal-ized distance are known in each case in this list.

The results in the columns listed as split linear programmingwere obtained as usual by rounding a noninteger optimumnumber of codewords down to the nearest power of two. It isinteresting to note that in every case in the table, integer splitlinear programming gave a maximum that was a power oftwo (the same one as the noninteger split linear programmingoptimum rounded down).

The column titled “integer split linear programming, ”indicates the case in which the past subcode is constrained tohave dimension , i.e., the word is not in the code.This helps establish whether a minimum normalized distancegreater than is possible without sacrificing rate. In three cases( , and ) it is not. This is tight for the cases consid-ered: we can achieve if it is not ruled out by this splitlinear programming result, and in the cases where it is ruled out,

we can find codes with exactly one codeword of minimum nor-malized weight.

The case above is straightforward. For ,the baseline codes are , , and . Toconstruct a code, we take the Golaycode and partition into two 's. Since these are codewords inthe code, the possible weight distributions of words ofHamming weight are , , and , all of which have nor-malized weight greater than . This leaves one codeword ofminimum normalized weight, the word , but the dis-cussion above shows that this cannot be avoided.

Shortening and puncturing produces codes of highest dimen-sion for all basic block lengths in the rangeto . For basicblock lengths to , however, it produces one codeword at nor-malized distance , rather than the zero indicated in the table.

Codes of highest dimension in the rangeto are pro-duced by shortening and puncturing a code.To construct this code, we note that since the past subcode musthave dimension zero, the punctured future subcode must bethe code. Taking an identity matrix on the right ofthe generator matrix, the rows on the left must have Hammingweight at least , and be distance at leastapart. Then allnonzero codewords have weight distribution , , and

, and thus have normalized distance at least . Since[24, p. 684], we can find ten such words. We

obtain a code with normalized weight enumerator

The nonexistence of andcodes can be demonstrated in a similar way, without using splitlinear programming. A code would have 12codewords with weight distribution and 66 codewordswith weight distribution . The rows of type wouldhave to have weight distribution exactly , but

Page 16: Rate gains in block-coded modulation systems with interblock memory

866 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 3, MAY 2000

TABLE VCOMPARISON TOLONGERBASELINE BCM FOR QAM-BASED SYSTEMS

, so we cannot find a set of twelve such words. (Similarly for, we have , whereas we require .)

For basic block lengthsand , the minimum normalized dis-tance can be extended to without sacrificing dimension. Forthe case, we first construct a code by takinga generator matrix with identity matrix at the right and sevendistinct rows of weight at the left. This gives 21 codewords ofweight , which integer split linear programming shows is theminimum possible. For , we take an identity matrix at theright of the generator matrix, and eight distinct rows of weight

at the left. This gives eight codewords of weight, whereasinteger split linear programming gives a lower bound of.

A. Extension to Third Basic Block

The normalizing constant is so high that for eachof the basic block lengths considered, the full-dimension lemmagives an upper bound of one on the dimension gain achievableby extending interblock memory to the third basic block. Thiscan be achieved in each case, without increasing the number ofminimum normalized weight codewords, by a gluing code ofdimension . Thus in each case, we take the bestcode for the first two basic blocks, an even-weightcode as shortened future code, and glue these together by addinga word of length to the generator matrix.

For , we glue the code for the first twobasic blocks and the code for the third basic block usingthe gluing code with generator matrix shown at the bottom ofthis page, and similarly for .

For , the gluing code again has weightin the middleblock, and weight in the third block. The first block can be

chosen as any word of weightthat is distance at leastfromeach of the left hand rows in the code above.Since , and only ten of these words have beenused, can be chosen to be any of the remaining three words.This gives a code with ten codewordsof normalized weight . Shortening and puncturing givesa code with eight codewords of normalizedweight .

For , the gluing code generator matrix can be taken tobe any row with weight distribution between the threebasic blocks. This produces codewords with weight distribu-tions , , and , all ofwhich give normalized weight greater than. We thus ob-tain a code with one codeword of normalizedweight ; this is the best possible. Shortening and puncturinggives an code that has one codeword of nor-malized weight .

VI. CONCLUSION

We have considered in this paper the addition of interblockmemory to block-coded modulation systems, combined withthe staggered assignment of code bits to code symbols; this isknown to allow significantly higher rates in block-coded mod-ulation systems at the cost of some increase in decoding com-plexity and potential error propagation.

For the analysis of the performance of such systems at highsignal-to-noise ratios, we have introduced the notion of normal-ized distance. We have developed bounds and constructions forcodes designed with the normalized distance criterion, and have

Page 17: Rate gains in block-coded modulation systems with interblock memory

PENGet al.: RATE GAINS IN BLOCK-CODED MODULATION SYSTEMS 867

used these to find many optimal codes for the situations of mostinterest, for both QAM- and PSK-based systems.

APPENDIX

COMPARISONS WITH BASELINE SYSTEMS WITH

LONGERBLOCKS

We noted in Section II-D that an code will involvedecoding a length code. Thus one comparison that might bemade is to the rates achievable with baseline BCM systems withbasic block length , which also involve decoding codes oflength (albeit two distinct codes rather than one as in theBCMIM system).

The results obtained for BCMIM codes in Table I are com-pared on this basis to longer baseline BCM systems in Table V.

The number of apparent nearest neighbors in a BCM systemdepends on the transmitted sequence and the overall modula-tion scheme into which the codes are embedded. For a 16-QAMsystem, for example, the maximum number of nearest neighborsis known to be

(where denotes the number of codewords of weightin the th code) and the average number over all transmittedsequences is

assuming all codewords are equally likely [22], [34]. We thusgive the number of minimum-weight codewords for each ofthe component codes of the BCM system in Table V: the nota-tion indicates that there are codewords of Hammingweight in the first code listed, and in the second.

The number of apparent nearest neighbors for a code with in-terblock memory may be derived similarly as, for the 16-QAMcase, a maximum of

and an average of

where denotes the Hamming weight of the wordin theth basic block [26, pp. 152–155].

The star on some BCM component codes indicates that thecode is known to be unique. A number as superscript indicatesthat there are exactly that many inequivalent codes with thegiven parameters. This information is collected in Jaffe's table[15].

From the table, the BCMIM system has higher rate in all casesexcept basic block lengthsand . The rates are the same forbasic block length, and basic block length provides the onlyexample in this list in which a higher rate is achievable usingthe longer baseline BCM system. For basic block length, themaximum number of nearest neighbors of the baseline BCMsystem is

whereas for the BCMIM system, using the weight distribution ofminimum normalized weight codewords given in Section IV-D,we have a maximum of

The effective coding gain is thus 0.78 dB better for the BCMIMsystem. Similarly, the basic block lengthcase gives the samerates, but due to the smaller number of nearest neighbors theBCMIM system has an effective coding gain of 0.55 dB overthe longer baseline BCM system.

Codes in which the interblock memory is extended to the thirdand higher basic blocks usually have a relatively simple struc-ture in higher basic blocks, as shown earlier. Thus we would ex-pect that an code would in general be much lesscomplex to decode than most length codes, and a compar-ison to baseline BCM codes with basic block lengthwouldnot be meaningful.

Similar results hold for 8-PSK systems: the best BCMIM sys-tems of Table IV all have higher rates than the correspondingBCM systems, except for the case .

REFERENCES

[1] E. L. Blokh and V. V. Zyablov, “Coding of generalized concatenatedcodes,”Probl. Pered. Inform, vol. 10, no. 3, pp. 45–50, 1974. (In Rus-sian. English translation inProbl. Inform. Transm.,vol. 10, no. 3, pp.218–222, July–Sept. 1974.).

[2] A. G. Burr, “Maximum likelihood and multistage decoding for opti-mized multilevel coded modulation,” inProc. 1998 Information TheoryWorkshop, Killarney, Ireland, June 1998, pp. 81–82.

[3] A. G. Burr and T. J. Lunn, “Block-coded modulation optimized for finiteerror rate on the white Gaussian noise channel,”IEEE Trans. Inform.Theory, vol. 43, pp. 373–385, Jan. 1997.

[4] A. R. Calderbank, “Multi-level codes and multi-stage decoding,”IEEETrans. Commun., vol. 37, pp. 222–229, Mar. 1990.

[5] J. H. Conway and N. J. A. Sloane, “Orbit and coset analysis of the Golayand related codes,”IEEE Trans. Inform. Theory, vol. 36, pp. 1038–1050,Sept. 1990.

[6] E. L. Cusack, “Error control codes for QAM signaling,”Electron. Lett.,vol. 20, no. 2, pp. 62–63, Jan. 1984.

[7] S. M. Dodunekov and S. B. Encheva, “Uniqueness of some linear sub-codes of the extended binary Golay code,”Probl. Pered. Inform., vol.29, no. 1, pp. 45–51, Jan.–Mar. 1993. In Russian. English translation inProbl. Inform. Transm.,vol. 29, no. 1, pp. 38–43, Jan.–Mar. 1993.

[8] I. Dumer, “Concatenated codes and their multilevel generalizations,”in Handbook of Coding Theory, V. S. Pless and W. C. Huffman,Eds. Amsterdam, The Netherlands: North-Holland, 1998, vol. I.

[9] G. D. Forney Jr., “Coset codes—Part I: Introduction and geometricalclassification,”IEEE Trans. Inform. Theory, vol. 34, pp. 1123–1151,Sept. 1988.

[10] G. D. Forney Jr. and G. Ungerboeck, “Modulation and coding forlinear Gaussian channels,”IEEE Trans. Inform. Theory, vol. 44, pp.2384–2415, Oct. 1998.

Page 18: Rate gains in block-coded modulation systems with interblock memory

868 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46, NO. 3, MAY 2000

[11] V. V. Ginzburg, “Multidimensional signals for a continuous channel,”Probl. Pered. Inform., vol. 20, no. 1, pp. 28–46, 1984. In Russian. Eng-lish translation inProbl. Inform. Transm., vol. 20, no. 1, pp. 20–34,Jan.–Mar. 1984.

[12] J. Hagenauer, “Source-controlled channel coding,”IEEE Trans.Commun., vol. 43, pp. 2449–2457, Sept. 1995.

[13] J. Huber, U. Wachsmann, and R. Fischer, “Coded modulation bymultilevel codes: Overview and state of the art,” inITG-Fachbericht:Codierung für Quelle, Kanal und Übertragung, Aachen, Germany,Mar. 1998, pp. 255–266.

[14] H. Imai and S. Hirakawa, “A new multilevel coding method using error-correcting codes,”IEEE Trans. Inform. Theory, vol. IT-23, pp. 371–377,May 1977.

[15] D. B. Jaffe, “Binary Linear Codes: New Results on Nonexistence,” Dept.Math. Stat., Univ. Nebraska, Lincoln, Tech. Rep., Oct. 1996. [Online]Available from http://math.unl.edu/~djaffe/codes/codes.ps.gz.

[16] T. Kasami, T. Takata, T. Fujiwara, and S. Lin, “A concatenated codedmodulation scheme for error control,”IEEE Trans. Commun., vol. 38,no. 6, pp. 752–763, June 1990.

[17] T. Kasami, T. Takata, T. Fujiwara, and S. Lin, “On multilevel block mod-ulation codes,”IEEE Trans. Inform. Theory, vol. 37, pp. 965–975, July1991.

[18] Y. Kofman, E. Zehavi, and S. Shamai, “Performance analysis of a mul-tilevel coded modulation system,”IEEE Trans. Commun., vol. 42, pp.299–312, Feb./Mar./Apr. 1994.

[19] M.-C. Lin, “A coded modulation scheme with inter block memory,” inProc. 1990 IEEE Intl. Symp. Information Theory, San Diego, CA, Jan.1990, p. 51.

[20] M.-C. Lin and S.-H. Ma, “A coded modulation scheme with inter blockmemory,”IEEE Trans. Commun., vol. 42, pp. 911–916, Feb./Mar./Apr.1994.

[21] M.-C. Lin, J.-Y. Wang, and S.-C. Ma, “On block-coded modulation withinterblock memory,”IEEE Trans. Commun., vol. 45, pp. 1401–1411,Nov. 1997.

[22] T. J. Lunn and A. G. Burr, “Number of neighbors for staged decodingof block-coded modulation,”Electron. Lett., vol. 29, no. 20, pp.1830–1831, Oct. 1993.

[23] S.-C. Ma and M.-C. Lin, “A trellis coded modulation scheme con-structed from block-coded modulation with interblock memory,”IEEETrans. Inform. Theory, vol. 40, pp. 1348–1363, Sept. 1994.

[24] F. J. MacWilliams and N. J. A. Sloane,The Theory of Error-CorrectingCodes. Amsterdam, The Netherlands: North-Holland, 1977.

[25] R. H. Morelos-Zaragoza, T. Kasami, S. Lin, and H. Imai, “On block-coded modulation using unequal error protection codes over Rayleighfading channels,”IEEE Trans. Commun., vol. 46, pp. 1–4, Jan. 1998.

[26] C.-N. Peng, “Block coded modulation schemes with interblockmemory,” Ph.D. dissertation, Univ. Michigan, Ann Arbor, 1998.

[27] S. I. Sayegh, “A class of optimum block codes in signal space,”IEEETrans. Commun., vol. COM-34, pp. 1043–1045, Oct. 1986.

[28] J. Simonis, “The[18; 9; 6] code is unique,”Discr. Math., vol. 106/107,pp. 439–448, Sept. 1992.

[29] N. J. A. Sloane, S. M. Reddy, and C.-L. Chen, “New binary codes,”IEEE Trans. Inform. Theory, vol. IT-18, pp. 503–510, July 1972.

[30] T. Takata, S. Ujita, T. Kasami, and S. Lin, “Multistage decoding of mul-tilevel block M-PSK modulation codes and its performance analysis,”IEEE Trans. Inform. Theory, vol. 39, pp. 1204–1218, July 1993.

[31] J. H. van Lint,Introduction to Coding Theory. New York: Springer-Verlag, 1982.

[32] U. Wachsmann and J. Huber, “Distance profile of multilevel coded trans-mission and rate design,” inProc. 1998 Information Theory Workshop,Killarney, Ireland, June 1998, pp. 10–11.

[33] R. G. C. Williams, “Low complexity block coded modulation,” Ph.D.dissertation, Univ. Manchester, Manchester, U.K., 1988.

[34] R. G. C. Williams, “Block coding for voiceband modems,”BritishTelecom Tech. J., vol. 10, no. 1, pp. 101–111, Jan. 1992.

[35] K. Yamaguchi and H. Imai, “A new block-coded modulation scheme andits soft decision decoding,” inProc. 1993 IEEE Intl. Symp. InformationTheory, San Antonio, TX, Jan. 1993, p. 64.

[36] L. Zhang and B. Vucetic, “Multilevel block codes for Rayleigh fadingchannels,”IEEE Trans. Commun., vol. 43, pp. 24–31, Jan. 1995.

[37] V. A. Zinov'ev, “Generalized concatenated codes,”Probl. Pered. Inform.,vol. 12, no. 1, pp. 5–15, 1976. (In Russian. English translation inProbl.Inform. Transm., vol. 12, no. 1, pp. 2–9, Jan.–Mar. 1976.).