[Universitext] Block Error-Correcting Codes ||

273

Transcript of [Universitext] Block Error-Correcting Codes ||

Universitext

Springer-Verlag Berlin Heidelberg GmbH

Sebastia Xamb6-Descamps

Bloek Error-Correeting Codes

A Computational Primer

, Springer

Sebastia Xamb6-Descamps Universitat Politecnica de Catalunya C. Pau Gargallo 5 08028 Barcelona Spain e-mail: [email protected]

Cataloging·in-Publication Data applied for

A catalog record for this book is available from the Iibrary of Congress.

Bibliographic information published by Oie Deutsche Bibliothek Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the Internet at <http://dnb.ddb.de>.

ISBN 978-3-540-00395-3 ISBN 978-3-642-18997-5 (eBook) DOI 10.1007/978-3-642-18997-5

Mathematics Subject Classification (2000): 94Bxx, 68P30, Il T71

This work is subject to copyright. AII rights are reserved, whether the whole or part of the material is concemed, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any otherway, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law ofSeptember 9, 1965, in its currentversion, and permission foruse must always be obtainedfromSprin­ger-Verlag. Violations are liable for prosecution under the German Copyright Law.

http://www.springer.de

© Springer-VerlagBerlinHeidelberg 2003 Originally published by Springer-Verlag Berlin Heidelberg New York in 2003

The use of general descriptive names, registered names, trademarks etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

Cover design: design & production, Heidelberg Typesetting by the author using a lF,X macro package Printed on acid-free paper 40/3142ck-543210

Preface

There is a unique way to teach : to lead the other person through

the same experience with which you learned.

Oscar Villarroya, cognitive scientist.'

In this book, the mathematical aspects in our presentation of the basic theoryof block error-correcting codes go together, in mutual reinforcement, withcomputational discussions, implementations and examples ofall the relevantconcepts, functions and algorithms. We hope that this approach will facilitatethe reading and be serviceable to mathematicians, computer scientists andengineers interested in block error-correcting codes.i

In its digital form, which is a pdf document with hyperlinks, the exam­ples included in the book can be run with just a mouse click. Moreover, theexamples can be modified by users and saved to their local facilities for laterwork. For the convenience of readers, the site

http://www.wiris.com/cc/

has been set up not only to allow a free downloading of the digital version,but also , and primarily, in order to provide direct and free access to the ex­amples, and to the services provided to run them .

The program that handles the computations in the examples, togetherwith the interface that handles the editing (mathematics and text), here willbe called WIRIslcc . More specifically, WIRIS stands for the interface, the(remote) computational engine and the associated language, and cc standsfor the extension of WIRIS that takes care of the computational aspects thatare more specific of error-correcting codes .

WIRIslcc is an important ingredient in our presentation, but we do notpresuppose any knowledge of it. As it is very easy to learn and use , it is

'Cognitive Research Center at the Universitat Autonoma de Barcelona, interviewed byLluis Arniguet in"la contra", LA VANGUARDIA, 22/08/2002 .

2A forerunner of this approach wasoutlined in [31].

ii Preface

introduced gradually, according to the needs of our presentation. A succintdescription of the functions used can be found in the Index-Glossary at theend. The Appendix is a summary of the main features of the system.

This book represents the author's response to the problem of teaching aone-semester course in coding theory under the circumstances that will beexplained in a moment. Hopefully it will be useful as well for other teachersfaced with similar challenges.

The course has been taught at the Facultat de Matematiques i Estadistica(FME) of the Universitat Politecnica de Catalunya (UPC) in the last fewyears. Of the sixty sessions of fifty minutes each, nearly half are devoted toproblem discussions and problem solving. The students are junior or seniormathematics majors (third and fourth year) and some of them are pursuing adouble degree in mathematics and telecommunications engineering.

The nature of the subject, the curriculum at the FME and the contextat the UPC advise that, in addition to sound and substantial mathematicalconcepts and results, a reasonable weight should be given to algorithms andthe effective programming of them. In other words, learning should span awide spectrum ranging from the theoretical framework to aspects that maybe labeled as 'experimental' and 'practical'.

All these various boundary conditions, especially the stringent time con­straints and the huge size of the subject, are to be pondered very carefully inthe design of the course. Given the prerequisites that can be assumed (saylinear and polynomial algebra, some knowledge of finite fields, and basicfacts about probability theory), it is reasonable to aim at a good understand­ing of some of the basic algebraic techniques for constructing block error­correcting codes, the corresponding coding and decoding algorithms and agood experimental and practical knowledge of their working . To give a moreconcrete idea, this amounts to much of the material on block error correct­ing codes included in, say, chapters 3-6 in [25] (or chapters 4 to 8 in [20]),supplemented with a few topics that are not covered in these books , plus thecorresponding algorithmics and programming in the sense explained above.

Of course, this would be impossible unless a few efficiency principlesare obeyed. The basic one, some sort of Ockam's razor, is that in the theo­retical presentation mathematics is introduced only when needed . A similarprinciple is applied to all the computational aspects .

The choice of topics is aimed at a meaningful and substantial climax ineach section and chapter, and in the book as a whole, rather than at an obvi­ously futile attempt at completeness (in any form), which would be senselessanyway due to the immense size of the fields involved.

WIRIslcc makes it possible to drop many aspects of the current presenta­tions as definitely unnecessary, like the compilations of different sorts of ta­bles (as for example those needed for the hand computations in finite fields).

Preface iii

As a consequence, more time is left to deal with conceptual matters.It is perhaps also interesting to point out that the examples provided at

http ://www.wiris.com/cc/

can be a key tool for the organization oflaboratory sessions, both for individ­ual work assignments and for group work during class hours. The system,together with the basic communications facilities that today can be assumedin most universities, makes it easier for students to submit their homework indigital form, and for teachers to test whether the computational implementa­tions run properly.

The basic pedagogical assumptions underlying the whole approach arethat the study of the algorithms leads to a better understanding of the mathe­matics involved, and that the discipline of programming them in an effectiveway promotes a better understanding of the algorithms . It could be arguedthat the algorithms and programs are not necessary to solve problems, atleast not in principle, but it can also be argued that taking them into accountreinforces learning, because it introduces an experimental component in themathematical practice, and because it better prepares learners for future ap­plications.

Ofcourse , we do not actually know whether these new tools and methodswill meet the high expectations of their capacity to strengthen the students'understanding and proficiency. But part of the beauty of it all, at least forthe author, stems precisely from the excitement of trying to find out, by ex­perimenting together with our students , how far those tools and methods canadvance the teaching and learning ofmathematics, algorithms and programs.

The growing need for mathematicians and computer scientists

in industry will lead to an increase in courses in the area of

discrete mathematics. One of the most suitable and fascinating

is, indeed, coding theory.

1. H. van Lint, [25], p. ix.

Acknowledgements

The character of this book , and especially the tools provided at

http://www.wiris.com/cc/.

would hardly be conceivable without the WIRIS system, which would nothave been produced without the sustained, industrious and unflappable col­laboration ofDaniel Marques and Ramon Eixarch, first in the Project OMEGAat the FME and afterwards, since July 1999, as partners in the firm Mathsfor More.

iv Preface

On the Springer-Verlag side, the production of this book would neverhave been completed without the help, patience and know-how of ClemensHeine . Several other people at Springer have also been supportive in dif­ferent phases of the project. My gratitude to all, including the anonymouspeople that provided English corrections in the final stage.

It is also a pleasant duty to thank the FME and the Departament deMatematica Aplicada II (MA2) of the UPC for all the support and encour­agement while in charge of the Coding Theory course, and especially thestudents and colleagues for the everyday give-and-take that certainly has in­fluenced a lot the final form of this text. Joan Bruna, for example, suggestedan improvement in the implementation of the Meggitt decoder that has beenincluded in Section 3.4 .

Part of the material in this text, and especially the software aspects, werepresented and discussed in the EAGER school organized by Mina Teicherand Boris Kunyavski at Eilat (Israel, January 12-16, 2003). I am gratefulto the organizers for this opportunity, and to all the participants for their ea­ger inquiring about all aspects of the course. In particular, I have to thankShmu1ik Kaplan for suggesting an improvement of the altemant decoder im­plementation presented in Section 4.3 .

Grateful thanks are due to Thomas Hintermann for sharing his enlight­ening views on several aspects ofwriting, and especially on English writing.They have surely contributed to improve this work, but ofcourse only the au­thor is to be found responsible for the mistakes and blunders that have gonethrough undetected.

And thanks to my wife , Elionor Sedo. Without her enduring supportand love it would not have been possible to dedicate this work to her on theoccasion of our thirtieth wedding anniversary, for it would hardly have beenfinished .

The authorL'Escala20/1/03

Contents

Preface

Introduction 1

1 Block Error-correcting Codes 131.1 Basic concepts . 141.2 Linear codes. . . 341.3 Hadamard codes. 631.4 Parameter bounds 78

2 Finite Fields 1012.1 Zn and Fp • • . • • • • • • • • • • . • . • • • . . • 1032.2 Construction of finite fields . . . . . . . . . . . . . 1082.3 Structure of the multiplicative group of a finite field 1242.4 Minimum polynomial. . . . . . . . . . . . . . . . 133

3 Cyclic Codes 1413.1 Generalities ... . .. . .. . .. 1423.2 Effective factorization of X" - 1 . 1543.3 Roots of a cyclic code. 1643.4 The Meggitt decoder . . . . . . . 173

4 Alternant Codes 1794.1 Definitions and examples . . . . . . . . . . . . . . . 1804.2 Error location, error evaluation and the key equation. 1954.3 The Berlekamp-Massey-Sugiyama algorithm 2034.4 The Peterson-Gorenstein-Zierler algorithm . . . . . 215

Appendix: The WIRISlcc system 221Bibliography 235

Index of Symbols 237

Alphabetic Index, Glossary and Notes 241

Introduction

Error-correcting codes have been incorporated in numerous

working communication and memory systems.

W. Wesley Peterson and E. 1. Weldon, Jr., [16], p. v.

This chapter is meant to be an informal introduction to the main ideas oferror-correcting block codes.

We will see that if redundancy is added in a suitable form to the infor­mation to be sent through a communications channel (this process is calledcoding), then it is possible to correct some ofthe errors caused by the chan­nel noise if the symbols received are processed appropriately (this process iscalled decoding).

The theoretical limit ofhow good coding can be is captured by Shannon'sconcept of (channel) capacity and Shannon's epoch-making channel codingtheorem.'

At the end we will also explain how are we going to present the compu­tational discussions and materials.

How do we correct text mistakes?

'A trap dt ond milrian cileq begens with a sinqle soep',

Chinese saying

M. Bossert, [5], p. xiii .

Assume that you are asked to make sense of a 'sentence' like this : "Codingtheary zs bodh wntelesting and challenning" (to take a simpler example thanthe Chinese saying in the quotation) .

3"While his incredible inventive mind enriched many fields, Claude Shannon 's enduringfame will surely rest on his 1948 paper A methematical theory ofCommuni cation and the on­going revolution in information technology it engendered" (Solomon W.Golomb , in "ClaudeElwood Shannon (1916-2001)", Notices of the AMS, volume 49, number I, p. 8).

2 Introduction

You realize that this proposed text has mistakes and after a little inspec­tion you conclude that most likely the writer meant "Coding theory is bothinteresting and challenging".

Why is this so? Look at 'challenning', for example. We recognize it isa mistake because it is not an English word. But it differs from the Englishword 'challenging' in a single letter. In fact, we do not readily find otherEnglish words that differ from 'challenning' in a single letter, and thus anyother English word that could be meant by the writer seems much moreunlikely than 'challenging'. So we 'correct' 'challenning' to 'challenging'.A similar situation occurs with the replacement of 'theary' by 'theory'. Nowin 'bodh', if we were to change a single letter to get an English word, wewould find 'bode, 'body" or 'both', and from the context we would choose'both', because the other two do not seem to make sense.

There have been several implicit principles at work in our correction pro­cess, including contextual information. For the purposes of this text, how­ever, the main idea is that most English words contain redundancy in thesense that altering one letter (or sometimes even more) will usually still al­low us to identify the original word. We take this word to be the correct one,because, under reasonable assumptions on the source oferrors, we think thatit is the most likely .

The repetition code Rep(3)

Similar principles are the cornerstones of the theory and practice of error­correcting codes. This can be illustrated with a toy example, which, despiteits simplicity, has most of the key features of channel coding.

Suppose we have to transmit a stream of bits Ul U2U3 . . . (thus U i E

{a, I} for all i) through a 'noisy channel' and that the probability that onebit is altered is p. We assume that p is independent of the position of the bitin the stream, and of whether it is 0 or 1. This kind of channel is called abinary symmetric channel with bit-error rate p.

In order to try to improve the quality of the information at the receivingend, we may decide to repeat each bit three times (this is the redundancy weadd in this case). Since for each information bit there are three transmissionbits, we say that this coding scheme has rate 1/3 (it takes three times longerto transmit a coded message than the uncoded one). Thus every informationbit U is coded into the code word UUU (or [uu u] if we want to representit as a vector). In particular, we see that there are two code words, 000and Ill . Since they differ in all three positions, we say that the minimumdistance of the code is 3. The fact that the code words have 3 bits, that eachcorresponds to a single information bit, and that the two code-words differ in

Introduction 3

3 positions is summarized by saying that this repetition code, which we willdenote Rep(3), has type [3,1,3).

If we could assume that at most one error is produced per each blockof three bits, then the redundancy added by coding suggests to decode eachblock of three bits by a 'majority vote' decision: 000, 100, 010, 001 aredecoded as a and Ill, all, 101, 110 are decoded as 1. Of course, if two orthree errors occurred -a possibility that usually is much more unlikely thanhaving at most one error-, then the decoder would give the wrong answer.

All seems very good, but nevertheless we still have to ask whether thisscheme really improves the quality of the received information. Since thereare three times more bits that can be in error, are we sure that this coding anddecoding is helping?

To decide this, notice that p represents the proportion of error bits if nocoding is used, while ifwe use Rep(3) the proportion ofbit errors will equalthe proportion p' of received 3-bit blocks with 2 or 3 errors (then we maycall the decoded bit a code-error). Since

we see that p' / p, which can be called the error-reduction factor of the code,is equal to

3p(1 - p) + p2 = 3p - 2p2 = 3p(1 - ~p)

If p is small , as it usually is, this express ion is also small , and we canuse the (excess) approximation 3p as its value. For example , 3p = 0.003 forp = 0.001, which means that, on average, for every 1000 errors producedwithout coding we will have not more than 3 with coding.

This is definitely a substantial improvement, although we are paying aprice for it: the number of transmitted bits has been tripled, and we also haveadded computing costs due to the coding and decoding processes.

Can we do better than in the Rep(3) example? Yes, of course, and this iswhat the theory of error-correcting codes is about.

Remarks. There is no loss of generality in assuming, for a binary symmet­ric channel , that p ~ 1/2 (if it were p > 1/2, we could modify the channelby replacing any 0 by 1 and any 1 by a at the receiving end). If this conditionis satisfied, then p' ~ p, with equality only for p = aand p = 1/2 (see Fig­ure 1). It is clear, therefore, that we can always assume that Rep(3) reducesthe proportion of errors , except for p = 1/2, in which case it actually doesnot matter because the channel is totally useless. In Figure 1 we have alsodrawn the graph of p'[p (the error reduction factor) . Note that it is < 1 forp < 1/2 and < 1/2 for p < 0.19.

4 Introduction

Figure 1: The graph ofp' = 3p2 - 2p3 as a function of p in the interval [0, ~1is the concave arc. The convex arc is the graph in the same interval ofp' /p (theerror-reduction factor). It is < 1for p < 1/2 and < 1/2for p < 0.19. Note that inthe interval [0,0.19J it decreases almost linearly to O.

Channel capacity and Shannon's channel coding theorem

Shannon, in his celebrated paper [21] , introduced the notion of capacity ofa channel- a number C in the interval [0,1] that measures the maximumfraction of the information sent through the channel that is available at thereceiving end. In the case of a binary symmetric channel it turns out that

C = 1 + p logz(p) + (1 - p) logz(1 - p),

where logz is the base 2 logarithm function. Notice that C = C(p) isa strictly decreasing function in the interval [0,1/2] , with C(O) = 1 andC(1/2 ) = 0, and that C(p) = C(l - p) (see Figure 2).

Figure 2: Graph ofthe capacity C (p). p E [0, l],for a binary symmetric channel

Introduction 5

Shannon's channel coding theorem states that if R is a positive real num­ber less than the capacity C of a binary symmetric channel (so 0 < R < C),and e is any positive real number, then there are 'codes' with rate at least Rand with a probability ofcode-error less than € (we refer to [25], [12] or [20]for details).

Shannon's theorem shows that in theory it is possible to transmit infor­mation with sufficient confidence and with a transmission time increase bya factor that can be as close to l/C as desired. Unfortunately, his methodsonly show the existence of such codes, but do not produce them, nor theircoding and decoding, in an effective way. It can be said that the main motiva­tion of the theory of error-correcting codes in the last half century has been,to a great extent , to find explicit codes with good rates, small code-errorprobabilities and with fast coding and decoding procedures.

Comments on our computational approach and Web services

Algorithms ... are a central part of both digital signal processing

and decoders for error-control codes...

R. E. Blahut, [4] , p. vii.

Let us return to the Rep(3) example in order to explain how we are goingto deal with the algorithmic and computing issues. In this case we would liketo define two functions , say f and g, that express, in a computational sense,the coding and the decoding we have been discussing. The result could bethe listing The code Rep(3). In this case it is a Library, with a Library Area(the area inside the rectangle below the title) and a Computation Area (thearea below the LibraryArea).

AI Library I The code Rep(3)

Coding functionI f(u) :=[u, u, u]

Majority decoderg([a, b, c]) :=

if a=b Ia=c thena

else if b=c thenb

else"Decoder error"

end

[Write the expressions you want to evaluate.in this area, starting at the block below

~

6 Introduction

The syntax and semantics of the expressions in the listing is explained inmore detail in the Appendix, but it should be clear enough at this point. Therelevant comment to be made here is that in the digital version we do not geta listing, but a label like The code Rep(3) . This label is linked to a suitablefile and when we click on the label, t at e is displayed on an WIRIS screen.The result is equivalent to the listing The code Rep(3) , but now with all thepower of WIRIslcc is available to us.

There are two main kinds of tasks that we can do.One is modifying the code in the Library area. This is illustrated in the

listing An alternative decoder for Rep(3) , where we have added the func­tion h, which is another implementation of the majority decoder for length 3binary vectors. In this case we are using the function weight that returns thenumber of nonzero entries of a vector (it is just one of the many functionsknown to WIRIslcc).4

"" Library I An alternative decoder for Rep(3)

Coding functionI f(u) := [u, u, u]

Majority decoderg([a, b, cJ) :=

if a=b Ia=c thena

else if b=c thenb

endh(x) :=

if weight(x»1 then1

elseo

end

The other kind of tasks we can do is writing expressions in the Compu­tation Area (in this case the empty block underneath the Library Area) andask for their evaluation. This is illustrated in the listings Some coding anddecoding requests and Results of the requests.

A WIRIslcc document can be saved at any time as an html document inthe facilities available to the user. These html files can be loaded and readwith a Web browser with Java, or can be 'pasted', without loosing any oftheir functionalities, to other html documents . In fact, this is how we havegenerated the listings we have been considering so far, and all the remainingsimilar listings in this book.

4Here for simplicity we assume that the vector [a be] is binary. Thus we have also omittedthe error handling message "Decoder error" when a, b, c are different.

Introduction

"'1 Library I Some codinq and decodinq requests

Coding functionI f(u) :=[u, u, u]

Majority decoderg([a, b, c]) :=

if a=b Ia=c thena

else if b=c thenb

endh(x) := if weight(x»1 then

1else

°end

Regardless of whether the Library folder is open or closed,the functions defined in it, and all other WIRIS/cc functions,can be used in this area

I f(O)I g([1,O,1])I h([1,O,1J)

Results of evaluating the requestsI f(O) ... [0 , 0, 0]I g( [1,0,1]) 1I h( [1,0,1]) 1

WIRIS/cc

7

As indicated before, we use WIRIslcc for all the algorithmic and program­ming aspects . It may be thought of as the conjunction of the general purposelanguage ofWIRIS and the special library cc for the treatment ofblock error­correcting codes. This language has been designed to be close to acceptedmathematical practices and at the same time encompass various program­ming paradigms. As with the mathematics, its features will be describedwhen they are needed to understand the examples without assuming pro­gramming skills on the part of the reader.

As an illustration , let us look at the decoder function 9 for the Rep(3)code which, for the reader 's convenience, we reproduce in the listing De­coder for Rep(3). The formal parameter of 9 is a length 3 vector [a be].Since the components of this vector are named symbolically, and the namescan be used in the body of the function (that is, the code written after := ),we see that WIRIS supports 'pattern-matching' . In this case, the body of thefunction is an if ... then ... else ... end expression. The value of this expres­sion is a when a and b are equal or when a and c are equal, and it is b whenband c are equal.

8

A'Library I Decoder for Rep(3)

g([a, b, c]):= if a=b Ia=c then

aelse if b=c then

bend

~ \caret

Introduction

Ofcourse, WIRIS can be used for many purposes other than codes. Figure2 on page 4, for example, has been generated as indicated in the listing Thecapacity functlon.f

AILibrary I The capacity function Ctn)

capacity (p :IR)check O~p & p~1

:= if p=O Ip=1 then1

else1+p ·log2(p) + (1-p) · log2 (1- p)

end

Drawing the graph of the capacity functionI C=curve(capacity,O..1);

Iplotter({width=1 .2, height=1 .2, center=point(O.5,O.5),

window_width=150, window_height=150});I plot(C) ;

Goals ofthis book

The goal of this text is to present the basic mathematical theory of blockerror-correcting codes together with its algorithmic and computational as­pects. Thus we will deal not only with the mathematics involved, but alsowith the related algorithms and with effective ways of programming thesealgorithms . This treatment has been illustrated already in the case of therepetition code Rep(3) above: we have listed there, after explaining the con­cepts ofcoder and decoder, a translation ofthem as algorithms (the functions

SIn the listing we have included only the parts that are essential for the graph. For example,we have omitted how labels, or the auxiliary segments, have been included in the final picture.In the digital book, the graph is generated by evaluating the expressions that appear on theWIRIS screen when we click on the link.

Introduction 9

f and g) and programs (in this case, the fact that f and 9 can be loaded andrun by a computer) .

The book has four chapters. The first is a broad introduction to the subjectof block error-correcting codes. After a first section devoted to the presen­tation of the basic concepts , we study linear codes and the basic computa­tional issues of the syndrome-leader decoder. The third section is devoted toHadamard codes, which in general are non-linear, and in the fourth sectionwe study the more outstanding bounds on the parameters of codes.

The second chapter is an independent introduction to finite fields andtheir computational treatment to the extent that they are needed later. Usuallyit is prefarable to cover its contents as the need arises while working on thematerial of the last two chapters.

Chapter 3 is devoted to cyclic codes, and to one of their practical decod­ing schemes, namely the Meggitt decoder. This includes a presentation ofthe two Golay codes, including its Meggitt decoding. The important Bose­Chaudhuri-Hocquenghem codes, a subfamily of the cyclic codes, are alsoconsidered in detail.

Chapter 4 is the culmination of the text in many respects. It is de­voted to altemant codes, which are generally not cyclic, and to their maindecoders (basically the Berlekamp-Massey-Sugiyama decoder, based onthe Euclidean division algorithm, and the Peterson-Gorenstein-Zierler al­gorithm, based on linear algebra, as well as some interesting variations ofthe two). It is to be noted that these decoders are applicable to the classi­cal Goppa codes, the Reed-Solomon codes (not necessarily primitive), andthe Bose-Chaudhuri-Hocquenghem codes (that include the primitive Reed­Solomon codes).

Chapter summary

• We have introduced the repetition code Rep(3) for the binary sym­metric channel, and Shannon's celebrated formula for its capacity,C = 1 + plog2(p) + (1 - P)log2(1 - p), where p is the bit-errorrate of the channel.

• Assuming p is small, the quotient of the code-error rate pi of Rep(3)over the bit-error rate p is aproximately 3p. Hence, for example, ifthere is one bit error per thousand bits transmited (on average), thenusing the code will amount to about three bit errors per million bitstransmitted (on average).

• The repetition code Rep(3) shows that error correction is possible at

10 Introduction

the expense of the information transmission rate and of computationalcosts. Shannon's channel coding theorem, which says that if Rand edenote positive real numbers such that 0 < R < C, where C is thecapacity, then there are 'codes' with a rate not less than R and witha code-error probability not higher than e, expresses the theoreticallimits of coding.

• Shannon only showed the existence of such codes, but no efficientways for constructing, coding and decoding them.

• The language in which the functions and algorithms are programmed,WIRIslcc, is very easy to learn and use, is explained gradually alongthe way and does not require any programming skills .

• There is a pdf hypertext version of this book freely accessible at

http://www.wiris.com/cc/.

This version allows the reader to run all the examples in the book,as well as her own examples and programs . The examples are alsoaccessible directly at the same site.

• The goal of this text is to present in the simplest possible terms, andcovering systematically the most relevant computational aspects, someof the main breakthroughs that have occurred in the last fifty years inthe quest of explicit codes, and efficient procedures for coding anddecoding them, that approach Shannon's theoretical limit ever moreclosely.

Conventions and notations

Exercises and Problems. They are labeled E.m.n and P.m.n, respectively,where n is the n-th exercise or problem within chapter m. Problems aregrouped in a subsection at the end of each section, whereas exercises areinserted at any place that seems appropriate.

Mathematics and wmts/cc. In the text, the expression of mathematicalobjects and the corresponding WIRIsicc expression are written in differenttypes. This translation may not be done if the mathematical and WIRIS syn­taxes are alike, especially in the case of symbolic expressions. For example,we do not bother in typesetting a ..b or xlx' for the WIRIsicc expressions ofthe mathematical range a..b or the concatenation x ix' ofthe vectors x and x' .

Note that there may be WIRIsicc objets (like Range, a type) that are notassigned an explicit formal name in the mathematical context.

Introduction 11

The ending symbol D. In general this symbol is used to signal the endof a block (for example a Remark) that is followed by material that is notclearly the beginning of another block. An exception is made in the case ofproofs : we always use the symbol to denote the end of a proof, even if afterit there is a definite beginning of a new block. On the other hand, since theend of theorems without proof is always clear, the symbol is never used inthat case.

Quality improvement. We would be very appreciative iferrors and sugges­tions for improvements were reported to the author at the following emailaddress: [email protected].

1 Block Error-correcting Codes

Channel coding is a very young field. However, it has gained

importance in communication systems, which are almost incon­

ceivable without channel coding.

M. Bossert, [5],p. xiv.

The goal of this chapter is the study of some basic concepts and construc­tions pertaining to block error-correcting codes, and of their more salientproperties , with a strong emphasis on all the relevant algorithmic and com­putational aspects.

The first section is meant to be a general introduction to the theory ofblock error-correcting codes. Most of the notions are illustrated in detailwith the example of the Hamming [7,4,3] code. Two fundamental results onthe parameters of a block code are proved: the Hamming upper-bound andthe Gilbert lower-bound.

In the second section, we look at linear codes, also from a general pointof view. We include a presentation of syndrome decoding. The main exam­ples are the Hamming codes and their duals and the (general) Reed-Solomoncodes and their duals. We also study the Gilbert-Varshamov condition on theparameters for the existence of linear codes and the MacWilliams identitiesbetween the weight enumerators of a linear code and its dual.

Special classes of non-linear codes, like the Hadamard and Paley codes,are considered in the third section. Some mathematical preliminairies onHadamard matrices are developed at the beginning. The first order Reed­Muller codes, which are linear, but related to the Hadamard codes, are alsointroduced in this section.

The last section is devoted to the presentation of several bounds on theparameters of block codes and the corresponding asymtotic bounds. Sev­eral procedures for the construction of codes out of other codes are alsodiscussed.

14 Chapter 1. Block Error-correcting Codes

1.1 Basic concepts

Although coding theory has its origin in an engineering prob­

lem, the subject has developed by using more and more so­

phisticated mathematical techniques.

F.1. MacWilliams and N. J. A. Sloane, [II] , p. vi.

Essential points

• The definition of block code and some basic related notions (code­words, dimension, transmission rate, minimum distance , equivalencecriteria for block codes) .

• The definition of decoder and of the correcting capacity of a decoder.

• The minimum distance decoder and its error-reduction factor.

• The archtypal Hamming code [7,4,3] and its computational treatment.

• Basic dimension upper bounds (Singleton and Hamming) and the no­tions of MDS codes and perfect codes.

• The dimension lower bound of Gilbert.

Introductory remarks

The fundamental problem of communication is that of repro­

ducing at one point either exactly or approximately a message

selected at another point.

C. E. Shannon 1948, [21].

Messages generated by an information source often can be modelled as astream 8 1 , 82 , . . . of symbols chosen from a finite set S called the source al­phabet. Moreover, usually we may assume that the time ts taken by thesource to generate a symbol is the same for all symbols .

To send messages we need a communications channel. A communica­tions channel can often be modelled as a device that takes a stream of sym­bols chosen from a finite set T, that we will call the channel (or transmission)alphabet, and pipes them to its destination (called the receiving end of the

1.1. Basicconcepts 15

channel) in some physical form that need not concern us here. I As a result, astream of symbols chosen from T arrives at the receiving end of the channel.

The channel is said to be noiseless if the sent and received symbols al­ways agree. Otherwise it is said to be noisy, as real channels almost alwaysare due to a variety of physical phenomena that tend to distort the physicalrepresentation of the symbols along the channel.

The transfer from the source to the sending end of the channel requiresan encoder, that is, a function f : S ---4 T* from the set of source symbolsinto the set T* offinite sequences of transmission symbols. Since we do notwant to loose information at this stage, we will always assume that encodersare injective. The elements of the image of f are called the code-words ofthe encoder.

In this text we will consider only block encoders, that is, encoders withthe property that there exists a positive integer n such that C ~ T", whereC = f (S) is the set of code words of the encoder. The integer n is called thelength of the block encoder.

For a block encoding scheme to make sense it is necessary that the sourcegenerates the symbols at a rate that leaves enough time for the opperation ofthe encoder and the channel transmission of the code-words. We alwayswill assume that this condition is satisfied, for this requirement is taken intoaccount in the design of communications systems.

Since for a block encoder the map f : S ---4 C is bijective, we may con­sider as equivalent the knowledge of a source symbol and the correspondingcode word. Generally speaking this equivalence holds also at the algorithmiclevel, since the computation of f or of its inverse usually can be efficientlymanaged . As a consequence, it will be possible to phrase the main issuesconcerning block encoders in terms of the set C and the properties of thechannel. In the beginning in next section, we adopt this point of view as ageneral starting point of our study of block codes and, in particular, of thedecoding notions and problems .

1.1 Remarks. The set {O, I} is called the binary alphabet and it is verywidely used in communications systems. Its two symbols are called bits.With the addition and multiplication modulo 2, it coincides with the field Z2of binary digits . The transposition of the two bits (0 ~ 1, 1 ~ 0) is callednegation. Note that the negation of a bit b coincides with 1 + b.

1.2Example. Consider the Rep(3) encoder f considered in the Introduction(page 2). In this case Sand T are the binary alphabet and C = {OOO, 111},respectively. Note that the inverse of the bijection f : S ---4 C is the map000 ~ 0, 111 ~ 1, which can be defined as x ~ Xl.

IThe interested reader may consult [19, 5].

16

Block codes

Chapter 1. Block Error-correcting Codes

Only block codes for correcting random errors are discussed

F.1. MacWilliams and N. 1. A. Sloane , [11], p. vii.

Let T = {tl , "" tq } (q ~ 2) be the channel alphabet. By a (block) codeof length n we understand any non-empty subset C ~ T" : If we want torefer to q explicitely, we will say that C is q-ary (binary if q = 2, ternaryif q = 3). The elements of C will be called vectors, code-words or simplywords.

Usually the elements of T " are written in the form x = (Xl"" , xn ) ,

where Xi E T. Since the elements of T " can be seen as length n sequencesof elements ofT, the element X will also be written as X IX2 . •. Xn (concate­nated symbols), especially when T = {O, I}.

If X E T" and x' E Tn' , the expression xix' will be used as an alternativenotation for the element (x, x' ) E T n+n ' . Similar notations such as xlx'lx"will be used without further explanation.

Dimension, transmission rate and channel coding

If C is a code of length n, we will set kc = logq(IC I) and we will say thatkc is the dimension ofC. The quotient Rc = kcln is called transmissionrate, or simply rate , of C.

1.3 Remarks. These notions can be clarified in terms of the basic notionsexplained in the first subsection (Introductory remarks). Indeed, 10gq (ITnl) =logq(qn) = n is the number of transmission symbols of any element of T" ,So kc = logq(ICI) = 10gq(ISI) can be seen as the number of transmis­sion symbols that are needed to capture the information of a source symbol.Consequently, to send the information content of kc transmission symbols(which , as noted, amounts to a source symbol) we need n transmission sym­bols, and so He represents the proportion of source information containedin a code word before transmission. An interesting and useful special caseoccurs when S = r-, for some positive integer k. In this case the sourcesymbols are already resolved explicitely into k transmission symbols and wehave kc = k and Rc = kin. D

If C has length n and dimension k (respectively IC I = M) , we say thatC has type tn , k] (respectively type (n ,M ) . Ifwe want to have q explicitelyin the notations, we will write tn , k]q or (n, M )q. We will often write C rv

tn , k] to denote that the type of Cis tn , k], and similar notations will be usedfor the other cases.

1.1. Basicconcepts

Minimum distance

17

Virtually all research on error-correcting codes has been based

on the Hamming metric .

w.w. Peterson and E.J. Weldon, Jr., [16], p. 307.

Given x , y E T " ; the Hamm ing distance between x and y, which we willdenote hd(x , y), is defined by the formula

hd(x , y) = I{i E l..n IXi i- ydl·

In other words , it is the number of positions i in which x and y differ.

E.l.!. Check that hd is a distance on T", Recall that d : X x X ---+ lR is saidto be a distance on the set X if, for all x, y, z E X,

I) d( x, y) ~ 0, with equality if and only if x = y;

2) d(y, x) = d(x ,y) ; and

3) d(x ,y) ::;; d(x,z) + d(z,y).

The last relation is called triangl e inequality.

E.l.2. Let x , y E Z2 be two binary vectors. Show that

hd(x , y) = Ixl+ Iyl - 21x, yl

where Ixl is the number of non-zero entries of x (it is called the weight of x )and x . y = (XlYl, ... ,xnYn) . 0

We will set d = de to denote the minimum of the distances hd(c, c' ), wherec, dEe and c i- c', and we will say that d is the minimum distance of e.Note that for the minimum distance to be defined it is necessary that lei ~ 2.For the codes e that have a single word, we will see that it is convenient toput de = n + 1 (see the Remark 1.12).

We will say that a code e is of type tn,k, d] (or of type (n,M, d)), ife has length n, minimum distance d, and its dimension is k (respectivelylei = M). If we want to have q explicitely in the notations, we will write[n, k,d]q or (n, M ,d)q. In the case q = 2 it is usually omitted. Sometimeswe will write e rv tn, k,d]q to denote that the type of e is [n,k, dJq , andsimilar notations will be used for the other cases.

The rational number 6e = de / n is called relative distance of e.

E.1.3. Let e be a code of type (n, M , d). Check that if k = n, then d = 1.Show also that if M > 1 (in order that d is defined) and d = n then M ::;; q,

hence ke ::;; 1.

18 Chapter 1. Block Error-correcting Codes

1.4 Example (A binary code (8,20,3)). Let C be the binary code (8, 20) con­sisting of 00000000, 11111111 and all cyclic permutations of 10101010,11010000 and 11100100. From the listing Cyclic shifts of a vector it iseasy to infer that de = 3. Thus C is a code of type (8,20,3) .

AI Library I Cyclic shifts of a vector

Icyclic_shift( x :Vector ) := [ Xlenglh(x)] I take(x,length(x)-1)

cyclic_shifts( x :Vector ) :=begin

local X={x}, y=cyclic_shift(x)while y=t=x do

X=XI{y}y=cyclic_shift(y)

endX

end

I \careta=[1 ,1,0,1,0,0,0,0] ; X=cyclic_shifts(a);I b=[1,1,1,0,0,1,0,0]; Y=cyclic_shifts(b);I c=[1 ,0,1,0,1,0,1,0]; Z=cyclic_shifts(c);I {hd(a, x) with x in tail(X)} ..... {4, 4, 4, 6, 4, 4, 4}I {hd(a,y) with y in Y} {3, 3, 5, 3, 5, 7, 3, 3}I {hd(a,z) with Z in Z} {5, 3}I {hd(b,y) with y in tail(Y)} ..... {4, 6,4,4,4,6, 4}I {hd(b,z) with Z in Z} ..... {4,4}I {hd(c,z) with Z in tail(Z)} ..... {8}

1.5 Remark. A code with minimum distance d detects up to d - 1 errors, inthe sense that the introduction of a number of errors between 1 and d - 1 inthe transmission gives rise to a word that is not in C. Note that d - 1 is thehighest integer with this property (by definition of d).

However, [the Hamming distance] is not the only possible and

indeed may not always be the most appropriate. For exam­

ple, in (FlO )3 we have d(428 ,438) = d(428 ,468), whereas in

practice, e.g. in dialling a telephone number, it might be more

sensible to use a metric in which 428 is closer to 438 than it is

to 468.

R. Hill, [9], p. 5.

Equivalence criteria

We will say that two codes C and C' of length n are strictly equivalent ifC' can be obtained by permuting the entries of all vectors in C with somefixed permutation. This relation is an equivalence relation on the set of allcodes of length n and by definition we see that it is the equivalence relation

1.1. Basic concepts 19

that corresponds to the natural action ofSn on T", and hence also on subsetsof t».

In the discussion of equivalence it is convenient to include certain per­mutations of the alphabet T in some specified positions. This idea can beformalized as follows. Let r = (I'1 , .. . , I'n), where I', is a subgroup ofper­mutations ofT, that is, a subgroup of Sq (1 ~ i ~ n). Then we say that twocodes C and C' are T-equivalent if C' can be obtained from C by a permuta­tion (J' E Sn applied, as before, to the entries of all vectors of C, followed bypermutations Ti E I', of the symbols ofeach entry i, i = 1, .. . , n . If I' , = Sqfor all i, instead of r -equivalent we will say Sq-equivalent , or simply equiv­alent . In the case in which T is a finite field F and I', = F - {O}, acting on Fby multiplication, instead of r -equivalent we will also say F-equivalent, orF*-equivalent, or scalarly equivalent.

Note that the identity and the transposition of:1:2 = {O, I} can be repre­sented as the operations x 1--+ x+O and x 1--+ x+1, respectively, and thereforethe action ofa sequence TI, . . . , Tn ofpermutations of:1:2 is equivalent to theaddition of the vector T E :1:~ that corresponds to those permutations.

In general it is a relatively easy task, given I', to obtain from a code Cother codes that are I'-equivalent to C, but it is much more complicated todecide whether two given codes are T-equivalent.

KIA. Check that two (strongly) equivalent codes have the same parametersn , k and d.

E.1.5. Show that given a code C ~ T" and a symbol t E T, there exists acode equivalent to C that contains the constant word tn.

K1.6. Can there be codes of type (n, q, n)q which are not equivalent to theq-ary repetition code? How many non-equivalent codes of type (n ,2, dhthere are?

Decoders

The essential ingredient in order to use a code C ~ T" at the receiving endof a channel to reduce the errors produced by the channel noise is a decodingfunction . In the most general terms, it is a map

g: D ~ C, where C ~ D ~ Tn

such that g( x) = x for all x E C. The elements of D, the domain of g, aresaid to be g-decodable. By hypothesis, all elements of Care decodable. Incase D = T", we will say that 9 is afull decoder (or a complete decoder) .

We envisage 9 working, again in quite abstract terms , as follows. Givenx E C , we imagine that it is sent through a communications channel. Let

20 Chapter 1. Block Error-correcting Codes

Y E T" be the vector received at the other end of the channel. Since thechannel may be noisy, y may be different from x, and in principle can be anyvector ofT" , Thus there are two possibilities:

• ify E D, we will take the vector x' = g(y) E C as the decoding ofy;

• otherwise we will say that y is non-decodable, or that a decoder errorhas occurred.

Note that the condition g(x) = x for all x E C says that when a codeword is received, the decoder returns it unchanged. The meaning of this isthat the decoder is assuming , when the received word is a code-word, that itwas the transmitted code-word and that no error occurred.

If we transmit x, and y is decodable, it can happen that x' =1= x . In thiscase we say that an (undetectable) code error has occurred.

Correction capacity

We will say that the decoder 9 has correcting capacity t, where t is a positiveinteger, if for any x E C, and any y E T" such that hd (x , y) :::; t, we havey E D andg(y) = x .

1.6 Example. Consider the code Rep(3) and its decoder 9 considered in theIntroduction (page 2). In this case T = {O, I}, C = {OOO, Ill} and D =T 3 , so that it is a full decoder. It corrects one error and undetectable codeerrors are produced when 2 or 3 bit-errors occur in a single code-word .

1.7 Remark. In general, the problem ofdecoding a code C is to construct Dand 9 by means of efficient algorithms and in such a way that the correctingcapacity is as high as possible.

Minimum distance decoder

Let us introduce some notation first. Given w E T" and a non-negativeinteger r, we set

B (w, r) = {z E T" Ihd (w , z) :::; r} .

The set B (w, r) is called the ball ofcenter wand radius r.If C = {xl , . . . , xM}, let D, = B(xi , t), where t = L(d - I)/2J, with

d the minimum distance of C. It is clear that C n D, = {xi } and thatD, n Dj = 0 if i =1= j (by definition of t and the triangular inequality ofthe Hamming distance) . Therefore, if we set Dr; = Ui Di, there is a uniquemap g : Dr: -t C such that g(y) = xi for all y E Di, By construction, 9is a decoder of C and it corrects t errors. It is called the minimum distancedecoder ofC.

1.1. Basic concepts 21

E.1.7. Check the following statements:

1. g(y) is the word x' E e such that hd (y , x' ) :::;; t, if such an x' exists,and otherwise y is non-decodable for g.

2. Ify is decodable and g( y) = x', then

hd (x , y) > t for all x E e - { x'}.

1.8 Remarks. The usefulness of the minimum distance decoder arises fromthe fact that in most ordinary situations the transmissions x f---> y that lead toa decoder error (y tJ. D), or to undetectable errors (y E D, but hd(y , x ) > t)will in general be less likely than the transmissions x f---> y for which y isdecodable and g(y) = x .

To be more precise, the minimum distance decoder maximizes the like­lihood of correcting errors if all the transmission symbols have the sameprobability of being altered by the channel noise and if the q - 1 possibleerrors for a given symbol are equally likely. If these conditions are satis­fied, the channel is said to be a (q-ary) symmetric channel. Unless otherwisedeclared , henceforth we will understand that 'channel ' means 'symmetricchannel ' .

From the computational point of view, the minimum distance decoder,as defined above, is inefficient in general, even if de is known, for it has tocalculate hd (y , x) , for x E C, until hd(y, x ) :::;; t, so that the average numberof distances that have to be calculated is of the order of lei = q"<. Note alsothat this requires having generated the qk elements ofe.

But we also have to say that the progress in block coding theory in thelast fifty years can, to a considerable extent, be seen as a series ofmilestonesthat signal conceptual and algorithmic improvements enabling to deal withthe minimum distance decoder, for large classes of codes, in ever more ef­ficient ways. In fact, the wish to collect a representative sample of thesebrilliant achievements has guided the selection of the decoders presented insubsequent sections of this book, starting with next example.

1.9 Example (The Hamming code [7,4,3]). Let K = ;[;2 (the field of binarydigits). Consider the K -matrix

110)101011

Note that the columns of R are the binary vectors oflength 3 whose weight isat least 2. Writing I; to denote the identity matrix of order r, let G = I41RT

and H = R lh (concatenate 14 and RT , and also Rand h , by rows). Note

22 Chapter I. Block Error-correcting Codes

that the columns of H are precisely the seven non-zero binary vectors oflength 3.

Let S = K 4 (the source alphabet) and T = K (the channel alphabet).Define the block encoding f : K 4 -+ K 7 by u 1-+ uG = uluRT . The imageof this function is C = (G), the K -linear subspace spanned by the rows ofG, so that C is a [7, 4J code. Since

because the arithmetic is mod 2, we see that the rows of G, and hence theelements ofC, are in the kernel of H T . In fact,

as the right-hand side contains C and both expressions are K-linear sub­spaces of dimension 4. From the fact that all columns of H are distinct ,it is easy to conclude that de = 3. Thus C has type [7,4,31. Since ICI .vO/(7 , 1) = 24 (1 + 7) = 28 = IK81, we see that De = K 7 .

As a decoding function we take the map 9 : K 7 -+ C defined by thefollowing recipe:

I. Let s = yHT (this length 3 binary vector is said to be the syndrome ofthe vector y).

2. If s = 0, return y (as we said above, s = °is equivalent to say thaty E C).

3. If s =1= 0, let j be the index of s as a row of H T .

4. Negate the j-th bit of y.

5. Return the first four components ofy.

Let us show that this decoder, which by construction is a full decoder(D = K7), coincides with the minimum distance decoder. Indeed, assumethat x E C is the vector that has been sent. Ifthere are no errors, then y = x,s = 0, and the decoder returns x . Now assume that the j-th bit of x E C ischanged during the transmission, and that the received vector is y. We canwrite y = x + ej, where ej is the vector with I on the j-th entry and°on theremaining ones. Then s = yHT = xHT + ejHT = ejHT , which clearlyis the j-th row of HT . Hence g(y) = y + ej = x (note that the operationy 1-+ Y + ej is equivalent to negating the j-th bit of y). 0

The expression of this example in WIRIsicc is explained in the listing Ham­ming code [7,4,3]. The WIRIS elements that are used may be consulted inthe Appendix, or in the Index-Glossary.

1.1. Basic concepts 23

"-I Library I Hamming code [7,4,31 1"-Let R be the binary matrix whose columns are the binary vectors of length threeand weight at least two

R= (~ ~ 6~) : Matrix(712)1 01 1

The matrices G and H

IG= I4 1RT; H=R 1'3;Encoding function

I hamming_encoder(u) := u·GDecoding functionhamming_decoder(y) :=

beginlocal r, n, s, j(r,n)=dimensions(G)s=y ·HTif not zero? (s) then

j=index(s, HT)

Yj=Yj-1endtake(y, r)

end

\caretExampleI u=[1, 1,0,1) - [1,1,0,1)I x=hamming3ncoder(u) - [1,1 ,0,1,0,1,0)I hamming_decoder(x) - [1, 1,0, 1)

Let us simulate an error in position 4I e=epsilon_vector(7,4) ; y=x+e - [1,1,0,0,0,1,0)I hamming_decoder(y) - [1, 1, 0, 1)

The notion of an correcting code was introduced by Hamming

[in 1950)

Y.D. Goppa, [8], p. 46.

Error-reduction factor

Assume that p is the probability that a symbol ofT is altered in the transmis­sion. The probability that j errors occur in a block of length n is

Therefore

Pe(n, t ,p) = t (~) pi (l - p)n- j = 1 - t (~)pi (l - p)n- j [1.1]j=t+l J j =O J

24 Chapter 1. Block Error-correcting Codes

gives the probability that t + 1 or more errors occur in a code vector, andthis is the probability that the received vector is either undecodable or thatan undetectable error occurs .

If we assume that N blocks of k symbols are transmitted, the number ofexpected errors is pkN if no coding is used, while the expected number oferrors if coding is used is at most (N . Pe(n, t,p)) . k (in this product wecount as errors all the symbols corresponding to a code error, but of coursesome ofthose symbols may in fact be correct). Hence the quotient

p(n, t ,p) = Pe(n, t ,p)/p,

which we will call the error reduction factor of the code, is an upper boundfor the average number of errors that will occur in the case of using cod­ing per error produced without coding (cf. E.I.8). For p small enough,p(n, t,p) < 1 and the closer to 0, the better error correction resulting fromthe code . The value ofp(n, t ,p) can be computed with the function erf(n,t,p).

1.10 Example (Error-reduction factor of the Hamming code). According tothe formula [1.1], the error-reduction factor of the Hamming code C[7,4, 3] for a bit error probability p has the form

G)P(l -p)5 + ...

which is of the order 21p for p small. Note that this is 7 times the error­reduction factor of the code Rep(3), so that the latter is more effective inreducing errors than C, even if we take into account that the true error re­duction factor ofC is smaller than 21p (because in the error-reduction factorwe count all four bits corresponding to a code error as errors, even thoughsome of them may be correct). On the other hand the rates of C and Rep(3)are 4/7 and 1/3, respectively, a fact that may lead one to prefer C to Rep(3)under some circumstances.

E.1.8. Let p' = p' (n, t, p) be the probability of a symbol error using a codeof length n and correcting capacity t. Let p = Pe(n, t,p) . Prove that p ~

p' ~ p/k.

Elementary parameter bounds

In order to obtain efficient coding and decoding schemes with small errorprobability, it is necessary that k and d are high , inasmuch as the maximumnumber of errors that the code can correct is t = L(d - 1)/ 2Jand that thetransmission rate is, given n, proportional to k.

1.1 . Basic concepts 25

"'I Library I Probabilaty of code error (an upper bound) and error reduction factor I'"n

P_e(n,t,p) := I (~}pi '(1-p)n-jj=t+1

erf(n,t,p) := if p=O theno

elseP_e(n,t,p)/ p

end!p=erf

I n=3; t=1 ; p=0.01;I P_e(n,t, p), p(n,t,p) ~ 0.000298, 0.0298In=7; t=1 ; p=0.01;I P_e(n,t, p), p(n,t,p) ~ 0.002031, 0.2031I n=23; t=3; p=0.01;

IP_e(n,t, p), p(n,t,p) ~ 7.6053 ' 10- 5, 0.0076

It turns out, however, that k and d cannot be increased independentlyand arbitrarily in their ranges (for the extreme cases k = n or d = n, seeE.I .3). In fact, the goal of this section, and of later parts of this chapter, isto establish several non trivial restrictions of those parameters. In practicethese restrictions imply, for a given n, that if we want to improve the ratethen we will get a lower correcting capability, and conversely, if we want toimprove the correcting capability, then the transmission rate will decrease.

Singleton bound and MDS codes

Let us begin with the simplest of the restrictions we will establish.

1.11 Proposition (Singleton bound). For any code oftype [n, k , dj,

k+ d :::; n + l.

Proof Indeed, if C' is any code of type (n, M, d), let us write C' ~ T n - d+1

to denote the subset obtained by discarding the last d - 1 symbols of eachvector of e. Then C' has the same cardinal as C, by definition of d, and soqk = M = lei = le'l :::; qn- d+l. Hence k :::; n - d+ 1, which is equivalentto the stated inequality. 0

MDS codes. Codes that satisfy the equality in the Singleton inequality arecalled maximum distan ce separable codes, or MDS codes for short. Therepetition code Rep(3) and the Hamming code [7,4,3] are MDS codes. Therepetition code of any length n on the alphabet T , which by definition isRepq(n ) = {tn It E T}, is also an MDS code (since it has q elements,its dimension is 1, and it is clear that the distance between any two distinctcode-words is n) .

26 Chapter 1. Block Error-correcting Codes

Values of A2(n, d)n d=3 d=5 d=75 4 2 -6 8 2 -

7 16 2 28 20 4 29 40 6 210 72-79 12 211 144-158 24 412 256 32 413 512 64 814 1024 128 1615 2048 256 3216 2720-3276 256--340 36--37

Table 1.1: Some known values or bounds/or A2 (n ,d)

1.12 Remark. If e is a code with only one word , then ke = 0, but deis undefined . If we want to assign a conventional value to de that satisfiesthe Singleton bound, it has to satisfy de :( n + 1. On the other hand, theSingleton bound tells us that de :( n for all codes e such that lei ;? 2. Thissuggests to put de = n + 1 for one-word codes C, as we will do henceforth.With this all one-word codes are MDS codes .

The function A q (n,d)

Given positive integers nand d, let Rq(n, d) denote the maximum of therates Re = kc / n for all codes e of length n and minimum distance d.

We will also set kq(n, d) and Aq(n, d) to denote the corresponding num­ber of information symbols and the cardinal of the code, respectively, so thatRq(n,d) = kq(n,d)/nandAq(n,d) = qk(n,d).

A code e (of length nand mimimum distance d) is said to be optimal ifRe = Rq(n , d) or, equivalently, if either kc = kq(n, d) or Me = Aq(n , d) .

E.1.9. If q ;? 2 is an integer, show that Aq(3 , 2) = q2. In particular we havethat A 2(3 , 2) = 4 and A3(3, 2) = 9. Hint: ifT = Zq, consider the codee = {(x, y,x + y) E T 3 Ix, YET} .

E.1.10. Show that A2(3k , 2k) = 4 for all integers k ;? 1.

E.1.ll. Show that A2(5 , 3) = 4.

The exact value of Aq(n, d) is unknown in general (this has often beencalled the main problem of coding theory) . There are some cases which are

1.1. Basicconcepts 27

very easy, like Aq (n, 1) = q", A2(4, 3) = 2, or the cases considered in E.I .9and E.1.11. The values A2(6, 3) = 8 and A2(7, 3) = 16 are also fairily easyto determine (see E.l .19), but most of them require, even for small n, muchwork and insight. The table 1.1 gives a few values of A2 (n,d), when theyare known, and the best known bounds (lower and upper) otherwise . Someof the facts included in the table will be established in this text; for a morecomprehensive table, the reader is referred to the table 9.1 on page 248 ofthe book [6]. It is also interesting to visit the web page

http://www.csl.sony.co.jp/person/morelos/ecc/codes.html

in which there are links to pages that support a dialog for finding the bestbounds known for a given pair (n, d), with indications of how they are as­certained.

E.1.12. In the Table 1.1 only values of A2(n, d) with d odd are included.Show that if d is odd, then A2(n, d) = A2(n + 1, d + 1). Thus the tablealso gives values for even minimum distance. Hint: given a binary code Cof length n, consider the code C obtained by adding to each vector of C thebinary sum of its bits (this is called the parity completion of C) .

The Hamming upper bound

Before stating and proving the main result of this section we need an auxil­iary result.

1.13 Lemma. Let r be a non-negative integer and x E T" , Then

IB(x, r)1 = t (~) (q - 1)i.i=O ~

Proof: The number ofelements ofT" that are at a distance i from an elementx E T" is

and so

as claimed.

IB(x , r)1 = t (~) (q - 1)i,i=O ~

oThe lemma shows that the cardinal of B(x, r) only depends on nand r,

and not on x, and we shall write volq(n , r) to denote it. By the preceedinglemma we have

vOlq(n, r) = IB(x, r)1 = to (:) (q - 1)i . [1.2]

28

"'1 Library I Computation of vola(n,r)

r

volurnetn.r.q) . « I (7) (q-1)ij=O

r

volume(n,r) := I (7)j=O

" \caretvolume(9,3,3) -+ 835~ volume(9,3) -+ 130

Chapter 1. Block Error-correcting Codes

I'"

1.14 Theorem (Hamming upper bound). If i = l(d-l)/2J, then the/al­lowing inequality holds:

Proof Let C be a code of type (n, M , d)q. Taking into account that theballs of radius t = l(d - 1)/2J and center elements of C are pair-wise dis­joint (this follows from the definition t and the triangular inequality of theHamming distance), it turns out that L IB(x, t)1 :::;; ITnl = qn.

xECOn the other hand we know that

IB(x, t)1 = vOlq(n, t)

and henceqn :;:: L IB(x, t)1 = M . vOlq(n, t).

xEC

Now if we take C optimal, we get

which is equivalent to the inequality in the statement. o1.15 Remark. The Hamming upper bound is also called sphere-packing up­per bound, or simply sphere upper bound.

E.1.13 . Let m and s be integers such that 1 :::;; s :::;; m and let Cl, . . . ,em E

F", where IF is a finite field with q elements. Show that the number ofvectorsthat are linear combinations of at most s vectors from among cj , . . . ,em isbounded above by vOlq(m, s).

[Hamming] established the upper bound for codes

Y.D. Gappa, [8], p. 46.

1.1. Basic concepts 29

"'I Library I Hamminq upper bound (also called sphere upper bound) ''''

Iub_sphere(n,d,q) :=l volume(n, L~:-1)/2J, q) J

Iub_sphere(n,d) :=lVOIUme(n,t~d-1)/2J) J

I \caretub_sphere(8,3) -+ 28I ub_sphere(9,3) -+ 51I ub_sphere(10,3) -+ 93I ub_sphere(11,3) -+ 170I ub_sphere(11,3,3) -+ 7702I ub_sphere(21,5) -+ 9039

Perfect codes

In general Dc is a proper subset ofT", which means that there are elementsy E T" for which there is no x E e with hd(y, x) ~ t . If Dc = T", thene is said to be perfect. In this case , for every y E T " there is a (necessarilyunique) x E e such that hd (y , x) ~ t .

Taking into account the reasoning involved in proving the sphere-bound,we see that the necessary and sufficient condition for a code e to be perfectis that to (~) (q - l )i = qn/M (= qn- k),

where M = le i = qk (this will be called the sphere or perfect condition).The total code T" and the binary repetion code of odd length are exam­

ples of perfect codes, with parameters (n ,q" ;1) and (2m + 1,2, 2m + 1),respectively. Such codes are said to be trivial perfect codes. We have alsoseen that the Hamming code [7,4,3] is perfect (actually this has been checkedin Example 1.9).

E.1.14. In next section we will see that if q is a prime-power and r a positiveinteger, then there are codes with parameters

[(qr _ l)/(q - 1), (qr - l)/(q - 1) - r, 3]).

Check that these parameters satisfy the condition for a perfect code. Notethat for q = 2 and r = 3 we have the parameters [7,4,3].

E.1.15. Show that the parameters [23, 12, 7], [90,78,5] and [11, 6,513 satisfythe perfect condition.

E.1.16. Can a perfect code have even minimum distance?

E.1.17. Ifthere is a perfect code oflength n and minimum distance d, whatis the value of Aq(n,d)? What is the value of A2 (7, 3)?

30 Chapter 1. Block Error-correcting Codes

E.1.18. Consider the binary code consisting of the following 16 words:

0000000011000100010110100111

1111111101100001110101010011

1000101010110000111011101001

1100010001011010011101110100

Is it perfect? Hint : show that it is equivalent to the Hamming [7,4,3] code.

1.16 Remarks. The parameters of any non-trivial q-ary perfect code, with q

a prime power, must be those ofa Hamming code, or [23,12,7], or [11, 6, 513(van Lint and Tietavainen (1975); idependently established by Zinovi'ev andLeont'ev (1973» . There are non-linear codes with the same parameters asthe Hamming codes (Vasili'ev (1962) for binary codes; Schonheim (1968)and Lindstrom (1969) in general). Codes with the parameters (23,212,7) and(11,36 ,5h exist (binary and ternary Golay codes; see Chapter 3), and theyare unique up to equivalence (Pless (1968), Delsarte and Goethals (1975);see [25], Theorem 4.3.2, for a rather accessible proof in the binary case, and[II], Chapter 20, for a much more involved proof in the ternary case).

One of the steps along the way of characterizing the parameters of q-ary

perfect codes, q a prime power, was to show (van Lint, H.W. Lenstra, A.M.Odlyzko were the main contributors) that the only non-trivial parametersthat satisfy the perfect condition are those of the Hamming codes, the Golaycodes (see E.1.l5, and also P.l.7 for the binary case) , and (90 , 278 , 5h. Thelatter, however, cannot exist (see P.l.8).

Finally let us note that it is conjectured that there are no non-trivial q­

ary perfect codes for q not a prime power. There are some partial resultsthat lend support to this conjecture (mainly due to Pless (1982», but theremaining cases are judged to be very difficult.

The Gilbert inequality

We can also easily obtain a lower bound for Aq(n,d). If C is an optimalcode, any element of T" lies at a distance s; d - 1 of an element of C , forotherwise there would be a word y E T" lying at a distance ;? d from allelements of C and C U {y} would be a code of length n, minimum distanced and with a greater cardinal than ICI. contradicting the optimality ofC. Thismeans that the union of the balls ofradius d - 1 and with center the elementsof C is the whole T " : From this it follows that A q(n, d) .volq (n, d -1) ;? q",Thus we have proved the following:

1.17 Theorem (Gilbert lower bound). Thefunction A q(n, d) satisfies thefol-

1.1. Basic concepts

lowing inequality:

31

qnAq (n, d) ~ (d )'vOlq n , - 1

What is remarkable about the Gilbert lower bound, with the improvementwe will find in the next chapter by means of linear codes , is that it is theonly known general lower bound. This is in sharp con trast with the varietyof upper bounds that have been discovered and of which the Singleton andsphere upper bounds are just two cases that we have already established.

AI Library I Gilbert lower bound

Ilb_9ilbert(n,d,Q) :=f I td-1 ) 1voume n, ,q

Ilb 9i1bert(n,d) := f I 2(n d 1)1- vo ume n, -

I \caretlb_9i1bert(8 ,3) .... 7Ilb_9ilbert(9,3) 12Ilb_9ilbert(10,3) 19I Ib_9i1bert(11 ,3) 31I Ib_9ilbert(11 ,3.3) 729Ilb_9ilbert(21 .5) .... 278

1.18 Remark. The Hamming and Gilbert bounds are not very close . Forexample, we have seen that 7 ~ A 2 (8,3) ~ 28, 12 ~ A2(9, 3) ~ 51,19 ~ A2(10, 3) ~ 93 and 31 ~ A2 (1l , 3) ~ 170. But in fact A2(8 , 3) =20 (we will get this later, but note that we already have A2(8, 3) ~ 20 byE.1.4), A2(9,3) = 40 and the best known intervals for the other two are72 ~ A 2 (1O, 3) ~ 79 and 144 ~ A2(1l , 3) ~ 158 (see Table 1.1).

E.1.19. The sphere upper bound for codes of type (6, M , 3) turns out to beM ~ 9 (check this), but according to the table 1.1 we have A2 (6,3) = 8,so that there is no code of type (6,9 , 3). Prove this. Hint : assuming such acode exists, show that it contains three words that have the same symbols inthe last two positions.

Summary

• Key general ideas about information generation, coding and transmis­sion.

• The definition of block code and some basi c related notions (code­words, dimension, transmission rate , minimum distance and equiva­lence criteria between codes).

32 Chapter 1. Block Error-correcting Codes

• Abstract decoders and error-correcting capacity.

• The minimum distance decoder and its correcting capacityto = l(d - 1)/2J .

• The Hamming code [7,4,3] (our computational approach includes theconstruction of the code and the coding and decoding functions) .

• Error-reduction factor of a code.

• Singleton bound: k ~ n + 1 - d.

• The function A q (n, d), whose determination is sometimes called the'main problem ' ofblock error-correcting codes.Table 1.1 provides someinformation on A 2 (n ,d).

• Hamming upper bound:

Aq(n ,d) ~ q" /volq(n , t), where t = L(d - 1)/2J.• Perfect codes: a code (n , M ,d)q is perfect if and only if

• Gilbert lower bound: Aq(n ,d) ~ qn/volq(n , d - 1).

Problems

P.l.I (Error-detection and correction). We have seen that a code C of type(n, M ,d) can be used to detect up to d - 1 errors or to correct up to t =L(d - 1)/2J errors. Show that C can be used to simultaneously detect up tos ~ t errors and correct up to t errors if t + s < d.

P.l.2 Prove that A 2(8 ,5) = 4 and that all codes of type (8,4, 5h are equiv­alent. Hint: by replacing a binary optimal code of length 8 and minimumdistance 5 by an equivalent one, we may assume that 00000000 is a codeword and then there can be at most one word of weight ~ 6.

P.l.3. Show that for binary codes of odd minimum distance the Hammingupper bound is not worse than the Singleton upper bound. Is the same truefor even minimum distance? And for q-arycodes with q > 2?

P.l.4 Prove that A 2 (n,d) ~ 2A2 (n - 1, d). Hint: if C is an optimal binarycode of length n and minimum distance d, we may assume, changing C intoan equivalent code if necessary, that both 0 and 1 appear in the last positionof elements of C, and then it is useful to consider, for i = 0, 1, the codesC, = {x E C IX n = i}.

1.1. Basic concepts 33

P.l.S (Plotkin construction, 1960). Let C1 and C2 be binary codes of types(n , M 1 , dI) and (n , M2, d2), respectively. Let C be the length 2n code whosewords have the form xl( x + y) , for all x E C1 and y E C2 . Prove thate rv (2n, M 1M2, d), where d = min(2dI, d2).

P.l.6 Show that A 2(16, 3) ;::: 2560 . Hint : use Example E.1A and the Plotkinconstruction.

P.l.7 Ife rv (n , M , 7) is a perfect binary code , prove that n = 7 or n = 23.Hint : use the sphere upper bound.

P.l.8. Show that there is no code with parameters (90 ,278 ,5). Hint: Ex­tracted from [9], proof of Theorem 9.7: if C'were a code with those param­eters, let X be the set ofvectors in e that have weight 5 and begin with two1s, Y the set ofvectors in Z~o that have weight 3 and begin with two 1s, andD = {(x ,y) E X x Y IS(y) c S(x)}, and count IDI in two ways (S(x) isthe support of x, that is, the set of indices i in 1..90 such that X i = 1).

34 Chapter 1. Block Error-correcting Codes

1.2 Linear codes

In an attempt to find codes which are simultaneously good in

the sense of the coding theorem, and reasonably easy to im­

plement, it is natural to impose some kind of structure on the

codes.

R. 1. McEliece, [12], p. 133.

Essential points

• Basic notions: linear code, minimum weight, generating matrix, dualcode, parity-check matrix .

• Reed-Solomon codes and their duals.

• Syndrome-leader decoding algorithm.

• Gilbert-Varshamov condition on q, n , k ,d for the existence of a linearcode [n ,k, d]q.

• Hamming codes and their duals.

• Weight enumerators and the MacWilliams theorem (rv identities) .

Introduction

The construction of codes and the processes of coding and decoding arecomputationally intensive. For the implemetation of such opperations anyadditional structure present in the codes can be of much help.

In order to define useful structures on the codes, a first step is to enrichthe alphabet T with more structure than being merely a set. If we want, forexample , to be able to add symbols with the usual properties, we could usethe group Zq as T . Since this is also a ring, the alphabet symbols couldalso be multiplied.? If we also want to be able to divide symbols, then weare faced with an important restriction, because division in Zq by nonzeroelements is possible without exception ifand only if q is prime.

We will have more room in that direction if we allow ourselves to usenot only the fields Zp, p prime, but also all other finite fields (also calledGalo is jields). As we will see in the next chapter in detail, there is a field ofq elements if and only if q is a power of a prime number, and in that case the

2For an interesting theory for the case T = Z4, see Chapter 8 of [25].

1.2. Linear codes 35

field is unique up to isomorphism. Henceforth this field will be denoted IF'q(another popular notation is GF(q) .

Prime powers less than 50I N=50;

I{n in 2..N where is_prime_power?(n)}-+ {2,3 ,4,5, 7,8,9,11 ,13,16, 17, 19, 23,25,27,29,31,32,37.41,43,47,49}

So let us assume that the alphabet T is a finite field IF'q - Then we canconsider the codes C ~ IF'~ that are an IF'q-vector subspace of IF'~ . Suchcodes are called linear codes and we devote this section to the study someof the basic notions related to them. Note that if C is a linear code andits dimension as an IF'q-vector space is k, then ICI = qk and therefore kcoincides with the dimension of C as a code: dim IFq (C) = dim (C).

Linear codes are important for diverse reasons. The more visible arethat they grant powerful computational facilities for coding and decoding.If we revisit the Hamming code [7,4,3] (example 1.9), we will realize thatlinear algebra over Z2 plays a major role there, and similar advantages willbe demonstrated for several families of codes from now on. A more subtlemotive, which is however beyond the scope of this introductory text, is thefact that Shannon's channel coding theorem can be achieved by means oflinear codes .

1.19 Example (The International Standard Book Number). The ISBN codeof a book is 10-digit codeword assigned by the publisher, like the 3-540­00395-9 of this book. For the last digit the symbol X (whose value is 10) isalso allowed. For example, the ISBN of [6] is 0-387-96617-X. The hyphensdivide the digits into groups that have some additional meaning, like the first0, which indicates the language (English) , and the groups 387-96617, whichidentify the publisher and the book number assigned by the publisher. Thelast digit is a check digit and is computed by means ofthe following weightedchecksum:

10

L i . X i == 0 (mod 11),i=l

or equivalently9

X lO = L i . Xi (mod 11) ,i=l

with the convention that the remainder 10 is replaced by X. For some in­teresing properties ofthis coding scheme, which depend crucially on the factthat Zll is a field, see P.1.9. See also the listing ISBN check digit for thecomputation of the ISBN check digit and the examples of the two booksquoted in this Example.

36

41 Library I ISBN check digit

isbn(x:Vector)check length(x) =9:= begin

9

local r= I i-x, mod 11

i = 1if r<10 then

relse

Xend

end

rl isbn ( [ 3, 5, 4, 0, 0, 0, 3, 9, 5]) -+ 9~ isbn ([ 0, 3, 8, 7, 9, 6, 6,1,7]) -+ X

Notations and conventions

Chapter 1. Block Error-correcting Codes

To ease the exposition, in this section code will mean linear code unless ex­plicitely stated otherwise. Often we will also write IF instead of IFq. We willsimply write M~(IF) to denote the matrices of type k x n with coefficientsin IF. Instead of Mf(IF) we will write Mk(IF).

The usual scalar product of x , y E IFn will be denoted (xly) :

(xly) = xyT = XlYl + .. .+ XnYn'

Weights

We will let II : IFn ~ N denote the map x I--> d(x,O). By definition, Ixl isequal to the number of non-zero symbols of x E IFn. We will say that [z],which sometimes is also denoted wi (x), is the weight of x .

E.1.20. The weight is a norm for the space IFn : for all x ,Y, Z E F", 101 = 0,Ixl > 0 if x =I- 0, and Ix + yl ~ [z] + Iyl (triangle inequality). 0

If G is a linear code, then the minimum weight of G, We, is defined asthe minimum of the weights [z] of the non-zero x E G.

1.20 Proposition. The minimum weight of G coincides with the minimumdistance ofG: de = we·

Proof Let x , y E G. IfG is linear, then x - y E G and we have hd(x , y) =Ix - yl . Since x - y =I- 0 if and only if x =I- y, we see that any distancebetween distinct elements is the weight of a nonzero vector. Conversely,

1.2. Linear codes 37

since Ixl = hd(x ,O), and 0 E e because e is linear, the weight of anynonzero vector is the distance between two distinct elements of e. Now theclaimed equality follows from the definitions of de and We. 0

1.21 Remark. For a general code e of cardinal M, the determination ofdeinvolves the computation of the M(M - 1)/2 Hamming distances betweenits pairs of distinct elements. The proposition tells us that if e is linear thenthe determination of de involves only the computation of M - 1 weights.This is just an example of the advantage ofhaving the additional linear struc­ture .

It is important to take into account that the norm II and the scalar product( I) are not related as in Euclidean geometry. In the geometrical context wehave the formula Ixl2 = (xix) (by definition of Ix!), while in the presentcontext Ixl and (xix) have been defined independently. In fact they are quiteunrelated, if only because the values of the norm are non-negative integersand the values of the scalar product are elements oflF.

E.1.21. Let e be a linear code of type In,k] over IF = IFq. Fix any integerj such that 1 ~ j ~ n . Prove that either all vectors of e have 0 at thej-th position or else that every element oflF appears there in precisely qk-l

vectors of e.

E.1.22. Given a binary linear code C, show that either all its words haveeven weight or else there are the same number of words with even and oddweight. In particular, in the latter case lei is even.

Generating matrices

Given a code e of type In, k], we will say that a matrix G E M~(lF) is agenerating matrix ofe if the rows of G form a linear basis of e.

Conversely, given a matrix G E M~ (IF) the subspace (G) s.:;; lFn gener­ated by the rows of G is a code of type In,k], where k is the rank of G. Wewill say that (G) is the code generated by G.

1.22 Remark. For the knowledge of a general code we need to have anexplicit list of its M vectors. In the case of linear codes of dimension k,M = qk, but the M code vectors can be generated out of the k rows of agenerating matrix . This is another example of the advantage provided by theadditional linear structure.

1.23 Examples. a) The repetition code of length n is generated by I n.b) A generating matrix for the Hamming code [7,4,3] is G = 141RT

(notations as in the example 1.9, p. 21).

38 Chapter 1. Block Error-correcting Codes

c) If C s: lFn is a code of dimension k, let C s: lFn +1 be the imageof C by the linear map lFn

-t lFn x IF = lFn +1 such that x f-t z ] - s( x),where s(x ) = Li Xi. Then C is a code of type In + 1, k] which is called theparity extension ofC (the symbol - s( x) is called the parity check symbol ofthe vector x). If G is a generating matrix of C, then the matrix G obtainedby appending to G the column consisting of the parity check symbols of itsrows is a generating matrix of C . The matrix G will be called the paritycompletion (or parity extension) of G. The listing Parity completion of amatrix shows a way of computing it.

AI Library I Paritv completion of a matrix IA

The expression G I[) below yields the matrix G augmented with the columnvector [ )Tat the end. An expression like [ )1 G works similarly, but this timeG is augmented with the column vector [)T at the beginning.

Iparity_completion (G :Matrix) := G I [-sum(g) with g in G)[leftparltycornpletton (G :Matrix) := [-sum(g) with g in G) IG

(

1 1 0 0 0 0 1) (1100001)0110010 . 0110010G= 1 0 1 0 1 0 0 : Matnx(iZ2) ~ 1 0 1 0 1 0 0

1111000 1111000

(

1 1 0 0 0 0 1 1). . 01100101

panty_completlon(G) ~ 1 01 01 001

111 1 0 0 0 0

left_parity_completion(diagonal_matrix([1,2,3,4)):Matrix(iZs)) ~(

4 1 0 0 0 )302002003010004

E.1.23. The elements x E lFn +1 such that s(x ) = 0 form a code C of typeIn + 1, n] (it is called the zero-parity code oflength n + 1, or of dimensionn) . Check that the matrix (InI1;') is a generating matrix ofC.

Coding

It is clear that if G is a generating matrix of C, then the map

induces an isomorphism of lFk onto C and hence we can use f as a codingmap forC.

If A E Mk(lF) is an invertible matrix (in other words, A E GLk(lF», thenAG is also a matrix of C . From this it follows that for each code C there

1.2. Linear codes 39

exists an equivalent code which is generated by a matrix that has the formG = (hiP), where h is the identity matrix of order k and P E M~_k(F) .

Since in this case f(u) = uG = (uluP), for all u E Fk, we see that thecoding of u amounts to appending the vector uP E Fn - k to the vector u(we may think ofuP as a 'redundancy' vector appended to the 'informationvector' u) .

The codes C (this time not necessarily linear) of dimension k for whichthere are k positions in which appear, when we let x run in C, all sequencesof k symbols , are said to be systematic (with respect to those k positions) .According to the preceding paragraph, each linear code of dimension k isequivalent to a systematic code with respect to the first k positions.

1.24 Example (Reed-Solomon codes). Let n be an integer such that 1 ~

n ~ q and Q = a1 , . .. , an E F a sequence of n distinct elements of F.For every integer k > 0, let F[X]k be the F-vector space whose elementsare polynomials of degree < k with coefficients in F. We have F[X]k(1,X , . . . ,X k- 1h.. and dim(F[X]k) = k . If k ~ n, the map

is injective, since the existence of a non-zero polynomial of degree < kvanishing on all the ai implies n < k (a non-zero polynomial of degreer with coefficients in a field can have at most r roots). The image of E istherefore a linear code C of type tn, k].

We claim that the minimum distance ofC is n - k +1. Indeed, a non-zeropolynomial f of degree < k can vanish on at most k - 1 of the elements a iand hence the weight of (J(at) , . .. , f(an ) ) is not less than n - (k - 1) =n - k + 1. On the other hand, this weight can be exactly n - k + 1 for somechoices of f , like f = (X - (1) . . . (X - ak-1), and this proves the claim.

We will say that C is a Reed-Solomon (RS) code oflength n and dimen­sion k, and will be denoted RSa(k). When the ai can be understood by thecontext, we will simply write RS(k), or RS(n, k) if we want to display thelength n. It is clear that the RS codes satisfy the equality in the Singletonbound, and so they are examples of MDS codes. On the other hand we haven ~ q, and so we will have to take high values of q to obtain interestingcodes.

Since 1, X , . . . , X k- 1 is a basis ofF[X]k, the Vandermonde matrix

is a generating matrix for RSa(k).

40

The dual code

Chapter 1. Block Error-correcting Codes

The linear subspace ofF" ortogonal to a subset Z ~ r will be denoted Z1..Let us recall that the vectors of this subspace are the vectors x E JFn suchthat (xlz) = °for all z E Z .

If C is a code, the code C1. is called the dual code of C .Since the scalar product ( I) is non-degenerate, by linear algebra we

know that C1. has dimension n - k if C has dimension k. In other words,C1. is of type In,n - k] if Cis of type In,k].

As C ~ C1.1., tautologically, and both sides of this inclusion have di­mension k , we infer that C1.1. = C.

Usually it happens that C and C1. have a non-zero intersection. Evenmore, it can happen that C ~ C1., or that C = C1.. In the latter case wesay that C is selfdual. Note that in order to be self-dual it is necessary thatn= 2k.

E.1.24. If C is the length n repetition code over C, check that C1. is thezero-parity code of length n.

E.1.2S. If G is a generating matrix of C, prove that C ~ C1. is equivalentto the relation GGT = 0. If in addition we have n = 2k, prove that thisrelation is equivalent to C = C1.. As an application, check that the parityextension of the Hamming code [7,4 ,3] is a self-dual code.

E.1.26.LetCl,C2 ~ JF~belinearcodes.Show that (C1+C2)1. = CfnCfand (C1 n C2)1. = cf + Cf.

Parity-check matrices

If H is a generating matrix of C1. (in which case H is a matrix of type(n - k) x n) we have that

x E C if and only if x H T = 0,

because if h is a row of H we have xhT = (xlh). Said in other words,the elements x E C are exactly those that satisfy the n - k linear equations(xlh) = 0, where h runs through the rows of H .

Given h E C1., the linear equation (x lh) = 0, which is satisfied for allx E C, is called the check equation of C corresponding to h . By the previousparagraph, C is determined by the n - k check equations corresponding tothe rows of H, and this is why the matrix H is said to be a check matrix ofC (or also, as most authors say, a parity-check matrix of C , in a terminologythat was certainly appropriate for binary codes, but which is arguably notthat good for general codes).

1.2. Linearcodes

The relation

41

can also be interpreted by saying that C is the set oflinear relations satisfiedby the rows ofH T , that is, by the columns ofH. In particular we have:

1.25 Proposition. Ifany T - 1 columns ofH are linearly independent, thenthe minimum distance ofC is at least T, and conversely.

1.26 Example (The dual of an RS code). Let C = RSQI,...,Qn(k),where a =aI , .. . , an are distinct nonzero elements of a finite field K. Then we knowthat G = Vk(a1 , . . . , an) is a generating matrix of C (Example 1.24). Notethat the rows of G have the form (at , ... , a~ ) , with i = 0, . . . , k - l.

Now we are going to describe a check matrix H ofC , that is, a generatingmatrix of C..L. Recall that ifwe define D(aI, . . . , an) as the determinant ofthe Vandermonde matrix Vn(aI , .. . ,an) (Vandermonde determinant), then

D(a1, . .. ,an) = II(aj - a i) .i<j

Define the vector h = (hI, . . . , hn) by the formula

hi = (_1)i-1 D(a1 , .. . , a i-I , ai+1, . .. , a n)/D(a1, . .. , an)

= l /(II (aj - ai))Hi

(remark that in the last product there are precisely i-I factors with theindices reversed, namely the aj - a i with j < i) . Then the matrix

is a check matrix of C .To see this, it is enough to prove that any row of G is orthogonal to any

row of H because H has clearly rank n - k . Since the rows of H have theform (h 1a-{ ,... ,hn~), with j = 0, . .. , n - k - 1, we have to prove thatn . .2: a;+Jhl = OifO ~ i ~ k-1 and 0 ~ j ~ n-k-l. Thus it will be enough1=1

nto prove that 2: afhi = 0 (0 ~ s ~ n - 2). Multiplying throughout by the

1=1nonzero determinant D(aI, . . . , an ), and taking into account the definitionof hi, we wish to prove that

n2: af( _1)1-1 D(a1 , . .. , al-1, aIH, . . . , a n) = O.l=l

42 Chapter I. Block Error-correcting Codes

But finally this is obvious, because the left hand side coincides with thedeterminant

as a sI n

1 1al an

n-2 n-2a l an

(developed along the first row), and this determinant has a repeated row.As we will see in Chapter 4, the form of the matrix If indicates that

RSo:(k) is an altemant code, and as a consequence it will be decodable withany of the fast decoders for altemant codes studied in that chapter (sections4.3 and 4.4) .

E.1.27. Check that the dual of an RS(n , k) code is scalarly equivalent to anRS(n, n - k). In particular we see that the dual of a RS code is an MDScode . 0

Let us consider now the question ofhow can we obtain a check matrix froma generating matrix G. Passing to an equivalent code if necessary, we mayassume that G = (hiP) and in this case we have:

1.27 Proposition. IfG = (hiP) is a generating matrix of C, then If =(-pTlln_k) is a generating matrix of c- (note that _pT = pT in thebinary case).

Proof Indeed, If is a matrix of type (n - k) x n and its rows are clearlylinearly independent. From the expressions of G and If it is easy to checkthat GIfT = 0 and hence (If) S;; CT. Since both terms in this inclusionhave the same dimension, they must coincide. 0

1.28 Example. In the binary case, the matrix If = (lkI1) is a check matrixof the parity-check code C rv [k + 1, kj, which is defined as the parity ex­tension of the total code of dimension k. The only check equation for C isL:7,:;l Xi = 0, which is equivalent to say that the weight Ixi is even for all x.

For the repetition code of length n relative to an arbitrary finite field IF,the vector In = (llln-d is a generating matrix and If = (1~_IIIn-l) is acheck matrix. Note that the check equations corresponding to the rows of Ifare X i = X l for i = 2, .. . , n .

1.29 Example. If H is a check matrix for a code C of type tn, kj , then

If = (In i)H 0n-k

is a check matrix of the parity extension C of C.

1.2. Linear codes 43

1.30 Example. Let IF be a field of q elements and let IF* = {al ,"" an},n = q - 1. Let C = RS0 1 . . .. .an (k). Then we know that the Vandermondematrix G = Vk(al , .' " an) is a generating matrix of C. Now we will seethat the Vandermonde matrix

is a check matrix of C .Indeed, as seen in Example 1.26, there is a check matrix of C that has

the form Vn-k(al, . . . , an)diag (hI , ... , hn) , with hi = 1/TIjii(aj - ai),and in the present case we have hi = ai (as argued in a moment), so thatVl.n-k(al, . .. , an) = Vn-k(al, ... , an)diag (al, . ··, an) is a check ma­trix ofC.

To see that hi = ai, first note that TIjii (aj - ai) is the product of allthe elements oflF*, except -ai. Hence hi = -adP, where P is the productof all the elements oflF*. And the claim follows because P = -1, as eachfactor in P cancels with its inverse, except I and -1, that coincide with theirown inverses.

Remark that the dual of RSal ,....an(k) is not an RSal ,...,On (n - k), be­cause H = Vl .n-k(al, . . . , an ) is not Vn-dal , . .. , an). But on dividingthe j-th column of H by a j we see that the dual of RSal ,...,On(k) is scalarlyequivalent to RSal ,...,an (n - k) .

E.1.28. Let G be the generating matrix of a code C rv In,k) and assume ithas the form G = AlB, where A is a k xk non-singular matrix (det(A) i= 0).Show that if we put P = A-I B , then H = (-pT)IIn_k is a check matrixofC.

E.1.29. In Proposition 1.27 the matrix h occupies the first k columns, andP the last n - k. This restriction can be easily overcome as follows. Assumethat h is the submatrix of a matrix G E M~ (IF) formed with the columnsis < . . . < ii, and let P E M~_k(IF) be the matrix left from G after remov­ing these columns. Show that we can form a check matrix H E M;:-k(IF) byplacing the columns of - p T successively in the columns j1, . .. , i» and thecolumns of In-k successively in the remaining columns . [For an illustrationof this recipe, see the Example 1.36]

E.1.30. We have seen (E.1.27 ; see also Example 1.30) that the dual of anRS code is scalarly equivalent to a RS code. Since RS codes are MDS,this leads us to ask whether the dual of a linear MDS code is an MDS code.Prove that the answer is affirmative and then use it to prove that a linear codeIn,k) is MDS ifand only if any k columns of a generating matrix are linearlyindependent.

44 Chapter 1, Block Error-correcting Codes

1.31 Remark. The last exercise includes a characterization of linear MDScodes. Several others are known, An excellent reference is [20] , § 5.3. Notethat Theorem 5.3.13 in [20] gives an explicit formula for the weight enumer­ator of a linear In,k]q, hence in particular for RS codes:

( )i - d ( ')n ' t - 1 . ,

a i = i (q - 1)~(-1)1 j q~-d-J.

AI Library I Weight enumerator of an MDS linear code

weight_enumerator_mds(n.k.q)check 2~min(k,n-k) & max(k.n-k)~q-1

beginlocal d=n-k+1

a

a=[(;)- (q-1)~ (-1 Ii.('j')-q;-d-i wlth l ln d..n][1] Iconstant_vector(d-1 .0) I a

end

~Weight enumerator of the RS code of dimension 4on the non-zero elements of Fg

I weighCenumeratocmds(8,4,9) - [1,0,0,0,0,448,896.2688.2528]

Syndrome decoding

Let C be a code of type In,k] and H a check matrix of C. Given y E JFn,

the elementyHT E JFn-k

is called the syndrome of y (with respect to H). From what we saw in the lastsubsection, the elements of C are precisely those that have null syndrome:C = {x E PI xHT = O}.

More generally, given s E JFn - k , let C, = {z E JFn IzHT = s} (henceCo = C). Since the map CT : JFn -+ JFn-k such that y f-7 yHT is linear,surjective (because the rank of His n - k) and its kernel is C, it follows thatC, is non-empty for any s E p-k and that C s = Zs + C for any Zs E C s .

In other words, Cs is a class modulo C (we will say that it is the class of thesyndrome s).

The notion of syndrome is useful in general for the purpose of minimumdistance decoding of C (as it was in particular for the Hamming [7,4,3] codein example 1.9). The key observations we need are the following. Let 9 be

1.2. Linear codes 45

the minimum distance decoder. Let x E C be the transmitted vector and ythe received vector. We know that y is g-decodable if and only if there existsx' E C such that hd(y , X l ) :::; t, and in this case z ' is unique and g(y) = x',With these notations we have:

1.32 Lemma. The vector y is g-decodable ifand only if there exists e E JFnsuch that yHT = eH T and lei :::; t, and in this case e is unique and g(y) =

y - e.

Proof If y is g-decodable, let X l = g(y) and e = y - x' , Then eH T =

yHT - x'H T = yHT , because the syndrome of x' E Cis 0, and lei =d(y, X l ) :::; t.

Conversely, let e satisfy the conditions in the statement and consider thevector x' = Y - e. Then x' E C, because Xl H T = yHT - eH T = 0 (by thefirst hypothesis on e) and d(y, Xl) = lei:::; t. Then we know that x' is unique(hence e is also unique) and that g(y) = Xl (so g(y) = y - e). 0

E.1.3!. If e , el E Cs and lei, [e'] :::; t, prove that e = e' , In other words, theclass Cs can contain at most one element e with lei :::; t.

Leaders' table

The preceeding considerations suggest the following decoding scheme. Firstprecompute a table E = {s ~ e S} s ElFn- k with es E C; and in such a waythat es has minimum weight among the vectors in C; (the vector es is saidto be a leader of the class Cs and the table E will be called a leaders' tablefor C) . Note that es is unique if Ies I :::; t (exercise E.l .31); otherwise we willhave to select one among the vectors of C, that have minimum weight. Nowthe syndrome decoder can be described as follows:

Syndrome-leader decoding algorithm

1) find the syndrome of the received vector y, say s = yHT ;

2) look at the leaders' table E to find es , the leader of the class corre­sponding to s;

3) return y - es .

1.33 Example. Before examining the formal properties of this decoder, itis instructive to see how it works for the code C = Rep(3) . The matrix

H = G~ ~) is a check matrix for the binary repetition code C =

{OOO, 111}. For each s E Z~, it is clear that e~ = Ols has syndrome s.Thus we have Coo = C = {OOO,I1l}, ClO = e~o + C = {OlO,I01},

46 Chapter I. Block Error-correcting Codes

COl = e~ l + C = {001 , 1l0}, Cll = e~ l + C = {OIl , 100}. The leadersof these classes are eOO= e~o = 000, e io = e~o = 010 , eoi = e~ l = 001,e ll = 100. Hence the vectors y in the class C, are decoded as y + e s

(we are in binary). For example, 011 is decoded as 011 + 100 = 111 and100 as 100 + 100 = 000, like the majority-vote decoding explained in theIntroduction (page 3). In fact, it can be quickly checked that this coindidenceis valid for all y E Z~ . 0

The facts discovered in the preceeding example are a special case of thefollowing general result:

1.34 Proposition. If D = U XEC B (x, t) is the set ojdecodable vectors bythe minimum distance decoder, then the syndrome-leader decoder coincideswith the minimum distance decoderJor all y E D .

Proof Let x E C and y E B(x, t). If e= y - x, then lei ~ t by hypothesis.Moreover, if s = yHT is the syndrome of y, then the syndrome of e is alsos,because

eHT = yHT - x H T = s.

Since es has minimum weight among the vectors ofsyndrome s, lesl ~ lei ~t . So e = es (E.I.3I) and g(y) = y - es = y - e = x, as claimed. 0

1.35 Remarks. With the syndrome decoder there are no docoder errors, forthe table E = {s -+ es } is extended to all s E lFn - k , but we get an unde­tectable error if and only if e = y - x is not on the table E.

For the selection of the table values es the following two facts may behelpful:

a) Let e~ be any element with syndrome s. If the class e~ + C containsan element e such that lei ~ t , then e is the only element in Cs withthis property (E. 1.3I) and so es = e.

b) If the submatrix of H formed with its last n - k columns is the identitymatrix In - k (note that this will happen if H has been obtained froma generating matrix G of C of the form (hiP)), then the elemente~ = (Okl s) has syndrome s and hence the search of es can be carriedout in C; = e~ +C . Moreover, it is clear that for all ssuch that lsi ~ twe have es = e~.

1.36 Example. See the listing A binary example of syndrome decodingfor how the computations have been arranged. Let C = (G), where

(

1 1G= 0 0

1 0

1.2. Linear codes 47

In the listing, 0 is returned as the list of its eight vectors.Then we compute the table E = {s ~ OSLEZ~ of classes modulo 0

and get

x = { [0, OJ -> { [0,0,0,0,0], [1,0 ,0,1 ,1], [0,0 ,1 ,0,1], [1,0,1 ,1 ,0] ,[1,1 ,0,0,1], [0, 1,0,1 , OJ , [1, 1, 1,0, OJ , [0, 1, 1, 1,1] },

[0, IJ -> { [0,0 ,0,0,1], [1,0,0,1 ,0], [0,0,1,0,0] , [1,0, 1,1 ,1 ],[1,1, 0,0,0]' [0, 1,0, 1, 1], [1, 1, 1,0 , IJ, [0, 1, 1, 1, 0J },

[1, OJ -> { [1,0,0,0,0], [0,0,0,1,1], [1,0,1 ,0,1] , [0,0, 1,1 ,0],[0,1 ,0,0,1], [1, 1, 0, 1,0J, [0, 1, 1,0 ,OJ , [1, 1, 1, 1, IJ} ,

[1,1] -> { [1,0,0,0,1 ], [0,0,0,1 ,0], [1,0,1 ,0, OJ , [0,0 ,1 ,1 ,1] ,[0,1 ,0,0,0], [1, 1,0, 1, IJ, [0, 1, 1,0,1] , [1, 1, 1, 1,0] } }

Note that to each s there corresponds the list of vectors of Os' Of course,the list corresponding to [0,0] is O.

AI Library I A binary example of syndrome decodinq

I min_weight(X) := min( {weight(x) with x in X})min_weights(X):= begin

local m=min_weight(X){x in X such_that weight(x) =m}

end

IB=[O,1] : Vector(712) ;

(1 1 0 0 1 )

G= 00 1°1 : Matrix(712) ;1 001 1

IH=GT{1, 5}; H=[1 , 0ll H 1[0, 1] ;

IC={ a ·G 1+b·G2+c ·G3 with (a, b, c) in (B,B,B)} ;

IX={ [a, b]~{[a , 0, 0, 0, b]+x with x in C} with (a,b) in (B,B)} ;

IM={ x1~min_weights(x2) with x in X};

IE={ x1~x2,1 with x in M} ;

I y= [0, 1, 0, 0, 1];

Is=y ·HT [1,0]I y+E(s) [1,1 ,0,0,1]

Let M be the table {s ~ M s } SEZ~' where M; is the set of vectorsof minimum weight in Os' This is computed by means of the functionmin_weights, which yields, given a list of vectors, the list of those that haveminimum weight in that list. We get:

M = {[O, OJ -> {[O, 0, 0, 0, OJ} , [0, 1] -> {[O, 0, 0, 0,1] , [0,0, 1,0, OJ} ,

[1,0] -> {[I , 0, 0, 0, OJ} , [1, 1] -> {[O, 0, 0,1,0], [0, 1,0,0, OJ}}.

48 Chapter 1. Block Error-correcting Codes

A leaders ' table E = {s -4 es } sEZ~ is found by retaining the first vector ofMs. We get:

E = {[O, 0] -t [0, 0,0, 0, OJ, [0, IJ -t [0, 0,0, 0,1 ],

[1 ,0] -t [1, 0, 0, 0, 0], [1, 1] -t [0,0, 0, 1, OJ} .

We also need a check matrix. Since the identity 13 is a submatrix of G,although this time at the second, third and forth columns, we can construct acheck matrix of C using E.1.29, as we do in the listing :

(1 1 0 1 0)

H = 0 1 1 1 1 .

Now we can decode any vector y as y + E(yHT ) . For example , if y =

[0,1 ,0 ,0 , 1J, then yHT = [1 , OJ and the decoded vector is [0,1 ,0 ,0, 1J +[1 ,0 ,0 ,0 , OJ = [1 ,1,0 ,0 ,11·

1.37 Example. The listing Ternary example of syndrome decoding issimilar to the previous example. Here the list C contain s 33 elements ; thetable X has 9 entries, and the value assigned to each syndrome is a list of 33

elements ; the classes of the syndromes

[0,1], [0, 2], [1 ,0], [1, 1], [2,0],[2, 2J

contain a unique vector of minimum weight, namely

[0, 0, 0, 0,1 ], [0,0,0,0,2], [0,0,0,1 ,0], [1,0,0,0,0], [0,0,0,2,0], [2,0,0,0, 0],

while

C [1,2] = {l0, 0, 0, 1, 2], [0,0 ,2 ,0,2], [0, 1,0,1 , 0], [0, 1, 2, 0, 0],

[1, 0, 0, 0, 1], [1, 2,0, 0, 0], [2,0,0,2,0] , [2, 0,1 , 0, OJ}

(C[2,1] is obtained by multiplying all the vectors in C [1,2] by 2). With all this,it is immediate to set up the syndrome-leader table E. The listing containsthe example of decoding the vector y = [1 ,1 ,1 ,1 , 1J.

Complexity

Assuming that we have the table E = {es } , the decoder is as fast as theevaluation of the product s = yHT and the looking up on the table E . Butfor the construction of E we have to do a number of operations of the orderq": there are qn-k entries and the selection of es in C, needs a number ofweight evaluations of the order qk on the average. In case we know thatH = (- pTlln_k) , the time needed to select the es can be decreased a littlebit for all sE lFn

-k such that lsi ~ t, for we may take es = Ok ls (note that

this holds for vOlq(n - k , t ) syndromes). In any case, the space needed tokeep the table E is ofthe order of qn-k vectors oflength n .

1.2. Linear codes

Ternary example of syndrome decodingIK=[O,1,2] : Vector(713);

(1 0 0 2 2 )

G= 0 1 00 1 : Matrix(713);0011 0

IH=-GT{4,S}II2 ;

IC=[a·G1+b·G2+c ·G3 with (a.b,c) in (K,K,K)] ;

I X={ [a,b]"{ to.o.o.e.st-x with x in C} with (a,b) in {K.K) } ;IM= {x1"min_weights(x2) with x in X} ;

IE={ x1"x2.1 with x in M} ;

Iy=[1 ,1.1.1.1]; s=y·HT - [1,1]I E(s) - [1,0,0, O. 0]I y-E(s) - [0,1,1,1,1]

49

Bounding the error for binary codes

As we noticed before (first of Remarks 1.35), an error occurs if and only ifthe error pattern e = y - x does not belong to the table E = {e,}. If we letO::i denote the number of es whose weight is i, and we are in the binary case,then

n

r; = L O::i (1- pt - ipii = O

is the probability that x' = x, assuming that p is the bit error probability.Note that (1 - p)n-ipi is the probability of getting a given error-pattern.Hence Pe = 1 - Pc is the code error probability. Moreover, since O::i = (7)forO ~ i ~ t, and 1 = L~o (7)pi( l - p)n-i,

If p is small, the dominant term in this sum can be approximated by

and the error reduction factor by

In general the integers O::i are difficult to accertain for i > t .

50 Chapter I. Block Error-correcting Codes

Incomplete syndrome-leader decoding

Ifasking for retransmission is allowed, sometimes it may be useful to modifythe syndrome-leader decoder as follows: If s is the syndrome of the receivedvector and the leader es has weight at most t, then we return y - es , otherwisewe ask for retransmission. This scheme is called incomplete decoding. Fromthe preceding discussions it is clear that it will correct any error-pattern ofweigt at most t and detect any error-pattern of weight t + 1 or higher that isnot a code word .

E.1.32. Given a linear code of length n, show that the probability of anundetected code error is given by

n

Pu(n,p) = 2::>ipi(1 - p)n-i,i=l

where ai is the number ofcode vectors that have weight i (1 ::;; i ::;; n). Whatis the probability of retransmission when this code is used in incompletedecoding?

Gilbert-Varshamov existence condition

1.38 Theorem. Fix positive integers n, k, d such that k ::;; nand 2 ~ d ~

n + 1. Ifthe relation

vOlq(n - 1, d - 2) < qn-k

is satisfied (this is called the Gilbert-Varshamov condition), then there exists

a linear code oftyp e [n, k, d'] with d' ~ d.

Proof It is sufficient to see that the condition allows us to construct a matrixH E M;:-k(JF) of rank n - k with the property that any d - 1 of its columnsare linearly independent. Indeed, if this is the case, then the code C definedas the space orthogonal to the rows of H has length n, dimension k (fromn - (n - k) = k) , and minimum weight d' ~ d (since there are no linearrelations of length less than d among the columns of H, which is a checkmatrix of C) .

Before constructing H, we first establish that the Gilbert-Varshamovcondition implies that d - 1 ::;; n - k (note that this is the Singleton bound

1.2. Linear codes

for the parameters d and k) . Indeed,

d-2 ( )n-l .vOlq(n - 1, d - 2) = L . (q - 1)1

j = o ]

d-2 (d 2)~ L ~ (q - l )j

j = O ]

= (1+ (q _ 1) )d- 2 = qd- 2,

51

and hence qd-2 < qn-k if the Gilbert-Varshamov condit ion is satisfied.Therefore d - 2 < n - k, which for integers is equivalent to the claimedinequality d - 1 ~ n - k .

In order to construct H, we first select a basis CI , . . . , Cn-k E IFn-k.

Since d - 1 ~ n - k, any d - 1 vectors extracted from this basis are linearlyindependent. Moreover, a matrix H of type (n - k) x m with entries in IFand which contains the columns cT, j = 1, . . . , n - k, has rank n - k.

Now assume that we have constructed, for some i E In - k ,n], vectorsCI , . . . ,Gi E IFn-k with the property that any d - 1 from among them arelinearly independent. If i = n , it is sufficient to take H = (cT , . .. , c~)and our problem is solved. Otherwise we will have i < n . In this case, thenumber of linear combinations that can be formed with at most d - 2 vectorsfrom among CI , . .. , Ci is not greater than volq(i, d - 2) (see E.l .13). Sincei ~ n - 1, vOlq(i , d - 2) ~ vOlq(n - 1, d - 2). If the Gilbert-Varshamovcondtion is satisfied , then there is a vector Ci+ I E IFn - k which is not a linearcombination of any subset of d - 2 vectors extracted from the list CI , . . . , Gi,

and our claim follows by induction. 0

Since Aq(n, d) ~ Aq(n, d') if d' ~ d, the Gilbert-Varshamov conditionshows that Aq(n, d) ~ qk. This lower bound, called the Gilbert-Varshamovbound, often can be used to improve the Gilbert bound (theorem 1.17). Inthe listing The Gilbert-Varshamov lower bound we define a function thatcomputes this bound (Ib_gilbert_varshamov).

Hamming codes

Hamming codes were discovered by Hamming (1950) and Go­

lay (1949) .

R. Hill, [9], p. 90.

We will say that a q-ary code C of type In,n - r ] is a Hamming codeojcodimension r if the columns of a check matrix H E M~ (IF) of C form

52 Chapter I. Block Error-correcting Codes

AI Library I The Gilbert-Varshamov lower bound

Ib_gilberCvarshamov(n, d. q):= begin

local k=O, v=volume(n-1, d-2, q)while qn-k>v do

k=k+1endqk-1

endIlb_gilberCvarshamov(n. d):= Ib_gilberCvarshamov(n, d, 2)

~l l b_g i l bert_varshamOV ( 1 0 .3 ) - 64Hence A(10,3) ~ 64

Ilb_gilbert(10,3) - 19

a maximal set among the subsets ofF" with the property that any two of itselements are linearly independent (of such a matrix we will say that it is aq-ary Hamming matrix ofcodimension r).

E.1.33. For i = 1, . . . , r , let Hi be the matrix whose columns are the vectorsof lFr that have the form (0, .. ., 0, 1, D:i+ b .. . , D:r ) , with D:i H , . . . , D:n E IFarbitrary. Let H = H 1IH2 !···IHr. Prove that H is a q-ary Hamming matrixofcodimension r (we will say that it is a normalized q-ary Hamming matrixof codimension r). Note also that H i has «:' columns and hence that Hhas (qr - 1)/(q - 1) columns. The listing Normalized Hamming matrixdefines a function based on this procedure, and contains some examples.

1.39 Remark. Since two vectors of'F" are linearly independent ifand only ifthey are non-zero and define distinct points of the projective space lP'(lFr

) , wesee that the number n ofcolumns ofa q-ary Hamming matrix ofcodimension

r coincides with the cardinal oflP'(JFT), which is n = qr - 1. This expressionq -1

agrees with the observation made in E.l .33. It also follows by induction fromthe fact that lP'r-l = lP'(lFr ) is the disjoint union of an affine space A r-l,

whose cardinal is «:'.and a hyperplane, which is a lP'r-2. 0

It is clear that two Hamming codes ofthe same codimension are scalarlyequivalent. We will write Hamq(r) to denote anyone of them and Ham~(r)

to denote the corresponding dual code, that is to say, the code generatedby the check matrix H used to define Hamq(r) . By E.!.33 , it is clear thatHamq(r ) has dimension k = n - r = (qr - 1)/ (q - 1) - r . Its codimension,which is the dimension of Ham~ (r ), is equal to r .

1.40 Example. The binary Hamming code of codimension 3, Ham2(3), is

1.2. Linear codes 53

AILibrary I Normalized HamminQ matrix of codimension r over a finite field K IA

normalized_hamming_matrix(K :GF, r :71 )check r>O:=begin

local B=[constanCvector(r-1, 0)\ [1]]. H=B, S, sK= {element(j, K) with j in 0 .. cardinal (K) -1}for j in 1 .. r-1 do

S=nullfor b in B do

s=null; b=tail(b)for tin K do

s=(append(b,t). s)endS=(S, s)

endB=[S]H=H&B

endreverse (H) T

end

normalized_hamming_matrix(71z.3) ~

normalized_hamming_matrix(71zA) ~

the code [7,4J that has

(

1 1 1 1 0 0 0 )00111100101011

(

1 1 1 1 1 1 1 1000000 OJ000011111111000001100110011110010101010101011

(

1 1 1 1 1 1 1 1 1 0 0 0 0 )00011122211100120120120121

(

1 0 0 1 1 0 1)H= 0 1 0 1 0 1 1

o 0 1 0 1 1 1

as check matrix. Indeed, the columns of this matrix are all non-zero binaryvectors of length 3 and in the binary case two non-zero vectors are linearlyindependent if and only if they are distinct.

E.1.34. Compare the binary Hamming matrix of the previous example witha binary normalized Hamming matrix ofcodimension 3, and with the matrixH of the Hamming [7,4,31 code studied in the Example 1.9.

1.41 Proposition. If C is any Hamming code, then de = 3. In particular,the error-correcting capacity ofa Hamming code is 1.

Proof: If H is a check matrix of C, then we know that the elements of Carethe linear relations satisfied by the columns of H. Since any two columns

54 Chapter 1. Block Error-correcting Codes

of H are linearly independent, the minimum distance of G is at least 3. Onthe other hand, G has elements ofweight 3, because the sum of two columnsof H is linearly independent of them and hence it must be proportional toanother column of H (the columns of H contain all non-zero vectors of lFr

up to a scalar factor). 0

1.42 Proposition. The Hamming codes are perfect.

Proof Since Hamq(r) has type [(qr - 1)/(q - 1), (qr - 1)/(q - 1) - r, 3],we know from E.1.l4 that it is perfect. It can be checked directly as well :the ball of radius 1 with center an element ofF" contains 1 + n(q - 1) = qrelements, so the union of the balls of radius I with center an element of G isa set with qn-rqr = qn elements. 0

1.43 Proposition. IfG' = Ham~ (r) is the dual ofa Hamming code G =Hamq(r), the weight ofany non-zero element ofC' is qr-I. In particular,the distance between any pair ofdistinct elements ofG' is qr-I .

Proof Let H = (hij) be a check matrix of G. Then the non-zero vectors ofG' have the form z = aH, a E ]FT, a i=- O. So the i-th component of z has theform Zi = aIhli + .. .+ arhri' Therefore the condition Zi = 0 is equivalentto say that the point Pi of pr-l = P(lFr) defined by the i-th column of Hbelongs to the hyperplane ofpr- I defined by the equation

Since {PI, ... ,Pn } = pr-I (cf. Remark 1.39), it follows that the number ofnon-zero components of Z is the cardinal of the complement of a hyperplaneofpr-I . Since this complement is an affine space Ar- I , its cardinal is qr-Iand so any non-zero element of G' has weight «:'. 0

Codes such that the distance between pairs of distinct elements is a fixedinteger d are called equidistant of distance d. Thus Ham~ (r) is equidistantof distance qr-I .

E.1.35. Find a check matrix of Ham7(2), the Hamming code over lF7 ofcodimension 2, and use it to decode the message

3523410610521360.

Weight enumerator

The weight enumerator of a code G is defined as the polynomial

n

A(t) = LAiti,i= O

1.2. Linear codes

where Ai is the number of elements of C that have weight i .It is clear that A(t) can be expressed in the form

A(t) = L t 1xI,x EC

55

for the term t i appears in this sum as many times as the number of solutionsof the equation Ixl = i, x E C.

Note that Ao = 1, since On is the unique element of C with weight O.On the other hand Ai = 0 if 0 < i < d, where d is the minimum distanceof C. The determination of the other A i is not easy in general and it is oneof the basic problems of coding theory.

MacWilliams identities

A situation that might be favorable to the determination of the weight enu­merator of a code is the case when it is constructed in some prescribed wayfrom another code whose weight enumerator is known. Next theorem (7),which shows how to obtain the weight enumerator of C1.. from the weightenumerator ofC (and conversely) is a positive illustration of this expectation.

Remark. In the proof of the theorem we need the notion of character ofa group G, which by definition is a group homomorphism of G to U(I) =

{z E C Ilzl = I}, the group ofcomplex numbers ofmodulus 1. The constantmap 9 ~ 1 is a character, called the unit character. A character differentfrom the unit character is said to be non-trivial and the main fact we willneed is that the additive group of finite field IFq has non-trivial characters.Actually any finite abelian group has non-trivial characters. Let us sketchhow this can be established.

It is known that any finite abelian group G is isomorphic to a product ofthe form

Znl X . . . X Znk'

with k a positive integer and nl , . . . , nk integers greater than 1. For example,in the case of a finite field IFq» we have (if q = pr, p prime),

since IFq is vector space of dimension T over Zp. In any case , it is clear that ifwe know how to find a non-trivial character ofZn1 then we also know how tofind a non-trivial character of G (the composition of the non-trivial characterof Znl with the projection of Znl X ' " X Znk onto Znl gives a non-trivialcharacter of G). Finally note that if n is an integer greater than I and ~ =I- 1is an n-th root of unity then the map x :Zn --+ U(I) such that X(k) = ~k iswell defined and is a non-trivial character of Zn.

56 Chapter 1. Block Error-correcting Codes

E.1.36. Given a group G, the set G V of all characters of G has a naturalgroup structure (the multiplication XX' of two characters X and X' is definedby the relation (XX')(g) = X(g) X'(g)) and with this structure GV is calledthe dual group ofG. Prove that ifn is an integer greater than I and ~ E U (1)is a primitive n-th root of unity, then there is an isomorphism Zn ~ Z~,

m ~ Xm, where Xm(k) = ~mk . Use this to prove that for any finite abeliangroup G there exists an isomorphism G ~ G V

1.44 Theorem (F. J. MacWilliams). The weight enumerator B(t) ofthe dualcode C1.. oja code C ojtype In, k] can be determinedJrom the weight enu­merator A(t) oJC according to theJollowing identity :

(1- t )qkB(t) = (1 +q*ttA ,

1 + q*tq* = q - 1.

Proof Let X be a non-trivial character of the additive group of IF = IFq

(see the preceeding Remark). Thus we have a map x: IF --t U(l) such thatx(a + (3) = x(a)x((3) for any a ,(3 E IF with X(r) :f:- 1 for some, E IF.Observe that I:a EF x(a) = 0, since

x(r) L x(a) = L x(a +,) = L x(a)aEIF aEIF aEIF

and X(r) :f:- l.Now consider the sum

S = L L X(x, y)t IY1,xEC yEFn

where X(x,y) = X((xly)). After reordering we have

S = L t1y! L X(x, y) .yEFn x EC

Ify E C1.. , then (xly) = 0 for all x E C and so X(x,y) = 1 for all x E C,and in this case I:xEC X(x, y) = ICI. Ify ~ C1.., then the map C --t IF suchthat x ~ (xly) takes each value of IF the same number of times (becauseit is an IF-linear map) and hence, using that I:aEF x(a) = 0, we have thatI:xEC X(x,y) = O. Putting the two cases together we have that

S = ICI L t 1Y1= l B(t) .yEC.l

1.2. Linear codes 57

On the other hand, for any given x E C, and making the convention, forall 0: E IF, that 10:1= 1 if 0: =I- 0 and 10:1= 0 if 0: = 0, we have

L X(x , y)t lYI = L X(XIYl + . . . + XnYn)tIYll+ ...+ IYnl

yEFn Yl,....YnEFn

= L II X(XiYi)tIYi l

Yl ,...,ynEFi= ln

= II (L x(xio:) tln l

) .i= l nE F

But it is clear that

{I + q*tL X(xio:)t1nl =

nEF 1 - t

if Xi = 0

if Xi =I- 0

because 2:n EIF* X(0:) = -1. Consequently

(I t ) Ixl

"'" X(x 'o: )t ln l = (1 + q*tt -L-IF

t 1 + q*tnE

and summing with respect to X E C we see that

S = (I +q*tt A( I-t) ,1 + q*t

as stated . o1.45 Example (Weight enumerator of the zero-parity code). The zero-paritycode of length n is the dual of the repetition code C of length n. Since theweight enumerator of C is A(t ) = 1 + (q - l )tn, the weight enumeratorB (t ) of the zero-parity code is given by the relation

qB(t) = (1 + (q -1)tt(1 + (q -1)C +1(;~ l)tf)= (1 + (q - l)tt + (q - 1)(1 - tt .

In the binary case we have

2B(t) = (1+ t)n + (1 - t)n ,

which yieldsi";"n/2

B(t) = ~ (~)t2i .t=O

Note that this could have been written directly, for the binary zero-paritycode oflength n has only even-weight words and the number of those havingweight 2i is (~) .

58 Chapter I. Block Error-correcting Codes

r -lB(t)=I+(qT-I)tq,

1.46 Example (Weight enumerator of the Hamming codes). We know thatthe dual Hamming code Ham~(r) is equidistant, with minimum distanceqT-I (Proposition 1.43). This means that its weight enumerator, say B(t), isthe polynomial

because qT - 1 is the number of non-zero vectors in Ham~(r) and each ofthese has weight qT-I. Now the MacWilliams identity allows us to determinethe weight enumerator, say A(t), of Hamq(r) :

qTA(t) = (1 + q*ttB (_II_-_t_)+ q*t

(1 _ t)qr-l= (1 + q*t)n + (qT - 1)(1 + q*t)n _

(1 + q*t)qr 1

r - l r-l= (1 + q*tt + (qT - 1)(1 + q*tt-q (1 - t)q

= (1 + q*tt-qr- 1 ((1 + q*t)qr-l + (qT _ 1)(1- t)qr-l)

= (1 + q*t) qr;~l-l ((1 + q*t)qr-l + (qT -1)(1- t)qr-l) ,

since n = (qT - l)/(q - 1) and n - qT-I = (qT-I - I)/(q - 1).In the binary case the previous formula yields that the weight enumerator

A(t) ofHam(r) is

2T A(t) = (1 + t)2r

-1-1 ((1 + t)2

r-

1 + (2T_ 1)(1 _ t)2

r-

1) .

For r = 3, Ham(3) has type [7,4] and

8A(t) = (1 + t )3 ((1 + t)4 + 7(1 - t)4) ,

orA(t) = 1 + 7t3+ 7t4+e.

This means that Ao = A7 = 1, Al = A2 = A5 = A6 = °and A3 = A4 =7. Actually it is easy to find, using the description of this code given in theexample 1.40 (see also the example 1.9) that the weight 3 vectors of Ham(3)are

[1 , 1,0 , 1,0 ,0,0], [0,1 , 1,0 , 1,0,0], [1,0,1 ,0,0,1,0], [0,0, 1,1 ,0,0 ,1],[1 ,0,0 ,0,1 ,0, I], [0,1 ,0 ,0 ,0,1 , 1J , [0,0,0,1 ,1 ,1,0]

and the weight 4 vectors are

[1 ,1 ,1,0,0,0,1]' [1 ,0,1,1,1,0,0], [0,1,1 ,1,0,1,0], [1 ,1 ,0 ,0 ,1,1 ,0],[0,1,0,1,1 ,0 ,1], [1 ,0,0,1,0,1 , 1J, [0,0,1,0,1 ,1 , IJ

Note that the latter are obtained from the former, in reverse order, by inter­

changing °and 1.

1.2. Linear codes

AI Library I Weightenumerator of the Hamming codes

hamming_weight_enumerator(q, r, T):=begin

r-1 1local n=~ a=1+(q-1) ·T e=qr-1q-1 ' ,an. (ae+(qr-1) ' (1-T)e)

qrend

Ihamming_weight_enumerator(r,T):= hamming_weight_enumerator(2, r, T)

{1-T}(1+(q-1) ·T)n. T~ 1+ -1 .T (A)

macwilliams(n,k,q,A,T):= k (q )q

Imacwiliiams(n,k.A,T) := macwiliiams(n,k,2,A,T)

rl hamming_weight_enumerator(3,T) ..... T7+7 ·T4+7 'T3+1~ macwilliams(7,3,1+7 ·T4,T) ..... T7+7 ·T4+7 ·T3+1

Summary

59

• Basic notions: linear code, minimum weight, generating matrix, dualcode, check matrix.

• Syndrome-leader decoding algorithm. It agrees with the minimumdistance decoder 9 on the set D ofg-decodable vectors.

• The Gilbert-Varshamov condition vOlq(n - 1, d - 2) < qn-k guar­antees the existence of a linear code of type In, k, dJ (n, k,d positiveintegers such that k ~ nand 2 ~ d ~ n). In particular the conditionimplies that Aq(n ,d) ;;? qk, a fact that often can be used to improvethe Gilbert lower bound.

• Definition of the Hamming codes

Hamq(r) rv [(qr - 1)/(q - 1), (qr - 1)/(q - 1) - r ,3J

by specifying a check matrix (a q-ary Hamming matrix ofcodimensionr) . They are perfect codes and the dual code Ham~(r) is equidistantwith distance qr-l .

• MacWilliams indentities (Theorem 7): If A(t) and B(t) are the weightenumerators of a code C "" In,kJ and its dual , then

qkB(t) = (1 + q*t)nA ( 1 - t ) , q* = q - 1.1 + q*t

The main example we have considered is the determination of theweight enumerator of the Hamming codes (example 1.46.

60

Problems

Chapter 1. Block Error-correcting Codes

P.1.9 The ISBN code has been described in the Example 1.19. Prove thatthis code :

1. Detects single errors and transpositions of two digits .

2. Cannot correct a single error in general.

3. Can correct a single error ifwe know its position.

If instead of the ISBN weighted check digit relation we used the the relation

L:i~l Xi == a (mod 11),

which of the above properties would fail?

P.1.IO (The Selmer personal registration code, 1967). It is used in Norwayto assign personal registration numbers and it can be defined as the set R C(Zll)ll formed with the vectors X such that Xi =I- 10 for i = 1, . .. ,9 and

T (3 7 6 1 8 9 4 5 2 1 0)xH = 0, where H = 5 4 3 2 7 6 5 4 3 2 1 . Show that

Rcan:1) Correct one error if and only if its position is not 4 or 10.

2) Detect a double error if and only if it is not of the form t(64 - 610),

t E Zh.What can be said about the detection and correction of transpositions?

P.1.11 Encode pairs of information bits 00, 01, 10, 11 as 00000, 01101,10111,11010, respectively.

1. Show that this code is linear and express each of the five bits of thecode words in terms of the information bits.

2. Find a generating matrix and a check matrix.

3. Write the coset classes of this code and choose a leaders' table.

4. Decode the word 11111.

P.1.12 The matrix

(

1 a aH = 1 1 0

101all

is the check matrix of a binary code C.

1.2. Linearcodes 61

1. Find the list of code-words in the case a = b = c = d = l.

2. Show that it is possible to choose a, b,c, d in such a way that C correctssimple errors and detects double errors. Is there a choice that wouldallow the correction of all double errors?

P.l.13 Let G I and G2 be generating matrices oflinear codes oftypes Inl' k, ddqand [n2, k, d2]q , respectively. Prove that the matrices

G3=(~1 ~2)'generate linear codes of types [nl + n2, 2k, d3]q and [nl + n2, k, d 4]q withd3 = min(d l, d2) and d 4 ~ d l + d 2•

P.1.14 Let n = rs, where rand s are positive integers. Let C be the binarycode of length n formed with the words x = ala2 . . . an such that, whenarranged in an r x s matrix

a(s-l)r+ l asr

the sum of the elements of each row and of each column is zero.

1. Check that it is a linear code and find its dimension and minimumdistance.

2. Devise a decoding scheme .

3. For r = 3 and s = 4, find a generating matrix and a check matrixofC.

P.1.IS A binary linear code C ofiength 8 is defined by the equations

X5 = X2 + X3 + X4

X6 = Xl + X2 + X 3

X7 = Xl + X 2 + X4

Xs = X l + X3 + X4

Find a check matrix for C, show that de = 4 and calculate the weight enu­merator of C.

P.1.16 (First order Reed-Muller codes). Let L m be the vector space of poly­nomials of degree ~ 1 in m indeterminates and with coefficients in IF. Theelements of L m are expressions

ao + alXI + ... + amXm,

62 Chapter 1. Block Error-correcting Codes

where X l, . . . , X m are indeterminates and ao, all . . . , am are arbitrary ele­ment s of IF. So 1, Xl, . . . ,Xm is a basis of Lm over IF and, in particular, thedimension of Lm is m + 1.

Let n be an integer such that qm- l < n ~ qm and pick distinct vectorsx = x l, . . . , x n E JFffi. Show that:

1) The linear map E: Lm -t IFn such that E(J) = (J (z '} , . . . , f (xn )) isinjective.

2) The image of E is a linear code of type In ,m + 1, n _ qm-l] .

Such codes are called (first order) Reed-Muller codes and will be denotedRM~(m). In the case n = qm, instead of RM~(m) we will simply writeRMq (m) and we will say that this is afull Reed-Muller code. Thus RMq (m ) rv

(qm, m+ 1,qm-l(q -1)) .

1.47 Remark. There are many references that present results on the decod­ing ofRM codes . The following two books , for example, include a lucid andelementary introduct ion: [1] (Chapter 9) and [27] (Chapter 4).

1.3. Hadamard codes

1.3 Hadamard codes

63

The first error control code developed for a deep space appli­

cation was the nonlinear (32,64) code used in the Mariner '69

mission .

S.B. Wicker, [28], p. 2124.

Essential points

• Hadamard matrices. The matrices u'».

• The order of a Hadamard matrix is divisible by 4 (if it is not 1 or 2).

• Paley matrices.

• Uses of the Paley matrices for the construction of Hadamard matrices.

• Hadamard code s. Example: the Mariner code.

• The Hadamard decoder.

• Equivalence of some Hadamard codes with first order Reed-Mullercodes .

Hadamard matrices

A Hadamard matrix of order n is a matrix of type n x n whose coefficientsare 1 or -1 and such that

HHT = nI.

Note that this relation is equivalent to say that the rows of H have normvn and that any two of them are orthogonal. Since H is invertible andHT = nH-1, we see that HT H = nH-1H = nI and hence HT is alsoa Hadamard matrix . Therefore H is a Hadamard matrix if and only if itscolumns have norm vn and any two of them are orthogonal.

If H is a Hadamard matrix of order n, then

det (H) = ±nn/2.

In part icular we see that Hadamard matrices satisfy the equality in the Hada­mard inequality

n n

Idet(A)I :::; II(I: a7j) 1/ 2,i= l i = l

64 Chapter I. Block Error-correcting Codes

which is valid for all n x n real matrices A = (aij) .

E.1.37. Show that nn /2 is an upper-bound for the value of det(A) when Aruns through all real matrices such that !aij I ~ 1, and that the equality isattained only if A is a Hadamard matrix . 0

If the signs ofa row (column) ofa Hadamard matrix are changed, then the re­sulting matrix is clearly a Hadamard matrix again. If the rows (columns) ofaHadamard matrix are permuted in an arbitrary way, then the resulting matrixis a Hadamard matrix . Two Hadamard matrices are said to be equivalent (orisomorphic) if it is possible to go from one to the other using a sequence ofthese two operations . Observe that by changing the sign of all the columnswhose first term is -1, and then of all the rows whose first term is -1, weobtain an equivalent Hadamard matrix whose first row and column are Inand I~, respectively. Such Hadamard matrices are said to be normalizedHadamard matrices. For example ,

is the only normalized Hadamard matrix of order 2. More generally, if wedefine H(n) (n ~ 2) recursively by the formula

H(n-l»)_H(n-l) ,

then H(n) is a normalized Hadamard matrix oforder 2n for all n (see the list­ing Hadamard matrix H(n), where we define the function hadamard_matrix_that later will be superseded by hadamard_matrix).

It is also easy to check that if Hand H' are Hadamard matrices ofordersnand n' , then the tensor product H 0 H' is a Hadamard matrix of order nn'(the matrix H 0 H', also called the Kronecker product of Hand H', can bedefined as the matrix obtained by replacing each entry aij of H by the matrixaijH'). For example, if H is a Hadamard matrix of order n , then

H(l) 0 H = (H H)H -H

is a Hadamard matrix of order 2n. Note also that we can write

E.1.38. Check that the function tensor defined in the listing Tensor productof two matrices yields the tensor or Kronecker product of two matrices.

I .3. Hadamard codes

AI Library I Hadamard matrix H(n)

Ihadamard_matrix_ (1) := (11 1 )-1hadamard_matrix_(n :71 )check n>1:=begin

local H=hadamard_matrix_ (n-1 )(H I H)&(H I-H)

end

Ihadamard_matrix_ (1) -+ (~ - ~ )

( ~ -~ ~ -~)hadamard_matrix_(2) -+ 1 1 -1 -1

1 -1 -1 1

I [hadarnardjnatrix, (3) I -+ 4096I Ihadamard_matrix_(5) I -+ 1208925819614629174706176

I(25) ; -+ 1208925819614629174706176

AI Library I Tensor or Kronecker product of two matrices

tensor(H :Matrix, H' :Matrix) :=begin

local S, T=[]for h in H do

S=[]for a in h do

S=Sla ·WendT=T&S

endend

IH=hadamard_matrix(1) -+ (~ -~)

(

1 1 1 1)1 -1 1-1tensor(H, H) -+ 1 1 -1 -1

1 -1 -1 1

65

IA

1.48 Proposition. If H is a Hadamard matrix oforder n ;;:: 3, then n is amultiple of4.

Proof We can assume, without loss of generality, the H is normalized.First let us see that n is even. Indeed, if rand r' are the number of times

that 1 and -1 appear in the second row of H, then on one hand the scalarproduct of the first two rows of H is zero, by definition of Hadamard matrix,

66 Chapter 1. Block Error-correcting Codes

and on the other it is equal to r - r', since H is normalized. Hence r' = rand n = r + r' = 2r.

Now we can assume, after permuting the columns of H if necessary, thatthe second row of H has the form

and that the third row (which exists because n ;?: 3) has the form

with8 + t = 8' + t' = r .

Since the third row contains r times 1 and r times -1, we also have that

8 + 8 ' = t + t' = r .

The equations 8 + t = rand 8 + 8 ' = r yield 8 ' = t. Similarly, t' = 8 from8 + t = rand t + t' = r. Hence the third row of H has the form

Finally the condition that the second and third row are orthogonal yields that2(8 - t) = 0 and hence 8 = t, r = 2 8 and n = 48. 0

E.1.39. For n = 4 and n = 8, prove that any two Hadamard matrices oforder n are equivalent.

The Paley construction ofHadamard matrices

Assume that q, the cardinal of IF, is odd (that is, q = p", where p is a primenumber greater than 2 and r is a positive integer). Let

X: IF* -t {±1}

be the Legendre character ofIF*, which, by definition, is the map such that

X(x) = { 1-1

if x IS a square

if x is not a square

We also extend X to IF with the convention that X(O) = O.

1.49 Proposition. Ifwe set r = (q - 1)/2, then X(x) = x T for all x E IF.Moreover, X takes r times the value 1 and r times the value -1 and hence

Z=xEIF X(x) = O.

1.3. Hadamard codes 67

Proof Since (x r )2 = x 2r = x(q-l) = 1 (because F* has order q - 1; cf.Proposition 2.29), we see that xr = ± 1. If x is a square, say x = w2, thenxr = w 2r = 1.

Note that there are exactly r squares in F*. Indeed, since (-x) 2 = x2 ,

the number of squares coincides with the number of distinct pairs {x, - x }with x E F* and it is enough to observe that x =j:. -x, for all x , because pisodd.

Since the equation x" = 1 can have at most r solutions in F*, and the rsquares are solutions, it turns out that x" =j:. 1 for the r elements of F* thatare not squares, hence z" = -1 for all x that are not squares. Thus the valueof x" is 1 for the squares and -1 for the non-squares, as was to be seen. 0

The elements of F that are squares (non-squares) are also said to bequadratic residues (quadratic non-residues) of F. The listing Legendrecharacter and the functions QR and NQR includes the function Legen­dre(a,F), that computes a(q-l)/2 (q = IFI) for a E F*, and the functionsQR(F) and QNR(F) that return, respectively, the sets of quadratic residuesand quadratic non-residues of any finite field F.3

"'1 Librarv I Leuendre character and the functions QR and NQR

Legendre (a :Element(Field) , F: GF)check subextension?(field(a) , F ):= if a=O then

oelse if a(cardinal(F)-1)/2=1 then

1else

-1end

QR(F :GF):=begin

local x, q=cardinal(F)x(j):= element(j , F){x(j) with j in 1 .. (q-1) such_that Legendre(x(j), F) =1 }

endQNR(F :GF):=

beginlocal x, q=cardinal(F)x(j):= element(j,F){x(j) with j in 2 .. (q-1) such_that Legendre(x(j), F) *1 }

end

rl QR(7129) {1, 4, 5, 6, 7, 9,13,16,20,22,23,24,25, 28}

~ QNR(7129) {2, 3, 8,10,11 ,12,14,15,17,18,19,21,26, 27}

3por the convenience of the reader, the function Legendre is defined and used here in­stead of the equivalent built-in function legendre.

68 Chapter 1. Block Error-correcting Codes

Paley matrix of a finite field of odd characteristic

Let Xo = 0, Xl , . . . ,Xq-l denote the elements oflF, in some order, and definethe Paley matrix oflF,

by the formula

Sij = X(Xi - Xj ).

For its computation, see the listing Paley matrix of a finite field .

4' Library I Paley matrix of a finite field of char '* 2

paley_matrix(F :GF)check characteristic(F)*2:=begin

local X, x, n=cardinal(F)-1xU) := element(j,F)x(a):= legendre(a,F)[[X(x(i) -x(j) )with j in O..n]with i in O..n)

end

14

(

0 11 0

paley_matrix(71s) .... -1 1-1 -1

1 -1

-1 -1 1)1 -1 -1o 1-1101

-1 1 0

1.50 Proposition. IfU = Uq E Mq(lF) is the matrix whose entries are all 1,then S = Sq has the following properties:

1) SIr = or 2) IqS = Oq3) SST = qlq - U 4) ST = (-I)(q-I) /2S.

Proof Indeed, 1 and 2 are direct consequences of the equality

LX(X) = 0,xEF

which in tum was established in proposition 1.49.The relation 3 is equivalent to say that

if j = i

if j i= i

The first case is clear. To see the second, let X = tu - ak and y = aj - ai, sothat aj - ak = X+y . With these notations we want to show that the value of

S = LX(x)X(x + y)x

1.3. Hadamard codes 69

is -1 for all y i O. To this end, we can assume that the sum is extendedto all elements x such that x i 0, -yo We can write x + y = XZx, whereZx = 1 + y/x runs through the elements i 0,1 when x runs the elementsi 0, -yoTherefore

S = L X(x)2X(zx) = L X(z) = -l.x # O,- y z#O,l

Finally the relation 4 is a consequence of the the formula X(-1)(-1 )(q-l)/2 (proposition 1.49). 0

Construction of Hadamard matrices using Paley matrices

1.51 Theorem. Ifn = q + 1 == 0 (mod 4), then the matrix

n; = (If _Iql~ sJis a Hadamard matrix ojorder n.

Proof The hypothesis n = q + 1 == 0 (mod 4) implies that (q - 1)/2 is oddand hence S = Sq is skew-symmetric (ST = -S). Writing 1 instead of l qfor simplicity, we have

and the preceeding proposition implies that this matrix is equal to nIno Notethat 1 . IT = q and IT. 1 = Uq. We have shown the case +S; the case -Sis analogous. 0

For the computation of the matrix H n in the theorem, see the listingHadamard matrix of a finite field with q+1 mod 4 =O.

1.52 Remark. When q + 1 is not divisible by 4, in E.1.42 we will see thatwe can still associate a Hadamard matrix to IFq» but its order will be 2q + 2.Then we will be able to define hadamard_matrix(F:Field) for any finite fieldof odd characteristic by using the formula in the Theorem 1.51 when q + 1is divisible by 4 (in this case it returns the same as hadamard_matrix_) andthe formula in E.l.42 when q + 1 is not divisible by 4.

The name conference matrix originates from an application to

conference telephone circuits.

lH. van Lint and R.M. Wilson, [26], p. 173.

70 Chapter 1. Block Error-correcting Codes

AILibrary I Hadamard matrix of a finite field with q+1 - amod 4 IA

hadamard_matrix_(F :GF)check cardinal(F)+ 1 mod 4 =a:=begin

local q=cardinal(F), u=constant_vector(q,1), A= Iq +paley_matrix(F)

([1]1 u)&( u I-A)

end

1 1 1 1 1 1 1-1 1 1 -1 1 -1 -1-1 - 1 1 1 -1 1 -1

hadamard_matrix_(717) --1 -1 -1 1 1 -1 1

1 -1 -1 -1 1 1 -1-1 1 -1 -1 -1 1 1

1 -1 1 -1 -1 -1 11 1 -1 1 -1 -1 -1

Conference matrices and their use in contructing Hadamard matrices

E.l.40. A matrix C oforder n is said to be a conference matrix if its diagonalis zero, the other entries are ±1 and CCT = (n - l)In . Prove that if aconference matrix of order n > 1 exists, then n is even. Prove also thatby permuting rows and columns of C , and changing the sign of suitablerows and columns, we can obtain a symmetric (skew-symmetric) conferencematrix of order n for n == 2 (mod 4) (n == 0 (mod 4)) .

E.1.41. Let S be the Paley matrix of a finite field IFq, q odd, and let e = 1 ifS is symmetric and e = -1 if S is skew-symmetric. Prove that the matrix

C= (0 lq)cIT S

q

is a conference matrix of order q + 1 (for the computation of this matrix, seethe listing Conference matrix of a finite field). This matrix is symmetric(skew-symmetric) if q == 1 (mod 4) (q == 3 (mod 4)) . Note that in the caseq == 3 (mod 4) the matrix In + C is equivalent to the Hadamard matrix H nof theorem 1.51.

E.1.42. If C is a skew-symmetric conference matrix of order n, show thatH = In + C is a Hadamard matrix (cf. the remark at the end of the previousexercise) . If C is a symmetric conference matrix of order n , show that

is a Hadamard matrix of order 2n . o

1.3. Hadamard codes

"'1 Library I Conference matrix of a finite field

conference_matrix(F :GF) :=begin

local S=paley_matrix(F) , q=cardinal(F),e= (-1 )(q-1) /2, u=constant_vector(q,1)

([011 u)&(e 'uIS)

end

71

(

0 1 1-1 0-1

\caretconference_matrix(713) .... -1 1 0

-1 -1 1 -DThe last exercise allows us, as announced in Remark 1.52, to associate

to a finite field IFq» q odd, a Hadamard matrix of order q + 1 or 2q + 2according to whether q + 1 is or is not divisible by 4, respectively. For thecomputational details, see the listing Hadamard matrix of a finite field .The function in this listing is called hadamard_matrix(F), and extends thedefinition of hadamard_matrix_ that could be applied only if q + 1 == 0mod 4.

..., Library I Hadamard matrix of a finite field

hadamard_matrix(F :GF) :=begin

local q=cardinal(F) , r=(q-1)/2, e=(-1V,1=Iq+ 1 , C=conference_matrix(F)

if e=1 then( I+C I -I+C)&(-I+C I-I-C)

elseI+C

endend

Suggested examples (matrices of order 12 and 20which are not displayed)

Ihadamard_matrix (71s)Ic1ear(x); F=713[xj/(x

2+1);

I hadamard_matrix(F)

1.53 Remark. The Hadamard matrices H (2), H (3) , H (4) , H (S) have orders4, 8,16,32. Theorem 1.51 applies for q = 7,11 ,19,23,27,31 and yieldsHadamard matrices H 8, H 12, H 20, H 24, H 28 and H 32. The second part ofthe exercise £.1.42 yields Hadamard matrices H 12, H 2o, H 28 from the fieldslFs, lFg and lF13.

72 Chapter 1. Block Error-correcting Codes

It is conjectured that for all positive integers n divisible by 4 there is aHadamard matrix of order n . The first integer n for which it is not knownwhether there exists a Hadamard matrix of order n is 428 (see [26] or [2]).

The matrices H(3) and Hs are not equal, but we know that they areequivalent (see £.1.39). The matrices H 12 and H 12 are also equivalent (seeP.1.19). In general, however, it is not true that all Hadamard matrices of thesame order are equivalent (it can be seen, for example, that H20 and H20 arenot equivalent) .

For more information on the present state ofknowledge about Hadamardmatrices, see the web page

http://www.research.att.comrnjas/hadamard/

Hadamard codes

One of the recent very interesting and successful applications

of Hadamard matrices is their use as co-called error-correcting

codes.

J.H. van Lint and R.M. Wilson, [26], p. 181.

We associate a binary code C = CH of type (n,2n) to any Hadamardmatrix H of order n as follows: C = C' U C", where C' is obtained byreplacing all occurrences of -1 in H by 0 and C" is the result of comple­menting the vectors in C' (or replacing -1 by 0 in - H).

Since any two rows of H differ in exactly n/2 positions, if follows thatthe minimum distance ofCH is n/2 and so the type ofCH is (n ,2n ,n/2) .

For the construction of CH as a list of code-words, see the listing Con­struction of Hadamard codes (to save space in the examples, instead ofthe matrix of code-words we have written its transpose).

E.1.43. Are the Hadmanard codes equidistant? Can they have vectors ofweight less than nj2?

E.1.44. Show that the code CHs is equivalent to the parity extension of theHamming [7,4,3] code. It is therefore a linear code (note that a Hadamardcode of length n cannot be linear unless n is a power of 2.

1.54 Example. The code C = CH(5), whose type is (32,64,16), is thecode used in the period 1969-1972 by the Mariner space-crafts to trans­mit pictures of Mars (this code turns out to be equivalent to the linear codeRM2(5); see P.1.21). Each picture was digitalized by subdividing it intosmall squares (pixels) and assigning to each of them a number in the range0..63 (00000000111111 in binary) corresponding to the grey level. Finallythe coding was done by assigning each grey level to a code-word of C in a

1.3. Hadamard codes 73

AI Library I Construction of Hadamard codes IA

~hadamard_code(H:Matrix) := {-1~0}(H&-H)

I hadamard_code(n :71) check n>O := hadamard_code (hadamard_matrix (n) )I hadamard_code(F :GF) := hadamard_code (hadamard_matrix (F) )

hadamard_code (2) T _

hadamard_code(717) T _

(

1 1 1 1 0 0 0 0 )101001011100001110010110

100 0 0 0 0 001 1 111 1 11111010000001011101110100100010110011101011000101100111000110001101001110101100011010011001011001110100100010110

one-to-one fashion. Hence the rate of this coding is 6/32 and its correctingcapacity is 7. See figure 1.1 for an upper-bound of the error-reduction factorof this code. Decoding was done by the Hadamard decoder explained in thenext subsection. For more information on the role and most relevant aspectsof coding in the exploration of the solar system, see [28].

0.01 0.02 0.03

Figure 1.1: The error-reduction factor of the Mariner code is given by

L:~:8 ef)(1 - p)32-ipJ-l. For p = 0.01, its value is 8.49 x 10-8, and for

p = 0.001 it is 1.03 X 10- 14 .

Decoding of Hadamard codes

Let C = CH be the Hadamard code associated to the Hadamard matrix H.Let n be the order of Hand t = l(n/2 - 1)/2J = l(n - 2)/4J. Sincede = n/2, t is the correcting capacity of C.

74 Chapter I. Block Error-correcting Codes

Given a binary vector y of length n, let iJ E {I , -l}n be the result ofreplacing any occurrence of 0 in y by -1. For example, from the definitionofG it follows that if x E G' (or x E Gil), then x (or -x) is a row of H.

E.l.45. Let y be a binary vector of length n, z be the result of changing thesign ofone component ofy, and h be any row of H . Then (ilh) = (iJlh) ±2.

1.55 Proposition. Let G = GH, where H is a Hadamard matrix ofordern. Assume x EGis the sent vector, that e is the error vector and thaty = x +e is the received vector. Let lei = s and assume s :::;; t. Then there isa unique row hy ofH such that I (iJlh) I > n/2, and all other rows h' ofHsatisfy I(iJlh) I < n/2. Moreover, x= hy or x = -hy according to whether(iJlh) > 0 or (iJlh) < O.

Proof Assume x E G' (the case x E Gil is similar and is left as an exercise) .Then h = x is a row of Hand (xlh) = (hlh) = n . Since iJ is the result ofchanging the sign of s components ofx,E.l,45 implies that (f)lh) = n ± 2r,with 0 :::;; r :::;; s :::;; t , and so (iJ lh) ;) n ± 2t > n/2. For any other row h'of H we have that (xlh') = (hlh') = 0 and again by E.1.45 we obtain that(iJlh') = ±2r, with °:::;; r :::;; s :::;; t, and hence I(iJlh') I :::;; 2t < n/2. 0

The preceding proposition suggests decoding the Hadamard code G =GH with the algorithm below (for its implementation, see The Hadamarddecoder).

Hadamard decoder

0) Let y be the received vector.

1) Obtain iJ.

2) Compute c = (iJ lh) for successive rows h of H, and stop as soon aslei > n /2, or if there are no further rows left. In the latter case returna decoder error message.

3) Otherwise, output the result of replacing -1 by 0 in h if c > °or in-h if c < 0.

1.56 Corollary. The Hadamard decoder corrects all error patterns ofweightup to t.

E.1.46 (Paley codes). Let 3 = 3q be the Paley matrix of order q associatedto IFq - q odd. Let Gq be the binary code whose elements are the rows of thematrices !(3 + I + U) and !(-3 + 1+ U), together with the vectors Onand In ' Show that Gq has type (q,2q + 2, (q - 1)/2) (for examples, see thelisting Construction of Paley codes; to save space, the example has beenwritten in columns rather than in rows). Note that for q = 9 we obtain a(9,20,4) code, hence also an (8,20,3) code (cf. E.1.4) .

1.3. Hadamard codes

AI Library I The Hadamard decoder

hadamard_decoder(y, H):=begin

local r=length(H)/2, cy={0=>-1}(y)for h in H do

c=y·hif [cp-r then

if c>O thenreturn({-1 =>O}(h»

elsereturn({-1 =>O}( -h»

endend

end"Decoder error"

end

I\caretH=hadamard_matrix(71s); C=hadamard_code(H);

Ix=Cs; x, hadamard_decoder(x,H)

- [1,0,0,1,1 ,1 ,1,0,0,1,0,1],[1,0,0,1 ,1,1,1,0,0,1,0,1]

Ix=C1S; y=fIip(x,{3}) ; x, y

- [0,0,0,0,1 ,1,0,0,1 ,0,1,1],[0,0,1,0,1 ,1,0,0,1,0,1,1]I hadamard_decoder(y,H) - [0,0,0,0,1 ,1,0,0,1,0,1,1]

Ix=Cg: y=fIip(x ,{2,7}); x, y

- [1,1 ,0,1,0,0,0,0,0,0,1,1],[1,0,0,1 ,0,0,1 ,0,0,0,1 ,1]I hadamard_decoder(y,H) - [1,1,0,1 ,0,0,0,0,0,0,1 ,1]

Ix=C18; y=fIip(x ,{7,12}); x, y- [0,0,1 ,1,0,0,0,0,1 ,1 ,0,1],[0,0,1 ,1 ,0,0,1 ,0,1 ,1 ,0,0]

I hadamard_decoder(y,H) - [0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1]

Ix=C18; y=fIip(x ,{2,7,9}); x, y

- [0,0,1 ,1 ,0,0,0,0,1 ,1 ,0,1],[0,1,1 ,1 ,0,0,1 ,0,0,1 ,0,1]I hadamard_decoder(y,H) - "Decoder error"

Summary

75

• The definition of Hadamard matrices (for example the matrices u'»,n ~ 1) and the fact that the order of a Hadamard matrix is divisible by4 (or is equal to I or 2).

• The Paley matrix of a finite field of odd order q and their properties(Proposition 1.50).

• The Hadamard matrix associated to a finite field of odd order q suchthat q + 1 is divisible by 4 (Theorem 1.51).

• Conference matrices. Symmetric and skew-symmetric conference ma­trices of order n have an associated Hadamard matrix of order n and2n, respectively (E.1.42).

76 Chapter I. Block Error-correcting Codes

AI Library I Construction of Paley codes IA

paley_code(F :GF):= begin

local S=paley_matrix(F), q=cardinal(F), 1=Iq , U=constant_matrix(q,1)

S+I+U -S+I+Uconstantvectortq.O) & 2 & 2 &constant_vector(q,1)

end

I\caretF= 7l3[x]! (x2+ 1);

01111001001000110111011101001001010110110111001001001110110101001111000111000111

paley_code(F)T.... 001 0 1 1 1 01 01 01 01 01 01100011110011100011101010010011101101110010010010111101101010100010011111101100011

• To each Hadamard matrix of order n there is associated a code

CH r-J (n , 2n,n/2) .

The codes CH are the Hadamard codes.

• To each finite field of odd order q there is associated a Paley codeCq r-J (q, 2q + 2, (q - 1)/2).

• The Hadamard code of the matrix H(m) is equivalnent to the (linear)Reed-Muller code RM2(m) (P.1.2l).

Problems

P.1.17. Compare the error-reduction factor of the binary repetition code oflength 5 with the Hadamard code (32,64,16) .

P.1.18. Prove that the results in this section allow us to construct a Hadamardmatrix of order n for all n ~ 100 divisible by 4 except n = 92.

P.1.19. Prove that any two Hadamard matrices of order 12 are equivalent.

P.1.20. Fix a positive integer r and define binary vectors uo, . . . ,Ur- l so that

Uk = [bo(k) , ..., br-l(k)],

where bi(k) is the i-th bit in the binary representation of k. Show that H(r)coincides with the 2r x 2r matrix (_l) (ui!Uj). For the computations related tothis, and examples, see the listing Bit-product of integers and Hadamardmatrices .

1.3. Hadamard codes

ATLibrarvT Bit-oroduct of inteaers and Hadamard matricesr-1

bit_product(a,b,r) := I bit(a,i) .bit(b,i) mod 2

i=O

hadamard (r:7l)check r>O:=begin

local n=2r

[[ (-1 )bicproduct(i,j.n) with j in 0..n-1) with i in 0..n-1)end

(

1 1 1 1)1 -1 1-1hadamard(2).... 1 1 -1 -1

1 -1 -1 1

77

P.l.2!. As seen in P.1.l6, the code RM2(m) has type (2m , 2m +1 , 2m - I ) .

Prove that this code is equivalent to the Hadamard code corresponding tothe Hadamard matrix ttv» . Thus, in particular, RM2(5) is equivalent to the~ariner code. Hint : If H is the control matrix of Ham2(m), then the matrixH obtained from H by first adding the column Om to its left and then therow 1 2m on top is a generating matrix of RM2(m ).

78 Chapter 1. Block Error-correcting Codes

1.4 Parameter bounds

Probably the most basic problem in coding theory is to find the

largest code of a given length and minimum distance.

F. MacWilliams and N.J.A. Sloane, [11], p. 523

Essential points

• The Griesmer bound for linear codes .

• The Plotkin and Elias bounds (for general q-ary codes) and the John­son bound (for general binary codes).

• The linear programming upper bound of Delsarte and the functionLP(n,d) implementing it (follwing the the presentation in [11]).

• Asymptotic bounds and related computations: the lower bound ofGilbert-Varshamov and the upper bounds of Singleton, Hamming,Plotkin, Elias , and McEliece-Rodernich-Rumsey-Welch.

Introduction

The parameters n, k (or M ) and d of a code satisfy a variety of relations.Aside from trivial relations, such as 0 :::; k :::; n and 1 :::; d :::; n , we knowthe Singleton inequality

d :::;n-k+l

(Proposition 1.11), the sphere upper bound

M :::; qn/volq(n, t)

(Theorem 1.14) and the Gilbert lower bound

M ~ q" / volq(n ,d - 1)

(Theorem 1.17). We also recall that if q is a prime power and

vOlq (n - 1, d - 2) < qn- k,

then there exists a linear code of type [n , k,d'] with d' ~ d and hence thatA(n ,d) ~ qk (Theorem 1.38). Moreover, it is not hard to see that this boundis always equal or greater than the Gilbert bound (see P.I .22).

IA. Parameterbounds 79

In this section we will study a few other upper bounds that provide a bet­ter understanding of the function Aq(n ,d) and the problems surrounding itsdetermination. We include many examples. Most of them have been chosento illustrate how far the new resources help (or do not) in the determinationof the values on Table 1.1.

Redundancy adds to the cost of transmitting messages , and the

art and science of error-correcting codes is to balance redun­

dancy with efficiency.

From the 2002/08/20press release on the Fields Medalawards (L. Laforgue and V. Voevodski) and Nevanlinna Prizeawards (Madhu Sudan). The quotation is from the descriptionof Madhu Sudan's contributions to computer science, and inparticular to the theory of error-correcting codes.

The Griesmer bound

Consider a linear code C of type [n, k,d] and a vector z E C such that[z] = d. Let

S (z) = {i E l..n IZi i- O}

(this set is called the support of z ) and

the linear map that extracts, from each vector in IFn, the components whoseindex is not in S (z). Then we have a linear code pz(C ) ~ IFn-d (we will saythat it is the residue of C with respect to z).

IfG is a generating matrix of C , then pz(C) = (Pz(G)) , where pAG)denotes the matrix obtained by applying pz to all the rows of G (we will alsosay that pz(G) is the residue of G with respect to z ). Since Pz(z) = 0, it isclear that dim (pz(C)) ~ k - 1.

1.57 Lemma. The dimension ofpz(C) is k - 1 and its minimum distance isat least rd/ql

Proof Choose a generating matrix G ofC whose first row is z , and let G' bethe matrix obtained from pz(G) by orniting the first row (which is Pz(z) =0). It will suffice to show that the rows of G' are linearly independent.

To see this, let us assume, to argue by contradiction, that the rows of G'are linearly dependent. Then there is a non-trivial linear combination x ofthe rows of G other than the first such that Xi = 0 for i f/. S (z ) (in otherwords , S(x) ~ S(z) and hence Ixl ~ Izi = d). Since the rows of G arelinearly independent, we have both x i- 0 and AX i- z for all A E IF. But

80 Chapter 1. Block Error-correcting Codes

the relation x =t 0 allows us to find A E IF* such that Zi = A Xi for somei E S(z), and from this it follows that [z - Axl < d, which is a contradictionbecause z - AX E C - {O}.

To bound below the minimum distance d' of pz(C) , we may assume,after replacing C by a scalarly equivalent code if necessary, that S(z) ={1, . .. , d} and that z, = 1 for all i E S(z). Then, given a non-zero vec­tor x' E pz(C) of minimum weight d', let x E C be any vector such thatpz(x) = x' , We can write x = Ylx', with Y E lFd. We claim that we canchoose Y in such a way that Iyl ~ d - rd/q1. Indeed, let a E IF be an elementchosen so that the number s of its ocurrences in y is maximum. With thiscondition it is clear that sq ~ d (which is equivalent to s ~ rd]q1) and thatx = x - a z E C has the form x = y'lx' with Iy'l ~ d - s ~ d - rd/qlNow we can bound d' below:

d' = Ix'i = Ixl-Iy'l ~ d - (d - rd/q1) = rd/q1,

which is what we wanted.

E.1.47. Let i be a positive integer. Check that

o

1.58 Proposition (Griesmer, 1960). If In,k , d] are the parameters ofa lin­ear code, and we let

k-l rd1Gq(k ,d) = I: i '

i= O q

thenGq(k ,d) ~ n.

Proof If we let Nq(k , d) denote the minimum length n among the linearcodes of dimension k ~ 1 and minimum distance d, it will be sufficient toshow that Gq(k , d) ~ Nq(k, d). Since this relation is true for k = 1, for thevalue of both Nq(l ,d) and Gq(l ,d) is d, we can assume that k ~ 2.

Suppose, as induction hypothesis, that Gq(k - 1, d) ~ Nq(k - 1, d) forall d. Let C be a code of type In,k , d]q with n = Nq(k, d). Let C' be theresidue of C with respect to a vector z E C such that [z] = d. We knowthat C' is a code of type [n - d, k - 1, d']q with d' ~ rd/q1(Lemma 1.57).Therefore

Nq(k , d) - d = n - d ~ Nq(k -1, d') ~ Gq(k -1, d') ~ Gq(k -1, rd/q1).

Taking into account E.1.47, the last term turns out to be Gq(k, d) - d andfrom this the stated relation follows . 0

104. Parameter bounds 81

1.59 Remark. Let us see that the Griesmer bound is an improvement ofthe Singleton bound. The term corresponding to i = °in the summationexpression of Gq(k, d) in Proposition 1.58 is d and each of the other k - 1terms is at least 1. Hence Gq(k ,n) ;:: d + k - 1 and so G(k , n) ~ n impliesthat d + k - 1 ~ n, which is equivalent to the Singleton relation. 0

1.60 Remark. In the proofofProposition 1.58 we have introduced the func­tion Nq(k, d), which gives the minimum possible length of a q-ary linearcode of dimension k and minimum distance d, and we have established that

Nq(k, d) ;:: d + Nq(k - 1, rd/ql) .

Using this inequality recursively, we have proved that Nq(k ,d) ;:: Gq(k , d).There are two main uses ofthis inequality. One is that if a code In, k, d]q

exists, then Gq(d, k) ~ n, and hence Gq(d, k) yields a lower bound for thelength n of codes of dimension k and miminurn distance d. In other words,to search for codes In, k , d]q , with k and d fixed, the tentative dimension tostart with is n = Gq(k , d). For example, G2(4,3) = 3 + r3/21 + f3/221 +f3/2 31 = 3 + 2 + 1 + 1 = 7 and we know that there is a [7,4,3] code, soN2(4,3) = 7, that is, 7 is the minimum length fora binary code ofdimension4 with correcting capacity I . For more examples, see 1.61.

The other use of the Griesmer inequality is that the maximum k thatsatisfies Gq(k, d) ~ n, for fixed nand d, yields an upper bound qk for thefunction Bq(n, d) defined as Aq(n ,d), but using only linear codes instead ofgeneral codes. Note that Aq(n, d) ;:: Bq(n , d).

We refer to Griesmer bound for implementations and some examples.The function griesmer(k,d,q) computes Gq(k, d), which we will call Gries­merfunction, and ub_griesmer(n,d,q) computes the upper bound on B q ( n , d)explained above.

1.61 Examples. According to the listing Griesmer bound, and Remark 1.60,we have B2(8, 3) ~ 32, B2(12, 5) ~ 32, B2(13,5) ~ 64, B2(16, 6) ~ 256.As we will see below, none of these bounds is sharp .

The first bound is worse than the Hamming bound (A2 (8,3) ~ 28), andin fact we will see later in this section, using the linear programming bound,that A2(8 ,3) = 20 (which implies, by the way, that B2(8 , 3) = 16)).

The bound B2(13,5) ~ 64 is not sharp, because if there were a [13,6,5]code, then taking residue with respect to a weight 5 code vector would leadto a linear code [8,5, d] with d ;:: 3 (as rd/21 = 3) and d ~ 4 (by Singleton),but both [8,5,4] and [8,5,3] contradict the sphere bound.

Let us see now that a (16 ,256,6) code cannot be linear. Indeed, if we hada linear code [16,8,6] , then there would also exist a [15,8,5] code . Selectingnow from this latter code all words that end with 00, we would be able to

82 Chapter 1. Block Error-correcting Codes

A' Library I Griesmer upper bound and Griesmer function

ub_griesmer (n,d,q):= begin

local i=O, P=1while n>O do

n=n-rd/Pl; P=P 'q; i= i+1endP

endIub_griesmer(n,d) := ub_griesmer(n,d,2)

k-1

griesmer(k,d,q):= I rd/q']i=O

Igriesmer(k,d) := griesmer(k,d.2)

I ub_griesmer(8,3) 32I ub_griesmer(12.5) 32I ub_griesmer(13,5) 64I ub_griesmer(16,6) 256I ub_griesmer(14,9,3) 81Igriesmer(4,3) 7Igriesmer(5,7) 15Igriesmer(12,7) 22Igriesmer(6.5.3) 11

construct a code [13,6, 5] that we have shown not to exist in the previousparagraph.

The fact that the bound B2(12,5) ~ 32 is not sharp cannot be settledwith the residue operation, because it would lead to a [7,4,3] code, whichis equivalent to the linear Ham(3). The known arguments are much moreinvolved and we omit them (the interested reader may consult the referencesindicated in [16], p. 124, or in [25], p. 52).

The last expresion in the same listing says that B3(14, 9) ~ 4, but it isnot hard to see that this is not sharp either (cf. [25], p. 53, or Problem P.l .23).

In the listing we also have that G2(5, 7) = 15, G2(12,7) = 22 andG(6,5, 3) = 11. The techniques developed in Chapter 3 will allow us toconstruct codes [15,5 ,7] (a suitable BCH code) and [11,6,513 (the ternaryGolay code). This implies that N2(5, 7) = 15 and N3(6, 5) = 11. On theother hand, there is no [22,12,7] code, because these parameters contradictthe sphere bound, but in Chapter 3 we will construct a [23,12 ,7] code (thebinary Golay code) and therefore N2(12, 7) = 23.

1.62 Remark. If we select all words of a code C whose i-th coordinate isa given symbol A, and then delete this common i-th coordinate A from theselected words, we get a code C' ,....., (n - I ,M',d), with M' ~ M . Wesay that this C' is a shortening of C (or the A-shortening of C in the i-th

104. Parameter bounds 83

coordinate). If C is linear, a O-shortening is also linear, but a A-shorteningof a linear code is in general not linear for A =I O. In the previous exampleswe have used this operation twice in the argument to rule out the existenceoflinear codes [16,8, 6] .

The Plotkin bound

We will write (3 to denote the quotient (q - 1)/ q = 1 - q-l. Althoughthe proof of the proposition below can be found in most of the books onerror correcting codes (see, for example, [25] (5.2.4) or [20], Th . 4.5.6),we include it because it illustrates an interesting method that is relevant forsubsequent exercises, examples and computations.

1.63 Proposition (Plotkin, 1951). For any code C oftype (n, M, d), we have

d ~ (3n M"" M-1 '

or, equivalently,M(d - (3n) :::;; d.

Proof The basic idea is that the average of the distances between pairs ofdistinct elements of C is at least d.

In order to calculate this average , arrange the elements of C in an M x nmatrix . For a given column c of this matrix, let m>. denote the number oftimes that A E IF appears in that column, and note that

Lm>. = M.>'EF

The contribution of c to the sum of the distances between pairs of distinctelements of C is

Be = Lm>.(M - m>.) = M 2 - Lml.>'EF >'EF

ButM 2 = (L m>.)2 :::;; q L ml,

>'EIF >'EF

by the Cauchy-Schwartz inequality, and so Be :::;; (3M 2. If we let B denotethe sum of the Be, where c runs through all columns of the matrix C, thenB:::;; n(3M 2. Now it is clear that M(M - l)d :::;; B and so

and this immediately yields the stated inequality. o

84 Chapter I. Block Error-correcting Codes

1.64 Corollary (Plotkin upper bound). Ifd > (3n, then

E.1.48. Prove that the inequality d ~ (3nM/ (M - 1) is an equality if andonly if the code C is equidistant and the numbers rnA are the same for all X.

1.65 Example. For the dual Hamming codes the Plotkin inequality is anequality. Indeed, ifC = Ham~(r), we know that C' has type [n ,r,d], withd = qr-l and n = (qr - 1)/(q - 1). Since (3n = qr-l _ «: < «:' = d,Plotkin's upper bound can be applied. As died - (3n) = qr, which is thecardinal of C,

A (qr - 1 r-l) _ rq 1 ,q -q.

q-

This formula explains the entries A2 (6,3) = A2 (7,4) = 8 and A2(14, 7) =A2(15 ,8) = 16 on the Table 1.1. Another example: with q = 3 and r = 3we find A3(13, 9) = 27.

E.1.49 (Improved Plotkin bound). For binary codes, (3 = 1/2, and thePlotkin upper bound, which is valid if 2d > n, says that

A2(n , d) ~ L2d/(2d - n)J .

But this can be improved to

A2(n,d) ~ 2 Ld/(2d - n)J .

Moreover, ifd is odd and n < 2d+ 1, then A2(n, d) = A2(n+ 1, d+1) andn + 1 < 2(d+ 1), so that

A2(n, d) ~ 2 L(d + 1)/(2d + 1 - n)J .

Hint : With the notations of the proof of Proposition 1.63, we have S e =

2rno(M - rno) and so Se ~ M 2/2 if M is even and S; ~ (M 2 - 1)/2 ifM is odd ; it follows that M(M - 1)d ~ S ~ nM2/2 if M is even andM(M - 1)d ~ S ~ n(M2 - 1)/2 if M is odd .

E.1.50. Let C be a code of type (n,M ,d)q. Prove that there is a code oftype (n - 1, M', d)q with M' ~ M / q. If C is optimal, we deduce thatAq(n, d) ~ qAq(n - 1, d) Hint: This can be achieved with the operation ofshortening C: choose a symbol .\ E IFwhose occurrence in the last positionis maximum, and let C' = {x E IFn-l I (xl.\) E C}.

1.4. Parameterbounds 85

1.66 Remark. If C is a code of type (n, M , d) and d ~ {3n, then we can­not apply the Plotkin bound as it is. But we can use the exercise E.l.50 tofind a code C' of type (n - m , M' , d) such that d > (3(n - m) and withM' ~ M / qm, whereupon we can bound M by qmM' and M' by the Plotkinbound corresponding to (n - m, M' , d) . Note that d > (3(n - m) is equiv­alent to m > n - d/{3 = (n{3 - d)/{3. These ideas , together with those ofthe previous exercise in the binary case , are used to optimize the functionub_plotkin defined in Plotkin upper bound.

AILibrary I Plotkin upper bound

ub_plotkin (n,d,q) :=begin

local m, p=(q-1)/qif d>P' n then

if q=2 thenif odd?(d) then

n=n+1; d=d+1endreturn 2 ·L d / (z -d-n) J

elsereturn L d / (d-P' n) J

endendm=(p ,n-d)/pif m 01 then

m=m+1else

m=rmlendqm'ub_plotkin(n-m, d, q)

endBinary case:

I ub_plotkin(n,d):= ub_plotkin(n,d,2)

I ub_plotkin(13,5) -+ 96Iub_plotkin(32,16) -+ 64I {ub_plotkin(n,3) with n in 5..8} -+ {4, 8, 16, 32}I {ub_plotkin(n,5) with n in 5..12} -+ {2, 2, 2, 4, 6, 12, 24, 48}I {ub_plotkin(n,7) with n in 7..16} -+ {2, 2, 2, 2, 4, 4, 8, 16,32, 64}

E.1.51. Show that ford even, A2 (2d, d) ~ 4d, and for dodd A2(2d+1, d) ~4d+4.

1.67 Remark. The second expression in the listing Plotkin upper boundshows that A2 (32, 16) ~ 64. But we know that there is a code (32,64,16),and hence A2(32 , 16) = 64. In other words, the Mariner code RM(5) isoptimal.

The same listing shows that the Plotkin bound gives the correct values

86 Chapter 1. Block Error-correcting Codes

4,8, 16 (for d = 3),2,2,2,4,6, 12,24 (for d = 5) and 2, 2, 2, 2, 4, 4, 8, 16,32(for d = 7) on the Table 1.1. Note that it gives good results when d is largewith respect to n. When d is small with respect to n, the sphere bound is bet­ter. For example, the Plotkin bound gives A2(8, 3) :s; 32, while the spherebound is A 2(8, 3) :s; 28.

Elias upper bound

From the proof of the Plotkin bound it is clear that we cannot hope that itis sharp when the distances between pairs of distinct elements of the codeare not all equal (cf. E.1.48). The idea behind the Elias bound (also calledBassalygo-Elias bound) is to take into account, in some way, the distributionof such distances in order to get a better upper estimate.

vOlq(n, r)'

Proof It can be found in many reference books, as for example [25] (5.2.11),or [20], Th. 4.5.24, and we omit it. 0

1.68 Proposition (Elias, 1954, Bassalygo, 1965). Assume that q, n, d and rare positive integers with q :? 2 and r :s; {3n. Suppose, moreover, thatr2 - 2{3nr + {3nd > O. Then

{3ndAq(n, d) :s; r2 _ 2{3nr + {3nd

1.69 Remark. The methods used by Elias and Bassalygo are a refinementofthose used in the proofofthe Plotkin bound, but are a little more involved,and the result improves some of the previous bounds only for large valuesonn.

For example, for all values of nand d included in Table 1.1, the Eliasbound is strictly greater than the corresponding value in the table, and thedifference is often quite large . Thus the Elias bound for A2(13, 5) is 162,when the true value of this function is 64 (this will be establshed with thelinear programming method later in this section).

The first value ofn for which the Elias bound is better than the Hamming,Plotkin and Johnson bounds (the Johnson bound is studied below) is n = 39,and in this case only for d = 15. For n = 50 (n = 100), the Elias bound isbetter than the same three bounds for d in the range 15..20 (21..46). For theprecise asymptotic behaviour of this bound, see E.l .58.

Johnson's bound

Let us just mention another approach, for binary codes, that takes into ac­count the distribution of distances. This is done by means of the auxiliary

1.4. Parameter bounds

41 Librarv I Elias upper bound

ub_elias(n,d,q):=begin

local f. R. M, A, 13=1-1/qf(r) := rL2 ·13 ·n ·r+13 ·n·dR=O .. L13 ·nJM=qnfor r in R do

if f(r»O then

l13 ·n ' d qn JA= f(r) ' volumetn.r.q)if A<M then

M=Aend

endendif q=2 & odd?(d) then

M=min(M. ub_elias(n+1.d+1))endM

endBinary case :

I ub_elias(n,d) :=ub_elias(n.d,2)

~ ub_elias(13.5) 162~ ub_elias(14,6) 162

87

function A(n ,d,w) that is defined as the greatest cardinal M of a binarycode (n, M) with the condition that its minimum distance is ;;::: d and all itsvectors have weight w (we will say that such a code has type (n, M , d, w)).

1.70 Remark. Since the Hamming distance between two words ofthe sameweight w is even, and not higher than 2w, we have A(n,2h - 1, w) =A(n , 2h, w) and, if w < h, then A(n, 2h,w) = 1. So we may restrictthe study of A(n ,d,w) to even d, say d = 2h, and w ;;::: h. Now we haveA(n, 2h, w) = A(n, 2h,n - w) (taking complements of the words of an(n , M , d, w) code gives an (n, M, d,n - w) code and viceversa).

E.1.52. Show that A(n , 2h,h) = ln]hJ.

1.71 Proposition (Johnson, 1962). The/unction A(n,d,w) satisfies the re­lation

A(n,2h,w) ~ l2A(n-1,2h,w-1)J.

Hence, by induction,

88 Chapter 1. Block Error-correcting Codes

Proof Let C be a code (n, M, 2h, w) with M = A(n, 2h, w). Let C' ={x' E Z~-ll (1Ix') E C} and set M' = IC'I. Then C' is an (n-l, M', 2h, w­I) code, and hence M' ~ A(n - 1, 2h, w - 1). Arguing similarly with allother components, we see that there are at most nA(n - 1, 2h, w - 1) ones inall words of C, and thus wM ~ nA(n - 1, 2h, w - 1), which is equivalentto the claimed bound. 0

1.72 Theorem (Johnson, 1962). For d = 2t + 1,

Az(n,d) ~ ( n) (d) .t ( n) t + 1 - t A(n, d,d)

~ i + l-.!!:..-Jt+l

Note that this would be the binary sphere bound if the fraction in the denom­inator were omitted.

Proof See, for example, [25] (5.2.15) or [20], Th. 4.5.14). o

The bounds in Proposition 1.71 and Theorem 1.72 are computed in thelisting Johnson upper bound with the function ubjohnson, the first boundwith the arguments (n,d,w) and the second with the arguments (n,d). In theJohnson bound the generally unknown term A(n, d,d) is replaced by its up­per bound given in Proposition 1.71 (notice that this replacement does notdecrease the right hand side of the Johnson bound). The case where d iseven, say d = 2t + 2, can be reduced to the odd d case with the equalityA2(n , 2t + 2) = Az(n - 1, 2t + 1).

In the examples included in Johnson upper bound we can see that onlya few values on the Table 1.1 are given correctly by the Johnson bound. Butin fact the overall deviations are not very large and for a fixed d they becomerelatively small when n increases.

Linear programming bound

The best bounds have been obtained taking into account, for an (n, M , d)code C, the sequence Ao, . . . , An of non-negative integers defined as

Ai = I{(x , y) E C x C Id(x , y) = i} I·

104. Parameterbounds

"" Library I Johnson upper bound

ubjohnson (n,d,w):=begin

local h=L(d+1)/2J, b=1if h~w then

for i in w-h .. 0 .. -1 do

b=l b· (n~i) JW-I

endelse

1end

endubjohnson(n,d):=begin

local t, xif even?(d) then

n=n-1; d=d-1endt=Ld/2J

,= i: (;)+ (I~1HnUb~Ohn,on (n,d,d)

i = O h+dL2n/xJ

end

I ubjohnson(13,5,5) -+ 23I ubjohnson(13,5) -+ 77

I{ubjohnson(n,3) with n in 5..16}-+ {4,8, 16,25,51,83, 167,292,585, 1024,2048, 3615}

I{ubjohnson(n,5) with n in 5..16}-+ {2,2,3,5, 10, 14,28,47,77, 153, 256,428}

I {ubjohnson(n,7) with n in 7..16} -+ {2, 2, 2, 3, 7, 12, 18, 26, 51, 86}

89

''''

We will set ai = AdM and we will say that aD, .. . , an is the distancedistribution, or inner distribution, of C. We have:

AD = M , aD = 1; [1.3]

Ai = ai = 0 for i = 1, . .. , d - 1; [1.4]

AD + Ad + .. . + An = M 2, aD + ad + . .. + an = M . [1.5]

1.73 Remark. For some authors the distance distribution of a code is thesequence AD , .. . , An.

E.1.53. Given x E C, let Ai (x) denote the number ofwords x' E C such thathd(x, x') = i. Check that Ai = E XEC Ai(x) . Show that Ai(x) :( A(n,d, i)and that ai :( A(n,d, i) .

90 Chapter 1. Block Error-correcting Codes

E.1.54. Show that if C is linear then the distance distribution of C coincideswith the weight enumerator.

The Krawtchouk polynomials

If C is a linear code with weight enumerator ao, ... , an, and bo, .. . , bn isthe weight enumerator of c-, then we know (Theorem 7) that

n n

Lbjzj= Lai(l- z)i(l + (q - l)zt-i.

j = O i=O

The coefficient of zj in (1 - z )i(l + (q - l)z )n-i is

and hence

for j = 0, . .. , n. Note that

n

bj = L aiKj(i)i=O

[1.6]

is a polynomial in x of degree j, which will be denoted Kj(x , n, q) if weneed to make the dependence on nand q explicit.

Since Kj(O) = (j) and ai = 0 for i = 1, ... , d - 1, the formula [1.6]can be rewritten as

[1.8]

.1 Librarv I Krawtchouk oolvnorn ials

The auxiliary Newton binomial functionj

N(x,j) := ~n(x-i+1)J .

i=1

The Krawtchouk polynomialsk

krawtchouk(x,k,n,q) := L (-1 )j .N(x,j) ·N(n- x, k-j) ' (q-1 )k-j

j=oI krawtchouk(x,k,n) := krawtchouk(x,k,n,2)

The next proposition yields another interpretation ofthe expressions Kj (i) .

lA. Parameterbounds 91

1.74 Proposition (Delsarte 's lemma). Let wEe be a primitive q-th root ojunity and T = Z q. For i = 0, . . . , q, let Wi be the subset of T " whoseelements are the vectors ofweight i. Then,Jar any given x E Wi,

L w(xlY) = K j (i ).YEWj

Proof Passing to an equivalent code if necessary, we may assume that x i =o for l = i + 1, ... , n . Let v = {VI , . . . , Vj } be the support ofy, withVI < .. . < Vj, and let s in O..j be the greatest subindex such that Vs ~ i(with the convention s = 0 if i < VI). Ifwe restrict the sum in the statementto the set Wj(v) formed with the vectors ofweight j whose support is v, wehave (with T' = T - {O})

L w (xl Y) = L ... L w XvIYvl+,,-+x VjYVj

YEWj(V) YVI ET' Yvj ET'

= (q - l)j- s L ... L WXvlYvl + " -+xvsYvs

YVl ET ' s», ET '

s

= (q - l )j-sII L wXveY

£= 1 yE T'

= (_ l)S(q _ l)j -s .

For the last equality note that the map Xa : T -t C such that y I--t way,

where a E T is fixed , is a character of the additive group of T , and hence ,for a =I- 0, 2:y ET X a (Y ) = 0 and 2:y ET ' X a (Y ) = -Xa(O) = - 1 (cf. thebeginning of the proof ofTheorem 7). Now the result follows , because thereare e)(';~;) ways to choose t/ , 0

Delsarte's theorem

With the notations above, notice that if aD, ... , an is the weight distributionof a linear code, then 2:7=0 aiKj (i) ~ 0 (since bj ~ 0). The surprising factis that if C is an unrestricted code and aD, . .. , an is its inner distribution,then the inequality is still true:

1.75 Theorem (Delsarte, 1973). Aq(n ,d) is bounded above by the maxi­mum of

subjet to the constraints

ai~ O (d ~ i ~n)

92

and

Chapter 1. Block Error-correcting Codes

(~) + t aiKj (i ) ~ O for jE{O,l, .. . ,n},J i=d

where Kj = Kj(x, n, q), j = 0, . .. . n; are the Krawtchouk polynomials.

Proof We only have to prove the inequalities in the last display. Let D; ~C X C be the set of pairs (x, y) such that hd(x ,y) = i . From the definitionof the ai, and using Delsarte's lemma, we see that

n n

MLaiKj(i) = L L L w(x-ylz)

i=O i= O (x ,y)EDi zEWj

= L ILw(x1z) 12 ,

ZEWj xEC

which is clearly non-negative. o1.76 Remark. In the case of binary codes and d even, we may assume thata; = 0 for odd i . Indeed, from a code (n, M, d) with d even we can get acode (n - 1, M, d - 1) (as in E.1.l2) and the parity extension of the latter isa code (n, M , d) which does not contain vectors of odd weight. The listingLinear programming bound contains an implementation of the linear pro­gramming bound for the binary case, basically following [11], Ch. 17, §4.Notice that in the case the constrained maximum is odd, then the functionLP goes through a second linear programming computation that decreasesthe linear terms (j) of the relations. For this clever optimization (due toBest, Brouwer, MacWilliams, Odlyzko and Sloane, 1977), see [11], p. 541.

1.77 Remark. The expression a$$i in the listing Linear programming up­per bound constructs an identifier by appending to the identifier a the integeri regarded as a character string. For example, a$$3 is the identifier a3.

1.78 Examples. So we see that A2(13, 5) = A2(14,6) :s; 64. Since thereis a code Y such that IYI = 64, as we will see in Chapter 3, it follows thatA2(13,5) = A2(14,6) = 64.

The third and fourth expressions in the listing Linear programmingbound show that we can pass a relation list as a third argument to LP. Suchlists are meant to include any further knowledge we may gain on the numbersai, for particular values ofnand d, beyond the generic constraints introducedin Theorem 1.75.

In the case A2(12, 5) = A2(13 ,6) , it can be easily seen that a12 :s; 1,cio :s; 4, with alO = 0 when a12 = 1, and so we have an additional relationa12 + 4alO :s; 4. With this we find that the linear programming bound gives

104. Parameter bounds

AiLibrarvT Linear oroorarnmino UDDer bound

I LP(n,d) := LP(n,d,O)LP(n,d,R'):=begin

local K, A, a, a, R, m, b, pif odd?(d) then n=n+1; d=d+1 endK= {krawtchouk(x,k,n) with kin 1.. n/2}A=[[evaluate(k,O)]I [evaluate (k,j) with j in d..n..2]with kin K]a=[1] I [a$$i with i in d..n..2]R={<a,a);;!:O with a in A}m=constrained_maximum(sum(a),RIR')

b=Lm1Jif odd?(b) then

P=[1-b-1] I tail(a)R={<p,a);;!:O with a in A}m=constrained_maximum(sum(a),RIR')

end{Lm1J, m2}

end

I LP(13,5) - {64, {a10~14,a12~O,a14~O,a6~42,a8~7}}I LP(14,6) - {64, {a10~14,a12~O,a14~O,a6~42,a8~7}}I LP(13,6,{a10+4 'a12~4}) - {32, {a10~4,a12~O,a6~24,a8~3}}

ILP(9,4,{a8~1, a6+4'a8~12}) - {20 , {a4~ 9: ,a6~ ~16 ,a8~1}}

93

'A

A2(12 , 5) ~ 32. Since we will construct a code (12,32,5) in Chapter 3 (theNadler code) , it turns out that A2(12 , 5) = 32.

Finally we find that A2(8, 3) = A2(9 ,4) ~ 20 using the extra relationsas ~ 1 and a6 + 4as ~ 12. Here as ~ A(9, 4, 8) = A(9, 4,1) = 1 (seeRemark 1.70 and E.l.53), a6 ~ A(9 ,4, 6) = A(9 ,4, 3) ~ 3A(8 ,4, 2) ~ 12,and if as = 1 then it is easy to see that a6 ~ A(8, 3, 5), so that a6 ~

A(8 ,3,5) = A(8 ,4 ,5) = A(8 ,4 ,3) ~ l~A(7,4,2)J ~ l~ l~JJ = 8.Since we know that there are codes (8,20,3) (Example lA, or E.1.46),

we conclude that A2(8 , 3) = 20.

The linear programming bound also gives, without any additional rela­tion, the values 128 and 256 (for d = 5) and 32 (for d = 7) on Table 1.1.

Asymptotic bounds

Let us consider an example first. By the Singleton bound we have k ~

n - d + 1. On dividing by n, we get

k d 1R = - ~ 1- - +-.

n n n

94 Chapter 1. Block Error-correcting Codes

Ifwe now keep 8 = :!. fixed and let n go to infinity, then we see that 0:(8) ~n

1 - 8, where 0:(8) = limsupR is the asymptotic rate corresponding to 8.Thus we see that the Singleton bound gives rise to the bound 0:(8) ~ 1 - 8of the asymptotic rate , and this inequality is called the asymptotic Singletonbound.

In general we may define the asymptotic rate 0:(8), 0 ~ 8 ~ 1, aslimsuPn->oo logq Aq(n, l8nJ)/n and then there will be a bound of 0:(8),given by some function 0:'(8) , corresponding to each of the known boundsof Aq(n, d) . These functions 0:'(8) are called asymptotic bounds . For exam­ple, 0:'(8) = 1 - 8 in the case of the Singleton bound.

Below we explain the more basic asymptotic bounds (for the correspond­ing graphs, see figure 1.2). We also provide details ofhow to compute them.

We will need the q-ary entropy function Hq(x), which is defined as fol­lows (0 ~ x ~ 1):

The importance of this function in this context stems from the fact that foro~ 8 ~ (3, (3 = (q - 1)/ q, we have

[1.9]

(see, for example, [25] (5.1.6), or [20], Cor. A.3.1l). It is immediate tocheck that Hq(x) is strictly increasing and H q(x) ~ x / (3 for 0 ~ x ~ (3,with equal ity if and only if x = 0 or x = (3.

E.1.55 (Gilbert asymptotic bound). Use the formula [1.9] and the Gilbertbound (Theorem 1.17) to prove that

0:(8) ~ 1 - Hq(8) ,

for 0 ~ 8 ~ (3.

1.79 Remark. If 0 ~ 8 ~ (3, the Gilbert asymptotic bound shows that thereexists a sequence of codes [nj , kj , djl, j = 1,2 , .. ., that has the followingproperty: given EO > 0, there exists a positive integer je; such that

for all j > je;. Any family of codes with this property is said to meet theGilbert-Varshamov bound. The family is said to be asymptotically good ifitsatisfies the following weaker property: there exist positive real numbers Rand 8 such that kj/nj ~ Rand dj/nj ~ 8 for all j . A family which is not

1.4. Parameterbounds 95

asymptotically good is said to be asymptotically bad, and in this case eitherkj/nj or dj/nj approaches 0 when j ---> 00.

Long ago it was conjectured that a( 0) would coincide with 1-Hq(0), butthis turned out to be incorrect. Indeed, in 1982 Tsfasman, Vladut and Zinkproved, using modem methods of algebraic geometry, that the bound can beimproved for q ;;:;: 49. For lower values of q, however, and in particular forbinary codes, the Gilbert bound remains the best asymptotic lower bound.

Asymptotic upper bounds

E.1.56 (Hamming asymptotic bound). Use the formula [1.9] and the sphereupper bound (Theorem 1.14) to show that

a(o) ~ 1 - Hq(0/2) ,

for 0 ~ 0 ~ 1.

E.1.57 (Plotkin asymptotic bound). Show that the Plotkin bound (Proposi­tion 1.63) implies that

a( 0) ~ 1 - 0/{3 if 0 ~ 0 ~ (3

a(o) = 0 if {3 ~ 0 ~ 1.

So in what follows we may restrict ourselves to the interval 0 ~ 0 ~ {3.

E.1.58 (Elias asymptotic bound). Use the formula [1.9] and the Elias bound(Proposition 1.68) to prove that

a(o) ~ 1 - Hq ({3 - J (3({3 - 0)) if 0 ~ 0 < (3

a( 0) = 0 if {3 ~ 0 ~ 1.

Note that the second part is known from the Plotkin asymptotic bound.

McEliece et al. asymptotic bound

This result was obtained in 1977 by McEliece, Rodernich, Rumsey andWelsh (MRRW) with a delicate analysis of the asymptotic behaviour of thelinear programming bound for binary codes and it remains the best of theknown asymptotic upper bounds for such codes (cf. [6], chapter 9). Theresult is the following :

a(o) ~ min (1+ g(u2) - g(u2 + 20u + 20)),

O:(u:(1-28

where g(x) = H2((1 - (1 - x)1/2)/2) .

96 Chapter 1. Block Error-correcting Codes

Van Lint asymptotic bound

The authors of the previous bound also showed that for 0.273 < 0 ::;; 0.5the minimum is attained at u = 1 - 20, in which case the bound takes thefollowing form:

This bound can be considered in the interval 0 ::;; 0 ::;; 0.273 as well, but it isslightly above the MRRW bound in this range, and near 0 it even becomesworse than the Hamming bound. In any case, this bound was establisheddirectly by van Lint using fairly elementary methods oflinear programming(see [25] (5.3.6)).

For a further discussion on "perspectives on asymptotics", the reader canconsult Section 4, Lecture 9, of[22].

AI Librarv I The entroov function and the asvmototic bounds IA

entropy (x :IR.q :71)check ocx& x:s:;1 & q~2

:= if x=O then 1 elif x=1 then logq(q-1)else x ·logq(q-1)-x ·logq(x)-(1-x) ·logq(1-x) end

Ientropy(x) := entropytx.Z)Igilbert(x: IR) check O:s:;x & x:s:; 1 := if x:s:; 1/2 then 1-entropy(x) else 0 endI hamming(x :IR) check O:s:;x & x:s:;1 := 1-entropy(x/2)

Ielias(x :IR) checkO:S:;X&X:S:;~:= 1 -entropy((1-~)/2)

Ivanlint(x :IR) check O:s:;x & x:s:;1 := entropyC - ";x · (1-x) )

mceliece(x) check O:s:;x & x:s:;1:=begin

local g, e=0.005. 0=0 .04

(1- y'"1"=t )9 (t) :=entropy 2

infimum(uH1+g(u2)_g(u2+2 ·x ·u+2 ·x). e .. (1-2 ·x-e) .. o· (1-2'x))end

rl entropy(O.2) - 0.72193~ mceliece(0.2) - 0.46137

The results . .. somehow bound the possible values of param­

eters. Still the problem of actually finding out what is possible

and what is not is very difficult

Tsfasman-Vladut, [24], p. 35

104. Parameter bounds 97

0.5

P=Plotkin

oL-----------=::~~-

1 •..~~

"'-,'-.~ Ls-van Lint~ IvlocMcEJiccc

~G=Gilbert

S~

Figure 1.2: Asymptotic bounds

Summary

• Griesmer bound: if [n, k, d] are the parameters of a linear code, then

k-l r 1L ~ ~n.i=O q

This gives an upper bound for Bq(n, d), the highest cardinal qk for alinear code [n, k, d]q.

• Plotkin bound: for any code C of type (n, M, d), d ~ f3nM ,whichM-l

is equivalent to M (d - f3n) ~ d. As a consequence, if d > f3n, then

Aq (n, d) ~ ld _df3n J. For binary codes this can be improved to

A2(n, d) ~ 2ld/(2d - n)J.

• Computations and examples related to the Elias, Johnson and linearprogramming upper bounds.

• Computations related to the asymptotic bounds.

98

Problems

Chapter I. Block Error-correcting Codes

P.1.22. Let C ~ JF~ be a linear code tn, k,d] and assume k is maximal (fornand d fixed). Show that ICI ~ qn/volq(n , d - 1), which proves that theGilbert lower bound can always be achieved with linear codes when q is aprimer power. Show also that the Gilbert-Varshamov lower bound is neverworse than the Gilbert bound. Hint : Ifit were ICI < qn/volq(n, d - 1), thenthere would exist y E ~ such that hd (y, x) ~ d for all x E C, and the linearcode C + (y) would have type tn,k + 1, d].

P.1.23 . Show that there does not exist a binary linear code [14,4, 5h (34 =81 is the value of the Griesmer bound for n = 14 and d = 5). Hint: Formingthe residue with respect to a word ofweight 9, and using the Singleton bound,we would have a [5,3 , 3h code, and this cannot exist as can be easily seenby studying a generating matrix ofthe form hiP, where P is a ternary 3 x 2matrix.

P.1.24 (Johnson, 1962). Prove that

A(n, 2h, w) ~ lh h~ )J 'n-wn-w

provided hn > w(n - w) . Hint : First notice that if x, y E Z~ satisfyIxl = Iyl = wand hd(x, y) ~ 2h then Ix·YI ~ w-h; next, given a code C rv

(n , M, 2h, w) with M = A(n, 2h, w), bound above by (w - h)M(M - 1)the sum 8 of all Ix .x ' i for x , x' E C, x f= x'; finally note that if mi is thenumber of Is in position i for all words ofC, then 8 = L~l m; - wM and8 ~ wM(wM/n - 1).

P.1.25 (The Best code, 1978). We know that A2(8, 3) = 20, hence A 2(9 ,3) ~2A2(8 ,3) = 40. In what follows we will construct a code (10,40,4) . Con-

sider the matrix G = MIM, where M = 131RT and R = (~ ~ ~) . Let

C' = (G )U{1000000100}. Ifwe let 0- = (1,2,3,4,5)(6,7,8,9,10) E 8 10 ,4

let C, = o-i(C') , i = 0, 1,2,3,4, and set C = U Ci. Show that C has typei=O

(10,40,4).

P.1.26. Write a function that returns the A-shortening C' of a code C withrespect to a position. If C is given as a list, C' should be given as a list. If Cis linear with control matrix H (generating matrix G), C' should be given asa list in the case A f= °and by a generating matrix if A = o.P.1.27. Find the linear programming bounds of A 2 (n,3) for n = 9,10,11and compare the results with the corresponding values in Table 1.1.

1.4. Parameter bounds 99

1

0.5

M······McElicceG=Gilbcrt

oL------------===="""--

Figure 1.3: Asymptotic bounds: MBBW and Elias

1

MMcElicccGc=Gilbcrt

G

o 0.5

Figure 1.4: Asymptotic bounds: MBBW and van Lint

2 Finite Fields

During the last two decades more and more abstract algebraic

tools such as the theory of finite fields and the theory of poly­

nomials over finite fields have influenced coding.

R. Lidl and H. Niederre iter, [10], p. 305.

This chapter is devoted to the presentation of some of the basic ideas and re­sults ofthe theory ofjinitejields that are used in the theory oferror-correctingcodes . It is a self-contained exposition, up to a few elementary ideas on ringsand polynomials (for the convenience of the reader, the latter are summarizedbelow) . On the computational side, we include a good deal ofdetails and ex­amples ofhow finite fields and related objects can be constructed and used.

Notations and conventions

A ring is an abelian group (A , +) endowed with a product A x A ---t A,(x, y) f--' X • y, that is associative, distributive with respect to the sum onboth sides, and with a unit element lA which we always assume to be dif­ferent from the zero element OA of (A, +) . If x . y = y . x for all x, yEAwe say that the ring is commutative. Ajield is a commutative ring for whichall nonzero elements have a multiplicative inverse. Examples: Z (the ring ofintegers) ; (11, JR ,C (the fields of rational , real and complex numbers); if A isa commutative ring , the polynomials in one variable with coefficients in Aform a commutative ring, denoted A[X], with the usual addition and multi ­plication ofpolynomials, and the square matrices oforder n with coefficientsin A form a ring, denoted Mn(A), with the usual addition and multiplicationof matrices, and which is non-commutative for n > 1.

An additive subgroup B of a ring A is said to be a subring if it is closedwith respect to the product and lA E B. If A is a field, a subring B is saidto be a subjield if b-1 E B for any nonzero element b of B. For example,Z is a subring of (11; Q is a subfield of JR; if A is a commutative ring, A is

102 Chapter2. FiniteFields

a subring of the polynomial ring A [X] ; if B is a subring of a commutativering A, then B[X] is a subring of A[X] and Mn(B) is a subring of Mn(A) .

An additive subgroup I of a commutative ring A is said to be an idealif a . x E I for all a E A and x E I. For example, if a E A, then theset (a) = {b . a Ib E A} is an ideal of A, and it is called the principal idealgenerated by a. In general there are ideals that are not principal, but there arerings (called principal rings) in which any ideal is principal, as in the case ofthe ring of integers Z. If I and J are ideals of A, then I n J is an ideal of A.The set 1+ J of the sums x + y formed with x E I and y E J is also an idealof A. For example, if m and n are integers, then (m) n (n) = (l) , wherel = Icm(m,n), and (m) + (n) = (d), where d = gcd(m, n). A field K onlyhas two ideals, {a} and K, and conversely, any ring with this property is afield.

Let A and A' be rings and f : A ---> A' a group homomorphism. We saythat f is a ring homomorphism if f(x · y) = f( x) . f(y) for all x, yEA andf(IA) = IA' . For example, if A is a commutative ring and I is an ideal of A,I 1= A, then the quotient group AII has a unique ring structure such that thenatural quotient map A ---> AII is a homomorphism. The kernel ker (I) of aring homomorphism f : A ---> A' is defined as {a E A I f (a) = aAt} and it isan ideal ofA. One ofthe fundamental results on ring homomorphisms is thatif I is an ideal of A and I ~ ker (I), then there is a unique homomorphismJ:AII ---> A' such that f(-lr(a)) = f(a) for all a E A, where 11" : A ---> AIIis the quotient homomorphism. The homomorphism J is injective ifand onlyif I = ker (I) and in this case, therefore, J induces an isomorphism betweenAlker (I) and f(A) , the image of A by f . Note that any homomorphism ofa field K into a ring must be injective, for the kernel of this homomorphismis an ideal ofK that does not conatain I K and hence it must be the ideal {a}.

If f : A ---> A' is a ring homomorphism, and B is a subring of both Aand A', then we say that f is a B-homomorphism if f(b) = b for all s « B.We also say that A and A' are extensions of B, and write AlB and A'lBto denote this fact, and a B -homomorphism from A to A' is also called ahomomorphism of the extension AlB to the extension A'lB. The homo­morphisms AlB ---> AlB that are bijective are said to be automorphismsof the extension AlB. The set of automorphisms of an extension AlB is agroup with the composition ofmaps.

If we copy the definition of K -vector space using a commutative ring Ainstead of a field K , we get the notion of A-module. For example, just asK[X] is a K-vector space, A[X] is and A-module. An A-module M isfreeif it has a basis, that is, if there exists a subset of M such that any elementof M can be expressed in a unique way as a linear combination of elementsof that subset with coefficients of A . For example, A[X] is a free A-module,because {I, X, X 2 , . . .} is a basis.

103

2.1 z, and IFp

The real world is full of nasty numbers like 0.79134989 ...• while

the world of computers and digital communication deals with

nice numbers like 0 and 1.

Conway-Sloane, [6], p. 56.

Essential points

• Construction ofthe ring Zn and computations with its ring operations.

• Zp is a field if and only if p is prime (Zp is also denoted IFp).

• The group Z~ of invertible elements of Zn and Euler's function cp(n),which gives the cardinal of Z~.

Let n be a positive integer. Let Zn denote Z/ (n) , the ring of integers modulon . The elements of this ring are usually represented by the integers a suchthat °:;;; a :;;; n - 1. In this case the natural quotient map Z ~ Zn is givenby x 1---+ [x]n, where [x]n is the remainder of the Euclidean division of x byn . This remainder is also denoted x mod n and it is called the reduction ofx modulo n. The equality [x]n = [Y]n is also written x == Y (mod n) and itis equivalent to say that Y - x is divisible by n . The operations of sum andproduct in Zn are the ordinary operations of sum and product of integers, butreduced modulo n .

For example, ifn = 11 we have 3+7 == 10,3 ·7 == 10,52 == 3 (mod 11) .See Examples of ring and field computations. For the notations see A.6and A.S (p. 223). The function geometric_series(a.r) returns the vector[1,a, . . . ,ar - 1] .

The decision of whether an integer x does or does not have an inversemodulo n, and its computation in case such an inverse exists, can be doneefficiently with a suitable variation of the Euclidean division algorithm. Thisalgorithm is implemented in the woos function bezout(m,n) , which returns ,given integers m and n, a vector [d, a, b] of integers such that d = gcd (m ,n)and d = am + bn , In case d = 1, a is the inverse of m (mod n), andthis integer is the value returned by inverse(m,n) . The function inverse(m),where m E Zn, returns the same value, but considered as an element of Zn,rather than as an integer. See More examples on ring computations. Thefunction invertible?(a) returns true or false according to whether a is or is not

2.1. z; and Fp

104 Chapter 2. Finite Fields

Examples of ring and field computationsIA=7135 .... Z35

143 mod 35 .... 8I a=43 :A .... 8Igeometric_series(a,12) .... [1,8,29,22,1,8,29,22,1,8,29,22]

Ia3+5·a7 27Ix=25 :7117 8

IY=~ 15

Ix·Y 1Igeometric_series(x ,17) .... [1,8,13,2,16,9,4,15,1 ,8,13,2,16,9,4,15,1]

an invertible element (in the ring to which it belongs). Note that if we askfor Y]« and a is not invertible, than the expression is returned unevaluated(as 1/(7:A)is the listing).

More examples on ring computationsIA=7135; a=8 :A;

I invertible?(a) .... trueI inverse(a) .... 22I1/a .... 22I inverse(8,35) 22I invertible?(7 :A) false11/(7 :A)I n=2345; m=1234;I [d,a,b] = bezout(n,m) .... [1,311 , -591]I a -n--b-m .... 1

The invertible elements of Zn form a group, Z~. Since an [x]n E Zn isinvertible if and only if gcd (x, n) = 1, the cardinal ofZ~ is given by Euler'sfunction cp(n), as this function counts, by definition, the number of integersx in the range 1..(n - 1) such that gcd (x, n) = 1. In particular we see thatZn is a field if and only if n is prime. For example, in Zw the invertibleelements are {I , 3, 7, 9} (hence cp(10) = 4) and 3-1 == 7 (mod 10), 7-1 ==3 (mod 10), 9-1 == 9 (mod 10). On the other hand, in Zll all non-zeroelements are invertible (hence cp(ll) = 10) and Zll is a field.

If p > 0 is a prime integer, the field Zp is also denoted IFp '

2.1 Proposition. The value ofcp(n) is determined by the following rules:

1) Ifgcd(n1,n2) = 1, thencp(n1n2) = cp(ndcp(n2)'

2) Ifp is a prim e number and r a positive integer, then

105

Proof The map tt : Z ---. Znl X Zn2 such that k 1-+ ([k] nl' [k]n2) is a ringhomomorphism and its kernel is (nl)n(n2) = (m), where m = Icm(nl' n2).Since nl and n2 are coprime, we have m = nln2 and hence there exists aninjective homomorphism Zj(m) = Znln2 ---. Z n l X Zn2' But this homo­morphism must be an isomorphism, as the cardinals of Z nl n2 and Z nl X Zn2are both equal to nln2 .

For part 2, observe that the elements which are not invertible in Zpr are

O,p, 2p, ..., (pr-l - l)p,

hence cp(pr) = pr _ pr-l = pr-l(p - 1). 0

For example, cp(10) = cp(2 . 5) = cp(2)cp(5) = (2 - 1)(4 - 1) = 4, inagreement with what we already checked above, and cp(72) = cp(8 . 9) =cp(8)cp(9) = (8 - 4)(9 - 3) = 24. In WIRIS the function cp(n) is given byphLeuler(n).

I q>=phi_euler;Iq>(105) .... 48Iq>(123456789) .... 822600721q>(1234567890123456789) .... 814197031044480000

2.2 Remark. Since cp(p) = p - 1 for p prime, and since there are infinitelymany primes , the quotient cp(n )j n can take values as close to I as wanted.On the other hand, cp(n)jn can be as close to 0 as wanted (see P.2.I). For acomputational illustration, see Small values of ij/(n)/n .

"'lLibrarvT Small values ofm7riHn

The primes in 2..NIP(N) := {n in 2..N such_that prime?(n)}

m(N)=q>(PN)fPN' PN the product of the primes in 2..N

Im(N) := Il P~1 .Float

p in P(N)

~m(10) 0.22857

Im(100) 0.12032Im(1000) 0.080965

,...

2.3 Proposition. For each positive integer n, 2:dlncp(d) = n (the sum isextended to all positive divisors of n).

Proof It is enough to notice that the denominators of the n fractions ob­tained by reducing lin, 2/n, ... ,n j n to irreducible terms are precisely thedivisors d of n and that the denominator d appears cp(d) times. 0

2.1. z; and Fp

106 Chapter 2. Finite Fields

I cp=phLeuler;I n=9 ·5 ·7;ID=divisors(n) .... {1, 3, 9, 5,15,45,7,21,63,35,105, 315}

Id~D cp(d) .... 315

E.2.1. Show that cp(n) is even for all n > 2. In particular we have thatcp(n) = 1 only happens for n = 1,2.

E.2.2. Show that {n E Z+ Icp(n) = 2} = {3, 4, 6} and {n E Z+ Icp(n) =4} = {5, 8, 10, 12}.

E.2.3. The groups Zs, Zg, Zio and Zh have order 4. Determine which ofthese are cyclic and which are not (recall that a group of order four is eithercyclic, in which case it is isomorphic to Z4, or isomorphic to Z§).

E.2.4. Given a finite group G with identity e, the order of an element a isthe smallest positive integer r such that aT E {e = aO , . . . ,aT-I}. Showthat actually aT = e and that r divides IGI. Hint: For the latter use thatH = {e = aO , . . . ,aT-I} is a subgroup of G and remember that the cardinalof any subgroup divides the cardinal of G (Lagrange theorem).

E.2.5. Fix a positive integer n and let a be any integer such that gcd (a , n) =1. Show that a'P(n) == 1 (mod n) .

E.2.6. LetFbe a finite field and q = IFI. Show that x q- 1 = 1 for all x E F*and that xq - x = 0 for all x E F.

Summary

• Integer arithmetic modulo n (the ring Z/(n), which we denote Zn).The ring Zn is a field if and only if n is prime and in this case is alsodenoted Fn .

• The group Z~ of invertible elements ofZn . Its cardinal is given by Eu­ler's funct ion cp(n), whose main properties are given in Propositions2.1 and 2.3.

• Computation of the inverse of an element in Z~ by means of Bezout'stheorem.

107

Problems

P.2.I. For k = 1,2,3, ..., let Pk denote the k-th prime number and let

Prove that the minimum of<p(n) j n in the range (Pk) . . (Pk+1 -1) is Nkj Pk.Deduce from this that in the range 1..(2 x 1011) we have <p(n) > 0.I579n.

P.2.2. Prove that lim inf.,.....oo <p(n)jn = O.

P.2.3. Show that the values ofn for which Z~ has 6 elements are 7,9,14,18.Since any abelian group of order 6 is isomorphic to Z6, the groups Z7' Z9'Zi4' Zis are isomorphic . Find an isomorphism between Z7 and Z9'

P.2.4. Show that the values of n for which Z~ has 8 elements are 15, 16, 20,24, 30. Now any abelian group of order 8 is isomorphic to Zs, Z2 X Z4 orZ~ . Show that Z24 ~ Z~ and that the other four groups are isomorphic toZ2 x Z4.

2.1. z; and Fp

108 Chapter 2. Finite Fields

2.2 Construction of finite fields

The theory of finite fields is a branch of modern algebra that has

come to the fore in the last 50 years because of its diverse ap­

plications in combinatorics, coding theory, cryptology, and the

mathematical theory of switching circuits, among others .

R. Lidl and H. Niederreiter, [10], p. ix.

Essential points

• Characteristic of a finite field.

• The cardinal of a finite field is a power of the characteristic.

• Frobenius automorphism (absolute and relative) .

• Construction of a field of qT elements if we know a field ofq elementsand an irreducible polynomial of degree r over it (basic construction).

• Splitting field of a polynomial with coefficients in a finite field andconstruction of a field of pT elements for any prime number p and anypositive integer r .

• Existence of irreducible polynomials over 'Zp (or over any given finitefield) of any degree . This guarantees the construction of a field ofpT elements that is more explicit and effective than using the splittingfield method.

Ruffini's rule

Let us begin with two elementary propositions and some examples. Recallthat if K is a field, X an indeterminate and 1 E K[X] a polynomial ofdegree r > 0, then 1 is said to be irreducible if it cannot be expressed as aproduct 1 = gh with g,hE K[X] and deg (g),deg (h) < r (otherwise it issaid to be reducible).

2.4 Proposition (Ruffini's rule). Let K be a fi eld. Given a polynomial 1 EK[X], an element 0: E K is a root oj1 (that is, 1(0:) = 0) if and only if1 isdivisible by X - 0:. In particular we see that if1 E K[X] is irreducible anddeg (j) > I , then 1 has no roots in K .

2.2. Construction of finite fields 109

Proof For any 0: E K, the Euclidean division of f E K[X] by X - 0: tellsusthatthereisg E K[X] andc E Ksuchthatf = (X -o:)g+c. ReplacingX by 0: yields c = f (0:). Thus we see that 0: is a root of f if and only ifc = 0, that is, if and only if f is divisible by (X - 0:) .

If f is irreducible and deg (J) > 1, then it cannot have a root in K, forotherwise it would have the form (X - o:)g with 0: E K and g E K[X] ofpositive degree , and this would contradict the irreducibility of f. 0

Thus the condition that f E K[X] (K a field, deg (J) > 1) has no rootin K is necessary for f to be irreducible, but in general it is not sufficient.We will see, however, that it is sufficient when the degree of f is 2 or 3.

2.5 Proposition. Let f E K[X] be a polynomial ofdegree 2 or 3. Iff isreducible, then f has a root in K. Thus f is irreducible if (and only if) it hasno root in K .

Proof A reducible polynomial of degree 2 is the product of two linear fac­tors, and a reducible polynomial of degree 3 is the product of a linear and aquadratic factor. In both cases f has a linear factor. But ifaX +b E K[X] isa linear factor of f (this means a =1= 0), then so is X - 0:, where 0: = -bfa,and hence 0: is a root of f . 0

2.6 Examples. If a E K, the polynomial X 2 - a is irreducible over K if andonly ifa is not the square ofan element of K. For example, the squares in 'Zqare 0, 1,4,2. Therefore the polynomials X 2 - 3 = X 2 +4, X 2 - 5 = X 2 + 2and X 2 - 6 = X 2 + 1 are irreducible over Z7. Analogously, the polynomialX3 - a is irreducible over K if and only if a is not a cube in K. Since thecubes in Z7 are 0, 1,6, the polynomials X3 - 2 = X3 +5, X 3- 3 = X 3+4,X 3 - 4 = X 3 + 3 and X 3 - 5 = X 3 + 2 are irreducible over Z7.

E. 2.7. Show that in a finite field of characteristic 2 every element is a square(recall that in a field of odd characteristic, as established in Proposition 1.49,half of the nonzero elements are squares and the other half are not). Showalso that every polynomial of the form X 2 + a is itself a square, hence re­ducible . Hint: If F is a field of characteristic 2, then the map F -i F suchthat x f---> x 2 is a homomorphism.

E. 2.8. Let K be a field ofodd characteristic and f = X 2 +aX +b E K [X].Prove that f is irreducible ifand only if the discriminant of f, namely a2 - 4b,is not a square in K .

Characteristic ofa finite field

We will use the characteristic homomorphism <PA : Z -i A of any ring A,that is, the map such that <PA(n) = n· 1A for all n E Z. Note that <PA is

110 Chapter 2. Finite Fields

the only ring homomorphism of Z to A. Moreover, the image of cPA is thesmallest subring of A, in the sense that it is contained in any subring of A,and it is called the prime ring of A.

The kernel of cPA is an ideal of Z and hence there is a uniquely deter­mined non-negative integer nA such that ker (cPA) = (nA). We say that nAis the characteristic of A. For example, the characteristic of any domain is°(like Z, (2), lRand C), while Zn has characteristic n for any integer n > 1.

I characteristic(71) 0

Icharacteristic (717) 7

Icharacteristic(71a) 8

Icardinal(Q) .... +00

Icardinal (7117 ) .... 17

In the case of rings of characteristic 0, cPA is injective and hence an iso­morphism of Z with the prime ring of A. Thus we may identify Z with theprime ring of A (the integer n is identified with n . lA E A). For exam­ple, if A = Mr(Q) (the ring of rational matrices of order r), then A hascharacteristic °and n E Z is identified with the diagonal matrix til» ,

In the case when nA > 0, then nA > 1 (for lA is assumed to be differentfrom OA) and by definition we have that the relation n· lA = OA, nEZ, isequivalent to nAln. Note also that in this case we have a unique homomor­phism (necessarily injective) ZnA C-+ A whose image is the prime ring of A,and so we can identify ZnA with the prime ring of A.

2.7 Proposition. The characteristic n A ofa ring A has the following prop­erties :

1) IfA isfinite, then nA > O.

2) IfA is a domain, then either nA = 0 or nA is a prime number, and itis a prime number ifA is finite .

3) The characteristic ofafinitefield F is a prime number p, and so in thiscase there is a unique homomorphism (necessarily injective) Zp C-+ Fwhose image is the prime ring ofF (in this case a subfield).

Proof If A is finite, then cPA (Z) is finite. Since Z is infinite , cPA cannot beinjective. Hence ker (cPA) =1= 0 and therefore nA =1= O.

If A is a domain and n = nA > 0, then Zn is also a domain, because itis isomorphic to a subring of A. But Zn is not a domain unless n is prime(and if n is prime then we know that Zn is a field) .

Since a finite field F is a finite domain, its characteristic p = nF isa prime number. The other statements are clear by what we have said somL 0

2.2. Construction of finite fields 111

2.8 Proposition. IfF is afinitefield, then its cardinal has theform pT, wherep is a prime number and r is a po sitive integer. Moreover, p coincides withthe characteristic ofF .

Proof If p is the characteristic of F, then p is a prime number and F has asubfield isomorphic to 7lp • Thus we can regard F as 7lp-vector space. SinceF is finite, the dimension of F over 7lp , say r, is also finite and we haveF ~ 7l~ as 7lp -vector spaces. Hence IFI = 171~1 = p", 0

E.2.9. Let L be a finite field and K a subfield of L. If IKI = q, show thatILl has the form qT for some positive integer r , Deduce from this that thecardinal of any subfield F of L such that K ~ F has the form s' , with sir.What does this say when K = 7lp , p the characteristic of L?

E.2.10 (Absolute Frobenius automorphism). Show that in a field K of char­acteristic p the map

F : K ~ K such that x ~ xp

satisfies(x + y )p = xp + yP and (xy)P = xPyP

for all x, y E K and)',P = >. for all >. E 7lp

(for this relation use E.2.6). If K is finite, then F is an automorphism ofK / 71p (it is called the Frobenius automorphism of K) .

E.2.1l (Relative Frobenius automorphism). Let L be a field of characteristicp and K ~ L a finite subfield. If q = IKI, show that the map

FK : L ~ L such that x ~ x q

satisfies(x + y)q = xq + yq and (xy)q = xqyq

for all x , y ELand>.q = >. for all >. E K

(for this relation use E.2.6). If L is finite, then FK is an automorphism ofL/K (it is called the Frobenius automorphism of L relative to K). 0

If K has cardinal q, the function frobenius(K) yields the map x ~ x q ,

that is, the Frobenius map relative to K . When applied to an element x ,frobenius(x) is equivalent to the Frobenius map relative to the prime field ofthe field of x. For examples, see the listing Frobenius automorphisms.

112

Frobenius automorphisms

IK=712[x]/(x2+x+1); L=K[y]/(y2+x 'y+1);

f is the Frobenius relative to the prime fieldI f=frobenius -+ frobeniusCK is the Frobenius relative to K

ICK=frobenius (K) -+ xHx4I f(x) -+ x+1If(y) -+ x 'y+1ICK(x) -+ xICK(y) -+ y+x

The basic construction

Chapter 2. Finite Fields

We will use the general ideas from polynomial algebra summarized in thepoints 1-5 below. IfK is any field and f E K[X] is any polynomial over Kof degree r > 0, then the quotient ring K' = K[X]/(f) has the followingproperties:

1) The natural homomorphism K ---. K' is injective and so we may iden­tify K with a subfield ofK'.

2) If x = [X] is the class of X modulo I, then

{I, x, ... ,xr - I }

is a basis of K' considered as a vector space over K . This meansthat every element of K' can be uniquely expressed in the form ao +aIX + ... + ar_IXr- l , ao, ... ,ar-I E K . So the elements of K' canbe represented as polynomials in x of degree < r with coefficients inK or, in other words, K' can be identified with the vector space K[x]rof polynomials in x of degree < r with coefficients in K .

3) In terms ofthe representation in the previous paragraph, the sum oftwoelements ofK' agrees with the sum ofthe corresponding polynomials,and its product can be obtained by reducing modulo f the ordinaryproduct of polynomials. This reduction can be done as follows. Iff = ao + alX + . .. + ar_Ixr-1 + X r, then we have xr = -(ao +alx + ...+ ar_IXr- I), and this relation can be used recursively toexpress xi, for all j ~ r, as a linear combination of 1, x , ... , xr- I.

4) The ring K' is afield ifand only iff is irreducible over K. In this case,the inverse ofa = aO+aIx+" ·+ar_IXr-1 can be obtained, as in thecase of modular arithmetic, with a suitable variation of the Euclideandivision algorithm applied to f and a . The field K' is usually denotedK (x), because it coincides with the minimum field that contains Kand x, but note that it coincides with K[x], the ring of polynomialexpressions in x with coefficients in K .

2.2. Construction of finite fields 113

5) If K is finite and q = IKI, then IK'I = qT, where r is the degree of I.

If p is a prime number and we apply 5 to K = 'Lp, we see that if we knowan irreducible polynomial I E 'Lp[X], then we also have a finite field K' ofcardinal pT. For example, I = x 3 - X +1 is an irreducible polynomial over'L3 (use Proposition 2.5) and so K' = 'L3[X]!(I) is a field and if x is theclass of X mod I then the 27 elements of K' have the form a + bx + cx 2 ,

a, b,c E 'L3. At the end of this section we include several other examples.For the computational implementation ofthe basic construction, see A.7,

and for sample computations related to the preceeding example, see the list­ing The function extension(K,f). Note that extension(K,f) is equivalent tothe expression K[x]/(f), where x is the variable ofthe polynomial f . The nameof this object is still K[x], but the x here is now bound to the class [x] moduloI of the initial variable x of I .

The function extension(K,f)Ilet K=713;

IF=extension(713, x3+2 x+1) .... K[x]

I characteristic(F) 3Iq=card inal(F) 27

Igeometric_series(2x+1,4) .... [1, 2'x+1 , x2+x+1, 2 'x+2]

IL= {xi-.+j with j in O..(q-1)} ;

Ix11, L(x2+x+2) .... x2+x+2,11

IF=713 [t] / (t3+2 ·t+1) .... K[t]

Icardinal(F) .... 27

Igeometric_series(2t+1,4) .... [1, 2 ·t+1, t2+t+1, 2 ·t+2]

2.9 Remark. In the points 1-3 above, K can be replaced by any commuta­tive ring and I by a monic polynomial (a polynomial I is monic if its thecoefficient of its highest degree monomial is 1). Then K' is a ring, and afreemodule over K with basis {I, x, . . . , xT

-1 } . In other words , assuming that

the ring K has already been constructed, that x is a free identifier (this canbe achieved , if in doubt, with the sentence clear x) and that I (x) is a monicpolynomial over K, then K' can be constructed as shown in The functionextension(K,f) also works for rings.

We end this subsection with a result that establishes a useful relation betweenthe field K' = K[X]/(I), IE K[X] irreducible, and any root of I .

2.10 Proposition. Let L be afield and K a subfield. Let I E K[X] be amonic irreducible polynomial, and set r = deg (I), K' = K[X]/ (I). Iff3 E L is a root of I, and x is the class ofX modulo I, then there exists a

114

The function extension(K,f) also works for ringsIlet A=7112 ~ A

IB=extension(A, x3+x+5) ~ A[x]

Ib=x10+X3 ~ 10'x2+2'x+10

Ib+b4 ~ 6 'x2+6 'x+6

I x-1 ~ 7 'x2+7

IB=A[t]f(t3+t+5) ~ A[t]

11ft ~ 7·t2+7

Chapter 2. Finite Fields

unique K -homomorphism K' -+ K[,6] such that x I---t ,6 (K-homomorphismmeans that it is the identity on K). This homomorphism is necessarily anisomorphism. Hence K[,6] is afield and therefore K[,6] = K(,6).

Proof Since the image of ao + alX + ... + ar_Ixr-1 E K' by a K­homomorphism such that x I---t ,6 is ao +al,6 +...+a r_l,6r-l, uniquenessis clear. For the existence, consider the map K[X] -+ L such that h I---t

h(,6) . It is a ring homomorphism with image K[,6] and its kernel containsthe principal ideal (I), because f( ,6) = 0 by hypothesis. Hence there is aninduced homomorphism K' -+ L such that

r-l a + ar-lao + alX + ...+ ar-Ix I---t ao + ali"' +... ar-If-' .

Since K' is a field, this homomorphism is necessarily injective, hence anisomorphism between K' and the image K[,6]. 0

E.2.12. Show that the kernel of the ring homomorphism h I---t h(,6) definedin the preceeding proposition coincides with the principal ideal (I).

2.11 Example. The quadratic polynomial X 2+ X + 1 has no roots in /Z2,hence it is irreducible. The field /Z2(X) = /Z2[X]/(X 2+ X + 1), where wewrite x for the class of X, has 4 elements : 0,1 , x , x + 1. The multiplicationtable is easy to find using the basic relation x 2 = x+1: we have x(x+1) = 1and (x + 1)2 = x . Note that {xO, xt,x2} = {1,x,x + I}, so that themultiplicative group of /Z2 (x) is cyclic .

Construction of F4

I c1ear(x);

If=x2+x+1 ~ x2+x+1

Ilet F4= 7l2[xjf(f) ~ F4

Igeometric_series(x,3) ~ [1, x, x+1j

2.12 Example. The polynomial T 3+T + 1 E /Z2 [T] is irreducible, becauseit has degree 3 and does not vanish for T = 0 or for T = 1. Hence /Z2(t) =

2.2. Construction of finite fields 115

Z2[T]/ (T 3 +T + 1) is a field of 8 elements, where we let t denote the classofT. The elements of Z2(t) have the form a + bt + ct 2

, with a, b, C E Z2,and the multiplication table is determined by the basic relation t 3 = t + 1.Note that t4 = (t + l )t = t2+ t . Since the cardinal of the multiplicativegroup of this field has 7 elements, it must be cyclic. In fact, as the followinglisting shows, it is a cyclic group generated by t.

Construction of Fe

I c1ear(x) ;

If=x3+x+1 ~ x3+x+1

Ilet F8= 7l2 [x] / (f) ~ F8

Igeometric_series(x,7) ~ [1, X, x2, x+1, x2+x, x2+x+1, x2+1]

2.13 Example. Let K = Z2(X) be the field introduced in Example 2.11.Consider the polynomial f = y 2 + xY + 1 E K[Y]. Since f (a) =I- 0 forall a E K, f is irreducible over K (see 2.5). Therefore K (y) = K [Y]/(J)is a field of 42 = 16 elements. For later examples, this field will be denotedZ2(X, y ). Its elements have the form a + by , with a, b E K , and its multi­plication table is determined by the multiplication table of K and the basicrelation y2 = x y + 1. The listing below shows that the multiplicative groupof this field is also cyclic : y + 1 is a generator, but note that y is not.

Construction of F16 in two steps

If=x2+x+1 ~ x2+x+1

Ilet F4=712[x] I (f) ~ F4

Ig=y2+x 'y+1 ~ y2+x·y+1IletF16=F4[y]/(g) ~ F16

Igeometric_series(y,15)~ [1, y, x'y+1 , x-y-sx , Y+x, 1, y, x'y+1, x-y-sx , Y+x, 1, y, x 'y+1, x -y-rx, y+x]

Igeometric_series(y+ 1,10)~ [1, y+1 , x-v, Y+x, y+(x+1) , X, x-y-rx, (x+1) 'y, x 'y+(x+1), x'y+1]

2.14 Example. Since - 1 is not a square in Z3, Z3(X) = Z3[X]/(X 2 + 1)is a field of 9 elements , where x denotes the class of X. Its elements havethe form a + bx, with a, b E Z3 and the basic relation for the multiplicationtable is x 2 = 2. The listing below shows that the multiplicative group of thisfield is also cyclic : x + 1 is a generator, but note that x is not.

2.15 Example. The polynomial f = X 3 + 5 is irreducible over Z7. There­fore Z7(X) = Z7[X]/(J) is a field of 73 = 343 elements, where x denotesthe class of X . The elements of Z7(X) have the form a + bx + cx2

, with

116 Chapter 2. Finite Fields

Construction of Fg

If=x2+1 .... x2+1

Ilet F9=7Z3 [x]1 (f) .... F9

Igeometric_series(x,8) .... [1, x, 2, zx, 1, x, 2, 2'x]Igeometric_series(x+1 ,8) .... [1, x+1, 2'x, 2 'x+1 , 2, 2'x+2, x, x+2]

a, b, C E Z7. The multiplication table ofZ7(x) is determined by the relationsx 3 = 2, which is the form it takes the relation f(x) = 0, and x 4 = 2x .

Construction of F343

Ilet F343=7Z7 [x]1 (x3+5) .... F343

I{element(j,F343) with j in 60..65}.... {x2+x+4, x2+x+5, x2+x+6, x2+2'x, x2+2'x+1, x2+2 'x+2}

E.2.13. Find an irreducible polynomial ofZ3[T ]ofthe form T 3+a, and useit to construct a field Z3(t) of27 elements .

E.2.14. Show that the polynomial S4 + S + 1 E Z[S] is irreducible anduse it to construct a field Z2(S) of 16 elements. Later we will see that thisfield must be isomorphic to the field Z2(X, y) of example 2.13. For a methodto find isomorphisms between two finite fields with the same cardinal, seeP.2.5.

Splitting fields

Given a prime number p and a positive integer r, the basic constructionwould yield a field of pT elements if we knew an irreducible polynomialf of degree rover Zp. We have used this in the examples in the previoussubsection for a few p and r , by choosing f explicitely, but at this point wedo not know whether an f exists for every p and r.

In this subsection we will show that we can use the basic construction inanother way (the method ofthe splittingjield) that leads quickly to a proof ofthe existence of a field whose cardinal is pT for any p and r. From this con­struction, which is difficult to use in explicit computations, we will deducein the next subsection a formula for the number of monic irreducible poly­nomials of any degree over a field of q elements, and procedures for findingthese polynomials. We will then be able to construct a field of pT elementsfor any p and r which can be handled explicitely and effectively.

2.16 Theorem (Splitting field of a polynomial). Given ajield K and a monic

2.2. Construction of finite fields

polynomial f E K[X], there is afield extension L/K and elements

such that

117

T

f= II(X-aj) and L=K(aI, . . . ,aT ) .

j=l

Proof Let r be the degree of f . If r = 1, it is enough to take L = K. Soassume r > 1 and that the theorem is true for polynomials of degree r - 1.If all irreducible factors of f have degree I, then f has r roots in K and wecan take L = K. Thus we can assume that f has at least one irreduciblefactor 9 of degree greater than I . By the fundamental construction applied tog, there is an extension K' / K and an element a E K' such that K' = K (a)and g(a) = 0, hence also f(a) = O. Now the theorem follows by inductionapplied to the polynomial f' = f /(X - a) E K'[X] (since a is a root of f,f is divisible by X - a, by Proposition 2.4). 0

2.17 Theorem (Splitting field ofx« - X). Let K be afinitefield and q =IKI. Let L be the splittingfield ofh = x« - X E K[X]. Then ILl = q",

Proof By definition of splitting field, there are elements ai E L, i =

1, ... , qT, such that h = rU:l (X - ai) and L = K(al, " " aqr). Nowthe elements ai must be different, for otherwise h and its derivative h' wouldhave a common root, which is impossible because h' is the constant -1. Onthe other hand the set {aI, . . . ,aqr } of roots of h in L form a subfield of L,because if a and fJ are roots, then (a - fJ)qr = aqr - fJqr = a - fJ and(afJ)qr = aqrfJqr = afJ (hence a - fJ and afJ are also roots of h) and if a isa nonzero root then (1/a)qr = 1/aqr = l/a (hence l/a is also a root of h).Since >..q = >.. for all >.. E K, the elements of K are also roots of h. It followsthat L = K(aI, . . . , aqr) = {al, . .. , aqr} and consequently ILl = qT. 0

2.18 Corollary (Existence offinite fields). Let p be a primer number and ra nonnegative integer. Then there exists afinite field whose cardinal is p",

Proof By the theorem, the splitting field of xr - X over Zp has cardi­nal p" . 0

E.2.15. Given a finite field L such that ILl = pT, the cardinal of any subfieldis v'. where sir (E.2.9). Prove that the converse is also true: given a divisors of r, there exists a unique subfield of L whose cardinal is pS.

Existence ofirreducible polynomials over Zp

In this section we will establish a formula for the the number Iq(n) ofmonicirreducible polynomials of degree n over a finite field K of q elements.

118 Chapter 2. Finite Fields

2.19 Proposition. Let f be a monic irreducible polynomial ofdegree roverK. Then f divides x« - X in K[X] ifand only ifrln.

Proof If L is the splitting field ofx« - X over K, then Theorem 2.17 tellsus that 1£1 = qn and x« - X = TIoEL(X - a). Let f E K[X] be monicand irreducible, set K' = K[XJI(J) and write x E K' for the class of Xmod j'.

If we assume that f divides x-: - X, then there exists (3 E L suchthat f({3) = 0 and by Propostion 2.10 we have that K[{3] is isomorphic toK' = K[XJI(J). Thus L has a subfield with cardinal qr, which implies thatrln.

Assume, for the converse, that rln. Then x« - X divides xc - X(because qr - 1 divides q" - 1 and hence xqr-l - 1 divides xqn-l - 1).To see that f divides xe - X it is thus enough to prove that f dividesx« _ X . Let h be the remainder of the Euclidean division of x« - Xby f . Since K' has qr elements, x is a root ofx« - X and hence also a rootof h. This implies that h is zero, because otherwise 1, x , . . . ,xr - 1 would notbe linearly independent, and hence f divides x« - X. 0

2.20 Corollary. Wehave the following decomposition:

xqn - X = II II t,din fElrrK(d)

where IrrK (d) is the set ofmonic irreducible polynomials ofdegree dover K.Consequently,

qn = L dlq(d).

din

2.21 Example. Since 23 = 8, the irreducible factors X 8 - X over Z2 coin­cide with the irreducible polynomials of degrees 1 and 3 over Z2 . Similarly,the irreducible factors X 16 - X over Z2 coincide with the irreducible poly­nomials of degrees 1,2 and 4 over Z2 . Now the factoring of f E K[X] intoirreducible factors can be obtained with the woos function factor(f,K) . Wecan see the results of this example in the listing below.

Irreducible polynomials of degree 1 and 3 over 71.2Ik=71.2;

Ifactor(XLX, k) ~ X· (X+1) ' (X3+X+1) ' (X3+X2+1)

Irreducible polynomials of degree 1, 2 and 4 over 71.2

Ifactor(X16-X,k)

-+ X· (X+1)' (X2+X+1) ' (X4+X+1) ' (X4+X3+1) ' (X4+X3+X2+X+1)

2.2. Construction of finite fields 119

To obtain an expression for Iq(n) we need the Mobius u function, which isdefined as follows : if d is a positive integer, then

if n = 1

if d = Pi ' . . Pk, with the Pi different primes

otherwise.

The function f.L is implemente in the WIRIS function rnurnoeblus.

Some values of 11II1=mu_moebius;

111(1) 1111(101) -1111(35) 1111(101101101) .... -1I factor(101101101) .... 3 ·101·333667

E.2.16. Show that if gcd(d,d') = 1, then f.L(dd') = f.L(d)f.L(d') .

E.2.17. For all integers n > 1, prove that Ldln f.L(d) = O.

E.2.18 (Mobius inversion formula). Let A be an abelian additive group andh : N -+ A and h' : N -+ A maps such that

h'(n) = L h(d)din

for all n. Prove that then

h(n) = L f.L(d)h'(njd)din

for all n, and conversely. There is a multiplicative version of Mobius inver­sion formula: Let A be an abelian multiplicative group and h : N -+ A andh' : N -+ A maps such that

h'(n) = II h(d)din

for all n . Prove that then

h(n) = IIh'(njd)IJ.(d)din

for all n, and conversely.

120 Chapter 2. Finite Fields

2.22 Example. Since for any positive integer n we have n = 2':dln!P(d)(Proposition 2.3), the Mobius inversion formula tells us that we also have

!pen) = L JL(d)~.din

I n=9 ·5 ·7;I D=divisors(n);I phi_euler(n) .... 144

II mu_mOebiUS(d) .% .... 144din D

Ifwe apply the Mobius inversion formula to the relation qn = 2':dlndIq(d)in Corollary 2.20, we get:

2.23 Proposition. The number Iq(n) ofirreducible polynomials ofdegree nover a field K ofq elements is given by the formula

1Iq(n) = - L JL(d)qn/d.

ndin

2.24 Corollary. For all n > 0, Iq(n) > O. In other words, ifK is afinitefield and q = IKI, then there are irreducible polynomials over K ofdegreen f or all n ~ 1.

Proof Since for d > 1 we have JL(d) ~ -1, it is easy to see that the formulain Proposition 2.23 implies that nIq(n) ~ q" - (qn - l)/(q - 1) > O. 0

"" Library I Computing IQ(n) ''''

I IJ.=mu_moebius;

Ilrr (q,n) := * . I IJ.(d)·qn/d

d in divisors(n)

[lrrtn) :=lrr(2,n)

~ Irr(q,35) .... 3~ .q3L 315

.q7_ 3~ .q5+ 3~ ·q~ [Irr(n)with n in 1..12] .... [2,1,2,3,6,9,18,30,56,99,186,335]

Ifwe apply the Mobius multiplicative inversion formula to the relation xc­X = Ildln Il!ElrrK(d) f in Corollary 2.20, we get:

2.25 Proposition. The product PK (n) ofall monic irreducible polynomialsofdegree n over K is given by the formula

PK(J) = IT(X qn/

d- X)J1,(d).

din

2.2. Construction of finite fields 121

"'I Library I Product of the monic irreducible polynomials of degree n over FQ I'"I J.L=mu_moebius

IP{q,n,X):= n (xqn/d _ X) Il(d )

d in divisors(n)

I P{n,X) :=P{2,n,X)

~P4=P{4,X) X1Z+X9+X6+X3+1

Ifactor{P4,7ZZ) (X4+X+1)· (X4+X3+1)' (X4+X3+XZ+X+1)

IP{3,4,X) ..... X72+X64+X56+X48+X40+X3Z+XZ4+X16+X8+1

2.26 Remark. Given a finite field K with q = IKI and a positive integer r ,to construct a field with qT elements we can pick a monic irreducible poly­nomial f E K[X], which exists because I q{r) > 0, and form K[XJI(J).Usually this construction is more effective than the splitting field methodintroduced in Corollary 2.18.

To find irreducible polynomials, wmrs has, in addition to the possibilityof factoring xe - X and which has already been illustrated above, two mainfunctions that are very useful for dealing with irreducible polynomials: thepredicate irreducible?{f,K), that returns true or false according to whether f E

K[x] is or is not irreducible over K , and irreducible_polynomial(K,r,X), thatreturns a monic polynomial of degree r in the variable X that is irreducibleover K . The latter function generates monic polynomials of degree r inthe variable X and coefficients in K and returns the first that is found tobe irreducible. This works well because the density of monic irreduciblepolynomials of degree r among all monic polynomials of degree r is, by theformula in Proposition 2.23, of the order ofl/r.

Iirreducible?{x3+xZ+1,7Zz) ..... true

If=sum{geometric_series{x,7)) ..... x6+x5+x4+x3+xZ+x+1Iirreducible?{f,7Zz) ..... false

Iirreducible_polynomial{7Zz,5,X) X5+X4+XZ+X+1

Iirreducible_polynomial{7Zz,6,X) X6+X+1

Iirreducible_polynomial{7Zz,7,X) X7+X+1

Iirreducible_polynomial{71z,8,X) X8+X7+X3+Xz+1

2.27 Remark. The function irreducible_polynomial(K,r,X) may give a differ­ent results (for the same K, r and X) for different calls, because it is based(unless rand K are rather small) on choosing a random polynomial andchecking whether it is irreducible or not.

122

Summary

Chapter2. Finite Fields

• The characteristic of a finite field K is the minimum positive numberp such that plK = OK. It is a primer number and there is a uniqueisomorphism between 7l..p and the minimum subfield of K (which iscalled the prime field of K). From this it follows IKI = p", where r isthe dimension of K as a 7l..p vector space .

• The (absolute) Frobenius automorphism of a finite field K is the mapK ---+ K such that x ~ xP, where p is the characteristic of K . If Lis a finite field and K is a subfield with q = IKI, then the Frobeniusautomorphism of L over K (or relative to K) is the map x ~ x q •

• Basic construction: if K is a finite field of cardinal q and f E K[X] isa monic irreducible ofdegree r, then K [X]/ (J) is a field that containsK as a subfield and its cardinal is q",

• For any finite field K and any positive integer r, there exist monicirreducible polynomials f E K[X] of degree r , and hence there arefields of cardinal qr (q = IKI) for all r ~ 1. In fact, we have derivedan expression for the number Ir(q) of monic irreducible polynomialsof degree r over a field of q elements (Proposition 2.23). From this itfollows that a positive integer can be the cardinal ofa finite field if andonly if it is a positive power of a prime number.

• Splitting field of a polynomial with coefficients in a finite field (Theo­rem 2.16) .

• Using the splitting field ofx« - X E K[X], where K is a finite fieldofq elements, we have Seen that xc - X E K [X] is the product ofallmonic irreducible polynomials with coefficients in K whose degree isa divisor of n. We also have found a formula for the product PK (n)of all monic irreducible polynomials in K[X] of n .

Problems

P.2.S (Finding isomorphisms between finite fields). Let L be a finite fieldand K ~ L a subfield. Set q = IKI and let r be the positive integer such that1£1 = qr. Assume that we have a monic irreducible polynomial f E K[X]and that We can find (3 E L such that f({3) = O. Show that there is a uniqueK-isomorphism K[X]/(J) ~ L such that x ~ (3, where x is the class of Xmodulo f. Prove also that the K-isomorphisms between K[X]j(J) and Lare in one-to-one correspondence with the roots of f in L,

2.2. Construction of finite fields 123

P.2.6. The polynomials f = X 3+X+I,g = X 3+X2+1 E Z2[X] are irre­ducible (as shown in Example 2.21 they are the only irreducible polynomialsof degree 3 over Z2). Use the previous problem to find all the isomorphismsbetween Z2[XJI(J) and Z2[XJI(g).

P.2.7. Use P.2.5 to find all the isomorphisms between the field described inE.2.14 and the field Z2 (x,y) defined in Example 2.13.

P.2.S. A partition of a positive integer n is a set of pairs of positive in­tegers A = {(TI ,ml), . . . , (Tk, m kn such that TI , .. . ,Tk are distinct andn = mlTI + ... + mkTk. The partition {(n, In is said to be improper, allothers are said to be proper. Let now K be a finite field of q elements. Usethe uniqueness (up to order) of the decomposition of a monic f E K[X] asa product of monic irreducible factors to prove that

gives the number of monic polynomials in K[X] that are the product ofmiirreducible monic polynomials of degree Ti, i = 1, ... ,k. Deduce from thisthat Rq(n) = I:>. Pq(A), where the sum ranges over all proper partitions ofn, is the number of polynomials in K[X] that are reducible, monic and ofdegree n. Finally show that Rq(n ) + Iq(n) = qn and use this to check theformula for Iq(n) obtained in Proposition 2.23 for n in the range 1..5.

124 Chapter 2. Finite Fields

2.3 Structure of the multiplicative group of a finitefield

This group could not have a simpler structure-it is cyclic.

S. Roman, [20], p. 288.

In the examples of finite fields presented in the preceeding section we haveseen that the multiplicative group was cyclic in all cases. In this section wewill see that this is in fact the case for all finite fields . We will also considerconsequences and applications of this fact.

Essential points

• Order of a nonzero element.

• Primitive elements (they exist for any finite field).

• Discrete logarithms.

• Period (or exponent) of a polynomial over a finite field.

Order ofan element

If K is a finite field and a is a nonzero element of K, the order ofa, ord(a),is the least positive integer r such that o" = 1 (in E.2.4 we saw that this rexists). Note that ord(a) = 1 if and only if a = 1.

2.28Example. in Z5 we have ord(2) = 4, because 2 =1= 1, 22 = 4 =1= 1,23 = 3 =1= 1 and 24 = 1. Similarly we have ord(3) = 4 and ord(4) = 2.

2.29 Proposition. If the cardinal ofK is q and a E K* has order r, thenrl(q - 1).

Proof It is a special case ofE.2.4. 0

2.30 Remark. This proposition implies that a q- 1 = 1 for all nonzero ele­ments a E K, a fact that was already established in E.2.6. On the other hand,if q - 1 happens to be prime and a =1= 1, then necessarily ord(a) = q - 1.More generally, r is the least divisor ofq - 1 such that ar = 1.

2.31 Examples. 1) In the field Z2(X) ofexample 2.11 we have, since q-1 =3 is prime, ord(l) = 1, ord(x) = 3 and ord(x + 1) = 3. Note that in 2.11we had already checked this from another perspective.

2.3. Structure of the multiplicative group of a finite field 125

2) In the field Z7(X) of the example 2.15, q - 1 = 342 = 2.32 .19.Since x 3 = 2 and 23 = 1, we have x 9 = 1 and hence ord(x )19. Finallyord(x ) = 9, as x 3 #- 1.

3) In the field Z2(X,y) of the example 2.13, which has cardinal 16,ord(y ) = 5, because ord(y ) has to be a divisor of16 -1 = 15 greater than 1and it is easy to check that y3 = x y + x #- 1 and y5 = (xy + 1)(x y + x) = 1.

Computationally, ord(a ) is returned by the function order(o:). It imple­ments the idea that the order r of a is the least divisor of q - 1 such thataT = 1. See the listing Order of field elements for examples.

Order of field elements

Iirreducible?(t5-t+1, 7Z5) .... true

IF=7Z5[tj/(t5-t+1); q=cardinal(F) .... 3125

I r=order(t) .... 1562

I q~1 .... 2

I{x,x"}=square_roots(t) .... {2·t4+t3+2·t2+3·t+4,3·t4+4·t3+3·t2+2·t+1}

Iorder(x) 3124lorder(2 ·t) 3124

E.2.19. In a finite field K every element a satisfies the relation a q = a,where q = IKI (E.2.6). Hence the elements ofK are roots of the polynomialX" - X . Show that

X" - X = II (X - a) and Xq- l - 1 = II (X - a).aEK aEK*

Note finally that these formulae imply that L aEK a = a if K #- Z2 (asX " - X has no term of degree q - 1 if q #- 2) and that TI aEK * a = - 1 (Seealso Example 1.30, where this fact was proved in a different way).

2.32 Proposition. Let K be a finite fi eld, a a nonzero element of K andr = ord( a) .

1) If x is a nonzero element such that x T = 1, then x = a k f or someinteger k.

2) Forany integer k, ord (a k ) =r/ gcd( k,r) .

3) The elements oforder r of K have the form a k , with gcd (k , r) = 1.In particular we see that if there is an element oforder r, then thereare precisely <p(r) elements oforder r.

Proof Consider the polynomial f = X T -1 E K [X ]. Since f has degree rand K is a field, f has at most r roots in K. At the same time , since r is the

126 Chapter 2. Finite Fields

order of a, all the r elements of the subgroup R = {I, a, ..., a r-

1} are rootsof f and so f has no roots other than the elements of R. Now by hypothesisx is a root of f, and hence x E R. This proves 1.

To establish 2, let d = gCd(r,k) and 8 = rid. We want to show that ak

has order 8 . If (ak)m = 1, then a mk = 1 and therefore rlmk. Dividing byd we get that 81mkld. Since (8,kid) = 1, we also get that 81m. Finally it isclear that (ak) S = a ks = (ar)k/d = 1, and this ends the proof.

Part 3 is a direct consequence of 1,2 and the definition of<p(r) . 0

Primitive elements

A nonzero element of a finite field K of cardinal q = pr is said to be aprimitive element of K if ord(a) = q - 1. In this case it is clear that

K * - {I 2 q-2}- , a , a , ... , a .

Of this representation of the elements of K we say that it is the exponen­tial representation (or also the polar representation) relative to the primitiveelement a . With such a representation, the product of elements of K is com­puted very easily: a iai = a k, where k = i + j mod q - l.

2.33 Examples. The elements 2 and 3 are the only two primitive elementsof Z5 and x and x + 1 are the only two primitive elements ofthe field Z2(x)introduced in Example 2.11. The element x of the field Z7(X) introduced inExample 2.15 is not primitive, since we know that ord(x) = 9 (see Example2.31). Finally, the element y of the field Z2(X,y) considered in Example2.13 is not primitive, since we saw that ord(y) = 5 (Example 2.31.3).

2.34 Theorem. Let K be a finite field ofcardinal q and d is a positive inte­ger. Ifdl(q - 1), then there are exactly <p(d) elements ofK that have orderd. In particular, exactly <p(q - 1)ofits elements are primitive. In particular,any finite field has a primitive element.

Proof Let p(d) be the number ofelements of K that have order d. It is clearthat

L p(d)=q-1,dl(q-l)

since the order of any nonzero element is a divisor of q - 1. Now observethat ifp(d) =1= 0, then p(d) = <p(d) (see Proposition 2.32.3). Since we alsohave l:dl(q-l) <p(d) = q - 1 (Proposition 2.3), it is clear that we must havep(d) = <p(d) for any divisor d ofq - 1. 0

The WIRIS function primitive_element(K) returns a primitive element of K .Here are some examples:

2.3. Structure of the multiplicative group of a finite field

Iirreducible?(t7-H1 ,717 ) -+ Irue

Ilet K=717 [I] I (t7-H1); q=cardinal(K);

Iu = primilive_element(K) -+ 2 ·1Iorder(u), q-1 -+ 823542,823542Iorder(l) -+ 274514

127

The algorithm on which the function primitive_element(K) is based is calledGauss' algorithm and it is explained and analyzed in P.2.1O.

2.35 Proposition. Let L be afinitefield, K a subfield ofLand q = IKI. Letr be the positive integer such that 1£1 = qr. Ifa is a primitive element ofL ,then 1, a, . .. , a r

- 1 is a basis ofL as a K-vector space.

Proof If 1, a , . . . , a r - 1 are linearly dependent over K, there exist

ao, aI, . . . , ar-l E K ,

not all zero, such that

Letting h = ao+a1X +...+ar_1xr-l E K[X], we have a polynomial ofpositive degree such that h(a) = O. From this it follows that there exists amonic irreducible polynomial f E K[X] of degree < r such that f( a) = O.Using now Proposition 2.10, we get an isomorphism from K' = K[XJI(I)onto K[a] such that x f---t a. But this implies that ord(a) = ord( x) ~

qr-l _ 1, which contradicts the fact that ord(a) = qr - 1. D

E.2.20. Let K be a finite field different from 1F2 . Use the existence of primi­

tive element to prove again that L:xEK x = O. D

Primitive polynomials

If f is an irreducible polynomial of degree rover 'Zp , p prime, then 'Zp(x) ='Zp[XJI(I) is a field of cardinal p", where x is the class of X modulo f . Weknow that x may be a primitive element, as in the case of the field 'Z2 (x)of example 2.11, or it may not, as in the case of the field 'Z7(X) studied inthe example 2.15 (cf. the examples 2.33). If x turns out to be a primitiveelement, we say that f is primitive over 'Zp. It is interesting, therefore, toexplore how to detect whether or not a given polynomial is primitive. Firstsee Example of a primitive polynomial (of degree 4 over 'Z3).

E.2.21. Let K be a finite field and f E K[X] be a monic irreducible polyno­mial, f =I X . Let x be the class of X in K[XJI(I) . If m = deg (I), showthat the order ofx is the minimum divisor d ofqm -1 such that f IX d -1. For

128

Example of a primitive polynomial

If=X4+X-1 ;

IF81=extension( 7Z3 'x, f);

Iorder(x) ~ 80Iprimitive_element(F81) ~ x

Chapter 2. Finite Fields

the computation of this order, which is also called the period (or exponent)of j , see The function period(f,q) .

"" Library I The function periodff.q)

period (f,q) :=begin

local r=degree(f), D=set(divisors (q' -1)), X=variable(f)for din D do

if remainder(Xd-1, f) =0 thenreturn d

endend

end

I p=5;

If = xP-x+1 : 7Zplx] ~ x5+4'x+1

IpP-1 , period(f,p) ~ 3124,1562

The discrete logarithm

Suppose L is a finite field and a E L a primitive element of L. Let K be asubfield of L and let q = IKI and r the dimension of Lover K as a K -vectorspace. We know that 1, a, . .. , a r - l form a basis of L as a K-vector space(Proposition 2.35), and thereby every element of L can be written in a uniqueway in the form ao + ala +...+ ar_la

r - l, ao, .. ., ar-l E K. We say thatthis is the additive representation over K of the elements of L , relative to theprimitive element a .

With the additive representation, addition is easy, as it reduces the sum oftwo elements ofL to r sums ofelements ofK. In order to compute products,however, it is more convenient to use the exponential representation withrespect to a primitive element a. More precisely, if x, y E L *, and we knowexponents i , j such that x = a i and y = a j, then xy = ai+j .

In order to be able to use the additive and exponential representationscooperatively, it is convenient to compute a table with the powers a i (r :(i :( q - 2, q = ILl) in additive form, say

i r-la = aiD+ ail a + ...+ ai ,r-la ,

2.3. Structure of the multipl icative group of a finite field 129

since this allows us to transfer, at our convenience, from the exponential tothe additive representation and vice versa. Ifwe let ind( x) (or inda(x) if wewant to declare a explicitely) to denote the exponent i such that x = ai, thenthe product xy is equal to ak, where k = ind(x) + ind(y) (mod q - 1).

The listing Computation of products by means of an index table in­dicates a possible computational arrangement of this approach, and the ex­ample below presents in detail IF16 done 'by hand'. Note that in the listing thefunction ind_table(x) constructs the table {xi ---+ j}j=o,... ,r - l> r = ord( x),augmented with {O ---+ _} in order to be able to include OK (in the definitionof the function we use 0 .x because this is the zero element ofthe field wherex lives, whereas 0 alone would be an integer). We have used the underscorecharacter to denote the index of OK, but in the literature it is customary touse the symbol 00 instead.

A' Library I Computation of products by means of an index table

~ ind_table(x) := {O ·x~_} I {xj~j withj in O..(order(x)-1)}

IK=713[x]/(x3-x+1); q=cardinal(K);

Set up an index table L and an exponential table E (inverse of L)Ia=primitive_element(K); L=ind_table(a) ; E=inverse(L);The product p(a,b) of a and b using table lookup

I p(a,b) := E( L(a)+L(b) mod q-1 );Example

Ia=a6; b=a15;

I L(a), L(b) .... 6,15

Ia -b, p(a,b) -+ x2+1, x2+1

By analogy with the logarithms, inda(x) is also denoted loga(x), orlog( x) if a can be understood. We say that it is the discrete logarithm ofx with respect to a .

2.36Example. Consider again the field Zjtr] = Z2[TJ!U),f = T 4+T+1,

studied in E.2.14. The element t is primitive, as ord(t)115 and t3 =I 1,t5 = t2 + t =I 1. So we have an exponential representation

Z2(t)* = {1, t, ..., t 14} .

Consider now Table 2.1. The column on the right contains the coeffi­cients of t k , k = 0,1, ... , 14, with respect to the basis 1, t, t 2 , t 3 . For exam­ple , t8 == t2 + 1 == 0101. The first column contains the coefficients withrespect to 1, t, t2 , t3 of the nonzero elements x of Z2 (t) ordered as the inte­gers 1, ... , 15 written in binary, together with the corresponding log(x) . Forexample, 5 == 0101 == t2 + 1, and its index is 8.

130 Chapter 2. Finite Fields

Table 2.1: Discrete logarithm oj'ldt), t4 = t + 1

x log(x) k tf<;

0001 0 0 00010010 1 1 00100011 4 2 01000100 2 3 10000101 8 4 00110110 5 5 01100111 10 6 11001000 3 7 1011

x log (x) k t k

1001 14 8 01011010 9 9 10101011 7 10 01111100 6 11 11101101 13 12 11111110 11 13 11011111 12 14 1001

2.37 Remark. Index tables are unsuitable for large fields because the corre­sponding tables take up too much space. They are also unsuitable for smallfields if we have the right computing setup because the field operations ofa good implementation are already quite efficient. As an illustration of thecontrast between pencil and machine computations, see Examples of Lidl­Niederreiter (cf. [10], p. 375).

Examples of Lidl-Niederreiter

If =x6+x5+x3+x2+1 ;IF64=extension(712,b,f); L=ind_table(b);

Is=(b6+b25+b44)-(1 +b35)-1 +b28 -+ b5+1

I L(s) -+ 38

(in agreement with the result s=b38)

Zech logarithm for F64

IZ={j"L(1 +t>i)with j in 0..62};IZ(4), Z(19), Z(35), Z(50) -+ 32,34, 31, 60

(thus, for example, 1+b19=b34)

Note in particular how to set up a table ofJacobi logarithms,

Z = {j 1-+ ind(l + a j)}

(also called Zech logarithms), a a primitive element. Since we have theidentity 1+ a j = aZ(j), they are useful to transform from the additive formto the exponential form. Here is a typical case:

a i + d = a i (l + a j-i) = ai+Z(j-i)

(assuming i :s; j).

2.3 . Structure of the multiplicative group of a finite field 131

Nevertheless, the problem of computing inda(x) for any x is very in­teresting and seemingly difficult. Part of the interest comes from its usesin some cryptographic systems that exploit the fact that the discrete expo­nential k I---t ci is easy to compute, while computing the inverse functionx I---t inda(x) is hard. See, for example, [10], Chapter 9.

Summary

• Order of a nonzero element of a field of q elements, and the fact thatit always divides q - 1. If d is a divisor of q - 1, there are precisely<pCd) elements of order d.

• Primitive elements (they exist for any finite field). Since they are theelements of order q - 1, there are exactly <pCq - 1) primitive elements .

• Period (or exponent) of an irreducible polynomial over a finite field.Primitive polynomials.

• Discrete and Jacobi (or Zech) logarithms .

Problems

P.2.9. Let K be a finite field, a, (3 E K *, r = ord(a), s = ord«(3) . Lett = ord(a(3), d = ged(r, s), m = lern(r, s) and m' = mid. Prove that m/[zand tim. Hence ord(a(3) = rs if d = 1. Give examples in which d > 1 andt = m' (repectively t = m) .

We now present an algorithm for finding primitive roots in an

arbitrary finite field. It is due to Gauss himself.

R.J. McEliece, [13], p. 38.

P.2.tO (Gauss algorithm). This is a fast procedure for constructing a primi­tive element of a finite field K of q elements.

1) Let a be any nonzero element of K and r = ord(a). If r = q - 1,return a. Otherwise we have r < q - 1.

2) Let b an element which is not in the group (a) = {I , a, ... ,ar - 1 }

(choose a nonzero b at random in K and compute br ; if b" ¥- 1, thenb <t (a), by proposition 2.32, and otherwise we can try another b; sincer is a proper divisor of q - 1, there are at least half of the nonzeroelements of K that meet the requirement).

3) Let s be the order of b. If m = q - 1, return b. Otherwise we haves <q-1.

132 Chapter 2. Finite Fields

4) Compute numbers d and e as follows. Start with d = 1 and e = 1 andlook successively at each common prime divisor p of rand s. Let mbe the minimum of the exponents with which p appears in rand sandset d = dpm ifm occurs in r (in this case p appears in s with exponentm or greater) and e = epm otherwise (in this case p appears in s withexponent m and in r with exponent strictly greater than m) .

5) Replace a by adbe , and reset r = ord(a) . Ifr = q - 1, then return a.Otherwise go to 2.

To see that this algorithm terminates, show that the order of adbe in step 5 islern(r, s) (use P.2.9) and that lern(r, s) > max(r, s).

P.2.H. Let K be a finite field, q = IK!, and f E K[X] a monic polynomialof degree m ~ 1 with f(O) f= 0. Prove that there exists a positive integere ~ qm - 1 such that fixe- 1. The least e satisfying this condition isthe period, or exponent, of f . If f is reducible , e need not be a divisor ofqm _ 1. See The function coarse_period(f,q) , which calls period(f,q) iff is irreducible and otherwise proceeds by sluggishly trying the conditionf I(xe- 1) for all e in the range deg (J) ..(qm - 1). Hint: If x is the class ofX in K' = K[XI!(J), the qm classes xi (j = 0, . . . , qm - 1) are nonzeroand K' has only qm - 1 nonzero elements .

A' Library I The function coarse .periodtf.q)

coarse_period (f,q) :=begin

local r=degree(f), X=variable(f), E=1 ..qr-1if irreducible?(f,coefficienUing(f») then

return period(f,q)endfor e in E do

if remainder(Xe-1,f)=0 thenreturn e

endend

end

Iu=1 :712;

Icoarse_period «X2+X+u)2, 2)

Icoarse_period (X10+X9+X3+X2+u, 2)

If=x4+x-1 : 7l3[x);

Iirreducible?(f,713) ..... true

I period(f.3) ..... 80Icoarseperlodtts) ..... 80

P.2.l2. Find all primitive irreducible polynomials ofdegree r = 2..6 over Z2.

2.4. Minimum polynomial

2.4 Minimum polynomial

133

The definitions, theorems, and examples in this chapter can be

found in nearly any of the introductory books on modern (or

abstract) algebra.

R. Lidl and H. Niederreiter, [10], p. 37.

Essential points

• Minimum polynomial of an element of a field with respect to a sub­field.

• The roots ofthe minimum polynomial ofan element are the conjugatesof that element.

• Between two finite fields of the same order p" there exist precisely risomorphisms.

• Norm and trace of an element of a field with respect to a subfield.

Let L be a finite field and K a subfield of L. If the cardinal of K is q, weknow that the cardinal of L has the form qm, where m is the dimension of Las a K -vector space.

Let a E L. Then the m + 1 elements 1, a , , am are linearly dependentover K and so there exists elements ao, a I, , am E K , not all zero, suchthat ao+al a +...+amam = O. This means that ifwe set f = ao+a1X +... + amXm, then f is a nonzero polynomial such that f(a) = O. With thisobservation we can now prove the next proposition.

2.38 Proposition. There is a unique monic polynomial p E K[X] that sat­isfies the following two conditions:

1) p(a) = O.

2) Iff E K[X] is a polynomial such that f(a) = 0, then pi/.

The polynomial p is irreducible and satisfies that

3) deg (p) ::::: m .

Proof Among all the nonzero polynomials f such that f(a) = 0, chooseone of minimum degree, p. The degree of p cannot be greater than m , aswe know that there are nonzero polynomials f of degree m or less such thatf(a) = O.

134 Chapter2. FiniteFields

To see 2, assume f (a) = °and let r be the residue of the Euclideandivision of f by p, say f = gp + r and with either r = aor deg (r) <deg (p). Since p(a) = 0, we also have r(a) = 0. Hence the only possibilityis r = 0, as otherwise we would contradict the way p was chosen.

To prove the uniqueness of p, let p' be another monic polynomial thatsatisfies 1 and 2. Then we have pip' and p'lp, and so p and p' are proportionalby a nonzero constant factor. But this factor must be 1, because both p andp' are monic.

Finally if p = fg is a factorization of p, then f(a) = °or g(a) = 0,and so plf or pig. If plf, then f = pI', for some l' E K[X], and sop = fg = pf'g. Thus 1 = l'g, f' and 9 are constants and p = fg is not aproper factorization of p. The case pig leads with a similar argument to thesame conclusion . Therefore p is irreducible. 0

The polynomial p defined in Proposition 2.38 is called the minimum poly­nomial of a over K, and is usually denoted

Pa. The degree of Pa is also known as the degree of a over K and isdenoted r = deg (a) .

E.2.22. Show that there exits a unique K -isomorphism K[X]/ (Pa) ~ K[a]such that x I---t a, where x is the class of X.

E.2.23. Show that if a is a primitive element of L , then a has degree mover K.

E.2.24. If f is an irreducible polynomial over a field K, and a is a root of fin an extension L of K, show that f is the minimum polynomial of a overK (note in particular that if K(x) = K[X]/(J), then f is the minimumpolynomial of x over K).

E.2.25. With the same notations as in the Proposition 2.38, prove that r =deg (p) is the first positive integer r such that a r E (1, a , ..., a r

-1) K . As a

result , r is also the first positive integer such that K[a] = (1, a, ..., a r - 1) K .

2.39 Example. Let us find the minimum polynomial, p, over Z2 of the ele­ment Y of Z2(X, y) introduced in the example 2.13 (cf. The function min­imum_polynomial). We have y2 = xy + 1. Hence y2 E (1, y)Z2(X) ' afact that is nothing but a rediscovery that the minimum polynomial of y overZ2(X) is y2 +x y + 1. But y2 tj. (1, Y)Z2' and so p has degree higher than 2.We have y3 = xy2 +Y = x2y+X+Y = xy +x and, as it is easy to see, y3 tj.(1,y ,y2)z2 ' Now y4 = xy2+ xy = x 2y + x + x y = y+ x = y3+ y2+ y+1,and therefore p = y4 +y3 +y2 +Y +1. As ord (y) = 5, p is not a primitivepolynomial. 0

2.4. Minimum polynomial

The function minimum_polynomial

Ik=712; K=k[xj/(x2+x+1); let L=K[yj/(y2+X•y+1);

Iminimum_polynomial(y,k,X) X4+X3+X2+X+1

Iminimum_polynomial(y,T) T4+T3+T2+T+1

Iminimum_polynomial(y,K,Y) y2+x 'Y+1

IK=713[aj/(a3-a+1); ~=a2+1;

Irnlnimumpolynornlalfji.X) .... X3+X2+2,X+1

The set of conjugates over K of an element a E L is defined to be

C _ { q q2 qr-l}a - a , a ,a , . . . , a ,

where r is the least positive integer such that aqr = a.

The calls conjugates(x) and conjugates(x,k)k, x, K, y, L: same meaning as in the previous listing

Ik=712; K=k[xj/(x2+x+1); L=K[y]/(y2+x' y+1);

Iconjugates(x,k) {x, x+1}Iconjugates(x) {x, x+1}Iconjugates(y,k) {y, x'y+1 , y--x, x'y+x}Iconjugates(y) {y, x 'y+1, Y+x, x-y--x}I conjugates(y,K) {y, y+x}Iconjugates(x,K) {x}

135

2.40 Proposition. Let K be afinitefield and q = IKI. Let L be afinitefieldextension ofK and a E L. !fCa is the set ofconjugates ofa over K , thenthe minimum polynomial Pa ofa over K is given by the expression

Pa = II (X - (3).{3EC",

Proof Extend the Frobenius automorphism ofL/K to a ring automorphismof L[X] by mapping the polynomial L:~=o aiXi to L:~=o aix i. Then thepolynomial f = Il{3EC",(X - (3) is invariant by this automorphism, because{3q E Ca if {3 E Ca. Therefore f E K[X] . If {3 E L is a root of'pj, then (3q isalso a root ofPo<> as it is easily seen by mapping the Frobenius automorphismto the relation Pa({3) = O. Using this observation repeatedly it follows, sincea is a root of Pa, that Pa({3) = 0 for all {3 E Ca and so flPa . But Pa isirreducible and f has positive degree, so f = Pa, as both polynomials arem~. 0

136

Uniqueness offinite fields

Chapter 2. Finite Fields

Next statement shows that two finite fields of the same cardinal are isomor­phic.

2.41 Theorem. Let K and K' be twofin ite fields with the same cardinal, q.Then there exists an isomorphism <p : K -7 K'.

Proof Consider the polynomial X" - X E Zp[X]. Regarded as a polyno­mial with coefficients in K, we have X'' -X = I1aEK(X -a) (see E.2.19).Similarly we have X" - X = I1a/EK'(X - a').

Let a be a primitive element of K and let f E Zp[Xj be its minimumpolynomial. As IKI = IK'I = pT , f has degree r (Proposition 2.35), andsince all the roots of f are in K (Proposition 2.40), we also have that

fl( T pT- l - 1)

as polynomials with coefficients in K. But since these polynomials aremonic with coefficients in Zp, this relation is in fact valid as polynomialswith coefficients in Zp. Now (TpT-1 - 1) also decomposes completely inK' and hence f has a root a' E K'. Now Proposition 2.10 says that thereis a unique isomorphism Zp[Xl/(f) ::::::: Zp[a'] = K' such that x I--t a',where x is the class of X modulo f. But there is also a unique isomorphismZp[Xl/(f) ::::::: Zp[a] = K and so there exists a unique isomorphism K ::::::: K'such that a I--t a'. 0

2.42 Remark. The proofabove actually shows that there are precisely r dis­tinct isomorphisms between K and K'.

E.2.26. The polynomials X3 + X + 1, X 3 + X 2 + 1 E Z2[Xj are the twoirreducible polynomials of degree 3 over Z2. Construct an explicit isomor­phism between the fields Z2[X]/(X3+X + 1) and Z2[Xl/(X3 +X2 + 1).(cf. the listing below).

~K=2Z2[~1/(~3+~2+1) ;

Iroots(X3+X+1 , K) -+ {~+1, ~2+1, ~2+~}

E.2.27. Prove that x2+x +4 and x2+1 are irreducible polynomials over thefield F 11 and construct an explicit isomorphism between F 11 [xl/(x2+x+4)and F11 [xl/(X2+ 1).

Trace and norm

Let L be a finite field and K a subfield. If IKI = q, then we know thatILl = qm, where m = [L : Kj, the dimension of L as a K-vector space.

2.4. Minimum polynomial 137

Definitions

Let a E L. Then we can consider the map m a : L ---+ L such that x 1-+ ax .This map is clearly K -linear and we define the trace and the norm of awith respect to K, denoted TrL/K(a) and NmL/K(a) respectively, by theformulae

TrL/K(a) = trace(ma ) , NmL/K(a) = det(ma ) .

Let us recall that the right hand sides are defined as the sum of the diagonalelements and the determinant, respectively, of the matrix of m a with respectto any basis of L as a K -vector space.

E.2.28. Check that TrL/K : L ---+ K is a K-linear map and that TrL/K(A) =ni). for all A E K .

E.2.29. Similarly, check that NmL/K : L* ---+ K* is a homomorphism ofgroups and that NmL/K(A) = Am for all A E K .

Formulas in terms of the conjugates of a

Let Po = X" + Cl X r - l + + Cr be the minimum polynomial of a withrespect to K . If we put {aI, , o,} to denote the set of conjugates of awith respect to K (al = a), then we know that Po = II=l (X - ad. Inparticular we have that

Cl = -a(a) and Cr = (-lt1r(a),

where a(a) = al + .. .+ ar and 1r(a) = al . .... ar.

2.43 Theorem. With the notations above, and defining s = m]r, we have:

TrL/K(a) = sa(a ) and NmL/K(a) = (1r(a)) 8.

Proof The set a = {l,a,··· , a r- l } is a basis of K(a) as a K-vectorspace. Ifwe let (31, ... ,(38 ' S = mfr , denote a basis of L as a K(a)-vectorspace, then L = f11j=lK (a )(3j (as K(a)-vector spaces) . It follows that m o

has a matrix that is the direct sum of s identical blocks, each of which isthe matrix A of m a : K(a) ---+ K(a) with respect to the basis a. Sincema(ai ) = ai+l, which for i = r - 1 is equal to

-(Cr + Cr- l a + ...+ clar- l) ,

we have that0 -Cr

1 -Cr-lA=

1 0 - C2

1 -Cl

138

and so

and

as stated.

Chapter 2. Finite Fields

trace(maJ = strace(A) = S(-Cl) = sa(a),

oIfLand K are constructed fields, then TrL/K(a) is returned by Tr(a,L,K),

or trace(a,L,K). If K is not specified, it is taken as the prime field of L. Andif neither K nor L are specified, it is assumed that L is the field of a and Kits prime field. Similar notations and conventions hold for NmL/K (a) andNm(a,L,K), or norm(a,L,K).

Trace examples

Ik=713; K=k[a)/(a2+1) ; L=K[~)/(~L~+1);

I u=1:k .... 1

ITr(u), Tr(a), Tr(~), Tr(a '~)' Tr(~2), Tr(a·~2) .... 1,0,0,0,2,0ITr(u,K), Tr(a,K) .... 2,0ITr(u,K,k), Tr(a,K,k) .... 2,0

ITr(u,L), Tr(a,L), Tr(~,L) , Tr(a·~,L), Tr(~2,L), Tr(a ·~2,L) .... 0,0,0,0,1,0

ITr(u,L,k), Tr(a,L,k), Tr(~,L ,k) , Tr(a·~,L,k) , Tr(~2,L,k) , Tr(a·~2,L,k) .... 0,0,0,0,1,0

ITr(u,L,K), Tr(a,L,K), Tr(~,L,K), Tr(a·~,L,K), Tr(~2,L,K) , Tr(a·~2,L,K).... 0,O,O,O,2,2 'a

Norm examples

Ik=713; K=k[a)/(a2+1); L=K[~)/(~3_~+1);

I ue t ik .... 1

INm(u), Nm(a), Nm(~), Nm(a'~)' Nm(~2), Nm(a·~2) .... 1,1,2,1,1,1INm(u,K), Nm(a,K) .... 1,1INm(u,K,k), Nm(a,K,k) .... 1,1

INm(u,L), Nm(a,L), Nm(~,L), Nm(a·~,L), Nm(~2,L), Nm(a·~2,L) .... 1,1,1,1,1,1

INm(u,L,k), Nm(a,L,k), Nm(~,L,k), Nm(a ·~,L ,k), Nm(~2,L,k), Nm(a·~2,L,k).... 1,1,1,1,1,1

INm(u,L,K), Nm(a,L,K), Nm(~,L,K), Nm(a·~,L,K) , Nm(~2,L,K), Nm(a·~2,L,K).... 1,2'a,2,a, 1,2 'a

2.4. Minimum polynomial

Summary

139

• Minimum polynomial of an element of a field with respect to a sub­field .

• The roots ofthe minimum polynomial ofan element are the conjugatesof that element with respect to the subfield.

• Between two finite fields of the same order pT, there exist precisely r

isomorphisms.

• Trace and norm of an element of a field L with ILl = qm with respectto a subfield K with IKj = p": if we set s = mfr, then the trace isthe sum ofthe conjugates of a multiplied by s and the norm is the s-thpower of the product of the conjugates of a.

Problems

P.2.13 . Let K be a finite field and q = IKI. Let f E K[X] be monic andirreducible and r = deg(j) . Show that K' = K[X]/(j) is the splittingfield of f.

P.2.14. Find the trace ofao+alt+a2t2+a3t3+a4t4 E Z2[T]/(T5+T2+1) ,where t is the class ofT.

P.2.15. Let K = Z2[X]/(X4 + X + 1) and a = [X] . Compute the matrixM = Tr(aia j ) for 0 :(: i,j :(: 3 and show that det(M) = 1. Deduce thatthere exists a unique Z2-linear basis f3o,/31 , f32, f33 ofK such that Tr(a i f3j) =6ij (0 :(: i, j :(: 3).

P.2.16. Let K = Z3, L = K[X]/(X2+ 1) and f3 = a + 1, where a is theclass of X. Find Nm(f3i) for 0:(: i :(: 7.

3 Cyclic Codes

The codes discussed in this chapter can be implemented rela­

tively easily and possess a great deal of well-understood math­

ematical structure.

W. W. Petersonand E. J. Weldon, Jr., [16], p. 206.

Cyclic codes are linear codes that are invariant under cyclic permutationsof the components of its vectors . These codes have a nice algebraic structure(after reinterpreting vectors as univariate polynomials) which favors its studyand use in a particularly effective way.

More specifically, we will see that cyclic codes of length n over a finitefield K are in one-to-one correspondence with the monic divisors g of

x» -1 E K[X],

and we will establish how to construct generating and control matrices interms of g. Then we will investigate, in order to generate all factors g, howto effectively factor X" - 1 E K[X] into irreducible factors .

Not surprisingly, several ofthe better known codes (like the Golay codes) ,or families of codes (like the BCH codes), are cyclic.

In the last section we will present the Meggitt decoding algorithm andits computational implementation. Moreover, we will illustrate in detail itsapplication in the case of the Golay codes .

142 Chapter 3. Cyclic Codes

3.1 Generalities

During the last two decades more and more algebraic tools

such as the theory of finite fields and the theory of polynomi­

als over finite fields have influenced coding.

Lidl-Niederreiter, [10], p. 470.

Essential points

• Cyclic ring structure of JF1l .

• Generating and control polynomials of a cyclic code .

• Parameters of a cyclic code.

• Generating matrices and systematic encoding.

• Control matrices.

We will say that a linear code C ~ JFn is cyclic if a(a) E C for all a E C ,where

a(a) = (an ,al , a2 ,· ··, an-l) if a = (al,"" an). [3.1]

E.3.1. Assume that a(a) E C for all rows a of a generating matrix G of C.Show that C is cyclic .

Cyclic ring structure ofJFn

In order to interpret the condition [3.1] in a more convenient form, we willidentify JFn with JF[x]n (the vector space of univariate polynomials of de­gree < n with coefficients in JF in the indeterminate x) by means of theisomorphism that maps the vector a = (al , " " an) E JF1l to the polynomiala(x) = al + a2X + .. .+ anxn- l (note that this is like the interpretation of357 as 3 x 102 + 5 x 10 + 7). SeeThe isomorphism F" ~ P[X]n for howwe can deal with this in computational terms.

Moreover, we will endow IF[x]n with the ring product obtained by identifyingIF[x]n with the underlying vector space of the ring IF[X]/(Xn - 1) via theunique JF-linearisomorphism that maps x to the class [X] ofX. If f E JF[X] ,its image f = f(x) in JF[x]n, which we will cal the n-th cyclic reduction off, is obtained by replacing, for all non-negative integers j, the monomial Xi

3.1. Generalities

The isomorphism Fn .. F[x] n

I reverse_print(true) ;Ia=[1 , 2,3] ;

Ia=polynomial(a,X) .... H2·X+3·X2Ivector(a) .... [1,2,3]I reverse_print (false) ;

143

by the monomial xU]n, where [j]n is the remainder of the Euclidean divisionof j by n. We will denote by 1r : F[X] ......... F[x]n the map such that f I-t f(see Cyclic reduction of two polynomials).

AILibrary I Cyclic reduction of polynomials

Cyclic reduction of a polynomial with respect to a variable xcyclic_reduction(f, n :71, x :Identifier)check n>O& collect(f,x) E- Polynomial:=begin

local cfor j in n..degree(f,x) do

c= coefficient (f,x,Dif c=O then

continueendf=f-c .xi ; f=f+c .xj mod n

endcollect(f,x)

endCyclic reduction of a univariate polynomial fcyclic_reduction (f, n:7l)check n>O& length (variables (f) ):::>1:= if variables (f) = { } then

felse

cyclic_reduction (f,n,variable (f) )end

Icyclic_reduction(1+2·x+3-x2+4·x3,3) .... 3·x2+2·x+5

Icyclic_reduction (sum (geometric_series (x,25) ) ,10)

.... 2 ·x9+2·x8+2 ·x7+2·x6+2·x5+3·x4+3 ·x3+ 3 ·x2+ 3 ·x+ 3

Icyclic_reduction(aO+a1 ·x+a2·x2+a3 ·x3+a4 ·x4, 3, x)

.... a2·x2+(a1+a4) 'x+aO+a3

Icyclic_reduction(aO+a1·x+a2 -x2+a3 ·x3+a4 ·x4, 1, x) .... aO+aHa2+a3+a4

For convenience we will also write, if f E F[X] and a E F[x]n, fa asmeaning the same as fa .

Next we will consider the functions that allow us to deal with the productin F[x]n, and a couple of examples (Cyclic product of polynomials).

In terms of vectors ofF", the product in F[x]n can be expressed directlyas follows :

144 Chapter 3. Cyclic Codes

""I Library I Cyclic product of two polynomials with resoect to a variable I""

~cyclic_product(a, b, n :71, x : Identifier) :=cyclic_reduction (a -b, n, x)Special call for univariatepolynomials

I cyclic_product(a, b, n :71) :=cyclic_reduction(a 'b, n)

Icyclicproductfa-eb-x-i-c-x'' , x2, 3, x) .... a'x2+c-x+b

Icyclic_product(1+2 x2+3 x3+5 x5, 1+x+7 x7, 6)

.... 5'x5+24-x4+19-x3+2 -x2+8-x+41

""I Library I Cyclic productof vectors '""

cyclic_product(a :Vector, b :Vector)check length(a) =Iength (b):= begin

local v, x, n=length(a)a=polynomial (a.x)b=polynomial(b,x)v=vector(cyclic_reduction(a -b, n))pad(v,n)

end

~cyclic_product([a , b, c],[d, e, f]).... [a -d-eb-f--c -e., a-e-eb-d-i-c -L, a·f+b·e+c·d]

I cyclic_product([1 , -2,3, -4,5],[2,3,5,7,11]) .... [-4" 29" -4" 53" 10]

3.1 Lemma. Ifwe let CT denote the cyclicpermutation [3.1], then CT(a) = x afor all a E IF[x]n.

Proof With the representation of a as the polynomial

the product xa is

Since xn = 1, we have

which is the polynomial corresponding to the vector CT(a) _ o3.2 Proposition. A linear code C oflength n is cyclic if and only if it is anidealoflF[x]n_

Proof If C ~ IF[x]n is cyclic, then Lemma 3.1 implies that x C ~ C. In­ductively we also have that xi C ~ C for all non-negative integers j _ SinceC is a linear subspace, we also have aC ~ C for all a E IF[x]n, which means

3.1. Generalities 145

that C is an ideal of IF[x]n. Conversely, if C S;;; IF[x]n is an ideal, then inparticular it is a linear subspace such that xC S;;; C. But again by Lemma 3.1this means that C is cyclic. 0

Given any polynomial f E IF[X], let Ct = (f) S;;; IF[x]n denot e the idealgenerated by J = f (x), which can also be described as the image in IF[x]nof the ideal (I) S;;; IF[X ] by the map 11" : IF[X] ----t IF[x]n .

E.3.2. If g and g' are monic divisors of X " - 1, show thatI) c, S;;; c, ifand only if g' lg.

2) Cg = Cg' ifand only if g = g'.

3.3 Proposition. Given a cyclic code C oflength n, there is a unique monicdivisor g ofX" - 1 such that C = Cg .

Proof Exercise E.3.2 tells us that we must only prove the existence of g.Given C, let g E IF[X] be a nonzero polynomial of minimum degree

among those that satisfy g E C (note that 1I"(xn - 1) = xn - 1 = 0 E C sothat g exists and deg (g) ~ n) . After dividing g by its leading coefficient, wemay assume that g is monic. Since C is an ideal, we have Cg = (g) S;;; C andso it will be enough to prove that g is a divisor of X" - 1 and that C S;;; Cg .

To see that g is a divisor of X" - 1, let q and r be the quotient andremainder of the Euclidean division of X " - 1 by g:

X" -1 = qg+ r.

Henceo= xn

- 1 = qg + f

and so f = -qg E Cg S;;; C . But deg (r) < deg (g), by definition of r . Thusr = 0, by definition of g, and this means that g divides X" - 1.

Now let a E C. We want to prove that a E Cg• Let

ax = ao + a1X + . . . + an_1X n-1,

so thata = ao + al X + . .. + an_l Xn- 1 = ax .

Let qa and r« be the quotient and remainder of the Euclidean division ofax

byg:

This implies that

a = ax = (jag + fa

and thereforefa = a - (jag E C.

This and the definition ofra and g yield that r a = 0 and hence a = (jag E Cg ,

as claimed. 0

146 Chapter 3. Cyclic Codes

The monic divisor 9 of X" - 1 such that 0 = Og is said to be thegenerator polynomial of Og .

E.3.3. For any I E IF[X], prove that the generator polynomial of Of is9 = gcd(j,xn -1). 0

In the sequel we shall write, if9 is a monic divisor of X n - 1,

g= (X" -1)/9

and we w~ say that it is the control polynomial, or checkpolynomial, of Og.Note that 9 = 9.

Further notations and conventions

Let us modify the notations used so far in order to better adapt them to thepolynomial representation of vectors. Instead of the index set {I, ... ,n},we will use {O,1, ... ,n - I} . In this wayan n-dimensional vector a(ao, . . . ,an-I) is identified with the polynomial

( )n-la x = ao + alX + ...+ an-IX .

The scalar product (alb) of a,b E IF[x]n can be determined in terms ofthe polynomial representation as follows: Given a E IF[X]n, let£(a) = an-l(the leading term of a) and

Then we have the relation

(alb) = £(ab),

[3.2]

[3.3]

as it is easy to check.In the remainder of this section we will assume that gcd(n ,q) = 1 and

we let p denote the unique prime number such that q = pT, for some positiveinteger r . This hypothesis implies that n i- 0 in IF, therefore

D(Xn - 1) = nXn-1rv X n-l

does not have non-constant common factors with X" - 1. Consequently theirreductible factors of X" - 1 are simple (have multiplicity 1). In particularwe have, if II ,... ,i- are the distinct irreducible monic factors of X" - 1,that X" - 1 = II · · · Ir (for the case in which nand q are not relativelyprime, see £.3.13 on page 158).

Thus the monic divisors of X" - 1 have the form

9 = l il . .. Ii., with 1 ~ i l < .. . < is ~ r,

3.1. Generalities 147

So there are exactly 2r cyclic codes of length n, although some of thesecodes may be equivalent (for a first example, see Example 3.6).

Since the ideals (Ji) are maximal ideals oflF[x]n = F[X]j(xn -1), thecodes M, = (Ii) are said to be maximal cyclic codes. In the same way, the

codes M = (h), h= (xn - 1)/ h , are called minimal cyclic codes, as theyare the minimal nonzero ideals oflF[XJn.

E.3.4. How many ternary cyclic codes of length 8 are there?

E.3.5. Let G be a binary cyclic code of odd length. Show that G contains anodd weight vector if and only if it contains 11 ... 1.

E.3.6. Let G be a binary cyclic code that contains an odd weight vector.Prove that the even-weight subcode of G is cyclic .

Dimension o/Og

Given a monic divisor g of X" -1, the elements ofGg are, by definition, thepolynomials that have the form ag, where a E F[x]n is an arbitrary element.In other words, the map I-" : F[X] --t Gg such that U f-+ fig is surjective.

On the other hand, we claim that ker (I-") = (g). Indeed, fig = 0 if andonly if (X" - l)lug. But X" - 1 = ggand gcd (g,g) = 1, so (X" - l)lugif and only if9lu, that is, if and only ifu E (g).

Hence we have that

Gg = im(l-") ~ IF[X]/ker (I-") ~ F[X]j(g) ~ IF[X]k ,

where k = deg (g). Thus we have proved:

3.4 Proposition. The dimension olGg coincides with the degree ofthe con­trol polynomial 9olGg :

dim (Gg ) = deg (g).

Since deg (g) = n - deg (g), deg (g) is the codimension ofOi ; 0

E.3.7. Prove that the linear map F[X]k --t Gg such that u f-+ ilg is an F­linear isomorphism.

Generating matrices

With the same notations as in the preceeding subsection, the polynomialsUi = {xig}o~i<k form a basis ofGg • Therefore, if

g = go +glX + ...+ gn_kx n- k,

148 Chapter3. CyclicCodes

then the matrix

go g1 gn-k 0 0 00 go g1 gn-k 0 0

G=0 0 go g1 gn-k 00 0 go g1 gn-k

(k rows and n columns) generates C. Note that gn-k = 1, since 9 is monic,but we retain the symbolic notation gn-k in order to display explicitely thedegree ofg.

AI Library I Cyclic aeneratina matrix of a cvclic code IA

cyclic_matrix(g :Vector, n :71 )check n~length(g)

:=beginlocal k=n-Iength(g), Gif k>O then

g=g IconstantvectorIk.O)endG=gfor i in 1..k do

g= [Oll take(g,n-1)G=(G,g)

end[G]

endI cyclic_matrix(g :Univariate_polynomial, n :71 ):= cyclic_matrix (vector(g) , n)

I v= [gO, g1, g2, g3, g4] ;

If=polynomial(v,x) -+ g4 ·x4+g3 ·x3+g2 ·x2+g1 ·x+gOIcyclic_matrix(f,5) -+ (gO g1 g2 g3 g4)

I.. (gO g1 g2 g3 g4 0)

cychc_matnx(v,6) -+ 0 gO g1 g2 g3 g4

(

gO g1 g2 g3 g4 0 0)cyclic_matrix(f,7) -+ 0 gO g1 g2 g3 g4 0

o 0 gO g1 g2 g3 g4

E.3.8. Show that the coding map IFk-t cg , U f--+ uG, can be described in

algebraic terms as the map IF[x]k -t cg such that u f--+ ug.

Normalized generating and control matrices

Ifwe want a normalized generating matrix (say in order to facilitate decod­ing), we can proceed as follows : For 0 ~ j < k, let

3.1. Generalities

with deg (rj) < n - k. Then the k polynomials

149

form a basis of Og and the corresponding coefficient matrix G' is normal­ized, in the sense that the matrix formed with the last k columns of G' isthe identity matrix h (See the listing Normalized generating and controlmatrices).

"'I Library I Cyclic normalized oeneratino and control matrices

cyclic_normalized_matrix(g, n)check n>degree(g) & monic?(g):=begin

local r, x=variable(g), k=n-degree(g), G=nullfor j in O.•k-1 do

r=remainder(xn-k+j,g)r=pad (vector(r) ,n- k)G=(G,-r)

end[Glilk

endcyclic_normalized_controLmatrix (g, n)check n>degree(g) & monic?(g):=begin

local r=degree(g), k=n-r,G=cyclic_normalized_matrix (g,n) ,R=G1..r

I I-RTn-k

end

Ig=1+2 X+X3 ; n=7;

G=cyclic_normalized_matrix(g,n) ~ (j J ~ ~ !~ ~)-1 -4 -4 0 0 0 1

(

1 0 0 -1 0 2 1)H=cyclic_normalized_controLmatrix(g,n) ~ 0 1 0 -2 -1 4 4

o 0 1 0 -2 -1 4

E.3.9. Let u E IFk ~ F[X]k. Show that the coding of u using the matrix G'can be obtained by replacing the monomials xj ofu by vs (0 :,;; j < k) :

E.3.10. Let H' be the control matrix of Og associated to G'. Prove that thesyndrome of s E IFn-k ~ IF[X]n-k of an element a E IFn ~ F[x]n coincideswith the remainder of the Euclidian division of a by g.

150

The dual code and the control matrix

Chapter 3. Cyclic Codes

Using the same notations, the control polynomial 9 satisfies that ga = 0is equivalent to a E Og (for a E F[x]n). Indeed, the relation ga = 0 isequivalent to say that X" - 1 = gglgax . Since IF[X] is a domain, the latteris equivalent to glax, and so we have gla, or a E Og.

Let us see that from this it follows that

[3.4]

where Og is the image of 0 9 by the map a 1--+ a(cf. [3.2]). Indeed, sinceboth sides have dimension n - k, it is enough to see that

- 1.0 9 ~ Og.

But if a E 09 and bE Og, then clearly ab = 0 and then (alb) = £(ab) = 0(cf. [3.3]).

Since gxn - k - 1, ... ,gx,9 form a basis of 09, if we set

9 = ho+ hlX + ...+ hkXk,

then

hk hk- 1 ho 0 0 00 hk hk- 1 ho 0 0

H=0 0 hk hk- 1 ho 00 0 hk hk- 1 ho

is a control matrix of Og (here hk = 1, but we write hk in order to displayexplicitely the indices of the coefficients; see Cyclic control matrix).

3.5 Remark. There are authors that call the ideal (g) the dual code of Og.This is confusing, because the formula [3.4] shows that it does not go alongwell with the general definition of the dual code . On the other hand, notethat the same formula shows that 09 is equivalent to the dual of Og.

In the next example we just need that the polynomial

is a factor ofXU -lover Z3. This can be checked directely, but note that inthe next section we will prove more, namely, that g is irreducible and that theother irreducible factors ofXU - 1 are X-I and X 5+X 4 - X 3 +X 2 - 1.

3.1. Generalities

AI Library I Cyclic control matrix

Icyclic_control_matrix(h :Vector, n :71) check n~length(h)

:= cyclic_matrix(reYerse(h), n) ;

Icyclicjcontrolmatrixth :monic?, n :71) check n>degree(g):= cyclic_control_matrix(Yector(h) , n) ;

I h= [hO, h1, h2, h3] ; n=7;

(

h3 h2 h1 hO 0 0 0). . 0 h3 h2 h1 hO 0 0

cychc_controLmatnx(h,n) - 0 0 h3 h2 h1 hO 0

o 0 0 h3 h2 h1 hO

151

3.6 Example (The ternary Golay code). Let q = 3, n = 11 and 0 = Og,where 9 is the factor of XU - 1 described in the preceding paragraph. Thetype of the cyclic code 0 is [11,6] .

Now we will prove that the minimum distance of this code is 5. Ifwe letG denote the normalized generating matrix of 0 corresponding to g, then it

is easy to check that GGT

= 0, where G is the parity completion of G (inorder to preserve the submatrix 16 at the end, we place the parity symbols onthe left):

In=11 ; g=factor(x"-1, 7l3}2.1 - x5+2 'x3+x2+2 'x+2

I G=cyclic_normalized_matrix(g , n) ;

1221201000001102212010000222011001000

G=left_parity_completion(G) - 1 1 01 1 1 0001 00

012221000010212102000001

000000)000000000000000000000000000000

Thus the code 0 = (G) is selfdual and in particular the weight of any vectorof 0 is a multiple of 3. Since the rows G have weight 6, the minimumdistance of 0 is either 3 or 6. But each row of G has exactly one zero in thefirst 6 columns, and the position of this 0 is different for the different rows ,so it is clear that a linear combination of two rows of G has weight at least2 + 2, hence at least 6. Since the weight of this combination is clearly notmore than 12 - 4 = 8, it has weight exactly 6. In particular, for each suchcombination there appear exactly two zeros in the first 6 positions. Now alinear combination of 3 rows will have weight at least 1+3, and so at least 6.

152 Chapter 3. Cyclic Codes

All other linear combinations of rows of G have weight at least 4, and so atleast 6. So C has type [12,6, 6J and as a consequence C has type [12,6, 5J.It is a perfect code. It is known (but rather involved) that all codes of type(11, 36 , 5) are equivalent to C (cf. [11]) and any such code will be said to bea ternary Golay code. 0

E.3.11. Find all binary cyclic codes oflength 7. In each case, give the gener­ating and control polynomials, a normalized generating matrix and identifythe dual code.

Summary

• Cyclic ring structure of JF1t .

• Generating and control polynomials of a cyclic code.

• The dimension ofa cyclic code is the degree ofthe control polynomial,hence the degree of the generating polynomial is the codimension.

• Cyclic and normalized generating matrices. Systematic encoding.

• The dual of the cyclic code Cg , 9 a divisor of X" - 1, is the cycliccode Cg, but written in reverse order. This allows to produce a controlmatrix of G from the generating matrix of Cg.

• The ternary Golay code, which has type [11,6, 5]s. It is perfect and itsparity completion, which has type [12,6, 6]s, is a selfdual code.

3.1. Generalities

Problems

153

P.3.! Consider two linear codes C1 and C2 of the same length and withcontrol matrices HI and H2.

1) Check that C1 n C2 is a linear code and find a control matr ix for it interms of HI and H 2 .

2) Prove that if C1 and C2 are cyclic with generating polynomials gl andtn, then C1 n C2 is cyclic. Find its generating polynomial in terms of

gl and g2·

P.3.2. Let G = hl (~ ) , where 85 is the Paley matrix of Z5. Prove that (G)is a ternary Golay code .

P.3.3. Use the MacWilliams identities to determine the weight enumeratorof the parity completion of the ternary Golay code .

P.3.4. The order of p = 2 mod n = 9 is k = 6. Let q = 2k = 64 and a aprimitive n-th root of 1 in K = IFq . Let T : K -+ Z2 be the map defined byT(~) = ( Tr (~), Tr (a~) , ... , Tr (an-l~) ) .

1) Show that T is linear and injective.

2) IfC is the image ofT, prove that C is a cyclic code [n,k].

3) Find the generator polynomial of C.

4) What is the minimum distance of C?

5) Generalize the construction of T for any prime p and any posi tive in­teger n not divisible by p and show that 1 and 2 still hold true for thisgeneral T.

E. Prange [as early as 1962] seems to have been the first to

study cyclic codes.

F. MacWilliams and N.JA Sloane, [11], p. 214.

154 Chapter 3. Cyclic Codes

3.2 Effective factorization of X " - 1

Cycliccodes are the most studied of all codes.

F.MacWilliams and N.J.A. Sloane, [11], p. 188.

Essential points

• The splitting field of X " - lover a finite field.

• Ciclotomic classes modulo n and the factorization theorem.

• The factorization algorithm.

• Cyclotomic polynomials.

Introduction

In the previous section we learnt that the generator polynomials 9 of cycliccodes of length n over IFq are, assuming gcd (n ,q) = 1, the monic divisorsof X " - 1 and that if it ,... ,Ir are the distinct monic irreducible factorsof X " - lover IFq then 9 has the form 9 = iiI " ' !is , with 0 ~ s ~ T ,

1 ~ i 1 < . . . < is ~ T .

The goal ofthis section is to explain an algorithm that supplies the factorsit ,. .. ,Ir in an efficient way. It is basically the method behind the functionfactor(f,K). Examples of factorization illustrates how it works .

Examples of factorization

IF2= Zl2 ; F3=Zl3; F4=F2[t]/(t 2+t+1);

F8=F2[u]1(u3+u+1); F16=F2[s]1(54+53+52+5+ 1);

Ifactor(x5-1, F2) (x+1)' (x4+x3+x2+x+1)

Ifactor(x5-1, F4) (x+1)' (x2+t 'x+1)' (x2+(t+1) 'x+1)

Ifactor(x5-1, F16) (x+1) ' (x+s) · (x+s2). (x+s3). (x+(s3+s2+ s+1»

Itactortx?-1, F2) (x+ 1) . (x3+x+1) . (x3+x2+ 1)

Ifactor(x7-1, F3) (x+2)' (x6+x5+x4+x3+x2+x+ 1)

Ifactor(x2L1, F2)

.... (x+1) ' (X11+x9+x7+x6+x5+x+1)' (x11+x10+x6+x5+x4+x2+1)

Ifactor(x 11-1, F3) .... (x+2) ' (x5+2 'x3+x2+2'x+2) ' (x5+x4+2'x3+x2+2)

3.2. Effective factorization of X" - 1 155

3.7 Example. We first illustrate the basic ideas by factoring X7 -lover Z2.Since we always have X" - 1 = (X - 1)(xn-1 +...+X + 1) it is enoughto factor I = X6 + ... + X + 1.

First let us consider a direct approach, using the special features of thiscase. Since I has 7 terms, it has no roots in Z2 and hence it does not havelinear factors. Since the only irreducible polynomial of degree 2 is

h = X 2 +X + 1

and it does not divide I (the remainder of the Euclidean division of I byh is 1), I is either irreducible or factors as the product of two irreduciblepolynomials of degree 3. But there are only two irreducible polynomials

of degree 3 over Z2 (h = X 3 + X + 1 and 12 = X3 + X 2 + 1) and ifI factors the only possibility is I = hh (since I does not have multiplefactor, I =F J'f and I =F Ii)· Finally, this relation is easily checked and sowe have that X7 - 1 = (X - 1)/112 is the factorization we were seeking.

Now let us consider an indirect approach. It consists in trying to finda finite field extension IF'IIF (here IF = IFq = Z2) that contains all roots ofX 7 - 1. If IF' exists, it contains, in particular, an element of order 7. If q' isthe cardinal ofIF', we must have that q' - 1 is divisible by 7. But q' = qm, forsome positive integer m, and hence m must satisfy that qm - 1 is divisibleby 7. As q = 2, the first such m is 3, and so it is natural to try IF' = IFa.

Now the group IFgis cyclic of order 7, and hence

X 7- 1 = II (X - 0:).

a ElFg

The root 1 corresponds to the factor X -1. Ofthe other 6 roots, 3 must be theroots of h and the other 3 the roots of h . The actual partition depends, ofcourse, on the explicit construction of IF' . Since [IF' : IF] = 3, we can defineIF' as the extension of IF by an irreducible polynomial of degree 3, say h :IF' = IFI (X3 + X + 1) . Ifwe let u = [X] (to use the notation in Examplesof factorization), then u is a root of h. Being in characteristic 2, u 2 and

u4 = u2 + u are the other roots of hand h = (X - u)(X - u2)(X - u4 ) .

So u3 = u + 1, (u + 1)2 = u2 + 1 and (u2 + 1)2 = u4 + 1 = u2 +u + 1 arethe roots of12 and 12 = (X -(u+1))(X -(u2+1))(X -(u2+u+1)). 0

The ideas involved in the last two paragraphs of the preceeding examplecan be played successfully to find the factorization of X" - lover IFq for allnand q (provided gcd (n, q) = 1), as we will see in the next subsection.

156

The splittingfield ofX" - 1

Chapter 3. Cyclic Codes

The condition gcd (n, q) = 1 says that [q]n E Z~ and hence we may considerthe order m of [q]n in that group. By definition, m is the least positive integersuch that qm == 1 (mod n) . In other words, m is the least positive integer suchthat nl(qm -1) and we will write en(q) to denote it. For example, e7(2) = 3,because 23 = 8 == 1 (mod 7). The value of en(q) can be obtained with thefunction order(q,n).

lorder(2,7) -+ 3lorder(2,15) -+ 4Iorder(2.23) -+ 11Iorderts.t t) -+ 5

Let now h E IF[X] be any monic irreducible polynomial of degree m =en(q) and define IF' = lFqm as IF[XJI(h). Let a be a primitive element ofIF' (if h were a primitive polynomial, we could choose a = [X]h). Thenord(a) = qm - 1 is divisible by n, by definition ofm. Let r = (qm - l)/nandw = o" .

3.8 Proposition. Over IF' we have

n-l

x» -1 = II (X - wi).i=O

Proof: Since ord(w) = (qm - l)/r = n, the set R = {wi I0 ~ j ~ n - I}has cardinal n. Moreover, wi is a root of X" - 1 for all j, because (wi)n =

(wn)i = 1. So the set R supplies n distinct roots of X" - 1 and henceIli,:-J (X - wi) is monic polynomial of degree n that divides X" - 1. Thusthese two polynomials must be the same. 0

3.9 Proposition. IF' = IF[w] and so IF' is the splittingfield ofX" - lover IF.

Proof: Indeed, if IlF[wJl = s', then n = ord(w) divides qS - 1 and bydefinition of m we must have s = m . 0

Cyclotomic classes

To proceed further we need the notion of cyclotomic classes . Given an inte­ger j in the range O..(n -1), the q-cyclotomic class of j modulo n is definedas the set

0i = {j ,qj .. . , qT- l j } (modn),

3.2. Effective factorization of X" - 1 157

where r is the least positive integer such that qTj == j (mod n) . For example,Co = {O} always, and if n = 7 and q = 2, then C1 = {l ,2 ,4} andC3 = {3, 6, 5}. If n = 11 and q = 3, then C1 = {l , 3,9, 5,4} and C2 ={2 , 6, 7, 10, 8}.

E.3.12. Show that if Cj n Ck i 0 then Cj = Ci , So the distinc t cyclotomicclasses form a partition of Zn . 0

The function cyclotomlcctassq.n.q) yields the cyclotomic class Cj . In thecase q = 2 we can use cyclotomlcclassq.n).

I cyclotomic_class (1,7) ~ {1, 2, 4}Icyclotomic_c1ass(3,7) ~ {3, 6, 5}Icyclotomic_c1ass(1,11,3) ~ {1 , 3, 9, 5, 4}Icyclotomic_c1ass(2,11,3) ~ {2, 6, 7,10, 8}

The function cyclotomic_c1asses(n,q) yields the list of q-cyclotomic classesmodulo n. In the case q = 2 we can use cyclotom ic_c1asses(n).

~ cyclotomic_c1asses(7) ~ {{O}, {1, 2, 4}, {3 , 6, 5}}~ cyclotomic_c1asses(11,3) ~ {{O}, {1, 3, 9, 5, 4}, {2 , 6, 7, 10, 8}}

The factorization theorem

If C is a q-cyclotomic class modulo n, define fe as the polynomial withroots wj , for j E C:

fe = II (X - wj ) .jEe

3.10 Lemma. The polyn omial fe has coeffic ients in F fo r all C .

Proof: Let us apply the Frobenius automorphism x I---t x q to all coefficientsof Ic. The result coincides with

But from the definition of q-cyclotomic class we have {j q Ij E C } = C andso

II (X - wjq) = fe·jEe

This means that aq = a for all coefficients a of fe and we know that thishappen s if and only if a E F . 0

158 Chapter 3. Cyclic Codes

3.11 Theorem. The correspondence C f---t fe is a bijection between the setofq-cyclotomic classes modulo n and the set ofmonic irreducible factors ofX" - lover lFq.

Proof The factorization of X" -lover IF' established in lemma 3.8 and thefact that the q-cyclotomic classes modulo n form a partition ofZn imply that

x»-1 = lIfe,e

where the product runs over all q-cyclotomic classes modulo n. Thus itis enough to prove that fe E IF[X] is irreducible. To see this, note that{wi Ij E C} is the conjugate set of any of its elements and so fe is, byproposition 2.40, the minimum polynomial over IF of wi for any j E C.Therefore fe is irreducible . 0

E.3.13. If n is divisible by p (the characteristic of IF), the factorization ofX" - 1 can be reduced easily to the case gcd (n , q) = 1. Indeed, if n = n' p",where p is prime and gCd(n',p) = 1, show that X" - 1 = (xn' - l)ps.Consequently the number of distinct irreducible factors of X" - 1 coincideswith the number of q-cyclotomic classes modulo n' and each such factor hasmultiplicity p" ,

The factorization algorithm

Here are the steps we can follow to factor X" - lover IFq. If n = n'pS withgcd (n' , q) = 1, factor xn' - 1 by going through the steps below for the casegcd (n , q) = 1 and then count every factor with multiplicity v' .

So we may assume that gcd (n, q) = 1 and in this case we can proceedas follows :

1) Find m = en(q), the order of q in Z~ . Define r = (qm - l)jn.

2) Find and irreducible polynomial h E IF[X] of degree m and let IF' =

IF[X]j(h).

3) Find a primitive element a of IF' and let w = o" .

4) Form the list C of q-cyclotomic classes C modulo m , and for eachsuch class C compute fe = TIiEdX - wi) .

5) The {fe}cEe is the set ofmonic irreducible factors of X" - 1 (hence

x» -1 = TIeEc Ic).

It is instructive to compare this algorithm with the listing Script for thefactorization algorithm and to look at the example attached there.

For another example, see Factorization of X 17 - lover Z13 '

3.2. Effective factorization ofX" - 1

"'I Librarv I Scriot for the factorization akiorithrn of Xn-1 over K

m=order(q,n)h=irreducible_polynomial (K,m,x)

F=K[x]/(h); r=(qm-1 )/n

a=primitive_element(F); ro=ar

C=cyclotomic classes(n,q)

In(X-roi) with c in c)j m c

159

ExampleIn= 11; q=3; K=71q;

Im=order(q,n);I h=irreducible_polynomial(K, rn, x):

IF=K[x]/(h); r=(qm-1)/n .... 22

Ia=primitive_element(F); ro=ar; order(ro) .... 11IC=cyclotomic_c1asses(n,q);

In(X-roi) with c in c)pnc.... {X+2, XS+X4+2 ·X3+X2+2, XS+2·X3+X2+2 ·X+2}

Ifactor(X 1L1,713) .... (X+2) ' (XS+2·X3+X2+2·X+2) · (XS+X4+2 ·X3+X2+2)

Factorization of X17-1 over F=7113In=17; q=13; K=71q; m=order(q,n) 4

Ih=irreducible_polynomial(K,m,x) x4+11

IF=K[x]/(h); r=(qm-1)/n 1680

Ia=primitive_element(F) 6 'x3+3'x2+2 'x+1

Iro=ar .... 10 'x3+2'x2+8 'x+1I order(ro) .... 17

IC=cyclotomic_c1asses(n,q).... {{O}. {1, 13,16, 4}. {2, 9,15, 8}. {3, 5,14, 12}. {6,10, 11, 7}}

If(c):= Il (x-roi);

J In c

If(C1), f(C2), f(C3) .... X+12, X4+9 'X3+9 'X+1 , X4+10 ·X3+9 ·X2+10·X+1

If(C4), f(Cs) .... X4+2·X3+5 ·X2+2·X+1 , X4+6 ·X3+6 ·X2+6 ,X+1

160

Summary

Chapter3. CyclicCodes

• The splitting field of X " - lover a finite field IF = IFq is, assuminggcd (n, q) = 1 and setting m = en(q), the field IF' = lFqrn. If ais a primitive element of IF', r = (qm - 1) / nand w = o", thenX" - 1 = IT~ (X - wi)J=l .

• Cyclotomic classes modulo n and the factorization theorem, which as­serts (still with the condition gcd (n , q) = 1) that the monic irreduciblefactors of X n - lover IF are the polynomials fe = ITiEdX - wi ),where C runs over the set of q-cyclotomic classes modulo n .

• The factorization algorithm. If p is the characteristic of IF and n =n'ps with gCd(n',p) = 1, then X" - 1 = (xn' - I)PS reduces theproblem to the factorization of x« - 1. If n is not divisible by p(this is equivalent to saying that gcd (n , q) = 1), then the factorizationtheorem gives an effective factorization procedure.

• Cyclotomic polynomials (they are introduced and studied in the prob­lems P.3.7-P.3.8).

Problems

P.3.5. Show that a q-ary cyclic code C of length n is invariant under thepermutation 0" such that O"(j) = qj (mod n).

P.3.6. We have seen that over Z, we have Xn - 1 = (X - 1)9091, where

90 = (X5- X 3 + X 2

- X-I) and 91 = (X5 + X 4- X 3 + X 2

- 1).

1) With the notations introduced in the example included in Script forthe factorization algorithm, show that the roots of 90 (91) are wi,where j runs over the nonzero quadratic residues (over the quadraticnon-residues) of :In.

2) If k E :li1 is a quadratic non-residue (so k E {2, 6, 7,8, 10}) and 1rk isthe permutation i ~ ki of :lll = {O, 1, ... , 10}, prove that 1rk(Cgo) =C91 (here then are two distinct cyclic codes that are equivalent) .

P.3.7. In the factorization of X" - 1 E lFq[X] given by lemma 3.8, let

a; = IT (X -wi)gcd (j ,n)=l

(this polynomial has degree <p(n) and is called the n-th cyclotomic polyno­mial over lFq). Prove that:

3.2. Effective factorization of X" - 1 161

1) x» - 1 = I1dln Qd·

2) Deduce from 1 that Qn E Zp[X] for all n , where p is the characteristicoflFq • Hint: Use induction on n .

3) Prove thato; = II(Xnjd - l)fl(d).

din

Note in particular that Qn is independent of the field IFq, and that itcan be computed for any positive integer n, although it can be then-th cyclotomic polynomial over IFq (or over Zp) if and only if n isnot divisible by p. Hint: Use the multiplicative version of Mobiusinversion formula.

4) Use the formula in 3 to prove that if n is prime then

Qn = X n-1 + X n-2 + ...+ X + 1

and

For example, Q5 = X 4 + X3 + X 2 + X + 1 and

5) If m denotes the order of q modulo n (so we assume gcd (n, q) = 1),prove that Qn factors over IFq into ep(n)jm distinct irreducible poly­nomials of degree m each. For example, e25 (7) = 4 and ep(25) = 20,and so Q25 splits over Z7 as the product of five irreducible polynomi­als of degree 4 each (see Computation of cyclotomic polynomials).Hint: the degree over IFq of any root of Qn is equal to m .

P.3 .8. If Pq(n, X) is the product of all monic irreducible polynomials ofdegree n over IFq, prove that

Pq(n, X) = IIQm(X)m

for all n > 1, where the product is extended over the set M(q, n) of alldivisors m > 1 of qn - 1 for which n is the multiplicative order of q modulom (see Computation of the set M(q,n)). Hint : First show that the roots ofPq(n ,X) are the elements of degree n in IFs":

162

"'I Library I Computation of cyclotomic polvnomlals

Q(n :71, X :Identifier) check n."O .­begin

local uernurnoeblusif n=1 then

XOelse Il (xn/d_1 )ll(d)

d in divisors(n)end

end

Chapter 3. Cyclic Codes

I'"

I{Q(n, X) with n in 1..10 such_that not prime?(n)}... {1, X2+1, X2-X+1, X4+1 , X6+X3+1 , XLX3+XLX+1}

IQ(2S, X) ... X20+X15+X10+X5+1

If=factor(X20+X15+X10+X5+1, 7l7 ) : f={r1 with r in f};

If1' f2, f3... X4+2·X3+4·X2+2·X+1, X4+4 'X3+4 'X+1, X4+4·X3+3·X2+4,X+1

If4, f5 ... X4+S·X3+S·X2+S ·X+1, X4+6·X3+S ·X2+6 ·X+1

"'I Library I Computation of the set M(Q,n) I'"

M(q,n) :=begin

local D=tail(divisors(qn-1)) , M={}for m in 0 do

if gcd(q,m)=1 & order(q,m)=n thenM=M I {m}

endend

endSpecial call for the binary case

IM(n) := M(2,n)

~{M(n) with n in 2..7} ... {{3}, {7}, {S, 1S}. {31}, {g, 21, 63}. {127}}

I{M(3,n) with n in 2..S}... {{4, B}, {13, 26}, {16, S,10, 20, 40, BO}, {11, 22,121, 242}}

3.12 Remark (On computing cyclotomic polynomials). The procedure ex­plained in P.3.7 to find the n-th cyclotomic polynomial involves cumbersomecomputations for large n. For example, if n = II! then Q(n, X) is a poly­nomial of degree <p(n) = 8294400 with 540 terms (the number of divisorsof n) and it took 45 seconds to evaluate in a Compaq Armada E500. For­tunately there is a better scheme based on P.3.9. It is presented in Efficientcomputation of cyclotomic polynomials and is the one implemented inthe internal function cyclotomic_polynomial(n,T).

3.2. Effective factorization of X " - 1 163

For the key mathematical properties of the cyclotomic polynomials on whichthis function is based, see P.3.9.With the internal function, and n = 111, Q(n, X) was obtained in 0.23 sec­onds, while the same function coded externally (the last quoted listing) took0.25 seconds. If we want to calculate a list of cyclotomic polynomials, thedirect method may be much better, because it does not have to factor andfind the divisors of many integers .

A' Library I Efficient computation of cyclotomic polynomials

cyclotomic_polynomial (n :71., T: Identifier) check n>O :=begin

P

local P=factor(n), s=length(P), p=P1,1' m=p, i=1, f= I Tp-j

j=1while i<s do

i=i+1; P=Pi,1; m=m -p

evaluate (f,TP)f f

endevaluate (f,Tn/m)

end

P.3.9. Let Qn(X) be the n-th cyclotomic polynomial (n ~ 1). Prove that:

1) Qmn(X) divides Qn(xm) for any positive integers m and n.

2) Qmn(X) = Qn(Xm) if every prime divisor ofm also divides n.

3) For any n, Qn(X) = Qr(xn/r), where r is the product of the primedivisors ofn.

164 Chapter 3. Cyclic Codes

3.3 Roots of a cyclic code

An alternative specification of a cyclic code can be made in

terms of the roots, possibly in an extension field, of the genera­

tor g(X) of the ideal.

w.w. Peterson and E.J. Weldon, Jr., [16], p. 209.

Essential points

• Determination of a cyclic code by the set of roots of its generatingpolynomial.

• Bose-Chaudhuri-Hocquenghem (BCH) codes. The BCH lower boundson the minimum distance and the dimension.

• The Reed-Solomon primitive codes are BCH codes.

Introduction

Let C be the cyclic code of length n and let 9 be its generating polynomial.By definition, the roots of C = Cg are the roots of 9 in the splitting fieldlFt = IFqm of X" - lover IFq ' If w E IFqm is a primitive n-th root of unityand we let Eg denote the set of all k E Zn such that wk is a root of g, thenwe know that Eg is the union of the q-cyclotomic classes corresponding tothe monic irreducible factors of g.

Let E~ ~ Eg be a subset formed with one element of each q-cyclotomic

class contained in Eg • Then we will say that M = {wk IkE E~} is aminimal set ofroots of Cg .

3.13 Proposition. IfM is a minimal set ofroots ofa cyclic code C oflengthn, then

C = {a E IF[x]n Ia(~) = 0 for all ~ EM}.

Proof If a E C, then a is a multiple of g, where 9 is the generating polyno­mial of C, and hence it is clear that a(~) = 0 for all ~ E M. So we have theinclusion

C ~ {a E IF[x]n Ia(~) = 0 for all ~ EM}.

Now let a E IF[x]n and assume that ax(~) = a(~) = 0 for all ~ E M .By definition of M, there is j E E~ such that ~ = wi, and so wi is a root

3.3. Roots ofa cyclic code 165

of ax . Since the minimal polynomial of wi over IF is lop we have thatIOj lax and hence ax is divisible by rriEE~ IOj = g, and this means that

a E (g) = Cg • 0

E.3.14. If we define a generating set ofroots of Cg as a set of roots of Cg

that contains a minimal set ofroots, check that Proposition 3.13 remains trueif we replace 'minimal set' by 'generating set'. 0

The description of Cg given in Proposition 3.13 can be inverted. If

are n-th roots of unity, then the polynomials a E IF[x]n such that a(~i) = 0,i = 1, . . . ,r, form an ideal Ce oflF[x]n, which will be called the cyclic codedetermined bye. If gi is the minimum polynomial of ~i over IFq , then Ce =Cg , where 9 is the fern of the polynomials g1, . . . , gr . This observation willbe used below for the construction of codes with predetermined properties.

Finding a control matrix from the roots

Let us find a control matrix of Ceo For each i E {I, .. . , r}, the con­dition a(~i) = 0 can be understood as a linear relation, with coefficients1, ~i, " " ~r-l E IFq(~d, that is to be satisfied by the coefficients (that we

will regard as unknowns) ao, . . . , an- I of a. If we express each ~{ as alinear combination of a basis of IFqm over IFq» the relation in question canbe understood as m linear relations that are to be satisfied by ao, .. . ,an-I .Therefore the conditions that define Ce are equivalent to mr linear equa­tions with coefficients in IFq • A control matrix for Ce can be obtained byselecting a submatrix H of the matrix A ofthe coefficients ofthis linear sys­tem of equations such that the rows of H are linearly independent and withrank(H) = rank(A) (for an implementation of this method, see the listingin Remark 4.2).

3.14 Example (The Hamming codes revisited). Let m be a positive integersuch that ged (m, q-1) = 1, and define n = (qm -1)/(q-1) . Letw E IFq1n

be a primitive n-th root of unity (if 0: is a primitive element of IFq1n, takew = o:q-I). Then Cw is equivalent to the Hamming code of codimension m,Hamq(m) .

Indeed, n = (q - 1)(qm-2 + 2qm- 3 + . . . + m - 1) + m , as it caneasily be checked, and hence ged (n, q - 1) = 1. It follows that wq -

1 is aprimitive n-th root of unity and hence that wi(q-I) =11 for i = 1, .. . , n - 1.In particular, wi ~ IFq. Moreover, wi and wi are linearly independent over IFq

if i =I j. Since n is the maximum number ofvectors of IFq1n that are linearlyindependent over IFq (cf. Remark1.39), the result follows from the above

166 Chapter 3. Cyclic Codes

description of the control matrix of Cw and the definition of the Hammingcode Hamq(m) (p. 51).

BCHcodes

An important class of cyclic codes, still used a lot in practice,

was discovered by R. C. Bose and D. K. Chaudhuri (1960) and

independently by A. Hocquenghem (1959) .

J. H. van Lint, [25], p. 91.

Let w E IFq"' be a primitive n-th root of unity. Let 0 ;:: 2 and e ;:: 1 beintegers. Write BCHw(0,e) to denote the cyclic code of length n generatedby the least common multiple 9 of the minimal polynomials gi = Pwe+i,i E {O, ... , 0 - 2}. This code will be said to be a BCH code (for Bose­Chaudhuri-Hocquenghem) with designed distance 0 and offset e. In the casee = 1, we will write BCHw(o) instead of BCHw(o, 1) and we will say thatthese are strict BCH codes. A BCH code will be said to be primitive ifn = qm - 1 (note that this condition is equivalent to say that w is a primitiveelement of IFq"') '

3.15 Theorem (BCH bound). If d is the minimum distance ofBCHw(o, e),then d ;:: s.Proof' First note that an element a E IF[x]n belongs to BCHw(O,e) if andonly if a(wH i) = 0 for all i E {O, . . . ,0 - 2}. But the relation a(wH i) = 0is equivalent to

and therefore(1,wH i , w2(Hi) , . . . ,w(n-l)(Hi»)

is a control vector of BCHw(0,e). But the matrix H whose rows are thesevectors has the property, by the Vandermonde determinant, that any 0 - 1 ofits columns are linearly independent. Indeed, the determinant ofthe columnsjI, . . . .i s- i is equal to

Wi O- 1l

wio-d£+l)

w i O-1(£+0- 2)

1wi O- 1

which is nonzero if jl, ' " ,jO-l are distinct integers . o

3.3. Roots ofa cyclic code 167

3.16 Example. The minimum distance of a BCH code can be greater thanthe designed distance. Let q = 2 and m = 4. Let w be a primitive elementof IF16. Since w has order 15, we can apply the preceeding results in thecase q = 2, m = 4 and n = 15. The 2-cyclotornic classes modulo n are{I, 2, 4, 8}, {3, 6,12, 9}, {5,1O}, {7, 14, 13, 11}. This shows, if we writeCo = BCHw (i5) and do = dC6 for simplicity, that C4 = C5, hence d4 =d5 ~ 5, and C6 = C7, hence d6 = d7 ~ 7. Note that the dimension ofC4 = C5 is 15-2·4 = 7, and the dimension ofC6 = C7 is 15-2·4-2 = 5.

3.17 Example. This example is similar to the previous one, but now q = 2and m = 5. Let w be a primitive element of1F32. The 2-cyclotomic classesmodulo 31 are

{1,2,4,8,16},{3,6,12,24,17},{5,10,20,9,18},

{7,14,28,25, 19}, {11,22, 13,26,21},{15,30,29,27,23}

and so we see, with similar conventions as in the previous example, thatC4 = C5, C6 = C7, Cs = Cg = ClO = Cn and CI 2 = CI 3 = CI4 = C15.Therefore d4 = d5 ~ 5, d6 = d7 ~ 7, ds = dg = d lO = dn ~ 11 anddI2 = dI3 = dI4 = dI5 ~ 15. If we write ko = dim (Co), then we havek4 = 31 - 2 . 5 = 21, k6 = 31 - 3 . 5 = 16, ks = 31 - 4 . 5 = 11 andkI 2 = 31 - 5 . 5 = 6.

E.3.15. Let C be a cyclic code of length n and assume that w i is a root of Cfor all i E I, where I ~ {a, 1, . . . , n -I}. Suppose that III = 8 -1 and that(i + 1 mod n) E I for all i E I. Show that the minimum distance of C isnot less than 8.

E.3.16. If w is a primitive element of1F64, prove that the minimum distanceof BCHw(16) is ~ 21 and that its dimension is 18. 0

In relation to the dimension of BCHw(8, f), we have the following bound:

3.18 Proposition. Ifm = ordn(q), then

dim BCHw(8,f ) ~ n - m(8 -1).

Proof If9 is the generating polynomial of BCHw (8, f), then

dim BCHw (i5, f) = n - deg (g).

Since 9 is the least common multiple of the minimum polynomials

Pi = Pwl+i, i = 1, . . . ,f -1,

and deg (Pwl+i) ~ [IFqrn : IFq] = m, it is clear that deg (g) ~ m(8 - 1), andthis clearly implies the stated inequality. 0

168 Chapter 3. Cyclic Codes

Improving the dimension bound in the binary case

The bound in the previous proposition can be improved considerably forbinary BCH codes in the strict sense . Let C, be the 2-cyclotomic class of imodulo n . If we write Pi to denote the mimimum polynomial ofwi, where wis a primitive n-th root of unity, then Pi = P2i, since 2i mod n E Ci. It turnsout, if t ;::: 1, that

Icm(pl ,p2, .. . ,P2t) = Icm(pl,p2, ' . . ,Pu-d = Icm(pl,P3, ... ,P2t-l) . [3.5]

Now the first equality tells us that BCHw(2t + 1) = BCH w(2t), so that it isenough to consider, among the binary BCH codes in the strict sense, thosethat have odd designed distance.

3.19 Proposition. In the binary case, the dimension k of a binary code

BCH w(2t + 1) of length n and designed distance 8 = 2t + 1 satisfy thefollowing inequality:

k ;::: n-tm,

where m = ordn(2).

Proof Let 9 be the polynomial [3.5]. Since the first expression ofgin [3.5]is the generating polynomial of BCHw(2t + 1), we know that k = n ­deg (g) . Butlooking at the third expression ofgin [3.5] we see that deg (g)is not higher than the sum of the degrees of PI , P3, ... ,P2t-1 and thereforedeg (g) ~ tm because the degree of each Pi is not higher than m . 0

Table 3.1: Strict BCH binary codes of length 15 and 31, where P il ,... ,i r

8 9 k dI 1 15 13 PI 11 35 PI,3 7 57 PI ,3,5 5 79-15 X l 5 -1 0 -

8 9 k d1 1 31 13 PI 26 35 PI,3 21 57 PI ,3,5 16 79,11 PI ,3,5,7 11 1113,15 PI,3,5,7,9 6 1517-31 X31 -1 0 -

3.20 Example. The lower-bound of the above proposition, when applied tothe code BCH w(8) = BCHw(9) of the example 3.17, gives us that

k ;::: n - tm = 31 - 4 . 5 = 11.

Since the dimension of this code is exactly 11, we see that the bound inquestion cannot be improved in general (cf. P.3.18). 0

3.3. Roots ofa cyclic code 169

E.3.17. Let f = x 4 + x 3 + x 2 + X + 1 E Z2[XJ, F = Z2[Xj/(J) anda a primitive element of F. Find the dimension and a control matrix ofBCHcr(5).

The Golay code is probably the most important of all codes, for

both practical and theoretical reasons.

FJ. MacWilliams, N.J.A. Sloane, [II], p. 64.

3.21 Example (The binary Golay code). Let q = 2, n = 23 and m = ordn(2)

= 11. So the splitting field of X 23 - 1 E Z2[X] is L = IF211. The 2­cyclotomic classes modulo 23 are as follows :

Go = {O}Gl = {1,2,4,8,16,9,18,13,3,6,12}

G5 = {5,10,20,17,11,22,21,19,15,7,14}.

If w E L is a primitive 23rd root of unity, the generating polynomial ofG = BCHw(5 ) is

g = Icm(Pl ,P2,P3 ,P4) = Pl·

As deg (PI) = IGII = 11, it turns out that dim (G) = 23 - 11 = 12.Moreover, G has miminum distance 7 (see E.3.18), and so G is a perfectbinary code of type [23,12,7]. It is known that all codes of type (23,212 ,7 )are equivalent to G (cf. [25] for an easy proof) and any such code will besaid to be a binary Golay code.

E.3.18. Show that the minimum distance of the binary code introduced in theexample 3.21 is 7. Hint: Prove that the parity extension of G has minimumdistance 8 arguing as in Example 3.6 to prove that the extended ternary Golaycode has minimum distance 6 (for another proof, see P.3.l1).

Primitive Reed-Solomon codes

NASA uses Reed-Solomon codes extensively in their deep space

programs, for instance on their Galileo, Magellan and Ulysses

missions.

Roman [1992] , pag. 152.

We have introduced RS codes in Example 1.24. It turns out that in thecase n = q - 1 the RS codes are a special type of BCH primitive codes .Before proving this (see Proposition 3.22) , it will help to have a closer lookat such codes . Let IF be a finite field and let w E IF be a primitive element.

170

The binary Golay code

In=23; g=factor(xn-1 ,712)2,1;

G=cyclic_normalized_matrix (g,n)1100011101010000000000001100011101010000000000111101101000010000000000111101101000010000000000111101101000010000000

-+ 11011001100000001000000011011001100000001000000011011001100000001000011011100011000000001000101010010110000000001001001001111100000000001010001110101000000000001

Chapter 3. Cyclic Codes

Let 8 be an integer such that 2 ~ 8 ~ n . Consider the code C = BCHw(8).The length of this code is n = q - 1 and its generating polynomial is

as the minimum polynomial of wi over IF is clearly X - wi . In particular,k = n - 8 + 1, where k is the dimension of C . Thus 8 = n - k + 1, andsince d ~ 8, by the BCH bound, and d ~ n - k + 1, by the Singleton bound,we have that d = 8 = n - k + 1, and so C is an MDS code.

3.22 Proposition. The code BCHw (8) coincides with the Reed-Solomon code

RSI ,w,oo .,wn-1 (n - 8 + 1).

Proof' According to Example 1.30, the Vandermonde matrix

is a control matrix of C = RSI ,w,oo .,wn-1 (n - 8 + 1). The i-th row of His (1,wi, . . . ,wi(n-I») and so the vectors a = (ao, all . . . ,an-I) of Carethose that satisfy ao + alwi + ...+ an_Iwi(n-l) = 0, i = 1, . . . , 8 - 1. Interms of the polynomial ax , this is equivalent to say that wi is a root of ax

for i = 1, . . . , 8 - 1 and therefore C agrees with the cyclic code whose rootsare w, ... ,wo- I . But this code is just BCH w(8) . 0

Summary

• Determination of a cyclic code by the set of roots of its generatingpolynomial. Construction of a control matrix from a generating set ofroots.

3.3. Roots ofa cyclic code 171

• The BCH codes and the lower bound on the minimum distance (theminimum distance is at least the designed distance) . Lower bounds onthe dimension (the general bound and the much improved bound forthe binary case).

• The primitive RS codes are BCH codes.

Problems

P.3.10 (Weight enumerator of the binary Golay code). Let a = L;~o aizi

and a = L;~o aizi be the weight enumerators of the binary Golay code Cand its parity completion C, respectively.

1) Prove that ai = a23-i for i = 0, . .. , 23 and ai = a24- i for i =0, . . . , 24.

2) Since C only has even weight vectors and its minimum distance is 8,we have

Use the MacWilliams identities for C to show that as = 759,0,10 = 0and 0,12 = 2576 .

3) Show that a7 + as = as, ag = alO = 0 and all = a12 = 0,12/2.

4) Use the MacWilliams identities for C to prove that a7 = 253 andas = 506. Hence the weight enumerator of C is

P.3.11. The complete factorization over &::2 of X 23 - 1 has the form

where 90 and 91 have degree 11. The binary Golay code C has been definedas Cgo .

1) With the notations of the example 3.21, show that the roots of 90 (91)have the form wi, where j is a nonzero quadratic residue (a quadraticnon-residue) of &::23.

2) Using that k = -1 is not a quadratic residue modulo 23, check that1r-1 is the map a I--t a that reverses the order of vectors. In partic­ular we have go = 91 (this can be seen directly by inspection of thefactorization of X 23 - lover &::2).

172 Chapter 3. Cyclic Codes

P.3.12 A code C oflength n is called reversible ifa = (ao, a I, . . . , an- I) EC implies that a= (an- I, an- 2, .. . , ao) E C.

l) Show that a cyclic code is revers ible if and only if the inverse of eachroot of the generating polynomial 9 of C is again a root ofg.

2) Describe the form that the divisor 9 of X" - 1 must have in order thatthe cyclic code Cg is reversible.

P.3.13 . For n = 15 and n = 31, justify the values of g, k and d on the table3.1 corresponding to the different values ofo.

P.3.14 Find a control matrix for a binary BCH code oflength 31 that corrects2 errors.

P.3 .15. Find a generating polynomial of a BCH code of length 11 and de­signed distance 5 over Z2.

P.3.16. Find the dimension ofa BCH code over !Fa oflength 80 and with thecapacity to correct 5 errors.

P.3.17. Calculate a generating polynomial of a ternary BCH code oflength26 and designed distance 5 and find its dimension.

P.3.18 . Let m ;:: 3 and t ;:: 1 be integers, a a primi tive element of !F2m andk(m, t) the dimension of the code BCHa (2t + 1) (thus we assume that 2t <2m - I ). Let! = Lm/2J andtm = 2/- 1• Prove that k(m,t) = 2m -l-m tif t :::;; tm .

3.4. The Meggitt decoder

3.4 The Meggitt decoder

173

The basic decoder of Section 8.9 was first described by Meggitt

(1960).

w.w. Peterson and EJ. Weldon, Jr., [16], p. 265.

Essential points

• Syndrome of the received vector (polynomial).

• The Meggitt table .

• The Meggitt decoding algorithm.

Syndromes

Let 9 E IF[x] be the generating polynomial of a cyclic code C of length nover IF. We want to implement the Meggitt decoder for C (as presented,for example, in [15], Ch. XVII). In this decoder, a received vector Y =[Yo, ... ,Yn-l] is seen as a polynomial YO+YIX+' . '+ Yn_l Xn- 1 E IF[x]n andby definition the syndrome ofy, S(y), is the remainder ofthe Euclidean divi­sion ofY by 9 (in computational terms, remainder(y,g). The vectors with zerosyndrome are, again by definition , the vectors ofC. Note that since 9 dividesX" - 1, S(y) coincides with the n-cyclic reduction ofremainder(y(X),g(X).In the sequel we will not distinguish between both interpretations.

3.23 Proposition. We have the identity S(xy) = S(xS(y)).

Proof By definition ofS(y), there exists q E IF[x]n such that y = qg+S(y) .Multiplying by x, and taking residue mod g, we get the result. 0

3.24 Corollary. Ifwe set So = S(y) and Sj = S(x jy), j = 1, . .. ,n - 1,then Sj = S(XSj-l). 0

The Meggitt table

Ifwe want to correct t errors, where t is not greater than the error-correctingcapacity, then the Meggitt decoding scheme presupposes the computationof a table E of the syndromes of the error-patterns of the form axn - 1 + e,

174 Chapter 3. Cyclic Codes

where a E IF* and e E IF[x] has degree n - 2 (or less) and at most t - 1non-vanishing coefficients.

3.25 Example (Meggitt table of the binary Golay code). The binary Golay codecan be defined as the length n = 23 cyclic code generated by

9 = xll + x9 + x 7 + x6 + x5 + X + 1 E Z2[X]

and in this case, since the error-correcting capacity is 3 (see Example 3.21),the Meggitt table can be encoded in a Divisor as follows:

Meggitt table for the binary Golay codeI n=23; R=O..(n-2) ;

Ig=x11 +x9+x7+X6+x5+x+ 1 : 7l2[xj ;

The table

IE1=[remainder(xn- 1,g)-+xn- 1j ;

IE2= [remainder(xn- 1+xi, g) -+xn- 1+xi with i in Rj;

IE3= [remaindertx'l"" +xi+xi, g) -+xn- 1+xi+xi with (i,j) in (R,R) where j<i l :I E=E1 +E2+E3 ;

Example

Is=remainder(xn- 1+x14+x3,g) - X10+x9+ x8+ x7+x5+x2+x+ 1

IE(s) _ x22+x14+x3

Thus we have that E (s) is°for all syndromes s that do not coincide withthe syndrome of x 22 , or of x 22 + xi for i = 0, ... ,21, or of x 22 + xi + xifor i,j E {O, 1, . . . , 21} and i > j. Otherwise E(s) selects, among thosepolynomials, the one that has syndrome s.

3.26 Example (Meggitt table of the ternary Golay code). The ternary Golaycode can be defined as the length II cyclic code generated by

and in this case, since the error-correcting capacity is 2, the Meggitt tablecan be defined as in the listing Meggitt table for the ternary Golay code.

The Meggitt algorithm

If y is the received vector (polynomial), the Meggitt algorithm goes as fol­lows:

I) Find the syndrome s = So of y.

2) If s = 0, return y (we know y is a code vector).

3.4. The Meggittdecoder

Meggitt table for the ternary Golay codeI n=11; R=O..(n-2) ;I U={1,-1};

Ig=X5+X4_X3+x2-1 : 7l3[X) ;

The table

IE1=[remainder(a ·xn- 1, g)-+xn- 1 with a in U);

IE2= [remainder(a'xn- 1+b'xi, g)-+a·xn- 1+b ·xi with (i,a,b) in (R,U,U)J ;

I E=E1+E2;Example

Is=remainder(-xn- 1+x5,g) ... x4+2·x+1

IE(s) .... -x10+x5

175

3) Otherwise compute , for j = 1,2, . . . , n -1, the syndromes Sj of xjy,and stop for the first j ~ 0 such that e = E(s) =I O.

4) Return y - ejxj .

In The Meggitt decoder we can see the definition of the function meg­gitl(y,g,n) which implements the Meggitt decoding algorithm. Its parametersare y, the polynomial to be decoded, g, the polynomial generating the code,and n, the length. That listing also contains an example of decoding theternary Golay code. This example can be easily modified so as to produce adecoding example for the binary Golay code. Note that Meggitt decoding ofthe Golay codes is complete, as these codes are perfect.

Summary

• The syndrome S(y) ofthe received polynomial y is defined as remain­der(y,g). It is zero ifand only if y is a code polynomial. The syndromesSj = S( x jy) satisfy the relation Sj = S(XSj-l ) for j > O.

• The Meggitt table . The entries in this table have the form

S(ax n - 1 + e) -+ axn - 1 + e,

where e is a polynomial of degree < n - 1 and with at most t nonzeroterms (t is the correcting capacity) and a is a nonzero element of thefield.

• The Meggitt decoding algorithm. If S(y) =I 0, finds the smallest shiftxjy such that Sj = S( x j y ) =I O. Then we know that the term ofdegree n - 1 - j of y is an error and that the error coefficient is theleading term of E (Sj ).

176 Chapter 3. Cyclic Codes

"'I Library I The Meggit decoder (presupposes a Meggitt table E)

meggitt(y,g,n):= begin

local x=variable(g), s=remainder(y,g), j=O, eif s=o then

say("Code vector II Iy); return yendwhile E(s)=O do

j=j+1 ; s=remainder(x's,g)ende=E(s)/xi; say("Error pattern: "Ie)y=y-e

end

Example relative to the ternary Golay codeI n=11 ; R=0 ..(n-2);I U={1,-1};

Ig=x5+x4-x3+x2-1 : 7l3[x];

I S(y):=remainder(y,g);Setting up the Meggit table

IE1=[S(a ·x n- 1)-+a·xn- 1 with a in U];

IE2=[S(a ·xn- 1+b ·x i) -+ a ·xn- 1+b·xi with (l.a.b) in (R,U,U)];IE=E1+E2;

Received vector and decoding

Iy=x4+x3+x+1 x4+x3+x+1

Imeggitt(y,g,n) x9+x5+x4+x3+x+1

Problems

''''

P.3.19. Consider the Hamming code Ham2(r), regarded as the cyclic codeCa. , where Q is a primitive root oflF2r. What is the Megill table in this case?For r = 4, with Q4 = Q + 1, decode the vector l sl010 with the Meggittalgorithm.

P.3.20. Let Q be a primitive element oflF16 such that Q4 = Q + 1 and let

be the generating polynomial ofa binary BCH code of type [15,5] . Assum­ing that we receive the vector

v =000101100100011,

find the nearest code vector and the information vector that was sent.

P.3.21 Let C = BCHa.(5) , where Q E lF32 is a root of the irreduciblepolynomial X S + X 2 + 1 E Z2[X], Thus C corrects two errors . Whatis the dimension of C? Assuming that the received vector has syndrome

3.4. The Meggitt decoder 177

1110011101, and that at most two errors have occurred, find the possibleerror polynomials.

P.3.22 Let C = BCHa (7), where a E lF32 is a root of X 5+X2+1 E lF2[X J.

1) Find the generating polynomial ofC.

2) Decode the vector 0000000111101011111011100010000.

4 Alternant Codes

Alternant codes are a [...J powerful family of codes. In fact, it

turns out that they form a very large class indeed, and a great

deal remains to be discovered about them

F. MacWilliams and N.lA. Sloane, [11], p. 332

In this chapter we will introduce the class of alternant codes over IFq bymeans of a (generalized) control matrix. This matrix looks like the controlmatrix of a general Reed-Solomon code (Example 1.26), but it is definedover an extension IFqm ofIFq and the hi need not be related to the Qi. Lowerbounds for the minimum distance and the dimension follow easily from thedefinition. In fact these bounds generalize the corresponding lower boundsfor BCH codes and the proofs are similar.

The main reasons to study alternant codes are that they include, in addi­tion to general Reed-Solomon codes, the family of classical Goppa codes (aclass that includes the strict BCH codes), that there are good decoding algo­rithms for them, and that they form a family of asymptotically good codes.

For decoding, the main concepts are the error-location and the error­evaluation polynomials, and the so-called key equation (a congruence) thatthey satisfy. Any method for solving the key equation amounts to a de­coding algorithm. The most effective methods are due to Berlekamp andSugiyama. Although Berlekamp's method is somewhat better for very largelengths, here we will present Sugiyama's method, because it is consider­ably easier to understand (it is a variation of the Euclidean algorithm for thegcd) and because it turns out that Berlekamp's method, which was discov­ered first (1967), can be seen as an optimized version of Sugiyama's method(1975). Since the decoder based on Berlekamp's method is usually called theBerlekamp-Massey decoder, here the decoder based on Sugiyama's methodwill be called Berlekamp-Massey-Sugiyama decoder.

We also present an alternative method to solve the key equation based onlinear algebra, which is the basis ofthe Peterson-Gorenstein-Zierler decoderdeveloped by Peterson (1960) and Gorenstein-Zierler (1961).

180 Chapter 4. Altemant Codes

4.1 Definitions and examples

The alternant codes were introduced by H. J. Helgert (1974).

J. H. van Lint, [25], p. 147.

Essential points

• Altemant matrices. Altemant codes.

• Lower-bounds on the dimension and on the minimum distance (alter­nant bounds) .

• General RS are altemant codes . Generalized Reed-Solomon codes(GRS) and its relation to altemant codes .

• BCH codes and classical Gappa codes are altemant codes.

• The classical Goppa codes include the BCH codes.

Let K = IFq and k = IFqm • Let a I, ... , an and hI , . . . , hn be elements ofk such that hi =I 0 for all i and ai =I aj for all i =I j and consider thematrix

H = v;.(aI, ... , an) diag (hI, . .. , hn) E M~(k) , [4.1]

that is,

h

n

)'r:h r -lnan

[4.2]

We say that H is the alternant control matrix of order r associated with thevectors

h = [hI, .. . , hnJ and a = [aI, . .. , an].

To make explicit that the entries of h and a (and hence of H) lie in ic, wewill say that H is defined over k .

In order to construct the matrix 4.2, observe that each of its r rows is, ex­cept the first, the componentwise product of the previous row by a. Hencewe introduce the function componentwise_product(a, b), then use it to con­struct the Vandermonde matrix v;.(a), with vandermonde_matrix(a , r ), andfinally the formula [4.1] is used to define alternant_matrix(h,a , r ), our mainconstructor of altemant matrices. See Constructing alternant matrices.

4.1. Definitions and examples

"'I Library 1 Constructinq alternant matrices

Ialternant jnatrtxth :Vector, u .Vector. r :71)check length(h)=length(a)& r>O:=vandermonde_matrix (a, r) .diagonal_matrix (h)

Ialternant_matrix (a, r) :=alternant_matrix (a , a, r)

vandermonde_matrix(a :Vector, r :71) check r>O:= begin

local n=length(a), h=constant_vector(n,1), H= [h)for i in 1..r-1 do

h=componentwise_product(h, a)H=H &h

endend

componentwise_product(a: Vector I List, b : Vector I List)check length (a) = length (b):= [aj ' bj with i in range(a)]

(

1 1 1 1 1)abc d evandermonde_matrix( [a, b, c, d, e),4).... a2 b2 c2 d2 e2

a3 b3 c3 d3 e3

Ip=19; K=71p;

I a=[element(j,K)where prime?(j)withj in 2..p-1] .... [2,3,5,7,11 ,13,17]

(

2 3 5 7 11 13 17]4 9 6 11 7 17 4

alternant_matrix(a,5).... 8 8 11 1 1 12 1116 5 17 7 11 4 1613 15 9 11 7 14 6

181

I'"

We will also need (in the decoding processes studied in sections 2-4) thevector

which of course is defined only if all the ai are not zero. The computationof this vector is done with the function invert_entries, which yields, whenapplied to a vector v, the result of mapping the pure function x -+ l/x to allcomponents of v (see The function lnvertentrlestvj).

"'I Library 1The function invert entries

~ invert_entries ( [1..7]) .... [1 , ~, f, ~, ~ , ~ , +]

''''

182

Alternant codes

Chapter 4. Altemant Codes

The K -code A K(h , Q , r) defined by the control matrix H is the subspace ofK" whose elements are the vectors x such that x H T = O. Such codes willbe called alternant codes. Ifwe define the H-syndrome of a vector y E Kas s = yH T E K , then AK(h, Q, r) is just the sub space of K" who seelements are the vectors with zero H-syndrome. If h = Q , we will wri teAK(Q, r) instead of A K(Q , Q , r). On the other hand, we will simply writeA(h, Q , r ) or A (Q , r) when K = Z2.

The main constructor of alternant codes is altemantcodetn,0:, r ). It re­turns a table C={h_=h, a_ =0:, r_=r , b_=13, H_=H}, where 13=inverCentries(o:)and H=alternanCmatrix(h, 0: , r ). The fields of this table, say H_, can beaccessed as either C(H_) or, more conveniently, H_(C). As it will becomeclear later on, the table C stores the data constructed from h , Q and r thatare needed to decode alternant codes, and in particular by means of theBerlekamp-Massey-Sugiyama algorithm. IfK is a finite field, the functioncall alternant_code(h, 0:, r,K) adds to the table the entry K_= K, which ismeant to contain the base field of the code .

It should be remarked that in expressions such as C(H_), H_ is localto C, in the sense that there is no interference with another identifier H_used outside C, even if this identifier is bound to a value. The alternativeexpression H_(C) , however, only works properly if H_ has not been assigneda value outs ide C.

"', Library I ConstructinQ alternantcodes I'"

Ialternantcode(h :Vector, a :Vector, r :71 )check length(h)=length(a) & r>O :=

{h_=h, a_=a, c=r, b_=invert_entries(a),H_=alternant_matrix(h, a , r)}We can include a base-field as a fourth parameter

Ialternant codethVector, a:Vector, r :71, K:Field):= alternant_code(h,a,r) I {K_=K}h=a by default

I alternant_code(a:Vector, r:71):= alternant.codeto, a , r)I alternant_code(a:Vector, r :71, K:Field):= alternantcodetc.r) I {K_=K}

IC=alternant_code([2,3.4,5],3);{h_(C) , a_(C), UC), b_(C), H_(C)}

-+ {[2,3,4,5J, [2,3,4,5J,3,[~ ,~, ~, ~J. (.~ ~ 1~ ;5)}8 27 64 125

IC(U -+ 3ID=alternant_code([1 ,-1 , 1,-1], [2, 3,4,5],3, 7l2):

IK_(D) -+ 712

4.1. Definitions and examples 183

4.1 Proposition (Altemant bounds). If C = AK(h, n, r-), then n - r ~

dim C ~ n - rm and de ~ r + 1 (minimum distance altemant bound).

Proof Let H' be the rm x n matrix over K obtained by replacing eachelement of H by the column of its components with respect to a basis of tcover K. Then C is also the code associated to the control matrix H' andsince the rank of H' over K is at most rm, it is clear that dim (C) ~ n - rm.

Now the r x r minor of H corresponding to the columns il, ... , ir isequal to

1 1

h r-l r-l r-l4~ ~l ~

= hil ... hirDr(ail'" ., air),

which is non-zero (we write Dr(aill"" a ir) to denote the determinant ofthe Vandermonde matrix v;. (ail , ... , a ir)) ' This means that any r columnsof H are linearly independent over K and consequently the minimum dis­tance of C is at least r + 1 (proposition 1.25).

Finally, dim (C) ~ n + 1 - de by the Singleton bound, which togetherwith de ~ r +1 gives dim (C ) ~ n - r, And with this the proof is complete.D

4.2 Remark. The map H I---' H' explained at the beginning of the previousproof (recall also the method for finding a control matrix for the cyclic codeCe explained on page 165) can be implemented as follows: Assume K isfinite field and that F = tc is obtained as F=extension(K,f(t)), where f isan irreducible polynomial of degree mover K. Let H be a matrix of typer x n over F . Then the matrix H' of type rm x n over K obtained byreplacing each entry of H by the column of its components with respect tothe basis 1, t, . .. , tm - 1 is provided by the function blow(H,F). Note that thecall blow(h,F), where h is a vector with components in F , is the auxiliaryroutine that delivers the result of replacing each entry of h by the row ofits components with respect to 1, t, . .. ,tm - 1 (see The function blow(H,F)) .We also include the function prune(H) that selects, given a matrix H of rankr , r rows of H that are linearly independent (see The function prune(H)).

4.3 Example. Let a E IF8 and assume that a 3 = a + 1. Consider the matrix

184 Chapter 4. Altemant Codes

"'I Library I The tunction calls blowtl-l, F) and blowth, F)

blow (H :Matrix, F :Field):= begin

local X=null. H=HTfor h in H do

X= (X.blow(h.F))end[X]T

endblow(h .Vector, F :Field):=begin

local h' = []for x in h do

h'=h'l components (x, F)end

end

"'I Library 1 The function prune(H)

prune(H :Matrix):=begin

local R. r, 5=1if zero?(H) then

return []endwhile zero?(H1) do

H=tail(H)endr=rank(H) ; R=H 1; H=tail(H)while H,* [] do

if rank([R .H1]»rank([R]) then

R=(R.H 1) ; s=s+1endH=tail(H)if s=r then

breakend

end[R]

end

I'"

and let C be the altemant binary code associated to H. Let us see that C ==[7,3,4], so that d = 4 > 3 = r + 1.

First the minimum distance d of C is ;:: 4, as any three columns of Harelinearly independent over lF2 . On the other hand, the first three columns andthe column of 0:5 are linearly dependent, for 0:5 = 0:

2 + 0: + 1, and so d = 4.Finally the dimension ofCis 3, because it has a control matrix ofrank 4 over

4.1. Definitions and examples 185

IF2, as the listing Computing the dimension of alternant codes shows.

Computing the dimension of alternant codesusing the blow and prune functions

I n=7 ; r=2;

IK=712; F=K[a]/(a3+a+1);

I h=constant_vector(n,1); a=geometric_series(a, n);IH= [h, a];

(

1 1 1 1 1 1 1)1001011

prune(blow(H,F)) -+ 01 01 11 0

0010111Hence AK(h,a,r)has dimension 3

Reed-Solomon codes

Given distinct elements at, . .. , an E K, we know from Example 1.26 thatthe Reed-Solomon code C = RSCX1 ,...,CXn (k) ~ K" (see Example 1.24) hasa control matrix of the form

with

Hence

hi = 1/(II(aj - ai)) 'Hi

[4.3]

RSCXl,...,CXn(k) = AK(h,a, n - k),

where h = (hI, . . . , hn ) is given by [4.3]. Note that in this case K = K ,hence m = 1, and that the altemant bounds are sharp, because we know thatthe minimum distance of Cis n - k + 1 = r + 1, where r is the numberofrows of H, and k = n - r. The idea that Reed-Solomon codes can bedefined as a special case of altemant codes is implemented in the functioncall RS(a,k) defined in General RS codes.

4.4 Remark (Generalized Reed-Solomon codes). The vector h in the def­inition of the code RSCX1,...,CXn(k) as an altemant code is obtained from a(formula [4.3]). If we allow that h can be chosen possibly unrelated to a ,but still with components in K, the resulting codes AK(h,a,n - k) arecalled Generalized Reed-Solomon codes, and we will write GRS(h, a, k)to denote them. See GRS codes for an implementation. It should be clearat this point that GRS(h,a , k) is scalarly equivalent to RSo,(k) .

186 Chapter 4. Altemant Codes

AILibrary I General RS codes IA

RS(a:Vector, k :71)check length(a»k & k>O:= begin

local h, n=length(a)

h=[ n (Uj-Uj) with i in 1..n]j in [U-11 1~+1 ..n]

h= inverl_entries(h)anernant codeth.o .n-k) I {G_=vandermonde(a,k)} I {K_=field(a1)}

end

Ia=[elementO,7111) with j in 1..9];

I C=RS(a,5);

(

1 1 1 1 1 1 1 1 1) (9 5 10 2 32 1059)1 23456789 9 10 8 8 4 1 4 7 4

G_(C), H_(C) -+ 149533594, 9 9 2 10 9 6 6 1 3185947263 976713985154399345

Ih_(C), a_(C) -+ [9,5,10,2,3,2,10,5,9],[1,2,3,4,5,6,7,8,9]

(

0 0 0 0 )0000G_(C)'H_(C? -+ 0000

00000000

Note that by definition ofalternant codes, we have the following relation:if K is a finite field, r a positive integer and h , a E K , the linear code overK defined by the alternating control matrix H of order r assoc iated to handa is the generalized Reed-Solomon code GRS(h , a , r ) and

AK(h,a,r) = GRS(h,a,r)nKn .

AI Library I GRS codes IA

~GRS(h :Vector, a :Vector, k :71)check length(h)=length(a) &k~O &k:':;;;length(h):= alternantcodeth, a, length(h)-k, field(h1)

IK=717; a=[elementO,K) with j in 1..6]; h=geometric_series(4:K ,6);

IC=GRS(h,a,3);H_(C), K_(C), h_(C), a_(C)

(

1 4 2 1 4 2 )-+ 1 1 64 6 5 ,717, [1, 4, 2, 1, 4, 2], [1, 2, 3, 4, 5, 6]

124222

4.1 . Definitions and examples

EA.!. If h' = (hI, ... ,hn ) is a nonzero vector in the kernel of

187

prove that the dual of GRS(h , 0 , r) is GRS(h', 0 , n - r) . This result canbe seen as a generalization of the formula for the control matrix of a Reed­Solomon code.

BCH codes

BCH codes are of great practical importance for error correc­

tions, paticularly if the expected number of errors is small com­

pared with the length.

F. MacWilliams and N. 1. A. Sloane, [11], p. 257 .

Let now a be an element of R, d a positive integer and l a non-negativeinteger. Let n denote the order of a . Then we know that a control matrixof C = aCHa(d, l) (the aCH code over K associated to a with designeddistance d and offset exponent l) is

a 2(I+d-2)

a(n-l)1 )a(n-l)(I+l)

a(n-I)~I+d-2)

which is the alternant control matrix of order d - 1 associated to the vectors

h - (1 1 21 (n-l)l) d - (1 2 n-l)- , a, a , ... , a an ° - ,a , a , . . . , a .

Thus we see that C can be defined as alternantcodeta , 0 , r) with h set tothe vector geometric_series(a/ ,n), ° to the vector geometric_series(a, n),and r to d - 1 (see Constructing BCH codes). By default we take l = 1eaCH codes in the strict sense). Note that the alternant bound coincides withthe aCH bound .

EA.2. Prove that the minimum distance of the aCH code defined in Con­structing BCH codes is 7.

188 Chapter 4. Altemant Codes

AILibrary I ConstructingBCH codes IA

BCH(a :Element(Field). d :71,I:71) check a=l=O & d>1:=begin

local n=order(a)alternant code(geometric_series(ai, n) , geometric_series(a, n). d-1)

end

IBCH(a :Element(Field), d :71, I :71, K :Field) check a=l=O & d>1:= BCH(a, d, I) I {K_=K}1=1 by default (strict BCH)

I BCH(a:Element(Field), d :71) := BCH(a, d, 1)I BCH(a :Element(Field), d :7l, K: Field):= BCH(a, d, 1) I {K_=K}

Iu=1 :7117 ;

I a=7u; d=7 ; 1=3;IC=BCH(a, d.I) : H=H_(C);I rank(H) , order(a) .... 6,16Idimensions(H) .... 6,16

Primitive RS codes

Beside serving as illuminating examples of BCH codes, they

are of considerablepractical and theoretical importance.

F.MacWilliams and N. J. A. Sloane, [11], p. 294.

Given a finite field K = IFq and an integer k such that 0 (; k (; n,n = q - 1, the Reed-Solomon code of dimension k associated to K isRS1 w ... wn - 1 (k), wherew is a primitive element ofK . But we know (Propo -, , ,sition 3.22) that this code coincides, if r = n - k, with BCHw(r + 1). Thisis the formula used in Constructing primitive Reed-Solomon codes todefine the funtion RS(K,k). Since in this case we know for certain that K isthe field over which the code is defined, we include the table-field K_= K inthe definition of the code.

AI Library I ConstructingReed-Solomon codes

~I RS(K:Field, k :71) check k~O & k<cardinal(K)~ := BCH(primitive_element(K), n-k+1) I {K_=K}

C=RS(7111 ,5); H_(C) .... (~ ~ i i ~ :~ ~ ~ i ~)1 534 9 1 534 91 10 1 10 1 10 1 10 1 10

E.4.3. Show that if ex is a primitive element ofa finite field F, then the matrixVn(ex , .. . , exn-kf is a control matrix for RS(F,k).

4.1. Definitions and examples 189

E.4.4. Let p be a real number such that °< P < 1 and t a positive integer.If K denotes an arbitrary finite field, show that the minimum q = IKI suchthat the code RS(K, k) has rate at least p and corrects t errors satisfies

2tq ? 1+--.

1-p

Conversely, given p E (0,1) and a positive integer t , if q satisfies the condi­tion and K = lFq, then the code RS(K, n - 2t), n = q - 1, has rate at leastp and it corrects t errors. See The function nexCq(p, t). For example, if wewant a rate of 3/4 and that 7 errors are corrected, then we need q = 59. Forcorrecting 10 errors we need q = 81.

AI Librarv I The function next 0(0, t)

Inext q(p, t):= nexCq(1+ 12~~ )

where for a given x E-IR+next_q(x):=begin

local q=rxlwhile prime_power?(q) =false do

q=q+1endq

end

IR=[0 .50, 0.55, 0.60, 0.65, 0.70, 0.75, 0.80];[[nexCq(p,t)with t in 1..12] with p in R]

5 9 13 17 23 25 29 37 37 41 47 49 \7 11 16 19 25 29 37 37 43 47 53 597 11 16 23 27 31 37 41 47 53 59 61

.... 7 13 19 25 31 37 41 47 53 59 64 718 16 23 29 37 41 49 59 61 71 79 819 17 25 37 41 49 59 67 73 81 89 9713 23 32 43 53 64 73 83 97 103 113 125,

Generating matrix. Let w be the primitive element of K used to constructthe control matrix H ofC = RS(K, k) . Since

C = RS1,w,...,wn- 1(k ) (n = q - 1, k = n - r),

a generating matrix of C is given by the Vandermonde matrix

This matrix is supplied by the function rs_G(w, k).

190

..., Library I Generating matrix of RS1,0l •. . ••Oln-l(k)

rs_G(ro, k):= begin

local n=order(ro)[geometric_series (roi, n)with j in O..k-1]

end

Chapter 4. Altemant Codes

Mattson-Solomon matrix. If a is an element of a finite field K, and n =ord(a), the n xn matrix (a i j ) , where 0 ::;; i ,j ::;; n-l, is called the Mattson­Solomon matrix of a . The Mattson-Solomon matrix of K is defined to bethe Mattson-Solomon matrix ofa primitive element of F.

"'1 Library I The Mattson-Solomon matrix

mattson_solomon_matrix (ex. :Element (Field) ) :=begin

local n=order(ex.)[[ex.i ·j withj in O..n-1]with i in O..n-1]

end

Imattson_solomon_matrix(K .Fleld) >

mattson_solomon_matrix(primitive_element(K) )

(1 1 1 )

mattson_solomon_matrix(2 :717) .... 124142

(

1 1 1 1 ). 1243mattson_solomon_matnx(71s).... 1414

1342

E.4.5 (Mattson-Solomon transform). Let K be a finite field and a E K anelement of order n. Let M = Mo: be the Mattson-Solomon matrix of a .Then the linear map K" ----t K" such that x 1--+ x M is called the Mattson­Solomon transform of K" relative to a . Show that if y = x M , then yM =

nx, where x= (xo, Xn-l, .. · ,Xl).

Classical Goppa codes

In order to solve the problem of coding and decoding powerful

methods from modern algebra , geometry, combinatorics and

number theory have been employed.

Y.D. Goppa, [8], p. 75.

Let g E R[T] be a polynomial ofdegree r > 0 and let ex = aI , ... ,an E

R be distinct elements such that g(ai) =f. 0 for all i. Then the classical

4.1. Definitions and examples 191

Gappa code over K associated with 9 and a , which will be denoted r(g , a ),can be defined (see [11], Chapter 12) as alternant_code(h, 0: , r ), where h isthe vector [l/g(O:l ) ," " l / g (O:n )] . In this way it is clear from Proposition4.1 that the minimum distance of r(g , a ) is ~ r + 1 and that its dimensionk satisfies n - rm ~ k ~ n - r.

The code r (g,a ) is provided by the function goppa(g,o:). See Con­structing classical Gappa codes.

AI Library I Constructing classical Goppa codes IA

~Igoppa(g :Univariate_polynomial, a :Vector):= alternant_code (lnvertentrles(evaluate (g,a)), a, degree(g) )

I goppa(g :Univariate_polynomial, a :Vector, K:Field) := goppa(g, a) I{K_=K}

Example of a Goppa code with d~7

IK=71s[xl/(x2-Z); q=cardinal(K);

Ig=T6+T3+T+1 ;Ia=[elementU, K)w ithj in 1..q-1l;Ia=[t in a such_that evaluate(g,t)::t:Ol;IC=goppa(g,a) ;The following computation shows that C has dimension 7

IH=blow(H_(C), K) ;Ilength(a)-rank(H) -+ 7

E.4.6. Let K = Z11 and 0: = [4] 11. Then 0: has order 5 and if a =[1,0:, . . . ,a4 ], then a = [1,4,5,9,3] . Let 9 = (X - 2)(X - 6)(X - 7) EK[X] and C = r (g,a ). Show that C rv [5, 2,4]11 , hence MDS, and checkthat C is not cyclic.

The next propos ition shows that the (strict) BCH codes are Goppa codes.

4.5 Proposition (Strict BCH codes are Goppa codes). Jfw is a primitive el­ement ojK = IFqm and 8 is an integer such that 2 ~ 8 ~ n, then the codeC = BCHw(8) coincides with C' = r(XJ-l, a), where

a = (1,w- 1, . .. , w - (n- 1» .

Proof Since the h-vector of the control matrix H' of C' is

( l, wJ- l, .. . ,w(J-1)(n-1»,

the i-th row of H' is equal to (1,wJ-i , . . . , w(J- i )(n-1» . Thus we see thatH' is the control matrix H that defines C, but with the rows in reversed order(note that the number of rows of H' is deg (XJ - 1) = 8 - 1). 0

4.6 Example (A non-strict BCH code that is not Goppa ). Let C be the bi­nary cyclic code of length 15 generated by 9 = x 2 + X + 1. Let a E IF16 be

192 Chapter 4. Altemant Codes

such that CIA = a + 1. Then the roots ofgin IFl6 are;3 = a 5 and;32 = alaand hence C = C(3 = BCH a(2, 5) (the designed distance is 2 and the offsetis 5). The (generalized) control matrix of this code is [1,;3,;32, ... , 1,;3 , ;32].This cannot have the form [l/g(ao), ... , 1/g(aI4)], with the ai distinct andf a linear polynomial g with coefficients in IF16 . Hence C is not a Goppacode.

Summary

• Alternant matrices. Alternant codes. Reed-Solomon codes (not nec­essarily primitive) and generalized Reed-Solomon codes (GRS).

• Lower-bounds on the minimum distance (d ~ r + 1, where r is theorder of the alternant control matrix used to define the code) and onthe dimension (k ~ n - mr, where m is the degree ofKover K). Wealso have the dimension upper-bound k ~ n - r .

• The classical Goppa codes are alternant codes. The BCH codes areclassical Goppa codes (Proposition 4.5) . Non-strict BCH codes, how­ever, need not be alternant (4.6).

• In particular we will be able to decode Reed-Solomon codes and clas ­sical Goppa codes (hence also BCH codes) as soon as we know a gen­eral decoder for the alternant codes (fast decoders of such codes areconstructed in the last two sections of this chapter).

Problems

P.4.1. Consider a Reed-Solomon code over IFq and assume its minimumdistance is odd, d = 2t + 1.

1) If we fix q and need a rate of at least p, show that t is bounded above

by l(l - p)(q - 1)/2J .

2) If we fix q and the error-correcting capacity t, show that the rate p isbounded above by 1 - 2t/(q - 1).

3) Fix q = 256and regard the elements oflF256 as bytes (8 bits) by takingcomponents of the field elements with respect to a basis over Z2. LetC be a RS code over IF256. What is the maximum number ofbytes thatcan be corrected with C if we need a rate of at least 9/10? What is themaximum rate of C if we need to correct 12 bytes? In each case, whatis the length of C in bits?

4.1. Definitions and examples 193

PA.2. Use K = Ft6 and 9 = X 3 + X + 1 to construct a binary Goppacode C = r(g, a) of maximal length and find its dimension and minimumdistance.

PA.3. Use K = F27 and 9 = X 3 - X + 1 to construct a ternary Goppacode C = r(g, a) of maximal length and find its dimension and minimumdistance.

P.4.4. By the alternant bound, the minimum distance distance d of the Goppacode defined in the listing Constructing Gappa codes satisfies d ~ 7.Prove that d = 8.

PA.5 (Examples of Goppa codes that are not cyclic). Let n be a positivedivisor of q - 1 (q a prime power), 0: E JFq a primitive n-th root of unityand a = [1,0:,... ,o:n-1]. Let C = r(g,a) be the Goppa code over JFassociated to a monic polynomial 9 E JF[X] of degree r and the vector a(hence we assume g(o:i) i= 0 for i = 0, ... ,n - 1).

1) Show that G = (o:ijg(o:i)), forO ~ i ~ n r- r -1 and 0 ~ j ~ n-1,is a generating matrix of C.

2) Deduce that the elements of C have the form

V f = [I(l)g(l) , f(o:)g(o:), ... , f(o:n-1)g(o:n-1)],

with f E JF[X]n-r .

3) Prove that if C is cyclic then 9 = X", and hence that C is a BCHcode. Hint: If C is cyclic, the vector [g(o:n-1) ,g(1), . .. ,g(o:n- 1)]has to coincide with vI> for some t, but this can only happen if f isconstant and from this one can deduce that 9 = X",

PA.6 (Alternative definition of Goppa codes). Let K = JFq and K = JFqm,m a positive integer. Let 9 E K[T] be a polynomial of degree r > 0 andlet a = 0:1 , .. . , O:n E K be distinct elements such that g(O:i) i= 0 for all i .Originally, Goppa defined the code r(g, a) as

[4.4]

In this problem we will see that the code defined by this formula is indeedr(g,a).

1) Given 0: E K such that g(0:) i= 0, show that x - 0: is invertible modulo9 and that

1 = __l_g(x) - g(o:) mod 9x-o: g(o:) x - o:

(note that (g(x) - g(0:)) / (x - 0:) is a polynomial of degree < r withcoefficients in K) .

194 Chapter 4. Alternant Codes

2) Show that the condition L~=o~ == 0 mod 9 in the definitionx - ai

[4.4] is equivalent to

~~g(x) - g(a) = O.f:J' g(ai) x - a

3) Use the relation in 2 to prove that the code defined by [4.4] has acontrol matrix of the form

H* = U· H = U· v;.(aI , ... , an) ' diag (hI, . . . , hn),

where hi = l/g(ai) and, if9 = gO+gIX +.. '+ grx r, U = (gr-i+j)(1 ~ i, j ~ n), with the convention that g/ = 0 if l > r .

4) Since U is invertible, the code defined by [4.4] has also a control ma­trix ofthe form H = Vr(aI, .. . , an)diag (hI, . .. , hn), and hence thatcode coincides with reg,a) .

P.4.7 (Improved minimum distance bounds for binary Goppa codes). Withthe same notations as in the preceeding problem, let 9 be the square closureof 9 (g is the lowest degree perfect square that is divisible by 9 and can beobtained from 9 by replacing each odd exponent in the irreducible factoriza­tion of 9 by its even successor). We will see (cf. [11] , Ch . 12,§3) that ifK = Z2, then reg, a) = reg,a) and hence that d ~ r + 1, where d is theminimum distance of reg,a) and r the degree of g. In particular we willhave d ~ 2r + 1 if9 has r distinct roots .

1) Let a E K " satisfy L~=o~ == 0 mod 9 and set S = Sea)x - ai

to denote the support of a, so that 1SI = s is the weight of a. Let

fa = IliEs(X - ai )' Show that

""' 1 Ifa L...J X-a ' = faiES t

(the derivative of fa).

2) Use the fact that fa and f~ have no cornmon factors to show that

LiES X ~ ai == 0 mod 9 if and only if glf~·3) Show that f~ is a square and hence that 9 1f~ if and only if 91f~. Con-

clude from this that reg,a) = reg,a), as wanted.

P.4.8 (MacWilliams-Sloane, [II], p. 345). Let C = reg ,a) be a binaryGoppa code and suppose 9 = (X - !3I) . .. (X - !3r), where !3I , . . . ,!3r E

f< = IF2m are distinct elements. Show that the Cauchy matrix (1/(!3i - aj )),1 ~ i ~ r, 1 ~ j ~ n , is a control matrix for C.

4.2. Error location, error evaluation and the key equation 195

4.2 Error location, error evaluation and the key equa­tion

Helgert (1975) gave the key equation for decoding altemant

codes.

F.MacWilliams and N.lA. Sloane, [11], p. 367

Essential points

• Basic concepts: error locators, syndrome (vector and polynomial forms),error locator polynomial and error evaluator polynomial.

• Basic results: Forney's formula and the key equation .

• Main result: the solution of the key equation (Berlekamp-Massey­Sugiyama algorithm).

Basic concepts

Let C be the altemant code associated to the alternant matrix H of order rconstructed with the vectors h and a (their components lie in K' = IFq"" ).

Let t = lr/ 2J, that is, the highest integer such that 2t ::;; r. Note that if welet t' = rr /21, then t + t' = r (we will use the equivalent equality r - t = t'in the proof of Lemma 4.12).

Let x E C (sent vector) and e E IFn (error vector, or error pattern). Lety = x + e (received vector) . The goal of the decoders that we will presentin this chapter is to obtain x from y and H when s = lei ::;; t.

Let M = {ml, ... ,m s } be the set of error positions, that is, m E Mif and only if em =1= O. Let us define the error locators r/i, i = 1, . . . ,s, bythe relation TJi = ami' Since the a j are different, the knowledge of the errorlocators is equivalent to the knowledge of the error positions (given the a's).

Define the syndrome vector S = (So, .. . , Sr-I) by the formula

(So, . . . ,Sr-I) = yHT.

Note that 8 = eHT, as xHT = O. Consider also the syndrome polynomial

8(z) = 80 + 8 I z + ...+ 8r _ I zr - l .

Since 8 = 0 is equivalent to saying that y is a code vector, henceforth wewill assume that S =1= o.

196 Chapter 4. Altemant Codes

E.4.7. Check that Sj = 2::=1 hmiemirn (0 :( j :( r - 1).

E.4.8. Assuming S f. 0, prove that the least j such that Sj f. °satisfiesj < s, hence also j < t. Since gcd(zr,S(z)) = zj, gcd(zr,S(z)) hasdegree strictly less than s, hence also stricly less than t. Similarly, prove thatdeg (S(z)) ~ t. Hint: Ifit were j ~ s we would have So = ... = Ss-l = 0and the expressions for Sj in E.4.7 would imply a contradiction. 0

The error locator polynomial CT(z) is defined by the formula

s

CT(Z) = II(1 - r/iz).i=l

Thus the roots of CT are precisely the inverses of the error locators.We also define the error evaluator polynomial by the formula

s s

E(Z) = Lhmiemi II (1- rli z ).i=l j=l ,j# i

4.7 Remark. It is possible to define the error-locator polynomial as the monicpolynomial whose roots are the error-locators. In that case the definitionof the error-evaluator and syndrome polynomials can be suitably modifiedto insure that they satisfy the same key equation (Theorem 4.9), and thatthe errors can be corrected with an apropriate variation of Forney's formula(Proposition 4.8). For the details of this approach, see P.4.9

Basic results

4.8 Proposition (Forney's formula). For k = 1, . . . , s we have

where CT' is the derivative ofCT.

Proof The derivative of CT is

CT'(Z) = - LTJiII(l-TJjz)i j#i

and from this expression we get that

CT'(TJ;;l) = -TJk II(1- r/j/TJk)'j#k

4.2. Error location, error evaluation and the key equation

On the other hand we have, from the definition of E, that

E('TJ;;l) = hmkemk II(1- 'TJj/'TJk).j#

Comparing the last two expressions we get the relation

which is equivalent to the stated formula.

197

o4.9 Theorem (key equation). The polynomials E(Z) i O"(z) satisfy the con­gruence

Proof From the definition of E it is clear that we also have

( )_ ()I:s hmiemiE Z -O"Z .

1 - 71'Zi= l ·It

But

i: -..:.:.hm.:..:-ie~mi I:s I:()j= hm·em· "liZ. 1- "liZ . " .~= 1 ~=1 ) )0

S r-1

== I: hmiemi I:('TJiZ)j mod zri=l j=Or-1 S

= I: (I:hmi emirri) zjj=O i=lr-1

= L Sjzj = S(z),j=O

where we have used EA.? for the last step. o4.10 Remark. The key equation implies that deg (gcd (z", S)) < t ,becausegcd (zr ,S) divides £ (cf. the first part ofEA.8).

E.4.9. In the case of the code BCH w (8, l) over IFq, prove that the syndromesSo, . . . , S8- 2 are the values of the received polynomial (or also of the errorpolynomial) onwl, . . . ,wl+8- 2. Deduce from this that Sqj = (Sj)q ifi ,qj E

{l , ... ,l + 8 - 2}.

198

Solving the key equation

Chapter 4. Alternant Codes

Nevertheless, decoding using the Euclidean algorithm is by far

the simplest to understand , and is certainly at least comparable

in speed with the other methods (for n < 106) and so it is the

method we prefer.

F. MacWilliams and N.lA. Sloane, [11], p. 369.

The key equation shows that there exists a unique polynomial r (z ) such that

E(Z ) = r( z)zr + 0-(z)8(z).

This equation is equivalent to the key equation and one of the main steps inthe decoding ofaltemant codes is to find its solution (0- and E) in terms of z"and 8(z) .

Here we are going to consider the approach based on a modification ofthe Euclidean algorithm to find the gcd oftwo polynomials. This modifica­tion is usually called Sugiyama 's algorithm (for classical Goppa codes it waspublished for the first time in [23]).

Sugiyama's variation on the Euclidean algorithm

The input of this algorithm is the pair ofpolynomials TO = zr and TI = 8 (z)(remember that we have assumed 8 i= 0) and it works as follows :

I) Let Ti , i = 0, . .. .i . be the polynomials calculated by means of theEuclidean algorithm applied to TO and TI, but with j the first indexsuch that deg (Tj) < t . For i = 2, ... , j, let qi be the quotient oftheEuclidean division of Ti - 2 by Ti -I , so that Ti = Ti -2 - qi Ti -l .

Note that since deg (ri ) = deg (8) ~ t by EA.8, we must have j ~ 2.On the other hand the integer j exists, for the greatest common divisord of TO and TI has degree less than t (see EA.8, or Remark 4.10) andwe know that the iteration of the Euclidean algorithm yields d.

2) Define vo, v b .. . , Vj so that Vo = 0, VI = 1 and Vi = Vi-2 - qiVi-l .

3) Output the pair { Vj , Tj} .

In order to prove that Sugiyama's algorithm yields the required solutionof the key equation (Lemma 4.12) , it is convenient to compute, in parallelwith v o, v I, . .. , Vj, the sequence u o , uI, .. . , Uj such that Uo = 1, UI = 0,and Ui = Ui-2 - qiui-I for i = 2, . . . , j .

E.4.10. Prove that for all i = 0, . . . .i we have UiTO+ ViTI = Ti.

4.2. Error location, error evaluation and the key equation 199

"'I Library I The Sugiyama variation of the Euclidean division algorithm I'"sugiyama(rO,r1 .t):=begin

local q, r, v, vO=O, v1= 1while t~degree (r1) do

{q,r}=quo_rem (rO,r1); v=vO-q ' v1rO=r1; r1=r; vO=v1; v1=v

end{v1, r1}

end

Ir=4; S=z+z3; t=2;

I{o, E}=sugiyama (z', S, t) ... {-z2+ 1, z}

Icr·S-E mod zr ... 0

4.11 Remark (Extended Euclidean algorithm). If in the description of theSugiyama algorithm we let j be the last integer such that rj ::I 0, then d = "iis the gcd(ro ,rI) and E.4.10 shows that ao = Uj and al = Vj yield asolution of a Bezout identity coro + alrI = d. The vector [d, ao, aI] isreturned by the function bezout.

Using the function bezout(rO,r1)

IrO=Z7; r1=z3+z+1;

I[d, aO, a1] =bezout(rO, r1)... [1, -6 'z2+4 'z-9, 6·z6_4 ·z5+3 ·z4_2 ·z3+z2_z+1]

IaO'rO+a1 'r1 ... 1Id'" 1

Main results

We keep using the notations introduced in the description of Sugiyama'salgorithm. Recall also that t' = rr /21 and that r - t = t '.

4.12 Lemma. Let E" = rj, f = Uj, 0- = Vj . Then

E"(z) = f(Z)ZT + o-(z)S(z) and deg (0-) ~ t ', deg (E") < t .

Proof According to E.4.10, UirO + Virl = r i for i = 0, ... .i. which fori = j is the claimed equality.

Now let us prove by induction on i that

for i = 1, . .. , j (hence, in particular, that deg (Vi ) is strictly increasingwith i). Indeed, the relation is certaily true for i = 1. We can thus assume

200

that i > 1. Then Vi = Vi-2 - qiVi-l and

Chapter 4. Altemant Codes

deg (Vi) = deg (qi) + deg (Vi-l) (deg (Vi-2 < deg (Vi-d) by induction)

= deg (ri-2) - deg (ri-l) + r - deg (ri-2)

= r - deg (ri-d

(in the second step we have used the definition of qi and the induction hy­pothesis). In particular we have

deg (0-) = deg (Vj) = r - deg (rj-l) ~ r - t = t',

which yields the first inequality. The second inequality is obvious, by defi­nition of j and E. 0

E.4.11. With the same notations as in Lemma 4.12, use induction on i toshow that

i 1, . . . , j , and deduce from it that gcd (ui ,vi ) = 1. In particular weobtain that gcd (1", 0-) = 1.

4.13 Theorem. With the notations as in Lemma 4.12, there exists p E IF;=such that a = pO- and E = pE.

Proof Multiplying the key equation (Theorem 4.9) by 0-, the equality inLemma 4.12 by a, and subtracting the results, we obtain the identity

Now the degree of the polynomial on the left is < r, because

deg (o-E) = deg (0-) + deg (E) ~ t' + t - 1 = r - 1

anddeg (aE) = deg (a) + deg (E) ~ t + t - 1 ~ r - 1.

Since the right hand side of the identity contains the factor z", we infer that

o-E = ea, irr = i a.

Thereforealo-E, o-l1"a.

Since gcd (a, E) = 1, for no root of a is a root of E, and gcd (1",0-) = 1, byEA.ll, we get that alo- and o-Ia, and hence a = pO- and E = PE for somep E IF;=. 0

4.2. Error location, error evaluation and the key equation 201

4.14 Remark. Theorem 4.13 shows that (j and (J' have the same roots, sowe can use (j instead of (J' in order to find the error locators. In addition,Forney's formula proves that we can use (j and € instead of (J' and E to findthe error values, because it is clear that

Summary

• Basic concepts: error locators ("1i), syndrome (vector S and polyno­mial S(z)), error locator polynomial «(J'(z)) and error evaluator poly­nomial (E(z)).

• Basic results : Forney's formula (Proposition 4.8) and the key equation(Theorem 4.9).

• Main result: Sugiyama's algorithm (which in turn is a variation ofthe Euclidean gcd algorithm) yields a solution of the key equation(Theorem 4.13) that can be used for error location and error correction(Remark 4.14) .

• The error-locator polynomial (J'(z) can also be defined so that its rootsare the error-locators (rather than the inverses of the error locators).Ifwe do so, and adopt suitable definitions of S(z) and E(Z ), we stillget a Forney's formula for error-correction and a key equation E(Z) ==(J'(z)S(z) mod z" that can be solved with the Sugiyama algorithm(PA.9).

Problems

P.4.9 (Alternantive syndrome, Forney's formula and key equation). Supposewe define the syndrome polynomial S(z), the error locator (J' (z) and the errorevaluator as follows :

S(z) = Sozr-1 + ...+ Sr-l,8

i = l8

E(Z) = - L hm i em i "1[II(z - "1i)i = l Hi

(now (J' is the polynomial whose roots ara "11 , . .. ,"18)'

202 Chapter 4. Altemant Codes

1) Show that

(j' ('TJk) = II('TJk - 'TJj) and E('TJk ) = -hmkemk'TJkII('TJk - 'TJj )j#k j# k

and hence that we have the following alternative Forney's formula:

2) Prove that we still have the same key equation

E(Z) == (j(z)S(z) mod ZT .

Note that if we let {E, i7} be the pair returned by sugiyama(zr ,S,t), then thezero s of i7 give the error locators, and the alternative Forney's formula withEand i7 finds the error values.

P.4.10. With the notations introduced in PA.6, prove that the syndrome S;with respect to the control matrix H * = U . H = U . v;. (Q) . diag (h ) of thereceived vector y corresponds to the polynomial

S*(Z) = ~ .ss:g(z) - g(ai) ,Y LJ g(a ·) z - a '

i=O t t

which is the unique polynomial of degree < r such that

n - 1

S;(z) == - L~ mod g(z).i=O z - ai

Show also that S;(z) = S;(z), where e is the error vector.

P.4.11. Continuing with the previous problem, define an error-locator poly­nomial (j*( z) = I1t=l (z - 'TJi) (as in PA.9) and an error-evaluator polyno­

mial E*(Z) = 2:t=l em; I1# i(z - 'TJi )' Then deg (o") = s, deg(c*) < S

and gcd((j*,c*) = 1. Prove that em; = E*('TJi )/ (j*'( 'TJi) (the form taken byForney's formula in this case) and the following key equation:

(j*(Z)S*(Z) == E*(Z) mod g(z) .

P.4.12. Modify Sugiyama's algorithm to solve the key equation in the pre­ceeding problem.

4.3. The Berlekamp-Massey-Sugiyama algorithm

4.3 The Berlekamp-Massey-Sugiyama algorithm

203

Decoding. The main source is Berlekamp [Algebraic Coding

Theory, [3] in our references].

F.MacWilliams and N.J.A. Sloane, [11], p. 292.

Essential points

• Algorithm BMS (Berlekamp-Massey-Sugiyama)

• Main result : BMS corrects at least t = Lr/2J errors.

• Altemant decoder (AD): the function alternantdecoderty) that imple­mentsBMS.

• Auxilary functions for experimenting with AD.

Introduction

It is fairly clear that the deepest and most impressive result in

coding theory is the algebraic decoding of BCH-Goppa codes.

R.J. McEleice, [12], p. 264.

As indicated at the beginning of this chapter, one of the nice features ofaltemant codes is that there are fast decoding algorithms for them.

Our function alternantdecoder (AD) is a straightforward implementa­tion of the algorithm, obtained by putting together the results in the last sec­tion, with some extras on error handling. This is a fundamental algorithmwhich we will call, for lack ofa better choice, Berlekamp-Massey-Sugiyamaalgorithm , BMS (cf. [29], §1.6, p. 12-15). There are other fast decoders foralternant codes . However, they can be regarded, with the benefit of hind­sight, as variations of the BMS algorithm, or even of the original Berlekampalgorithm. The next section is devoted to one ofthe earliest schemes, namelythe Peterson-Gorenstein-Zierler algorithm.

Let H be the alternant control matrix of order r associated to the vec­tors h , 0: E k», and let C = CH ~ K" be the corresponding code.As explained at the beginning of Section 4.1 (page 181) we will set {3 =[lh, . . . ,IJnl to denote the vector formed with the inverses of the elementsO:i , that is, IJi = 0:;1 . We also set t = Lr/2J.

204 Chapter 4. Altemant Codes

Next we will explain in mathematical tenus the BMS decoding algorithmto decode CH. It takes as input a vector y E K " and it outputs, if y turnsout to be decodable, a code vector x . Morevover, x coincides with the sentvector if not more than t errors occurred.

The BMS algorithm

In the description below, Error means "a suitable decoder-error message".

1) Find the syndrome s = yHT , say s = [so , . . . , sr-d .

2) Transform s into a polynomial S = s(z) in an indeterminate z (8 iscalled polynomial syndrome):

8 = So + SIZ + ...+ Sr_IZr- i .

3) Let {c, E} be the pair returned by suqlyarnaiz",S,t).

4) Make a list M = {mI , ... , ms } of the indices j E {I, ... , n} suchthat o-(f3j) = O. These indices are called error locations. If s is lessthan the degree of 0-, return Error.

5) Let x be the result of replacing the value Yj by Yj +ej, for each j E L,where

and 0-' is the derivative of 0-, provided ej lies in K . If for some j wefind that ej rt K , return Error. Otherwise return x ifx is a code-vector,or else Error.

4.15 Theorem. The Euclidean decoding algorithm corrects at least t =

lr/2J errors.

Proof Clear from the discussions in the previous sections.

Implementing BMS

o

Given an alternant code C, we wish to be able to call alternant_decoder(C)and have it return the pure function Y f--7 x',where x' is the result ofapplyingthe BMS algorithm for the code C to y. If this were the case, we could setg=alternant_decoder(C) and then apply 9 to the various vectors y that weneed to decode .

4.3. The Berlekamp-Massey-Sugiyama algorithm 205

Auxiliary functions

To work out this plan, we first explain two auxiliary functions. The first,zero_positions(f,a), returns, given a univariate polynomial f and a vector a,the list of indices j for which aj is a root of f . It goes through all the aj

and retains the j for which the value of f is 0 (this is often referred to as theChien search).

"'I Library I The function zerojiosifionsff', a)

~I zero_positions(f :Univariate_polynomial, a :Vector) :=~ {j in range(a) such_that evaluate(f, aj) =O}

~ f=(x-1)'(x-2) '(x-3); a=[O,1,7,2,9,3,11,1,2];~ zero_positions(f , a) -+ {2, 4, 6, 8, 9}

We will also need f1ip(v,L) . This function replaces, for all j in the list L, thevalue Vj of the vector v by 1 - Vj . In fact, v can also be a list, in which casethe value of flip is a list, and in either case L can be a vector.

"'I Library I The function mp(v, II I'"

~flip (v :List IVector, L: List IVector I Range):= for j in L do Vj = 1-vj end

By default, L=[range v]I flip (v :Vector IList) :=fIip(v,range(v))

~ f1ip([1 ,1,0,0,1,0]) -+ [0,0,1,1,0,1]~ f1ip([1 ,1,0,0,1,0], {3,4}) -+ [1,1,1,1,1,0]

The function alternantdecoder

As we said before, this function (AD in shorthand) implements the BMSdecoding algorithm and can be found in the listing The function alter­nantdecoder at the end of this section. Following an observation by S.Kaplan, the present version includes the computation of Forney values, andchecking whether they lie in the base field, even for the base field 1Z2. Know­ing the error positions in the binary case would be enough to correct thereceived vector ifwe knew that the error values given by Forney's formulaare I, but the point is that these values may lie in some extension of 1Z2 if thenumber of errors exceeds the correcting capacity.

If C is an alternant code of length n over the field K and y E K",then alternanCdecoder(y,C) yields the result of decoding y according to theBMS algorithm relative to C (or the value 0 and some suitable error messageif some expected conditions are not satisfied during decoding). The callalternant_decoder(C) yields the function y~alternanCdecoder(y,C) .

206 Chapter 4. Alternant Codes

The listing A BCH code that corrects 3 errors is adapted from [20](Examples 8.1.6 and 8.3.4). The reader may wish to work it by hand in orderto follow the operation ofall the functions. Note that in polynomial notationsthe received vector is 1+x 6 + x 7 +x 8 +x 12 and the error vector is x4 +x 12.

A BCH code that corrects 3 errors

IF=71z[x]I (x4+x+ 1);

IC=BCH(x,5,71z) ;

Ix=alternant_decoder( [1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0LC)-+ [1,0,0,0,1,0,1,1,1,0,0,0,0,0,0]

-corrected positions: {5,13}Ialternant_decoder(x,C) -+ [1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0]

- x is a code vector

A few more examples can be found in Goppa examples of the alter­nantdecoder.

A Goppa code that corrects 3 errors

IF=71z[t] / (t4+t+1); n=cardinal(F)-1 ;

Ig=X6 ; a= [elernentfj.Fjwith j in 1..n]; C=goppa (g ,a,71z) ;

- C corrects 3 errorsIy= [0,1,1,0,0,0,0,0,0,0,0,0,0,0,1] : Vector(71z);

I alternanCdecoder(y,C) -+ [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]-corrected positions : {2,3,15}

Iy= [0,1,1,0,0,1,0,0,0,0,0,0,0,0,1] : Vector(71z);

I x=alternant_decoder(y,C) -+ [1,1,1 ,0,0,1,1,1,0,0,0,0,0,0,1]-corrected positions : {1,7,B}

Ialternant_decoder(x,C) -+ [1,1,1 ,0,0,1,1,1,0,0,0,0,0,0,1]- x is a code vector

I y=x+epsilon_vector(n ,7) -+ [1,1,1,0,0,1,0,1,0,0,0,0,0,0,1]I alternant_decoder(y,C) -+ [1,1,1,0,0,1,1,1,0,0,0,0,0,0,1]

-corrected positions: {7}I y=y+epsilon_vector(n,10) -+ [1,1,1,0,0,1,0,1,0,1,0,0,0,0,1]I alternanCdecoder(y,C) -+ [1,1,1,0 ,0,1,1,1,0,0,0,0,0,0,1]

-corrected positions : {7,10}I y=y+epsilon_vector(n,12) -+ [1,1,1,0,0,1,0,1,0,1,0,1,0,0,1]Ialternant_decoder(y,C) -+ [1,1,1,0,0,1,1,1,0,0,0,0,0,0,1]

-corrected positions: {7,10,12}I [0,1,1,0,0,1,0,0,0,1,0,0,0,0,1] ;Ix=alternanCdecoder(y,C) -+ [1,1,1,0,0,1,1,1,0,0,0,0,0,0,1]

-nondecodable vector: too few rootsIalternant_decoder(x,C) -+ [1,1,1,0,0,1,1,1,0,0,0,0,0,0,1]-0 is not a vector

4.3. The Berlekamp-Massey-Sugiyama algorithm

Some tools for experimenting with the alternant decoder

207

To experiment with altemantdecoder, it is convenient to introduce a fewauxiliary functions, which incidentally may be useful in other contexts . Wewill use the function call random(1,k) (k a positive integer), which producesa (pseudo)random integer in the range l..k.

Random choice ofm elements ofa given set If X is a non empty set, thefunction rd_choice(X) chooses an element ofX at random, and rd_choice(X,m)(for a positive integer m less then the cardinal ofX) extracts a random subsetof m elements of X. Finally, if n is a positive integer, then rd_choice(n,m)(for a positive integer m less than n) chooses m elements at random of theset {I, . .. , n}.

AILibrary I Choosing elements of a set at random (rd choice) 104

Ird_choice(X :List) := Xrandom(1,lenglh(X»

rd_choice(X :List, m :71) check m>O & m<length(X):=begin

local x=rd_choice(X), Cif m=1 then

C={x}else

X=X/{x} ; C={x} I rd_choice(X , m-1)endset(C)

endI rd_choice(n :71, m :71 ) check m>O&m<n :=rd_choice(m, list (1..n) )

~X={p in 2..30 such_that prime?(p)};

I rd_choice(X) -+ 19I rd_choice(X,3) -+ {5, 7, 17}

Random list ofgiven length ofelements ofa finite field. For a finite field(or ring) K of cardinal q, the function elernenttj.K) returns the j-th elementofK, where 0 ~ j ~ q - 1, with a built-in natural order (for example, in therange j = 0, . . . ,p - 1, P the characteristic of K, elernenttj.K) is the residueclass of j mod pin Zp ~ K). Hence we can generate a random element ofKby picking j at random in the range 0 ~ j ~ q - 1. Similarly, we can choosea nonzero element ofK by choosing j at random in the range 1 ~ j ~ q - 1.

Using rd (or rd_nonzero) 8 times (8 a positive integer) we can produce a listof 8 elements (or 8 nonzero elements) of K (see the function calls rd(s,K)and rd_nonzero(s,K)) in Choosing a list of elements of a finite field atrandom.

208 Chapter 4. Aitemant Codes

AI Library I Choosing an element of a finite field at random IA

r, rd(K:Field) :=element(random(0,cardinal(K)-1), K)~ rd_nonzero( K :Field) :=element (random (1,cardinal (K) -1 ), K)

~ rd(71u) ~ 5

AI Library I Choosina a random list of elements of a field IA

~ rd(s:71, K tFleld) checks>O:={rd(K) with i in 1..s}~ rd_nonzero(s :71, K :Field) check s>O:={rd_nonzero(K) with i in f .s}

~ rd(7,713) ~ {O, 2,1,0,2,2, 1}

~ rd_nonzero(7.713) ~ {1, 2, 2, 2, 1, 1, 1}

Random error patterns. Given the length of the code , n, a weight s with1 ~ s ~ n, and the field (or even a ring) K, a random error pattern oflength n, weight s and entries in K can be produced by picking randomly spositions in the range l..n and s nonzero elements of K and using them toform a length n vector whose nonzero entries are precisely those s elementsof K placed at those s positions.

AI Library I Generatina random error-patterns IA

rd_error;vector (n :7l, s :7l, K: Ring) check s>O & s~n:=begin

local k, E=rd_nonzero(s,K), L=rd_choice(n, s) , e=constant_vector(n,O)for i in range(E) do

k=Lj

ek=ek+Ejend

endBy default , n=card(K)-1

I rd_error_vector(s:71, K:Ring) := rd_erroc vector (cardinal (K) - 1, s, K)By default, K=712

Ird_error_vector(n:71, s:71) := rd_error_vector (n, S, 7l2 )

~rd_errocvector(7, 3, 7ls) ~ [0,0, 3, 1, 0, 2, 0]

Ird_error_vector(3, 7ls)~ [1,4,0,1]

I rd_errocvector(5,2) ~ [1,0,0,0,1]

Finding code vectors. If G is the generating matrix of a linear code Coverthe finite field K, the function rd_lineaccombination(G,K) returns a randomlinear combination of the rows of G, that is, a random vector of C.

Decoding trials. To simulate the coding + transmission (with noise) + de­coding situation for an alternant code we can set up a function that generates

4.3. The Berlekamp-Massey-Sugiyama algorithm

"" Library I Generating random linear combinations

rd_linear_combination(G :Matrix, K:Field) :=begin

local k=length(G) , u=vector(rd(k, K))u ·G

end

~ a=[2,3,5,7]; G=vandermonde_matrix(a,3); K=7Zs;~ rd_linear30mbination(G,K) .... [3,4, 0, 3]

209

a random code vector x , that adds to it a random error pattern e of a pre­scribed number s of errors, and that calls alternant_decoder on y = x + e.If the result is a vector x', to make the results easier to interprete we returnthe matrix [x,e, x + e] in case x' = x and the matrix [x, e, x + e, x'] in casethe simulation has resulted in an undetectable error. In case x' is not a vector(hence x' = 0), there has been a decoder error and then we simply return thematrix [x,e].

"" Library I A checker for the alternant decoder I'"

IdecodeUrial(C: Code, s :7Z) check s>O := decodeUrial(C, s, K_(C))decodeUrial(C .Code, s :7Z , K :Field) check s>O:=begin

local x' , G=kernel(H_(C)?, n=n_columns(G),x=rd_linear_combination(G, K) ,e=rd_error_vector(n, s, K)

x'=alternant_decoder(x+e, C)if is? (x' ,Vector) then

if x'=x thensay( "Decoder trial was succesful" )[x, e, x+e]

elsesay( "Decoder trial produced an undetectable error" )[x, e, x+e, x']

endelse

say( "Decoder trial produced a decoder error")[x, e]

endend

Examples

In this subsection we will look at a few examples of RS codes that show howthe machinery developed in this section works .

First we will consider RS codes of the form C = RS(K,r), where Kis a finite field and r is a positive integer less than n = q - 1, q = IFI (see

210 Chapter 4. Altemant Codes

EA.3) . Recall that C has type [n,n - r, r + 1] and that it corrects lr/2Jerrors. We will also consider an example of the form RS(a, k), where k isthe dimension and a E K",

Example: RS[12,6,7]

Suppose we want an RS code with rate at least 1/2 which can correct 3 errors.The table in the listing after EAA tells us that the least possible q is 13. So letus take q = 13, hence n = 12. Since t = 3, the least possible codimensionis r = 6, thus k = 6.

We can construct this code with the function call RS(Z13,6), and we canobserve its working by means of calls to decoder_trial for diverse s , as illus­trated in the next example.

(

1 5 1399588111011)decodeUrial{C,1) ~ 0 11 0 0 0 0 0 0 0 0 0 0

1 3 1 3 9 9 5 8 8 11 10 11

(

2 6 5 3 7 8 0 11 11 8 7 9)decodeUrial{C,2) ~ 1 5 0 0 0 0 0 0 0 0 0 0

3 11 5 3 7 8 0 11 11 8 7 9

(

5 7 8 8 2 3 9 1 10 9 5 11)decodeUrial{C,3) ~ 0 0 10 0 0 0 0 1 0 0 0 5

5 7 5 8 2 3 9 2 10 9 5 3

(

6 5 4 0 7 7 10 5 0 8 10 12)decodeUrial{C,4) ~ 10 0 0 0 0 6 0 0 0 9 11 0

3 5 4 0 7 0 10 5 0 4 8 123 8 4 9 7 0 3 5 0 4 8 12

Idecoder_trial{C,4) ~ (5 12 11 12 9 5 9 8 2 10 4 6)o 0 0 0 4 0 0 3 0 0 4 10

We see that for an error-pattern ofweight 4, which is beyond the correct­ing capacity of the code, in one case we got an undetectable error and in theother a decoder error. Error-correction in the case of I, 2 or 3 errors worksas expected .

Example: RS[26,16,1l]

Suppose now that we want an RS code with rate at least 3/5 which cancorrect at least 5 errors. This time the table in the listing after EAA tells usthat the least possible q is 27. So let us set q = 27, thus n = 26. Since t = 5,the least possible codimension is r = 10 and so k = 16.

The construction of this code, and also the observation of its workingby means of calls to decoder_trial(s), is illustrated in the next example forvarious s. Note that we use the shorthand notation given by the ina, table.

4.3. The Berlekamp-Massey-Sugiyama algorithm 211

Note that up to 5 errors it behaves as expected, but that in the case of 6 errorsit lead to a decoder error.

RS[26,16,11j. Corrects up to 5 errors

IK=713[tj / (t3+2 ·t+1); E=ind_table(t);

I C=RS(K, 10) ;shorthand(decodeUrial(C,3), E)

(

12 25 3 23 20 15 9 19 8 5 2 17 2 13 6 2 9 10 0 15 4 13 7 11 2 22)... 14 7 6

12 25 3 23 20 15 9 19 8 8 2 17 2 13 15 2 9 10 0 15 25 13 7 11 2 22shorthand(decodeUrial (C,5), E)

(

19 1 15 8 7 2 2 25 12 5 _ 15 _ 10 6 4 6 1 _ 9 6 13 20 22 19 23)... _ 21 14 _ _ 15 __ 4 23 _ _ _ _ _ _ _ _ _ _ _ _

19 6 15 8 7 2 4 25 12 11 _ 15 4 _ 6 4 6 1 _ 9 6 13 20 22 19 23

Ishorthand(decodeUrial(C,6), E)... (4 18 8 25 23 8 10 12 0 24 9 11 19 4 11 5 8 1 25 8 24 10 7 22 13 18)

_23 99_0_20 3 _

Example RS[80,60,21]

Suppose we want a RS code with rate 3/4 and correcting 10 errors. Thend = 21, hence k = n - 20. On the other hand k = 3n/4 and so n =80. We could take K = !FsI , but since ZS3 has simpler arithmetic, we canchoose this latter field as K and use only 80 of its nonzero elements. Wecan let n = 80 and a = (al," " an), where aj = [j]n and use the codeRS(a,60) . Happily, this code is alternant (and in general such codes areneither Goppa nor BCH) and hence that can be decoded with the ADdecoder.In the following example we have indicated a decoder trial of 10 errors, butwe have not displayed the result because of its length.

RS[80 ,60,21]. Corrects 10 errorsIK=7183; a=[elementO,K) with j in 1..80];

IC=RS(a, 60);IdecodeUrial(C,10)

Detractors of coding research in the early 1960s had used the

lack of an efficient decoding algorithm to claim that Reed-Sol­

omon codes were nothing more than a mathematical curiosity.

Fortunately the detractors' assessment proved to be completely

wrong .

Wicker-Bhargava, [29], p. 13.

212 Chapter 4. Altemant Codes

Example: A Goppa code [24, 9, ~ 11J

In this example we construct the code C = r(T lO , a), where a contains allthe nonzero elements of1F25. Ifw is a primitive root of1F25, C coincides withBCHw(l1), which could be constructed as BCH(w,11).

Example of a Goppa code with d~7

IK=71s[xI/(xLZ); q=cardinal(K) ;

Ig="rG+T3+T+1 ;Ia=[elementQ, K)withj in 1..q-1];Ia= [t in a such_that evaluate(g,t) *01;IC=goppa(g,a);The following computation shows that C has dimension 7

IH=blow(H_(C), K);Ilength(a)-rank(H) .... 7

Summary

• The Berlekamp-Massey-Sugiyama algorithm (BMS) for altemant codes,which corrects up to t = lr/2J errors . Here r is the order of the al­temant matrix used to define the code. In the special case of (general)RS codes, r is the codimension. In the case of Goppa codes, r is thedegree of the polynomial used to define the code. For BCH codes,possibly not strict, r is 8 - 1, where 8 is the designed distance .

• Implementation ofthe BMS algorithm (the function alternant_decoderand several auxiliary functions that allow to experiment with the algo­rithm).

• Examples of decoding RS codes, both general and primitive, Gappacodes and BCH codes .

4.3. The Berlekamp-Massey-Sugiyama algorithm

Problems

213

P.4.13 . Find a RS code that has rate 3/5 and which corrects 25 errors. Whatare its parameters? Which is the field with smallest cardinal that can be used?Estimate the number of field operations needed to decode a vector with theAD decoder.

P.4.14. Consider the code C = r(T10 +T 2 +T +3, a) over Z5, where a isthe vector formed with the nonzero elements of IF25. Show that dim (C) = 4.Note that this coincides with the lower bound k ~ n - mr for the dimensionof alternant codes, as n = 24, r = 10 and m = 2. Note also that with thepolynomial TlO instead ofT10+T 2 +T +3 the corresponding Goppa codehas dimension 9, as seen in the last example above.

P.4.15. Write a function alternative_decoder to decode alternant codes usingthe alternative definitions ofS(z), a(z) and €(z), and the alternative Forneyformula and key equation that were introduced in PA.9.

P.4.16. Write a function goppa_decoder to decode Goppa codes based onthe polynomials S*(z), a*(z) and f*(Z) introduced in PA.lO and P.4.II, onForney's formula and the key equation established in P.4.11, and on the so­lution to this key equation given in PA.II.

214 Chapter 4. Alternant Codes

"'I Library I Implementing the BMS algoritm I'"

Ialternant_decoder(C .Code) :« yHalternant_decoder(y,C)alternanCdecoder(y, C :Code):=begin

local H, h, r, a, b, K, s, z, S, o, E, M, E, x, yOif Y 1: Vector then

say ("AD> Argument is not a vector"); return(O)endH=H_(C) ; h=h_(C); r=UC) ; a=a_(C); b=b_(C); K=K_(C)if length(y) *n30Iumns(H) then

say("AD> Vector argument has the wrong length") ; return(0)ends=y ·HTif zero?(s) then

say ( "AD> Code vector"); return(y)endS=polynomial (s,z)if degree(S)< Lr/2J 1trailing_degree(S);;,Lr/2J then

say ("AD> Nondecodable vector: extraneous syndrome") ; return(O)end{o, E}=sugiyama (z', S, Lr/2J)M=zero_positions (o.b)if length(M)<degree(cr) then

say ( "AD> Nondecodable vector: too few roots"); return(O)endcr=cr'; E=null; yO=yfor min M do

am·evaluate (E,bm)X=---'-''--------'-'.:-.

hm·evaluate (cr,bm)if not subfield?(field(x) ,K) then

say("AD> Nondecodable vector: Forney value not in base field")return 0

endE=(E,-x); Ym=Ym+x

endE={E}if zero?(y·HT) then

say("AD> Successful decoding, with error table "I {M,E}); return(y)else

say("AD> Decoder error : x' has nonzero syndrome"); return(O)end

end

4.4. The Peterson-Gorenstein-Zierler algorithm

4.4 The Peterson-Gorenstein-Zierler algorithm

215

Furthermore, a simply implemented decoding procedure has

been devised for these codes [BCH codes].

W. W. Peterson and E. J. Weldon, Jr., [16], p. 269.

Essential points

• Finding the number of errors by means of the matrices Ae (formula[4.6]).

• Finding the error-locator polynomial by solving a system of linearequations .

• The algorithm of Peterson-Gorenstein-Zierler (PGZ).

We will present another algorithm for decoding altemant codes. Let us usethe notations introduced at the beginning Section 4.2, but define the error­locator polynomial CT as:

s

CT(Z) = II(z - 1li) ,i = l

so that the roots of CT are now the error locators.We will assume henceforth that s ~ t.

Finding the number oferrors

For any integer .e such that s ~ .e ~ t, define

[4.5]

Ae= [4.6]

This matrix is usually called the Hankel matrix associated with the vector(So, S1, . .. ,SU-2).

4.16 Lemma. We have that det(Ae) = Ofor s < .e ~ t and det(A s ) -; O.In other words, s is the highest integer (among those satisfying s ~ t) suchthat det(A s ) -; O.

216 Chapter 4. Altemant Codes

Proof Let M' = {m~ , ,ma ~ {O, ...,n - I} be any subset such thatM ~ M' . For i = 1, ,e, set "li am~' As we have seen , for j =0, ... , r - 1 we have

5 l

s, = L b-«, emkaimk = L hm~em~ rtfck=l k=l

Let

D = diag(hm~ em~, ... , hm~em~) ,

so that it is clear that det(D) =I 0 ife= s and that det(D) = 0 ife> s. Letus also write

W = Vi("ll, ...,"ll),

where Vi(tn , ... ,"ll) is the Vanderrnonde matrix oferows associated with theelements "li. Note that in particular we have det(W) =I O. We also have

WDWT = Al ,

since the i-th row of W (i = 0, .." e- 1) is (171, ..., "l~) , the j-th row of

DWT (j = 0, ...,e-1) is (hm, em' ~l" ... , hm, em, rf, )T, and their product islI e e <.

l i+ 'L k=l hm~ em~"lk J = Si+j .

Thus we have det(Al) = det(D) det(W)2, which vanishes ife> s (inthis case det(D) = 0), and is nonzero ife= s (in this case det(D) =I 0 anddet(W) =I 0). 0

Finding the error locator polynomial

Once the number oferrors, s , is known (always assuming that it is not greaterthan t) , we can see how the coefficients of the error locator polynomial [4.5]are obtained. Note that

5

a( z) = II(z - "li) = ZS + alzs- l + a2zs-2 + ...+ as, [4.7]i=l

where aj = (-l)j aj = aj ("ll' ... ,"ls) is the j -th elementary symmetricpolynomial in the "li (0 ~ j ~ s). .

E.4.12. Show that 1+alz+ . ·+aszs is the standard error locator polynomial

n :=o(l - "liZ).

4.17 Proposition. If a = (as, ... ,ad is the vector of coefficients of a andb = (Ss , ... , S2s-l). then a satisfies the relation

Asa =-b

and is uniquely determined by it .

4.4. The Peterson-Gorenstein-Zierler algorithm 217

Proof Since det A s =1= 0, we only have to show that the relation is satisfied.Substituting z by 'TJi in the identity

s

II(z - 'TJi) = ZS + a1zs-1+ ...+ asi=l

we obtain the relations

s + s-l °'TJi a1'TJi + ... + as = ,

where i = 1, ... , s. Multiplying by hm;em;rd, and adding with respect to i,we obtain the relations

where j = 0, .. ., s - 1, and these relations are equivalent to the stated matrixrelation. 0

Algorithm PGZ

Putting together Lemma 4.16 and Proposition 4.17 we obtain an algorithm todecode alternant codes . In essence this algorithm is due to Peterson, Goren­stein and Zierler (see [16]) . In detail, and with the same conventions as inthe BMS algorithm concerning the meaning ofError, we have:

1) Calculate the syndrome vector, yHT = (80 , .. . , 8r - 1). If 8 = 0,return y.

2) Thus we can assume that 8 =1= 0. Starting with s = t, and whiledet(As) is zero, set s = s - 1. At the end of this loop we still haves > °(otherwise 8 would be 0) and we assume that s is the numberof errors.

3) Solve for a in the matrix equation Asa = -b (see Proposition 4.17).After this we have ai, ... , as , hence also the error-locator polynomial CT.

4) Find the elements Ctj that are roots of the polynomial CT. If the numberof these roots is < s, return Error. Otherwise let 'TJ1, ... , 'TJs be theerror-locators corresponding to the roots and set M = {m1, . .. , m s } ,

where 'TJi = Ctm;'

5) Find the error evaluator €(z) by reducing the product

mod zr (see E.4.12).

218 Chapter 4. Altemant Codes

6) Find the errors em; using Forney's formula with the error-locator poly­nomiall +a1z +...+asz s and the error-evaluator c:(z). If any of theerror-values is not in K, return Error. Otherwise return y - e.

4.18 Proposition. The algorithm PGZ corrects up to terrors.

Proof It is an immediate consequence ofwhat we have seen so far. 0

4.19 Remark. In step 5 of the algorithm we could use the alternative poly­nomial evalurator 8(z) = 80zr + 8 1zr-1+ ...+ 8r-1, find the alternativeerror evaluator as the remainder of the division of a(z)8(z) by zr and then,in step 6, we would use the alternative Forney formula (see Remark 4.7).

4.20 Remark. In step 5 of the algorithm we could determine the error coef­ficients em I , . . . , ems by solving the system of linear equations

which is equivalent to the matrix equation

i-: hm2 hms e m I 80

hmI1]l hm21]2 hms1]s em2 81

hm I1]r hm21]~ hms1]; em3 82

h s-s l h s-l hms1]ss-l ems 8s-1mI1]l m21]2

and then return y - e (or an error if one or more of the components of e isnot in K).

Summary

• Assuming that the received vector has a nonzero syndrome, the num­ber oferrors is determined as the first ein the sequence t, t - l , ... suchthat det(Ae) i= 0, where Ae is the matrix [4.6].

• The error-locator polynomial [4.7] is found by solving a system oflinear equations Asa = -b defined in Proposition 4.17.

• The error values are obtained by finding the error evaluator polyno­mial via the key equation, and then applying Forney's formula, or bysolving the system of linear equations in Remark 4.20.

• The three steps above summarize the PGZ algorithm.

x

4.4. The Peterson-Gorenstein-Zierler algori thm

AI Library I The Peterson-Gorenstein-Zierler decoder

pgz(y, C:Code) :=begin

local H, h, a, K. r=UC), t=lr/2J, s=t+1 ,S, A, D, b, o, z, M, E, X, E, S'=O

if Y ¢- Vector thensay("pgz> " Iy I" is not a vector")return 0

endH=H_(C)if length(y)*n_columns(H) then

say("pgz> Vector "I y I" has the wrong length")return 0

endh=h_(C) ; a=a_(C) ; K=K_(C)S=y·HTif zero?(S) then

say("pgz> Code vector")return y

endrepeat

s=s-1 ; A= [[Si+j_1with j in 1..s]with i in 1..s] ; D= IAIuntil D*Ob= [Sj with j in 5+1 ..2 s]cr=solve(A,-b); 0'=1+z ·polynomial (reverse (0'), z)b=invert_entries (a) ; M=zero_positions (o.b)if length(M)<degree(O') then

say ("pgz> Nondecodable vector : too few roots")return 0

endE=remainder(O" polynomial(S,z) .z')E={}O'=cr'for min M do

am' evaluate (E,bm)

hm' evaluate (O',bm)if not subfield?(field(x) ,K) then

say("pgz> nondecodable vector: Forney's value not in base field")return 0

endE=EI{-x}Ym=Ym+x

endsay("pgz> error pattern: ",{M, E})return y

end

219

220

Problems

Chapter4. Altemant Codes

P.4.17. We have explained the PGZ algorithm under the assumption thats ::;; t. If the received vector is such that s > t, analyze what can go wrongat each step of the algorithm and improve the algorithm so that such errorsare suitably handled .

P.4.IS. Let a be a primitive root of IF16 satisfying a 4 = a + 1 and considerthe code C = BCH,A5) . Decode the vector 100100110000100.

1) with the Euclidean algorithm;

2) with the PGZ algorithm.

Appendix: The WIRIS/cc system

The syntax rules of the language state how to form expressions,

statements, and programs that look right.

C. Ghezzi and M. Jazayeri, [7], p. 25.

A.I (User interface). The woos user interface has several palettes and eachpalette has several icons. Palette icons are useful to enter expressions thatare typeset on the screen according to standard mathematical conventions.In the Operations palette, for example, we have icons for fractions, pow­ers, subindices, roots, and symbolic sums and products. Here is a sampleexpression composed with that palette :

In addition to the above style (woos or palette style) the woos interfacealso supports a keyboard style , which only requires the keyboard to composeexpressions . In some cases the keyboard style is indistinguishable from thepalette style, as for example the = operator, but generally it is quite different ,as a.i for the subindex expression ai, or Zn(n) for :En (the ring of integersmodulo n) .

A.2 (Blocks and statements). At any given moment , a woos session has oneor more blocks. A block is delimited by a variable height' ['. Each block con­tains one or more statements. A statement is delimited by a variable lengthI. A statement is either an empty statement or an expression or a successionof two or more expressions separated by semicolons. The semicolon is notrequiered if next expression begins on a new line. An expression is either

222 Appendix

a formula or an assignment. A formula is a syntactic construct intended tocompute a new value by combining operations and values in some explicitway. An assignment is an expression that binds a value to a variable (see A.5for more details).

The active block (active statement) displays the cursor. Initially there isjust one empty block (that is, a block that only contains the null statement).Blocks can be added with the [I icon in the Edit palette (the effect is a newempty block following the active block, or before the first if the cursor was atthe beginning ofthe first block). Pressing the Enter key adds a null statementjust before or just after the current statement according to whether the cursoris or is not at the beginning of that statement.

The semantic rules of the languge tell us how to build meaning­

ful expressions. statements, and programs.

C. Ghezzi and M. Jazayeri, [7], p. 25.

A.3 (Evaluation). The evaluation of all the statements in the active blockis requested with either Ctrl+Enter or by clicking on the red arrow. Thevalue of each statement in the block is computed and displayed next to itwith the red arrow in between (the value ofa statement is the value of its lastexpression). After evaluation, the next block, if there is one, becomes theactive block; otherwise an empty block is appended.

In the following we will typeset input expressions in a sans serif font,like in a+b, and the output values in the usual fonts for mathematics, like ina + b, if this distinction may possibly be helpful to the reader.

A.4 . Integers, like 314, and operations with integers, like (2+ 3) . (11 - 7)or 264 , are represented in the usual way:

Usual representation of integers and their opperations1(2+3)'(11-7) ..... 20

1264 ..... 18446744073709551616WIRIS ignores white space between an operant and an operator

I (2 + 3) . (11 - 7)..... 20

12 64 ..... 18446744073709551616

In the listing above there are two text lines. These lines were entered inthe usual way and converted to text with the T icon in the Edit menu. Ingeneral, this icon converts the active statement into text and vice versa. Textis not evaluated and retains all its typographical features . In particular, allthe services for composing expressions are available while editing it. Finallynote that text is not delimited by a variable height I in the way statementsare.

Appendix 223

Why should a book on The Principles ofMathematics contain a

chapter on 'Proper Names, Adjectives and Verbs'?

A. Wood, [30], p. 191.

A.5 (Naming objects). Let x be an identifier (a letter, or a letter followedby characters that can be a letter, a digit, an underscore or a question mark;capital and lower case letters are distinct). Let e be a formula expression.Then the construct x=e binds e (the value ofe) to x. On evaluating, the valueofx is e. On the other hand the construct x:=e binds e (not e) to x. When werequest the value of x, we get the result of evaluating e at that momement,not the result of having evaluated e once for all at the moment of formingthe construct (as it happens with x=e).

If instead of x=e we use let x=e, then all works as with x=e, but in addi­tion x is used to denote the value e.

An identifier that has not been bound to a value is said to be afree iden­tifier (or free variable). Otherwise it is said to be a bound identifier (or abound variable).

Syntactic constructs that have the form x=e, x:=e or let x=e are calledassignments, or assignment expressions.

Ia=2 .... 2I let b=3 .... b13 .... bIx :=a .... a[x .... 2Ia=5 .... 5[x .... 5

I a=b .... b[x .... b

A.6 (Modular arithmetic and finite prime fields). The construction ofthe ringZp of integers modulo the positive integer p is performed by the functionZn(p), in keyboard style, and with the subindex operation to the symbol Z inwoos style. If we put A=Zn(p) and n is an integer, then the class of n modp is the value of the expression n:A (for more details on this construct, seeA.15). Operations in the ring A are denoted in the usual way:

IA=7135;

Ix=7 :A; y=22 :A;Ix1200.y350+x1000 .y300+500 .... 25

11/y .... 8

11/x

224 Appendix

Note that the inverse of 22 mod 35 is 8, and that the inverse of 7 mod 35does not exist (this is why 1/x is left unevaluated and an error message isreported) . Note also that woos does not display the value of a statementwhich ends with a semicolon. The reason for this is that in this case the lastexpression of the statement actually is null (cf. A.2) .

For j = 0, .. . , n - 1, the value of elementU,Zn) is the class [j]n.The ring Zp is a field if and only if p is prime . Note, in particular, that

Zn(2) constructs the field ofbinary integers.

~[1,1/2,1/3,1/4,1/5,1/6] :Vector(Z'l7) ~ [1,4,5,2,3,6]

A.7 (Construction offield and ring extensions). Suppose K is a ring and thatfE K[x] is a monic polynomial of degree r. Then the function extension(K,f),which can be also called with the expression K[x]/(f), constructs the ring K' =K[xl/(f) . The variable x is assigned the class ofx mod f and K is a subringof K'. The name of K[x]/(f) is still K[x], but here x is no longer a variable,but an element of K' that satisfies the relation f(x) = O. When K is finitewith q elements, then K' is finite with qr elements. If K is a field and f isirreducible over K, then K' is a field. If we want to use a different namefor the variable x of the polynomials and the variable a to which its classmodulo f is bound, we can use extension(K,a,f) .

If R is a finite ring, cardinal(R) yields the number of elements of Randcharacteristic(R) yields the minimum positive number n such that n:R iszero. If n is the cardinal of R, the elements of R can be generated withelementij .R), for j in the range O..(n-1).

To test whether a ring K is a finite field, we can use the Boolean functionFinite_field(K), or its synonym GF(K), which is defined as the expressionis?(K,Field) & (cardinal(K)<infinity)?

If=x5-x+1;Ilet F=extension(Z'ls,a,f) ~ F

Ix10 ~ x10

Ia10 ~ a2+3 'a+1

~f=~5-~+1;

Ilet L=Z'ls[~]/(f) ~ L

I~10 ~ ~2+3'~+1

A.8 (Sequences). Sequences are generated with the comma operator ',' . Theonly sequence with 0 terms is null. If sand t are sequences with m and nterms, respectively, then s.t is the sequence with n + m terms obtained by

Appendix 225

appending t to s. In the latter construction, any object which is not a sequenceacts as a sequence ofone term, The number ofterms ofa sequence s is givenby number_oUtems(s) (or its synonym n_items(s)) and its j-th element byitemtj.s).

A.9 (Ranges). If a, band d are real numbers, d f= 0, then the range r definedby a,b,d is the value of the expression a..b..d. It represents the sequenceof real numbers x in the interval [a, b] that have the form x = a + jd forsome nonnegative integer j. Note that the sequence is increasing if d > 0and decreasing if d < o. It is non-empty if and only if either a = b or b - aand d have equal sign. The expression a..b denotes a sequence of valuesin the interval [a, b] that depends on the function using a ..b as an argument.However, it is usually interpreted as a ..b..1, as we will continue to do unlessstated otherwise.

A.tO (Aggregates and pairings). In the sequel, we will say that an object isan aggregate if it is a range or a sequence of objects enclosed in braces or asequence of objects enclosed in brackets .

The sequence of objects that we enclose in braces or brakets is unre­stricted with the exception of a special kind that here will be called pairingsand which may belong to four flavours: relations, tables, substitutions anddivisors. The table summarizes how these flavours are constructed .

Pairingflavours

e -sb x=a a*b

10 relation table substitutionI [ ] divisor table

Here is a more detailed description . Ifone term ofan aggregate delimitedwith braces has the form a ~ b (a->b in keyboard style), where a and b araarbitrary objects , then all other terms must have the same form (otherwisewe get an error on evaluation) and in this case the resulting object is called arelation (it implements a list ofordered pairs). Divisors are like relations , butusing brakets instead of braces. Tables are like divisors (hence delimited bybrakets), but instead of terms of the form a ~ b they are formed with termsof the form x = a, where x is an identifier and a any object (tables can alsobe delimited with braces) . Substitutions are like relations (hence delimitedwith braces), but its terms have the form x * a (x=>a in keyboard style),with x an identifier and a any object. In this case, x occurs more than once,only the last term in which it occurs is retained.

Given a pairing P, the set of values a for which there exists an object bsuch that a~b is a term in P is called the domain of P and it is the value ofdomain(P) . In the case of divisors, we get the same value with support(P).

226 Appendix

Pairings have a functional charater. To see this, let us consider eachflavor separately. If R is a relation (respectively a table) and a is an object(an identifier), the value ofthe expression R(a) is the sequence (possibly thenull sequence) formed with the values b such that a-.b (a=b) is a term in R.The value ofR is the relation (table) R whose terms have the form a -. R(a)(a = R(a)) where a runs over domain(R) .

I R={H1 , s-z,H1} -+ {H(1,1) , b~2}I R(1) -+ 1,1I R(c) -+ nullI R(c)=5 -+ {H(1,1), e-z, c~5}I R -+ {H(1,1) , b~2, c~5}

A nice additional feature of tables is that if x is any identifier, then T(x)can also be obtained as x(T).This often makes clearer the meaning ofexpres­sions .

I T={a=1 , b=2, a=1} -+ {a=(1,1), b=2}IT(a) -+ 1,1IT(c) -+ nullIT(c)=5 -+ {a=(1,1), b=2, c=5}I c(T) -+ 5

Now let us consider a divisor D. Ifa is an object, the value ofthe expres­sion D(a) is the sum (possibly 0) ofthe values b such that a-sb is a term in D.The value of D is the divisor D whose terms have the form a -. D(a), wherea runs over domain(D). Divisors can be added. The sum D+E of the divisorsD and E is characteristized by the rule (D+E)(a)=D(a)+E(a) for all a .

I D=[ 1-+1,b-+2, 1-+1] -+ [H2, b~2]I D(1) -+ 2I D(c) -+ 0I D(c)=5 -+ [1-+2, b-+2, c-+5]I E=[c"'-3] -+ [c~-3]

I D+E -+ [1-+2, b-+2, c-+2]

Finally if S is a substitution and b is any object, S(b) returns the resultof replacing any occurrence ofx in b by the value a corresponding to x in S,for all x in the domanin of S.

In all cases, the ordering of the value P of a pairing P is done accordingto the internal ordering of domain(P).

If x is an aggregate, the number of elements in x, say n, is the value oflength(x). We remark that the number of terms in a sequence s is not given

Appendix

IS={a~1, b~2} -+ {a~1,b~2}

IS(a4·b3 ·c2) -+ 8'c2

IS(C) -+ cIS(c)=5 -+ {a~1,b~2,c~5}

IS(a4·b3 .c2) -+ 200

227

by length(s), but by number_oUtems(s) (with synonym n_items(s)) . Thefunction range(x) returns the range Ln. Moreover, if i is in the range 1..n ,then Xi (or x.i in keyboard style) yields the i-th element of r; otherwise itresults in an error.

Let x still denote an aggregate of length n, and k a positive integer in therange Ln. Then take(x,k) returns an aggregate of the same kind (except,as we explain below, when x is a range) formed with the first k terms of x .To return the last k terms of x, write -k instead of k. The function tail(x)is defined to be equivalent to take(x,-(n-1)). When x is a range of length n,take(x,k) is the list of the first k terms of x (and take(x,-k) is the list of thelast k terms of x; in particular, tail(x) is the list ofthe last n - 1 terms of x) .

If x and yare aggregates of the same kind, but not ranges, then xly (orjoin(x,y)) yields the concatenation ofx and y.

~ R={H1, b~2, H1, c~3} -+ {H(1 ,1), s-z, c~3}~ length(R), range(R), R3 -+ 3, 1..3, c-+3

~0=[ 1-+1 , b-+2, 1-+1] -+ [H2, b-+2]

101 -+ 1-+2

Idomain(O),5Upport(0) -+ {1, b} , {1 , b}

~T={a=1 , b=2, a=1, c=5} -+ {a=(1 ,1) , b=2 , c=5}

Idomain(T) -+ {a, b, c}I take(T,-2) -+ {b=2, c=5}

IS={a~1, b~2} -+ {a~1, b~2}

IS=S I{b~3 , c~4 , d~5} -+ {ae t, b~3, c~4, d~5}

I r=range(S) -+ 1..4I take(r,3), tail(r) -+ {1, 2, 3}, {2, 3, 4}

A.II (Lists and vectors). Aggregates delimited by braces (brakets) and whichare not pairings are called lists (vectors) .

Let x denote a range, a list or a vector. Then the call index(a,x) givesoif a does not appear in x and otherwise yields the position index of thefirst occurrence of a in x. Here we again remark that sequences behavedifferently, for the i-th element of a sequence s is given by item(i,s). Toreverse the order ofx we have the function reverse(x). In the case of a range,reverse(a..b..d) yields b..a..-d.'

228

I r=1 ..17..3 .... 1..17..3Ilength(r), reverse(r), index(7,r) .... 6,17..1..-3,311={a, 1, b, 2, c, 3} {a, 1, b, 2, c, 3}I u=[a, 1, b, 2, c, 3] [a, 1, b, 2, c, 3]I index(3,u) 6I reverse(l) {3, c, 2, b, 1, a}

Appendix

For a list or a range x, vector(x) returns the vector formed with the ele­ments of x (this is the same as [x] if x is a range; when x is a list, however,[x] is a vector with x as unique component). On the other hand, Iist(x) yieldsthe list with the same elements as x when x is a vector, and just a list whoseonly element is x when x is a range. More generally, if x is a sequence ofranges, then [x] is the vector formed with the elements of all the ranges in x,while {x} is just the list of the ranges in the sequence.

To append an object a to a list or vector x, we have the function ap­pend(x,a). It can also be obtained with xl{a} if x is a list or xl[a] if x is avector.

I r=1 ..9..2 .... 1..9..2Ivector(r), [r] .... [1,3,5, 7, 9]. [1, 3, 5, 7, 9]I Iist(r), {r} .... {1, 3, 5, 7, 9}, {1..9..2}11=list(r);vector(I), [ij .... [1,3,5,7,9], [{1 , 3, 5, 7, 9}]Iappend(I,11) .... {1, 3, 5, 7,9, 11}I [r] I [11] .... [1,3,5,7,9,11]

The function constant,vector(n,a) yields, if n is a positive integer and aany object, a vector oflength n whose entries are all equal to a (it is denotedan). The function constanUist(n,a) behaves similarly, but produces a listinstead of a vector.

If j lies in the range 1..n, then the expression epsilon_vector(n,j) yieldsthe vector of length n whose entries are all 0, except the j -th, which is 1. Thesame expression for j not in the range 1..n yields the zero-vector of length n,which can also be constructed with the expression constantvectortn.O), orsimply zero_ vector(n).

The sum of the elements of a range, list or vector x is given by sum(x).Similarly, min(x)yields the minimum element of x.

IThe function index(a,x) also works for pairings, but the returned values are usually dif­ferent from what we would expect from the order in which they were defined, as the tenusin pairings are internally ordered according to the internal order given to the domain of thepairing. This is especially manifest with the function reverse(x), which returns x when x is apairing.

Appendix 229

To form the set associated to a list L we have set(L) . This function dis­cards repetions in L and returns the ordered list formed with the remaining(distinct) values.

Finally, the call zero?(x) returns true if the vector x is zero and falseotherwise.

I constant_vector(4,t), constanUist(4,t) .... [t, t, t, t], {t, t, t, t}I epsilon_vector(5,3) .... [0,0, 1,0, 0]Isum({1,2,3.4,5}) .... 15Imin([1000..500..-23]) .... 517Izero?([O,O,1]), zero?([O,O,1-1]) .... false, true

A.12 (Hamming distance, weight and scalar product). The Hamming dis­tance between vectors x and y of the same length is given by the functionhamming_distance(x,y) (or by its synonym hd(x,y». Similarly, weight(x) (orits synonym wt(x») yields the weight of vector x. The scalar product of twovectors x and y is delivered by (x,y) (or x·y in keyboard style) .

Hamming distance and weight I ExamplesIx=[1,0,1,O ,1,O,1] [1,0,1,0,1 ,0,1]I y=[1,1,1,1,1,1 ,1] [1,1 ,1 ,1 ,1 ,1,1]Iwt(x) 4I hd(x,y) 3I (x,y), x·y .... 4,4

A.13 (Matrices). Structurally, matrices are vectors whose entries are vectorsofthe same length. So ifwe set A=[[1,2],[2,-3]], then A is a 2 x 2 matrix whoserows are the vectors [1,2] and [2,-3]. Writing matrices and working with themcan be done quickly using the Matrices palette (see Matrix expressions).

The expression dimensions(A), A a matrix, gives a sequence m , n suchthat m and n are the number ofrows and columns of A, respectively. The in­tegers m and n can also be obtained separately as the values of the functionsn_rows(A) and n30Iumns(A), which are abbreviations ofnumbecoCrows(A)and number_oCcolumns(A). The transpose of A is AT (or transposetxj).

If A is a matrix, x is a vector, and the number of columns (rows) ofA isequal to the length of x, then the values AxT (respectively xA) are obtainedwith the expressions A·x (respectively x·A). In the expression A·x the vectorx is automatically interpreted as a column vector. Similarly, if A and Barematrices, and the number of columns of A is equal to the number of rows ofB, then A·B computes the product AB.

The element in row i column j of A can be obtained with the expressionAj,j (or A.i.j in keyboard form). On the other hand, if Iand J are lists , AI forms

230 Appendix

91969161193855646

28429075775814

1611938

Matrix expressions

A=(~ : -~ =~);5 4 -3 -5

Idimensions(A) ... 3,4I n_rows(A), n_columns(A) ... 3,41[2,-3, 5]·A ... [21,12,9, -26]IA·[1..4] [6, -12, -16]

B=A·AT (~~ ~~ ~~)14 49 75

IB2 [-6,57,49]

IB2,3 49

IB{2,3}1I2 ... (~~ ~~ i~ ~ nB~,3}{1,3}&I2 -- (~; ~jJ

I IBI,determinant(B) ... 11628,11628I trace(B) ... 150

9375814

B-1... 2842907

91969

a matrix with the intersections of the rows of A specified by I (if these rowsexist) and A1,J forms a matrix with the intersections of the rows and columnsof A specified by I and J, respectively (provided these rows and columnsexist).

IfA and B are matrices with the same number ofrows (columns), then the

matrix AIB (respectively ~) is the value of the expression AlB (respectively

A&B).The expression identity_matrix(r) (or also lr in WIRIS) yields L, the iden­

tity matrix of order r . Another basic constructor is constantrnatrlxtrn.n.a),which yields an m x n matrix with all entries equal to a. The expressionconstantjnatrixtn.a) is equivalent to constant_matrix(n,n,a) .

In the case of square matrices A, its determinant is given by IAI (or de­terminant(A». Similarly, trace(A) returns the trace of A. If IAI =1= 0, then A- 1

(or inverse(A» gives the inverse of A.

A.14 (Boolean expressions). The basic Boolean values are true and false. Ingeneral, a Boolean expression (a Boolean function in particular) is defined asany expression whose value is true or false .

Appendix 231

Ifa and b are Boolean expressions, then not a (or not(a)) is the negationof a, and a&b, alb are the conjunction and disjunction of a and b, respec­tively. The negation of a is true if and only if a is false. Similarly, the con­junction (disjunction) ofa and b is true (false) ifand only ifboth expressionsare true (false).

Another way ofproducing Boolean expressions is with the question mark(?) placed after a binary relation, as for example 3>2 ?, whose value is true,or 3==2?, whose value is false. Since? is considered a literal character,and hence it can be used to form identifiers, the space before? is requiredif, appended to the preceeding token, it would form a valid identifier. If indoubt, use parenthesis to surround the binary relation . On the other hand, ?may be omitted when the binary relation is used in a context that expects aBoolean expression, as for example in the iffield ofa conditional expression.

There are a few Boolean functions whose identifiers end with a questionmark. For example:

even?(n)Yields true if the integer n is even and false otherwise .

invertible?(x)For a ring element x, it yields true when x is invertible in its ring, and falseotherwise.

irreducible?(f,K)For a field K and a polynomial f with coefficients in K, tests whether f isirreducible over K or not.

is?(x,T)Tests whether x has type T (see page 232)

monic?(f)Decides whether a polynomial f is monic or not.

odd?(n)Yields true if the integer n is odd and false otherwise .

prime?(n)Decides whether the integer n is prime or not.

zero?(x)For a vector x, it returns true if x is the zero vector and false otherwise.

Most of the deep differences among the various programming

languages stem from their different semantic underpinnings.

C. Ghezzi , M. Jazayeri, [7], p. 25

232 Appendix

A.IS (Functions). A woos function f (where f stands for any identifier) withargument sequence X is defined by means of the syntax f(X) := F, where F(the body of the function) is the expression of what the function is supposedto do with the parameters X. The same name f can be used for function s thathave a different number of arguments.

Often the value of a function is best obtained through a series S of con­secutive steps. Formally we take this to mean that S is a statement (see A.2,page 221). But a statement S with more than one expression is not con­sidered to be an expression and hence cannot by itself form the body of afunction . To make an expression out of S we have the construct begin Send.

As any statement, S may optionally be preceeded by a declaration of theform local V, where V is a sequence of indentifiers. These identifiers can beused as variables inside the statement S. This means that these variables willnot interfere with variables outside S that might happen to have the samename. The identifiers v in V may be optionally replaced by assignments v=e,e any expression.

A.I6 (Functional programming constructs). Each object is assigned a type(or domain), so that all objects are first class objects. Some of the basic pre­defined types are: Boolean, Integer, Float, Rational, Real, Complex, Field,Ring, Polynomial, Sequence, Range, List,Vector, Matrix, Relation, Table, Di­visor, Function, Substitution, Rule. All predefined types begin with a capitalletter.

The main use of types is in the definition of functions (A.IS). It is im­portant to realize that any object (in particular a function) can be returned asa value by functions (or passed as an argument to functions).

To test whether an object x is of a type T, we have the Boolean funtionis?(x,T), or XET in palette style . For a given T, this is a Boolean function ofx.

~ is?(3/14, Rational), is?(3.14, Q), is?(3.14, Rational) .... true, false, false

~ is?(3.14, Float), is?(3.14, IR) .... true, true

An argument x of a function can be checked to belong to a given typeT with the syntax x:T. In fact T can be any Boolean expression formed withtypes and Boolean functions. This checking mechanism, which we call aftl­ter, allows us to use the same name for functions that have the same numberof arguments but which differ in the type of at least one of them .

A further checking mechanism for the arguments X of a function is thatbetween X and the operator := it is allowed to include a directive of theform check b(X), where b(X) stands for a Boolean expression formed with

Appendix 233

the parameters of the function. Several of the features explained above areillustrated in the listing A sample function.

AILibrary I A samole function 1.1.

Given a finite field K of q elements and a positive integer r that dividesq-1, it returns an element of K of order r

geCelemenCoCorder(r :7Z, K:GF)check r>O& cardinal(K)<+OO & remainder(cardinal(K)-1 , r) =0:= begin

local ro=primitive_element(K) , n=cardinal(K)-1ron/r

end

~ \caretget_element_oCorder(7, 7Z29) .... 16

A.17 (Structured programming constructs). Let us have a look at the basicconstructs that allow a structured control ofcomputations. In the descriptionsbelow we use the following conventions: e is any expression, b is a Booleanexpression, Sand T are statements and X is an aggregate.

ifb then SendEvaluate S in case the value of b is true .

ifb then S else TendEvaluate of S or T according to whether the value of the Boolean expressionb is true or false .

while b do SendKeep evaluating S as long as b evaluates to true.

repeat S until bEvaluate S once and keep doing it so long as the value of b is false.

for x in X do SendEvaluate S for any x running over the items ofX.

for I; b; U do SendStart evaluating I, which must be a sequence of expressions. Then evaluateSand U (U must also be a sequence of expressions) as long as b evaluates totrue.

continueThe presence of this directive somewhere in the S of a while, repeat or foriterators skips the evaluation process of the remaining expressions of Sandcontinues from there on.

breakThe presence of this directive somewhere in the S of a while, repeat or for

234 Appendix

iterators exits the iteration and continues as if it had been completed.

with. such_thatThe expression {e with x in X such_that b } groups with braces the expres­sions e that satisfy b, where the Boolean expressionn b depends on x, whichin turn runs over the terms ofX. The expression {x in X such_that b } groupswith braces the terms x of X that satisfy b. There are similar constructionsusing brakets instead of braces. The directives such_that and where are syn­omyms.

return RExit a function with the value of the expression R.

A.IS (Symbolic sums and related expressions). Let e denote an expression,i a variable and r a list, vector or range. For each value i of r, let e(i) denotethe result of substituting any occurrence of i in e by i . Then the expression

sigma e with i in r (keyboard style) or L e (palette style) yields the sumi in r

2:iEr e(i ). The expression product e with i in r works similarly, but takingproducts instead of sums.

If b is a Boolean expression, then sigma e with i in r such_that b yield asum like 2:i Er e(i), but restricted to the i E r such that b(i) is true. In palette

style, II e (to get a new line for the condition b, press Shift-Enter). The

i in rb

case ofproducts works similarly.

I n=10 - 10n

Ii3- 3025

i=1

I i if;n j3 - 3025

II j3 - 503

i in 1..n where prime?(i)

~1 u _p_-_1_

"2 ' p in 3..20..2 pprime?(p)

55296323323

Bibliography

[1] J. Adamek. Foundations of Doding: Theory and Applications of Error­Correcting Codes with an Introduction to Cryptography and Information The­ory. John Wiley & Sons, 1991.

[2] 1. Anderson. Combinatorial Designs and Tournaments. Clarendon Press,1997.

[3] E.R. Berlekamp. Algebraic Coding Theory. Aegean Park Press, 1984.

[4] R.E. Blahut. Algebraic Methods for Signal Processing and CommunicationsCoding. Springer-Verlag, 1992.

[5] M. Bossert. Channel Coding for Telecommunications. John Wiley & Sons,1999.

[6] J.H. Conway and N.J .A. Sloane. Sphere Packings, Lattices and Groups. Num­ber 290 in Grundlehren der mathematischen Wissenschaften. Springer-Verlag,1988.

[7] C. Ghezzi and M. Jazayeri. Programming Language Concepts. John Wiley &Sons, 3rd edition edition, 1998.

[8] V.D. Goppa. Geometry and Codes. Mathematics and Its Applications. KluwerAcademic Publishers, 1988.

[9] R. Hill. A first course in coding theory. Oxford applied mathematics andcomputing science series. Clarendon Press at Oxford, 1988.

[10] R. Lidl and H. Niederreiter. Introduction tofinite fields and their applications.Cambridge University Press, 1986.

[11] EJ. MacWilliams and N.lA. Sloane. The Theory ofError-correcting Codes.Number 6 in Mathematical Library. North-Holland, 1977.

[12] R.J. McEliece. The Theory of Information and Coding: A MathematicalFramework for Communication. Number 3 in Encyclopedia of Mathematicsand its Applications. Addison-Wesley, 1977.

[13] R.l McEliece. Finite Fields for Computer Scientists and Engineers. KluwerAcademic Publishers, 1987.

236 Bibliography

[14] A.l Menezes, editor. Applications offinitefields. Kluwer Academic Publish­ers, 1993.

[15] O. Papini and J. Wolfmann. Algebre discrete et codes correcteurs. Springer­Verlag, 1995.

[16] W.W. Peterson and E.J. Weldon. Error-Correcting codes. MIT Press, 1972.2nd edition.

[17] A. Po1i. Exercices sur les codes correcteurs. Masson, 1995.

[18] A. Poli and LL. Huguet. Codes correcteurs, theorie etapplications. Number 1in LMI. Masson, 1989.

[19] J.G. Proakis and M. Salehi. Communications Systems Engineering. PrenticeHall, 2nd edition edition, 2002.

[20] S. Roman. Coding and Information Theory. Number 134 in Graduate Texts inMathematics . Springer-Verlag, 1992.

[21] C.E. Shannon. A mathematical theory of communication. Bell System Tech.J., 27:379-423,623-656, 1948.

[22] M. Sudan. Algorithmic Introduction to Coding Theory. MlTlWeb,2002.

[23] Y. Sugiyama, M. Kasahara, S. Hirasawa, and T. Namekawa . A method forsolving the key equation for decoding goppa codes. Information and Control,27:87-99,1975.

[24] M. Tsfasman and S. Vladut. Algebraic-geometric codes. Kluwer, 1991.

[25] lH. van Lint. Introduction to Coding Theory. Number 82 in GTM. Springer­Verlag, 1999. 3rd edition.

[26] lH. van Lint and R.M. Wilson. A course in combinatorics. Cambridge Uni­versity Press, 1992.

[27] S.A. Vanstone and P.C. van Oorschot. An Introduction to Error CorrectingCodes with Applications. The K1uwer Intern. Ser. in Engineering and Com­puter Science. Kluwer Academic Publishers, 1989.

[28] S.B. Wicker. Deep space applications. In V.S. Pless and W.C. Huffinan, edi­tors, Handbook ofCoding Theory, volume II, pages 2119-2169. Elsevier Sci­ence n.v 1998.

[29] S.B. Wicker and v,K. Bhargava, editors. Reed-Solomon codes and their ap­plications. The Institute of Electrical and Electronics Engineers, 1994.

[30] A. Wood. Preliminary notes to Russell's Philosophy : a Study of its Develop­ment. 1959. Appended to Bertrand Russell 's My Philosophical Developement,Unwin Books.

[31] S. Xambo-Descamps . OMEGA: a system for effective construction , cod­ing and decoding of block error-correcting codes. Contributions to Science,1(2):199-224, 2001. An earlier version of this paper was published in theproceedings of the EACA-99 (Fifth "Encuentro de Algebra Computaciona1 yApplications" (Meeting on Computational Algebra and Applications) , SantaCruz de Tenerife, September 1999).

Index of Symbols

(~)lxJfxlt n

xly(xly)IXIA*A[X]ZnIFq , GF(q)IF*q[L :K]I r

M~(A)

n!/k!(n - k)! (binomial number)Greatest integer that is less than or equal xSmallest integer that is greater than or equal xVector [t, ..., t] oflength n . Examples used often: On and InConcatenation of x and y (x and y may be vectors or lists)X l Y I + ...+ X n Yn (scalar product of vectors x and y)Number of elements of a set XGroup of invertible elements of the ring ARing ofpolynomials in X with coefficients in ARing of integers modulo nField of q elements (q must be a prime power)Group of nonzero elements of IFq

Degree of a field L over a subfield KIdentity matrix of order rMatrices of type k x n with entries in A

Other than the general mathematical notations explained above, we list sym­bols that are used regularly in the book, followed by a short description and,enclosed in parenthesis, the key page number(s) where they appear. We donot list symbols that are used only in a limited portion of the text.

pSTCq

k,kcR,Rchd(x, y)d,dc(n,M,d)q[n,k ,d]qIxlx ·yJ,JctB(x,r)DcPe(n , t ,p)

Probability of a symbol error in a symmetric channel (2)Source alphabet (14)Channel alphabet (14,16)A general code (15)Cardinal of the channel alphabet T (16)Dimension of a code C (16)Rate ofa code C (16)Hamming distance between vectors x and y (17)Minimum distance of a code C (17)Type of a code, with short form (n,M)q (17)Type of a code, with short form [n ,k]q (17)Weight of vector x (17, 36)Componentwise product of vectors x and y (17)Relative distance of a code C (17)Correcting capacity of a code (20)Hamming ball of center x and radius r (20)Set ofminimum distance decodable vectors (20)Upper bound for the code error probability (23)

238

p(n, t,p)Rq(n,d)kq(n, d)Aq(n, d)Bq(n,d)CvOlq(n, r)(G)RSo:(k)Vk(O:)D(o:)Hamq(r)Ham~(r)

RM~(m)

pzCC)Kj(x,n, q)cp(n)J.L(n)Iq(n)ord(a)indo: (x)TrL/K(a)NmL/K(a)

CfBCHw(o, t)

r(g ,0:)

T/iS,S(z)a(z)E(Z)

Index ofSymbols

Upper bound for the error-reduction factor (24)Maximum rate R among codes (n, M, d)q (26)Maximum dimension k among codes In,k, d]q (26)Maximum cardinal M among codes (n, M, d)q (26)Maximum qk among linear codes In, k , d]q (81)

Parity completion of a binary code (27)Cardinal of B(x, r) (27)Linear code generated by the rows of matrix G (37)Reed-Solomon code of dimension k on 0: (39)Vandermonde matrix of k rows on the vector 0: (39)Vandermonde determinant of 0: = al , "" an (41)The q-aryHamming code of codimension r (51)The q-arydual Hamming code (51)First order Reed-Muller code of dimension m + 1 (62)Residue of C with respect to Z E C (79)Krawtchouk polynomials (90)Euler 's function (104)Mobius' function (119)Number of irreducible polynomials of degree n over IFq (120)Order of a (124)Index of x with respect to the primitive element a (129)Trace of a E Lover K (137)Norm of a E Lover K (137)Cyclic code defined by a polynomial f (145)BCH code of designed distance 0 and offset t,w a primitive n-th root of unity (166)Altemant code defined over K corresponding tothe altemant control matrix of order r associatedto the vectors h , 0: E f< (182)Classical Goppa code (191)Error locators (195)Syndrome vector and polynomial for alternant codes (195)Error-locator polynomial (196)Error-evaluator polynomial (196)

The table below includes the basic WIRISicc expressions explained in the Ap­pendix which might be non-obvious. The first field contains the expressionin a keyboard style, optionally followed by the same expression in palettestyle (A.1, page 221). The second field is a brief description of the expres­sion. The WIRISicc functions used in the text are included in the alphabeticalindex.

Index ofSymbols

=-. ,~

==, =->, ---+

!=, i=>=, ~

<=, ~

=>,=?$$

I&Zn(p), Zp

n:Ax=ea..b..da..b[Xl ,' " , Xn ]

{Xl , '" ,Xn }

A"1ATIAIAlB

A&Bx·y, (x,y)

Assignment operatorDelayed assignment operatorBinary operator to create equationsOperator to create pairs in a relation or divisorNot equal operatorGreater than or equal operatorLess than or equal operatorOperator to create pairs in a substitution or ruleOperator to create compound identifiers (92)Disjunction (and concatenation) operator (231)Boolean conjunction operator (231)Ring of integers mod p, IFp ifp is prime (223)Natural image of n in A (223)Bounds the value of e to X (223){x E [a , b] Ix = a+ jd,j E Z,j ~ O}(225)Usually a ..b..1 (225)Vector whose components are Xl, . . . , Xn (227)List of the objects xi , . .. , Xn (227)The inverse of an invertible matrix A (230)The transpose of a matrix A (229)The determinant of a square matrix A (230)If A and B are matrices with the same number of rows,the matrix obtained by concatenating each row ofAwith the corresponding row of B (230)Matrix A stacked on matrix B (A.B, page 230)The scalar product ofvectors X and y (A.12, page 229)

239

Alphabetic Index, Glossary and Notes

The entrie s of the index below are ordered alphabetically, with the convention thatwe ignore blanks, underscores, and the arguments in parenthesis of functions. Eachentry is followed by information arranged in two fields. The first field, which is op­tional , consists of the page number or numbers, enclosed in parenthesis, indicatingthe more relevant occurrences of the entry. The second field may be a pointer toanother index entry or a brief summary of the concept in question.

There are two kinds of of entries: those that refer to mathematical concepts,which are typeset like alphabet, and those that refer to WIRIslcc constructs, whichare typeset like aggregate. Those entries that do not refer directly to a conceptintroduced in the text (see, for example, covering radius or Fermat's little theorem)constitute a small set of supplementary notes.

In the summaries, terms or phrases written in italics are mathematical entries.Acronyms are written in italic boldface, like sse. Terms an phrases typeset as ag­gregate refer to WIRIslcc entries.

A

active block (222) In a WIRIS session, the block that displays the blinking cursor (athick vertical bar).

active statement (222) In a WIRIS session, the statement that displays the blinkingcursor (a thick vertical bar).

AD(y,C) (205) Equivalent to alternant decoderty.O).

additive representation (128) [ofan element ofa finite field with respect to a subfield]The vector of components of that element relative to a given linear basis of the fieldwith respect to the sufield. Via this representation, addition in the field is reduced tovector addition in the subfield.

aggregate (225) A range , list, vector or pairing.

alphabet A finite set. See source alphabet and channel alphabet. See also binaryalphabet.

a/ternant bounds (182) For an alternant code C , de ;;::: r + 1 (alternant bound for theminimum distance) and n -r ;;::: ke ;;::: n-rm (alternant bounds for the dimension),where r is the order on the altern ant control matrix H E M~(K) that defines C andm=[R:K].

a/ternant codes (182) An alternant code over a finite field K is a linear code of theform {x E K" IxHT = O}, where H E M:;'(K) is an altemant control matrixdefined over a finite field R that contains K . General RS codes and BCH codes arealternant codes (see page 185 for RS codes and page 187 for BCH codes) .

alternanCcode(h,a,r) (182) Creates the alternant code associated to the matrix con­trol_matrix(h,a ,r).

a/ternant control matrix (180) A matrix H of the form H = Vr (o:)diag (h ) EM:;,(R), where r is a positi ve integer (called the order of the control matrix) andh ,O: E Rn (we say that H is defined over R).

alternanCdecoder(y,C) (205) Decodes the vector y according to the BMS decoder forthe alternant code C.

242 Index-Glossary

alternantmatrbqh.a.r) (180) Creates the alternant control matrix of order r associ­ated to the vectors h and a (which must be of the same length) .

append(x,a) (228) Yields the result of appending the object a to the list or vector x. Itcan also be expressed as xl{a} or xl[a] ifx ifx is a list or vector, respectively.

assignment expression See assignment.asymptotically bad I good families (94) See Remark 1.79.asymptotic bounds (96) Any function of 8 = din that bounds above lim sup kin

(when we let n -+ 00 keeping din fixed) is called an asymptotic upper bound(asymptotic lower bounds are defined similarly). To each of the upper (lower)bounds for Aq(n, d) there is a corresponding asymptotic upper (lower) bound.

McEliece, Rodemich, Rumsey and Welsh (page 95) have obtained the best up­per bound. Corresponding to the Gilbert-Varshamov lower bound for Aq (n, d) thereis an asymptotic Gilbert-Varshamov bound (see E.1.55).

B

Bassalygo-Elias bound See Elias bound.

BCH Acronym for Bose-Chaudhuri-Hocquenghem.BCH(a,d,l) (187) Creates the code BCHe,,(d, l) of designed distance d and offset l

based on the element 0: .

BCH bounds See BCH codes.BCH codes The BCH code of length n, designed distance 8 and offset £ over lFq ,

which is denoted C = BCHw ( 8, f) , is the cyclic code with roots wf+i, for i =0, . .. , 8 - 2, where w is a primitive n-th root of unity in a suitable extension ofIFq - The minimum distance of C is at least 8 (BCH bound of the minimum distance,Theorem 3.15). The dimension k ofC satisfies n - (8 -1) ~ k ~ n - m(8 - 1),where m is the degree of w over IFq (BCH bounds of the dimension, Proposition3.18). In the binary case, the dimension lower bound can be improved to k ~ n-mtif 8 = 2t + 1 (Proposition 3.19). When £ = 1, we write BCH w(8) instead ofBCHw(8, 1) and we say that these are strict BCH codes. Primitive BCH codes areBCH codes with n = qm - 1 for some m.

begin Send (232) This construct builds an expression out of statement S .BER Acronym for bit error rate .

Berlekamp algorithm [for factoring a polynomial over lFq ] In this text we have onlydeveloped a factorization algorithm for X n -1 (page 158). There are also algorithmsfor the factorization of arbitrary polynomials. For an introduction to this subject,including detailed references , see [14], Chapter 2.

bezout(m,n) This function can be called on integers or univariate polynomials m, n.It returns a vector [d, a, b] such that d = gcd (m, n) and d = am + bn (Bezout'sidentity). In fact a and b also satisfy lal < Inl and Ibl < Iml in the integer case, anddeg (a) < deg nand deg (b) < deg m in the polynomial case.

binary alphabet (15) The set {O, I} of binary digits or bits. Also called binary num­bers.

binary digit (2,15) Each of the elements of the binary alphabet {O, I}. With additionand multiplication modulo 2, {O, I} is the field Z2 of binary numbers.

binary symmetric channel (2) In this channel, the symbols sent and received are bitsand it is characterized by a probability p E [0,1] that a bit is altered (the same for

Index-Glossary 243

oand 1). Since p denotes the average proportion of erroneous bits at the receivingend, it is called the bit error rate (BER) of the channel.

bit (2, 15) Acronym for !2.inary dig!!.. It is also the name of the fundamental unit ofinformation.

bit error rate (2) See binary symmetric channel.block (221) In a WOOS session, each ofthe sets of statements delimited by a variable

height T. Each statement in a block is delimited with a variable height 'I'.block code (16) If T is the channel alphabet and q = ITI, a q-ary block code of

length n is a subset C ofT" .block encoder (15) An encoder f : 5 -+ T* such that f(5) ~ T" for some n (called

the length of the block encoder).blow(h,F) (183) For a finite field F=K[x]/(f) and a vector h with entries in F, it creates

a vector with entries in K by replacing each entry of h by the sequence of its com­ponents with respect to the basis 1, x , . . . ,xr - l of F as a K-vector space . If H is amatrix with entries in F, then blow(H,F) is the result of replacing each column h ofH by blow(h,F).

BMS (203) Acronym for Berlekamp-Massey-Sugiyama.BMS algorithm (204) A decoding algorithm for altemant codes based on Sugiyama's

method for solving the key equation and on Forney's formula for the determinationof the error values.

Boolean expression (230) An expression whose value is true or false .Boolean function (230) An function whose value is true or false .BSC Acronym for binary symmetric channel .

ccapacity (4) It measures the maximum fraction ofsource information that is available

at the receiving end of a communication channel. In the case of a binary symmet­ric channel with bit error probability p, it is a function C(p) given by Shannon'sformula C(p) = 1 + plog2(p) + (1- p) log2(1- p).

cardinal(R) (224) The cardinal of a ring R.Cauchy matrix (194) A matrix of the form (1/(a; - f3j )) , where a l , ' " ,an and

131, . . . ,f3n belong to some field and a; =1= f3j for 1 ~ i, j ~ n.cauchy_matrix(a,b) Given vectors a and b, this function constructs the Cauchy matrix

(1/(a; - bj)), provided that a; =1= bj for all i, j with i in range(a) and all j inrange(b).

channel Short form of communication channel.channel alphabet (14) The finite set T of symbols (called channel symbols) with

which the information stream fed to the communication channel is composed. It isalso called transmission alphabet. Although we do make the distinction in this text,let us point out that in some channels the input alphabet may be different from theoutput alphabet.

channel coding theorem (4) If C is the capacity of a BSC, and Rand e are positivereal numbers with R < C , then there exist block codes for this channel with rate atleast R and code error rate less than c.

channel symbol See symbol.characteristic [of a ring or field] See characteristic homomorphism.characteristic(R) (224) The characteristic of a finite ring R.

244 Index-Glossary

characteristic homomorphism (109) For a ring A , the unique ring homomorphismc/>A : Z ~ A. Note that necessarily c/>A(n) = n·1A. The image of'Z is the minimumsubring ofA (which is called the prime ring of A). Ifker (c/>A) = {O}, then A is saidto have characteristic o. Otherwise the characteristic of A is the minimum positiveinteger such that n . lA = O. In this case, there is a unique isomorphism from Znonto the prime ring of A. If A is a domain (has no zero-d ivisors), then the primering is Z if the characteristic is 0 and the field Zp if the characteristic p is positive(p is necessarily prime, because otherwise Zp would not be a domain) .

check b(X) (232) In the definition of a function, between the argument sequence (X)and the operator := preceeding the body ofthe function, it is allowed to include a di­rective of the form check b(X), where b(X) stands for a Boolean expression formedwith the arguments of the function. This mechanism extends the filtering mecha­nism with types (A.l6, page 232) .

check matrix (40) For a linear code C, a matrix H such that x E C if and only ifxHT = O. Also called parity-check matrix , or control matrix. For an introductoryillustration, see Example 1.9.

classical Goppa codes (191) Altemant codes C = AK(h,a, r), h, a ERn, suchthat there exists a polynomial 9 E R[X] of degree r with g(ai) =1= 0 and hi =

I/g(ad, i = 1, .. . ,n. As shown in PA.6, this definition gives the same codesas Goppa's orig inal definition, which looks quite different at a first glance. Thealtemant bounds give de ~ r + 1 and n - r ~ ko ~ n - rm, m = [R : K] .Strict BCH codes are classical Goppa codes (Proposition 4.5) . In the case of binaryclassical Goppa codes we have de ~ 2r + 1 if 9 does not have multiple roots(PA.7).

clear X Assuming that X is a sequence of identifiers , this command undoes all theassignments made earlier to one or more of these identifiers .

code In this text, short form of block code.code efficiency See code rate.code error (3) It occurs when the received vector either cannot be decoded (decoder

error) or else its decoded vector differs from the sent vector (undetectable error).The number of erroneous symbols at the destination due to code errors is usuallybounded above using the observation that for a code of dimension k a code errorproduces at most k symbol errors .

code error rate (3) The probability of a code error. For an introductory example, seeRep(3).

code rate (16) For a code C, the quotient kin, where k and n denote the dimensionand the length of C, respectively. It is also called code efficiency or transmissionrate. For an introductory example, see Rep(3).

code vector (16) A code word.code word (16) An element of the image of an encoder, or an element of a block

code. It is also called a code vector.coefficient(f,x,j) For a polynomial f, the coefficient of xi . If f is multivariate, it

is considered as a polynomial in x with coefficients which are polynomials in theremaining variables.

coefficienCring(f) This function gives the ring to which the coefficients of a polyno­mial f belong.

collect(f,X) (143) Assuming X is a list of variables and f a multivariate polynomial,

Index-Glossary 245

this function returns f seen as a polynomial in the variables X (with coefficientswhich are polynomials in the variables off which are not in X. The argument X canbe a single variable x, and in this case the call is equivalent to collect(f,{x}).

communication channel (14) The component of a communication system that effectsthe transmission of the source information to the destination or receiver.

communication system According to Shannon's, widely accepted model, it has aninformation source, which generates the messages to be sent to the information des­tination, and a communication channel, which is the device or medium that effectsthe transmission of that information.

complete decoder Afull decoder .Complex (232) The type associated to complex numbers. Denoted C in palette style .conference matrix (70) A square matrix C with zeroes on its diagonal, ±1 entries

otherwise, and such that CCT = (n - l)In , where n is the order of C .conference_matrix(F) (70) Computes the conference matrix associated to a finitefield

F of odd cardinal.conjugates (135) The set of conjugates of a E L, L a finite field, over a subfield K

of Lis{

q q2 qr-l}a ,a , a , . .. ,a ,

where q = IKI and r is the least positive integer such that aqr

= a . If K is notspecified, it is assumed that it is the prime field of L .

conjugates(a,K) (135) Yiels the list of conjugates of a over K. The element a is as­sumed to belong to a finite field that contains Kas a subfield . The call conjugates(a)is equivalent to conjugates(a,K), where K is the prime field of the field of a.

constanUist(n,a) (228) The list of length naIl of whose terms are equal to a .constanCmatrix(m,n,a) (230) The matrix with m rows and n columns all whose en­

tries are equal to a. constant_matrix(n,a) is equivalent to constant rnatrbqn.n.a).constanCvector(n,a) (228) The vector an (the vector oflength n all of whose com­

ponents are equal to a) .constrained_maximum(u,C) Computes the maximum of a linear polynomial u with

respect to the constraints C (a list oflinear inequalities). This function is crucial forthe computation of the linear programming bound LP(n,s) .

control matrix See check matrix.control polynomial (146) For a cyclic code Cover K, the polynomial (X" - l)/g,

where n is the length of C and 9 its generator polynomial. The control polynomialis also called check polynomial.

correcting capacity (20) A decoder 9 : D ~ C for a q-ary block code C ~ T" hascorrecting capacity t (t a positive integer) if for any x E C and any y E T " suchthat hd(x, y) ~ t we have y ED and g(y) = x.

covering radius If C ~ T ", where T is finite set, the convering radius of C is theinteger

r = max minhd(y, x ).yETn xEC

Thus r is the minimum positive integer such that U B(x, r) = T ": The GilbertxEC

bound (Theorem 1.17) says that if C is optimal then r ~ d - 1, where d is theminimum distance of C .

crossover probability In a binary symmetric channel, the bit error probability.

246 Index-Glossary

cyclic code (142) Said of a linear code C ~ IF~ such that (an , aI, . .. , an-dECwhen taj , .. . ,an-l ,an) E C .

cyclic_control_matrix(h,n) (150) Constructs the control matrix of the cyclic code oflength n whose control polynomial is h.

cyclic_matrix(g,n) (148) Constructs a generating matrix of the cyclic code oflengthn whose generator polynomial is g.

cyclic_normalized30ntrol_matrix(g,n) (149) Constructs the normalized control gen­erating matrix of the cyclic code oflength n whose generator polynomial is 9 corre­sponding to cyclic_normalized_matrix(g,n).

cyclic_normalized_matrix(g,n) (149) Constructs a normalized generating matrix ofthe cyclic code oflength n whose generator polynomial is g.

cyclic_product(a,b,n,x) (143) Equivalent to cycllcreductlorua-b.n.x), assuming thata and b are polynomials, n a positive integer and x a variable. We may omitx for univariate polynomials . Finally the call cyclic_product(a,b) works for twovectors a and b of the same length, say n, and it yields the coefficient vector ofcyclicreductiorua -b'vn), where a' and b' are univariate polynomials whose coeffi­cient vectors a and b. The result is padded with zeros at the end to make a vector oflength n.

cyclicJeduction(f,n,x) (142) Replaces any monomial xi in the polynomial f by x [j ]n .The polynomial f can be multivariate. The call cyclic_reduction(f,n) works for uni­variate polynomials and is equivalent to cyclic_reduction(f,n,x), where x is the vari­able off.

ciclic_shift(x) (18) Ifx is the vector (Xl , . . . , xn), the vector (xn,X l, . . . ,xn- d.ciclic_shifts(x) (18) Ifx is a vector, the list of cyclic permutations ofx.ccnstantjnatrbqrn.n.a) (230) The m x n matrix whose entries are all set equal to a.

The call constant_matrix(n,a) is equivalent to constant_matrix(n,n,a).cyclotomic class (157) For an integer q such that gCd(q,n) = 1, the q-cyclotomic

class of an integer j modulo n is the set {jqi mod n }O~i<r, where r is the leastpositive integer such that jqr == j mod n.

cyclotomlcclassq.n.q) (157) The value of this function is the q-cyclotomic class ofan integer j mod a positive integer n, where q is required to be a positive intgerrelatively prime with n.

cyclotomic_classes(n,q) (157) For two relatively prime positive integers nand q, itgives the list of the q-cyclotomic classes mod n.

cyclotomic polynomial (160) The n-th cyclotomic polynomial over IFq (see P.3.7) isQn = flgcd U,n)=l (X - wi), where w is a primitive n-th root ofunity in a suitableextension field oflFq (for example IFqm , m = ordn(q)).

cyclotomic_polynomial(n,T) (162) Provided n is a positive integer and T a variable,it returns the n-th cyclotomic polynomial Qn(T) E ZIT].

D

decoder A decoding function .decoder error (20) It occurs when the received vector is non-decodable .decodectrial(C,s,K) (209) For a code C over a finite field Kand a positive integer s,

this function generates a random vector X of C, a random error vector e of weights, and calls alternant_decoder(x+e,C). If the result x' coincides with x, [x,e, X +e] is returned. If x' =1= x, but otherwise is a vector, [x,e, x + e, x'] is returned.

Index-Glossary 247

If x' is not a vector, [x,ej is returned. The call decodeUrial(C,s) is equivalentdecodeUrial(C,s,K), where K is the base field ofC.

decoding function (19) For a block code C ~ T " , a map 9 : D --> C such thatC ~ D ~ T" and g( x ) = x for all x E C. The elements c f D , the domain of g, aresaid to be g-decodable.

degree (134) 1 The degree of a finite field L over a subfield K, usually denoted[L : K j, is the dimension of L as a K-vector space . If K is not specified, it isassumed to be the prime field of L. 2 The degree of a E L over a subfield K of Lis [K[aj : K] . It coincides with degree of the minimum polynomial ofa over K . IfK is not specified , it is assumed to be the prime field of L.

degree(f) Yields the degree of a polynomial f.designed distance See BCH codes.determinant(A) (230) Assuming A is a square matrix, the determinant of A (IAI in

palette style) .diagonal_matrix(a) (180) Creates the diagonal matrix whose diagonal entries are the

components of the vector a .dimension (16) The dimension ofa q-ary block code C oflength n is kc = logq(M),

where M = ICI.dimensions(A) (229) For a matrix A with m rows and n columns, the sequence m,n.discrete logarithm See exponential representation.Divisor (232) The type associated to divisors .divisor (225) A WOOS divisor is a sequence of terms of the form a --> b, where

a and b are arbitrary objects, enclosed in brakets. . Divisors can be thought of asa kind of function . Indeed, if 0 is a divisor and x any object, then O(x) returnsthe sum (possibly 0) of the values y such that x-+y appears in O. Divisors can beadded . The sum 0+0' of two divisors 0 and 0 ' is characterized by the relation(O+O')(a)=O(a)+O'(a),. for all a.

divisors(n) (106) The list of positive divisors of a nonzero integer n.domain A ring with no zero-divisors. In other words, a ring in which x . Y = 0 can

only occur when x = 0 or y = O.domain(R) (225) For a relation R, the set of values a for which there is a term of the

form a-+b in R. A similar definition works for the other pairing flavours. In case Ris a divisor, it coincides with supporteR).

dual code (40) IfC is a linear code over II"q, the code

Cl-={hEF~1 (xlh) =0 forall X E C }.

dual Hamming code (54) Denoted Ham~(r), it is the dual of the Hamming codeHamq(r). It is equidistant with minimum distance «:' (Proposition 1.43). Thesecodes satisfy equality in the Plotkin bound (Example 1.65).

E

Element(Field) The type of objects that happen to be an element of some field.elementO,R) (224) For a finite ring R of cardinal n, this function yields the elements

ofR when j runs over the range 0..(n-1).elias(x) (95, 96) Computes the asymptotic Elias upper bound.

248 Index-Glossary

Elias bound (86) For positive integers q,n ,d, r with q ~ 2, r :( (3n (where (3 =(q - l)/q) and r 2

- 2(3nr + (3nd > 0,

(3nd qnAq(n, d) :( 2 2(3 (3 d ()"r - nr + n vOlq n, r

encoder (15) In a communication system, an injective map f : S ....... T* from thesource alphabet S to the set T* of finite sequences of channel symbols .

entropy (94) The q-ary entropy function is

for 0 :( x :( 1. For q = 2, H2(x) = 1- C(x), where C(x) is the capacity function.The entropy plays a basic role in the expression of some of the asymptotic bounds.

entropy(x,q) (94, 96) Computes the entropy function Hq(x). The call entropy(x) isequivalent to entropy(x,2) .

epsilon_vector(n,j) (228) If j is in the range 1..n, the vector oflength n whose com­ponents are all 0, except the j-th, which is I. Otherwise, the vector On.

equidistant code (54) A code such that hd(x, y) is the minimum distance for alldistinct code words x and y.

equivalent codes See equivalence criteria.equivalence criteria (18, 19) Two q-ary block codes C and C' oflength n are said to

be equivalent if there exist permutations a E Sn and Ti E Sq (i = 1, . .. , n) suchthat C' is the result of applying Ti to the symbols in position i for all words of C,i = 1, ... ,n, followed by permuting the n components of each vector accordingto a . In the case where the q-ary alphabet T is a finite field and each Ti is thepermutation of T given by the multiplication with a nonzero element of T, thecodes are said to be scalarly equivalent.

erf(n,t,p) 24 Computes the error-reduction factor of a code of length n and errorcorrecting capacity t for a symmetric channel with symbol error probability p.

error detection (18, 32) A block code of minimum distance d can detect up to d - 1errors . It can also simultaneously correct t errors and detect s ~ t errors if t + s < d(P.l.l).

error-locator polynomial (196 ,215) [in relation to an alternant code AK(h,o,r)and an error pattern e] 1 In the BMS decoding approach, the polynomial a(z) =

n ;=1 (1 - 'fJi Z), where 'fJ1 , .. . , tt » are the error locators corresponding to the errorpattern e. 2 In the PGZ approach, the polynomial a(z) = n ;=1 (z - 'fJi)' This defi­nition can also be used, with suitable modifications, in the BMS approach (Remark4.7 and PA.9).

error locators (195) [in relation to an alternant code AK(h, 0 , r) and an error patterne] The values 'fJi = Q m i (1 :( i :( s) where {m1 , ... .rn,} ~ {I,··· ,n} is thesupport of e (the set of error positions).

error-evaluator polynomial (196) [in relation to an alternant code AK(h, 0 , r ) andan error pattern e] The polynomial f(Z) = 2:;=1 hmiem i n j=1 ,Ni(1- 'fJiZ), where'fJ1, ... , 'fJs are the error locators corresponding to the error pattern e.

error pattern (195) See error vector.error-reduction factor (3, 24) The quotient of the symbol error rate obtained with a

coding/decoding scheme by the symbol error rate without coding.

Index-Glossary 249

error vector (195) Also called error pattern, it is defined as the difference y - xbetween the sent code vector x and the received vector y.

Euler's function <pen) (104) For an integer n > 1, cp(n) counts the number of integersk such that 1 ~ k ~ n - 1 and gcd (k ,n) = 1. It coincides with the cardinal of thegroup Z~ . The value of cp(l) is conventionally defined to be 1.

evaluate(f,a) Yields the value of a polynomial f at a . Here a is a list with as manyterms as the number of variables of f, as in evaluate(x3+y5,{3,2}). In the case ofa single variable, the braces surrounding the unique object can be omitted (thusevaluate(x3+x5,{17})and evaluate(x3+x5,17) are equivalent).

evaluation (222) In a WIRIS session, the action and result of clicking on the red ar­row (or by pressing Ctrl+Enter). The value of each statement of the active blockis computed and displayed, preceeded by a red arrow icon, to the right of the state­ment.

even?(n) (231) This Boolean function tests whether the integer n is even.exponent [of a monic irreducible polynomial] See primitive polynomial.exponential representation (128) The expression x = a i ofa nonzero element x of

a finite field L as a power of a given primitive element a . The exponent i is calledthe index (or discrete logarithm) of x with respect to a, and is denoted indo(x) . Ifwe know i = indo(x) andj = indo(y), then xy = a k , where k = i + j mod q-l(q the cardinal of L). This scheme for computing the product, which is equivalentto say that indo(xy) = indo+indo(Y) mod q - 1, is called index calculus (withrespect to a). In practice, the index calculus is implemented by compiling a table,called an index table, whose entries are the elements x of L (for example in additivenotation) and whose values are the corresponding indices indo(x) .

expression (221) A formula or an assignment (see statement).extension(K,f) (113, 224) Given a ring K and a polynomial fEK[x] whose leading

coefficient is invertible (a monic polynomial in particular), then extension(K,f}, orthe synomym expression K[x]/(f}, constructs the quotient K[x]/(J) and assigns theclass ofx tox itself (thus the result is equivalent to the ring K[x], with f( x) = O. Ifwe want another name a for the class ofx, then we can use the call extension(K,a,f} .

F

facto r(f,K) Given a field K and a polynomial f with coefficients in K, finds the factor­ization off into irreducible factors.

factorization algorithm (158) When gcd(q,n) = 1, the irreducible factors of thepolynomial X" - 1 E IFq[X] are fe = lliEdX - wi), where C runs over theq-cyclotomic classes mod nand w is a primitive n-th root of unity in IFqm, m =ordn(q).

false (230) Together with true forms the set of Boolean values .Fermat's little theorem It asserts that if p is a prime number and a is any integer

not divisible by p, then aP -1 == 1 (mod p). This is equivalent to say that any

nonzero element a of the field Zp satisfies a P- 1 = 1. We have established twogeneralizations of this statement: one for the group Z~ (see E.2.5) and another forthe multiplicative group of any finite field (see E.2.6).

Field (232) The type associated to fields.field (101) A commutative ring for which all nonzero elements have a multiplicative

inverse.

250 Index-Glossary

field(a) Assuming a is an element of some field F, this function returns F. For exam­ple, the value offield(2/3) is <ell, and the value offield(5:Z11 ) is Zn .

filter (232) Any Boolean expression F, which may include types, used in the form x:Fto check that an argument x of a function has the required properties.

Finite_field(K) (224) This Boolean function, or its synonym GF(K), tests whether Kis a Galois field.

finite fields (Ill) A finite field (also called a Galois field) is a field K whose cardinalis finite. If q = IKI, then q must be a prime power (Proposition 2.8): q = p", pthe characteristic of K and r a positive integer. Conversely, if p is a prime numberand r a positive integer, then there is a finite field of p" elements (Corollary 2.18).Furthermore, any two fields whose cardinal is pr are isomorphic (Theorem 2.41).This field is denoted IFq » or GF(q) . Note that IFp = Zp, the field of integers mod p.For any divisor s of r there is a unique subfield of IFpr whose cardinal is p", andthese are the only subfields ofIFpr (E.2.I5).

f1ip(x,l) The components Xi of the vector x indicated by the elements i of the list I arereplaced by 1 - Xi . For a binary vector this amounts to complement the bits at thepositions indicated by I. We remark that I can be a range or a vector. On the otherhand,flip (x) is equivalent to f1ip(x, 1..n), n the length ofx.

Float (232) The type of decimal numbers .formula (222) See statement.formula expression (222) See statement.Forney's formula (196) Expression for the error values in terms of the error-locator

and error-evaluator polynomials (proposition 4.8).Frobenius automorphism (III) 1 If L is a finite field of characteristic p, the map

L -+ L such that X I-t x P (absolute Frobenius automorphism of L). 2 If K is asubfield of L with IKj = q, the map L -+ L such that x I-t x q is an automorphismof Lover K (Frobenius automorphism of Lover K) .

frobenius(x) (III) If x lies in a field of characteristic p, the value x" , The call trobe­nius(K,x), where K is a finite field of q elements, yields x" ,

full decoder (I 9) A decoder 9 : D -+ C such that D = T" (hence all length nvectors are decodable) .

Function (232) The type associated to functions , in the sense of A.I5, page 232.

G

gcd(m,n) The greatest common divisor of two integers (or of two univariate polyno­mials) m and n.

geometric_progression See (geometric_series).geometric_series(x,n) (103) Given expression x and a non-negative integer t, returns

the vector [1 ,x, . . . , xn-1j (note that n is its length) . The call geometr ic_series(a,x,n),where a is an expression, is equivalent to a-qeornetrtcserlestx.n). Synonym: geo­metric_progression.

gilbert(x) (94, 96) Computes the asymptotic Gilbert lower bound.g-decodable See decoding function .Galois field (34) Seefinitefields.Gauss algorithm (131) A fast procedure for finding a primitive element of a finite

field (P.2.I 0).

Index-Glossary 251

generalized Reed-Solomon codes (185) Alternant codes AK(h,a,r) for whichK = K (and hence such that h , a E K").

generating matrix (37) For a linear code C , a matrix G whose rows form a linearbasis of C. In particular, C = (G) .

generator polynomial (146) For a cyclic code Cover K , the unique monic divisor 9of X " - 1 such that C = Cg (Proposition 3.3).

GF(K) (224) Synonym of Finite_field(K), which tests whether K is a Galois field.Gilbert lower bound (30) Aq (n ,d) ~ q" / volq (n, d - 1).Gilbert-Varshamov bound (51) Aq(n ,d) ~ qk if (n, k ,d) satisfy the Gilbert-Var­

shamov condition .Gilbert-Varshamov condition (50) For positive integers n, k, d such that k ~ nand

2 ~ d ~ n + 1, the relation vOlq(n - I ,d - 2) < «:". It implies the existence ofa linear In, k, d]q code (Theorem 1.38).

Golay codes (151, 169) The binary Golay code is a cyclic [23,12,7] code (see Ex­ample 3.21, page 169). The ternary Golay code is a cyclic [11,6, 5h code (seeExample 3.6, page 151). For the weight enumerators of these codes, and their paritycompletions, see P.3.3 (ternary) P.3.l0 (binary).

Goppa codes See classical Goppa codes.Griesmer bound (80) If In, k, d] are the parameters of a linear code, then we have

(Proposition 1.58) Gq(k,d) ~ n, where Gq(k, d) = E~:~ rd/qil (this expressionis called Griesmer function).

griesmer(k,d,q) (80, 81) Computes the Griesmer function Gq(k, d) = E~:~ r: lwhich is a lower bound on the length nof a linear code In,k,d]q.

Griesmer function See Griesmer bound.GRS Acronym for Generalized Reed-Solomon.GRS(h,a,r) (186) Creates the GRS code of codimension r associated to the vectors h

and a, which must have the same length.

H

Hadamard codes (72) If H is a Hadamard matrix of order n, the binary code CH rv

(n ,2n,n/2) formed with the rows of Hand - H after replacing -1 by 0 is theHadamard code associated to H . A Hadamard (32,64,16) code, which is equivalentto the Reed-Muller code RM2(5), was used in the Mariner missions (1969-1972).For the decoding of Hadamard codes, see Proposition 1.55 (page 74).

hadamard_code(H) (72) Returns the Hadamard code associated to the Hadamardmatrix H.

hadamard_code(n) (72) For positive integer n, this function returns the Hadamardcode associated to the Hadamard matrix us».

hadamard_code(F) (72) For a finite field F of odd cardinal, we get the Hadamardcode associated to the Hadamard matrix of F.

Hadamard matrix (63, 71) A square matrix H whose entries are ±1 and such thatH HT = nIn , where n is the order of H .

hadamard_matrix(n) (64) The 2n x 2n Hadamard matrix ni» ,hadamard_matrix(F) (71) The Hadamard matrix of a finite field F of odd cardinal.Hadamard decoder (74) The decoder for Hadamard codes based on Proposition 1.55.

252 Index-Glossary

hadamard_decoder(y,H) (74) Decodes vector y by the Hadamard decoder of theHadamard code associated to the Hadamard matrix H.

hamming(x) (95, 96) Computes the asymptotic Hamming upper bound.Hamming code (21, 51) Any linear code over IF = IFq that has a check matrix H EM~(IF) whose columns form a maximal set with the condition that no two of themare proportional. They are perfect codes with length n = (qr - l)/(q - 1) andminimum distance 3, and are denoted Hamq(r). The check matrix H is said to be aHamming matrix . The binary Hamming code Ham2(3) is the [7,4,3] code studiedin Example 1.9. For the weight enumerator of the Hamming codes, see Example1.46. The Hamming codes are cyclic (Example 3.14, page 165).

Hamming distance (17) Given a finite set T and x, y E T", it is the number ofindicesi in the range l..n such that Xi =1= Yi and it is denoted hd(x, V).

hamming_distance(x,y) (229) The Hamming distance between the vectors x and y.For convenience can be called with the synonym hd(x,y).

Hamming matrix (52) See Hamming code.Hamming's upper bound (28) Aq(n ,d) ~ q" /volq(n, t), t = L(d - 1)/2J .hamming_weighCenumerator(q,r,T) (59) The weight enumerator, expressed as a poly-

nomial in the variable T, of the codimension r Hamming code over a field of cardi­nal q. The call hamming_weighCenumerator( r,T) is equivalent to hamming_weight_­enumerator(2,r,T).

Hankel matrix (215) The matrix (Si+i) , 0 ~ i, j ~ £. - 1, associated to the vector(So, . . . , S£-I) .

hankel(S) (215) Produces the Hankel matrix associated to the vector S.hd(x,y) (229) Synonym of hamming_distance(x,y), which yields the Hamming dis­

tance between the vectors x and y.

I

icon (221) Each of the choices in a palette menu of the WIRIS user interface, oftenrepresented with an ideogram.

ideal (102) An additive subgroup I of a commutative ring A such that a . x E I forall a E A and x E I .

identifier (223) A letter, or a letter followed by characters that can be a letter, a digit,an underscore or a question mark. Capital and lower case letters are distinct. Iden­tifiers are also called variables.

identity_matrix(r) (230) The identity matrix of order r, L:if b then Send (233) Evaluates statement S when the value ofthe Boolean expression

b is true, otherwise it does nothing .if b then S else Tend (233) Evaluates statement S when the value of the Boolean

expression b is true, otherwise evaluates statement T.incomplete decoding (50) A correction-detection scheme for linear codes used when

asking for retransmission is possible . It corrects error-patterns of weight up to theerror-correcting capacity and dectects all error-patterns of higher weight which arenot code vectors .

index(a,x) (227) For a vector or list x, 0 if the object a does not belong to x, otherwisethe index of first occurrence of a in x.

index calculus See exponential representation.index table See exponential representation .

Index-Glossary 253

ind_table(a) (129) Given an element a of a finite field, it returns the relation {ai -+j}o~i~n-I, n the order of a. In order to incluce 0, we also add the pair {O->_}(herewe use the underscore character, but often the symbol 00 is used instead).

infimum(f,r) (96) For a function f and a range r, it returns the minimum of the valuesf(t) when t runs over r. Iff is continuous on a closed interval [a, b] , we can approx­imate the minimum off in [a ,b], as closely as wanted, by choosing r=a..b..ewith E

small enough.information source (14) The generator of messages in a communication system. A

(source) message is a stream ofsource symbols.inner distribution (89) For a a code C '" (n , M)q , the set of numbers a; = AdM,

i = 1, . .. , n, where Ai is the set ofpairs (x, y) E ex C such that hd(x, y) = i . Inthe case of a linear code it coincides with the weight distribution.

input alphabet See channel alphabet.Integer (232) The type associated to integers. Denoted Z in palette style.inverse(x) 1 The inverse, x-lor 1/x, of an invertible ring element x. 2 If R is a

relation, the relation formed with all pairs b-+a such that a->b is a pair of R.inverCentries(x) (181) Gives the vector whose components are the inverses of the

components of the vector x.invertible?(x) (231) For a ring element x, it yields true when x is invertible in its ring,

and false otherwise.is?(x,T) (232) Tests whether x has type T or not. Takes the form xET in palette style.ISBN (35) Acronym for International Standard Book Number (see Example 1.19).is_prime_power?(n) (35) For an integer n, decides whether it is a prime power or

not.irreducible?(f,K) (231) For a field K and a polynomial f with coefficients in K, tests

whether f is irreducible over K or not.irreducible_polynomial(K,r,T) (121) Given a finite field K, a positive integer r and a

variable T, returns a monic irreducible polynomial ofdegree r in the variable T andwith coefficients in K . Since it makes random choices, the result may vary fromcall to call.

item(i,s) (225) The l-th term of a sequence s .

J

Jacobi logarithm See Zech logarithm.Johnson's upper bound (87,88) An upper bound for binary codes which improves

Hamming's upper bound (Theorem 1.72, and also Proposition 1.71).join(x,y) (227) The concatenation of x and y (both vectors or both lists). It coincides

with xly.

K

keyboard style (221) Conventions for representing the action of palette icons bymeans of a keyboard character string. For example, the symbol Z for the ring ofintegers that can be input from the Symbol palette has the same value as Integer.Similarly, the fraction ~ composed by clicking the fraction icon in the Operationsmenu has the same value as a/b.

key equation (197) For alternant codes defined by an alternant matrix of order r,the error-evaluator polynomial E(Z) is, modulo z", the product of the error-locator

254 Index-Glossary

polynomial a-(z) times the syndrome polynomial S(z) (Theorem 4.9).Krawtchouk polynomials (90) The q-ary degree j Krawtchouk polynomial for the

length n is

K j(x,n,q) = ~(-l)SG)C=:)(q _l)j-s .

As shown by Delsarte, these polynomials playa fundamental role in the linear pro­gramming bound (cf. Propostion 1.74 and Theorem 1.75).

L

Ib_gilbert(n,d,q) (31) The value of rqn/vo/(n ,d - 1, q)l , which is the Gilbert lowerbound for Aq(n, d). The expression Ib_gilbert(n,d)is defined to be Ib_gilbert(n,d,2).

Ib_gilbert_ varshamov(n,d,q) (31) The value qk, where k is the highest positive in­teger such that volq(n - 1, d - 2) < qn-k (Gilbert-Varshamov condition). Theexpression Ib_gilbert_varshamov(n,d) is defined to be Ib_gilbert3arshamov(n,d,2) .

Icm(m,n) The least common multiple of two integers (or of two univariate polynomi­als) m and n.

leaders' table (45) A table E = {s -. eS}SEIF~-k formed with a vector e, E IF~

of syndrome s with respect to a check matrix H of a code tn,k]q and which hasminimum weight with that condition. The vector e, is called a leader of the set ofvectors whose H-syndrome is s.

Legendre character (66,67) Given a finite field IF of odd cardinal q, the map

X : IF* -. {±l}

such that X(x) = 1 if x is a square (also called a quadratic residue) and X(x) = -1if x is not a square (also called a quadratic non-residue) . Conventionally X(O) isdefined to be O.

lefCparity_extension(G) See parity_extension(G).legendre(a,F) (67) Yields the Legendre charater of the element a of a finite field F.length (15) 1 The length ofa vector, word, or list is the number of its elements. 2 The

length of a block encoder or of a block code is the length of any of its words .length(x) (226) The length ofx (an aggregate) .let (223) The sentence let x=e binds the value of e to x and assigns the name x to this

value.linear codes (35) A code whose alphabet T is a Galois field and which is a linear

subspace ofT" .linear encoding (38) If G is a generating matrix of a linear code C over IFq, the

encoding IF~ -. IF~ given by u f-t uG .linear programming bound (91) Aq(n ,d) is bounded above by the maximum of the

linear function 1 + ad +...+ an subject to the constraints ai ~ 0 (d :( i :( n) and

(~) + t aiKj (i) ~ O for jE{O,l , ... ,n} ,J i=d

where K j = Kj(x, n, q), j = 0, . .. , n, are the Krawtchoukpolynomials.List (232) The type associated to lists .

Index-Glossary 255

Iist(x) (228) Ifx is a range or vector, the list whose components are the elements ofx.The expression (x} is, in both cases, a list oflength I with x as its only element.

local X (232) A directive to insert at the beginning of a statement a sequence X oflocal variables for that statement. These variables do not interfere with homonymvariables defined outside the statement. The variables introduced with the localdirective can be assigned values, as in local x, y=O (x is left free, while y is assignedthe value 0).

LP(n,d) (91, 92) Computes the linear programming bound of A q (n, d) . The three­argument call LP(n,d,X), with X a list of inequalities, finds a constrained maximumlike LP(n,d), but with the Delsarte constraints (see Theorem 1.75) enlarged with theinequalities X.

M

macwilliams(n,k,q,A,T) (59) The polynomial q-k(l + (q - l )T )nA(T), where A(T)is the expression obtained after substituting the variable T in A by (1 - T) 1(1 +(q - l)T). The call macwilliams(n,k,A,T) is equivalent to macwilliams(n,k,2,A,T).

MacWilliams identities (59) These determine the weight enumerator of the dual ofalinear code C in terms of the weight enumerator of C (see Theorem 7).

main problem (26) See optimal code.Mariner code (72) See also Hadamard codes.Matrix (232) The type associated to matrices . If R is a ring, Matrix(R) is the type of

the matrices whose entries are in R.Mattson-5olomon matrix (190) A matrix of the form M = (O'ij ) , 0 ~ i ,j ~ n - 1,

where 0' is an element of a finite field with ord(0') = n . The map x f-> xM is calleda Mattson-Solomon transform.

mattson_solomon_matrix(a) (190) Creates the Mattson-Solomon matrix on the ele­menta.

Mattson-5olomon transform (190) See Mattson-Solomon matrix.maximum distance separable (25) Codes [n , k , d]q that satisfy equality (k + d =

n + 1) in the Singleton bound (k + d ~ n + 1).mceliece(x) (95, 96) Computes the McEliece, Rodemich, Rumsey and Welsh asymp­

totic linear programming upper bound.MDS Acronym for maximum distance separable .Meggitt algorithm (174) A decoding algorithm for cyclic codes based on a Meggitt

table.Meggitt decoder (175) A decoder for cyclic codes that implements the Meggitt algo­

rithm.meggitt(y,g,n) (175) For a positive integer n and a polynomial y=y(x) of degree < n

representing the received vector, this function returns the result of decoding y withthe Meggitt decoder of the cyclic code defined by the monic divisor 9 of x n - 1. Itpresupposes that a Meggitt table E is available.

Meggitt table (174) Given a cyclic code of length n over IFq - the table E = {s -t

ax n - 1 + e}, where a is any nonzero element of IFq» e is any polynomial of degree< n - 1 and weight < t, where t is the error correcting capacity of the code , and sis the syndrome ofax n - 1 + e.

min(x) (228) For a range, list or vector x, the minimum among its terms .

256 Index-Glossary

min_weight(X) (47) Given a list of vectors X, the minimum among the weights ofthevectors in X.

min_weights(X) (47) The sublist ofa list ofvectors X whose weight is min_weight(X).minimum distance (17) Given C ~ T " ; where T is a finite set, the minimum distance

of C is

dc = m in{hd(x, x') IX, x ' E C, z =I- x ' } ,

where hd (x , x ') is the Hamming distance between z and x' ,minimum distance decoder (20) For a q-ary block code C ~ T " , the decoder

g : Dc -> C (Dc = UXEC B (x , t), t = L(d - 1)/2j) such that g(y) = x forall y E B (x ,t) . Its correcting capacity is t .

minimum polynomial (134) Given a finite field L and a subfield K , the minimumpolynomial of a E Lover K is the unique monic polynomial Po; E K[X] of leastdegree that has a as a root. Note that Po; is irreducible.

minimum_polynomial(a,K,X) (134) Calculates the minimum polynomial of a overK, with variable X. The element a is assumed to belong to a finite field that con­tains K as a subfield. The call minimum_polynomial(a,X) is equivalent to mini­mum_polynomial(a,K,X) with K set to the prime field of the field of a.

minimum weight (36) For a linear code, the minimum of the weights of its nonzerovectors. It coincides with the minimum distance.

Mobius function J.L (n) (119) If n > 1 is an integer, J1- (n) is 0 if n has repeated primefactors. Otherwise is I or -1 according to whether the number of prime factorsis even or odd. The value of J1- (I ) is conventionally defined as 1. If nand n' areposi tive integers such that gcd (n,n') = 1, then J1- (nn' ) = J1- (n )J1-(n' ). One usefulproperty of the function J1- is that it satisfies the Mobius inversion formula (E.2.18) .

Mobius inversion formula See Miibius function J1- (n ).mod See remainder.monic?(f) (231) Tests whether the polynomial f is monic or not.mu_moebius(n) Computes, for a posi tive integer n, Mobiusfun ction J1-(n) .

N

n_items(s) (224) The number of terms of the sequence s .n_columns(A) (229) Abreviation of number_oCcolumns(A) . Yields the number of

columns of the matrix A.n_rows(A) (229) Abreviation of number_oUows(A). Yields the number of rows of

the matrix A.

Nm(Q,L,K) (138) For an element a and a subfield K of a finite field L, this functionreturns NmL/K (a) . The call Nm(Q,L) is equivalent to Nm(Q,L,K) is with Kthe primefield of L. The call Tr(Q) is equivalent to Tr(Q,L) with L=field(Q).

noiseless channel (15) A channel that does not alter the symbols sent through it.noisy channel ( IS) A channel that is not noiseless (symbols sent through it may thus

be altered).non-decodable (20) Given a decoder, said of a vector that is not in the domain of this

decoder.norm (36, 137) 1 The weight is a norm on F~ (E.1.20). 2 The norm of a finite

field L over a subfield K is the homomorphism NmL/ K : L* -> K* such that

Index-Glossary 257

NmL/K(a) = det(ma:) , where ma: : L ---* L is the multiplication by a (the K­linear map x t-+ ax). For an expression ofNmL/ K (a) in terms of the conjugates ofa, see Theorem 2.43.

normalized Hamming matrix (52) See E.l.33.normalized_hamming_matrix(K,r) (52) The normalized Hamming matrix of r rows

with entries in the finite field K.not b (231) Negation ofthe Boolean expression b. It is equivalent to not(b).null (224) The empty sequence (or the only sequence having 0 terms).numbecoCcolumns(A) (229) The number of columns of the matrix A.number_oUtems(s) (224) The number ofterms of the sequence s. Can be shortened

to n_items(s).numbecoCrows(A) (229) The number ofrows of the matrix A.

oodd?(n) (231) This Boolean function tests whether the integer n is odd.offset See BCH codes.optimal code (26) For a given length n and minimum distance d, the highest M such

that there exists a code (n, M, d)q. This M is denoted Aq(n, d). The determinationof the function Aq (n, d) is often called the main problem of coding theory.

order (124) 1 The order of an element a of a finite group G is the least positiveinteger r such that aT = e (e the identity element of G) . Ifn = IGI, then rln . 2 Inthe case of the multiplicative group K* of a finite field K, the order of o, denotedord(a) , divides q - 1. Conversely, if r divides q - 1, then there are exactly <p(r)elements of order r in K (Proposition 2.34). 3 If n > 1 is an integer, and a isanother integer such that gcd (a,n) = 1, the order of a = [aln in the group Z~,

denoted en(q) (or also ordn(a», divides <p(n). 4 The order of an alternant controlmatrix is the number of rows of this matrix .

order(a) (125) If a is a nonzero element of a finite field, it computes the order of a .order(q,n) (156) The order ofthe integer q in the group of invertible elements modulo

the positive integer n, provided q and n are relatively prime .output alphabet See channel alphabet.

p

pad(a,r,t) (143) From a vector a, say oflength n, a positive integer r and an expressiont, it constructs a vector of length r as follows. If n ~ r, the result is equivalent totake(a,r) . Otherwise it is equivalent to alconstant; vector(n-r,t). The call pad(a,r) isequivalent to pad(a,r,O).

pairing (225) A relation, a table, a substitution or a divisor.palette (221) Each of the icon menus of the WOOS user interface . In this release

(November 2002) the existing palettes are, in alphabetical order, the following:Analysis, Combinatorics, Edit, Format, Geometry, Greek, Matrix, Operations, Pro-gramming, Symbols and Units. .

palette style (221) The characteristic typographical features and conventions of inputexpressions composed with palette icon menus. Designed to be as close as possibleto accepted mathematical practices and conventions, it is also the style in whichoutput expressions are presented.

Paley codes (74) See E.l.46.

258 Index-Glossary

paley_code(F) (74) Constructs the Paley code (E.I.46) associated to a Finite field.Paley matrix (68) The Paley matrix of a finite field IF = {Xl, . .. ,Xq } , q odd, is

(X(Xi - Xj)) E Mq(IF ), 1 :::; i, j :::; q, where X is the Legendre character of IF.paley_matrix(F) (68) This function computes the Paley matrix ofa finitefield F ofodd

cardinal.parity-check matrix See check matrix .parity completion (151, 38) 1 The result of appending to each vector ofa binary code

the binary sum of its component bits . Thus the parity completion ofan (n, M ) codeis an (n + 1, M) code with the property that the binary sum of the component bitsof each of its elements is zero. 2 More generally, the result of adding to each vectorof a code defined over a finite field the negative of the sum of its components . Thusthe parity extension of an (n , M )q code is an (n + 1, M )q code with the propertythat the sum of the components of each of its vectors is zero.

parity_completion(G) (lSI) The argument G is assumed to be a matrix r x n and thefunction returns a matrix r x (n + 1) whose last column is the negative of the sumsof the rows ofG. The function left_parity_extension(G) is defined in the same way,but the extra column is inserted at the beginning of G.

parity extension Parity comp letion.perfect codes (29) Codes that satisfy equality in the Hamming upper bound. For a

review of what is known about these codes, see Remark 1.16, page 30.period [of a monic irreducible polynomial] See primitive polynomial.period(f,q) (128) Assum ing f is a moni c irreducible polynomial with coefficients in

field of q elements, this function computes the period (or exponent) of f.PGZ Acronym for Peterson-Gorenstein-Zierler.PGZ algorithm (217) A decoding scheme for alternant codes based on a solving a

linear system of equations (cf. Proposition 4.17, page 216) for the determination ofthe error-locator polynomial.

phi_euler(n) (104) Computes Euler 's fun ction <p(n) .plotkin(x) (95) Computes the asymptotic Plotkin upper bound.Plotkin bound (83) If (n , M , d)q are the parameters of a code, then we have the

inequality d :::; /3nM / (M - 1), or M(d - /3n) :::; d, where /3 = (q -1 )/q. The dualHamming codes meet the Plotkin bound (Example 1.65). For improvements of thePlotkin bound, see E.I .50 and, for binary codes, E.l.49.

Plotkin construction (33) See P.1.5.polar representation Exponential representation .Polynomial (232) The type of (possibly multivariate) polynomials.polynomial(a ,X) (142) Given a vector a and an indeterminate X, returns a l + a2X +...+anxn- l (the polynomial in X with coefficients a) .

prime?(n) (231) Test whether the integer n is prime or not.prime ring I field See characteristic homomorphism.primitive BCH codes see BCH codes .primitive element (126) In a finite field IFq' any element of order q - 1. A primitive

element is also called a primit ive root. In IFq there are exactly <p(q - 1) primitiveelements (Proposition 2.34).

primitive_element(K) (126) Returns a primitive element of the finite field K.primitive polynomial (127) Any monic irreducible polynomial f E K [X ], where K

is a finite field, such that Q = [X], is a primitive element of the field K [X ]/(f ).

Index-Glossary 259

Since the order of Q coincides with the period (or exponent) of f (E.2.21), which bydefinition is the least divisor d of qr - 1 such that f divides x» - 1 (r = deg U)),f is primitive if and only if f does not divide x» - 1 for any proper divisor d ofq" _ l.

primitive root See primitive element.primitive RS codes See Reed-Solomon codes.principal ideal (102) Given a E A , A a ring, the set {xa Ix E A} is an ideal of A. It

is denoted (a)A' or just (a) if A can be understood, and is called the principal idealgenerated by a.

prune(H) (183) The result of dropping from the matrix H the rows that are linearcombination of the privious rows.

Q

q-ary See block code.q-ary symmetric channel See symmetric channel.QNR(F) (67) When F is a finite field of odd cardinal, it yields the list of quadratic

non-residues ofF.QR(F) (67) When F is a finite field of odd cardinal, it yields the list of quadratic

residues ofF.quadratic residue See Legendre characterquadratic non-residue See Legendre characterquotient(m,n) Given integers m and n (or univariate polynomials with coefficients in

a field), computes the quotient of the Euclidean division of m by n.quotienCand_remainder(m,n) Given integers m and n (or univariate polynomials

with coefficients in a field), yields a list with the quotient and the remainder of theEuclidean division of m by n.

R

Range (227, 232) The type associated to ranges.range (225) A construct ofthe form a..b..d. It represents the sequence ofreal numbers

x in the interval [a, b] that have the form x = a + jd for some nonnegative integerj. The construct a ..b usually is interpreted as a ..b,,1, but it can be interpreted asdifferent sequence of values in the interval [a, b] when it is passed as an argument tosome functions.

range(x) (227) For a vector, list or pairing of length n, the range 1..n.rank(A) Returns the rank of the matrix A.rate (16) Short form of code rate.Rational (232) The type associated to rational numbers. Denoted Q in palette style.rd(K) (207) Picks an element at random from the finite field K. The call rd(s,K) picks

s elements at random from K.rd_choice(X,m) (207) Given a set X of cardinal n and a positive integer m less than

n, makes a (pseudo)random choice of m elements of X. The call rd_choice(n,m),where now n is an integer, chooses m elements at random of the set {I, ... , n}.

rd_error_vector(n,s,K) (208) Produces a random vector of weight s in K" .rd_lineaccombination(G,K) (208) Generates a random linear combination of the

rows of the matrix G with coefficients from the finite field K.

260 Index-Glossary

rd_nonzero(K) (207) Picks a nonzero element at random from the finite field K. Thecall rd_nonzero(s ,K) picks s nonzero elements at random from K.

Real (232) The type associated to real numbers. Denoted lR in palette style.Reed-Muller codes (62) The first order Reed-Muller codes are introduced in P.1.l6.

They are denoted RM~(m). This code is the result of evaluating the polynomialsof degree < 1 in m variables Xl , . . . , X m at n points x = x!, .. . ,z" E JF:;'(qm-1 < n < qm). The Reed-Muller codes of order s, which are not studied inthis book, are defined in a similar way, but using polynomials of degree < s inXl , . . . , X m (see, for example, [25]).

Reed-Solomon codes (39) Introduced in Example 1.24, they are denoted RSo:(k).This code is the result of evaluating the polynomials of degree < k in one variableX over JFq on n distinct elements a: = a1 , " " an of JFq- This code has type[n , k, n - k + 1]q, and hence is MDS. Ifn = q - 1 and a1, .. . , a n are the nonzeroelements ofJFq, RSo:(k) is said to be a primitive RS code. Primitive RS codes areBCH codes (Proposition 3.22).

Relation (232) The type associated to relations.relation (225) A WIRIS relation is a sequence of objects of the form a-sb enclosed

in braces, where a and b are arbitrary objects . Relations can be thought as a kind offunctions . Indeed, if R is a relation and x any object, R(x) returns the sequence ofvalues y (which may be null)such that x-->y appears in the definition of R.

relative distance (17) For a block code C of length n and minimum distance d, thequotient Oc = din.

remainder(m,n) Given integers m and n (or univariate polynomials with coefficientsin a field), computes the remainder of the Euclidean division of m by n.

Rep(3) (2) The binary repetition code oflength 3, or Rep2(3).repetition code (25) The code Repq(n), of type [n, 1, n]q, that consists of the q con­

stant vectors t T" t E T .residue code (79) The residue of a linear code C ~ JF~ with respect to a z E C is the

image of C under the linear map pz : JF~ -+ JF~-s which extracts, for each y E JF~ ,

the n - s components whose index lies outside the support of z (hence s denotesthe length of this support) .

reverse(x) (227) This yields , for a list or vector x, the list or vector with the objectsofx written from the end to the beginning. The call reverse(a..b..d) gives the rangeb..a..-d.

reverse_print(b) (142) By default, polynomials are displayed by decreasing degreeof its terms. This order can be reversed with reverse_print(true). The commandreverse_print(false) restitutes the default order.

Ring (232) The type associated to rings.ring (101) An abelian group (A, +) endowed with a product A x A --> A, (x , y) l--+

X . y, that is associative, distributive with respect to the sum on both sides, and witha unit element 1A which we always assume to be different from the zero elementOA of the group (A, +) . If x . y = y . x for all x, yEA we say that the ring iscommutative .

ring(x) Returns the ring where the element x belongs .ring homomorphism (102) A map f : A -+ A' from a ring A to a ring A' such

that f(x + y) = f( x) + f(y) and f(x · y) = f( x) · f(y) , for all x, y E A, andf(lA) = 1A' .

Index-Glossary 261

RM Acronym for Reed-Muller codes.roots (164) For a cyclic code C of length n over IF'q » the roots of the generator poly­

nomial of C in the splitt ing field of X" - lover IF'q -

roots(f,K) (136) For a univariate polynomial f and field K, it returns the list of roots offin K.

RS Acronym for Reed-Solomon codes.RS(a,k) (185) Creates the Reed-Solomon code ofdimension k associated to the vector

a.Ruffini's rule (108) A polynomial IE K[X], K a field, is divisible by X - a if and

only if I(a) = O. It follows that I has a factor hE K[X] of degree 1 if and only ifI has a root in K.

Rule The type of rules . See Substitution.

ssay(t) This command displays a text t at the message window at the bottom of the

WIRIS screen. The text string correspending to an expression e can be obtainedwith string(e) . It may be useful to note that ift is a string and e an expression, thentie is equivalent to tlstring(e).

scalarly equivalent codes (19) See equivalent codes.sequence(x) (224) Ifx is a range, list or vector, the sequence formed with the com­

ponents ofx.set(L) (229) From a list L, it forms an ordered list with the distinct values ofthe terms

ofL (repetitions are discarded).Shannon (4) Claude Elwood Shannon (1916-2001) is considered the father of the

Information Age . One of his fundamental results is the celebrated channel codingtheorem. "Look at a compact disc under a microscope and you will see music repre­sented as a sequence ofbits, or in mathematical terms, as a sequence ofO's and 1's,commonly referred to as bits . The foundation of our Information Age is this trans­formation of speech, audio, images and video into digital content, and the man whostarted the digital revolution was Claude Shannon" (from the obituary by RobertCalderbank and Neil 1. A. Sloane).

shortening (84) 1fT is a q-ary alphabet and A E T, the code {x E T n - 1 I(xIA) E C}is said to be the A-shortening of the code C ~ T" with respect to the last position.Shortenings with respect to other positions are defined likewise.

shorthand(X,E) (211) Represents the entries of the vector or matrix X by the corre­sponding indices on table E (Zech logarithms). The element 0 is represented by theunderscore character.

Singleton bound (25) A code [n , k , d]q satisfies k + d :( n + 1 (Proposition 1.11,page 25).

source alphabet (14) The finite set S, whose elements are called source symbols,used to compose source messages. A source message is a stream 81> 82 , ' " withs; E S (cf. information source).

source symbol See symbol.sphere-packing upper bound The Hamming's upper bound.sphere upper bound The Hamming's upper bound.splitting field (116) If K is a field and I E K[X] a polynomial of degree r, there

exists a field L that contains K and elements al, . .. , a r E L such that I =

262 Index-Glossary

n~=1 (X - Q i) and L = K(QI l " " Q r ) . This field L is the splitting field of fover K. It can be seen that it is unique up to isomorphisms. For the splitting field ofX" - 1, which plays a key role in the theory of cyclic codes, see Proposition 3.9.

square_roots(a) (125) If a is a field element, it returns a list with its square roots .statement (221) Either an expression or a succession of two or more expressions

separated by semicolons (the semicolon is not requiered if next expression beginson a new line) . An expression is either a formula or an assignment. A formula is asyntactic construct intended to compute a new value by combining operations andvalues in some explicit way. An assignment is a construct of one of the forms x=e,x:=e or let x=e, where x is an identifier and e is an expression.

strict BCH codes BCH codes with offset 1 (see BCH codes).string(e) See say(t).subextension?(K,F) The value of this Boolean function is true if and only ifK coin­

cides with one of the subextensions ofF. The subextensions are defined recursivelyas follows : if E is a field and F=E[x]/(f), then the subextensions of Fare F and thesubextensions of E.

subfield (101) A subring that is a field.sUbring (101) An addtive subgroup of a ring which is closed with respect to the

product and contains the unit of the ring .Substitution (232) The type of substitutions.substitution (225) 1 A WOOS substitution is a sequence of objects of the form x=>a

enclosed in braces, where x is an identifier and a is any object. If an identifier xappears more than once, only its last occurrence is retained. Subtitutions can bethought as a kind of functions . Indeed, ifS is a substitution and b is any object, S(b)returns the result of replacing any occurrence ofx in b by the value a correspondingto x in S. 2 Ifsome x of the pairs x => a is not an identifier, then WIRIS classifies Sas a Rule. Rules work like substitutions, but they look for patterns to be substituted,rather than variables. The type of rules is Rule.

such_that (234) A directive to force that a Boolean condition is satisfied.Sugiyama algorithm (198) Solves the key equation. It is variation of the extended

Euclidean algorithm (see Remark 4.11) for the determination of the gcd of twopolynomials .

sum(x) (228) For a range, list or vector x, the sum of its terms.support (79) The support of a vector x is the set of indices i such that Xi =I O.symbol (14) An element of the source alphabet (source symbol) or of the channel

alphabet (channel symbol).symmetric channel (21) A channel with q-ary alphabet T is symmetric ifany symbol

in T has the same probability of being altered and if any of the remaining q - 1symbols are equally likely.

syndrome (22, 44, 173, 195) 1 If H is a check matrix for a linear code C oflength n,the syndrome with respect to H ofa vector y oflength n is the vector s = yHT . 2 If9 is the generator polynomial ofa cyclic code, the syndrome ofa received vector y isthe rema inder of the Euclidean division of y by 9 (interpreting y as a polynomial).

syndrome-leader decoder (45) The decoder y f--> Y - E(yHT) , where H is a checkmatrix ofa linear code C and E is a leader's table for C.

syndrome polynomial (195) If S = (So, ... I Sr-I) is the syndrome of a receivedvector, the polynomial S(z) = So+ SIZ + ...+ Sr_IZr-l .

Index-Glossary 263

systematic encoding (39) An encoding of the form f : T k -> T" with the propertythat u E T k is the subword of x = f( u ) formed with the components of x at somegiven set of k indices. For example, if the generating matrix G of a linear code ofthe form G = (hiP), then the linear encoding u ....... uG is systematic (with respectto the first k positions) because uG = uluP.

T

Table (232) The type of tables .table (225) A WIRIS table is a sequence of objects of the form x=a enclosed in

brakets (braces are also allowed), where x is an identifier and a is any object. Tablescan be thought as a kind of functions. Indeed, if x is any identifier, T(x) returns thesequence ofvalues a such that x=a is on T.

tail(x) (227) Equivalent to x,n-1 if x is a list, vector or range oflength n .take(x,k) (227) If x is a list or vector of length n, and k is in the range 1..n, the list or

vector formed with the first k terms of x. If instead of k we write -k, we get the listor vector of the last k terms of x. This function also makes sense when x is a range,and in this case it yields the list of the first k terms of x (or the last k for the calltake(x,-k)).

tensor(A,B) (64) Returns the tensor product of the matrices A and B.Tr(a,L,K) (138) For an element a and a subfield K of a finite field L, this function

returns TrL/K(a) . The call Trto .L) is equivalent to Tr(a ,L,K) with K the prime fieldofL. The call Tr(a) is equivalent to Tr(a,L) with L=field(a).

trace (137) The trace of a finite field L over a subfield K is the K-linear mapTrL/K : L -> K such that TrL/K (a) = Tr(m et), where met : L -> L is the multi­plication by a (the K-Iinear map x ....... ax). For an expression of TrL/ K (a) in termsof the conjugates of a , see Theorem 2.43.

trace(A) (230) Assuming A is a square matrix, the trace of A (the sum of its diagonalentries) .

transmission alphabet See channel alphabet.transmission rate See code rate.transpose(A) (229) The matrix obtained by writing the rows of A as columns. De­

noted AT.trivial perfect codes (29) The total q-ary codes T" , the one-word codes and the odd-

length binary repetion codes .true (230) Together with false, they form the set of Boolean values.ub_elias(n,d,q) (86) Gives the Elias upper bound for Aq(n, d).ub_griesmer(n,d,q) (80, 81) The Griesmer upper bound for the function Bq(n, d) .

This bound has the form qk, where k is the highest non-negative integer such that

Gq(k,d) ~ nand Gq(k,d) = I:~:~ r; 1(the Griesmer function).

ub~ohnson(n,d) (88) Gives the Johnson upper bound for A 2(n, d). There is also thecall ubjohnson(n,d ,w), which computes the upper bound introduced in Proposition1.71 for the function A (n , d,w). This function is, by definition , the maximum Mfor binary codes (n , M , d) all whose words have weight w .

ub_sphere(n,d,q) (28) The value of Lqn/vo/(n, t , q)J , t = L(d - 1)/2J, which is thesphere-packing upper bound for Aq(n, d). The expression ub_sphere(n,d) is definedto be ub_sphere(n,d,2).

264 Index-Glossary

undetectable code error See code error.Univariate_polynomial(f) (148) The value of this Boolean funtion is true if f is a

univariate polynomial polynomial and false otherwise .

vVandermonde determinant (41) The determinant, denoted D(al, " ., an), ofa square

Vandermonde matrix Vn(al,"" an) .Vandermonde matrix (39) The Vandermonde matrix of r rows on the elements 0: =

01, .' " an is (a;) , with 0 ~ i ~ r - 1 and 1 ~ j ~ n . It is denoted Vr(o:).vandermonde_matrix(a,r) (180) Creates the Vandermonde matrix of order r associ-

ated to the vector a.vanlint(x) (96, 96) Computes the asymptotic van Lint upper bound.variable Identifier.variable(f) Retrieves the variable of a univariate polynomial.variables(f) Retrieves the list of variables of a polynomial.Vector (232) The type of vectors . If R is a ring, the type of the vectors with compo­

nents in R is Vector(R).vector See code word.vector(x) (228) If x is a range or list, the vector whose components are the elements

ofx. In the case in which x is a range, vector(x) is the same as [xl. When x is a list,however, [xl is a vector oflength I with x as its only component.

vector(p) (142) Ifp is a univariate polynomial al +a2X +...+anXn- l, the vector[al,'" ,anI.

volume(n,r,q) (27) The value of I:~=o (7) (q - l)i (for a q-ary alphabet , this is thenumber of length n vectors that lie at a Hamming distance at most r of any givenlength n vector. The value ofvolume(n,r) coincides with volume(n,r,2).

wweight (17, 36) Given a vector x , the number of nonzero components of x . It is

denoted Ixl or wt(x).weight(x) (229) Yields the weight of the vector x. Synomym expression : wt(x).weight distribution See weight enumerator.weight enumerator (54) Given a linear code C of length n , the polynomial a( t) =

I::::oaiti , where ai is the number of vectors in C which have weight i. The se­quence ao, aI, ... ,an is called the weight distribution ofthe code.

where Synonym of such_that.WIRIS/cc (i, 7) The general mathematical system used to carry out the computations

sollicited by the examples in this book. It includes the interface that takes care ofthe editing (mathematics and text). It is introduced gradually along the text. For asummary, see the Appendix . The examples in this book can be accessed and run athttp://www.wiris.com/cc/.

with (234) A directive to make a variable or variables run over an aggregate or list ofaggregates . For more details see A.17.

word See code word.wt(x) Synonym of weight, yields the weight of the vector x.

Index-Glossary

z

265

zero?(x) (229) Ifx is a vector, returns true if x is zero and false otherwise .zero_positions(f,a) (205) Given a univariate polynomial f and a vector a of length n,

finds the list of indices i in the range l..n such that !(ai) = O.zero_vector(n) (228) The vector On'Zn(p) (223) Ring Zp ofof integers mod p (a field ifp is prime, also denoted IFpo

Zech logarithms (130) In the exponential representation with respect to a primitiveelement a of a finite field, the Zech (or Jacobi) logarithm Z (j) of an exponent j isdefined to be ind,A l + aj) , and so 1 + a j = aZU). Thus a table {j f---* Z(j)} ofZech logarithms allows to find the exponential representation of a sum of terms alsoexpressed in the exponential representation.

zero-parity code (38) For a given length n, the subcode oflF~ formed with the vectorsthe sum ofwhose components is zero. For its weight enumerator see Example 1.45.

Universitext

Aksoy, A.; Khamsi, M. A.: Methods in FixedPoint Theory

Alevras, D.; Padberg M. w.: Linear Opti­mization and Extensions

Andersson, M.: Topics in Complex Analysis

Aoki, M.: State Space Modeling of Time Se­r ies

Audin, M.: Geometry

Aupetit, R : A Primer on Spectral Theory

Bachem, A.; Kern, w.: Linear ProgrammingDuality

Bachmann, G.; Narici, L.; Beckenstein, E.:Fourier and Wavelet Analysis

Badescu, L.: Algebraic Surfaces

Balakrishnan, R.; Ranganathan, K.: A Text­book of Graph Theory

Balser, w.: Formal Power Series and LinearSystems ofMeromorphic Ordinary Differen ­tial Equations

Bapat, R.R: Linear Algebra and Linear Mod ­els

Benedetti, R.; Petronio, c.: Lectures on Hy­perbolic Geometry

Berberian, S.K.: Fundamentals of Real Anal ­ysis

Berger, M.: Geometry I, and II

Bliedtner,J.;Hansen, w.: Potential Theory

Blowey, J. F.; Coleman, J.P.; Craig, A. W.(Eds.): Theory and Numerics of DifferentialEquations

Borger, E.; Gradel, E.; Gurevich, Y.: The Clas­sical Decision Prob lem

Bottcher,A; Silbermann, R : Introduction toLarge Truncated Toeplitz Matr ices

Boltyanski, v.; Martini, H.; Soltan, P.S.: Ex­cur sions into Combinatorial Geometry

Boltyanskii, V. G.; Efremovich, V.A.: Intu­itive Combinatorial Topolog y

Booss, R ; Bleecker, D. D.: Topology andAnalysis

Borkar, V.S.: Probability Theory

Carleson, L.; Gamelin, T. w.: Complex Dy­namics

Cecil, T. E.: Lie Sphere Geometry: With Ap­plications of Subm anifolds

Chae, S.R : Lebesgue Integration

Chandrasekharan, K.: Classical FourierTransform

Charlap, L. S.: Bieberbach Groups and FlatManifolds

Chern, S.: Complex Manifolds without Po­tential Theory

Chorin, A.J.; Marsden, J.E.: MathematicalIntroduction to Fluid Mechanics

Cohn,H.: A Classical Invitation to AlgebraicNumbers and Class Fields

Curtis,M. L.: Abstract Linear Algebra

Curtis, M. L.: Matrix Groups

Cyganowski, S.; Kloeden, P.; Ombach, J. :From Elementary Probability to StochasticDifferential Equations with MAPLE

Dalen, D. van: Logic and Structure

Das, A. : The Special Theory of Relativity: AMathematical Exposition

Debarre, 0.: High er-Dim ens ional AlgebraicGeometry

Deitmar, A.: A First Course in HarmonicAnalysis

Demazure, M.: Bifurcations and Catastro­phes

Devlin, K.J.: Fundamentals of Contempo­rary Set Theory

Dibenedetto, E.: Degenerate ParabolicEquations

Diener, F.; Diener, M.(Eds.): NonstandardAnalysis in Practice

Dimca, A.: Singularities and Topology ofHypersurfaces

DoCarmo, M. P.: Differential Forms and Ap­plications

Duistermaat, J.J.; Kolk, J.A. c. Lie Groups

Edwards, R. E.: A Formal Background toHigher Mathematics Ia, and Ib

Edwards, R. E.: A Formal Background toHigher Mathematics Ila, and Ilb

Emery,M.: Stochastic Calculus in Manifolds

Endler, 0 .: Valuation Theory

Erez, R : Galois Modules in Arithmetic

Everest, G.; Ward, T.: Heights of Polynomialsand Entropy in Algebraic Dynamics

Farenick, D.R.: Algebras of Linear Transfor­mations

Foulds, 1. R.: Graph Theory Applications

Frauenthal,f. c.: Mathematical Modeling inEpidemiology

Friedman, R.: Algebraic Surfaces and Holo­morphic Vector Bundles

Fuks,D. R; Rokhlin, V. A.: Beginner's Coursein Topology

Fuhrmann, P.A.: A Polynomial Approach toLinear Algebra

Gallot, S.;Hulin, D.; Lafontaine, l.: Rieman ­nian Geometry

Gardiner, C.F.: A First Course in Group The­ory

Garding, 1.; Tambour, T.: Algebra for Com­puter Science

Godbillon, c.: Dynamical Systems on Sur­faces

Goldblatt,R.: Orthogonality and SpacetimeGeometry

Gouvea,F.Q.: p-Adic Numbers

Gustafson, K. E.; Rao, D. K. M.: NumericalRange. The Field of Values ofLinear Opera ­tors and Matrices

Hahn, A.[: Quadratic Algebras, Clifford Al­gebras , and Arithmetic Witt Groups

Hajek, P.; Havranek, T.: Mechanizing Hy­pothesis Formation

Heinonen,[: Lectures on Analysis on MetricSpaces

Hlawka, E.; Schoifiengeier, f.; Taschner, R.:Geometric and Analytic Number Theory

Holmgren, R. A.: A First Course in DiscreteDynamical Systems

Howe, R., Tan, E.Ch.: Non-Abelian Har­monic Analysis

Howes, N. R.: Modern Analysis and Topol­ogy

Hsieh, P.-F.; Sibuya, Y. (Eds.): Basic Theoryof Ordinary Differential Equations

Humi, M., Miller, w': Second Course in Or­dinary Differential Equations for Scientistsand Engineers

Hurwitz, A.; Kritikos, N.: Lectures on Num­berTheory

Iversen, R: Cohomology of Sheaves

[acod, l-; Protter, P.: Probability Essentials

Iennings, G. A.: Modern Geometry with Ap­plications

lones, A.; Morris, S. A.; Pearson, K.R.: Ab­stract Algebra and Famous Inposs ibilities

[ost, f.: Compact Riemann Surfaces

lost, l.: Postmodern Analysis

lost, l.: Riemannian Geometry and Geomet­ric Analysis

Kac, V.; Cheung,P.: Quantum Calculus

Kannan, R.; Krueger, C.K.: Advanced Anal­ysis on the Real Line

Kelly, P.; Matthews, G.: The Non-EuclideanHyperbolic Plane

Kempf, G.: Complex Abelian Varieties andTheta Functions

Kitchens, R P.: Symbolic Dynamics

Kloeden, P.; Ombach, f.; Cyganowski, S.:From Elementary Probability to StochasticDifferential Equations with MAPLE

Kloeden, P.E.; Platen; E.; Schurz, H.: Nu­merical Solution of SDEThrough ComputerExperiments

Kostrikin,A. 1.: Introduction to Algebra

Krasnoselskii, M. A.; Pokrovskii, A. v.: Sys­tems with Hysteresis

Luecking, D.H., Rubel,1. A.: Complex Anal­ysis. A Functional Analysis Approach

Ma, Zhi-Ming; Roeckner, M.: Introductionto the Theory of (non-symmetric) DirichletForms

Mac Lane,S.;Moerdijk, 1.: Sheaves in Geom­etry and Logic

Marcus,D.A.: Number Fields

Martinez, A.: An Introduction to Semiclas­sical and Microlocal Analysis

Matousek, [: Using the Borsuk-Ulam Theo­rem

Matsuki, K.: Introduction to the Mori Pro­gram

Mc Carthy, P. l.: Introduction to Arithmeti­cal Functions

Meyer, R. M.: Essential Mathematics for Ap­plied Field

Meyer-Nieberg; P. : Banach Lattices

Mines, R.; Richman, F.; Ruitenburg, w.: ACourse in Constructive Algebra

Moise, E. E.: Introductory Problem Coursesin Analysis and Topology

Montesinos-Amilibia, t. M.: Classical Tessel­lations and Three Manifolds

Morris, P. : Introduction to Game Theory

Nikulin, V. v.;Shafarevich, 1.R.: Geometriesand Groups

Oden, t.t. Reddy,t.N.: Variational Methodsin Theoretical Mechanics

0ksendal, R : Stochastic Differential Equa­tions

Poizat, R : A Course in Model Theory

Polster, R : A Geometrical Picture Book

Porter, j . R.; Woods, R. G.: Extensions andAbsolutes of Hausdorff Spaces

Radjavi, H.;Rosenthal,P.: Simultaneous Tri­angularization

Ramsay, A ; Richtmeyer, R. D.: Introductionto Hyperbolic Geometry

Rees, E. G.: Notes on Geometry

Reisel, R. R : Elementary Theory of MetricSpaces

Rey, W. j. j. : Introduction to Robust andQuasi-Robust Statistical Methods

Ribenboim, P.: Classical Theory of AlgebraicNumbers

Rickart, C.E.: Natural Function Algebras

Rotman, l- l.: Galois Theory

Rubel, L. A.: Entire and Meromorphic Func­tions

Rybakowski,K. P.: The Homotopy Index andPartial Differential Equations

Sagan,H.: Space-Filling Curves

Samelson, H.: Notes on Lie Algebras

Schiff,[; L.: Normal Families

Sengupta, j. K.: Optimal Decisions underUncertainty

Seroul, R.: Programming for Mathemati­cians

Seydel,R.: Tools for Computational Finance

Shafarevich, 1.R.: Discourses on Algebra

Shapiro, I.H.: Composition Operators andClassical Function Theory

Simonnet, M.: Measures and Probabilities

Smith, K.E.; Kahanpiiii, L.; Kekiiliiinen, P.;Traves, w.: An Invitation to Algebraic Ge­ometry

Smith, K. T.: Power Series from a Computa­tional Point of View

Smorynsk i, C.: Logical Number Theory I. AnIntroduction

Stichtenoth, H.: Algebraic Function Fieldsand Codes

Stillwell, j .: Geometry of Surfaces

Stroock, D. w.: An Introduction to the The­ory of Large Deviations

Sunder, V.S.: An Invitation to von NeumannAlgebras

Tamme, G.: Introduction to Etale Cohomol­ogy

Tondeur, P.: Foliations on Riemannian Man­ifolds

Verhulst, F.: Nonlinear Differential Equa­tions and Dynamical Systems

Wong, M. w.: WeylTransforms

Kambo-Descamps, S.: Block Error-Cor­recting Codes

Zaanen, A.C.: Continuity, Integration andFourier Theory

Zhang, F.: Matrix Theory

Zong, c.: Sphere Packings

Zong, C: Strange Phenomena in Convex andDiscrete Geometry