opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete...

21

Transcript of opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete...

Page 1: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

Topics in Discrete Mathematics

Spring Term 2018/19

ERROR-CORRECTING CODES

ALEXANDER J. MALCOLM

Abstract. These lecture notes form the �rst topic in the 2018/19 course, Topics in Dis-crete Mathematics. The second topic, lectured by Dr. Dan Martin, is on cryptography.

Contents

0.1. Course Summary 2

1. Introduction 2

2. Linear Algebra Revision 4

2.1. Fields 4

2.2. Vector Spaces 5

3. Linear codes 7

3.1. Preliminaries 7

3.2. Two presentations of a linear code 10

4. Hamming codes and some decoding 12

5. Dual codes 14

6. Perfect codes and other properties 16

7. Equivalent codes and the weight enumerator 19

Date: February 14, 2019.

1

Page 2: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

2 ALEXANDER J. MALCOLM

0.1. Course Summary.

- Unit code: MATH30002 & MATHM0009- Level of study: H/6 & M/7- Credit points: 10- Teaching block (weeks): 2 (13-18)- Lecturer: Alex Malcolm- Email: [email protected] Course homepage: https://alexmalcolmblog.wordpress.com/teaching/- Assessment: This unit is examined entirely by a single exam in May/June 2019.The pass mark is 40 for MATH30002 and 50 for MATHM0009.

- Lectures� Tuesdays 10am - SM2, Maths Building� Thursdays 5pm - SM1, Maths Building� Fridays 5pm - SM1, Maths Building

- Maths Cafe� Thursdays 11am - Portacabin 2, Maths Building rear (week 14 onwards)

Although these lecture notes are entirely self-contained, outside reading is of course en-couraged. There is a strong literature on error-correcting codes, and the following areparticularly recommended:

(1) F.J. MacWilliams and N.J.A. Sloane: The Theory of Error-Correcting Codes;(2) J.H. van Lint: Introduction to Coding Theory;(3) R.H. Morelos-Zaragoza: The Art of Error-Correcting Coding.

1. Introduction

Transmitting messages is an important practical problem. Information is a valuable assetfor everyday life, but when it is shared between people or machines it can pick up errors.For example it is very easy for us to spell a word incorrectly or even use a wrong wordentirely. Likewise, messages sent by a computer in binary may pick up an error due totransceiver fault e.g. a "bit-�ip" interchanging a 0 for a 1 with a given probability.

Some errors can be detected but others cannot. For example, suppose that Alice and Bobare friends, and Bob receives the following message from Alice

Pub tonihgt? The Red Loin at 8pm?

Bob would immediately recognise that an error has been made in the message, but howmany? He can easily detect an error in the word "tonihgt" as this isn't in the englishlanguage. Furthermore as it very closely resembles the word "tonight", he can be con�dentthat this was the intended message and can correct it appropriately. But what about "RedLoin"? This is a strange phrase, but it is correct english. With a little extra informationi.e. the knowledge that "Red Lion" is the most common name for an english pub, Bob canprobably correct this too. But he can't detect any possible error in the time e.g. changing"8" to "7" would give an equally valid message. To be con�dent that the time was correct,he would have to assume only two errors have occurred.

To combat problems such as this, we design systems of messages that remain legible evenin the presence of errors. These are known as error-correcting codes.

A simple example - the Yes/No code: Bob wants to respond to Alice's question, anddecides to use a simple system where the message "1" corresponds to answer "yes", and"0" corresponds to "no". But just sending the 1 or 0 is a terrible system: if Bob makes asingle mistake and presses the wrong key, the message changes completely. Furthermore,whichever letter Alice receives, she has no way of telling if an error has occurred.

Page 3: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

ERROR-CORRECTING CODES 3

We can improve the process by repeating the choice. Let C = {(0, 0, 0), (1, 1, 1)} ⊂ F32 =

{0, 1}3, these two vectors will be the codewords. The process of transmitting encodedmessages is then given by Fig. 1.

Figure 1. Repetition code R3

Bob chooses message 0 or 1

Message encodedas (0,0,0) or (1,1,1)

Alice receives binaryvector of length three

Vector decoded andoriginal message found

Encoding

Transmission

Decoding

It should be clear that this system can detect up to two "bit-�ip" errors. E.g. if Alicereceives the vector (0, 1, 0) then either one error has occurred (the middle 0 changing to a1) or two has occurred (the outside 1s turning to 0s). But more importantly, it can alsocorrect one error. How do we do this? Well we take our received vector and count thenumber of 1s or 0s. If there are more 1s, then the intended message is (1,1,1), and viceversa. This is called majority logic decoding; intuitively it makes sense as we would expecta low number of such errors to occur.

Example 1.1. We can generalise this idea to arbitrary dimensions. Let

Rn := {(0, 0, . . . 0), (1, 1, . . . 1)} ⊂ Fn2 ,for any n. This is called the n-bit repetition code. Intuitively these codes can detect n− 1errors and correct b(n− 1)/2c errors.

A drawback of the repetition code is that you can't send many messages.

The following example allows us to send 8 messages while still correcting 1 error.

Example 1.2. The simplex code C3:Suppose we want to send the message (x, y, z), a vector of length three where x, y, z ∈ F2 ={0, 1}. Let C3 be the code consisting of vectors of length 7, (x, y, z, a, b, c, d) ∈ F7

2 satisfyingthe following

x+ y = a, x+ z = b, y + z = c;

x+ y + z + a+ b+ c = d.

These equations are called the check-sums. So our encoding is (x, y, z)→ (x, y, z, a, b, c, d),and we easily check that there are 8 messages and 8 corresponding codewords.

Suppose we receive (0111011). Well

x+ y = 1 = a

x+ z = 1 6= b

y + z = 0 6= c

x+ y + z + a+ b+ c 6= d.

Page 4: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

4 ALEXANDER J. MALCOLM

So there is an error, but where? Well it is in z because this breaks the b, c and d checksums. We correct z, to get corrected codeword (0101011) corresponding to the message(010).

Exercise 1.3. Come up with a decoding/error-correction method to show that C3 correctsany single error.

HINT: Errors in di�erent positions x, y, z, a, b, c, d correspond to di�erent numbers ofcheck-sums failing.

Now we have seen some examples, we should make the formal de�nition of a code clear.

De�nition 1.4. Let A be a �nite set (an "alphabet"). A code of length n over A, is asubset C ⊆ An. An element of An is called a vector (or word) and the elements of C arecalled codewords.

Examples 1.1 and 1.2 are both codes over the alphabet A = F2. They have lengths n and7 respectively.

Notation: We will typically write elements of An in vector form i.e. a = (a1, . . . , an)where ai ∈ A. This is natural as we will often restrict our attention to the case where Ais a �nite �eld and An is a vector space. However, when A = F2 = {0, 1} we may also usethe shorthand (0, 1, 1, 1, 0) = 01110. This shorthand is very common in the literature.

Let's also be clear about the exact nature of an "error".

De�nition 1.5. Let C ⊆ An be a code of length n over alphabet A. A single error in acodeword c = (c1, . . . , cn) is a change of one of the components cj .

Example 1.6. Suppose that Alice sends the codeword c = (1, 0, 1, 1). If c′ = (1, 1, 1, 1) isreceived by Bob, then the codeword has picked up 1 error in the 2nd position. If insteadc′ = (1, 1, 1, 0) then there are two errors (in the 2nd and 4th positions).

Other types of errors are possible e.g. a codeword could be shortened by the nth componentdisappearing completely. However these will not be the focus of this course.

Roughly speaking, the English language is a code over the alphabet A = {a, b, . . . , z} buthere the codewords do not have the same length. We could remedy this by padding wordswith extra zs at the end but this would be very ine�cient - the average word length inEnglish is 5, compared to the longest word with 45 letters. Furthermore, comparativelyfew of the 2645 choices in A45 are actually real English words. On the other hand, it isthis sparsity that allows for relatively easy "decoding" when we make a spelling mistake.

In this course we will focus on the mathematical properties of error-correcting codes(moreso than the encoding/decoding procedures). We'll see that most of the best codesare linear, that is, they possess the structure of vector spaces.

2. Linear Algebra Revision

Linear algebra will be an important tool for studying the properties of error-correctingcodes. Therefore in this section we shall brie�y recap some of the fundamentals of �eldsand vector spaces. If you need a refresher, there are a few revision exercises on the �rstproblem sheet.

2.1. Fields.

De�nition 2.1. A �eld is a set F with two binary operations +,× such that (F,+) and(F\{0},×) are abelian groups, and multiplication distributes over addition i.e. a(b+ c) =ab+ ac for all a, b, c ∈ F .

Page 5: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

ERROR-CORRECTING CODES 5

Intuitively, a �eld is simply a set on which the operations of addition, subtraction, multi-plication and division behave "naturally" (i.e. as they do in the real numbers).

There are many examples:

Example 2.2. The sets Q,R,C are �elds. The set of integers modulo a prime number pis also a �eld, and we denote it by Fp.

Exercise 2.3. The sets Z,N are not �elds. Neither are the integers modulo n, for ncomposite. Why is this?

We can build extension �elds out of old �elds.

De�nition 2.4. Let X be a set and F a �eld. Then the extension of F by X is is de�nedto be the smallest �eld containing F and X. We denote this F (X).

We are only concerned with �elds of the form F (α) where α is algebraic over the base �eldF i.e. α is the root of a polynomial with coe�cients in F .

For example

Example 2.5.

(1) Q(√2) = {a+ b

√2 | a, b ∈ Q};

(2) C = R(i) = {a+ bi | a, b ∈ R}.

Most of the examples listed above are in�nite, but in this course we will focus on �nite�elds. Now it is clear that Fp, the integers mod p, form a �nite set but these are not theonly examples.

Fact:

(1) Every �nite �eld is of the form Fp(α) where p is a prime, and α is the root of anirreducible polynomial f , over Fp. If degree(f) = l, then

Fp(α) = {al−1αl−1 + an−2αl−2 + · · ·+ a1α+ a0 | ai ∈ Fp}. (1)

(2) It's easy to see that |Fp(α)| = pl. But in fact, there is exactly one �nite �eld of size

pl up to isomorphism, for each p, and l. We denote this �eld by Fpl - usually we

de�ne q = pl and use the shorthand Fq.

Example 2.6. Let's construct the �nite �eld of 4 elements i.e. F4 = F22: Well f =x2 + x+ 1 ∈ F2[x] is irreducible (check f(0), f(1) 6= 0) and degree(f) = 2 as required. Letα be a root of f . Then

F4 = {a1α+ a0 | ai ∈ F2} = {0, 1, α, α+ 1}.To see that this �eld is closed under multiplication, simply note that powers of α can bereduced by applying the rule α2 = α+ 1 (as given by f).

Remark. It's important to note from (1) that all the coe�cients ai are contained in Fp.Therefore any addition or multiplication carried out in the �nite �eld is done "modulop".The prime p is known as the characteristic of the �eld.

Warning! The �nite �eld Fpl is not the same as integers mod pl!

2.2. Vector Spaces. Just how �elds were de�ned to be analogous to Q or R, we want anabstract notion of vector space that captures the essential properties of n-dimensional realspace Rn.

De�nition 2.7. A vector space over a �eld F is a non-empty subset V with two operations;addition and scalar multiplication by elements of F such that:

(1) (V,+) is an abelian group;

Page 6: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

6 ALEXANDER J. MALCOLM

(2) α(βv) = (αβ)v for all α, β ∈ F and v ∈ V ;(3) 1v = v for all v ∈ V ;(4) α(v + v′) = αv + αv′ for all α ∈ F and v,v′ ∈ V ;(5) (α+ β)v = αv + βv for all α, β ∈ F and v ∈ V .

Again, there are many natural (and hopefully familiar) examples:

Example 2.8.

(1) Vectors: Fn is a vector space over F under the usual addition and scaling of vectors.(2) Matrices: Ma,b(F ) is a vector space over F under the usual addition and scaling

operations of matrices.(3) Polynomials: F [x] is a vector space over F under the usual addition and scaling

operations of polynomials.

We have the natural inclusion of vector spaces through the notion of subspace:

De�nition 2.9. Let V be a vector space over �eld F , and let W be a subset of V . Wesay that W is a subspace of V if W is closed under addition, and scalar multiplication byF . I.e. if αw + βw′ ∈W for all α, β ∈ F and w,w′ ∈W .

Notation: In some texts the symbol "⊆" is used to denote inclusion of subspaces, but thiscan be ambiguous and so we do not do that here. In this course "⊆" will only indicateinclusion as sets. We shall use "≤" to denote inclusion of vector spaces.

Example 2.10.

(1) Let V = R2. Then W = {a(1, 0) | a ∈ R} ≤ V .(2) Let V = F [x] for some �eld F , and let W denote the polynomials in V of degree at

most 3. Then W ≤ V .(3) Let V be a vector space and let X := {v1, . . . ,vk} ⊆ V be a collection of vectors.

Then the span of X over F , de�ned

〈X〉F := 〈v1, . . . ,vk〉F := {k∑i=1

aivi | ai ∈ F},

is a vector subspace of V .

When studying Rn, we usually �x a co-ordinate system to allow for an easy description ofvectors. For example it is common to use cartesian co-ordinates ei := (0, . . . , 0, 1, 0, . . . , 0)so that Rn = 〈e1, . . . , en〉R. Furthermore, every vector v has a unique decompositionv = a1e1 + . . . anen.

Now there are many other choices of spanning vectors for Rn but note that the set of eihas the stronger property that there is no redundancy : if we remove e1 (or indeed any ei)then it's clear that Rn 6= 〈e2, . . . , en〉R.

Example 2.11. Let V = R2 and consider X1 = {(1, 0), (0, 1)}, X2 = {(1, 0), (0, 1), (1, 1)}.Clearly 〈X1〉R = 〈X2〉R, but decompositions in the second spanning set X2 are not unique!For a general v ∈ R2 there will be multiple solutions a, b, c ∈ R to

v = a(1, 0) + b(0, 1) + c(1, 1).

This is because we have a redundancy or dependence in X2: clearly (1, 1) = (1, 0)+ (0, 1)and so we don't really need this extra vector.

This motivates the following de�nition

De�nition 2.12. Let V be a vector space and v1, . . . ,vn ∈ V .

Page 7: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

ERROR-CORRECTING CODES 7

• We say that v1, . . . ,vn ∈ V are linearly independent if the only solution to theequation a1v1 + · · ·+ anvn = 0 is ai = 0 for all i.• We say that v1, . . . ,vn ∈ V is a basis for V if they are linearly independent and〈v1, . . . ,vn〉F = V .

Example 2.13. Let F be a �eld and let V = Fn be the vector space of n-dimensionalvectors over F . Then e1, . . . , en (where ei := (0, . . . , 0, 1, 0, . . . , 0) as before) form a basisof V . We call ei the standard basis vectors of Fn.

Fact: If a vector space has a �nite basis then all bases have the same size - this is knownas the dimension of the vector space.

This is exactly why we had the redundancy in the spanning set X2, in Example 2.11 above.As dim(R2) = 2, any set of 3 or more (non-zero) vectors will not be linearly independent.

We can use matrices to generate vector spaces (over a �eld F ) in the following two ways.For M ∈Ma,b(F ):

• Let ri, 1 ≤ i ≤ a, denote the rows ofM . Then RowSpace(M) = 〈r1, . . . , ra〉F ≤ F b.The dimension of this space is exactly rank(M). Warning! rank(M) ≤ a; it isnot necessarily equal to a i.e. the number of rows in M• Recall that NullSpace(M) is the set of vectors such that MvT = 0. FurthermoreNullSpace(M) ≤ F b and has dimension, nullity(M) called the nullity of M .

Lastly, let's recall an important result from linear algebra that we'll use (but not prove)repeatedly in this course.

Theorem 2.14. The Rank-Nullity Theorem Let M ∈ Ma,b(F ). Then rank(M) +nullity(M) = b.

Intuitively, what the rank-nullity theorem is saying is that if you have a set of k independenthomogeneous linear equations in n variables, then you expect the space of solutions to be(n− k)-dimensional.

3. Linear codes

In Section 1 we de�ned a code and saw some examples. Now, after the revision of Section2, we are ready to study these codes in further detail.

3.1. Preliminaries.

De�nition 3.1. Let u,v ∈ An. We de�ne the Hamming distance (usually shortenedsimply to "distance") between the two vectors to be the number of co-ordinates in whichthey di�er i.e.

d(u,v) := |{i |ui 6= vi}|.

Example 3.2. Some distances in F52:

d(01110, 01001) = 3;

d(11111, 11001) = 2.

The Hamming distance on An satis�es a number of natural properties.

Lemma 3.3. The Hamming distance on An is a metric. That is, it satis�es the followingthree properties:

(1) d(u,v) ≥ 0, with equality if and only if u = v;(2) d(u,v) = d(v,u);(3) d(u,v) ≤ d(u, s) + d(s,v) for all s ∈ An.

Page 8: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

8 ALEXANDER J. MALCOLM

Proof. Left as an exercise. �

It is of course also natural to consider the minimum possible distance between any twocodewords in C, as well as spheres centred at vectors:

De�nition 3.4. Let C ⊆ An be a code. The minimum distance d(C) of C isd(C) := min{d(c,d) | c,d ∈ C, c 6= d}.

De�nition 3.5. Let v ∈ An. The Hamming sphere with centre v and radius r ≥ 0 is thecollection of vectors

Sr(v) := {u ∈ An : d(u,v) ≤ r}.

As you probably expect, the notion of distance is intrinsically linked to the error-detection/correctionproperties of a given code. Suppose Alice wants to message Bob: she sends the codewordc but some errors occur and Bob receives a vector v. Intuitively, if v is "close" to multiplecodewords it is going to be hard to decide which is the correct codeword that Alice sent.

De�nition 3.6. Let C ⊆ An be a code.

We say that C detects r errors if for each c ∈ C, changing any r letters does not produceanother codeword. I.e.

Sr(c) ∩ C = {c}, for all c ∈ C.We say that C corrects r errors if for each v ∈ An, changing up to r letters produces atmost one codeword c ∈ C. I.e.

|Sr(v) ∩ C| ≤ 1, for all v ∈ An.

Example 3.7. Recall the binary repetition code Rn from Example 1.1. We see that thisindeed detects n− 1 errors and corrects b(n− 1)/2c errors, by the above de�nitions.

Notice from this example, the importance of "at most one codeword" in the de�nitionof correction. For example, if C = R6 then we can clearly correct up to 2 errors but|S2(000111) ∩ C| = 0.

The �rst theorem of this course con�rms the intuition described above, and highlights thefundamental importance of minimum distance.

Theorem 3.8. Let C ⊆ An be a code with minimum distance d(C) = d.

(1) C detects r errors if and only if r ≤ d− 1.(2) C corrects r errors if and only if r ≤ d−1

2 .

Proof.

(1) Here the proof is a simple unpacking of the two de�nitions: C detects r-errors ifand only if Sr(c) ∩ C = {c}, for all c ∈ C. This is equivalent to d(c, c′) > r for allc, c′ ∈ C. And this is true if and only if r < d(C) = d.

(2) Firstly suppose that d(C) ≥ 2r+ 1 and assume for a contradiction that C does notcorrect r errors. Then by de�nition there exists v ∈ An such that |Sr(v) ∩ C| ≥ 2.Take two distinct codewords c, c′ ∈ Sr(v)∩ C. It follows by the triangle inequalitythat

d ≤ d(c, c′) ≤ d(c,v) + d(v, c′) ≤ 2r ≤ d− 1.

This gives the desired contradiction, proving the right to left implication of (2).The reverse implication is left as an exercise.

De�nition 3.9. From Theorem 3.8, we see that a linear code with minimum distanced(C) = d can correct at most e := bd−12 c errors. We call e the error correcting index of C.

Page 9: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

ERROR-CORRECTING CODES 9

Looking at Theorem 3.8 we might think that the best thing to do is pick codes thatmaximise distance, as this will yield the ability to correct/detect large errors. However asthe repetition code Rn demonstrates, this isn't always the best idea - it has the maximalpossible distance for a code of length n but we can only send two messages! So otherfactors can be important; we'll discuss this in further detail in Section 6.

However to make it easier to study their properties, we can start by restricting our attentionto codes that have at least some simple mathematical structure. It turns out that mostof the best codes are vector subspaces of Fnq and that introducing this "modest" conditionleads to signi�cant developments in the theory.

De�nition 3.10. We call C ≤ Fnq a linear code if C is a vector subspace of Fnq . If C isa subspace of dimension k, then we correspondingly say that C has dimension k, or is an[n, k]q-linear code. Furthermore, if it also has minimum distance d = d(C), we call it an[n, k, d]q-linear code.

The [n, k, d]q-notation is simply a concise way of denoting four fundamental properties ofthe code.

Remark. From this point on, all codes considered in this course will be linearcodes over �nite �elds. That is, alphabets A = Fq for a prime power q, andC ≤ Fnq .

Now what can we immediately deduce about linear codes? Firstly, recall from De�nition1.4 that a general code can be of any size between 1 and |A|n. But for linear codes thereare far fewer possible sizes.

Lemma 3.11. Let C be an [n, k]q- linear code. Then |C| = qk.

Proof. We simply need to show that a subspace C ≤ Fnq of dimension k, has size qk. Wellas it is a vector space, C has a basis, say c1, . . . , ck. It follows that

C = {λ1c1 + · · ·+ λkck |λi ∈ Fq}.As the basis vectors {ci} are linearly independent, any choice of scalars (λ1, . . . , λk) pro-duces a distinct vector in C. But there are qk such choices and therefore |C| = qk. �

Another immediate advantage of linear codes is that it is much quicker to �nd the minimumdistance. For a general code, we need to compare the distance between every two distinct

codewords i.e. compute(|C|2

)distances. We claim that if C is linear, then the work can be

reduced to |C|−1 calculations; far more e�cient. To see this we use the notion of codewordweight.

De�nition 3.12. The weight of a vector (v1, . . . , vn) = v ∈ Fnq , denoted wt(v), is

wt(v) := |{i | vi 6= 0}| = d(v, 0).

Lemma 3.13. Let C be an [n, k]q-linear code. Then

d(C) = min{wt(c) | c 6= 0}.

Proof. Firstly observe that as C is linear, if c, c′ ∈ C then c− c′ ∈ C. Furthermore, c− c′

has non-zero entries precisely in the positions where c and c′ di�er. Hence d(c, c′) =d(c − c′, 0) = wt(c − c′). So the distance between any two codewords can be realised asthe weight of another codeword. The result follows. �

Remark. When calculating d(C) with this method, it is important to range over all code-words in C. Taking a minimum over a basis is not enough!

E.g. consider the [5, 2]2-linear code C spanned by c1 = (1, 1, 1, 1, 1) and c2 = (1, 1, 0, 0, 1).Clearly wt(c1) = 5 and wt(c2) = 3, but c1 − c2 ∈ C and wt(c1 − c2) = 2.

Page 10: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

10 ALEXANDER J. MALCOLM

3.2. Two presentations of a linear code. Linear codes are convenient in that theycan be easily described using a basis (this is far more e�cient than writing down all thecodewords!). We store the basis in a generator matrix :

De�nition 3.14. Let C be an [n, k]q-code. Then a generator matrix for C is a matrixG ∈ Mk,n(Fq) whose rows form a basis for C. I.e. C = RowSpace(G) and Rank(G)= dim(C) = k.

So we can describe all codewords in C by taking linear combinations of the basis vectors;this is achieved through easy matrix multiplication:

Lemma 3.15. Let C be an [n, k]q-code with generator matrix G. Then

C = {wG |w ∈ Fkq}.

The matrix multiplication wG can be viewed as the encoding process: we start with amessage w which has length k as it lies in the message space Fkq . Then multiplying by Gencodes the message to give wG = c, a codeword that lives in the codespace C. The ratiok/n can then be viewed as the proportion of useful (non-redundant) information containedin a codeword. The value k/n is called the rate of C.

Exercise 3.16. Let C be an [n, k]q-code with generator matrix G. Show that if wG = w′Gthen w = w′. Why is this important for the encoding process?

Example 3.17. Some examples

(1) R5 - the repitition code of length 5, has generator matrix

G = (1 1 1 1 1).

Here it is easy to see how the encoding process works: we let w = 0 or 1, and weget output code words (00000) and (11111) respectively.

(2) Recall the simplex code C3, as de�ned in Example 1.2. A natural choice of generatormatrix is

G =

1 0 0 1 1 0 10 1 0 1 0 1 10 0 1 0 1 1 1

.

But there are other basis choices for this subspace and therefore other choices ofgenerator matrix e.g.

G′ =

1 0 0 1 1 0 11 1 0 0 1 1 00 0 1 0 1 1 1

would also work.

So evidently, there is not always a unique generator matrix. In fact we have the following:

Exercise 3.18. Let C be an [n, k]q-code and suppose that G is a unique generator matrixfor C. Show that n = k = 1 and q = 2.

More generally we can ask

Exercise 3.19. How many generator matrices does an [n, k]q-code have?

If we perform Gaussian elimination on a given generator matrix, we get the so-calledstandard form:

De�nition 3.20. We say that a generator matrix is in standard form if it is of the formG = (Ik A), where Ik is the k × k identity matrix and A ∈Mk,n−k(Fq).

Page 11: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

ERROR-CORRECTING CODES 11

The generator matrix is the �rst convenient way of de�ning a linear code. There is howevera second method, where rather than specifying a basis, we specify a set of equations thatthe co-ordinates of the codewords must solve. We describe these equations in the formof a parity check matrix, usually denoted H. Now, a codeword solving our equations isequivalent to being in the null space of H.

De�nition 3.21. We say that a matrix H ∈ Mn−k,n(Fq) is a parity check matrix for the[n, k]q-code C if

C = {v ∈ Fnq |HvT = 0},i.e. C =NullSpace(H).

Example 3.22. Consider the binary 4-bit repetition code R4. A vector v ∈ F42 is a

codeword if and only if vi = vj for all 1 ≤ i, j ≤ 4. Throwing out unnecessary equations(recall that dim(R4) = 1 and hence by rank-nullity we require 3 independent equations) wehave that v ∈ R4 if and only if

v1 = v2, v2 = v3 and v3 = v4.

This then gives the parity check matrix

H =

1 1 0 00 1 1 00 0 1 1

.

We can clearly generalise this to n dimensions. Let H ∈ Mn−1,n(F2) be the matrix withentries Hi,i = Hi,i+1 = 1, 1 ≤ i ≤ n − 1 and zero otherwise. Then H is a parity checkmatrix for Rn.A great property of the parity check matrix formulation of a code, is that it is very quickto test if a vector is a codeword or not. We simply compute HvT and check that it's thezero vector. The trade-o� is that using H is not as e�cient at producing many codewordsas the generator matrix G.

It is easy to see that given any matrix H ∈ Mn−k,n(Fq) of full rank, we can construct an

[n, k]q-code. We simply de�ne C := {v ∈ Fnq |HvT = 0}, and as the null space of a matrixis closed under addition/scalar multiplication, C is a subspace of Fnq and by de�nition alinear code. It again has the required rank by the rank-nullity theorem (Thm. 2.14).

Example 3.23. Consider the linear code C ≤ F45 de�ned by parity check matrix

H =

(1 2 3 00 0 4 1

).

So C consists of vectors (v1, . . . , v4) ∈ F45 such that

v1 + 2v2 + 3v3 = 0 = 4v3 + v4.

We �rst see that the latter of these two conditions is equivalent to v3 = v4 (we are work-ing mod 5). Now, letting (v2, v3) ∈ {(1, 0), (0, 1)} we get from the �rst condition thatc1 = (3, 1, 0, 0) and c2 = (2, 0, 1, 1) are both in C. Thus, as c1 and c2 are clearly linearlyindependent, they form a basis of C, and they can therefore can be used to de�ne a generatormatrix for the code

G =

(3 1 0 02 0 1 1

).

Exercise 3.24. Try some more examples of your own. What happens if H ∈Mn−k,n(Fq)but is not of full rank?

Page 12: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

12 ALEXANDER J. MALCOLM

4. Hamming codes and some decoding

We can now de�ne one of the most well-known families of error-correcting codes, theHamming codes.

De�nition 4.1. Let q be a prime power, k ∈ N≥1 and n = n(q, k) := (qk − 1)/(q − 1). AHamming code, Hk(q) is the [n, n− k]q-linear code for which the parity check matrix has

columns that form a maximal set of pairwise linearly independent vectors in Fkq .

Exercise 4.2. Double-check that a code de�ned by such a parity check matrix does indeedhave length n(q, k).

For example consider a binary Hamming code of rank 3 i.e. H3(2). This is the [7, 4]2-linearcode H3(2) = {v ∈ F7

2 |HvT = 0}, where H has columns consisting of all non-zero vectorsof F3

2 (recall that over F2, two vectors are linearly independent if and only if they arenon-equal). So for example

H :=

1 0 0 1 1 0 10 1 0 1 0 1 10 0 1 0 1 1 1

.

The binary Hamming codes are the most commonly found Hamming codes in the literature(and also in this course). We shall therefore often drop the �eld value q = 2 in the notationi.e. we'll typically write Hk(2) = Hk.Now we could have ordered the columns in H di�erently. This would produce a di�erentbut equivalent (we'll de�ne this later in the course, see De�nition 7.2) Hamming code.

We can exhibit a method of correcting 1 error using the Hamming codes.

Firstly, recall that for an [n, k]q- linear code C with parity check matrix H, c ∈ C if andonly if HcT = 0. In other words, if v ∈ Fnq such that HvT 6= 0 then we know that v /∈ C.

De�nition 4.3. Let C be an [n, k]q- linear code with parity check matrix H, and letv ∈ Fnq . We call HvT the syndrome of v.

The value of the syndrome is important: �rstly, to restate the above, a given vector is acodeword if and only if it has zero syndrome. Furthermore, suppose that Alice and Bob arecommunicating using a linear code C; Alice sends a codeword c but it picks up an error(s)and Bob receives vector v 6= c . Provided not too many errors have occurred, we can usethe syndrome of v to work out the original codeword c.

Let's demonstrate how to do this with one error:

Proposition 4.4. Let C ≤ Fnq be a linear code with parity check matrix H.

(1) Suppose that every set of m − 1 columns of H is linearly independent. Then theminimum distance d(C) ≥ m.

(2) Suppose in addition to (1) that H contains a set of m linearly dependent columns.Then d(C) = m.

Proof.

(1) Suppose for a contradiction that d(C) ≤ m − 1. So by Lemma 3.13 there exists acodeword 0 6= c ∈ C such that wt(c) = r ≤ m − 1. Now we can write c as a sumof r standard basis vectors

c = λi1ei1 + · · ·+ λireir ,

Page 13: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

ERROR-CORRECTING CODES 13

for some λij ∈ F∗q . Hence

0 = HcT = H(λi1eTi1) + · · ·+H(λire

Tir)

=

r∑j=1

λij × ij-th column of H.

This is clearly a contradiction against the hypothesis of (1).(2) Suppose that columns i1, . . . , im are linearly dependent. Then there exists λij ∈ Fq

such thatm∑j=1

λij × ij-th column of H = 0.

As any m− 1 subset of these columns is linearly independent by (1), all of λij 6= 0.Hence

0 =m∑j=1

λij × ij-th column of H

=m∑j=1

H(λijeTij ) = H(

m∑j=1

λijeTij ).

Therefore c =∑m

j=1 λijeij ∈ C and wt(c) = m.

Corollary 4.5. The Hamming codes correct exactly one error.

Proof. Let H be a parity check matrix for Hk(q). By Theorem 3.8 and Proposition 4.4,it su�ces to show that every pair of columns in H is linearly independent, but that thereexist three linearly dependent columns. Well, the former condition follows immediatelyfrom the de�nition of the Hamming codes. It is also clear by de�nition that amongst itscolumns, H contains scalar multiples of the vectors e1, e2 and e1 + e2. These columnssatisfy the latter condition and we are done.

So the Hamming codes correct one error in theory, but how do we actually complete thedecoding in practice? Well suppose that a codeword c ∈ Hk(q) is sent but it picks up oneerror during transmission. That is we receive the vector v = c+ λei where λ ∈ Fq and eiis the i-th standard basis vector. We compute the syndrome of v:

HvT = H(c+ λei)T = HcT + λHeTi

= λHeTi= λ× i-th column of H.

But the columns of H are pair-wise independent and so it is easy to read o� both i andλ (i.e. identify e) by comparing with H. We then �nally make the correction v − e toretrieve the original codeword.

Correcting more than one error

The method for the Hamming codes is indicative of how to correct more than one error.Firstly, note that similarly to how all codewords c ∈ C have zero syndrome, the syndromesof vectors outside of the code-space are also non-unique.

Lemma 4.6. Two vectors u,v have the same syndrome if and only if u− v ∈ C.

Proof. An easy exercise. �

Page 14: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

14 ALEXANDER J. MALCOLM

It follows that the syndrome of a vector v is uniquely determined by its coset

v + C = {v + c|c ∈ C}.

We can however distinguish some vectors by their syndromes, using the following result.

Proposition 4.7. Let C be a linear code with error correcting index e. Then the syndromesof vectors of weight at most e are distinct.

Proof. Assume for a contradiction that the statement is false; that is, there exist u,v withthe same syndrome such that wt(u),wt(v) ≤ e. It follows that u− v = c for some c ∈ Cby Lemma 4.6. But then

d ≤ wt(c) = wt(u− v) = d(u,v)

≤ d(u,0) + d(0,v)

= wt(u) + wt(v)

≤ 2e

≤ d− 1.

This result allows us to design an algorithm for correcting e errors or less.

(1) Find a set of minimal weight representatives vi for the cosets of C. These are calledthe coset leaders.

(2) Compute the syndromes of the coset leaders vi and store these in a look-up tablewhose rows are indexed by syndrome.

(3) The error-correction process is as follows: Suppose we receive a message v with atmost e errors. Compute the syndrome HvT - this corresponds uniquely to a rowof our table i.e. a unique coset leader vj .

(4) It follows that v = vj + c for a unique codeword c ∈ C. This c is the intendedmessage.

5. Dual codes

We have seen that any given matrix can be treated as a parity check matrix, de�ning alinear code. But is the reverse always possible? I.e. does a linear code always have a paritycheck matrix that de�nes it? The answer is yes, but to see this we �rst need to developthe notion of duality. Furthermore we shall see that duality provides a link between thetwo methods of constructing a code, in particular how the roles of generator matrix G, andparity check H are reversed by "taking duals".

De�nition 5.1. Let C ≤ Fnq be a linear code. De�ne the dual code to C, denoted C⊥, by

C⊥ := {v ∈ Fnq | cvT = 0, ∀c ∈ C}.

Exercise 5.2. Prove that C⊥ is also a linear code.

Example 5.3. Consider the linear code

C = {(0, 0, 0), (1, 1, 0), (0, 1, 1), (1, 0, 1)} ≤ F32.

Suppose that v = (v1, v2, v3) ∈ C⊥. By computing cvT for the 2nd and 3rd codewords inC, it follows that v1 = v2 = v3. It is then easy to check that C⊥ = {(0, 0, 0), (1, 1, 1)}.

Remark. We should be careful not to think of C⊥ as an orthogonal complement in thesense of real vector spaces - it is quite possible that over a �nite �eld C ∩ C⊥ is more thanjust {0}. In fact, it is possible that C = C⊥, in which case C is called a self-dual code.

Page 15: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

ERROR-CORRECTING CODES 15

Exercise 5.4. See if you can construct some self-dual codes of your own. What elementaryproperties of such codes can be deduced?

Rather than calculating the dot product with all codewords, we can in fact check if a vectoris in the dual code using a generator matrix for C.

Lemma 5.5. . Let C be an [n, k]q-linear code with generator matrix G. Then v ∈ C⊥ ifand only if GvT = 0.

Proof. Let ri ∈ C, 1 ≤ i ≤ k be the basis vectors of the code in the generator matrixG ∈Mk,n(Fq). That is,

G =

← r1 →...

← rk →

.

Now if v ∈ C⊥, then by de�nition rivT = 0. Hence

GvT =

← r1 →...

← rk →

vT =

r1vT

...rkv

T

= 0. (2)

To prove the reverse inclusion, assume that GvT = 0 and let c ∈ C. As the ri form a basis

of C, there exist λi ∈ Fq such that c =∑k

i=1 λiri. But then

cv⊥ =k∑i=1

λirivT = 0

by (2). As c was chosen arbitrarily it follows that v ∈ C⊥. �

Example 5.6. Recall from example 3.17 that

G =

1 0 0 1 1 0 10 1 0 1 0 1 10 0 1 0 1 1 1

is a generator matrix for the simplex code C3. But we also saw that this was a choice ofparity check matrix for the binary Hamming code H3! Hence v ∈ C⊥3 (for v ∈ F7

2) if andonly if v ∈ H3, and so C⊥3 = H3 by Lemma 5.5.

We summarise the main properties of the dual code in the following theorem

Theorem 5.7. Let C be an [n, k]q-linear code.

(1) Any generator matrix for C is a parity check matrix for C⊥. In particular C⊥ is an[n, n− k]q-linear code;

(2) Any parity check matrix for C is a generator matrix for C⊥;(3) C = (C⊥)⊥.

Proof.

(1) The �rst part of this statement is exactly Lemma 5.5 and so it remains to con�rmthe parameters of C⊥. Well �rstly, it is clear by de�nition that C⊥ ≤ Fnq . Now ifG is a generator matrix of C then rank(G) = k. Then as G is also a parity checkmatrix for C⊥, we have by the rank-nullity Theorem 2.14 that

rank(C⊥) = nullity(G) = n− rank(G) = n− k.

Page 16: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

16 ALEXANDER J. MALCOLM

(2) Let H ∈ Mn−k,n(Fq) be a parity check matrix for C. It su�ces to show that

RowSpace(H) = C⊥. Now recall that HcT = 0 for all c ∈ C by de�nition. So inparticular, ric

T = crTi = 0 where ri denote the rows of H. Hence ri ∈ C⊥ for eachi and we have the inclusion of subspaces

〈r1, . . . , rn−k〉Fq = RowSpace(H) ≤ C⊥.But these two subspaces have the same dimension n− k and are hence equal.

(3) Left as an exercise.

Corollary 5.8.

(1) Generator/parity check matrices for C correspond to parity check/ generator ma-trices for C⊥ respectively.

(2) Every linear code has a parity check matrix

Proof.

(1) This follows immediately from Theorem 5.7 (3).(2) Let C be a linear code. Now very linear code has a basis and therefore a generator

matrix. In particular, C⊥ has a generator matrix, say G. But then G is a paritycheck matrix for C by the �rst part of this corollary.

6. Perfect codes and other properties

The previous sections provided methods for constructing linear codes. Furthermore, wehave developed techniques for calculating the parameters such as dimension and minimumdistance. But which combinations of parameters are actually possible? There's no reasonto believe that we can construct an [n, k, d]q-linear code for any n, k, d, q of our choosing.

Ideally we want a code where we can send many messages i.e. |C| is large, as well as ahigh minimum distance d(C) as this allows for greater error correction. But we also wantto minimise the length n as this will a�ect the practical cost of transmitting codewords. Acompromise needs to be found: in this section we'll see two important results concerningthe existence and non-existence of linear codes with these desirable parameters.

Recall from De�nition 3.5, that a sphere with centre v ∈ Fnq and radius r ≥ 0 is thecollection of vectors

Sr(v) := {u ∈ Fnq : d(u,v) ≤ r}.As the Hamming distance between two points is always a non-negative integer, it is enoughto take the radius r to be an integer too. It is now easy to compute the size of these spheres.

Lemma 6.1. Let v ∈ Fnq and r ∈ N. Then

|Sr(v)| =r∑i=0

(n

i

)(q − 1)i.

Proof. For each 0 ≤ i ≤ r let Si = {u ∈ Fnq d(u,v) = i} i.e. the points that are Hammingdistance exactly i from v. Clearly |Sr(v)| = |S0| + |S1| + · · · + |Sr| and so it su�ces toshow that |Si| =

(ni

)(q − 1)i. Well, how many ways can we change exactly i entries of v?

Firstly we need to pick i of the co-ordinates to change: as there are n options, this gives(ni

)choices. Then for each of position we wish to change, we have |Fq| − 1 = q − 1 choices

of �eld element. This gives(ni

)(q − 1)i distinct choices in total, and the result follows.

Page 17: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

ERROR-CORRECTING CODES 17

Theorem 6.2. The sphere-packing bound. Let C ⊆ Fnq be a (not necessarily linear)code, with error correcting index e. Then

|C| ≤ qn∑ei=0

(ni

)(q − 1)i

.

In particular, if C is also an [n, k]q-linear code then

1 ≤ qn−k∑ei=0

(ni

)(q − 1)i

.

Proof. As C corrects e errors, the spheres Se(c) are disjoint for all c ∈ C. It follows that

| ∪c∈C Se(c)| = |C||Se(c)|

= |C|(e∑i=0

(n

i

)(q − 1)i),

by Lemma 6.1. But ∪c∈CSe(c) ≤ Fnq and so

|C|e∑i=0

(n

i

)(q − 1)i ≤ qn.

Clearly a code that attains the sphere-packing bound is very useful, as it can send thelargest number of codewords possible for a �xed n, q and e.

De�nition 6.3. A code C that attains the sphere-packing bound is called perfect, or e-perfect if the error-correcting index is given.

We have already seen an in�nite family of perfect codes

Example 6.4. The Hamming codes Hk(q) are all 1-perfect.

Proof. This is a straightforward computation: as n = (qk − 1)/(q − 1) and |Hk| = qn−k,we simply need to check that

e∑i=0

(n

i

)(q − 1)i = 1 + n(q − 1) = 1 +

qk − 1

q − 1(q − 1) = qk.

The sphere-packing bound is useful for showing the non-existence of codes:

Example 6.5. Let C be a binary linear code of length 15 that corrects 2 errors. What isthe maximum dimension of C? Well, by Theorem 6.2

|C| ≤ 215

1 + 15 +(152

) = 215/121 < 29.

Hence |C| ≤ 28 i.e. dim C ≤ 8.

But this is not to say that such codes of dimensions 1, . . . , 8 all do actually exist!

We can get a bit closer to the truth using the following famous existence result for linearcodes.

Page 18: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

18 ALEXANDER J. MALCOLM

Theorem 6.6. The Gilbert-Varshamov bound. Let n, k, d be positive integers, and qa prime power such that

d−2∑j=0

(n− 1

j

)(q − 1)j < qn−k. (3)

Then there exists an [n, k]q-linear code C with minimum distance d(C) ≥ d.

Proof. The proof of the G-V bound is constructive: �rst we assume that the positiveintegers n, k, d, q satisfy condition (3). Now to construct an [n, k]q-linear code C of correctdistance, it is su�cient by Proposition 4.4 to construct a parity check matrix H satisfying

(a) H ∈Mn−k,n(Fq) of rank n− k;(b) any d− 1 columns of H are linearly independent.

We constructH inductively, column by column. Begin by choosing the �rst n−k columns tobe the standard basis vectors eTi , 1 ≤ i ≤ n−k. These are clearly all linearly independent.Inductive step Suppose that we've chosen some i columns hT1 , . . . ,h

Ti ∈ Fn−kq , such that

n− k ≤ i ≤ n− 1 and any collection of d− 1 columns is linearly independent. Then

Hi =

↑ ↑hT1 · · · hTi↓ ↓

is (n− k)× i and satis�es (b). We now need to choose a further column hi+1 ∈ Fn−kq suchthat Hi+1 still satis�es (b). Well how many choices of columns won't work?

We can't take any vector that is contained in the span of up to d − 2 columns from{h1, . . . ,hi}. Consider a span of size j ≤ d− 2: �rstly note that we can choose j distinct

columns from {h1, . . . ,hi} in(ij

)ways. Furthermore, these columns span a set of size

(q − 1)j as we can pick any non-zero coe�cient from Fq for each. It follows that thenumber of vectors that won't work is at most

1 + i · (q − 1) +

(i

2

)(q − 1)2 + · · ·+

(i

d− 2

)(q − 1)d−2.

But since i ≤ n− 1, this number is less than qn−k by the assumption in the statement ofthe Theorem. Hence there exists a vector that does work i.e. there exists hi+1 ∈ Fn−kq

such that

Hi+1 =

↑ ↑hT1 · · · hTi+1↓ ↓

does satisfy (b). This completes the inductive step, and we constructHi for i = n−k, . . . , n.The matrix H = Hn is the �nal parity check matrix we require. �

Example 6.7. Let's apply this to our problem of looking for codes in example 6.5: letq = 2, n = 15 and d = 5. We see that

1 + 14 +

(14

2

)+

(14

3

)= 1 + 14 + 91 + 364 < 512 = 29 = 215−6.

Hence by Theorem 6.6 it follows that such a code (i.e. a binary code of length 15 andcorrecting 2 errors) of dimension 6 does exist. Unfortunately, neither the sphere-packing,nor G.V. bound tells us anything about the dimensions 7 and 8.

Page 19: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

ERROR-CORRECTING CODES 19

7. Equivalent codes and the weight enumerator

Properties such as dimension and minimum distance are useful for comparing two codes C1and C2, especially when it comes to technological applications. However as mathematiciansinterested in classifying algebraic objects we may wish for a greater measure of whether C1is "like" C2 - perhaps something resembling the property of two groups being isomorphic,or two matrices being conjugate. For this we introduce the notion of equivalent codes.

Firstly we need to de�ne the action of the symmetric group, Sn, on a code C ⊆ Fnq . Recallthat Sn is simply the group of permutations on the letters {1, . . . , n}.

De�nition 7.1. Let C ≤ Fnq , c = (c1, . . . , cn) ∈ C be a codeword, and let σ ∈ Sn. Thenσ(c) := (cσ(1), . . . , cσ(n)).

We are now ready to de�ne code equivalence.

De�nition 7.2. Let C1, C2 be two [n, k]q-linear codes. We say that C1 and C2 are equivalentif they di�er only by a �xed reordering of the components of the codewords. More formally,we say that C2 is equivalent to C1 if there exists a �xed σ ∈ Sn such that for all c ∈ C2,there exists d ∈ C1 such that σ(d) = c. If this holds, we denote equivalence by C1 ∼= C2.

Example 7.3. Let

C1 := {(0000), (0011), (1100), (1111)} ⊂ F42;

C2 := {(0000), (0101), (1010), (1111)} ⊂ F42.

Clearly, C1 ∼= C2 via the permutation (2, 3) ∈ S4.

We can check for code equivalence in a natural way using generator matrices.

Lemma 7.4. Let C1, C2 ≤ Fnq be linear codes with generator matrices G1, G2 ∈ Mk,n(Fq),respectively. Suppose that there exists a n× n permutation matrix P (*) such that G1P =G2. Then C1 ∼= C2.

Proof. Left as an exercise �

Remark. (*) Recall that a permutation matrix is an invertible square matrix such thateach row and each column contains exactly one entry 1, with all other entries equal to 0.Clearly there is a bijection between such matrices and the elements of Sn.

Exercise 7.5. Recall from De�nition 4.1 that there is more than one binary Hammingcode Hk for a �xed k- they correspond to di�erent choices of parity check matrix. Showthat all of these Hamming codes are equivalent.

In addition to the minimum distance of a linear code, we may also consider �ner structureit possesses. For example, given a code C it is natural to ask how many codewords thereare of weight m?

We create a "book-keeping device" that keeps track of these quantities called the weightenumerator.

De�nition 7.6. Let C ≤ Fnq be a linear code. The weight enumerator, WC(x, y) is thepolynomial

WC(x, y) :=∑c∈C

xn−wt(c)ywt(c) =

n∑m=0

Amxn−mym,

where Am = |{c ∈ C |wt(c) = m}|.

Example 7.7.

(1) The n-bit repetition code C = Rn over F2 has WC(x, y) = xn + yn.

Page 20: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

20 ALEXANDER J. MALCOLM

(2) Consider the simplex code C3 of length 7 and dimension 3 (see example 1.2). Itis easy to see from the natural generator matrix that each non-zero codeword hasweight four. Hence WC3(x, y) = x7 + 7x3y4.

We deduce the following properties of the weight enumerator.

Lemma 7.8. Let WC(x, y) be the weight numerator for C, as above. The following hold

(1) WC(1, 1) =∑n

m=0Am = |C|;(2) d(C) = min({m 6= 0 |Am 6= 0}).

Proof. The proof of both statements follow easily from the de�nition. �

Lemma 7.9. Let C ≤ Fn2 be a binary linear code. Then WC(x, y) = WC(y, x) if and onlyif 1 = (1, . . . , 1) ∈ C.

Proof. As C is linear, 0 ∈ C, and therefore WC(x, y) contains the term xn. So assumingWC(x, y) = WC(y, x), it follows that the weight enumerator contains the term yn and byde�nition 1 ∈ C. For the reverse inclusion, notice that the map f : C → C given byf(c) = c+ 1 is a bijection that maps codewords of weight i to codewords of weight n− i.Hence if 1 ∈ C then

|{c ∈ C |wt(c) = i}| = |{c ∈ C |wt(c) = n− i}|

for all 0 ≤ i ≤ n. The weight enumerator is then symmetric by de�nition. �

Lemma 7.10. Let C1 and C2 be equivalent linear codes. Then WC1(x, y) =WC2(x, y).

Proof. This follows easily from the de�nition of equivalence once we notice that the per-mutation action preserves codeword weight. �

So we have (an incredibly strong!) relationship between the weight enumerators of equiv-alent codes. Can we also �nd a relationship concerning the enumerators of dual codes?When the codes are binary, we have the following well-known and elegant result.

Theorem 7.11. MacWilliams Identity: Let C be a [n, k]2-code. Then

WC⊥(x, y) =1

2kWC(x+ y, x− y).

Example 7.12. Recall that the simplex code C3 is dual to H3, the binary hamming codeof length 7. Hence

WH3 =1

23WC3(x+ y, x− y)

=1

23(x+ y)7 +

1

23(x+ y)3(x− y)4

= x7 + 7x4y3 + 7x3y4 + y7.

Example 7.13. More generally, consider Hk, the binary Hamming code of length n =2k − 1. Here the weight enumerator has the form

WHk(x, y) =

xn

n+ 1(1 +

y

x)n +

nxn

n+ 1(1 +

y

x)(n−1)/2(1− y

x)(n+1)/2. (4)

The proof of (4) is beyond the scope of this course (it involves solving a di�erential equa-tion!) but if you are interested, the details can be found in Van Lint (page 41).

Let's compute the weight enumerator for H4: well n = 15 and hence

Page 21: opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR ... › 2019 › ... · opicsT in Discrete Mathematics Spring ermT 2018/19 ERROR-CORRECTING CODES ALEXANDER J. MALCOLM Abstract.

ERROR-CORRECTING CODES 21

WH4(x, y) =x15

15 + 1(1 +

y

x)15 +

15x15

15 + 1(1 +

y

x)7(1− y

x)8

= x15 + 35x12y3 + 105x11y4 + 168x10y5 + 280x9y6 + 435x8y7

+ 435x7y8 + 280x6y9 + 168x5y10 + 105x4y11 + 35x3y12 + 1.

Notice that WH3 and WH4 are symmetric - that is they are �xed under exchanging thevariable x and y.

Exercise 7.14. Prove this holds in general i.e. for Hk, k ≥ 3.

Alexander J. Malcolm, School of Mathematics, University of Bristol, Bristol, BS8 1TW,

UK, and the Heilbronn Institute for Mathematical Research, Bristol, UK.

E-mail address: [email protected]