Lectures on Finite Fields Xiang-dong Houshell.cas.usf.edu/~xhou/MAD6617F05/LecFF-web.pdf ·...

Lectures on Finite Fields

Xiang-dong Hou

Department of Mathematics, University of South Florida, Tampa,Florida 33620

E-mail address: [email protected]

Abstract.

Contents

Chapter 1. Preliminaries 11.1. Basic Properties of Finite Fields 11.2. Partially Ordered Sets and the Mobius Function 101.3. Tensor 15Exercises 23

Chapter 2. Polynomials over Finite Fields 252.1. Number of Irreducible Polynomials 252.2. Berlekamp’s Factorization Algorithm 282.3. Functions from Fnq to Fq 322.4. Permutation Polynomials 372.5. Linearized Polynomials 432.6. Payne’s Theorem 47Exercises 51

Chapter 3. Exponential Sums 533.1. Characters of a Finite Abelian Group 533.2. Gauss Sums 623.3. Evaluation of the Gauss Quadratic Sum over Fp 653.4. Formal Power Series 703.5. The Davenport-Hasse Theorem and Evaluation of the Gauss Quadratic

Sum over Fq 773.6. Dedekind Domains and Number Fields 803.7. Cyclotomic Fields 913.8. 95Exercises 95

Chapter 4. Zeros of Polynomials over Finite Fields 974.1. Ax’s Theorem 97

Hints for the Exercises 103

Bibliography 105

iii

CHAPTER 1

Preliminaries

1.1. Basic Properties of Finite Fields

Existence and uniqueness. Let F be a field with |F | < ∞. Define a ringhomomorphism

f : Z −→ Fn 7−→ n1F

where 1F is the identity of F . By the first isomorphism theorem, we have anembedding Z/ ker f ↪→ F . Thus, Z/ ker f is an integral domain. Therefore, ker fis a prime ideal of Z, i.e., ker f = pZ for some prime p. Since the field Z/pZ isembedded in F , we may simply assume that F contains Z/pZ as a subfield. Clearly,F is a vector space over Z/pZ. Since F is finite, [F : Z/pZ] = dimZ/pZ F < ∞.Let n = [F : Z/pZ]. Then F ∼= (Z/pZ)n as an (Z/pZ)-vector space. In particular,|F | = pn.

To sum up, if F is a finite field, then |F | = pn for some prime p and integern > 0.

An immediate question is: given a prime p and an integer n > 0, does thereexist a field F with |F | = pn? The answer is positive.

Theorem 1.1. Let p be a prime and n a positive integer. The splitting field ofxp

n − x ∈ (Z/pZ)[x] has precisely pn elements.

Proof. Let f = xpn − x and F the splitting filed of f over Z/pZ. Note that

(f ′, f) = (−1, f) = 1. Thus, f has pn distinct roots in F . Let

E = {a ∈ F : f(a) = 0}.

We will show that F = E. It suffices to show that E is a field. (Then f splits inE. Since F is the smallest field in which f splits, we must have F = E.)

We claim thatφ : F −→ F

a 7−→ apn

is an automorphism of F . Clearly, φ(1) = 1. Let a, b ∈ F . We have

φ(ab) = (ab)pn

= apn

bpn

= φ(a)φ(b).

Since p = 0 in F , we also have

φ(a+ b) = (a+ b)pn

= apn

+ bpn

= φ(a) + φ(b).

Hence, φ : F → F is a ring homomorphism. Clearly, kerφ = {0}. Thus, φ isone-to-one. Since F is a finite extension over Z/pZ, |F | < ∞. Therefore, φ mustbe onto, making it an automorphism of F .

Now, E is the fixed field of φ in F . Hence, E is a field. �

1

2 1. PRELIMINARIES

A finite field with a given order (number of elements) is unique up to isomor-phism.

Theorem 1.2. Given a prime p and an integer n > 0, all finite fields of orderpn are isomorphic.

Proof. Let F be a finite filed with |F | = pn. As seen at the beginning of thissubsection, Z/pZ ⊂ F . Since F \ {0} is a multiplicative group of order pn − 1, wehave ap

n−1 = 1 for all a ∈ F \ {0}. Thus,

apn

= a for all a ∈ F.Namely, all elements of F are roots of f = xp

n − x ∈ (Z/pZ)[x]. Since (f ′, f) = 1,f has precisely pn distinct roots. Hence, F consists of all the roots of f . Therefore,F is a splitting field of f over Z/pZ.

Since all splitting fields of f over Z/pZ are isomorphic, the conclusion of thetheorem follows. �

We denote the finite field with pn elements by Fpn . Thus, Fp = Z/pZ. We havean Fp-vector space isomorphism (not a ring isomorphism) Fpn ∼= Fnp .

The multiplicative group of Fpn . The multiplicative group of Fpn is denotedby F∗pn .

Theorem 1.3. F∗pn is cyclic. A generator of F∗pn is called a primitive elementof Fpn .

Proof. Assume to the contrary that F∗pn is not cyclic. By the fundamentaltheorem of finite abelian groups, we must have

(1.1) F∗pn∼= A×B,

where |A| = a, |B| = b and (a, b) 6= 1. (The fundamental theorem of finite abeliangroups: Every finite abelian group G is isomorphic to

(Z/pe11 Z)× · × (Z/pek

k Z)

for some primes p1, . . . , pk and integers e1, . . . , ek > 0. G is cyclic if and only ifp1, . . . , pk are all distinct.) It follows that pn − 1 = |F∗pn | = ab > lcm(a, b). By(1.1), we have

(1.2) xlcm(a,b) = 1 for all x ∈ F∗pn .

However, the polynomial xlcm(a,b)−1 can have at most lcm(a, b) roots in Fpn , whichis a contradiction to (1.2). �

Representation of elements. Let α be a primitive element of Fpn . ThenFpn = {0, 1, α, . . . , αpn−2}. Multiplications in Fpn are easily performed under thisrepresentation of the elements of Fpn . However, to perform additions in Fpn , weneed to treat Fpn as an extension of Fp by an irreducible polynomial of degree n.

Lemma 1.4. Let p be a prime and n > 0 an integer. Then there exists anirreducible polynomial f ∈ Fp[x] of degree n.

Proof. Let α ∈ Fpn be a primitive element. Clearly, Fpn = Fp(α). (Fp(α) isthe extension of Fp obtained by adjoining α to Fp.) Let f ∈ Fp[x] be the minimalpolynomial of α over Fp. Then f is irreducible and deg f = [Fp(α) : Fp] = [Fpn :Fp] = n. �

1.1. BASIC PROPERTIES OF FINITE FIELDS 3

Lemma 1.4 is an existence result. In Chapter 2, we will determine the exactnumber of irreducible polynomials of degree n over a finite field. However, findingirreducible polynomials of large degrees over a finite filed is not easy.

Let f = xn + an−1xn−1 + · · · + a0 ∈ Fp[x] be a monic irreducible polynomial

of degree n. Then Fp[x]/(f) is a field and every element of Fp[x]/(f) is uniquely ofthe form

c0 + c1x+ · · ·+ cn−1xn−1

where x = x + (f) ∈ Fp[x]/(f) and c0, . . . , cn−1 ∈ Fp. Since |Fp[x]/(f)| = pn, byTheorem 1.2, Fp[x]/(f) = Fpn . An element g + (f) ∈ Fp[x]/(f), where g ∈ Fp[x],is simply written as g when the meaning is clear from the context. Thus, theelements of Fp[x]/(f) are polynomials of degree < n in Fp[x]; the addition of twosuch elements is simply the polynomial addition; the multiplication of two suchelements is the polynomial multiplication followed by a reduction modulo f .

Example 1.5. f(x) = x3 + x + 1 ∈ F2[x] is irreducible. (A polynomial ofdegree ≤ 3 over a field F having no root in F is irreducible over F .) Hence,F23 = F2[x]/(f). Let g = x2 + x+ 1, h = x2 + 1 ∈ F2[x]/(f). We have

fg = (x2 + x+ 1)(x2 + 1)

= x4 + x3 + x+ 1

= x(x+ 1) (since x3 + x+ 1 = 0)

= x2 + x.

The multiplication table of F23 = F2[x]/(f) is given below, where c2x2 + c1x + c0is abbreviated as c2c1c0.

Table 1.1. Multiplication Table of F23 = F2[x]/(x3 + x+ 1)· 000 001 010 011 100 101 110 111

000 000 000 000 000 000 000 000 000001 000 001 010 011 100 101 110 111010 000 010 100 110 011 001 111 101011 000 011 110 101 111 100 001 010100 000 100 011 111 110 010 101 001101 000 101 001 100 010 111 011 110110 000 110 111 001 101 011 010 100111 000 111 101 010 001 110 100 011

Lattice of finite fields. In Fpn , the additive order of 1 is p. Thus, the charac-teristic of Fpn is p. To describe the relations among all finite fields of characteristicp, we put all such fields in one ambient filed. Let Fp be the algebraic closure of Fp.For each integer n > 0, since Fp contains a splitting field of xp

n − x over Fp, Fpn isa subfield of Fp.

Theorem 1.6. Let p be a prime and let Fp be the algebraic closure of Fp.(i) For each integer n > 0, Fp has a unique subfield of order pn.(ii) Let Fpm ⊂ Fp and Fpn ⊂ Fp. Then Fpm ⊂ Fpn if and only if m | n. In

general,

(1.3) Fpm ∩ Fpn = Fp(m,n) ,

4 1. PRELIMINARIES

(1.4) FpmFpn = Fp[m,n] ,

where FpmFpn is the subfield of Fp generated Fpm∪Fpn , (m,n) = gcd(m,n)and [m,n] = lcm(m,n).

Note. We already know that a finite field of order pn is unique up to isomor-phism. However, Theorem 1.6 (i) states that in a given algebraic closure of Fp, afinite field of order pn is not only unique up to isomorphism, but also unique as aset.

Proof of Theorem 1.6. (i) By the proof of Theorem 1.2, a subfield of Fp oforder pn must be {a ∈ Fp : ap

n

= a}.(ii) If Fpm ⊂ Fpn , then Fpn is an [Fpn : Fpm ]-dimensional vector space over Fpm .

Hence,pn = |Fpn | = |Fpm |[Fpn :Fpm ] = pm[Fpn :Fpm ].

Thus n = m[Fpn : Fpm ].If m | n, then

xpn

− x = x(xpn−1 − 1)

= x(x

pn−1pm−1 (pm−1) − 1

)= x(xp

m−1 − 1)

pn−1pm−1−1∑i=0

x(pm−1)i

= (xpm

− x)

pn−1pm−1−1∑i=0

x(pm−1)i.

Therefore, in Fp, the splitting field of xpm − x is contained in the splitting field of

xpn − x, i.e., Fpm ⊂ Fpn .

To prove (1.3), first observe that Fp(m,n) ⊂ Fpm ∩ Fpn . Let Fpm ∩ Fpn = Fps .Since Fps ⊂ Fpm and Fps ⊂ Fpn , from the above, s | m and s | n; hence s | (m,n).Therefore, Fpm ∩ Fpn = Fps ⊂ Fp(m,n) .

Equation (1.4) is proved in the same way. �

Proposition 1.7. Let Fpm ⊂ Fpn , where m | n. If α is a primitive element of

Fpn , then αpn−1pm−1 is a primitive element of Fpm .

Proof. Since o(α) = pn − 1, o(αpn−1pm−1 ) = pm − 1. Since F∗pn is cyclic, F∗pm is

the only subgroup of F∗pn of order pm − 1. Thus, F∗pm = 〈αpn−1pm−1 〉. �

The automorphism group. Define a map

σ : Fpn −→ Fpn

a 7−→ ap

It is obvious that σ is an automorphism of Fpn . σ is called the Frobenius map ofFpn over Fp.

Theorem 1.8. The extension Fpn/Fp is Galois and Aut(Fpn/Fp) = 〈σ〉. Moregenerally, if m | n, then the extension Fpn/Fpm is Galois and Aut(Fpn/Fpm) =〈σm〉.


Proof. Since xpn − x is a separable polynomial in Fp[x] and since Fpn is the

splitting polynomial of xpn−x over Fp, Fpn is Galois over Fp. Thus, |Aut(Fpn/Fp)| =

[Fpn : Fp] = n. Since σ ∈ Aut(Fpn/Fp), to prove that Aut(Fpn/Fp) = 〈σ〉, it sufficesto show that o(σ) = n, or, equivalently, o(σ) ≥ n. Since σo(σ) = id, we have

(1.5) 0 = σo(σ)(a)− a = apo(σ)

− a for all a ∈ Fpn .

The polynomial xpo(σ) − x, being of degree po(σ), has at most po(σ) roots in Fpn .

Thus, (1.5) implies that pn ≤ po(σ), i.e., n ≤ o(σ).If m | n, then Fp ⊂ Fpm ⊂ Fpn . Since Fpn/Fp is Galois, so is Fpn/Fpm . More-

over, Aut(Fpn/Fpm) is a subgroup of Aut(Fpn/Fp) of order nm . Since Aut(Fpn/Fp) =

〈σ〉 is cyclic, its only subgroup of order nm is 〈σm〉. Thus, Aut(Fpn/Fpm) = 〈σm〉. �

Note. The automorphism σm ∈ Aut(Fpn/Fpm) = 〈σm〉 is defined by σm(a) =ap

m

, a ∈ Fpn , and is called the Frobenius map of Fpn over Fpm .

Trace and norm. Let Fps ⊂ Fpt , where s | t. We usually write such two fieldsas Fq ⊂ Fqn , where q = ps and n = t

s . By Theorem 1.8, Aut(Fqn/Fq) = 〈τ〉, where

τ : Fqn −→ Fqn

a 7−→ aq

is the Frobenius map of Fqn over Fq. For each a ∈ Fqn , define

TrFqn/Fq(a) =

∑φ∈Aut(Fqn/Fq)

φ(a) =n−1∑i=0

τ i(a) =n−1∑i=0

aqi

and

NFqn/Fq(a) =

∏φ∈Aut(Fqn/Fq)

φ(a) =n−1∏i=0

τ i(a) = aq0+···+qn−1

= aqn−1q−1 .

For each ψ ∈ Aut(Fqn/Fq), we have

ψ(TrFqn/Fq

(a))

= ψ( ∑φ∈Aut(Fqn/Fq)

φ(a))

=∑

φ∈Aut(Fqn/Fq)

(ψφ)(a)

=∑

φ′∈Aut(Fqn/Fq)

(φ′)(a) (let φ′ = ψφ)

= TrFqn/Fq(a).

Since Fqn/Fq is Galois, we must have TrFqn/Fq(a) ∈ Fq. By the same argument,

NFqn/Fq(a) ∈ Fq.

For a ∈ Fqn , TrFqn/Fq(a) is called the trace of a from Fqn to Fq, NFqn/Fq

(a) iscalled the norm of a from Fqn to Fq.

Theorem 1.9.(i) The map Tr : Fqn → Fq is an onto Fq-map.(ii) If a ∈ Fq, then TrFqn/Fq

(a) = na.(iii) For all a ∈ Fqn and φ ∈ Aut(Fqn/Fq), TrFqn/Fq

(φ(a)) = TrFqn/Fq(a). In

particular, TrFqn/Fq(aq) = TrFqn/Fq

(a).

6 1. PRELIMINARIES

Proof. (i) Since φ ∈ HomFq (Fqn ,Fq) for all φ ∈ Aut(Fqn/Fq), we have TrFqn/Fq=∑

φ∈Aut(Fqn/Fq) φ ∈ HomFq (Fqn ,Fq). We claim that TrFqn/Fq6= 0. This is true since

TrFqn/Fq(a) = aq

0+ aq

1+ · · ·+ aq

n−1,

being a polynomial of degree qn−1 in a, cannot be all 0 as a runs through Fqn .Thus, TrFqn/Fq

: Fqn → Fq is onto since the target Fq is of dimension 1 over Fq.(ii) We have

TrFqn/Fq(a) =

∑ψ∈Aut(Fqn/Fq)

ψ(a) =∑

ψ∈Aut(Fqn/Fq)

a = na.

(iii) We have

TrFqn/Fq(φ(a)) =

∑ψ∈Aut(Fqn/Fq)

ψ(φ(a))

=∑

ψ∈Aut(Fqn/Fq)

(ψφ)(a)

=∑

ψ∈Aut(Fqn/Fq)

ψ(a)

= TrFqn/Fq(a).

�

Theorem 1.10.

(i) NFqn/Fq(0) = 0 and the map NFqn/Fq

: F∗qn → F∗q is an onto group homo-morphism.

(ii) If a ∈ Fq, then NFqn/Fq(a) = an.

(iii) For all a ∈ Fqn and φ ∈ Aut(Fqn/Fq), NFqn/Fq(φ(a)) = NFqn/Fq

(a). Inparticular, NFqn/Fq

(aq) = NFqn/Fq(a).

Proof. (i) Clearly, NFqn/Fq(0) = 0. Since

NFqn/Fq(a) = a

qn−1q−1 , a ∈ F∗qn ,

NFqn/Fq: F∗qn → F∗q is a group homomorphism. By Proposition 1.7, NFqn/Fq

mapsa generator of F∗qn to a generator of F∗q . Thus, NFqn/Fq

: F∗qn → F∗q is onto.The proofs of (ii) and (iii) are the same as the proofs of (ii) and (iii) of Theo-

rem 1.9. �

Theorem 1.11 (Transitivity of trace and norm). Let F ⊂ K ⊂ L be finitefields and let a ∈ L. Then

(1.6) TrK/F (TrL/K(a)) = TrL/F (a),

(1.7) NK/F (NL/K(a)) = NL/F (a).

Proof. Let F = Fq, K = Fqs , L = Fqst . Let τ be the Frobenius map of Lover F . Then τ s is the Frobenius map of L over K and τ |K is the Frobenius map


of K over F . Thus,

TrK/F (TrL/K(a)) = TrK/F(t−1∑i=0

τ si(a))

=s−1∑j=0

τ j(t−1∑i=0

τ si(a))

=t−1∑i=0

s−1∑j=0

τ si+j(a)

=st−1∑k=0

τk(a) (k = si+ j)

= TrL/F (a).

The proof of (1.7) is the same. �

The next two theorems describes the kernels of TrFqn/Fqand NFqn/Fq

.

Theorem 1.12. Let φ be any generator of Aut(Fqn/Fq). Then

(1.8) ker(TrFqn/Fq) = {φ(x)− x : x ∈ Fqn}.

Proof. Let f = φ − id ∈ HomFq(Fqn ,Fqn). Then the right side of (1.8)

is f(Fqn). By Theorem 1.9 (iii), TrFqn/Fq◦ f = TrFqn/Fq

◦ φ − TrFqn/Fq= 0.

Hence, f(Fqn) ⊂ ker(TrFqn/Fq). By Theorem 1.9 (i), dimFq

ker(TrFqn/Fq) = n − 1.

Thus, to prove (1.8), it suffices to show that dimFqf(Fqn) = n − 1. Note that

ker f = {x ∈ Fqn : φ(x) = x} = Fq since Fqn/Fq is Galois with Aut(Fqn/Fq) = 〈φ〉.Thus,

dimFq f(Fqn) = n− dimFq ker f = n− 1.�

Theorem 1.13 (Hilbert Theorem 90 for finite fields). Let φ be any generatorof Aut(Fqn/Fq). Then

(1.9) ker(NFqn/Fq: F∗qn → F∗q) =

{φ(x)x

: x ∈ F∗q}.

Proof. The proof is similar to that of Theorem 1.12. Define a group homo-morphism

f : F∗qn −→ F∗qn

x 7−→ φ(x)x

Then the right side of (1.9) is f(F∗qn). It is easy to see that

f(F∗qn) ⊂ ker(NFqn/Fq: F∗qn → F∗q).

Thus, to prove (1.9), it suffices to show that

|f(F∗qn)| = | ker(NFqn/Fq: F∗qn → F∗q)| =

|F∗qn ||F∗q |

.

We have ker f = {x ∈ F∗qn : φ(x) = x} = F∗q since Fqn/Fq is Galois withAut(Fqn/Fq) = 〈φ〉. Thus,

|f(F∗qn)| =|F∗qn || ker f |

=|F∗qn ||F∗q |

.

8 1. PRELIMINARIES

�

By Theorem 1.9 (i), Theorem 1.10 (i), Theorems 1.12 and 1.13, for any gener-ator φ of Aut(Fqn/Fq), we have exact sequences

Fqnφ−id−−−→ Fqn

TrFqn /Fq−−−−−→ Fq −→ {0}and

F∗qn

φ/id−−−→ F∗qn

NFqn /Fq−−−−−→ F∗q −→ {1}.The trace an norm can also be characterized in terms of a linear transformation.

Theorem 1.14. Let a ∈ Fqn and define an Fq-linear map

Ta : Fqn −→ Fqn

x 7−→ ax

Then TrFqn/Fq(a) = Tr(Ta) and NFqn/Fq

(a) = det(Ta). (The trace and determinantof a linear transformation T of a finite dimensional vector space V are defined tobe the trace and determinant of the matrix of T with respect to any basis of V .)

Proof. Consider the tower Fq ⊂ Fq(a) ⊂ Fqn and let [Fq(a) : Fq] = s, [Fqn :Fq(a)] = t. Then 1, a, . . . , as−1 is a basis of Fq(a) over Fq. Let

(1.10) f(x) = xs + bs−1xs−1 + · · ·+ b0 ∈ Fq[x]

be the minimal polynomial of a over Fq. Then

Ta

1a...

as−1

= A

1a...

as−1

,where

A =

0 1

0 1· ·

· ·0 1

−b0 −b1 · · · −bs−1

.Let ε1, . . . , εt be a basis of Fqn over Fq(a). Then εiaj , 1 ≤ i ≤ t, 0 ≤ j ≤ s− 1, is abasis of Fqn over Fq. With respect to this basis, we have

Ta

ε1a0

...ε1a

s−1

...εta

0

...εta

s−1

=

A . . .A︸︷︷︸

t blocks

ε1a0

...ε1a

s−1

...εta

0

...εta

s−1

.

Thus,

(1.11) Tr(Ta) = tTr(A) = t(−bs−1),


(1.12) det(Ta) = (detA)t =[(−1)sb0

]t.

Let τ be the Frobenius map of Fq(a) over Fq. Then τ0(a), . . . , τ s−1(a) are all roots off and are all distinct. (If, to the contrary, τ i(a) = τ j(a) for some 0 ≤ i < j ≤ s−1,then τ j−i(a) = a. Since τ j−i ∈ Aut(Fq(a)/Fq) fixes Fq and a, we must haveτ j−i = id, which is a contradiction since o(τ) = s.) Therefore,

f(x) =s−1∏i=0

(x− τ i(a))

= xs −(s−1∑i=0

τ i(a))xs−1 + · · ·+ (−1)s

s−1∏i=0

τ i(a)

= xs − TrFq(a)/Fq(a)xs−1 + ·+ (−1)sNFq(a)/Fq

(a).

(1.13)

A comparison of (1.10) and (1.13) yields

−bs−1 = TrFq(a)/Fq(a) and (−1)sb0 = NFq(a)/Fq

(a).

Thus, form (1.11), (1.12) and the above, we have

Tr(Ta) = tTrFq(a)/Fq(a)

= TrFq(a)/Fq(ta)

= TrFq(a)/Fq

(TrFqn/Fq(a)(a)

)(by Theorem 1.9 (ii))

= TrFqn/Fq(a)

and

det(Ta) =[NFq(a)/Fq

(a)]t

= NFq(a)/Fq(at)

= NFq(a)/Fq

(NFqn/Fq(a)(a)

)(by Theorem 1.10 (ii))

= NFqn/Fq(a).

�

Normal basis. Let τ be the Frobenius map of Fqn over Fq and let a ∈ Fqn .In general, τ0(a), τ1(a), . . . , τn−1(a) do not necessarily form a basis of Fqn over Fq;if they do, the basis is called a normal basis of Fqn over Fq.

Theorem 1.15 (Existence of a normal basis). There exists a normal basis ofFqn over Fq.

Proof. Let τ be the Frobenius map of Fqn over Fg and view τ as an Fq-lineartransformation of Fqn . Since τn = id, the polynomial xn − 1 annihilates τ . Weclaim that xn − 1 is the minimal polynomial of τ . Assume to the contrary thatthe minimal polynomial of τ is f(x) = xm + am−1x

m−1 + · · · + a0 ∈ Fq[x], where0 < m < n. Then for all y ∈ Fqn ,(1.14)0 = f(τ)(y) = (τm + am−1τ

m−1 + · · ·+ a0τ0)(y) = yq

m

+ am−1yqm−1

+ · · ·+ a0y.

But this is impossible since the right side of (1.14) is a polynomial of degree qm iny thus has at most qm roots in Fqn .

10 1. PRELIMINARIES

Let A be the matrix of τ with respect to any basis of Fqn over Fq. Then theminimal polynomial of A is xn − 1. It follows that

(1.15) A ∼

0 1

0 1· ·

· ·0 1

1 0 · · · 0

.

(The symbol∼means matrix similarity.) Similarity (1.15) holds since both matriceshave the same invariant factor xn − 1. Therefore, there is a basis ε1, . . . , εn of Fqn

over Fq with respect to which the matrix of τ is the matrix at the right side of(1.15). Since

τ

ε1...εn

=

0 1

0 1· ·

· ·0 1

1 0 · · · 0

ε1...εn

=

ε2...εnε1

,

we have ε2 = τ(ε1), ε3 = τ(ε2) = τ2(ε1), ..., εn = τn−1(ε1). Thus, ε1, τ(ε1), . . . , τn−1(ε1)is a normal basis of Fqn over Fq. �

1.2. Partially Ordered Sets and the Mobius Function

Definition 1.16. A partially ordered set (poset) is a nonempty set X equippedwith a binary relation ≤ satisfying the following conditions:

(i) (reflexivity) x ≤ x for all x ∈ X,(ii) (transitivity) if x ≤ y and y ≤ z, where x, y, z ∈ X, then x ≤ z,(iii) (anti-symmetry) if x ≤ y and y ≤ x, where x, y ∈ X, then x = y.

Let (X,≤) be a poset and x, y ∈ X, “x < y” means that x ≤ y and x 6= y. Wedefine [x, y] = {z ∈ X : x ≤ z ≤ y}, [x, y) = {z ∈ X : x ≤ z < y}, etc. and calledthem intervals. A poset (X,≤) is called locally finite if for all x, y ∈ X, |[x, y]| <∞.

Definition 1.17 (The Mobius function). Let (X,≤) be a locally finite poset.The Mobius function of (X,≤) is a function

µ : X ×X −→ Z

such that if x 6≤ y, µ(x, y) = 0 and if x ≤ y,∑z∈[x,y]

µ(x, z) = δ(x, y),

where

δ(x, y) =

{1 if x = y,

0 if x 6= y

is the Kronecker symbol.

1.2. PARTIALLY ORDERED SETS AND THE MOBIUS FUNCTION 11

The Mobius function of a locally finite poset (X,≤) exists and is unique. Infact, with a fixed x ∈ X, µ(x, y), where y ≥ x, is inductively given by

(1.16)

µ(x, x) = 1,µ(x, y) = −

∑z∈[x,y)

µ(x, z) if y > x.

Since (X,≤) is locally finite, (1.16) does give µ(x, y) for all y ≥ x.The usefulness of the Mobius function lies in the so called Mobius inversion

formula.

Theorem 1.18 (The Mobius inversion). Let (X,≤) be a locally finite poset withMobius function µ. Let A be an abelian group and N= : X → A a function. Letl,m ∈ X be fixed and for x ∈ X define

N≥(x) =∑

y∈[x,m]

N=(y)

andN≤(x) =

∑y∈[l,x]

N=(y).

Then

(1.17) N=(x) =∑

y∈[x,m]

µ(x, y)N≥(y) for all x ∈ X with x ≤ m

and

(1.18) N=(x) =∑y∈[l,x]

µ(y, x)N≤(y) for all x ∈ X with x ≥ l.

Proof. Let x ∈ X such that x ≤ m. We have∑y∈[x,m]

µ(x, y)N≥(y) =∑

x≤y≤m

µ(x, y)∑

y≤z≤m

N=(z)

=∑

x≤y≤z≤m

µ(x, y)N=(z)

=∑

x≤z≤m

N=(z)∑

x≤y≤z

µ(x, y)

=∑

x≤z≤m

N=(z)δ(x, z)

=N=(x).

To prove (1.18), we define a partial order ≥ on X such that x ≥ y if and only ify ≤ x. It is obvious that the Mobius function of the poset (X,≥) is η(x, y) = µ(y, x).Thus, (1.18) follows from (1.17) applied to (X,≥). �

Let (X1,≤1) and (X1,≤2) be two posets, A bijection f : X1 → X2 is calledan isomorphism if for x, y ∈ X1, x ≤1 y if and only if f(x) ≤2 f(y). The posets(X1,≤1) and (X2,≤2) are called isomorphic, denoted (X1,≤1) ∼= (X2,≤2), if thereis an isomorphism from (X1,≤1) to (X2,≤2).

12 1. PRELIMINARIES

Clearly, isomorphic locally finite posets have the “same” Mobius function. Moreprecisely, let (Xi,≤i) be a locally finite poset with Mobius function µi, i = 1, 2,and f : (X1,≤1) → (X2,≤2) an isomorphism. Then for x, y ∈ X1,

(1.19) µ2

(f(x), f(y)

)= µ1(x, y).

On the other hand, the partial order of a locally finite poset is completely deter-mined by its Mobius function.

Proposition 1.19. Let (X,≤) be a locally finite poset with Mobius functionµ. Then for distinct x, y ∈ X, x < y if and only if there is a finite sequencex = x1, x2, . . . , xn = y such that µ(xi, xi + 1) = −1 for all 1 ≤ i < n.

Proof. (⇐) Since µ(xi, xi+1) 6= 0, we have xi ≤ xi+1, 1 ≤ i < n. By thetransitivity of the partial order, x = x1 ≤ xn = y.

(⇒) Let x1 = x. Choose x2 ∈ (x1, y] such that (x1, x2) = ∅. (Such an x2 existssince |(x1, y]| < ∞.) Then by (1.16), µ(x1, x2) = −1. In the same way, choosex3, x4, . . . such that x1 < x2 < x3 < · · · ≤ y and µ(xi, xi+1) = −1, i = 1, 2, . . . .Since |[x1, y]| <∞, the sequence x1, x2, . . . must stop with xn = y. �

Let Pi = (Xi,≤i), i = 1, 2, be posets. For (x1, x2), (y1, y2) ∈ X1 ×X2, define(x1, x2) ≤ (y1, y2) if and only if x1 ≤1 y1 and x2 ≤2 y2. Clearly, (X1 × X2,≤) isalso a poset; it is called the product of P1 and P2 and is denoted by P1 × P2.

Theorem 1.20. Let Pi = (Xi,≤i) be a locally finite poset with Mobius functionµi, i = 1, 2. Then P1 × P2 is a locally finite poset with Mobius function(1.20)(µ1 × µ2)

((x1, x2), (y1, y2)

):= µ1(x1, y1)µ2(x2, y2), (x1, x2), (y1, y2) ∈ X1 ×X2.

Proof. For (x1, x2), (y1, y2) ∈ X1×X2, we have [(x1, x2), (y1, y2)] = [x1, y1]×[x2, y2], which is finite. Thus P1 × P2 is locally finite.

To prove that µ1 × µ2 is the Mobius function of P1 × P2, first note that if(x1, x2) 6≤ (y1, y2), then µ1(x1, y1)µ2(x2, y2) = 0. Now assume (x1, x2) ≤ (y1, y2).We have ∑

(z1,z2)∈[(x1,x2),(y1,y2)]

µ1(x1, x2)µ2(z1, z2)

=

∑z1∈[x1,y1]

µ1(x1, z1)

∑z2∈[x2,y2]

µ2(x2, z2)

= δ(x1, y1)δ(x2, y2)

= δ((x1, x2), (y1, y2)

).

Thus, µ1 × µ2 is indeed the Mobius function of P1 × P2. �

We end this section with some well known examples of locally finite posets andtheir Mobius functions.

1.2. PARTIALLY ORDERED SETS AND THE MOBIUS FUNCTION 13

Example 1.21. Let ≤ be the ordinary order in Z. It follows immediately from(1.16) that the Mobius function of (Z,≤) is

µZ(x, y) =

1 if y = x,

−1 if y = x+ 1,0 otherwise

=

{(−1)y−x if y = x or y = x+ 1,0 otherwise.

Example 1.22. Let X be a finite set and P(X) the set of all subsets of X.Then (P(X),⊂) is a locally finite poset. To determine the Mobius function µ of(P(X),⊂), write X = {x1, . . . , xn} and define

(1.21)f : P(X) −→ {0, 1}n

A 7−→ (a1, . . . , an)

where

ai =

{1 if xi ∈ A,0 if xi /∈ A.

We make {0, 1} into a poset E by defining 0 ≤ 1. The Mobius function of E is

η(a, b) =

{(−1)b−a if a ≤ b,

0 otherwise.

It is easy to see that the map f in (1.21) is an isomorphism from (P(X),⊂) toE × · · · × E︸︷︷︸

n

. Let A,B ∈ P(X) such that A ⊂ B. Write f(A) = (a1, . . . , an) and

f(B) = (b1, . . . , bn). We have

µ(A,B) = (η × · · · × η)((a1, . . . , an), (b1, . . . , bn)

)(by (1.19) and (1.20))

=n∏i=1

η(ai, bi)

=n∏i=1

(−1)bi−ai

= (−1)∑n

i=1 bi−∑n

i=1 ai

= (−1)|B|−|A|.

Example 1.23. Let Z+ be the set of all positive integers. The (Z+, | ) is alocally finite poset where x | y (x, y ∈ Z+) means that x divides y. Let x, y ∈ Z+

such that x | y. To determine the value µ(x, y) of the Mobius function µ of (Z+, | ),write x = pa1

1 · · · pann , y = pb11 · · · pbn

n , where p1, . . . , pn are distinct primes and0 ≤ ai ≤ bi, 1 ≤ i ≤ n. With p1, . . . , pn fixed, let

X = {pc11 · · · pcnn : ci ≥ 0, 1 ≤ i ≤ n} ⊂ Z+.

Thenf : X −→ Zn

pc11 · · · pcnn 7−→ (c1, . . . , cn)

14 1. PRELIMINARIES

is an isomorphism from the poset (X, | ) to (N,≤)× · · · × (N,≤), where (N,≤) is asub-poset of (Z,≤). Therefore,

µ(pa11 · · · pan

n , pb11 · · · pbnn )

= (µZ × · · · × µZ)((a1, . . . , an), (b1, . . . , bn)

)=µZ(a1, b1) · · ·µZ(an, bn)

=

{(−1)

∑ni=1(bi−ai) if bi − ai ∈ {0, 1} for all 1 ≤ i ≤ n,

0 if bi − ai ≥ 2 for some 1 ≤ i ≤ n.

Equivalently,

(1.22) µ(x, y) =

{(−1)s if y

x is a product of s distinct primes,0 if y

x is divisible by a square of a prime.

Example 1.24. Let F be a field and F [x]m the set of all monic polynomialsin F [x]. Then (F [x]m, | ) is a locally finite poset where f | g (f, g ∈ F [x]m) meansthat f divides g. For f, g ∈ F [x]m with f | g, the value of µ(f, g) of the Mobiusfunction µ of (F [x]m, |) is given by

µ(f, g) =

{(−1)s if g

f is a product of s distinct irreducibles in F [x]m,0 if g

f is divisible by a square of an irreducible in F [x]m.

The above formula follows from the same argument as in Example 1.23.

Example 1.25. Let V be an n-dimensional vector space over Fq and let L(V )be the set of all subspaces of V . Clearly, (L(V ),⊂) is a locally finite poset. Denotethe Mobius function of (L(V ),⊂) by µL(V ). First, note that for U,W ⊂ L(V ) withU ⊂ W , µL(V ) is determined by dimFq W/U . In fact, the poset ([U,W ],⊂) is iso-morphic to (W/U,⊂) by the correspondence between the subspaces of W/U and thesubspaces between U and W , and (W/U,⊂) is further isomorphic to (Fmq ,⊂), wherem = dimFq W/U . Thus, µL(V )(U,W ) = µL(Fm

q )({0},Fmq ), which is determined bym. Put µm = µL(V )(U,W ), where U,W ∈ L(V ), U ⊂W and dimFq

W/U = m.The method used here to determine µm is taken from [2]. Let Fkq be the k-

dimensional vector space over Fq. For each U ∈ L(V ), let

N=(U) = |{f ∈ HomFq(V,Fkq ) : ker f = U}|

and

N⊃(U) =∑

W∈L(V )W⊃U

N=(W ) = |{f ∈ HomFq(V,Fkq ) : ker f ⊃ U}|.

Let ε1, . . . , εs ∈ V such that their images in V/U form a basis of V/U , wheres = dimFq V/U = n− dimFq U . Then an Fq-map f ∈ HomFq (V,Fkq ) with ker f ⊃ U

is uniquely determined by f(ε1), . . . , f(εs) ∈ Fkq which can be arbitrarily chosen.Thus,

N⊃(U) = |Fkq |n−dimFq U = (qk)n−dimFq U .

1.3. TENSOR 15

Let Ld(V ) = {U ∈ L(V ) : dimFq U = d}. By (1.17), we have

N=({0}) =∑

U∈L(V )

µL(V )({0}, U)N⊃(U)

=n∑d=0

∑U∈Ld(V )

µL(V )({0}, U)N⊃(U)

=n∑d=0

|Ld(V )|µd (qk)n−d.

(1.23)

Note that N=({0}) is the number of injections in HomFq(V,Fkq ). Let δ1, . . . , δn be

a basis of V . Then every injection f ∈ HomFq (V,Fkq ) is uniquely determined by alinearly independent list f(δ1), . . . , f(δn) ∈ Fkq . The number of choices for f(δ1) isqk− 1, the number of choices for f(δ2) is qk− q, ... the number of choices for f(δn)is qk − qn−1. Thus,

|N=({0})| = (qk − 1)(qk − q) · · · (qk − qn−1).

Thus (1.23) can be written as

(1.24) (x− 1)(x− q) · · · (x− qn−1) =n∑d=0

|Ld(V )|µd xn−d,

where x = qk. Since k can be any nonnegative integer, x takes infinitely manyvalues in (1.24). Thus, (1.24) hols for all x ∈ R. Letting x = 0, we have

µn = (−1)(−q) · · · (−qn−1) = (−1)nq(2n).

1.3. Tensor

All rings are with identity and all modules are unitary. A subring has the sameidentity as the super ring. All ring homomorphisms map identity to identity.

Let R be a commutative ring and let A,B,C be R-modules. A function f :A×B is called a bilinear map if

f(ra1 + sa2, b) = rf(a1, b) + sf(a2, b),

f(a, rb1 + sb2) = rf(a, b1) + sf(a, b2)

for all a, a1, a2 ∈ A, b, b1, b2 ∈ B and r, s ∈ R.

Theorem 1.26. Let A,B be modules over a commutative ring R.

(i) There is an R-module F and a bilinear map f : A × B → F such thatfor any R-module C and bilinear map g : A × B → C, there is a uniqueR-map φ : F → C such that the following diagram commutes.

(1.25)Q

QQQs

��

��3

?

A×B

C

F

φ

g

f

16 1. PRELIMINARIES

(ii) If f ′ : A × B → F ′ is another bilinear map, where F ′ is an R-module,having the same property as f : A × B → F , then there is a uniqueR-isomorphism α : F → F ′ such that the following diagram commutes.

QQ

QQs

��

��3

?

A×B

F ′

F

α

f ′

f

Proof. (i) Let M be the free R-module generated by the elements of A×B.Let N be the submodule of M generated by all elements of the forms

(ra1 + sa2, b)− r(a1, b)− s(a2, b),

(a, rb1 + sb2)− r(a, b1)− s(a, b2)

for all a, a1, a2 ∈ A, b, b1, b2 ∈ B and r, s ∈ R. Let F = M/N and define

f : A×B −→ F(a, b) 7−→ (a, b) +N.

We first show that f is bilinear. Let a1, a2,∈ A, b ∈ B and r, s ∈ R. We have

f(ra1+sa2, b)−rf(a1, b)−sf(a2, b) = (ra1+sa2, b)−r(a1, b)−s(a2, b)+N = 0+N,

i.e., f(ra1 + sa2, b) = rf(a1, b) + sf(a2, b). The R-linearity of f in the secondvariable is proved the same way.

Now, let C be an R-module and g : A × B → C a bilinear map. Define anR-map Φ : M → C such that Φ(a, b) = g(a, b) for all (a, b) ∈ A × B. (SinceM is the free R-module generated by the elements of A × B, the map Φ exists.)Since g is bilinear, it is easy to see that N ⊂ ker Φ, Thus Φ induces an R-mapφ : F = M/N → C. For each (a, b) ∈ A×B, we have

(φ ◦ f)(a, b) = φ((a, b) +N

)= Φ(a, b) = g(a, b).

Thus φ ◦ f = g, i.e., diagram (1.25) commutes.To prove the uniqueness of φ, assume that φ′ : F → C is another R-map such

that φ′ ◦ f = g = φ ◦ f . Then for all (a, b) ∈ A×B,

φ((a, b) +N

)= (φ ◦ f)(a, b) = (φ′ ◦ f)(a, b) = φ′

((a, b) +N

).

Note that F is generated by {(a, b) +N : (a, b) ∈ A×B}. Thus φ = φ′.(ii) The proof of this part is the standard argument for the uniqueness of a

universal object in category theory.First, by the properties of f : A × B → F and f ′ : A × B → F ′, there exist

unique R-maps α : F → F ′ and β : F ′ → F such that the following diagramcommutes

QQ

QQs

��

��3

-?

?

A×B

F

F

F ′

α

βf

f

f ′

1.3. TENSOR 17

We compare two commutative diagrams:

QQ

QQs

��

��3

?

A×B

F

F

β◦α

f

f

QQ

QQs

��

��3

?

A×B

F

F

idF

f

f

The uniqueness of the vertical maps (part of the property of f : A × B → F )dictates that β ◦ α = idF . In the same way, α ◦ β = idF ′ . Thus, α : F → F ′ is anR-isomorphism. The uniqueness of α is part of the property of f : A×B → F . �

Definition 1.27. The R-module F in Theorem 1.26 (i) is called the tensorproduct of A and B and is denoted by A ⊗R B. The bilinear map f : A × B →A⊗R B is called the canonical bilinear map. For each (a, b) ∈ A×B, the elementf(a, b) ∈ F = A⊗R B is denoted by a⊗ b.

Remark. Note that A ⊗R B is generated by {a ⊗ b : (a, b) ∈ A × B}. Thus,elements of A⊗R B are of the form

∑ni=1 ri(ai × bi) where n ≥ 0, ri ∈ R, ai ∈ A,

bi ∈ B, 1 ≤ i ≤ n. The module operations in A⊗R B are governed by the rules

(ra1 + sa2)⊗ b = r(a1 ⊗ b) + s(a2 ⊗ b),

a⊗ (rb1 + sb2) = r(a⊗ b1) + s(a⊗ b2).

Using the notation in Definition 1.27, we can rephrase Theorem 1.26 (i) asfollows.

Theorem 1.28. Let A,B,C be modules over a commutative ring R. If g :A×B → C is a bilinear map, then there is a unique R-map φ : A⊗R B → C suchthat

φ(a⊗ b) = g(a, b) for all (a, b) ∈ A×B.

Here are some basic properties of the tensor product.

Theorem 1.29. Let R be a commutative ring and let A,B,C be R-modules.(i) A⊗R B ∼= B ⊗R A.(ii) (A⊗R B)⊗R C ∼= A⊗R (B ⊗R C).(iii) (A⊕B)⊗R C ∼= (A⊗R C)⊕ (B ⊗R C). More generally, if Ai, i ∈ I, are

R-modules, then there is an isomorphism(⊕i∈I

Ai

)⊗R C ∼=

⊕i∈I

(Ai ⊗R C)

which maps (∑i∈I ai)⊗ c to

∑i∈I(ai ⊗ c), where ai ∈ Ai are nonzero for

only finitely many i ∈ I and c ∈ C.

Proof. (i) Define

g : A×B −→ B ⊗R A(a, b) 7−→ b⊗ a

Clearly, g is bilinear. By Theorem 1.28, there is an R-map φ : A⊗R B → B ⊗R Asuch that φ(a⊗ b) = g(a, b) = b⊗ a for all (a, b) ∈ A×B. In the same way, there isan R-map ψ : B ⊗R A→ A⊗R B such that ψ(b⊗ a) = a⊗ b for all (b, a) ∈ B ×A.Thus (ψ ◦ φ)(a ⊗ b) = a ⊗ b for all (a, b) ∈ A × B. Since A ⊗R B is generated

18 1. PRELIMINARIES

by {a ⊗ b : (a, b) ∈ A × B}, we must have ψ ◦ φ = idA⊗RB . In the same way,φ ◦ ψ = idB⊗RA. Hence φ : A⊗R B → B ⊗R A is an isomorphism.

(ii) For each c ∈ C, define

gc : A×B −→ A⊗R (B ⊗R C)(a, b) 7−→ a⊗ (b⊗ c)

Clearly, gc is bilinear. Thus, there is an R-map φc : A ⊗R B → A ⊗R (B ⊗R C)such that

φc(a⊗ b) = a⊗ (b⊗ c) for all (a, b) ∈ A×B.

Defineh : (A⊗R B)× C −→ A⊗R (B ⊗R C)

(x, c) 7−→ φc(x)

Then h is bilinear. (Check this claim.) By Theorem 1.28 again, there is an R-mapφ : (A⊗R B)⊗R C → A⊗R (B ⊗R C) such that

φ(x⊗ c) = φc(x) for all x ∈ A⊗R B and c ∈ C.

Letting x = a⊗ b, (a, b) ∈ A×B, we have

φ((a⊗ b)⊗ c

)= φc(a⊗ b) = a⊗ (b⊗ c) for all (a, b, c) ∈ A×B × C.

In the same way, there is an R-map ψ : A ⊗R (B ⊗R C) → (A ⊗R B) ⊗R C suchthat

ψ(a⊗ (b⊗ c)

)= (a⊗ b)⊗ c for all (a, b, c) ∈ A×B × C.

Thus,

(ψ ◦ φ)((a⊗ b)⊗ c

)= (a⊗ b)⊗ c for all (a, b, c) ∈ A×B × C.

Since (A ⊗R B) ⊗R C is generated by {(a ⊗ b) ⊗ c : (a, b, c) ∈ A × B × C}, wehave ψ ◦ φ = id(A⊗RB)⊗RC . In the same way, φ ◦ ψ = idA⊗R(B⊗RC). Thus, φ :(A⊗R B)⊗R C → A⊗R (B ⊗R C) is an isomorphism.

(iii) Exercise. �

Example 1.30. Let A be a module over a commutative ring R. Then

R⊗R A ∼= A.

In fact,g : R×A −→ A

(r, a) 7−→ ra

is bilinear. Thus, there is an R-map φ : R ⊗R A → A such that φ(r ⊗ a) = ra forall (r, a) ∈ R×A. On the other hand,

ψ : A −→ R⊗R Aa 7−→ 1⊗ a

is an R-map. Clearly, φ ◦ ψ = idA and ψ ◦ φ = idR⊗RA. It follows that ψ : A →R⊗RA is an isomorphism. Consequently, every element in R⊗RA can be uniquelywritten an 1⊗ a for some a ∈ A.

Theorem 1.31. Let R be a commutative ring. If A and B are free R-moduleswith bases {ui}i∈I and {vj}j∈J , then A⊗RB is a free R-module with a basis {ui⊗vj}i∈I,j∈J .

1.3. TENSOR 19

Proof. We have A =∑i∈I Rui and B =

∑j∈J Rvj . By Theorem 1.29 (iii),

there is an isomorphism

A⊗R B =(∑i∈I

Rui

)⊗R

(∑j∈J

Rvj

)∼=

∑i∈I,j∈J

(Rui)⊗R (Rvj)

which maps ui ⊗ vj ∈ A ⊗R B to (1ui) ⊗ (1vj) ∈∑i∈I,j∈J(Rui) ⊗R (Rvj). Note

that there are isomorphisms

(Rui)⊗R (Rvj) ∼= R⊗R R ∼= R

which map (1ui)⊗(1vj) ∈ (Rui)⊗R(Rvj) to 1⊗1 ∈ R⊗RR and then to 1·1 = 1 ∈ R(by Example 1.30). Thus, (1ui)⊗ (1vj) is a basis of (Rui)⊗R (Rvj) and it followsthat {(1ui) ⊗ (1vj)}i∈I,j∈J is a basis of

∑i∈I,j∈J(Rui) ⊗R (Rvj). Consequently,

{ui ⊗ vj}i∈I,j∈J is a basis of A⊗R B. �

The tensor product can be used to extend the ring of scalars of a module.

Theorem 1.32. Let S be a subring of a commutative ring R and let A be anS-module. For each u ∈ R, there is a unique R-map gu : R ⊗S A → R ⊗S A suchthat gu(r⊗ a) = (ur)⊗ a for all (r, a) ∈ R×A. Moreover, R⊗S A is an R-modulewith the scalar multiplication defined by

(1.26) ux := gu(x), u ∈ R, x ∈ R⊗S A.

Note.

(i) When R⊗SA is viewed as an R-module, the scalar multiplication is givenby

un∑i=1

(ri ⊗ ai) =n∑i=1

(uri)⊗ ai

for all u, ri ∈ R and ai ∈ A.(ii) As an S-module, A is not necessarily embedded in R ⊗S A. In Exam-

ple 1.35 (i), we will see that Q⊗Z (Z/nZ) = 0.

Proof of Theorem 1.32. For each u ∈ R, define

hu : R×A −→ R⊗S A(r, a) 7−→ (ur)⊗ a

Clearly, hu is R-bilinear. Thus, there is a unique R-map gu : R ⊗S A → R ⊗S Asuch that

gu(r ⊗ a) = (ur)⊗ a for all (r, a) ∈ R×A.

It remains to show that R⊗S A is an R-module under the scalar multiplication(1.26). First, observe that for u ∈ R and x, y ∈ R⊗S A, we have

u(x+ y) = gu(x+ y) = gu(x) + gu(y) = ux+ uy.

Let x =∑ni=1 ri ⊗ ai ∈ R⊗S A, where ri ∈ R, ai ∈ A and let u, v ∈ R. We have

1x = g1

( n∑i=1

ri ⊗ ai

)=

n∑i=1

g1(ri ⊗ ai) =n∑i=1

(1ri)⊗ ai =n∑i=1

ri ⊗ ai = x,

20 1. PRELIMINARIES

(u+ v)x = (u+ v)n∑i=1

ri ⊗ ai

=n∑i=1

((u+ v)ri

)⊗ ai

=n∑i=1

(uri + vri)⊗ ai

=n∑i=1

(uri)⊗ ai +n∑i=1

(vri)⊗ ai

= ux+ vx,

and

u(vx) = un∑i=1

(vri)⊗ ai =n∑i=1

(u(vri)

)⊗ ai =

n∑i=1

((uv)ri

)⊗ ai = (uv)x.

Hence, R⊗S A is an R-module. �

Definition 1.33. Let R be a commutative ring. A ring A is called an algebraover R (R-algebra) if A is also an R-module and

r(ab) = (ra)b = a(rb) for all r ∈ R, a, b ∈ A.

Theorem 1.34. Let A and B be algebras over a commutative ring R. ThenA⊗R B can be made into an R-algebra with a unique multiplication operation thatsatisfies

(1.27) (a1 ⊗ b1)(a2 ⊗ b2) = (a1a2)⊗ (b1b2) for all (a1, b1), (a2, b2) ∈ A×B.

Proof. For each (a1, b1) ∈ A×B, define

g(a1,b1) : A×B −→ A⊗R B(a, b) 7−→ (a1a)⊗ (b1b)

Clearly, g(a1,b1) is bilinear, hence, there is an R-map m(a1,b1) : A⊗R B → A⊗R Bsuch that

m(a1,b1)(a⊗ b) = (aa1)⊗ (b1b) for all (a, b) ∈ A×B.

It is routine to check thatf : A×B −→ HomR(A⊗R B,A⊗R B)

(a1, b1) 7−→ m(a1,b1)

is bilinear. Hence, there is an R-map m : A⊗R B → HomR(A⊗R B,A⊗R B) suchthat

m(a1 ⊗ b1) = m(a1,b1) for all (a1, b1) ∈ A×B.

Now, for x, y ∈ A⊗R B, we define

(1.28) xy :=[m(x)

](y).

For (a1, b1), (a2, b2) ∈ A×B, we have

(a1 ⊗ b1)(a2 ⊗ b2) =[m(a1 ⊗ b1)

](a2 ⊗ b2) = m(a1,b1)(a2 ⊗ b2) = (a1a2)⊗ (a2b2).

So (1.27) is satisfied. It is also routine to check that with the multiplication definedin (1.28), A⊗R B is an R-algebra.

1.3. TENSOR 21

Since every element in A⊗RB is of the form∑ni=1 ai⊗bi, where (ai, bi) ∈ A×B,

1 ≤ i ≤ n, a multiplication in A⊗R B that satisfies (1.27) and the distributive lawis unique. �

Note.

(i) In Theorem 1.29 and Example 1.30, if the modules are R-algebras, thenthe isomorphisms are also R-algebra isomorphisms.

(ii) If S is a subring of a commutative ring R and A is an S-algebra, byTheorems 1.32 and 1.34, R⊗S A is an R-algebra in which

u(r ⊗ a) = (ur)⊗ a for all u ∈ R and (r, a) ∈ R×A,

(r1 ⊗ a1)(r2 ⊗ a2) = (r1r2)⊗ (a1a2) for all (r1, a1), (r2, a2) ∈ R×A.

Example 1.35. Let R be an integral domain and F the fraction field of R.(i) Let A be a torsion R-module, i.e., for every a ∈ A, there is 0 6= r ∈ R

such that ra = 0. Then

F ⊗R A = 0.

In fact, for any (u, a) ∈ F ×A, there is 0 6= r ∈ R such that ra = 0. Thus,

u⊗ a =u

r⊗ (ra) =

u

r⊗ 0 = 0.

(ii) There is an F -algebra isomorphism F ⊗R R→ F which maps u⊗ r to urfor all (u, r) ∈ F ×R. More generally, if S is a subring of a commutativering T , there is a T -algebra isomorphism T ⊗S S → T which maps t ⊗ sto ts for all (t, s) ∈ T × S.

(iii) Assume that R is a PID and A is a finitely generated R-module. Then

(1.29) A ∼= Ator ⊕Rn

where Ator = {a ∈ A : ra = 0 for some 0 6= r ∈ R} is the torsionsubmodule of A and n ≥ 0 is an integer (the fundamental theorem offinitely generated modules over a PID). The integer n is called the rankof A and is denoted by A. By (1.29), we have R-module isomorphisms

F ⊗R A ∼= F ⊗R (Ator ⊕Rn)

∼= (F ⊗R Ator)⊕( n⊕i=1

F ⊗R R)

∼= Fn (by (i) and (ii)).

It is easy to see that the above isomorphisms are also F -isomorphisms.Hence,

(1.30) rankA = n = dimF F ⊗R A.So, we can define rankA = dimF F ⊗RA without referring to the isomor-phism (1.29); such a definition is intrinsic.

Example 1.36. Let m,n be positive integers. There is an Z-algebra isomor-phism

(Z/mZ)⊗Z (Z/nZ) ∼= Z/(m,n)Z.To see this isomorphism, first define

Φ : (Z/mZ)× (Z/nZ) −→ Z/(m,n)Z([x]m, [y]n) 7−→ [xy](m,n)

22 1. PRELIMINARIES

where x, y ∈ Z and [x]m is the image of x in Z/mZ. Clearly, Φ is well defined andis Z-bilinear. So there is a Z-map φ : (Z/mZ)⊗Z (Z/nZ) → Z/(m,n)Z such that

φ([x]m ⊗ [y]n) = [xy](m,n) for all x, y ∈ Z.

Clearly, φ is also a Z-algebra homomorphism.Consider a Z-map

f : Z −→ (Z/mZ)⊗Z (Z/nZ)x 7−→ [x]m ⊗ [1]n

Then (m,n)Z ⊂ ker f . In fact, write (m,n) = am+ bn, a, b ∈ Z. Then

f((m,n)

)= f(am+ bn)

= [am+ bn]m ⊗ [1]n= n[b]m ⊗ [1]n= [b]n ⊗ n[1]n= [b]n ⊗ [0]n= 0.

Thus, f induces Z-map f : Z/(m,n)Z → (Z/mZ)⊗Z(Z/nZ) such that f([x](m,n)) =[x]m ⊗ [1]n. It is easy to see that φ ◦ f = idZ/(m,n)Z and f ◦ φ = id(Z/mZ)⊗Z(Z/nZ).Therefore, φ is an isomorphism.

Lemma 1.37. n elements ε1, . . . , εn ∈ Fqn are linearly independent over Fq ifand only if the matrix

A =

ε1 · · · εnεq1 · · · εqn...

...εq

n−1

1 · · · εqn−1

n

is nonsingular.

Proof. (⇐) Suppose [ε1, . . . , εn][b1, . . . , bn]T = 0 for some [b1, . . . , bn] ∈ Fnq .Then we have A[b1, . . . , bn]T = 0, which forces [b1, . . . , bn]T = 0.

(⇒) The (i, j) entry of ATA isn−1∑k=0

(εiεj)qk

= TrFqn/Fq(εiεj).

Since (x, y) 7→ TrFqn/Fq(xy) is a nondegenerate Fq-bilinear form on Fqn (Exer-

cise 1.1), det(ATA) 6= 0. So detA 6= 0. �

Proposition 1.38. Let q > 1 be a prime power and let m,n be positive integers.Then there is an Fq-algebra isomorphism

(1.31) Fqm ⊗FqFqn ∼= Fq[m,n] × · · · × Fq[m,n]︸︷︷︸

(m,n)

.

Proof. Definef : Fqm × Fqn −→ Fq[m,n] × · · · × Fq[m,n]

(x, y) 7−→ (xy, xqy, . . . , xq(m,n)−1

y)

EXERCISES 23

Clearly, f is Fq-bilinear. So there is an Fq-map

φ : Fqm ⊗Fq Fqn −→ Fq[m,n] × · · · × Fq[m,n]

such thatφ(x⊗ y) = (xy, xqy, . . . , xq

(m,n)−1y).

Obviously, φ is also an Fq-algebra homomorphism. Thus, it suffices to show that φis a bijection. Since the two sides of (1.31) have the same Fq-dimension, it sufficesto show that φ is onto. We will show that for each 1 ≤ i ≤ (m,n),

{0} × · · · × {0} × Fq[m,n]

i

× {0} × · · · × {0} ⊂ im(φ).

Without loss of generality, let i = 1. Let ε1, . . . , ε(m,n) be a bassis of Fq(m,n) over

Fq. By Lemma 1.37, (εi, εqi , . . . , ε

q(m,n)−1

i ) ∈ F(m,n)

q(m,n) , 1 ≤ i ≤ (m,n), are linearlyindependent over Fq(m,n) . Therefore, there exist y1, . . . , y(m,n) ∈ Fq(m,n) such that

(m,n)∑i=1

yi(εi, εqi , . . . , ε

q(m,n)−1

i ) = (1, 0, . . . , 0).

The left side of the above is φ(∑(m,n)i=1 εi ⊗ yi). So (1, 0, . . . , 0) ∈ im(φ). Now, for

any x ∈ Fqm and y ∈ Fqn , we have

(x, 0, . . . , 0) = φ(x⊗ 1)(1, 0, . . . , 0) ∈ im(φ),

(y, 0, . . . , 0) = φ(1⊗ y)(1, 0, . . . , 0) ∈ im(φ).

It follows that (Fqm ∪ Fqn)×{0}× · · · × {0} ⊂ im(φ). Since Fq[m,n] is generated byFqm ∪Fqn , we have Fq[m,n] ×{0}× · · ·× {0} ⊂ im(φ) and the proof is complete. �

Theorem 1.39. Let R be a commutative ring and let f : A→ A′, g : B → B′

be R-maps where A,A′, B,B′ are R-modules. Then there is a unique R-map fromA⊗R B to A′ ⊗R B′ denoted by f ⊗ g such that

(f ⊗ g)(a⊗ b) = f(a)⊗ g(b) for all (a, b) ∈ A×B.

If A,A′, B,B′ are R-algebras and f, g are R-algebra homomorphism, then f ⊗ g isalso an R-algebra homomorphism.

Proof. Defineα : A×B −→ A′ ⊗R B′

(a, b) 7−→ f(a)⊗ g(b)

Then α is bilinear. So there is a unique R-map f ⊗ g : A ⊗R B → A′ ⊗R B′ suchthat

(f ⊗ g)(a⊗ b) = f(a)⊗ g(b) for all (a, b) ∈ A×B.

The second half of the theorem is obvious. �

Exercises

1.1. (i) Clearly, f : Fqn × Fqn , (x, y) 7→ TrFqn/Fq(xy) is a symmetric Fq-bilinear

map. Prove that f is nondegenerate, i.e., f(x, y) = 0 for all y ∈ Fqn

implies x = 0.

24 1. PRELIMINARIES

(ii) Defineα : Fqn −→ HomFq

(Fqn ,Fq)x 7−→ TrFqn/Fq

(x · )where TrFqn/Fq

(x · ) maps y ∈ Fqn to TrFqn/Fq(xy). Prove that α is an

Fq-module isomorphism.(iii) Assume g ∈ HomFq (Fqn ,Fq) such that g◦τ = g, where τ is the Frobenius

map of Fqn over Fq. Prove that g = aTrFqn/Fqfor some a ∈ Fq.

1.2. Prove that every element in Fq is a sum of two squares.1.3. Let φ be the Euler function and µ the Mobius function of (Z+, | ). Prove

thatφ(n) =

∑d|n

dµ(d, n) for all n ∈ Z+.

1.4. Prove Theorem 1.29 (iii).

CHAPTER 2

Polynomials over Finite Fields

2.1. Number of Irreducible Polynomials

Let q > 1 be a prime power and n > 0 an integer. Denote by Ig(n) the set ofall monic irreducible polynomials of degree n in Fq[x]. We will derive an explicitformula for |Iq(n)|.

Lemma 2.1. We have

(2.1) xqn

− x =∏d|n

∏f∈Iq(d)

f.

Proof. Let F = xqn − x. Since (F, F ′) = 1, the factorization of F does not

have repeated irreducible factors. Thus, to prove (2.1), it suffices to show that⋃d|n Iq(d) is precisely the set of monic irreducible factors of F .

First, let f ∈ Iq(d) for some d | n. Let a be any root of f (in some extensionof Fq). Then Fq(a) = Fqd and f is the minimal polynomial of a over Fq. Sincea ∈ Fqd ⊂ Fqn , a is a root of F . Therefore, f | F .

Now assume that f ∈ Iq(d) is a monic irreducible factor of F . Since Fqn is thesplitting field of F over Fq, f splits in Fqn . Let a ∈ Fqn be any root of f . Then wehave d = [Fq(a) : Fq] | [Fqn : Fq] = n. �

By comparing the degrees of both sides of (2.1), we have the following corollary.

Corollary 2.2. We have

qn =∑d|n

d|Iq(d)|.

In Example 1.23, we determined the Mobius function µ of the poset (Z+, | ).By (1.22), µ(x, y), where x | y, depends only on y

x . We denote µ(x, y) by µ( yx ).

Theorem 2.3. We have

|Iq(n)| = 1n

∑d|n

µ(n

d)qd.

Proof. For each n ∈ Z+, let

N=(n) = n|Iq(n)|

andN≤(n) =

∑d|n

N=(d) =∑d|n

d|Iq(d)|.

By Corollary 2.2,N≤(n) = qn.

25

26 2. POLYNOMIALS OVER FINITE FIELDS

Hence, by the Mobius inversion (Theorem 1.18, Equation (1.15)),

n|Iq(n)| = N=(n) =∑d|n

µ(n

d)N≤(d) =

∑d|n

µ(n

d)qd,

i.e.,

|Iq(n)| = 1n

∑d|n

µ(n

d)qd.

�

In the next two propositions, we collect some useful facts about irreduciblepolynomials in Fq[x].

Proposition 2.4.(i) Every irreducible polynomial f ∈ Fq[x] is separable, i.e., f has no multiple

roots (in its splitting field).(ii) If f ∈ Fq[x] is irreducible with deg f = n, then f splits in Fqn .(iii) For each a ∈ Fqn , the minimal polynomial of a over Fq is

(2.2) fa =∏b∈[a]

(x− b),

where[a] = {γ(a) : γ ∈ Aut(Fqn/Fq)}

is the Aut(Fqn/Fq)-orbit of a. Equivalently,

(2.3) fa = (x− aq0)(x− aq

1) · · · (x− aq

m−1),

where m is the smallest positive integer such that aqm

= a.

Proof. (i) Since f has a root x+ (f) ∈ Fq[x]/(f) and Fq[x]/(f) ∼= Fqn , wheren = deg f , f has a root, say a, in Fqn . Thus, f is the minimal polynomial of a overFq. Since Fqn/Fq is Galois, f is separable.

(ii) By the proof of (i), f has a root in Fqn . Since Fqn/Fq is Galois and f isirreducible over Fq, f splits in Fqn .

(iii) Since Fqn/Fq is Galois, fa splits in Fqn with no multiple roots and Aut(Fqn/Fq)acts transitively on the set of roots of fa. Thus [a] consists of all the roots of fa.Therefore, we have (2.2). For (2.3), note that [a] = {aq0 , aq1 , . . . , aqm−1}. �

Proposition 2.5. Let f ∈ Fq[x] be monic irreducible with deg f = n and letm > 0 be an integer. Then in Fqm [x],

f = f0 · · · f(m,n)−1,

where f0, . . . , f(m,n)−1 ∈ Fqm [x] are distinct monic irreducibles of degree n(m,n) .

Moreover, f0, . . . , f(m,n)−1 are conjugates by Aut(Fqm/Fq), i.e., for each 0 ≤ i ≤(m,n)− 1, there is a γ ∈ Aut(Fqm/Fq) such that fi = γi(f0).

Proof. Let τ be the Frobenius map of Fqn/Fq. By Proposition 2.4 (ii), fhas a root a ∈ Fqn . Clearly, f is the minimal polynomial of a over Fq. Thus, byProposition 2.4 (iii),

f =n−1∏i=0

(x− τ i(a)

),

2.1. NUMBER OF IRREDUCIBLE POLYNOMIALS 27

where τ0(a), . . . , τn−1(a) are all distinct. Let

fi =

n(m,n)−1∏j=0

(x− τ i+j(m,n)(a)

), 0 ≤ i ≤ (m,n)− 1.

Then deg fi = n(m,n) , 0 ≤ i ≤ (m,n) − 1, and f = f0 · · · f(m,n)−1. Clearly,

τ (m,n)(fi) = fi, 0 ≤ i ≤ (m,n) − 1. Since 〈τ (m,n)〉 = Aut(Fqn/Fq(m,n)), we havefi ∈ Fq(m,n) [x]. We claim that fi is irreducible in Fqm [x]. Let ai = τ i(a) ∈ Fqn .Then Fq(ai) = Fqn since ai is a root of f . Since Fqm ⊂ Fqm(ai) and Fqn =Fq(ai) ⊂ Fqm(ai), we have Fq[m,n] ⊂ Fqm(ai). Obviously, Fqm(ai) ⊂ Fq[m,n] . SoFq[m,n] = Fqm(ai). Then [Fqm(ai) : Fqm ] = [Fq[m,n] : Fqm ] = [m,n]

m = n(m,n) . Note

that fi ∈ Fqm [x], deg fi = n(m,n) and fi(ai) = 0. Thus, fi is the minimal polynomial

of ai over Fqm . In particular, fi is irreducible in Fqm [x].Let σ be the Frobenius map of Fq[m,n]/Fq. Then σ|Fqm is the Frobenius map of

Fqm/Fq. Since 〈σ〉 = Aut(Fq[m,n]/Fq) acts transitively on the roots of f , it also actson the minimal polynomials of the roots of f over Fq[m,n] , i.e., {f0, . . . , f(m,n)−1}.Consequently, 〈σ|Fqm 〉 = Aut(Fqm/Fq) acts transitively on {f0, . . . , f(m,n)−1}. �

��

@@@

@@@

��

@@@

Fq

Fq(m,n)

Fqm Fqn = Fq(ai)

Fq[m,n] = Fqm(ai)

m(m,n)

n(m,n)

m(m,n)

n(m,n)

n

Definition 2.6. An irreducible polynomial f ∈ Fq[x] of degree n is called aprimitive polynomial over Fq if f is the minimal polynomial of a primitive elementof Fqn .

The number of monic primitive polynomials of a given degree is easily counted.

Theorem 2.7. The number of monic primitive polynomials of degree n overFq is φ(qn−1)

n where φ is the Euler function.

Proof. Let P be the set of all primitive elements of Fqn . Note that Aut(Fqn/Fq)acts on P . Since F∗qn is cyclic of degree qn − 1, |P | = φ(qn − 1). Since each a ∈ Pis of degree n over Fq, by Proposition 2.4 (ii), the Aut(Fqn/Fq)-orbit [a] of a hasn elements. Therefore, P is partitioned into φ(qn−1)

n orbits by the Aut(Fqn/Fq)action. By Proposition 2.4 (iii) again, each Aut(Fqn/Fq)-orbit in P corresponds toa primitive polynomial of degree n over Fq. Therefore, there are precisely φ(qn−1)

nprimitive polynomials of degree n over Fq. �


2.2. Berlekamp’s Factorization Algorithm

Let f ∈ Fq[x] be a polynomial with deg f > 0. How to factor f into irreducibles?In general, this is a difficult problem. In this section, we describe an algorithm byBerlekamp [3] for factoring polynomials in Fq[x]. The algorithm works efficientlywhen q is small.

Berlekamp’s algorithm is an iterative method. Given f ∈ Fq[x] with deg f > 0,if f is not irreducible, we try to find a nontrivial factorization of f and go on tofactor the factors in the factorization.

Lemma 2.8. Let f ∈ Fq[x] be monic and let h ∈ Fq[x]. Then hq ≡ h (mod f)if and only if

(2.4) f =∏a∈Fq

(f, h− a).

Proof. (⇒) Since Fq consists of all roots of xq − x, we have

(2.5) xq − x =∏a∈Fq

(x− a).

Substituting h for x in (2.5), we see that

hq − h =∏a∈Fq

(h− a).

Now, since f | (hq − h) and h− a, a ∈ Fq, are pairwise coprime, we have

f = (f, hq − h) =∏a∈Fq

(f, h− a).

(⇐) We have

f =∏a∈Fq

(f, h− a)∣∣∣ ∏a∈Fq

(h− a) = hq − h.

�

Remark. In Lemma 2.8, if 0 < deg h < deg f , then deg(f, h − a) < deg f forall a ∈ Fq. Thus, (2.4) is a nontrivial factorization of f .

Definition 2.9. Let f ∈ Fq[x] be a polynomial with deg f > 0. A polynomialh ∈ Fq[x] is called an f -reducing polynomial if 0 < deg h < deg f and hq ≡ h(mod f).

Let n = deg f and let A be the n× n matrix over Fq defined by

(2.6)

x0q

x1q

...x(n−1)q

≡ A

x0

x1

...xn−1

(mod f).

2.2. BERLEKAMP’S FACTORIZATION ALGORITHM 29

Then for h = a0 + · · ·+ an−1xn−1 ∈ Fq[x],

hq − h = [a0, . . . , an−1]

x0q

...x(n−1)q

− [a0, . . . , an−1]

x0

...xn−1

≡ [a0, . . . , an−1](A− In)

x0

...xn−1

(mod f),

where In is the n× n identity matrix. Hence hq ≡ h (mod f) if and only if

(2.7) [a0, . . . , an−1](A− In) = 0.

By (2.6), the first row of A is [1, 0, . . . , 0]. Thus [a0, . . . , an−1] = [1, 0, . . . , 0] isalways a solution of (2.7). The solutions [a0, . . . , an−1] of (2.7) with [a1, . . . , an−1] 6=[0, . . . , 0] are precisely the coefficients of f -reducing polynomials. The existence of f -reducing polynomials, when f is reducible, is guaranteed by the following theorem.

Theorem 2.10. In the above notation,

nullity(A− In) = the number of distinct irreducible factors of f .

Proof. From the above, we see that

nullity(A− In) = dimFq{h ∈ Fq[x]/(f) : hq = h}.Let f = fe11 · · · fek

k , where f1, . . . , fk ∈ Fq[x] are distinct irreducibles. By theChinese remainder theorem, there is an Fq-algebra isomorphism

(2.8) Fq[x]/(f) ∼=(Fq[x]/(fe11 )

)× · · · ×

(Fq[x]/(fek

k )).

Let h ∈ Fq[x]/(feii ). By Lemma 2.8, hq = h if and only if fei

i =∏a∈Fq

(feii , h−

a). (Here is a harmless abuse of notation: h ∈ Fq[x]/(feii ) is also treated as

an element in Fq[x].) Since h − a, a ∈ Fq, are pairwise coprime, we see thatfeii =

∏a∈Fq

(feii , h− a) if and only if fei

i | h− a for some a ∈ Fq, which happens ifand only if h = a for some a ∈ Fq. Therefore,

(2.9) dimFq{{h ∈ Fq[x]/(fei

i ) : hq = h} = 1.

Combining (2.8) and (2.9), we have

dimFq{h ∈ Fq[x]/(f) : hq = h} =

k∑i=1

dimFq{h ∈ Fq[x]/(fei

i ) : hq = h} = k.

�

To sum up, given any polynomial f ∈ Fq[x] with deg f = n > 0, Berlekamp’salgorithm produces a nontrivial factorization of f or finds that f is irreduciblethrough the following steps.Step 1. Find the matrix A defined by (2.6).Step 2. If nullity(A − In) = 1, f is irreducible. If nullity(A − In) > 1, find

[a1, . . . , an−1] 6= [0, . . . , 0] such that [0, a1, . . . , an−1](A − In) = 0. Leth = a1x+ · · ·+ an−1x

n−1.Step 3. Compute (f, h − a) with a ranging over Fq. f =

∏a∈Fq

(f, h − a) is anontrivial factorization of f .


Example 2.11. We try to factor

f =∈ F3[x]

using Berlekamp’s algorithm. We have the modulo f congruences

x0·3 ≡ 1,

Therefore,A =

Theorem 2.10 can be generalized as follows.

Theorem 2.12. Let f = fe11 · · · fek

k where f1, . . . , fk are distinct irreduciblepolynomials in Fq[x] with deg fi = ni and ei > 0, 1 ≤ i ≤ k. Let n = deg f =e1n1 + · · · + eknk and A the n × n matrix defined by (2.6). Then for each integerm > 0,

(2.10) nullity(Am − In) =k∑i=1

(m,ni).

Proof. Using (2.6) repeatedly, we havex0·qm

x1·qm

...x(n−1)qm

≡ Am

x0

x1

...xn−1

(mod f).

Therefore, by Theorem 2.10,

nullity(Am − In) = the number of distinct irreducible factors of f in Fqm [x].

By Proposition 2.5, in Fqm [x], fi splits into (m,ni) distinct irreducibles. Hence,the number of distinct irreducible factors of f in Fqm [x] is

∑ki=1(m,ni). So (2.10)

is proved. �

Using Theorem 2.12, we can derive a formula for the number of irreduciblefactors of a given degree of f in terms of the matrix A.

Lemma 2.13. Let m,n ∈ Z+. We have∑d|m

µ(m

d)(d, n) =

{φ(m) if m | n,0 if m - n,

where µ is the Mobius function of (Z+, | ) and φ is the Euler function.

Proof. If m | n, by Exercise 1.2, we have∑d|m

µ(m

d)(d, n) =

∑d|m

µ(m

d)d = φ(m).

If m - n, write m = pa11 · · · pat

t and n = pb11 · · · pbtt where p1, . . . , pt are distinct

primes, and, without loss of generality, assume that a1 > b1. Then∑d|m

µ(m

d)(d, n) =

( ∑d1|p

a11

µ(pa11

d1)(d1, p

b11 )

)· · ·

( ∑dt|pat

t

µ(patt

dt)(dt, pbt

t ))

2.2. BERLEKAMP’S FACTORIZATION ALGORITHM 31

where ∑d1|p

a11

µ(pa11

d1)(d1, p

b11 ) = (pa1

1 , pb11 )− (pa1−1

1 , pb11 ) = 0.

�

Theorem 2.14. Let f ∈ Fq[x] be a polynomial with deg f = n > 0 and letA be the matrix defined in (2.6). For each integer m > 0, the number of distinctirreducible factors of f of degree m is given by∑

s≤nm|s

µ(s

m)

1φ(s)

∑d|s

µ(s

d) nullity(Ad − In).

Proof. Let n1, . . . , nk be the degrees of the distinct irreducible factors of f .For each m ∈ Z+, let

N=(m) = |{1 ≤ i ≤ k : ni = m}|

and

N≥(m) =∑m|s

N=(s) = |{1 ≤ i ≤ k : m | ni}|.

From (2.10), we have∑d|m

µ(m

d) nullity(Ad − In)

=k∑i=1

∑d|m

µ(m

d)(d, ni)

= |{1 ≤ i ≤ k : m | ni}|φ(m) (by Lemma 2.13)

=N≥(m)φ(m).

Thus,

N≥(m) =1

φ(m)

∑d|m

µ(m


Since ni ≤ n for all 1 ≤ i ≤ k, obviously, N≥(m) = 0 if m > n. Thus, by theMobius inversion (1.17),

N=(m) =∑m|s

µ(s

m)N≥(s)

=∑s≤nm|s

µ(s

m)N≥(s)

=∑s≤nm|s

µ(s

m)

1φ(s)

∑d|s

µ(s


�


2.3. Functions from Fnq to Fq

Let n ≥ 0 be an integer and let F(Fnq ,Fq) denote the set of all functions fromFnq to Fq. Clearly, F(Fnq ,Fq) is an Fq-algebra. A property peculiar to finite fieldsis that every function in F(Fnq ,Fq) is a polynomial function.

Let Fq[X1, . . . , Xn] be the polynomial ring in X1, . . . , Xn over Fq. Each elementf(X1, . . . , Xn) ∈ Fq[X1, . . . , Xn] gives rise to a function

f : Fnq −→ Fq(a1, . . . , an) 7−→ f(a1, . . . , an)

Clearly, ( ) : f 7→ f is an Fq-algebra homomorphism from Fq[X1, . . . , Xn] toF(Fnq ,Fq). The homomorphism ( ) : Fq[X1, . . . , Xn] → F(Fnq ,Fq) is onto. Thisclaim follows from the Lagrange interpolation. For each (a1, . . . , an) ∈ Fnq , define

f(a1,...,an) =n∏i=1

∏b∈Fq\{ai}

Xi − b

ai − b∈ Fq[X1, . . . , Xn].

Then

f (a1,...,an)(b1, . . . , bn) =

{1 if (b1, . . . , bn) = (a1, . . . , an),0 if (b1, . . . , bn) 6= (a1, . . . , an).

So, f (a1,...,an), (a1, . . . , an) ∈ Fnq , form a basis of F(Fnq ,Fq). Consequently, ( ) :Fq[X1, . . . , Xn] → F(Fnq ,Fq) is onto.

Theorem 2.15. The homomorphism ( ) : Fq[X1, . . . , Xn] → F(Fnq ,Fq) inducesan Fq-algebra isomorphism

(2.11) Fq[X1, . . . , Xn]/(Xq1 −X1, . . . , X

qn −Xn) ∼= F(Fnq ,Fq),

where (Xq1 − X1, . . . , X

qn − Xn) is the ideal of Fq[X1, . . . , Xn] generated by Xq

1 −X1, . . . , X

qn −Xn.

Proof. Since aq−a = 0 for all a ∈ Fq, it is clear that (Xq1−X1, . . . , X

qn−Xn) ⊂

ker ( ). Thus ( ) induces an onto homomorphism

ε : Fq[X1, . . . , Xn]/(Xq1 −X1, . . . , X

qn −Xn) −→ F(Fnq ,Fq).

However,

dimFqFq[X1, . . . , Xn]/(X

q1 −X1, . . . , X

qn −Xn) = qn = dimFq

F(Fnq ,Fq).

(The first equal sign holds in the above since Xe11 · · ·Xen

n , 0 ≤ ei ≤ q − 1, 1 ≤i ≤ n, form a basis of Fq[X1, . . . , Xn]/(X

q1 −X1, . . . , X

qn −Xn).) Therefore, ε is an

isomorphism. �

The meaning of (2.11) is concrete. Every function from Fnq to Fq can be uniquelyrepresented as a polynomial in Fq[X1, . . . , Xn] in which the degree of each Xi is atmost q − 1. In particular, every function from Fq to Fq is uniquely represented bya polynomial of degree q − 1 in Fq[X].

PutPq,n = Fq[X1, . . . , Xn]/(X

q1 −X1, . . . , X

qn −Xn).

2.3. FUNCTIONS FROM Fnq TO Fq 33

We identify the two Fq-algebras Pq,n and F(Fnq ,Fq). When it is convenient andcauses no confusion, elements in Pq,n and F(Fnq ,Fq) are simply written as polyno-mials in Fq[X1, . . . , Xn]. Every element f ∈ Pq,n is uniquely of the form

(2.12) f =∑

(e1,...,en)∈[0,q−1]n

ae1,...,enXe11 · · ·Xen

n ,

where [0, q − 1] = {0, 1, . . . , q − 1} and ae1,...,en∈ Fq. We define deg f to be the

total degree of the polynomial on the right side of (2.12), i.e.,

deg f = max{e1 + · · ·+ en : ae1,...,en6= 0}.

(By convention, deg 0 = −∞.)For each −1 ≤ r ≤ n(q − 1), let

Rq(r, n) = {f ∈ Pq,n : deg f ≤ r}.

Rq(r, n) is an Fq-subspace of Pq,n and is called the q-ary Reed-Muller code of orderr and length qn. We will not go into the background of coding theory, but simplypoint out that the “codeword” arising from f ∈ Rq(r, n) is the qn-tuple (f(a))a∈Fn

q.

The quotient space Rq(r, n)/Rq(r − 1, n) is the space of homogeneous polyno-mial functions of degree r in Pq,n. SinceXe1

1 · · ·Xenn , 0 ≤ ei ≤ q−1, e1+· · ·+en = r,

form a basis of Rq(r, n)/Rq(r − 1, n), we have

dimFqRq(r, n)/Rq(r − 1, n)

=∣∣{(e1, . . . , en) ∈ [0, q − 1]n : e1 + · · ·+ en = r

}∣∣=the coefficient of xr in (1 + x+ · · ·+ xq−1)n

=the coefficient of xr in (1− xq)n(1− x)−n

=the coefficient of xr in( n∑i=0

(n

i

)(−1)ixqi

)( ∞∑j=0

(n+ j − 1

j

)xj

)=

∑i≤b r

q c

(−1)i(n

i

)(r − qi+ n− 1

r − qi

).

Consequently,

dimFqRq(r, n) =

r∑s=0

dimFqRq(s, n)/Rq(s− 1, n)

=r∑s=0

∑i≤b s

q c

(−1)i(n

i

)(s− qi+ n− 1

s− qi

).

When q = 2, the above dimension formulas are much simpler. We have

dimFq R2(r, n)/R2(r − 1, n) =∣∣{(e1, . . . , en) ∈ [0, 1]n : e1 + · · ·+ en = r

}∣∣ =(n

r

)and

dimFq R2(r, n) =r∑s=0

(n

s

).


The method of representing functions in F(Fnq ,Fq) as polynomials in Fq[X1, . . . , Xn]is referred to as the multi variable approach. There is another method of represent-ing functions in F(Fnq ,Fq), called the single variable approach, which we describenow.

Identify Fqn with Fnq as Fq-vector spaces. We abbreviate TrFqn/Fqas Tr. Since

Tr : Fqn → Fq is onto (Theorem 1.9 (i)), every function F ∈ F(Fnq ,Fq) is a compo-sition

Fqnf−→ Fqn

Tr−→ Fqfor some function f : Fqn → Fqn . Namely,

(2.13) F (x) = Tr(f(x)) for all x ∈ Fqn ,

where, by Theorem 2.15, f ∈ Fqn [x]/(xqn−x). Note that in (2.13), f is not uniquely

determined by F , as Tr(f(x)q) = Tr(f(x)).A natural question arises. How to choose elements f ∈ Fqn [x]/(xq

n −x) so thatthe functions Tr(f(x)) form a basis of Rq(r, n)/Rq(r − 1, n)? We will answer thisquestion in the rest of this section.

Lemma 2.16. Let e0, . . . , en−1 be integers such that 0 ≤ ei ≤ q−1, 0 ≤ i ≤ n−1,and let a ∈ Fqn . Then

(2.14) Tr(axe0+e1q+···+en−1qn−1

) ∈ Rq(r, n),

where r = e0 + · · ·+ en−1.

Proof. Let ε1, . . . , εn be a basis of Fqn over Fq. For each x ∈ Fqn , write

x =n∑i=1

xiεi, (x1, . . . , xn) ∈ Fnq .

We want to show that Tr(axe0+e1q+···+en−1qn−1

), as a polynomial function of x1, . . . , xn,has a total degree ≤ r. We have

xe0+e1q+···+en−1qn−1

=( n∑i=1

xiεi

)e0+e1q+···+en−1qn−1

=( n∑i=1

xiεi

)e0( n∑i=1

xiεqi

)e1· · ·

( n∑i=1

xiεqn−1

i

)en−1

.

The above is clearly a homogeneous polynomial of degree e0 + · · · + en−1 = r inx1, . . . , xn. Thus, we can write

xe0+e1q+···+en−1qn−1

=∑

s1+···+sn=r

bs1,...,snxs11 · · ·xsn

n ,

where bs1,...,sn ∈ Fqn depends on ε1, . . . , εn. Therefore,

Tr(axe0+e1q+···+en−1qn−1

) =∑

s1+···+sn=r

Tr(bs1,...,sn)xs11 · · ·xsnn

which has a total degree ≤ r in x1, . . . , xn. �

Remark. To avoid possible future confusion, we reiterate what we have justseen. In (2.14), although xe0+e1q+···+en−1q

n−1is a monomial of degree e0+e1q+· · ·+

en−1qn−1 in x, Tr(axe0+e1q+···+en−1q

n−1) is a polynomial of degree ≤ e0 + · · ·+en−1

in x1, . . . , xn−1.

2.3. FUNCTIONS FROM Fnq TO Fq 35

Let τ be the Frobenius map of Fqn over Fq and let

Er ={(e0, . . . , en−1) ∈ [0, q − 1]n : e0 + · · ·+ en−1 = r

}.

Define a 〈τ〉-action on Er by

τm(e0, . . . , en−1) = (e−m, e−m+1, . . . , e−m+n−1)

where the subscript i of ei is taken modulo n. In straightforward terms, the actionof τm on (e0, . . . , en−1) is the cyclic shift of the components m positions to theright. So, two elements in Er are in the same 〈τ〉-orbit if and only if one can beobtained from the other through a cyclic shift. Observe that for all x ∈ Fqn and(e0, . . . , en−1) ∈ [0, q − 1]n,

τm(x(e0,...,en−1)(q

0,...,qn−1)T)

= τm(xe0q

0+···+en−1qn−1

)= x(e0q

0+···+en−1qn−1)qm

= xe−mq0+···+e−m+n−1q

n−1(since xq

n

= x)

= xτm(e0,...,en−1)(q

0,...,qn−1)T

.

(2.15)

Let ε1, . . . , εk ∈ Er be a set of representatives of the 〈τ〉-orbits in Er and let[εi] be the 〈τ〉-orbit of εi, 1 ≤ i ≤ k. The stabilizer of εi in 〈τ〉 is 〈τ |[εi]|〉 =Aut(Fqn/Fq|[εi]|).

Theorem 2.17. In the above notation, a basis of Rq(r, n)/Rq(r−1, n) is givenby

(2.16) Tr(axεi(q

0,...,qn−1)T), 1 ≤ i ≤ k, a ∈ Ai,

where Ai is any subset of Fqn such that TrFqn/Fq|[εi]|

(a), a ∈ Ai, form a basis Fq|[εi]|

over Fq.

Proof. By Lemma 2.16, all functions in (2.16) belong to Rq(r, n). The numberof functions listed in (2.16) is

k∑i=1

|Ai| =k∑i=1

|[εi]| = |Er| = dimFqRq(r, n)/Rq(r − 1, n).

Therefore, it suffices to show that the list (2.16) is linearly independent over Fq.Assume that

(2.17)k∑i=1

∑a∈Ai

bi,aTr(axεi(q

0,...,qn−1)T)

= 0 for all x ∈ Fqn ,

where bi,a ∈ Fq for all 1 ≤ i ≤ k, a ∈ Ai. By (2.15),

τ |[εi]|(xεi(q

0,...,qn−1)T)

= xτ|[εi]|(εi)(q

0,...,qn−1)T

= xεi(q0,...,qn−1)T

,


so xεi(q0,...,qn−1)T ∈ Fq|[εi]| . Therefore, we can rewrite (2.17) as

0 =k∑i=1

∑a∈Ai

bi,aTrFq|[εi]|/Fq

(TrFqn/F

q|[εi]|(a)xεi(q

0,...,qn−1)T)

=k∑i=1

TrFq|[εi]|/Fq

[( ∑a∈Ai

bi,aTrFqn/Fq|[εi]|

(a))xεi(q

0,...,qn−1)T]

=k∑i=1

|[εi]|−1∑j=0

τ j( ∑a∈Ai

bi,aTrFqn/Fq|[εi]|

(a))xτ

j(εi)(q0,...,qn−1)T

(2.18)

for all x ∈ Fqn . The right side of (2.18) is a polynomial in x of degree ≤ qn− 1 andthe exponents of x are all distinct. (In fact, τ j(εi), 1 ≤ i ≤ k, 0 ≤ j ≤ |[εi]| − 1, arethe base-q digit vectors of all integers in [0, qn − 1].) Therefore, all the coefficientsin the right side of (2.17) are 0, i.e., for all 1 ≤ i ≤ k,∑

a∈Ai

bi,aTrFqn/Fq|[εi]|

(a) = 0.

From the definition of Ai, we must have bi,a = 0 for all a ∈ Ai. The proof of thetheorem is complete. �

Remark. A basis of Rq(r, n) can be obtained by taking the union of the basesof Rq(r, n)/Rq(r−1, n), Rq(r−1, n)/Rq(r−2, n), . . . , Rq(0, n) using Theorem 2.17.

Example 2.18. We determine a basis of Rq(2, n)/Rq(1, n) using Theorem 2.17.For 2 ≤ i ≤ n, let

εi = (1, 0, · · · , 0, 1i, 0, · · · , 0) ∈ E2.

Case 1. q = 2.Case 1.1. Assume that n is odd. Then the 〈τ〉-orbits of E2 are [εi], 2 ≤ i ≤

n+12 , where |[εi]| = n for all 2 ≤ i ≤ n+1

2 . A basis of R2(2, n)/R2(1, n) is given by

Tr(axεi(2

0,...,2n−1)T)

= Tr(ax1+2i−1

), 2 ≤ i ≤ n+ 1

2, a ∈ Ai,

where Ai is any basis of F2n over F2.Case 1.2. Assume that n is even. The 〈τ〉-orbits of E2 are [εi], 2 ≤ i ≤ n

2 + 1,where

|[εi]| =

{n if 2 ≤ i ≤ n

2 ,n2 if i = n

2 + 1.

A basis of R2(2, n)/R2(1, n) is given by

Tr(axεi(2

0,...,2n−1)T)

= Tr(ax1+2i−1

), 2 ≤ i ≤ n

2+ 1, a ∈ Ai,

where Ai, 2 ≤ i ≤ n2 , is any basis of F2n over F2 and An

2 +1 is any subset ofF2n such that TrF2n/F

2n/2(a), a ∈ An

2 +1, form a basis of F2

n2

over F2. Sinceker(TrF2n/F

2n/2) = F

2n2, the condition on An

2 +1 simply means that the imagesof An

2 +1 in F2n/F2

n2

forms an F2-basis of F2n/F2

n2. (A general fact: If g : V →W

is an onto homomorphism of vector spaces, then g(vi), i ∈ I, form a basis of W ifand only if vi + ker g, i ∈ I, form a basis of V/ ker g.)

Case 2. q > 2.Put δ = (2, 0, . . . , 0) ∈ E2.

2.4. PERMUTATION POLYNOMIALS 37

Case 2.1. Assume that n is odd. The 〈τ〉-orbits of E2 are [δ] and [εi], 1 ≤ i ≤n+1

2 , where |[δ]| = n and |[εi]| = n, 2 ≤ i ≤ n+12 . A basis of Rq(2, n)/Rq(1, n) is

given byTr

(axδ(q

0,...,qn−1)T)

= Tr(ax2), a ∈ A,

andTr

(axεi(q

0,...,qn−1)T)

= Tr(ax1+qi−1

), 2 ≤ i ≤ n+ 1

2, a ∈ Ai,

where each of A,A2, . . . , An+12

is a basis of Fqn over Fq.Case 2.2. Assume that n is even. The 〈τ〉-orbits of E2 are [δ] and [εi],

2 ≤ i ≤ n2 + 1, where |[δ]| = n and

|[εi]| =

{n if 2 ≤ i ≤ n

2 ,n2 if i = n

2 + 1.

A basis of Rq(2, n)/Rq(1, n) is given by

Tr(axδ(q

0,...,qn−1)T)

= Tr(ax2), a ∈ A,

andTr

(axεi(q

0,...,qn−1)T)

= Tr(ax1+qi−1

), 2 ≤ i ≤ n

2+ 1, a ∈ Ai,

where each of A,A2, . . . , An2

is a basis of Fqn over Fq and An2 +1 is any subset of Fqn

such that TrFqn/Fqn/2

(a), a ∈ An2 +1, form a basis of F

qn2

over Fq. If q is even, thecondition on An

2 +1 means that the image of An2 +1 in Fqn/F

qn2

forms an Fq-basisof Fqn/F

qn2. If q is odd, one can choose for An

2 +1 any basis of Fq

n2

over Fq.

2.4. Permutation Polynomials

Definition 2.19. A polynomial f ∈ Fq[x] is called a permutation polynomialof Fq if the function a 7→ f(a) is a permutation of Fq.

By Theorem 2.15, every function from Fq to Fq is uniquely represented bya polynomial of degree ≤ q − 1 in Fq[x]. Therefore, the number of permutationpolynomials of Fq of degree ≤ q − 1 is q!.

The main question concerning permutation polynomials of Fq is how to recog-nize them. The following is a useful criterion for this purpose.

Theorem 2.20 (Hermite’s criterion). Let q = pn, where p is a prime. Apolynomial f ∈ Fq[x] is a permutation polynomial of Fq if and only if the followingtwo conditions are both satisfied.

(i) f has exactly one root in Fq.(ii) For each integer 1 ≤ s ≤ q− 2, fs ≡ fs (mod xq −x) for some fs ∈ Fq[x]

with deg fs ≤ q − 2.

Remark. Condition (ii) of Theorem 2.20 is equivalent to(ii′) For each integer 1 ≤ s ≤ q − 2 with p - s, fs ≡ fs (mod xq − x) for some

fs ∈ Fq[x] with deg fs ≤ q − 2.To see that (ii′) implies (ii), assume that s = s′pm, where p - s′. Write

fs′≡

q−2∑i=0

aixi (mod xq − x).


For each 0 ≤ i ≤ q − 2, write i = (b0, . . . , bn−1)(p0, . . . , pn−1)T , 0 ≤ bj ≤ p − 1,(b0, . . . , bn−1) 6= (p− 1, . . . , p− 1). Then

xipm

≡ x(b−m,b−m+1,...,b−m+n−1)(p0,...,pn−1)T

(mod xq − x),

where the subscripts of bj is taken modulo n. In other words, the remainder of xipm

(mod xq − x) is xi′where i′ is obtained from i by cyclicly shifting its base p digits

m positions. Since (b0, . . . , bn−1) 6= (p− 1, . . . , p− 1), we have i′ < q− 1. Thus, theremainder of fs = fs

′pm

=∑q−2i=0 a

pm

i xipm

(mod xq − x) has degree < q − 1.The proof of Theorem 2.20 relies on the following two lemmas.

Lemma 2.21. Let a0, a1, . . . , aq−1 ∈ Fq. Then the following two conditions areequivalent.

(i) a0, a1, . . . , aq−1 are distinct, i.e., Fq = {a0, a1, . . . , aq−1}.(ii)

q−1∑j=0

asj =

{0 if 0 ≤ s ≤ q − 2,−1 if s = q − 1.

Proof. For each a ∈ Fq, let

(2.19) ha = 1− (x− a)q−1 ∈ Fq[x].

Clearly, ha maps a to 1 and all x ∈ Fq \ {a} to 0. Thus a0, . . . , aq−1 ∈ Fq aredistinct if and only if

f :=q−1∑j=0

haj

maps all x ∈ Fq to 1. The latter condition is equivalent to f ≡ 1 (mod xq − x),i.e., f = 1, since deg f < q. It remains to show that condition (ii) is equivalent tothe equation f = 1.

Since (x − a)q = xq − aq = (x − a)∑q−1i=0 a

q−1−ixi, we have (x − a)q−1 =∑q−1i=0 a

q−1−ixi. Thus,

(2.20) ha = 1−q−1∑i=0

aq−1−ixi

and

(2.21) f = −q−1∑j=0

q−1∑i=0

aq−1−ij xi = −

q−1∑i=0

(q−1∑j=0

aq−1−ij

)xi.

From (2.21), it is clear that f = 1 if and only if (ii) holds. �

Lemma 2.22. For any f ∈ Fq[x],

f ≡ −q−1∑i=1

( ∑a∈Fq

aq−1−if(a))xi + f(0) (mod xq − x).


Proof. For each a ∈ Fq, let ha = 1 − (x − a)q−1 ∈ Fq[x], as in (2.19). Thenf(z) =

∑a∈Fq

f(a)ha(z) for all z ∈ Fq. Hence,

f ≡∑a∈Fq

f(a)ha (mod xq − x)

=∑a∈Fq

f(a)(1−

q−1∑i=0

aq−1−ixi)

(by (2.20))

= −q−1∑i=1

( ∑a∈Fq

aq−1−if(a))xi + f(0).

�

Proof of Theorem 2.20. (⇒) (i) is obviously true. For 1 ≤ s ≤ q − 2, byLemma 2.22,

(2.22) fs ≡ −( ∑a∈Fq

f(a)s)xq−1 + fs (mod xq − x)

for some fs ∈ Fq[x] with deg fs < q − 1. Since f(a), a ∈ Fq, are all distinct, byLemma 2.21,

∑a∈Fq

f(a)s = 0. Hence fs ≡ fs (mod xq − x).(⇐) For 1 ≤ s ≤ q − 2, since the remainder of fs (mod xq − x) has degree

< q − 1, by (2.22), ∑a∈Fq

f(a)s = 0, 1 ≤ i ≤ q − 2.

Obviously,∑a∈Fq

f(a)0 = q = 0. Since f(x) has only one root in Fq, we have∑a∈Fq

f(a)q−1 = q − 1 = −1. Now by Lemma 2.21, f(a), a ∈ Fq, are all distinct,proving that f a permutation polynomial of Fq. �

Corollary 2.23. Let f ∈ Fq[x]. For each 1 ≤ s ≤ q − 1, let fs ≡ fs(mod xq−x), where fs ∈ Fq[x], deg fs ≤ q−1. Then f is a permutation polynomialof Fq if and only if

(2.23) deg fs

{≤ q − 2 if 1 ≤ s ≤ q − 2,= q − 1 if s = q − 1.

Proof. (⇒) By Theorem 2.20, deg fs ≤ q − 2 for 1 ≤ s ≤ q − 2. ByLemma 2.22, there is a g ∈ Fq[x] with deg g < q − 1 such that

fq−1 ≡ −( ∑a∈Fq

f(a)q−1)xq−1 + g (mod xq − x)

= xq−1 + g (by Lemma 2.21 (ii)).

Thus, fq−1 = xq−1 + g has degree q − 1.(⇐) By Lemma 2.22, we see that (2.23) is equivalent to

(2.24)∑a∈Fq

f(a)s =

{0 if 1 ≤ s ≤ q − 2,c if s = q − 1


for some 0 6= c ∈ Fq. Let

F =∑a∈Fq

hf(a),

where hf(a) is defined in (2.19). By (2.21) and (2.24),

F = −q−1∑i=0

( ∑a∈Fq

f(a)q−1−i)xi = −c.

Assume to the contrary that f is not a permutation polynomial of Fq. Then thereexists z ∈ Fq \ f(Fq). It follows that

0 =∑a∈Fq

hf(a)(z) = F (z) = −c,

which is a contradiction. �

Corollary 2.24. If f ∈ Fq[x] is such that deg f > 1 and deg f | q− 1, then fis not a permutation polynomial of Fq.

Proof. Let s = q−1deg f . Then 1 ≤ s < q−1 and deg fs = q−1. By Theorem 2.20

(ii), f is not a permutation polynomial of Fq. �

There are several known families of permutation polynomials, the simplest onebeing f(x) = xk ∈ Fq[x] where (k, q − 1) = 1. In the next section, we will see acriterion for a linearized polynomial to be a permutation polynomial. In general,finding new permutation polynomials is a challenging task. In the rest of thissection, we will introduce a remarkable family of permutation polynomials calledDickson polynomials.

Definition 2.25. Let R be a commutative ring, a ∈ R and n ∈ N. The Dicksonpolynomial Dn(x, a) ∈ R[x] is defined inductively by

(2.25)

D0(x, a) = 2,D1(x, a) = x,

Dn(x, a) = xDn−1(x, a)− aDn−2(x, a), n ≥ 2.

Clearly, Dn(x, a) is monic of degree n. For the first few n, we have

D2(x, a) = x2 − 2a,

D3(x, a) = x3 − 3ax,

D4(x, a) = x4 − 4ax2 + 2a2.

In fact, there is an explicit formula for Dn(x, a) for all n.

Theorem 2.26. For n ≥ 1, we have

(2.26) Dn(x, a) =bn

2 c∑i=0

n

n− i

(n− i

i

)(−1)iaixn−2i.

Note. Let R = Z and a = 1. Since Dn(x, 1) ∈ Z[x], (2.26) implies that for0 ≤ i ≤ bn2 c,

nn−i

(n−ii

)= (n−i−1)!n

i!(n−2i)! ∈ Z.


Proof of Theorem 2.26. We use induction on n. For n = 1, 2, equation(2.26) is obviously true. For n ≥ 3, we have

Dn(x, a)

=xDn−1(x, a)− aDn−2(x, a) (by (2.25))

=∑

0≤i≤n−12

n− 1n− 1− i

(n− 1− i

i

)(−1)iaixn−2i

−∑

0≤i≤n2−1

n− 2n− 2− i

(n− 2− i

i

)(−1)iaixn−2−2i (induction hypothesis)

=∑

0≤i≤n−12

n− 1n− 1− i

(n− 1− i

i

)(−1)iaixn−2i

+∑

1≤i≤n2

n− 2n− 1− i

(n− 1− i

i− 1

)(−1)iaixn−2i

=∑

0≤i≤n2

[ n− 1n− 1− i

(n− 1− i

i

)+

n− 2n− 1− i

(n− 1− i

i− 1

)](−1)iaixn−2i.

In the last step above, both∑

are replaced with∑

0≤i≤n2

since the additional termsbrought in are 0. To complete the induction, it remains to verify that

(2.27)n− 1

n− 1− i

(n− 1− i

i

)+

n− 2n− 1− i

(n− 1− i

i− 1

)=

n

n− i

(n− i

i

)for all 0 ≤ i ≤ n

2 . When i = 0 or n2 , (2.27) clearly holds. When 0 < i < n

2 ,

n− 1n− 1− i

(n− 1− i

i

)+

n− 2n− 1− i

(n− 1− i

i− 1

)=

1n− 1− i

[ (n− 1− i)!(n− 1)i!(n− 1− 2i)!

+(n− 1− i)!(n− 2)(i− 1)!(n− 2i)!

]=

(n− 1− i)!i!(n− 2i)!(n− 1− i)

[(n− 2i)(n− 1) + i(n− 2)

]=

(n− 1− i)!ni!(n− 2i)!

=n

n− i

(n− i

i

).

�

Proposition 2.27. Let R be a commutative ring and let a ∈ R. In the ring ofLaurent polynomials R[y, y−1], we have

(2.28) Dn(y + ay−1, a) = yn + any−n

for all n ≥ 1.


Proof. Again, we use induction on n. When n = 0, 1, (2.28) is obviously true.For n ≥ 2, we have

Dn(y + ay−1, a)

= (y + ay−1)(yn−1 + an−1y−(n−1))− a(yn−2 + an−2y−(n−2))

(induction hypothesis)

= yn + any−n.

�

Now, we are ready to state the main theorem on the Dickson polynomials aspermutation polynomials of Fq.

Theorem 2.28. Let a ∈ F∗q and n ∈ Z+. Then Dn(x, a) ∈ Fq[x] is a permuta-tion polynomial of Fq if and only if (n, q2 − 1) = 1.

Note. In Theorem 2.28, if a = 0, Dn(x, 0) = xn, which is a permutationpolynomial of Fq if and only if (n, q − 1) = 1.

Before proving Theorem 2.28, we observe a simple fact. Let F be any filed andlet y1, y2, a ∈ F ∗. Then

y1 +a

y1= y2 +

a

y2if and only if y1 = y2 or

a

y2.

In fact, y1 + ay1− (y2 + a

y2) = (y1 − y2)(1− a

y1y2).

Proof of Theorem 2.28. Let Y = {y ∈ F∗q2 : yq−1 = 1 or yq+1 = a}. Weclaim that

Fq ={y +

a

y: y ∈ Y

}.

For any y ∈ Y , we have yq = y or yq = ay . Thus, (y + a

y )q = yq + ayq = y + a

y .So y + a

y ∈ Fq. On the other hand, for any b ∈ Fq, the equation y + ay = b, i.e.,

y2 − by + a = 0, has a solution y ∈ F∗q2 . Since yq + ayq = bq = b = y + a

y , by theabove fact, yq = y or a

y , i.e., y ∈ Y .(⇐) Assume Dn(x1, a) = Dn(x2, a) for some x1, x2 ∈ Fq. We show that x1 =

x2. From the above, we can write xi = yi + ayi

for some yi ∈ F∗q2 , i = 1, 2. ByProposition 2.27,

yn1 +an

yn1= Dn(x1, a) = Dn(x2, a) = yn2 +

an

yn2.

Thus, yn1 = yn2 or an

yn2. Since (n, q2 − 1) = 1, we have y1 = y2 or a

y2, implying that

x1 = x2.(⇒) Assume to the contrary that (n, q2−1) > 1. We show that Dn(x, a) is not

a one-to-one function on Fq.If (n, q2 − 1) is even, then n is even and q is odd. By (2.26), Dn(−1, a) =

Dn(1, a), where −1 6= 1 in Fq.So assume that (n, q2 − 1) is odd thus has an odd prime factor r. Of course,

r | q − 1 or r | q + 1. It suffices to show that there exist y1, y2 ∈ Y such thaty1 /∈ {y2, ay2 } but yn1 = yn2 . In fact, from this, we have

Dn(y1 +a

y1, a) = yn1 +

an

yn1= yn2 +

an

yn2= Dn(y2 +

a

y2, a),

2.5. LINEARIZED POLYNOMIALS 43

where yi + ayi∈ Fq, i = 1, 2, are distinct.

If r | q − 1, let y2 = 1. Since yr = 1 has r ≥ 3 solutions in F∗q , there existsy1 ∈ F∗q ⊂ Y such that yr1 = 1 and y1 /∈ {y2, ay2 }. Clearly, yn1 = 1 = yn2 .

If r | q + 1, write a = bq+1 for some b ∈ F∗q2 and let y2 = b. (Since F∗q is thesubgroup of the cyclic group F∗q2 of index q + 1, every element of F∗q is a (q + 1)stpower of some element of F∗q2 .) Since yq+1

2 = a, y2 ∈ Y . Since yr = br hasr ≥ 3 solutions in F∗q2 , there exists y1 ∈ F∗q2 such that yr1 = br and y1 /∈ {y2, ay2 }.Since yr = br and r | q + 1, we have yq+1

1 = bq+1 = a; hence y1 ∈ Y . Clearly,yn1 = bn = yn2 . �

2.5. Linearized Polynomials

Definition 2.29. Let q > 1 be a prime power and n ∈ Z+. An Fq-linearizedpolynomial (or simply a q-polynomial) over Fqn is a polynomial of the form

(2.29) L(x) =k∑i=0

aixqi

∈ Fqn [x].

The polynomial function L : Fqn → Fqn defined by the polynomial L(x) in(2.29) is an Fq-map. In fact, L =

∑ki=0 aiτ

i ∈ HomFq(Fqn ,Fqn), where τ is the

Frobenius map of Fqn/Fq. Denote by L(q, n) the set of all q-polynomials in Fqn [x]and by Lk(q, n) the set of all q-polynomials of degree ≤ qk in Fqn [x]. The followingproposition shows that every element in HomFq

(Fqn ,Fqn) is represented by a q-polynomial in Ln−1(q, n). The representation is unique since the polynomials inLn−1(q, n) are of degree ≤ qn−1 < qn.

Proposition 2.30. We have

HomFq(Fqn ,Fqn) = Ln−1(q, n),

where the polynomials in Ln−1(q, n) are treated as functions from Fqn to Fqn .

Proof. We already saw that Ln−1(q, n) ⊂ HomFq(Fqn ,Fqn). Also,

dimFq Ln−1(q, n) = n dimFqn Ln−1(q, n) = n2 = dimFq HomFq (Fqn ,Fqn).

(In the above, dimFqn Ln−1(q, n) = n since xq0, . . . , xq

n−1is an Fqn -basis of

Ln−1(q, n).) Therefore, HomFq(Fqn ,Fqn) = Ln−1(q, n). �

We now determine the dimension of the range of a q-polynomial. For

f(x) = a0x+ a1xq + · · ·+ an−1x

qn−1∈ Ln−1(q, n),

define

A(f) =

a0 a1 · · · an−1

aqn−1 aq0 · · · aqn−2...

.... . .

...aq

n−1

1 aqn−1

2 · · · aqn−1

0

.Theorem 2.31. In the above notation, we have

rankA(f) = dimFqf(Fqn).


Proof. Let

V ={

zzq

...zq

n−1

: z ∈ Fqn

}⊂ Fnqn

and define an Fq-isomorphism

ι : Fqn −→ V

z 7−→

zzq

...zq

n−1

The matrix A(f) defines an Fqn -linear map from Fnqn to Fnqn by left multiplication.We treat the subscript i of ai as an integer modulo n. For each z ∈ Fqn , we have

A(f)

zzq

...zq

n−1

=

∑n−1i=0 aiz

qi∑n−1i=0 a

qi−1z

qi

...∑n−1i=0 a

qn−1

i−(n−1)zqi

=

∑n−1i=0 aiz

qi∑n−1i=0 a

qi zqi+1

...∑n−1i=0 a

qn−1

i zqi+n−1

(since zqn

= z)

=

∑n−1i=0 aiz

qi(∑n−1i=0 aiz

qi)q

...(∑n−1i=0 aiz

qi)qn−1

=

f(z)f(z)q

...f(z)q

n−1

.

(2.30)

In particular, A(f)(V ) ⊂ V .By Theorem 1.28, there exists an Fq-map φ : V ⊗Fq

Fqn → Fnqn such that

φ(

zzq

...zq

n−1

⊗ b)

=

zzq

...zq

n−1

b for all z, b ∈ Fqn .

2.5. LINEARIZED POLYNOMIALS 45

We claim that φ is an isomorphism. Let z1, . . . , zn be an Fq-basis of Fqn . ByLemma 1.37,

zizqi...

zqn−1

i

, 1 ≤ i ≤ n

form a basis of Fnqn over Fqn . Thus φ is onto. Since

dimFq V ⊗Fq Fqn = (dimFq V )(dimFq Fqn) = n2 = dimFq Fnqn ,

φ is an isomorphism.We have the following commutative diagram.

(2.31)

-

-

-

? ?

? ?

Fnqn Fnqn

V ⊗Fq Fqn V ⊗Fq Fqn

Fqn ⊗Fq Fqn Fqn ⊗Fq Fqn

A(f)

[A(f)|V ]⊗1

f⊗1

φ φ

ι⊗1 ι⊗1

∼=

∼=∼=

∼=

The commutativity of the bottom square in (2.31) is obvious. To see the commu-tativity of the top square in (2.31), observe that for z, b ∈ Fqn ,

[(ι⊗ 1) ◦ (f ⊗ 1)

](z ⊗ b) =

f(z)f(z)q

...f(z)q

n−1

⊗ b,

[([A(f)|V

]⊗ 1

)◦ (ι⊗ 1)

](z ⊗ b) =

(A(f)

zzq

...zq

n−1

)⊗ b

and, by (2.30),

A(f)

zzq

...zq

n−1

=

f(z)f(z)q

...f(z)q

n−1

.


Now, by (2.31),

rankA(f) = dimFqn

(A(f)(Fnqn)

)= dimFqn

[(f ⊗ 1)(Fqn ⊗Fq Fqn)

]= dimFqn

[f(Fqn)⊗Fq Fqn

]=

1n

dimFq

[f(Fqn)⊗Fq Fqn

]=

1n

(dimFq f(Fqn)

)(dimFq Fqn

)= dimFq f(Fqn).

�

Corollary 2.32. A q-polynomial

f(x) = a0x+ a1xq + · · ·+ an−1x

qn−1∈ Ln−1(q, n)

is a permutation polynomial of Fqn if and only if A(f) is nonsingular.

Proof. Immediate from Theorem 2.31. �

The set L(q, n) of all q-polynomials in Fqn [x] is not closed under multiplicationbut is closed under composition. We will see that (L(q, n),+, ◦) is a ring which isisomorphic to the so called skew polynomial ring over Fqn .

Definition 2.33. Let F be a field and σ ∈ Aut(F ). The skew polynomial ringF [x;α] is a ring such that

(i) (F [x;σ],+) = (F [x],+),(ii) the multiplication in F [x;σ] is defined by the distributivity, associativity

and the rule that

xa = σ(a)x for all a ∈ F.

Clearly, F [x;σ] is commutative if and only if σ = idF . When σ = idF , F [x;σ] =F [x].

Example 2.34. Let σ be the Frobenius map of Fqn/Fq and let f =∑si=0 aix

i,g =

∑tj=0 bjx

j ∈ Fqn [x;σ]. Then

fg =s+t∑k=0

∑i+j=k

aixibjx

j =s+t∑k=0

∑i+j=k

aibqi

j xi+j =

s+t∑k=0

( ∑i+j=k

aibqi

j

)xk.

Theorem 2.35. Let σ be the Frobenius map of Fqn/Fq. The map

φ : Fqn [x;σ] −→ L(q, n)s∑i=0

aixi 7−→

s∑i=0

aixqi

is a ring isomorphism from Fqn [x;σ] to (L(q, n),+, ◦).

2.6. PAYNE’S THEOREM 47

Proof. The only thing that needs proof is that φ preserves the multiplicationsof the rings. Let f =

∑si=0 aix

i, g =∑tj=0 bjx

j ∈ Fqn [x;σ]. We have

φ(fg) = φ(s+t∑k=0

( ∑i+j=k

aibqi

j

)xk

)(by Example 2.34)

=s+t∑k=0

( ∑i+j=k

aibqi

j

)xq

k

=s∑i=0

t∑j=0

aibqi

j xqi+j

=s∑i=0

ai

( t∑j=0

bjxqj

)qi

= f ◦ g.

�

Corollary 2.36. The map

φ : Fq[x] −→ L(q, 1)s∑i=0

aixi 7−→

s∑i=0

aixqi

is a ring isomorphism.

Proof. The Frobenius map of Fq/Fq is idFqand Fq[x; idFq

] = Fq[x]. �

2.6. Payne’s Theorem

In this section, we use Corollary 2.32 to prove (a generalization of) a theoremby Payne [11]. Let f(x) =

∑n−1i=0 aix

qi ∈ Fqn [x] be a q-polynomial. Then

f(x)x

=n−1∑i=0

aixqi−1.

For each x ∈ F∗qn , let x denote the image of x in F∗qn/F∗q . Since f : Fqn → Fqn isFq-linear, the map

F∗qn/F∗q −→ F∗qn

x 7−→ f(x)x , x ∈ F∗qn

is well defined. If the map

F∗qn/F∗q −→ F∗qn

x 7−→(f(x)x

), x ∈ F∗qn

is a permutation of F∗qn/F∗q , we say that the function f(x)/x, x ∈ F∗qn , is a permu-tation of F∗qn/F∗q . Thus, f(x)/x is a permutation of F∗qn/F∗q if and only if for eachb ∈ F∗qn , there exist z ∈ F∗qn and ε ∈ F∗q such that f(z)

z = εb.


Theorem 2.37 (Generalized Payne’s Theorem). Let f(x) =∑n−1i=0 aix

qi ∈Fqn [x] be a q-polynomial. Then f(x)/x is a permutation of F∗qn/F∗q if and onlyif (n, q − 1) = 1 and f(x) = axq

k

where a ∈ F∗qn and k is an integer such that1 ≤ k ≤ n− 1 and (k, n) = 1.

Lemma 2.38. Let f(x) =∑n−1i=0 aix

qi ∈ Fqn [x] be a q-polynomial such thatf(x)/x is a permutation of F∗qn/F∗q . Then the determinants of the principal subma-trices of

A(f) =

a0 a1 · · · an−1

aqn−1 aq0 · · · aqn−2...

.... . .

...aq

n−1

1 aqn−1

2 · · · aqn−1

0

.of size m×m (1 ≤ m ≤ n− 1) are all 0.

Proof. Let

D(x) =

∣∣∣∣∣∣∣∣∣a0 + x a1 · · · an−1

aqn−1 (a0 + x)q · · · aqn−2...

.... . .

...aq

n−1

1 aqn−1

2 · · · (a0 + x)qn−1

∣∣∣∣∣∣∣∣∣ ∈ Fqn [x].

For each b ∈ F∗qn , since f(x)/x is a permutation of F∗qn/F∗q , there exist z ∈ F∗qn andε ∈ F∗q such that f(z)

z = −εb. Thus z is a root of

(2.32) (a0 + εb)x+ a1xq + · · ·+ an−1x

qn−1.

Since the polynomial in (2.32) has at least two roots 0 and z, it is not a permutationpolynomial of Fqn . It follows from Corollary 2.32 that D(εb) = 0. Therefore forevery b ∈ F∗qn ,

∏ε∈F∗q

D(εb) = 0, which implies that

(2.33)∏ε∈F∗q

D(εx) = δ(xqn−1 − 1)

for some δ ∈ F∗qn . (In fact, δ = −1, although this fact is not needed in theproof. This is because D(0) is invariant under the Frobenius map of Fqn/Fq and−δ = (D(0))q−1 = 1.)

Let 0 ≤ i1 < i2 < · · · < im ≤ n− 1 with 1 ≤ m ≤ n− 1. Write {0, · · · , n− 1} \{i1, · · · , im} = {j1, · · · , js} with 0 ≤ j1 < · · · < js ≤ n−1. We try to determine thecoefficient of x(q−1)qj1+···+(q−1)qjs in

∏ε∈F∗q

D(εx) through the expansion of D(εx).In the expansion of

D(εx) =

∣∣∣∣∣∣∣∣∣a0 + εx a1 · · · an−1

aqn−1 aq0 + εxq · · · aqn−2...

.... . .

...aq

n−1

1 aqn−1

2 · · · aqn−1

0 + εxqn−1

∣∣∣∣∣∣∣∣∣as a polynomial of x, each term is of the form

axc0q0+···+cn−1q

n−1, a ∈ Fqn , c0, . . . , cn−1 ∈ {0, 1}.

2.6. PAYNE’S THEOREM 49

Therefore, we can write

(2.34) D(εx) =∑

c0,...,cn−1∈{0,1}

ac0q0+···+cn−1qn−1(ε)xc0q0+···+cn−1q

n−1,

where ac0q0+···+cn−1qn−1(ε) ∈ Fqn . The coefficient of xqj1+···+qjs in the expansion

of D(εx) is εs det(A(f)(i1, · · · , im)

), where A(f)(i1, · · · , im) is the principal sub-

matrix of A(f) with row and column indices i1, · · · , im, i.e., the submatrix of A(f)obtained by deleting rows and columns with indices other than i1, · · · , im. Thusin (2.34), aqj1+···+qjs (ε) = εs det

(A(f)(i1, · · · , im)

). By the uniqueness of the q-

adic expansion of the integer (q − 1)qj1 + · · ·+ (q − 1)qjs , we see the coefficient ofx(q−1)qj1+···+(q−1)qjs in∏

ε∈F∗q

D(εx) =∏ε∈F∗q

( ∑c0,...,cn−1∈{0,1}

ac0q0+···+cn−1qn−1(ε)xc0q0+···+cn−1q

n−1)

equals ∏ε∈F∗q

aqj1+···+qjs (ε) =[det

(A(f)(i1, · · · , im)

)]q−1 ∏ε∈F∗q

εs

=[det

(A(f)(i1, · · · , im)

)]q−1(−1)s.

Comparing the coefficients of x(q−1)qj1+···+(q−1)qjs in the two sides of (2.33), wehave det

(A(f)(i1, · · · , im)

)= 0. �

Proof of Theorem 2.37. (⇐) Since F∗qn/F∗q is a cyclic group of order qn−1q−1

and since (qk − 1,

qn − 1q − 1

)=

((qk − 1, qn − 1),

qn − 1q − 1

)=

(q − 1,

qn − 1q − 1

)(since (k, n) = 1)

= (q − 1, 1 + q + · · ·+ qn−1)

= (q − 1, n)= 1,

the map ( ) → ( )qk−1 is a permutation of F∗qn/F∗q . Thus, f(x)

x = axqk−1 is a

permutation of F∗qn/F∗q .(⇐) It suffices to show that f(x) has exactly one nonzero coefficient. By

Lemma 2.38, the determinants of principal submatrices of

A(f) =

a0 a1 · · · an−1

aqn−1 aq0 · · · aqn−2...

.... . .

...aq

n−1

1 aqn−1

2 · · · aqn−1

0

of sizes 1× 1, 2× 2, · · · , (n− 1)× (n− 1) are all 0. Observe that

A(f) = [bij ]0≤i,j≤n−1

where

(2.35) bij = 0 if and only if aj−i = 0,


where the subscript j − i of aj−i is taken modulo n.We claim that if i1 + · · ·+ im ≡ 0 (mod n) (1 ≤ m ≤ n− 1), then

(2.36) ai1 · · · aim = 0.

To prove (2.36), we use induction on m. The case m = 1 is obvious. Assume to thecontrary that i1 + · · ·+ im ≡ 0 (mod n) but ai1 · · · aim 6= 0. We may assume that0, i1, i1+i2, · · · , i1+· · ·+im−1 are all distinct modulo n. (Otherwise, is+· · ·+it ≡ 0(mod n) for some 1 ≤ s < t ≤ m− 1. By the induction hypothesis, ais · · · ait = 0,which is a contradiction.) Consider the principal submatrix of A(f) with row andcolumn indices j0 = 0, j1 = i1, j2 = i1 + i2, · · · , jm−1 = i1 + · · ·+ im−1:

B =

0 b0j1 b0j2 · · · b0jm−1

bj10 0 bj1j2 · · · bj1jm−1

bj20 bj2j1 0 · · · bj2jm−1

......

.... . .

...bjm−10 bjm−1j1 bjm−1j2 · · · 0

Since ai1 , · · · , aim are all nonzero, by (2.35), b0j1 , bj1j2 · · · , bjm−2jm−1 , bjm−10 areall nonzero. Since all 2× 2 principal submatrices of B have determinant 0, bj10 =bj2j1 = · · · = bjm−1jm−2 = 0. Since all 3 × 3 principal submatrices of B havedeterminant 0, it follows that bj20 = bj3j1 = · · · = bjm−1jm−3 = 0. (For example,

0 =

∣∣∣∣∣∣0 b0j1 b0j20 0 bj1j2bj20 0 0

∣∣∣∣∣∣ = b0j1bj1j2bj20

implies that bj20 = 0.) In the same way, by considering principal submatrices of Bup to size (m− 1)× (m− 1), we conclude that

B =

0 b0j1 ∗ · · · ∗ ∗0 0 bj1j2 · · · ∗ ∗0 0 0 · · · ∗ ∗...

......

. . ....

...0 0 0 · · · 0 bjm−2jm−1

bjm−10 0 0 · · · 0 0

.

It follows that b0j1bj1j2 · · · bjm−2jm−1bjm−10 = detB = 0, which is a contradiction.Thus (2.36) is proved.

Assume that ak 6= 0 for some 1 ≤ k ≤ n − 1. We claim that (k, n) = 1.Otherwise, there is an integer 1 ≤ l ≤ n− 1 such that lk ≡ 0 (mod n). By (2.36),we have

ak · · · ak︸︷︷︸l

= 0,

which is a contradiction. For any 1 ≤ i ≤ n − 1 with i 6= k, we can write i ≡ −jk(mod n) with 1 ≤ j ≤ n− 2. By (2.36) again, we have

ai ak · · · ak︸︷︷︸j

= 0,

which implies that ai = 0. Thus ak is the only nonzero coefficient of f and theproof of the theorem is complete. �

EXERCISES 51

Corollary 2.39 (Payne’s Theorem). Let f(x) =∑n−1i=0 aix

2i ∈ F2n [x] be a2-polynomial. Then f(x)

x =∑n−1i=0 aix

2i−1 is a permutation polynomial of F2n ifand only if f(x) = ax2k

where a ∈ F∗2n and k is an integer such that 1 ≤ k ≤ n− 1and (k, n) = 1.

Proof. The polynomial f(x)/x is a permutation polynomial of F2n if and onlyif it permutes F∗2n . Since F∗2n = F∗2n/F∗2, the conclusion follows immediately fromTheorem 2.37. �

Exercises

2.1. Prove thatlimn→∞

nq−n|Iq(n)| = 1.

Namely, |Iq(n)| ∼ qn

n as n→∞.2.2. (i) Let f ∈ Fq[x] be a monic irreducible polynomial of degree n and f 6= x.

Prove that f is primitive if and only if xd 6≡ 1 (mod f) for all d | qn− 1but d 6= qn − 1.

(ii) Determine all monic irreducible cubic polynomials in F3[x]. Amongthese irreducible cubics, identify the ones that are primitive.

2.3. Let q = pn, where p is a prime. Let a0, a1, . . . , aq−1 ∈ Fq and assume thatq−1∑j=0

aij = 0 for all 1 ≤ i ≤ q − 2.

(i) If∑q−1j=0 a

q−1j 6= 0, prove that a0, . . . , aq−1 are all distinct.

(ii) If∑q−1j=0 a

q−1j = 0, prove that each element of {a0, . . . , aq−1} appears a

multiple of p times in the sequence a0, . . . , aq−1.2.4. Let Dn(x, a) be the Dickson polynomial over a commutative ring R. Prove

thatDm

(Dn(x, a), an

)= Dmn(x, a).

CHAPTER 3

Exponential Sums

3.1. Characters of a Finite Abelian Group

The paring between G and G†. Let G be a finite abelian group and define

G† = HomZ(G, Q/Z).

Of course, G† is a Z-module, i.e., an abelian group.

Theorem 3.1. We have G ∼= G†.

Proof. By the fundamental theorem of finite abelian groups, we may assumeG = Z/m1Z× · · · × Z/msZ. For each x ∈ Z and m ∈ Z+, we denote the image ofx in Z/mZ by [x]m. Define a map

f : G = Z/m1Z× · · · × Z/msZ −→ G†

([a1]m1 , . . . , [as]ms) 7−→ f([a1]m1 ,...,[as]ms )

wheref([a1]m1 ,...,[as]ms ) : G −→ Q/Z

([x1]m1 , . . . , [xs]ms) 7−→ a1x1

m1+ · · ·+ asxs

ms

Clearly, f is well defined. It is routine to check that f is a group homomorphism.Assume ([a1]m1 , . . . , [as]ms

) ∈ ker f , i.e., f([a1]m1 ,...,[as]ms ) = 0. Then in Q/Z,

0 = f([a1]m1 ,...,[as]ms )(0, . . . , 0, [1]mi , 0, . . . , 0) =aimi

, 1 ≤ i ≤ s.

Thus, [ai]mi= 0, 1 ≤ i ≤ s. So f is injective.

Assume α ∈ G†. For each 1 ≤ i ≤ s, since in Q/Z,

mi α(0, . . . , 0, [1]mi, 0, . . . , 0) = α(0, . . . , 0) = 0,

we must have

α(0, . . . , 0, [1]mi, 0, . . . , 0) =

aimi

for some ai ∈ Z.

Now, for any ([x1]m1 , . . . , [xs]ms) ∈ G, we have

α([x1]m1 , . . . , [xs]ms) = α( s∑i=1

xi(0, . . . , 0, [1]mi , 0, . . . , 0))

=s∑i=1

xiα(0, . . . , 0, [1]mi , 0, . . . , 0)

=s∑i=1

xiaimi

= f([a1]m1 ,...,[as]ms )([x1]m1 , . . . , [xs]ms).

53

54 3. EXPONENTIAL SUMS

Therefore, α = f([a1]m1 ,...,[as]ms ). So f is onto. �

Define a paring map 〈·, ·〉 : G† ×G→ Q/Z by

〈α, y〉 = α(y), α ∈ G†, y ∈ G.Clearly, 〈·, ·〉 is Z-bilinear. For each H ⊂ G and A ⊂ G†, put

H⊥ = {α ∈ G† : 〈α, y〉 = 0 for all y ∈ H},

A⊥ = {y ∈ G : 〈α, y〉 = 0 for all α ∈ A}.

Clearly, H⊥ < G† and A⊥ < G. It is also obvious that ( )⊥ is an inclusion-reversing operator, i.e., H ⊂ K in G implies H⊥ ⊃ K⊥ in G† and A ⊂ B in G†

implies A⊥ ⊃ B⊥ in G.

Lemma 3.2. For each H < G, we have

(3.1) |H| |H⊥| = |G|.

Proof. Let α ∈ H⊥. SinceH ⊂ kerα, α : G→ Q/Z induces a homomorphismα : G/H → Q/Z. Define

φ : H⊥ −→ (G/H)†

α 7−→ α

Clearly, φ is a group homomorphism. If α ∈ kerφ, then α = 0, which meansα(y) = 0 for all y ∈ G. Thus α = 0. So φ is injective. Now assume β ∈ (G/H)†.Let π : G→ G/H be the natural homomorphism and let α = β ◦ π. Then α ∈ H⊥

and φ(α) = α = β. Hence φ is also onto, making φ an isomorphism. From this andTheorem 3.1, we have |H⊥| = |(G/H)†| = |G/H| = |G|/|H|. �

Theorem 3.3.(i) Let H < G and A < G†. We have

H⊥⊥ = H, A⊥⊥ = A

and

(3.2) |A| |A⊥| = |G|.Let S(G) and S(G†) be the set of all subgroups of G and G† respectively.Then ( )⊥ : S(G) → S(G†) and ( )⊥ : S(G†) → S(G) are both bijectionsand are inverses of each other.

(ii) Let H < K < G. There is an isomorphism

(3.3)φ : H⊥/K⊥ −→ (K/H)†

α+K⊥ 7−→ 〈α, ·〉, α ∈ H⊥,

where

(3.4)〈α, ·〉 : K/H −→ Q/Z

y +H 7−→ 〈α, y〉, y ∈ K.

(iii) Let A < B < G†. There is an isomorphism

(3.5)ψ : A⊥/B⊥ −→ (B/A)†

y +B⊥ 7−→ 〈·, y〉, y ∈ A⊥,where

〈·, y〉 : B/A −→ Q/Zα+A 7−→ 〈α, y〉, α ∈ B.

3.1. CHARACTERS OF A FINITE ABELIAN GROUP 55

Proof. (i) For each H < G and A < G†, it is clear from the definition of ( )⊥

that

H ⊂ H⊥⊥,(3.6)

A ⊂ A⊥⊥.(3.7)

We claim that H⊥⊥⊥ = H⊥. In fact, taking ( )⊥ of both sides of (3.6), we haveH⊥ ⊃ H⊥⊥⊥. Letting A = H⊥ in (3.7), we have H⊥ ⊂ H⊥⊥⊥. So the claim isproved. Now we have

|H⊥⊥| =|G|

|H⊥⊥⊥|((3.1) with H⊥⊥ in place of H)

=|G||H⊥|

(since H⊥⊥⊥ = H⊥)

= |H| ((3.1) again).

Therefore, H = H⊥⊥.Since ( )⊥ = idS(G), ( )⊥ : S(G) → S(G†) is one-to-one and ( )⊥ : S(G†) →

S(G) is onto. Since G ∼= G†, |S(G)| = |S(G†)| < ∞. Thus, ( )⊥ : S(G) → S(G†)and ( )⊥ : S(G†) → S(G) are both bijections and are inverses of each other.Therefore, ( )⊥⊥ = idS(G†), proving that A⊥⊥ = A. Write A = H⊥ for someH ∈ S(G). Then

|A| |A⊥| = |H⊥| |H⊥⊥| = |H⊥| |H| = |G|.(ii) It is clear that the maps in (3.4) and (3.3) are both well defined. Assume

α ∈ H⊥ such that α+K⊥ ∈ kerφ. This means that 〈α, y〉 = 0 for all y ∈ K. Thusα ∈ K⊥, i.e., α + K⊥ = 0 in H⊥/K⊥. So φ is injective. However, by (3.1) andTheorem 3.1,

|H⊥/K⊥| = |K/H| = |(K/H)†|.Hence φ is an isomorphism.

(iii) Same as the proof of (ii). �

Theorem 3.3 has some immediate consequences which are not completely trivial.

Corollary 3.4.(i) If 0 6= y ∈ G, then there exists α ∈ G† such that α(y) 6= 0.(ii) Let K < G. Then every homomorphism β : K → Q/Z can be extended to

a homomorphism α : G→ Q/Z.

Proof. (i) Let H = 〈y〉. Since H 6= {0}, H⊥ 6= {0}⊥ = G†. Choose α ∈G† \H⊥. Then α(y) 6= 0.

(ii) Let H = {0} in (3.3) and observe that φ : G†/K⊥ → K† is onto. �

Let us take another look of Theorem 3.1. The isomorphism G ∼= G† constructedin the proof of Theorem 3.1 depends on the decomposition G ∼= Z/m1Z × · · · ×Z/msZ. On the other hand, applying Theorem 3.1 twice, we have G ∼= G††. Itshould be pointed out that there is an isomorphismG ∼= G†† which does not dependson the decomposition of G. Letting A = {0} and B = G† in (3.5), we see that

ψ : G −→ G††

y 7−→ 〈·, y〉is an isomorphism.


Characters. Let U = {z ∈ C : zn = 1 for some n ∈ Z+}. U is the subgroupof C∗ consisting of roots of unity. Define

η : Q/Z −→ Ux+ Z 7−→ e2πix, x ∈ Q.

Then η is a well defined group isomorphism.

Definition 3.5. Let G be a finite abelian group. A character of G is a homo-morphism χ : G→ U . The group of all characters of G, i.e., HomZ(G,U), is calledthe character group of G and is denoted by G◦.

Characters are used to form exponential sums. Loosely speaking, an exponen-tial sum over G is a sum of the form∑

y∈Gf(y)χ

(g(x)

),

where χ is a character of G, f : G → C, g : G → G are functions. As we will seelater, exponential sums are a powerful tool for studying zeros of polynomials overa finite field.

There is no fundamental difference between the groups G† and G◦. The mapG† → G◦, α 7→ η ◦ α, is a group isomorphism. Thus G◦ ∼= G† ∼= G. Of course, thefacts of the previous subsection can be paraphrased for G and G◦.

There is a Z-bilinear map 〈·, ·〉 : G◦ ×G→ U defined by

〈χ, y〉 = χ(y), χ ∈ G◦, y ∈ G.For H ⊂ G and A ⊂ G◦, let

H⊥ = {χ ∈ G† : 〈χ, y〉 = 0 for all x ∈ H},

A⊥ = {y ∈ G : 〈χ, y〉 = 0 for all χ ∈ A}.

Theorem 3.6 (Restatement of Theorem 3.3).(i) ( )⊥ : S(G) → S(G◦) and ( )⊥ : S(G◦) → S(G) are both bijections and

are inverses of each other. For each H < G and A < G◦,

|H| |H⊥| = |G| and |A| |A⊥| = |G|.(ii) Let H < K < G. There is an isomorphism

φ : H⊥/K⊥ −→ (K/H)◦

χ+K⊥ 7−→ 〈χ, ·〉, χ ∈ H⊥,

where〈χ, ·〉 : K/H −→ U

y +H 7−→ 〈χ, y〉, y ∈ K.(iii) Let A < B < G◦. There is an isomorphism

ψ : A⊥/B⊥ −→ (B/A)◦

y +B⊥ 7−→ 〈·, y〉, y ∈ A⊥,where

〈·, y〉 : B/A −→ Uχ+A 7−→ 〈χ, y〉, χ ∈ B.

Corollary 3.7 (Restatement of Corollary 3.4).(i) If 0 6= y ∈ G, there exists χ ∈ G◦ such that χ(y) 6= 1.


(ii) Let K < G. Then every character of K can be extended to a character ofG.

If χ is a character of G and y ∈ G, then χ(−y) = χ(y)−1 = χ(y) since |χ(y)| =1. The inverse of χ in the group G◦ is the character χ : G → U defined byχ(y) := χ(y) = χ(−y). The identity element of G◦ is the character 1G : G → U ,y 7→ 1 for all y ∈ G. 1G is called the trivial character of G. In the next theorem,we collect some basic properties of characters.

Theorem 3.8. Let G be a finite abelian group.

(i) Let H < G, a ∈ G and A < G◦, α ∈ G◦. Then

(3.8)∑y∈H

α(y) =

{|H| if α ∈ H⊥,

0 if α /∈ H⊥,

(3.9)∑χ∈A

χ(a) =

{|A| if a ∈ A⊥,0 if a /∈ A⊥.

In particular, with H = G and A = G◦,

(3.10)∑y∈G

α(y) =

{|G| if α = 1G,0 if α 6= 1G,

(3.11)∑χ∈G◦

χ(a) =

{|G| if a = 0,0 if a 6= 0.

(ii) (Orthogonal relations) Let a, b ∈ G and α, β ∈ G◦. Then

(3.12)∑y∈G

α(y)β(y) =

{|G| if α = β,

0 if α 6= β,

(3.13)∑χ∈G◦

χ(a)χ(b) =

{|G| if a = b,

0 if a 6= b.

Proof. (i) If α ∈ H⊥, then∑y∈H α(y) =

∑y∈H 1 = |H|. If α /∈ H⊥, choose

a ∈ H such that α(a) 6= 1. Then∑y∈H

α(y) =∑y∈H

α(y + a) =∑y∈H

α(a)α(y) = α(a)∑y∈H

α(y),

which forces∑y∈H α(y) = 0. Thus, (3.8) is proved. The proof of (3.9) is identical.

(ii) By (3.10), we have

∑y∈G

α(y)β(y) =∑y∈G

(αβ)(y) =

{|G| if αβ = 1G, i.e., α = β,

0 if αβ 6= 1G, i.e., α 6= β.

The proof of (3.13) is the same. �


Let F(G,C) be the C-algebra of all functions from G to C. For each f, g ∈F(G,C), define

(3.14) 〈〈f, g〉〉 =1|G|

∑y∈G

f(y)g(y).

Then 〈〈·, ·〉〉 is an inner product on F(G,C) and F(G,C) becomes a |G|-dimensionalunitary space over C. Equation (3.12) means that the elements of G◦ form anorthonormal basis of F(G,C).

By Theorem 3.6 (iii), with A = {1G} and B = G◦, we see that G → G◦◦,y 7→ 〈·, y〉, is an isomorphism. We will identify y ∈ G with the character 〈·, y〉 ofG◦. From the above paragraph, with G replaced by G◦, F(G◦,C) is also a |G|-dimensional unitary space over C. Equation (3.13) means that the elements in Gfrom an orthonormal basis of F(G◦,C).

Fourier transform. Let G be a finite abelian group and f : G→ C a function.The Fourier transform of f is a function f : G◦ → C defined by

(3.15) f(χ) =∑y∈G

f(y)χ(y) = |G|〈〈f, χ〉〉, χ ∈ G◦,

where 〈〈·, ·〉〉 is the inner product on F(G,C) defined in (3.14). If f, g are twofunctions from G to C, the convolution of f and g is a function f ∗ g : G → Cdefined by

(f ∗ g)(y) =∑z,w∈Gz+w=y

f(z)g(w), y ∈ G.

The next proposition is a collection of some basic formulas about the Fouriertransform.

Proposition 3.9. Let G be a finite abelian group, H < G, and f, g : G → Ctwo functions. Then

f(y) =1|G|

∑χ∈G◦

f(χ)χ(y) for all y ∈ G (inversion formula),(3.16)

˜f(y) = |G|f(−y) for all y ∈ G,(3.17)

f ∗ g = f g,(3.18)

fg =1|G|

f ∗ g,(3.19) ∑χ∈H⊥

f(χ) g(χ) =|G||H|

∑y∈H

(f ∗ g)(y),(3.20)

∑y∈H

f(y)g(y) =|H||G|2

∑χ∈H⊥

(f ∗ g)(χ),(3.21)


∑χ∈H⊥

f(χ) =|G||H|

∑y∈H

f(y),(3.22)

∑χ∈G◦

|f(χ)|2 = |G|∑y∈G

|f(y)|2 (Parseval identity).(3.23)

∑χ∈G◦

|f(χ)|4 = |G|∑y∈G

|(f ∗ f)(y)|2.(3.24)

Remark. In (3.17), the domain of ˜f is G◦◦ and the domain of f is G. However,

G◦◦ is identified with G through the isomorphism G→ G◦◦, y 7→ 〈·, y〉.By (3.17), ( ) : F(G,C) → F(G◦,C) is a bijection. Moreover, (3.18) im-

plies that F(G,C) is also a C-algebra with the convolution ∗ as its multipli-cation. We denote this C-algebra by F(G,C, ∗). It is clear from (3.18) that( ) : F(G,C, ∗) → F(G◦,C) is a C-algebra isomorphism. Similarly, (3.19) impliesthat 1

|G| ( ) : F(G,C) → F(G◦,C, ∗) is a C-algebra isomorphism.

Proof of Proposition 3.9. Proof of (3.16): Since the elements of G◦ forman orthonormal basis of F(G,C),

f =∑χ∈G◦

〈〈f, χ〉〉χ =1|G|

∑χ∈G◦

f(χ)χ.

Proof of (3.17): We have˜f(y) =

∑χ∈G◦

f(χ)χ(y) =∑χ∈G◦

(∑z∈G

f(z)χ(z))χ(y) =

∑z∈G

f(z)∑χ∈G◦

χ(z + y).

By (3.11), ∑χ∈G◦

χ(z + y) =

{|G| if z = −y,0 if z 6= −y.

Thus, ˜f(y) = |G|f(−y).

Proof of (3.18): For each χ ∈ G◦,

f ∗ g(χ) =∑y∈G

(f ∗ g)(y) χ(y)

=∑y∈G

(∑z∈G

f(z)g(y − z))χ(y)

=∑y∈G

∑z∈G

f(z)g(y)χ(y + z) (replacing y with y + z)

=(∑z∈G

f(z)χ(z))(∑

y∈Gg(y)χ(y)

)(since χ(y + z) = χ(y)χ(z))

= f(χ) g(χ).

Proof of (3.19): For each y ∈ G, we have˜(fg)(y) = |G|(fg)(−y) (by (3.17))


and

˜( 1|G|

f ∗ g)(y) =

1|G|

(˜f ˜g)(y) (by (3.18), with f , g in place of f, g)

= |G|(fg)(−y) (by (3.17)).

Hence ˜fg = ˜(

1|G| f ∗ g

). Therefore, fg = 1

|G| f ∗ g.Proof of (3.20): We have∑

χ∈H⊥

f(χ)g(χ) =∑χ∈H⊥

∑y∈G

∑z∈G

f(y)χ(y)g(z)χ(z)

=∑y,z∈G

f(y)g(z)∑χ∈H⊥

χ(y + z)

=∑y,z∈G

f(y)g(z)∑χ∈H⊥

χ(−y − z)

= |H⊥|∑y,z∈Gy+z∈H

f(y)g(z) (by (3.9))

= |H⊥|∑w∈H

∑y,z∈Gy+z=w

f(y)g(z)

= |H⊥|∑w∈H

(f ∗ g)(w).

Proof of (3.21): By (3.20), with f and g replaced by f and g and H replacedby H⊥, we have ∑

y∈H

˜f(y) ˜g(y) = |H|

∑χ∈H⊥

(f ∗ g)(χ).

However, by (3.17),∑y∈H

˜f(y) ˜g(y) = |G|2

∑y∈H

f(−y)g(−y) = |G|2∑y∈H

f(y)g(y).

Equation (3.21) follows from the above.Proof of (3.22): We have∑

χ∈H⊥

f(χ) =∑χ∈H⊥

∑y∈G

f(y)χ(y)

=∑y∈G

f(y)∑χ∈H⊥

χ(y)

=∑y∈H

f(y)|H⊥|

= |H⊥|∑y∈H

f(y).

Proof of (3.23): Let g : G→ C be defined by

g(y) = f(−y), y ∈ G.


Then for χ ∈ G◦,

g(χ) =∑y∈G

g(y)χ(y) =(∑y∈G

f(−y)χ(y))

=(∑y∈G

f(y)χ(y))

= f(χ).

Thus by (3.20), with H = {0}, we have∑χ∈G◦

|f(χ)|2 =∑χ∈G◦

f(χ)g(χ)

= |G|(f ∗ g)(0)

= |G|∑y∈G

f(y)g(−z)

= |G|∑y∈G

|f(y)|2.

Proof of (3.24): We have

|G|∑y∈G

|(f ∗ f)(y)|2 =∑χ∈G◦

|(f ∗ f)(χ)|2 (by (3.23))

=∑χ∈G◦

|f(χ)|4 (by (3.17)).

�

Characters of Fq and F∗q. Characters of (Fq,+) and (F∗q , ·) are called additiveand multiplicative characters of Fq respectively. These characters can be easilydescribed.

Proposition 3.10. Let q = pn, where p is a prime, and let ω = e2πi/p.

(i) The mapφ : Fq −→ F◦q

a 7−→ ωTrFq/Fp (a · )

is a group isomorphism. Thus, every character of (Fq,+) is of the formωTrFq/Fp (a · ) for a unique a ∈ Fq.

(ii) Let α be a primitive element of Fq. The map

ψ : F∗q −→ F∗q◦

αk 7−→ ψ(k)

whereψ(k) : F∗q −→ U

αl 7−→ e2πiq−1kl

is an isomorphism. Thus, every character of (F∗q , ·) is ψ(k) for a uniquek ∈ Z such that 0 ≤ k ≤ q − 2.

Proof. (i) Since TrFq/Fp(xy), (x, y) ∈ F2

q, is a nondegenerate Fp-bilinear formon Fq, it is clear that φ is an injective homomorphism. Since |Fq| = |F◦q |, φ is anisomorphism.

(ii) Since F∗q ∼= Z/(q − 1)Z, the conclusion is obvious. �


For any multiplicative character χ ∈ F∗q◦, we always extend its domain from F∗q

to Fq according to the convention

χ(0) =

{1 if χ = 1F∗q ,

0 if χ 6= 1F∗q .

With this convention, it follows from (3.10) that

(3.25)∑y∈Fq

χ(y) =

{q if χ = 1F∗q ,

0 if χ 6= 1F∗q .

3.2. Gauss Sums

Let λ be a multiplicative character of Fq and χ an additive character of Fq.The Gauss sum G(λ, χ) is defined to be

G(λ, χ) =∑y∈Fq

λ(y)χ(y).

If χ = 1Fq, by (3.25),


λ(y) =

{q if λ = 1F∗q ,

0 if λ 6= 1F∗q .

If χ 6= 1Fqbut λ = 1F∗q ,


χ(y) = 0.

To sum up, when at least one of λ and χ is trivial, we have

G(λ, χ) =

{q if both λ and χ are trivial,0 if eaxctly one of λ and χ is trivial.

When λ and χ are both nontrivial, the Gauss sum is not determined except fora few special cases. We will discuss one of these cases, the Gauss quadratic sum, indetails in the subsequent sections. However, the complex norm |G(λ, χ)| requireslittle effort to determine.

Theorem 3.11. If λ ∈ F∗q◦ and χ ∈ F◦q are both nontrivial, then

(3.26) |G(λ, χ)| = q12 .

3.2. GAUSS SUMS 63

Proof. We have

|G(λ, χ)|2 = G(λ, χ)G(λ, χ)

=( ∑y∈F∗q

λ(y)χ(y))( ∑

z∈F∗q

λ(z)χ(z))

(since λ(0) = 0)

=∑y,z∈F∗q

λ(yz−1)χ(y − z)

=∑y,z∈F∗q

λ(y)χ(yz − z) (replacing y by yz)

=∑y∈F∗q

λ(y)∑z∈F∗q

χ((y − 1)z

)=

∑y∈F∗q

λ(y)[∑z∈Fq

χ((y − 1)z

)− χ(0)

]=

∑y∈F∗q

λ(y)∑z∈Fq

χ((y − 1)z

)(since

∑y∈F∗q

λ(y) = 0).

(3.27)

If y 6= 1, χ((y − 1) ·

)is a nontrivial additive character of Fq. Thus by (3.10),

∑z∈Fq

χ((y − 1)z

)=

{q if y = 1,0 if y 6= 1.

Therefore, (3.27) becomes

|G(λ, χ)|2 = λ(1)q = q.

�

Let ω = e2πi/p. The addictive character e ∈ F◦q defined by

e(y) = ωTrFq/Fp (y), y ∈ Fq,

is called the canonical additive character of Fq. If χ is a nontrivial additive characterof Fq, by Proposition 3.10 (i), χ( ) = e(a · ) for some a ∈ F∗q . Thus


λ(y)e(ay) =∑y∈Fq

λ(a−1y)e(y) = λ(a)∑y∈Fq

λ(y)e(y) = λ(a)G(λ, e).

Put G(λ) = G(λ, e). then

G(λ, χ) = λ(a)G(λ) where χ( ) = e(a · ), a ∈ F∗q .

So, every Gauss sum with a nontrivial additive character can be normalized so thatthe additive character is canonical.


Let k | q − 1 and let Ak be the subgroup of F∗q◦ of order k. Then A⊥k is the

subgroup of F∗q of order q−1k , i.e., A⊥k = {yk : y ∈ F∗q}. We have∑

λ∈Ak\{1F∗q }

G(λ) =∑λ∈Ak

G(λ) (since G(1F∗q ) = 0)

=∑λ∈Ak

∑y∈Fq

λ(y)e(y)

=∑y∈Fq

e(y)∑λ∈Ak

λ(y)

=∑λ∈Ak

λ(0) +∑y∈F∗q

e(y)∑λ∈Ak

λ(y)

= 1 +∑y∈A⊥k

e(y)k (by (3.9))

= 1 +∑y∈F∗q

e(yk)

=∑y∈Fq

e(yk).

(3.28)

In the above, the sum∑y∈Fq

e(yk) is a special case of the Weil sum, see Section ??.When q is odd and k = 2, A2 \{1F∗q} contains only one multiplicative character

η. Since o(η) = 2, ker η is the subgroup of order F∗q of order q−12 , i.e., the subgroup

of squares in F∗q . Therefore,

(3.29) η(y) =

{1 if y is a square in F∗q ,−1 if y is a nonsquare in F∗q .

η is called the quadratic character of Fq. By (3.28), we have

(3.30) G(η) =∑y∈Fq

e(y2).

The sum in (3.30), referred to as the Gauss quadratic sum over Fq, is completelyknown (Theorem 3.30). The Gauss quadratic sum G(η) is easily determined up toa ± sign. In fact, since η = η, we have

G(η) =∑y∈Fq

η(y)e(y) =∑y∈Fq

η(y)e(−y) =∑y∈Fq

η(−y)e(y)

= η(−1)∑y∈Fq

η(y)e(y) = η(−1)G(η),

where

η(−1) =

{1 if q ≡ 1 (mod 4),−1 if q ≡ −1 (mod 4).

Thus, G(η) is real if q ≡ 1 (mod 4) and is purely imaginary if q ≡ −1 (mod 4). Inthe light of Theorem 3.11, we must have

(3.31) G(η) =

{±q1/2 if q ≡ 1 (mod 4),±iq1/2 if q ≡ −1 (mod 4).

3.3. EVALUATION OF THE GAUSS QUADRATIC SUM OVER Fp 65

The determination of the sign of the Gauss quadratic sum was a highlight in classicnumber theory. Here, we give a little history. Gauss initially studied his quadraticsum over a prime field Fp. It took him four years from 1801 to 1805 to determinethat in (3.31) with n = 1, the correct sign was + in both cases. The Gauss quadraticsum over an arbitrary finite field Fq, where q is a prime power, was determined byDavenport and Hasse [4] in 1934.

The next three sections are devoted to the evaluation of the Gauss quadraticsum. The approach will take us on a tour through some most elegant theorems inclassic number theory.

3.3. Evaluation of the Gauss Quadratic Sum over FpTheorem 3.12. Let p be an odd prime and η the quadratic character of F∗q

defined in (3.29). Then

(3.32) G(η) =p−1∑j=0

e2πij2/p =

{p

12 if p ≡ 1 (mod 4),ip

12 if p ≡ −1 (mod 4).

We will see two proofs of the above theorem. The first proof is analytic, based onthe method of Mordell, cf. []. The second proof is elementary, given by Estermann[5].

An analytic proof of Theorem 3.12.

Theorem 3.13. Let m and n be positive integers such that mn is even. Then

(3.33)m−1∑j=0

eπinj2/m = eπi/4

(mn

) 12n−1∑j=0

e−πimj2/n.

Proof. Let

f(z) =eπinz

2/m

e2πiz − 1.

f is a meromorphic function on C with poles at the integers. Let R be a largepositive real number and consider a contour CR shown in Figure 1 where the twosemicircles are of radius 1

2 .The poles of f(z) inside the contour CR are j = 0, 1, . . . ,m− 1, and

Res(f ; j) =eπinj

2/m

2πi.

By the residue theorem,

(3.34)∫CR

f(z)dz = 2πim−1∑j=0

Res(f ; j) =m−1∑j=0

eπinj2/m.

Let z = A+ x, where x ∈ R and 0 ≤ x ≤ m. We have

|e2πiz − 1| ≥ 1− |e2πiz| = 1− e−2πIm z = 1− e−√

2πR.


.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

..................................

........................................

...........................................................

.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

............................................................................................................................................................................................................................................

..........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

......................................... ................

.........................................................

.................................... ................................... •0 m

−A −A+m

A+mA = Reπi/4

Figure 1. The contour CR

Since Im z2 = Im ( 1√2R+ x+ 1√

2Ri)2 = (R+

√2x)R, we also have

|eπinz2/m| = e−πn(R+

√2x)R/m ≤ e−πnR

2/m.

Therefore,

∣∣∣∫ A

A+m

f(z)dz∣∣∣ ≤ ∫ A

A+m

∣∣∣ eπinz2/me2πiz − 1

∣∣∣ds ≤ e−πnR2/m

1− e−√

2πRm.

It follows that

(3.35)∫ A

A+m

f(z)dz = o(1).

(The symbol o(1) represents any complex valued function of R which tends to 0 asR→ +∞.) Similarly,

(3.36)∫ −A+m

−Af(z)dz = o(1).

Let Γ1 be the part of CR from A to −A and Γ2 the part of CR from −A+mto A+m. Then(3.37)(∫

Γ1

+∫

Γ2

)f(z)dz =

∫Γ1

f(z)dz −∫

Γ1

f(z +m)dz = −∫

Γ1

[f(z +m)− f(z)

]dz,


where

f(z +m)− f(z) =1

e2πiz − 1[eπin(z+m)2/m − eπinz

2/m]

=eπinz

2/m

e2πiz − 1[e

1mπin[(z+m)2−z2] − 1

]=eπinz

2/m

e2πiz − 1(e2πinz − 1

)(since mn is even)

= eπinz2/m

n−1∑j=0

e2πijz

=n−1∑j=0

e−πimj2/n eπin(z+jm

n )2/m.

Since f(z +m)− f(z) is an entire function, the integral at the right side of (3.37)depends only on the endpoints of Γ1; hence(∫

Γ1

+∫

Γ2

)f(z)dz =

∫ A

−A

[f(z +m)− f(z)

]dz

=n−1∑j=0

e−πimj2/n

∫ A

−Aeπin(z+jm

n )2/mdz.

(3.38)

In the above,∫ A

−Aeπin(z+jm

n )2/mdz =∫ A+jm

n

−A+jmn

eπinz2/mdz

=(∫ −A

−A+jmn

+∫ A

−A+

∫ A+jmn

A

)eπinz

2/mdz.

By the same argument that leads to (3.35) and (3.36), we see that∫ −A

−A+jmn

eπinz2/mdz = o(1) and

∫ A+jmn

A

eπinz2/mdz = o(1).

Hence, ∫ A

−Aeπin(z+jm

n )2/mdz =∫ A

−Aeπinz

2/mdz + o(1)

=∫ R

−Re−πnr

2/m eπi/4 dr + o(1)

= eπi/4∫ +∞

−∞e−πnr

2/m dr + o(1)

= eπi/4(mn

) 12I + o(1),

(3.39)

where

I =∫ +∞

−∞e−πr

2dr.


Now we have

m−1∑j=0

eπinj2/m =

∫CR

f(z)dz ((3.34))

=(∫

Γ1

+∫ A+m

−A+

∫Γ2

+∫ A

A+m

)f(z)dz

=(∫

Γ1

+∫

Γ2

)f(z)dz + o(1) (by (3.35) and (3.36))

= eπi/4(mn

) 12In−1∑j=0

e−πimj2/n + o(1) (by (3.38) and (3.39)).

Letting R→ +∞, we get

m−1∑j=0

eπinj2/m = eπi/4

(mn

) 12In−1∑j=0

e−πimj2/n.

Letting m = 2 and n = 1 in the above, we see that I = 1. Thus, the proof of thetheorem is complete. �

Corollary 3.14. Let m be a positive integer. Then

m−1∑j=0

e2πij2/m =

(1 + i)m

12 if m ≡ 0 (mod 4),

m12 if m ≡ 1 (mod 4),

0 if m ≡ 2 (mod 4),im

12 if m ≡ 3 (mod 4).

Proof. Let n = 2 in (3.33). �

Theorem 3.12 follows immediately from Corollary 3.14.

An elementary proof of Theorem 3.12. For any m ∈ Z+, let

g(m) =m−1∑j=0

e2πij2/m.

Theorem 3.15. Let m be a positive odd integer. Then

Re[12(1− i)(1 + im)g(m)

]> −

√m.


Proof. We have

12(1 + im)(g(m)− 1) =

12(1 + im) 2

m−12∑j=1

e2πij2/m

=

m−12∑j=1

(e2πij

2/m + e2πi(j2+ 1

4m2)/m

)

=

m−12∑j=1

(e2πij

2/m + e2πi(12m−j)

2/m)

=

m−12∑j=1

(eπi(2j)

2/(2m) + eπi(m−2j)2/(2m))

=m−1∑j=1

eπij2/(2m)

= A+B

(3.40)

where

A =b√mc∑

j=1

eπij2/(2m), B =

m−1∑j=b

√mc+1

eπij2/(2m).

Since cosx+ sinx ≥ 1 for 0 ≤ x ≤ π2 , we see that

(3.41) Re[(1− i)A

]=

b√mc∑

j=1

(cos

πj2

2m+ sin

πj2

2m

)≥ b

√mc.

For 1 ≤ j ≤ m, write

eπij2/(2m) = eπij

2/(2m)(eπij/(2m) − e−πij/(2m)

) 12i

cscπj

2m

=(eπij(j+1)/(2m) − e−πi(j−1)j/(2m)

) 12i

cscπj

2m

=12i

(aj − aj−1)bj ,

(3.42)

where aj = eπij(j+1)/(2m) and bj = csc πj2m . Then

B =m−1∑

j=b√mc+1

eπij2/(2m)

=12i

m−1∑j=b

√mc+1

(aj − aj−1)bj (by (3.42))

=12i

[ m−1∑j=b

√mc+1

ajbj −m−2∑

j=b√mc

ajbj+1

]

=12i

[ m−2∑j=b

√mc+1

aj(bj − bj+1) + am−1bm−1 − ab√mcbb

√mc+1

].


Since b1 > b2 > · · · > bm > 0 and |aj | = 1, 1 ≤ j ≤ m, we have

|B| ≤ 12

[ m−2∑j=b

√mc+1

(bj − bj+1) + bm−1 + bb√mc+1

]= bb

√mc+1

= cosπ

2b√mc+ 1m

≤ m

b√mc+ 1

(since sinπ

2x ≥ x for 0 ≤ x ≤ 1)

≤√m.

(3.43)

Combining (3.40), (3.41) and (3.43), we get

Re[12(1− i)(1 + im)(g(m)− 1)

]= Re

[(1− i)A

]+ Re

[(1− i)B

]≥ b

√mc −

√2 |B|

≥ 12b√mc −

√2√m

> −√m.

(3.44)

Since

(3.45)12(1− i)(1 + im) =

{1 if m ≡ 1 (mod 4),−i if m ≡ −1 (mod 4),

Re[12 (1− i)(1 + im)

]≥ 0. Hence (3.44) implies that

Re[12(1− i)(1 + im)g(m)

]> −

√m.

�

Proof of Theorem 3.12. By (??) and (3.45),

12(1− i)(1 + ip)g(p) = ±p 1

2 .

However, by Theorem 3.15, Re[12 (1− i)(1 + ip)g(p)

]> −p 1

2 . Hence we must have

12(1− i)(1 + ip)g(p) = p

12 ,

which is (3.32). �

3.4. Formal Power Series

Definition and basic facts. Let F be a field of characteristic 0. A formalpower series in x over F is a formal sum

a0 + a1x+ a2x2 + · · · =

∞∑j=0

ajxj , aj ∈ F, j ∈ N.

3.4. FORMAL POWER SERIES 71

The set of all formal power series in x over F is denoted by F [[x]]. For f =∑∞j=0 ajx

j , g =∑∞j=0 bjx

j ∈ F [[x]], define

f + g =∞∑j=0

(aj + bj)xj ,

(3.46) f g =∞∑l=0

( ∑j+k=l

ajbk

)xl.

Clearly, F [[x]] is an integral domain under these operations. F [[x]] is called thering of formal power series in x over F . The polynomial ring F [x] is embedded inF [[x]] through the embedding

a0 + a1x+ · · ·+ anxn 7−→ a0 + a1x+ · · ·+ anx

n + 0xn+1 + · · · .

Theorem 3.16.(i) The multiplicative group of F [[x]] is

F [[x]]∗ ={ ∞∑j=0

ajxj ∈ F [[x]] : a0 6= 0

}.

(ii) F [[x]] is a local ring with maximal ideal xF [[x]].(iii) Let 1 + xF [[x]] =

{1 +

∑∞j=1 ajx

j ∈ F [[x]]}. Then 1 + xF [[x]] < F [[x]]∗.

Moreover,

φ : F ∗ × (1 + xF [[x]]) −→ F [[x]]∗(a, 1 +

∞∑j=1

ajxj)

7−→ a(1 +

∞∑j=1

ajxj)

is a group isomorphism.

Proof. (i) Let f =∑∞j=0 ajx

j . If f ∈ F [[x]]∗, there exists g =∑∞k=0 bkx

k

such that fg = 1. It follows that a0b0 = 1; hence a0 6= 0.On the other hand, assume a0 6= 0. Define bk, k ∈ N, inductively by

b0 =1a0,

bk = − 1a0

k−1∑j=0

ak−jbj , k > 0,

and let g =∑∞k=0 bkx

k. Then by (3.46), fg = 1. Thus f ∈ F [[x]]∗.(ii) Since F [[x]] \ F [[x]]∗ = xF [[x]], the claim is obviously true.(iii) The proof of this part is routine. �

Example 3.17. In F [[x]], we have (1 − x)−1 =∑∞j=0 x

j and (1 + x)−1 =∑∞j=0(−1)jxj .

Let fi =∑∞j=0 aijx

j ∈ F [[x]], i ∈ I, be a family of formal power series. If, foreach j ∈ N, there are only finitely many i ∈ I such that aij 6= 0, we define

(3.47)∑i∈I

fi =∞∑j=0

(∑i∈I

aij

)xj ∈ F [[x]]


and call the sum∑i∈I fi defined. If

∑i∈I fi is defined, then {(i, j) ∈ I×N : aij 6= 0}

is countable; hence {i ∈ I : fi 6= 0} is countable. Thus, in a defined sum, there canbe at most countably many nonzero terms. Every formal power series

∑∞j=0 ajx

j ∈F [[x]] can also be viewed as the sum of ajxj , j ∈ N.

Let gi =∑∞j=1 aijx

j ∈ xF [[x]], i ∈ I, be a family of (noninvertible) formalpower series such that

∑i∈I gi is defined. We define∏

i∈I(1 + gi) = 1 +

∞∑j=1

bjxj

where

(3.48) bj =∑k≤j

∑i,...,ik∈Ij1,...,jk≥1j1+···+jk=j

ai1j1 · · · aikjk , j ≥ 1,

and call the product∏i∈I(1 + gi) defined. Note that the sum in (3.48) is always a

finite sum since∑i∈I gi is defined. It is also clear that in a defined product, there

can be at most countably many factors which are not 1.Let f =

∑∞j=0 ajx

j ∈ F [[x]] and g ∈ xF [[x]]. Then the sum∑∞j=0 ajg

j isdefined. We write

f(g) =∞∑j=0

ajgj .

Proposition 3.18. Let f1, f2 ∈ F [[x]] and g ∈ xF [[x]]. Then

(f1 + f2)(g) = f1(g) + f2(g),(3.49)

(f1f2)(g) = f1(g) f2(g).(3.50)

Proof. We only prove (3.50) since the proof of (3.49) is similar. Let fi =∑∞j=0 aijx

j , i = 1, 2. Then for each n ∈ N,

f1(g) f2(g) =( ∞∑j=0

a1jgj)( ∞∑

k=0

a2kgk)

≡( n∑j=0

a1jgj)( n∑

k=0

a2kgk)

(mod xn+1)

≡n∑l=0

( ∑j+k=l

a1ja2k

)gl (mod xn+1)

≡∞∑l=0

( ∑j+k=l

a1ja2k

)gl (mod xn+1)

= (f1f2)(g).

Since n ∈ N is arbitrary, we have (f1f2)(g) = f1(g) f2(g). �

Remark. Given g ∈ xF [[x]], Proposition 3.18 implies that the map

eg : F [[x]] −→ F [[x]]f 7−→ f(g)

is an F -algebra homomorphism. It is called the evaluation homomorphism of F [[x]]at g.


The derivative. For f =∑∞j=0 ajx

j ∈ F [[x]], we define f(0) = a0. Thederivative of f , denoted by Df , is defined to be

Df =∞∑j=1

jajxj−1 ∈ F [[x]].

Clearly, D : F [[x]] → F [[x]] is an F -linear map with kerD = F .

Theorem 3.19 (The McLaurin expansion). For each f ∈ F [[x]], we have

f =∞∑j=0

1j!

(Djf)(0)xj .

Proof. Let f =∑∞j=0 ajx

j . Then

Dkf =∞∑j=k

j(j − 1) · · · (j − k + 1)ajxj−k.

Thus, (Dkf)(0) = k!ak, i.e., ak = 1k! (D

kf)(0). �

Theorem 3.20 (The product rule). Let f, g ∈ F [[x]]. Then

D(fg) = (Df) g + f Dg.


j and g =∑∞k=0 bkx

k. Then

(Df) g + f Dg =( ∞∑j=1

jajxj−1

)( ∞∑k=0

bkxk)

+( ∞∑j=0

ajxj)( ∞∑

k=1

kbkxk−1

)=

∞∑l=1

( ∑j+k=l

jajbk

)xl−1 +

∞∑l=1

( ∑j+k=l

ajkbk

)xl−1

=∞∑l=1

( ∑j+k=l

(j + k)ajbk)xl−1

=∞∑l=1

l( ∑j+k=l

ajbk

)xl−1

= D(fg).

�

The product rule implies that

D(fn) = nfn−1Df

for all f ∈ F [[x]] and n ∈ Z+. If f ∈ F [[x]]∗, the above equation holds for alln ∈ Z. In fact, for n ∈ Z+, we have 0 = D(fnf−n) = fnD(f−n) + f−nD(fn) =fnD(f−n) + nf−1Df , which gives D(f−n) = −nf−n−1Df .

Theorem 3.21.(i) Let fi ∈ F [[x]], i ∈ I, such that

∑i∈I fi is defined. Then

∑i∈I Dfi is

also defined andD

(∑i∈I

fi

)=

∑i∈I

Dfi.


(ii) Let gi ∈ xF [[x]], i ∈ I, such that∏i∈I(1 + gi) is defined. Then∑

i∈I(Dgi)∏t∈I\{i}(1 + gt) is defined and

(3.51) D[∏i∈I

(1 + gi)]

=∑i∈I

(Dgi)∏

t∈I\{i}

(1 + gt).

Proof. (i) Obvious.(ii) Let j ∈ N be arbitrary. Since

∑i∈I gi is defined, gi ≡ 0 (mod xj+2) except

for only finitely many i ∈ I, say, i1, . . . , is. Then

∏i∈I

(1 + gi) =( s∏α=1

(1 + giα)) ∏i∈I\{i1,...,is}

(1 + gi)

≡s∏

α=1

(1 + giα) (mod xj+2).

Hence, by the product rule,

(3.52) D[∏i∈I

(1 + gi)]≡

s∑α=1

(Dgiα)∏

t∈{i1,...,is}\{iα}

(1 + gt) (mod xj+1).

On the other hand, for i ∈ I \ {i1, . . . , is}, Dgi ≡ 0 (mod xj+1), which impliesthat the sum at the right side of (3.51) is defined. Moreover,∑

i∈I(Dgi)

∏t∈I\{i}

(1 + gt)

≡s∑

α=1

(Dgiα)∏

t∈I\{iα}

(1 + gt) (mod xj+1)

=[ s∑α=1

(Dgiα)∏

t∈{i1,...,is}\{iα}

(1 + gt)] ∏i∈I\{i1,...,is}

(1 + gi)

≡s∑

α=1

(Dgiα)∏

t∈{i1,...,is}\{iα}

(1 + gt) (mod xj+2).

(3.53)

By (3.52) and (3.53), the coefficients of xj at the two sides of (3.51) are the same.Since j ∈ N is arbitrary, (3.51) is proved. �

Theorem 3.22 (The chain rule). Let f ∈ F [[x]] and g ∈ xF [[x]]. Then

D(f(g)) = (Df)(g)Dg.


j . By Theorem 3.21 (i),

D(f(g)

)= D

( ∞∑j=0

ajgj)

=∞∑j=0

D(ajgj) =∞∑j=1

jajgj−1Dg = (Df)(g)Dg.

�


The exponential and logarithmic series. The formal power series

exp(x) =∞∑j=0

1j!xj

and

log(1 + x) =∞∑j=1

(−1)j−1

jxj

are called the exponential and logarithmic series, respectively.

Theorem 3.23. We have

D(exp(x)

)= exp(x),

D(log(1 + x)

)=

11 + x

.

Proof. Both formulas follow from direct computation. We have

D(exp(x)

)= D

( ∞∑j=0

1j!xj

)=

∑j=1

1(j − 1)!

xj−1 = exp(x)

and

D(log(1 + x)

)= D

( ∞∑j=1

(−1)j−1

jxj

)=

∞∑j=1

(−1)j−1xj−1 =1

1 + x.

�

The exponential and logarithmic series give rise to two maps

exp : xF [[x]] −→ 1 + xF [[x]]f 7−→ exp(f)

log : 1 + xF [[x]] −→ xF [[x]]1 + f 7−→ log(1 + f)

The two maps are group isomorphisms between (xF [[x]], +) and (1 + xF [[x]], · )and are indeed inverses of each other.

Theorem 3.24. For all f, g ∈ xF [[x]], we have

log(exp(f)

)= f,(3.54)

exp(log(1 + f)

)= 1 + f,(3.55)

exp(f + g) = exp(f) exp(g),(3.56)

log((1 + f)(1 + g)

)= log(1 + f) + log(1 + g).(3.57)

Proof. SinceD[log

(exp(f)

)]= exp(f)−1 exp(f)Df = Df and

[log

(exp(f)

)](0)

= 0 = f(0), we have log(exp(f)

)= f . Since

D[log

((1 + f)(1 + g)

)]=

1(1 + f)(1 + g)

[(Df)(1 + g) + (1 + g)Dg

]=

Df

1 + f+

Dg

1 + g

= D[log(1 + f) + log(1 + g)

]and

[log

((1 + f)(1 + g)

)](0) = 0 =

[log(1 + f) + log(1 + g)

](0), we have (3.57).


By (3.57) and (3.54), log : (1 + xF [[x]], · ) → (xF [[x]], +) is an onto homo-morphism. If log(1 + f) = 0, then 0 = D

(log(1 + f)

)= Df

1+f , i.e., Df = 0. Sincef(0) = 0, we must have f = 0. Thus log : (1 + xF [[x]], · ) → (xF [[x]], +) isan isomorphism. By (3.54), exp : (xF [[x]], +) → (1 + xF [[x]], · ) is the inverseisomorphism of log and the proof of the theorem is complete. �

Theorem 3.25. Let fi ∈ xF [[x]], i ∈ I, such that∑i∈I fi is defined.

(i)∏i∈I exp(fi) is defined and

exp(∑i∈I

fi

)=

∏i∈I

exp(fi).

(ii)∑i∈I log(1 + fi) is defined and

log(∏i∈I

(1 + fi))

=∑i∈I

log(1 + fi).

Proof. Exercise. �

Let f ∈ 1 + F [[x]] and a ∈ F . We define

(3.58) fa = exp(a log f).

Theorem 3.26. Let f, g ∈ 1 + xF [[x]] and a, b ∈ F .

(i) log(fa) = a log f .(ii) faf b = fa+b, (fg)a = faga, (fa)b = fab.(ii) (The power rule) D(fa) = afa−1Df .

Proof. (i) Apply log to both sides of (3.58).(ii) We only show (fa)b = fab; the proofs of the other two identities are similar.

By (i), we have log[(fa)b

]= b log(fa) = ba log f = log(fab).

(iii) By (3.58), D(fa) = exp(a log f) a · 1fDf = afa−1Df . �

Theorem 3.27 (The binomial theorem). Let f ∈ xF [[x]] and a ∈ F . Then

(3.59) (1 + f)a =∞∑j=0

(a

j

)f j

where (a

j

)=a(a− 1) · · · (a− j + 1)

j!.

3.5. THE DAVENPORT-HASSE THEOREM, THE GAUSS QUADRATIC SUM OVER Fq 77

Proof. Let g =∑∞j=0

(aj

)f j . Then

(1 + f)Dg = (1 + f)∞∑j=1

(a

j

)jf j−1Df

= (Df)[ ∞∑j=0

(j + 1)(

a

j + 1

)f j +

∞∑j=0

j

(a

j

)f j

]= (Df)

∞∑j=0

[(j + 1)

(a

j + 1

)+ j

(a

j

)]f j

= (Df)∞∑j=0

a

(a

j

)f j

= ag Df.

Therefore,

D[log

((1 + f)a

)]= D

[a log(1 + f)

]= a

Df

1 + f=Dg

g= D(log g).

Since[log

((1 + f)a

)](0) = 0 = (log g)(0), we have log

[(1 + f)a

]= log g, i.e.,

(1 + f)a = g. �

Remark. Let h = (1 + x)a ∈ F [[x]] and f ∈ xF [[x]]. Theorem 3.27 impliesthat h(f) = (1 + f)a.

3.5. The Davenport-Hasse Theorem and Evaluation of the GaussQuadratic Sum over Fq

Let λ be a multiplicative character of Fq, χ an additive character of Fq andn ∈ Z+. Then λ′ = λ ◦ NFqn/Fq

is a multiplicative character of Fqn and χ′ =χ ◦ TrFqn/Fq

is an additive character of Fqn . λ′ and χ′ are called the lifts of λand χ to Fqn respectively. The canonical additive character e of Fq is lifted to thecanonical additive character e′ of Fqn . The Gauss sums G(λ) and G(λ′) are definedon Fq and Fqn respectively. The main result of this section is the following theorem.

Theorem 3.28 (The Davenport-Hasse theorem). In the above notation, wehave

−G(λ′) =[−G(λ)

]n.

Despite the simplicity of its statement, the Davenport-Hasse theorem is a non-trivial result. Its proof requires a little preparation.

Let M be the set of all monic polynomials in Fq[x] an I the set of all monicirreducible polynomials in Fq[x]. Given any multiplicative character λ of Fq, definea map

ρ : M −→ C1 7−→ 1

xk − ak−1xk−1 + · · ·+ (−1)ka0 7−→ λ(a0)e(ak−1), k > 0.

We claim that

(3.60) ρ(fg) = ρ(f)ρ(g) for all f, g ∈M.


If one of f and g is 1, (3.60) is obviously true. If f = xk−ak−1xk−1 + · · ·+(−1)ka0,

k > 0, and g = xl − bl−1xl−1 + · · ·+ (−1)lb0, l > 0, then

fg = xk+l − (ak−1 + bl−1)xk+l−1 + · · ·+ (−1)k+la0b0.

Therefore,

ρ(fg) = λ(a0b0)e(ak−1 + bl−1) = λ(a0)e(ak−1)λ(b0)e(bl−1) = ρ(f)ρ(g).

Lemma 3.29.

(i) Let f ∈ I such that deg f = d and d | n. Let α ∈ Fqn be a root of f . Then

ρ(f)n/d = λ′(α)e′(α).

(ii) We have

G(λ′) =∑f∈I

deg f |n

(deg f)ρ(f)n/ deg f .

Proof. (i) Since f is the minimal polynomial of α over Fq, we have Fq(α) =Fqd . Moreover, by Proposition 2.4 (iii),

f =∏

γ∈Aut(Fqd/Fq)

(x− γ(α)

)= xd −

( ∑γ∈Aut(F

qd/Fq)

γ(α))xd−1 + · · ·+ (−1)d

∏γ∈Aut(F

qd/Fq)

γ(α)

= xd −(TrF

qd/Fq(α)

)xd−1 + · · ·+ (−1)dNF

qd/Fq(α).

Thus,

ρ(f)n/d =[λ(NF

qd/Fq(α)

)e(TrF

qd/Fq(α)

)]n/d= λ

((NF

qd/Fq(α)

)n/d)e(nd

TrFqd/Fq

(α))

= λ(NF

qd/Fq(α)

)e(TrF

qd/Fq(α)

)(Thm 1.9 (ii) and Thm 1.10 (ii))

= λ′(α)e′(α).

(ii) Since xqn − x =

∏f∈I, deg f |n f , Fq is partitioned into the set of roots of f

where f runs through I with deg f | n. Thus,

G(λ′) =∑α∈Fqn

λ′(α)e′(α)

=∑f∈I

deg f |n

∑α∈Fqn

is a root of f

λ′(α)e′(α)

=∑f∈I

deg f |n

(deg f)ρ(f)n/ deg f (by (i)).

�

3.5. THE DAVENPORT-HASSE THEOREM, THE GAUSS QUADRATIC SUM OVER Fq 79

Proof of Theorem 3.34. In the ring of formal power series C[[t]], we have∏f∈I

11− ρ(f)tdeg f

=∏f∈I

( ∞∑j=0

ρ(f j)tj deg f)

=∞∑j=0

(∑k≤j

∑f1,...,fk∈Ij1,...,jk≥1

j1 deg f1+···+jk deg fk=j

ρ(f j11 · · · f jkk ))tj

=∞∑j=0

( ∑f∈M

deg f=j

ρ(f))tj (f = f j11 · · · f jkk ).

(3.61)

We now determine the inner sum in the last step of (3.61). Clearly,∑f∈M

deg f=0

ρ(f) = 1

and ∑f∈M

deg f=1

ρ(f) =∑a∈Fq

ρ(x− a) =∑a∈Fq

λ(a)e(a) = G(λ).

For j > 1, ∑f∈M

deg f=j

ρ(f) =∑

a0,...,aj−1∈Fq

ρ(xj − aj−1xj−1 + · · ·+ (−1)ja0)

= qj−2( ∑a0∈Fq

λ(a0))( ∑

aj−1∈Fq

e(aj−1))

= 0.

Therefore, (3.61) becomes

(3.62)∏f∈I

11− ρ(f)tdeg f

= 1 +G(λ)t.

Taking the logarithm of both sides of (3.62) and using Theorem 3.25 (ii), we get

(3.63) −∑f∈I

log(1− ρ(f)tdeg f

)= log

(1 +G(λ)t

).

Differentiating (3.63) and multiplying t to both sides of the result, we have

(3.64)∑f∈I

(deg f)ρ(f)tdeg f

1− ρ(f)tdeg f=

G(λ)t1 +G(λ)t

.

In (3.64),

(3.65)∑f∈I

(deg f)ρ(f)tdeg f

1− ρ(f)tdeg f=

∑f∈I

(deg f)∞∑j=1

ρ(f)jtj deg f

and

(3.66)G(λ)t

1 +G(λ)t=

∞∑j=1

(−1)j−1G(λ)jtj .


A comparison of the coefficients of tn in (3.65) and (3.66) yields

(−1)n−1G(λ)n =∑f∈I

deg f |n

(deg f)ρ(f)n/ deg f = G(λ′),

where the second equal sign follows from Lemma 3.29 (ii). Therefore, Theorem 3.28is proved. �

Theorem 3.30. Let p be an odd prime and q = pn, n ∈ Z+. Let η be thequadratic character of Fq. Then

G(η) =

{(−1)n−1q1/2 if p ≡ 1 (mod 4),(−1)n−1inq1/2 if p ≡ −1 (mod 4).

Proof. Let λ be the quadratic character of Fp. Since NFq/Fp: F∗q → F∗p is onto,

λ′ = λ ◦ NFq/Fpis a multiplicative character of Fq of order 2. Hence λ′ = η. The

conclusion of the theorem follows immediately from Theorems 3.28 and 3.12. �

3.6. Dedekind Domains and Number Fields

In this and the next section, we gather some basic facts from algebraic numbertheory. The purpose is to provide an adequate background so that the Stickelbergertheorem on the Gauss sum in Section 3.8 can be accurately stated and proved.

Ring extensions. Let S be a commutative ring and R a subring of S. S iscalled an extension of R. An element α ∈ S is called integral over R if there is amonic polynomial f ∈ R[x] such that f(α) = 0. If every element of S is integralover R, S is said to be integral over R. For each α ∈ S, the subring of S generatedby R and α is R[α] = {f(α) : f ∈ R[x]}.

Proposition 3.31. Let R ⊂ S be commutative rings and α ∈ S. Then thefollowing statements are equivalent.

(i) α is integral over R;(ii) R[α] is a finitely generated R-module;(iii) α is contained in a subring T of R which is a finitely generated R-module.

Proof. (i) ⇒ (ii). Since α is integral over R, there exists a monic polynomialf ∈ R[x] such that f(α) = 0. Let f = xn + an−1x

n−1 + · · ·+ a0. Then

αn = −a0 − · · · − an−1αn−1.

Then it is clear that R[α] is generated by 1, α, · · · , αn−1 as an R-module.(ii) ⇒ (iii). Let T = R[α].(iii) ⇒ (i). Let T be generated by ε1, . . . , εn as an R-module. We may assume

ε1 = 1 since we may insert 1 to the generating set. Then

α

ε1...εn

= A

ε1...εn

for some n× n matrix A over R. Hence,

(αIn −A)

ε1...εn

=

0...0

.

3.6. DEDEKIND DOMAINS AND NUMBER FIELDS 81

Since ε1 = 1, the first column of αIn − A is a linear combination of the othercolumns. Thus, det(αIn − A) = 0. Since det(xIn − A) is a monic polynomial inR[x], α is integral over R. �

Proposition 3.32. Let R ⊂ S be integral domains and α ∈ S. If there is afinitely generated R-module 0 6= M ⊂ S such that αM ⊂M , then α is integral overR.

Proof. The proof is almost identical to the proof of (iii) ⇒ (i) in Proposi-tion 3.31. Let M be generated by ε1, . . . , εn as an R-module where 0 6= εi ∈ S,1 ≤ i ≤ n. Since αM ⊂M , we have

α

ε1...εn

= A

ε1...εn

for some n × n matrix A over R. The above equation implies that αIn − A, as amatrix over the quotient field of R, is singular. Thus, det(αIn − A) = 0, provingthat α is integral over R. �

Proposition 3.33. Let R ⊂ S be commutative rings.

(i) Lat A ⊂ S such that every element in A is integral over R. Then thesubring R[A] of S generated by R and A is integral over R.

(ii) Let R = {α ∈ S : α is integral over R}. Then R is a subring of S.

In Proposition 3.33 (ii), the ring R is called the integral closure of R in S. IfR = R, R is called integrally closed in S.

Proof of Proposition 3.33. (i) Let α ∈ R[A]. Then α ∈ R[α1, . . . , αn]for some α1, . . . , αn ∈ A, where R[α1, . . . , αn] is the subring of S generated by Rand {α1, . . . , αn}. For each 1 ≤ i ≤ n, since αi is integral over R, it is integralover R[α1, . . . , αi−1]. By Proposition 3.31, R[α1, . . . , αi] is a finitely generatedR[α1, . . . , αi−1]-module. From the tower

R ⊂ R[α1] ⊂ R[α1, α2] ⊂ · · · ⊂ R[α1, . . . , αn],

it follows immediately that R[α1, . . . , αn] is a finitely generated R-module. ByProposition 3.31, α is integral over R.

(ii) It suffices to show that if α, β ∈ R, then α − β ∈ R and αβ ∈ R. By (i),R[α, β] ⊂ R. Thus α− β, αβ ∈ R[α, β] ⊂ R. �

Proposition 3.34. Let R ⊂ S ⊂ T be commutative rings such that S is integralover R and T is integral over S. Then T is integral over R.

Proof. Let α ∈ T . Since α is integral over S, there exists f = xn+an−1xn−1+

· · · + a0 ∈ S[x] such that f(α) = 0. Since f ∈ R[a0, . . . , an−1][x], α is integralover R[a0, . . . , an−1], i.e., R[a0, . . . , an−1, α] = R[a0, . . . , an−1][α] is a finitely gen-erated R[a0, . . . , an−1]-module. Since a0, . . . , an−1 are integral over R, by the proofof Proposition 3.33 (i), R[a0, . . . , an−1] is a finitely generated R-module. Thus,R[a0, . . . , an−1, α] = R[a0, . . . , an−1][α] is a finitely generated R-module. Sinceα ∈ R[a0, . . . , an−1, α], by Proposition 3.31, α is integral over R. �


Fractional ideals. Let o be an integral domain with quotient field F . In thissection, all ideals are meant to be nonzero.

Definition 3.35. A nonzero o-module a ⊂ F is called a fractional ideal of o ifthere exists 0 6= a ∈ o such that aa ⊂ o.

If b is an (ordinary) ideal of o and 0 6= α ∈ F , then αb is a fractional ideal ofo since there exists 0 6= a ∈ o such that aα ∈ o and thus a(αb) ⊂ o. On the otherhand, if a is a fractional ideal of o, then aa ⊂ o for some 0 6= a ∈ o. So, aa is anideal of o and a = 1

a (aa). Therefore, fractional ideals of o are precisely αb, where bis an ideal of o and 0 6= α ∈ F .

Let a, b be two fractional ideals of o. The product

ab :={ n∑i=1

aibi : ai ∈ a, bi ∈ b, n ∈ N}

is a fractional ideal of o. In fact, if a = αa′ and b = βb′, where a′ and b′ are idealsof o and α, β ∈ F \ {0}, then ab = αβa′b′.

For any fractional ideal a of o, define

(3.67) a−1 = {α ∈ F : αa ⊂ o}.

Then a−1 is also a fractional ideal of o. (Choose 0 6= a ∈ a ∩ o. Then aa−1 ⊂ o.)Moreover, a−1a ⊂ o. A fractional ideal a of o is called invertible if ab = o for somefractional ideal b of o. b is called the inverse of a. Let a = αa′, where 0 6= α ∈ Fand a′ is an ideal of o. a is invertible if and only if there exists an ideal b′ of osuch that a′b′ is a principal ideal of o. In fact, if a′b′ = (a), where 0 6= a ∈ o, thena( 1αab′) = o. On the other hand, assume ab = o for some fractional ideal b of o.

Write b = βb′, where 0 6= β ∈ F and b′ is an ideal of o. Choose 0 6= c ∈ o such thatcαβ ∈ o. Then a′(cb′) = c

αβ o is a principal ideal of o.

Proposition 3.36. Let a be an invertible fractional ideal of o.(i) The inverse of a is unique and is a−1.(ii) If ab = ac where b and c are fractional ideals of o. Then b = c.

Proof. (i) Let d be a fractional ideal of o such that da = o. By (3.67), d ⊂ a−1.On the other hand, a−1 = oa−1 = daa−1 ⊂ do = d. Thus, d = a−1.

(ii) We have b = ob = a−1ab = a−1ac = c. �

Dedekind domains.

Definition 3.37. An integral domain o is called a Dedekind domain if everyproper ideal of o is a product of finitely many prime ideals of o.

It will be shown (Lemmas 3.39 and 3.40) that in a Dedekind domain, thefactorization of a proper ideal into a finite product of prime ideals is unique. So,the ideals in a Dedekind domain enjoy the same property as the elements do ina UFD. However, a Dedekind domain is not necessarily a UFD and vice versa(Exercise 3.3). It is clear that a PID is a Dedekind domain.

An integral domain o is called integrally closed if it is integrally closed in itsquotient field. It is easy to see that Z is integrally closed.

The main result of this subsection is the following characterization of Dedekinddomains.


Theorem 3.38. An integral domain o is a Dedekind domain if and only if thefollowing conditions are all satisfied.

(i) o is noetherian;(ii) o is integrally closed;(iii) every prime ideal of o is maximal.

Lemma 3.39. Let o be an integral domain. If p1, . . . , pm and q1, . . . , qn are in-vertible prime ideals of o such that p1 · · · pm = q1 · · · qn, then m = n and q1, · · · , qnis a permutation of p1, . . . , pm.

Proof. Among p1, . . . , pm, choose one, say p1, such that p1 is minimal withrespect to inclusion. Since q1 · · · qm = p1 · · · pm ⊂ p1 and p1 is prime, one ofq1, . . . , qn, say, q1, is contained in p1. Since p1 · · · pm = q1 · · · qn ⊂ q1 and q1 isprime, pi ⊂ q1 for some 1 ≤ i ≤ m. Since pi ⊂ q1 ⊂ p1, by the minimalityof p1, we have pi = p1; hence q1 = p1. Since p1 = q1 is invertible, we havep2 · · · pm = q2 · · · qn. The conclusion of the lemma follows by induction. �

Lemma 3.40. Let o be a Dedekind domain. Then every prime ideal of o isinvertible and maximal.

Proof. Let p be a prime ideal of o.1◦ If p is invertible, then it is maximal.Assume to the contrary that p is not maximal. Then there exists a ∈ o \ p such

that p + ao 6= o. Since o is a Dedekind domain, we can write

p + ao = p1 · · · pm,(3.68)

p + a2o = q1 · · · qn,(3.69)

where p1, . . . , pm, q1, . . . , qn are prime ideals of o. Let π : o → o/p be the canonicalhomomorphism. For each 1 ≤ i ≤ m, since (o/p)/π(pi) = (o/p)/(pi/p) ∼= o/pi is anintegral domain, π(pi) is a prime ideal of o/p. In the same way, π(qj), 1 ≤ j ≤ n,are prime ideals of o/p. Applying π to (3.68) and (3.69), we get(

π(a))

= π(p1) · · ·π(pm),(3.70) (π(a2)

)= π(q1) · · ·π(qn),(3.71)

Since(π(a)

)and

(π(a2)

)are principal ideals of o/p and thus invertible, π(p1), . . . ,

π(pm), π(q1), . . . , π(qn) are all invertible ideals of o/p. Equations (3.70) and (3.71)also give

π(q1) · · ·π(qn) =(π(a2)

)=

(π(a)

)2 =(π(p1)

)2 · · ·(π(pm)

)2.

By Lemma 3.39, n = 2m and, without loss of generality, π(q2i−1) = π(q2i) = π(pi),1 ≤ i ≤ m. It follows that q2i−1 = q2i = pi, 1 ≤ i ≤ m. Therefore,

p + a2o = q1 · · · qn = (p1 · · · pm)2 = (p + ao)2.

For each b ∈ p, we have b ∈ p + a2o = (p + ao)2 ⊂ p2 + ao. Write b = c+ ar withc ∈ p2 and r ∈ o. Then ar = b − c ∈ p. Since p is prime and a /∈ p, we must haver ∈ p. Hence b = c + ar ∈ p2 + ap = p(p + ao). Therefore, we have proved thatp ⊂ p(p + ao). Since p is invertible, we have o = p−1p ⊂ p−1p(p + ao) = p + ao,which is a contradiction.

2◦ p is invertible.Choose 0 6= b ∈ p and write (b) = p1 · · · pm where p1, . . . , pm are prime ideals

of o. Since (b) is principal, p1, . . . , pm are all invertible and by 1◦, they are all


maximal. Since p1 · · · pm = (b) ⊂ p and p is a prime, pi ⊂ p for some 1 ≤ i ≤ m.The maximality of pi implies that p = pi. So, p is invertible. �

Corollary 3.41. Every fractional ideal of a Dedekind domain o is invertible.

Proof. Let F be the quotient field of o and let a be a fractional ideal of o.Then a = αa′ where a′ is an ideal of o and 0 6= α ∈ F . Since a′ is a finite productof primes ideals and, by Lemma 3.40, every prime ideal of o is invertible, it followsthat a is invertible. �

Proof of Theorem 3.38. Let F be the quotient field of o.(⇒) (iii) is already proved in Lemma 3.40.(i) Let a be any ideal of o. By Corollary 3.41, a is invertible, i.e., a−1a = o.

Write 1 =∑ni=1 biai, where bi ∈ a−1, ai ∈ a, 1 ≤ i ≤ n. We claim that a is

generated by a1, . . . , an. In fact, for each a ∈ a, a =∑ni=1(abi)ai, where abi ∈ o

since bi ∈ a−1. Therefore, a is finitely generated. Hence, o is noetherian.(ii) Let α ∈ F be integral over o. By Proposition 3.31, o[α] is a finitely generated

o-module. It follows that o[α] is a fractional ideal of o. (Let o[α] = oα1 + · · ·+ oαnwhere α1, · · · , αn ∈ F . Then there exists 0 6= a ∈ o such that aαi ∈ o for all1 ≤ i ≤ n. So, ao[α] = a(oα1 + · · · + oαn) ⊂ o.) By Corollary 3.41, o[α] isinvertible. Also note that o[α]o[α] = o[α] since o[α] is a ring. Thus,

α ∈ o[α] = o[α]−1o[α]o[α] = o[α]−1o[α] = o.

So, o is integrally closed.(⇐) 1◦ For every ideal a of o, there exist prime ideals p1, . . . , pn of o such that

p1 · · · pn ⊂ a.Assume the contrary. Let A be the set of all ideals of o which do not have the

claimed property. Since A 6= ∅ and o is noetherian, A has a maximal element a.Obviously, a is not a prime ideal. Hence there exist b1, b2 ∈ o\a such that b1b2 ∈ a.Put a1 = a + (b1) and a2 = a + (b2). Then a1a2 ⊂ a but a ( a1 and a ( a2. Bythe maximality of a, we have a1 /∈ A and a2 /∈ A, i.e., there exist prime idealsp1, . . . , pm, q1, . . . , qn of o such that p1 · · · pm ⊂ a1 and q1 · · · qn ⊂ a2. Thus,

p1 · · · pmq1 · · · qn = a1a2 ⊂ a,

which is a contradiction.2◦ Every maximal ideal p of o is invertible.Assume to the contrary that p is not invertible. Since op ⊂ o, by the definition

of p−1, we have o ⊂ p−1. Thus, p ⊂ p−1p ⊂ o. Since p is maximal and p−1p 6= o,we must have p = p−1p. Since p is finitely generated, by Proposition 3.32, elementsin p−1 are integral over o. Since o is integrally close, we have p−1 ⊂ o.

On the other hand, choose 0 6= a ∈ p. By 1◦, there are prime ideals p1, . . . , pnof o such that p1 · · · pn ⊂ (a) ⊂ p. We may assume that p1, . . . , pn are chosen sothat n is minimal. Since p is prime, one of p1, . . . , pn, say p1, is contained in p.Since p1 is maximal, p1 = p. By the minimality of n, p2 · · · pn 6⊂ (a). Chooseb ∈ p2 · · · pn \ (a). Then b

a /∈ o, but bap ⊂ 1

ap1p2 · · · pn ⊂ 1a (a) = o, i.e., b

a ∈ p−1.This is a contradiction to the fact that p−1 ⊂ o established above.

3◦ o is a Dedekind domain.Assume to the contrary that there is an ideal of a of o which is not a finite

product of prime ideals of o. Since o is noetherian, we may assume that a is maximalamong all ideals of o which are not a finite product of prime ideals. Choose a


maximal ideal p of o such that a ⊂ p. Clearly, a ⊂ ap−1 ⊂ aa−1 = o. If ap−1 = a,since a is finitely generated, by Proposition 3.32, p−1 ⊂ o. Then p−1p ⊂ op = p 6= o,contradicting 2◦. So, a ( ap−1 ⊂ o. By the maximality of a, ap−1 is a finite productof prime ideals of o. Therefore, a = (ap−1)p is also a finite product of prime ideals,which is a contradiction. �

Let o be a Dedekind domain and p a prime ideal of o. Then o = p0 ⊃ p1 ⊃p2 ⊃ · · · is a strictly descending sequence. Moreover,

⋂i≥0 pi = {0}. (Otherwise,⋂

i≥0 pi is an invertible ideal of o. Then from p⋂i≥0 pi =

⋂i≥0 pi, we get p = o,

which is a contradiction.) Thus, for every ideal a of o, there is a unique i ∈ N suchthat a ⊂ pi but a 6⊂ pi+1. The integer i is denoted by νp(a) and is called the p-adicorder of a. The set of all prime ideals of o is called the spectrum of o and is denotedby Spec(o).

Theorem 3.42. Let a be an ideal of a Dedekind domain o. Then νp(a) > 0 foronly finitely many p ∈ Spec(o) and

a =∏

p∈Spec(o)

pνp(a).

Proof. Write a =∏

p∈Spec(o) pip where ip ∈ N, p ∈ Spec(o), and ip > 0 foronly finitely many p. For each p ∈ Spec(o), we will show that ip = νp(a). Sincea ⊂ pip , we have ip ≤ νp(a). If, to the contrary, ip < νp(a), then a ⊂ pip+1.Since p is invertible, a cancelation of pip implies that

∏q∈Spec(o)\{p} qiq ⊂ p. Since

p is prime, q ⊂ p for some q ∈ Spec(o) \ {p}. But this is impossible since q ismaximal. �

It is clear from Theorem 3.42 that if a and b are ideals of o, then νp(ab) =νp(a) + νp(b).

Number fields.

Lemma 3.43. Let o be an integral domain with quotient field F . If α (in someextension of F ) is algebraic over F , then there exists 0 6= a ∈ o such that aα isintegral over o.

Proof. There exist a0, . . . , an−1 ∈ F such that αn + an−1αn−1 + · · ·+ a0 = 0.

Choose 0 6= a ∈ o such that aai ∈ o for all 0 ≤ i ≤ n − 1. Since (aα)n +aan−1(aα)n−1 + an−1a1(aα) + · · ·+ ana0 = 0, where ana0, a

n−1a1, . . . , aan−1 ∈ o,aα is integral over o. �

Definition 3.44. A subfield k ⊂ C with [k : Q] < ∞ is called an algebraicnumber field, or simply a number field. The integral closure of Z in k is called thering of integers of k.

Let k be a number field with ring of integers o. By Lemma 3.43 (with o = Z),for each α ∈ k, there exists 0 6= a ∈ Z such that aα is integral over Z, i.e., aα ∈ o. Inparticular, k is the quotient field of o. We will prove that o is a Dedekind domain.

Lemma 3.45. If a is an ideal of o, then a ∩ Z 6= {0}.

Proof. Choose 0 6= α ∈ a. Since α is integral over Z, there exist a0, . . . , an−1 ∈Z such that αn + an−1α

n−1 + · · ·+ a0 = 0. Dividing both sides by a power of α ifnecessary, we may assume a0 6= 0. Then a0 ∈ a ∩ Z. �


Definition 3.46. Let a be an ideal of o. A sequence ε1, . . . , εn ∈ a is called anintegral basis of a if it is a basis of k over Q and generates a as a Z-module.

Lemma 3.47. Let F and E be fields and let φ1, . . . , φn be distinct embeddingsof F into E. Then φ1, . . . , φn are linearly independent over E as functions from Fto E.

Proof. Assume to the contrary that φ1, . . . , φn are linearly dependent over E.Choose 0 6= (c1, . . . , cn) ∈ En with the least number of nonzero components suchthat

(3.72) c1φ1(y) + c2φ2(y) + · · ·+ cnφn(y) = 0 for all y ∈ F.

We may assume c1 6= 0 and c2 6= 0. Choose z ∈ F such that φ1(z) 6= φ2(z). In(3.72), replace y by zy and divide the result by φ1(z). We have

(3.73) c1φ1(y) + c2φ2(z)φ1(z)

φ2(y) + · · ·+ cnφ2(z)φ1(z)

φn(y) = 0 for all y ∈ F.

Combining (3.72) and (3.73), we get

(3.74) 0φ1(y)+c2(1− φ2(z)

φ1(z)

)φ2(y)+ · · ·+cn

(1− φ2(z)

φ1(z)

)φn(y) = 0 for all y ∈ F.

The coefficients in (3.74) are not all zero but have fewer nonzero terms thanc1, . . . , cn, which is a contradiction. �

Proposition 3.48. Every ideal a of o has an integral basis.

Proof. 1◦ a contains a basis of k/Q.Let δ1, . . . , δn be a basis of k/Q. By Lemma 3.43, there exists 0 6= b ∈ Z such

that bδi ∈ o for all 1 ≤ i ≤ n. Choose 0 6= a ∈ a. Then abδ1, . . . , abδn are a basis ofk/Q contained in a.

2◦ a has an integral basis.Let k be the normal closure of k in C. Then Aut(k/k) < Aut(k/Q) with

[Aut(k/Q) : Aut(k/k)] = [k : Q] := n. Let φ1, . . . , φn ∈ Aut(k/Q) be a set of leftcoset representatives of Aut(k/k) in Aut(k/Q). Let ε1, . . . , εn ∈ a be a basis of k/Q(by 1◦). Define

∆(ε1, . . . , εn) =[det

(φi(εj)

)]2.

(∆(ε1, . . . , εn) is called the discriminant of ε1, . . . , εn over Q.) For each φ ∈ Aut(k/Q),φφi ∈ φπ(i)Aut(k/k), 1 ≤ i ≤ n, where π is a permutation of {1, . . . , n}. Therefore,

φ[det

(φi(εj)

)]= det

(φφi(εj)

)= det

(φπ(i)(εj)

)= ±det

(φi(εj)

).

Thus,φ(∆(ε1, . . . , εn)

)= ∆(ε1, . . . , εn).

Since k/Q is Galois, ∆(ε1, . . . , εn) ∈ Q. Since ε1, . . . , εn ∈ a ⊂ o, they are integralover Z; hence ∆(ε1, . . . , εn) is integral over Z. So, we must have ∆(ε1, . . . , εn) ∈ Zsince Z is integrally closed. Since φ1, . . . , φn are in distinct cosets of Aut(k/k)in Aut(k/Q), φ1|k, . . . , φn|k are all distinct. By Lemma 3.47, det

(φi(εj)

)6= 0.

Therefore, ∆(ε1, . . . , εn) ∈ Z \ {0}.Choose a basis ε1, . . . , εn of k/Q in a such that |∆(ε1, . . . , εn)| is minimal. We

claim that a = Zε1 + · · ·+ Zεn. Otherwise, there exists α = a1ε1 + · · ·+ anεn ∈ awhere ai ∈ Q, 1 ≤ i ≤ n, and at least one of the ai’s, say a1 is not in Z. Write


a1 = b + r with b ∈ Z and 0 < r < 1. Let ε′1 = α − bε1 = rε1 + a2ε2 + · · · + anεn.Then ε′1, ε2, . . . , εn is a basis of k/Q in a and

|∆(ε′1, ε2, . . . , εn)| = r2|∆(ε1, . . . , εn)| < |∆(ε1, . . . , εn)|,which is a contradiction. �

Proposition 3.49. If a is an ideal of o, then |o/a| <∞.

Proof. By Lemma 3.45, there exists 0 6= a ∈ a ∩ Z. It suffices to show that|o/(a)| <∞. Let ε1, . . . , εn be an integral basis of o. Then clearly,

|o/(a)| =∣∣∣∣ Zε1 + · · ·+ ZεnaZε1 + · · ·+ aZεn

∣∣∣∣ = |a|n.

�

Theorem 3.50. o is a Dedekind domain.

Proof. By Theorem 3.38, it suffices to show (i) o is noetherian, (ii) o is inte-grally closed and (iii) every prime ideal of o is maximal.

(i) Let 0 6= a1 ⊂ a2 ⊂ · · · be an ascending chain of ideals of o. By Proposi-tion 3.49, |o/a1| <∞; hence there are only finitely many ideals of o containing a1.Thus, am = am+1 = · · · for some m ∈ Z+. So o is noetherian.

(ii) Let α ∈ k be integral over o. Since o is integral over Z, by Proposition 3.34,α is integral over Z. Thus α ∈ o. Therefore, o is integrally closed.

(iii) Let p be a prime ideal of o. Then o/p is an integral domain. Since |o/p| <∞(Proposition 3.49), o/p must be a field. Hence p is a maximal ideal of o. �

Let k ⊂ K be number fields with rings of integers ok and oK respectively. Ifp ∈ Spec(ok) and P ∈ Spec(oK) such that p ⊂ P, we say that P lies above p and plies under P. Given P ∈ Spec(oK), clearly, P ∩ ok ∈ Spec(ok). If q ∈ Spec(ok) liesunder P, then q ⊂ P∩ok. Since q is maximal, we must have q = P∩ok. Therefore,P ∩ ok is the unique prime ideal of ok that lies under P. On the other hand, givenp ∈ Spec(ok), there are only finitely many prime ideals of oK lying above p. In fact,since poK is an ideal of oK , we can write

(3.75) poK = Pe11 · · ·Pem

m ,

where P1, . . . ,Pm ∈ Spec(oK) are distinct and e1, . . . , em ∈ Z+. By Theorem 3.42,ei = νPi

(poK), 1 ≤ i ≤ m, and P1, . . . ,Pm are precisely the prime ideals of oKthat contain p, i.e., lie above p. Hence, The prime ideals of oK that lie above pare precisely those appearing in the factorization of poK . In (3.75), the integerei is called the ramification index of Pi over p and is denoted by e(Pi/p). ByProposition 3.49, ok/p and oK/Pi, 1 ≤ i ≤ m, are finite fields. They are called theresidue fields of p and Pi, 1 ≤ i ≤ m. It is easy to see that the inclusion ok ↪→ oKinduces an embedding ok/p ↪→ oK/Pi. Thus, the residue field of Pi is an extensionof the residue field of p. The degree of this extension, i.e., [oK/Pi : ok/p], is calledthe degree of Pi over p and is denoted by f(Pi/p). If e(Pi/p) = 1, we say that Pi

is unramified over p. If e(Pi/p) = 1 for all 1 ≤ i ≤ m, we say that p is unramifiedin oK . If m = 1 and f(P1/p) = 1, we say that p is totally ramified in oK .

Let k ⊂ K ⊂ L be number fields, p ∈ Spec(ok), P ∈ Spec(oK), P ∈ Spec(oL)such that P lies above P and P lies above p. Obviously, we have

e(P/p) = e(P/P)e(P/p) and f(P/p) = f(P/P)f(P/p).


Theorem 3.51. Let k ⊂ K be number fields with [K : k] = n. Let p ∈ Spec(ok)and let P1, . . . ,Pm be the prime ideals of oK lying above p. Put ei = e(Pi/p) andand fi = f(Pi/p), 1 ≤ i ≤ m. Then

n =m∑i=1

fiei.

Proof. We have poK = Pe11 · · ·Pem

m . Let k = ok/p and Ki = oK/Pi, 1 ≤ i ≤m. By the Chinese remainder theorem, there is an oK-algebra isomorphism

(3.76)oK/poK = oK/P

e11 · · ·Pem

m −→ oK/Pe11 × · · · × oK/P

emm

a+ poK 7−→ (a+ Pe11 , . . . , a+ Pem

m )

Clearly, all the oK-algebras in (3.76) are k-vector spaces and the isomorphism in(3.76) is a k-vector space isomorphism. Moreover, each oK/P

eii is a Ki-vector

space.1◦ dimKi

oK/Peii = ei.

Consider the filtration oK = P0i ⊃ P1

i ⊃ · · · ⊃ Peii . It suffices to show

that there is an oK-module isomorphism Pji/P

j+1i

∼= Ki for all j ∈ N. Chooseτ ∈ Pj

i \Pj+1i . Then Pj+1

i ( (τ) + Pj+1i ⊂ Pj

i . So,

νP

((τ) + Pj+1

i

) {= j if P = Pi,

≤ νP(Pj+1i ) = 0 if P ∈ Spec(oK) \ {P}.

Hence, (τ) + Pj+1i = Pj

i . Therefore,

φ : oK −→ Pji/P

j+1i

a 7−→ aτ + Pj+1i

is an onto oK-module homomorphism with Pi ⊂ kerφ. Since Pi is maximal, wehave kerφ = Pi. Thus, φ induces an isomorphism Ki = oK/Pi

∼= Pji/P

j+1i .

2◦ dimk oK/poK = n.Let Sp = ok \ p. Since p is a prime ideal. Sp is closed under multiplication. Let

S−1p ok = {ab : a ∈ ok, b ∈ Sp} and S−1

p oK = {ab : a ∈ oK , b ∈ Sp}. (In general, forany A ⊂ oK , let S−1

p A = {ab : a ∈ A, b ∈ Sp}.) S−1p ok and S−1

p oK are subrings ofk and K respectively. S−1

p ok is a local ring with maximal ideal S−1p p and is called

the localization of ok at p. We claim that

(3.77)g : ok/p −→ S−1

p ok/S−1p p

a+ p 7−→ a+ S−1p p

is an ok-algebra isomorphism and

(3.78)h : oK/poK −→ S−1

p oK/S−1p (poK)

a+ p 7−→ a+ S−1p (poK)

is an oK-algebra isomorphism. We only prove the claim about h since the proofof the claim about g is identical. Clearly, h is a well-defined oK-algebra homomor-phism. Let a

b ∈ S−1p oK where a ∈ oK and b ∈ Sp. Since p is a maximal ideal of

ok and b /∈ p, we have p + (b) = ok. Thus, there exists c ∈ ok such that cb ≡ 1(mod p). So, ca ≡ a

b (mod S−1p (poK)), i.e., h(ca + poK) = a

b + S−1p (poK). Hence


h is onto. Assume that d ∈ oK such that h(d + poK) = 0. Then d ∈ S−1p (poK).

Write d = ab with a ∈ poK and b ∈ Sp. Let c ∈ ok such that cb ≡ 1 (mod p). Then

d ≡ cbd (mod p)= ca

≡ 0 (mod poK).

Hence h is one-to-one. Therefore, h is an isomorphism.We claim that every ideal I of S−1

p ok is of the form S−1p pj for some j ∈ N. It

is easy to see that I = S−1p a for some ideal a of ok. Write a = pjq1 · · · qs where

q1, . . . , qs ∈ Spec(ok) \ {p}. Obviously, S−1p qi = S−1

p ok for 1 ≤ i ≤ s. (Choosec ∈ qi \ p. For each a

b ∈ S−1p ok with a ∈ ok and b ∈ Sp, we have a

b = cacb ∈ S

−1p qi.)

Therefore,

S−1p a = (S−1

p pj)(S−1p q1) · · · (S−1

p qs) = S−1p pj .

Let π ∈ p \ p2. Then νp(πok) = 1. We claim that every ideal of S−1p ok is

generated by πj for some j ∈ N. (This is equivalent to saying that S−1p ok is a discrete

valuation ring with prime π.) In fact, we will show that S−1p pj = S−1

p (πjok). Sinceνp(πjok) = j νp(πok) = j, from the above paragraph, with πjok in place of a, wehave S−1

p (πjok) = S−1p pj .

In particular, S−1p ok is a PID. Since oK is a finitely generated Z-module (Propo-

sition 3.48), S−1p oK is a finitely generated S−1

p ok-module. Since S−1p oK is an integral

domain, it is a torsion-free S−1p ok-module. By the fundamental theorem of finitely

generated modules over a PID, S−1p oK is a free S−1

p ok-module of finite rank r.We claim that r = n. Let ε1, . . . , εr be an S−1

p ok-basis of S−1p oK . Since oK

contains a basis of K/k (Lemma 3.43), oK generates K as a k-vector space. So,ε1, . . . , εr generate K as a k-vector space. Assume that a1ε1 + · · ·+ arεr = 0 whereai ∈ k, 1 ≤ i ≤ r. By Lemma 3.43, there exists 0 6= b ∈ Z such that bai ∈ ok forall 1 ≤ i ≤ r. Since (ba1)ε1 + · · ·+ (bar)εr = 0, we must have bai = 0, i.e., ai = 0,1 ≤ i ≤ r. So, ε1, . . . , εr is a basis of K/k. Therefore, r = n. Now we have k-vectorspace isomorphisms

oK/poK ∼= S−1p oK/S

−1p (poK) (by (3.78))

= S−1p oK/p(S−1

p oK)

=(S−1

p ok)ε1 + · · ·+ (S−1p ok)εn

p[(S−1

p ok)ε1 + · · ·+ (S−1p ok)εn

]=

(S−1p ok)ε1 + · · ·+ (S−1

p ok)εn(S−1

p p)ε1 + · · ·+ (S−1p p)εn

∼=S−1

p ok

S−1p p

× · · · ×S−1

p ok

S−1p p︸︷︷︸

n

∼= ok/p× · · · × ok/p︸︷︷︸n

(by (3.77))

= kn.

Therefore, dimk oK/poK = n.


3◦ By (3.76), 1◦ and 2◦, we have

n = dimk oK/poK =m∑i=1

dimk oK/Peii =

m∑i=1

fi dimKioK/P

eii =

m∑i=1

fiei.

�

The p-adic valuation of a number field. Let k be a number field andp ∈ Spec(ok). For each 0 6= a ∈ ok, we define νp(a) = νp(aok). We set νp(0) = ∞.If α = b

a ∈ k, where a, b ∈ ok, a 6= 0, we define

νp(α) = νp(b)− νp(a).

It is easy to see that νp(α) depends only on α but not on the choice of a and b.Thus, we have defined a function νp : k → Z ∪ {∞}. This function is called thep-adic valuation of k.

Proposition 3.52. Let x, y ∈ k.(i) νp(xy) = νp(x) + νp(y).(ii) νp(x+ y) ≥ min{νp(x), νp(y)}.(iii) If νp(x) 6= νp(y), νp(x+ y) = min{νp(x), νp(y)}.

Proof. (i) First assume x, y ∈ ok. Then

νp(xy) = νp(xyok) = νp

((xok)(yok)

)= νp(xok) + νp(yok) = νp(x) + νp(y).

In general, let x = ba , y = d

c , a, b, c, d ∈ ok, a 6= 0, c 6= 0. We have

νp(xy) = νp

(bdac

)= νp(bd)−νp(ac) = νp(b)−νp(a)+νp(d)−νp(c) = νp(x)+νp(y).

(ii) Choose 0 6= a ∈ ok such that ax ∈ ok and ay ∈ ok. By (i), νp(x + y) =νp(ax+ay)−νp(a) and min{νp(x), νp(y)} = min{νp(ax), νp(ay)}−νp(a). Therefore,we may assume that x, y ∈ ok. Let i = min{νp(x), νp(y)}. Then xok ⊂ pi andyok ⊂ pi, so (x+ y)ok ⊂ pi. Thus, νp(x+ y) ≥ i.

(iii) Let νp(x) > νp(y). Assume to the contrary that νp(x+ y) > νp(y). Thenby (ii),

νp(y) = νp(x+ y − x) ≥ min{νp(x+ y), νp(x)} > νp(y),

which is a contradiction. �

Proposition 3.53. Let x ∈ k.(i) x ∈ ok if and only if νp(x) ≥ 0 for all p ∈ Spec(ok).(ii) x is a unit of ok if and only if νp(x) = 0 for all p ∈ Spec(ok).

Proof. (i) The “only if” part is obvious. To see the “if” part, write x = ba ,

a, b ∈ ok, a 6= 0. By the assumption, aok = pu11 · · · pun

n and bok = pv11 · · · pvnn , where

p1, . . . , pn ∈ Spec(ok) and ui ≤ vi, 1 ≤ i ≤ n. Thus,

b

aok = (pv11 · · · pvn

n )(pu11 · · · pun

n )−1 = pv1−u11 · · · pvn−un

n ⊂ ok,

proving that ba ∈ ok.

(ii) Note that we also have νp(x−1) = −νp(x) = 0 for all p ∈ Spec(ok). By (i),both x and x−1 belong to ok. So, x is a unit of ok. �

3.7. CYCLOTOMIC FIELDS 91

Let k ⊂ K be number fields, p ∈ Spec(ok), and P ∈ Spec(oK) lie above p.Then clearly,

νP(x) = e(P/p)νp(x) for all x ∈ k.

3.7. Cyclotomic Fields

Let n ∈ Z+ and ζn = e2πi/n. The number field Q(ζn) is called the cyclotomicfield of the nth roots of unity. The cyclotomic field Q(ζn) will be abbreviated asQ(n) and its ring of integers denoted by oQ(n). Given any prime integer p, theideal poQ(n) factors into a finite product of prime ideals in oQ(n). We will give afairly explicit description of this factorization. Alongside, we will also show that[Q(n) : Q] = φ(n), where φ is the Euler function, and that oQ(n) = Z[ζn].

We first look at the factorization of p in oQ(n) where n is a power of p.

Proposition 3.54. Let p be a prime integer and s ∈ Z+. Then pZ is totallyramified in oQ(ps). The only prime ideal of oQ(ps) lying above pZ is (1− ζps)oQ(ps)

andpoQ(ps) =

[(1− ζps)oQ(ps)

]φ(ps).

Moreover, [Q(ps) : Q] = φ(ps).

Proof. In C[x], we have xps − 1 = (xp

s−1 − 1)Φ(x), where

Φ(x) = 1 + xps−1

+ xps−1·2 + · · ·+ xp

s−1(p−1).

Clearly, the roots of Φ are precisely the primitive psth roots of unity. Hence,

(3.79) Φ(x) =∏

1≤i≤ps−1(i,p)=1

(x− ζips).

Since ζps is a root of Φ ∈ Q[x], [Q(ps) : Q] ≤ deg Φ = φ(ps). Letting x = 1 in(3.79), we get

(3.80) p =∏

1≤i≤ps−1(i,p)=1

(1− ζips).

For each 1 ≤ i ≤ ps − 1 with (i, p) = 1, we have

1− ζips

1− ζps

= 1 + ζps + · · ·+ ζi−1ps ∈ oQ(ps).

There exists j ∈ Z+ such that ij ≡ 1 (mod ps). Thus, we also have

1− ζps

1− ζips

=1− ζijps

1− ζips

= 1 + ζips + · · ·+ ζi(j−1)ps ∈ oQ(ps).

Therefore, (1− ζips)oQ(ps) = (1− ζps)oQ(ps). So, (3.80) implies that

poQ(ps) =[(1− ζps)oQ(ps)

]φ(ps).

Write (1 − ζps)oQ(ps) = Pe11 · · ·Pet

t , where P1, . . . ,Pt ∈ Spec(oQ(ps)) are distinctand ei ∈ Z+, 1 ≤ i ≤ t, and let fi be the degree of Pi over pZ. Then poQ(ps) =(Pe1

1 · · ·Pett )φ(ps). By Theorem 3.51,

(3.81) φ(ps)t∑i=1

fiei = [Q(ps) : Q] ≤ φ(ps).


Thus, t = 1 and f1 = e1 = 1. So, (1 − ζps)oQ(ps) is a prime ideal of oQ(ps) and istotally ramified above pZ. (3.81) also forces [Q(ps) : Q] = φ(ps). �

The factorization of p in oQ(n) for an arbitrary n is given by (ii) of the nexttheorem.

Theorem 3.55. Let n = psm where p is a prime, s ∈ N, m ∈ Z+, p - m. Letf be the multiplicative order of p in Z/mZ.

(i) Let P be the unique ideal of oQ(ps) lying above pZ. (By Proposition 3.54,P = (1− ζps)oQ(ps) if s > 0 and P = pZ if s = 0.) Then

PoQ(n) = ℘1 · · ·℘twhere t = φ(m)/f , ℘1, . . . , ℘t ∈ Spec(oQ(n)) are distinct and form anAut

(Q(n)/Q(ps)

)-orbit. Moreover, f(℘i/P) = f , 1 ≤ i ≤ t.

(ii) The factorization of p in oQ(n) is

poQ(n) = ℘φ(ps)1 · · ·℘φ(ps)

t .

Proof. (i) Let

(3.82) PoQ(n) = ℘e11 · · ·℘eττ ,

where ℘1, . . . , ℘τ ∈ Spec(oQ(n)) are distinct and e1, . . . , eτ ∈ Z+. Observe thatif ℘ ∈ Spec(oQ(n)) lies above P and γ ∈ Aut

(Q(n)/Q(ps)

), γ(℘) ∈ Spec(oQ(n))

also lies above P. Therefore, Aut(Q(n)/Q(ps)

)acts on {℘1, . . . , ℘τ}. We claim

that this action is transitive. Assume to the contrary that γ(℘1) 6= ℘2 for allγ ∈ Aut

(Q(n)/Q(ps)

). By the Chinese remainder theorem, there exists a ∈ oQ(n)

such that

a ≡

{0 (mod ℘2),1 (mod γ(℘1)) for all γ ∈ Aut

(Q(n)/Q(ps)

).

Therefore, a ∈ ℘2 but γ(a) /∈ ℘1 for all γ ∈ Aut(Q(n)/Q(ps)

). Put

a′ =∏

γ∈Aut(Q(n)/Q(ps))

γ(a).

Since ℘1 is prime, a′ /∈ ℘1. Since a ∈ ℘2, a′ ∈ ℘2. Since Q(n)/Q(ps) is Galois,a′ ∈ Q(ps). Thus, a′ ∈ ℘2 ∩ Q(ps) = P = ℘1 ∩ Q(ps), which is a contradiction.Hence, we have proved that Aut

(Q(n)/Q(ps)

)acts transitively on {℘1, . . . , ℘τ}.

So, {℘1, . . . , ℘τ} is an Aut(Q(n)/Q(ps)

)-orbit. Since the right hand side of (3.82)

is invariant under the action of Aut(Q(n)/Q(ps)

), we must have e1 = · · · = eτ := e.

Also clearly, f(℘1/P) = · · · = f(℘τ/P) := g. By Theorem 3.51, [Q(n) : Q(ps)] =τge, i.e.,

(3.83) g =[Q(n) : Q(ps)]

τe.

We claim that e = 1 and g = f .Let G =

{γ ∈ Aut

(Q(n)/Q(ps)

): γ(℘1) = ℘1

}be the stabilizer of ℘1 in

Aut(Q(n)/Q(ps)

). (G is called the decomposition group of ℘1.) We have

(3.84) |G| =|Aut

(Q(n)/Q(ps)

)|

τ=

[Q(n) : Q(ps)]τ

.

3.7. CYCLOTOMIC FIELDS 93

Let FP and F℘1 denote the residue fields of P and ℘1 respectively. Then FP = Fpand F℘1 = Fpg . Define a group homomorphism

( ) : G −→ Aut(F℘1/FP)γ 7−→ γ

whereγ : F℘1 −→ F℘1

x+ ℘1 7−→ γ(x) + ℘1

(Clearly, both γ and ( ) are well defined.) We claim that ( ) is one-to-one. We firstobserve that 1, ζm, . . . , ζm−1

m are all distinct modulo ℘1. In fact, letting x = 1 inthe polynomial identity 1 + x+ · · ·+ xm−1 =

∏m−1i=1 (x− ζim), we have

m =m−1∏i=1

(1− ζim).

Since charF℘1 = p, m 6= 0 in F℘1 . View the above equation in F℘1 . It followsthat ζim 6≡ 1 (mod ℘1) for 1 ≤ i ≤ m − 1, i.e., 1, ζm, . . . , ζm−1

m are all distinctmodulo ℘1. Now assume γ ∈ G such that γ = id. Then γ(x) ≡ x (mod ℘1) forall x ∈ oQ(n). Since γ(ζm) is a primitive mth root of unity, γ(ζm) = ζim for some1 ≤ i ≤ m − 1. Since ζim = γ(ζm) ≡ ζm (mod ℘1), we must have i = 1, i.e.,γ(ζm) = ζm. Since Q(n) =

(Q(ps)

)(ζm), we have γ = id. So, we have proved that

( ) : G→ Aut(F℘1/FP) is one-to-one. By (3.83) and (3.84),

|G| = [Q(n) : Q(ps)]τ

≥ [Q(n) : Q(ps)]τe

= g = |Aut(F℘1/FP)|.

It follows that ( ) is an isomorphism and e = 1.Let σ be the Frobenius map of F℘1/FP, i.e., σ(y) = yp for all y ∈ F℘1 . Then

σ = γ for some γ ∈ G. Let γ(ζm) = ζim. Then

ζim + ℘1 = γ(ζm) = σ(ζm) = ζpm + ℘1.

In the above paragraph, we already showed that 1, ζm, . . . , ζm−1m are all distinct

modulo ℘1. Thus, i ≡ p (mod m), i.e., γ(ζ) = ζpm. Therefore, g = o(σ) = o(γ) = f .It follows form the next theorem that [Q(n) : Q(ps)] = φ(m). Thus by (3.83),

τ = φ(m)/f . This completes the proof of (i).(ii) follows immediately from (i) and Proposition 3.54. �

Theorem 3.56. [Q(n) : Q] = φ(n).

Proof. Let p be any prime divisor of n and write n = psm where s,m ∈ Z+,p - m. By induction, it suffices to show that [Q(n) : Q(m)] = φ(ps).

Let ℘ be a prime ideal of oQ(m) lying above pZ and P a prime ideal of oQ(n)

lying above ℘. By Proposition 3.54, P lies above (1− ζps)oQ(ps) ∈ Spec(oQ(ps)). ByTheorem 3.55 (i), with pZ in place of P, we have e(℘/pZ) = 1. Thus,

[Q(n) : Q(m)] ≥ e(P/℘) = e(P/pZ) ≥ e((1− ζps)oQ(ps)/pZ) = φ(ps),

where the last equal sign follows from Proposition 3.54.On the other hand, since Q(n) =

(Q(m)

)(ζps) and ζps is a root of Φ(x) =∑p−1

i=0 xps−1i, we have [Q(n) : Q(m)] ≤ deg Φ = φ(ps). So, [Q(n) : Q(m)] =

φ(ps). �

Theorem 3.57. oQ(n) = Z[ζn].


@@@

��

��

@@@

Q

Q(n)

Q(ps) Q(m)

φ(ps) φ(m)

φ(m) φ(ps)

the fields

@@@

��

��

@@@

pZ

P

(1− ζps)oQ(ps) ℘

φ(ps) 1

1 φ(ps)

the ideals and the ramification indices

Figure 2. Proof of Theorem 3.56

Proof. It suffices to show that oQ(n) ⊂ Z[ζn]. Let p be a prime divisor of nand write n = psm where s,m ∈ Z+, p - m. Let α ∈ oQ(n) be arbitrary. We wantto show that α ∈ Z[ζn]. Since α ∈ Q(n) =

(Q(m)

)(ζps), we can write

(3.85) α = a0 + a1ζps + · · ·+ aφ(ps)−1 ζφ(ps)−1ps ,

where ai ∈ Q(m), 0 ≤ i ≤ φ(ps) − 1. It suffices to show that ai ∈ oQ(m), 0 ≤ i ≤φ(ps) − 1, since by an induction on n, we would have ai ∈ Z[ζm], implying thatα ∈

(Z[ζm]

)[ζps ] = Z[ζn]. Therefore, it suffices to show that ν℘(ai) ≥ 0 for all

℘ ∈ Spec(oQ(m)) and 0 ≤ i ≤ φ(ps)− 1.Case 1. ℘ lies above pZ.To prove that ν℘(ai) ≥ 0, it suffices to show that νP(ai) ≥ 0 for all P ∈

Spec(oQ(n)) lying above ℘. Write

α = b0 + b1(1− ζps) + · · ·+ bφ(ps)−1(1− ζps)φ(ps)−1

with bi ∈ Q(m), 0 ≤ i ≤ φ(ps) − 1. Since each ai is a linear combination ofb0, b1, . . . , bφ(ps)−1 with coefficients in Z[ζps ], it suffices to show that νP(bi) ≥ 0,0 ≤ i ≤ φ(ps)− 1. We have

(3.86) νP

(φ(ps)−1∑i=0

bi(1− ζps)i)

= νP(α) ≥ 0.

By Figure 2, e(P/℘) = φ(ps) and νP(1−ζps) = 1. Thus, for each 0 ≤ i ≤ φ(ps)−1,

(3.87) νP(bi(1− ζps)i

)= νP(bi) + i = e(P/℘)ν℘(bi) + i ≡ i (mod φ(ps)).

Using (3.87) and Proposition 3.52 (iii), we see that (3.86) forces νP(bi(1−ζps)i

)≥ 0

for all 0 ≤ i ≤ φ(ps)−1. So, νP(bi) ≥ −i > −φ(ps). Since νP(bi) ≡ 0 (mod φ(ps)),we must have νP(bi) ≥ 0.

Case 2. ℘ does not lie above pZ.Write Aut

(Q(n)/Q(m)

)= {γ0, . . . , γφ(ps)−1} and let γi(ζps) = ζtips , ti ∈ (Z/psZ)∗.

Applying γi to (3.85), we have

γi(α) = [1, ζtips , . . . , ζti(φ(ps)−1)ps ]

a0

...aφ(ps)−1

, 0 ≤ i ≤ φ(ps)− 1,

EXERCISES 95

i.e.,

(3.88)

γ0(α)...

γφ(ps)−1(α)

=[ζtijps

]0≤i,j≤φ(ps)−1

a0

...aφ(ps)−1

.Note that

(3.89) det[ζtijps

]=

∏0≤i<k≤φ(ps)−1

(ζtk − ζti)

since the determinant is a Vandermonde determinant.For each 1 ≤ t < ps, write t = pum with u ∈ N, m ∈ Z+ and p - m. We

claim that (1 − ζtps)/(1 − ζps)pu

is a unit of oQ(ps). As we saw in the proof ofProposition 3.54,

1− ζtps

1− ζps−u

=1− ζmps−u

1− ζps−u

is a unit of oQ(ps−u) ⊂ oQ(ps). Thus, it suffices to show that (1−ζps−u)/(1−ζps)pu

isa unit of oQ(ps). Let P = (1−ζps)oQ(ps) ∈ Spec(oQ(ps)) and p = (1−ζps−u)oQ(ps−u) ∈Spec(oQ(ps−u)). Then

e(P/p) =e(P/pZ)e(p/pZ)

=φ(ps)φ(ps−u)

= pu.

Thus, νP

((1 − ζps−u)/(1 − ζps)p

u)= e(P/p) − pu = 0. If Q ∈ Spec(oQ(ps)) \ {P},

then νQ(1 − ζps) = 0. Since P is the only primes ideal of oQ(ps) lying above p, Q

does not lie above p, so νQ(1− ζps−u) = 0. Thus, νQ

((1− ζps−u)/(1− ζps)p

u)= 0.

Hence we have proved that

νQ

( 1− ζps−u

(1− ζps)pu

)= 0 for all Q ∈ Spec(oQ(ps)).

Therefore, (1− ζps−u)/(1− ζps)pu

is a unit of oQ(ps).By the claim in the above paragraph, we can write (3.89) as

det[ζtijps

]= ε(1− ζps)v,

where ε is a unit of oQ(ps) and v ∈ N. Solving (3.88) for a0, . . . , aφ(ps)−1, we get a0

...aφ(ps)−1

= (1− ζps)−v

c0...

cφ(ps)−1

,where ci ∈ oQ(n), 0 ≤ i ≤ φ(ps) − 1. Since ℘ does not lie above pZ, it doesnot lie above P. Therefore, ν℘(1 − ζps) = 0. So, ν℘(ai) = ν℘(ci) ≥ 0 for all0 ≤ i ≤ φ(ps)− 1. �

3.8

Exercises

3.1. Prove Theorem 3.25.


3.2. Let λ1, . . . , λk be multiplicative characters of Fq. The Jacobi sum of λ1, . . . , λkis defined to be

J(λ1, . . . , λk) =∑

a1,...,ak∈Fq

a1+···+ak=1

λ1(a1) · · ·λk(ak).

Assume k ≥ 2. Prove the following statements.(i)

J(λ1, . . . , λk) =

{qk−1 if λ1, . . . , λk are all trivial,0 if some, but not all, of λ1, . . . , λk is trivial.

(ii) If λ1, . . . , λk are all nontrivial,

J(λ1, . . . , λk) =G(λ1) · · ·G(λk)G(λ1 · · ·λk)

and

|J(λ1, . . . , λk)| =

{q

k−22 if λ1 · · ·λk is trivial,

qk−12 if λ1 · · ·λk is nontrivial.

3.3. (i) Let o be a Dedekind domain. Prove that o is a UFD if and only if it isa PID.

(ii) Let k = Q(√−5). Prove that the ring of integers ok is Z[

√−5]. Hence

Z[√−5] is a Dedekind domain. Prove that Z[

√−5] is not a UFD.

CHAPTER 4

Zeros of Polynomials over Finite Fields

4.1. Ax’s Theorem

Let p be a prime integer, t ∈ Z+ and q = pt. For each f ∈ Fq[X1, . . . , Xn], let

Z(f) ={(x1, . . . , xn) ∈ Fnq : f(x1, . . . , xn) = 0

}.

Ax’s theorem is a lower bound on the p-adic order of |Z(f)|. The proof given hereis based on Ax’s original paper [1].

We follow the notation of Section 3.8: p is a prime ideal of oQ(q−1) lying abovepZ and ℘ is the unique prime ideal of oQ(p(q−1)) lying above p, see Figure ??;Fp = oQ(q−1)/p, π = ζp − 1, T = 〈ζq−1〉 ∪ {0}; λp is the multiplicative characterof Fq defined in (??). For an integer i, 0 ≤ i ≤ q − 1, with p-adic expansioni = i(0) + i(1)p+ · · ·+ i(t−1)pt−1, where 0 ≤ i(j) ≤ p− 1, let

s(i) = i(0) + i(1) + · · ·+ i(t−1)

andτ(i) = i(t−1) + i(0)p+ · · ·+ i(t−2)pt−1.

Note that

(4.1) τ(i) ≡ pi (mod q − 1)

and

(4.2)t−1∑j=0

τ j(i) =q − 1p− 1

s(i).

Lemma 4.1. We have

(4.3) ζTrFq/Fp (y+p)p =

q−1∑i=0

ciyi for all y ∈ T,

where

ci =

1 if i = 0,

− q

q − 1if i = q − 1,

1q − 1

G(λ−ip ) if 0 < i < q − 1.

In particular,

(4.4) ν℘(ci) = s(i), 0 ≤ i ≤ q − 1.

97

98 4. ZEROS OF POLYNOMIALS OVER FINITE FIELDS

Proof. Since |T | = q, by the Lagrange interpolation, there exist c0, . . . , cq−1 ∈C such that

(4.5)q−1∑i=0

ciyi = ζ

TrFq/Fp (y+p)p for all y ∈ T.

Letting y = 0 in (4.5), we have c0 = 1. To determine cq−1, observer that

−1 =∑

y∈〈ζq−1〉

ζTrFq/Fp (y+p)p

=∑

y∈〈ζq−1〉

q−1∑i=0

ciyi

=q−1∑i=0

ci∑

y∈〈ζq−1〉

yi

= (q − 1)(c0 + cq−1).

It follows that

cq−1 = − 1q − 1

− c0 = − q

q − 1.

Now assume 0 < j < q − 1. We have

G(λp) =∑

y∈〈ζq−1〉

y−jζTrFq/Fp (y+p)p

=∑

y∈〈ζq−1〉

y−jq−1∑i=0

ciyi

=q−1∑i=0

ci∑

y∈〈ζq−1〉

yi−j

= (q − 1)cj .

(In the last step above, note that for 0 ≤ i ≤ q − 1, i − j ≡ 0 (mod q − 1) if andonly if i = j.) Thus, cj = 1

q−1G(λp).For 0 < i < q−1, (4.4) follows from Stickelberger’s theorem. For i = 0 or q−1,

(4.4) is obviously true. �

For u = (u1, . . . , un) ∈ Nn, let |u| = u1 + · · · + un. If x = (x1, . . . , xn) is ann-tuple over a commutative ring, we define xu = xu1

1 · · ·xunn .

Theorem 4.2 (Ax’s theorem). Let f ∈ Fq[X1, . . . , Xn] with deg f = d > 0.Then

(4.6) |Z(f)| ≡ 0 (mod q)dnd e−1.

Proof. Let X = (X1, . . . , Xn) and write

f =m∑j=1

ajXuj ,

4.1. AX’S THEOREM 99

where aj ∈ Fq, uj ∈ Nn and |uj | ≤ d, 1 ≤ j ≤ m. We identify Fq with Fp. Thus,aj = bj + p for some bj ∈ T , 1 ≤ j ≤ m. We have

q|Z(f)| =∑x0∈Fp

∑x∈Fn

p

ζTrFq/Fp (x0f(x))p

=∑

(x0,x)∈Fn+1p

ζTrFq/Fp (x0

∑mj=1 ajxuj )

p

=∑

z∈Fn+1p

m∏j=1

ζTrFq/Fp (ajz(1,uj))p (z = (x0,x))

=∑

y∈Tn+1

m∏j=1

(q−1∑i=0

cibijyi(1,uj)

)(by (4.3))

=∑

y∈Tn+1

∑0≤i1,...,im≤q−1

ci1 · · · cimbi11 · · · bimm yi1(1,u1)+···+im(1,um)

=∑

0≤i1,...,im≤q−1

bi11 · · · bimm ci1 · · · cim∑

y∈Tn+1

yi1(1,u1)+···+im(1,um).

(4.7)

We claim that for all (i1, . . . , im) ∈ {0, . . . , q − 1}m,

(4.8) ν℘

(ci1 · · · cim

∑y∈Tn+1

yi1(1,u1)+···+im(1,um))≥ t(p− 1)

⌈nd

⌉.

We first observe that the conclusion of the theorem follows from (4.7) and (4.8). Infact, by (4.7) and (4.8),

ν℘(|Z(f)|

)+ t(p− 1) = ν℘

(q|Z(f)|

)≥ t(p− 1)

⌈nd

⌉.

Thus, ν℘(|Z(f)|

)≥ t(p− 1)(dnd e − 1) and

νp(|Z(f)|

)=

1e(℘/pZ)

ν℘(|Z(f)|

)=

1p− 1

ν℘(|Z(f)|

)≥ t

(⌈nd

⌉− 1

),

which is equivalent to (4.6).To prove (4.8), we use the fact that for i ≥ 0,

(4.9)∑y∈T

yi =

q if i = 0,q − 1 if i 6= 0 but i ≡ 0 (mod q − 1),0 if i 6≡ 0 (mod q − 1).

We consider three cases.Case 1. i1(1,u1) + · · ·+ im(1,um) 6≡ (0, . . . , 0) (mod q − 1). By (4.9),

(4.10)∑

y∈Tn+1

yi1(1,u1)+···+im(1,um) = 0.

Case 2. (i1, . . . , im) = (0, . . . , 0). Then

(4.11)∑

y∈Tn+1

yi1(1,u1)+···+im(1,um) = qn+1.

Thus, the left hand side of (4.8) is ≥ ν℘(qn+1) = t(p− 1)(n+ 1) > t(p− 1)dnd e.

100 4. ZEROS OF POLYNOMIALS OVER FINITE FIELDS

Case 3. i1(1,u1) + · · ·+ im(1,um) ≡ (0, . . . , 0) (mod q− 1) but (i1, . . . , im) 6=(0, . . . , 0). In particular, i1 + · · ·+ im is a nonzero integer multiple of q − 1. Let kbe the number of nonzero components of i1u1 + · · ·+ imum. Clearly,

(i1 + · · ·+ im)d ≥ i1|u1|+ · · ·+ im|um| ≥ k(q − 1).

Since i1 + · · ·+ im ≡ 0 (mod q − 1), we get

(4.12) i1 + · · ·+ im ≥ (q − 1)⌈kd

⌉.

By (4.9), we have

(4.13)∑

y∈Tn+1

yi1(1,u1)+···+im(1,um) = (q − 1)k+1qn−k.

We claim that for any i1, . . . , im ∈ {0, . . . , q − 1}, equation (4.13) implies in-equality (4.12). Assume that (4.13) is satisfied. Then (4.10) and (4.11) are notsatisfied; hence i1, . . . , im must be in Case 3. Moreover, the number of nonzerocomponents of i1u1 + · · ·+ imum is k, So, (4.12) holds.

By (4.1),∑y∈Tn+1

yτ(i1)(1,u1)+···+τ(im)(1,um)

=∑

y∈Tn+1

yp[i1(1,u1)+···+im(1,um)]

=∑

y∈Tn+1

yi1(1,u1)+···+im(1,um) (since y 7→ yp is a permutation of T )

= (q − 1)k+1qn−k,

i.e., (4.13) holds with τ(i1), . . . , τ(im) in place of i1, . . . , im. Therefore, (4.12) holdswith τ(i1), . . . , τ(im) in place of i1, . . . , im. The same holds for τ l(i1), . . . , τ l(im),0 ≤ l ≤ t− 1. By (4.12) and (4.2), we have

t(q − 1)⌈kd

⌉≤

t−1∑l=0

m∑j=1

τ l(ij) =q − 1p− 1

m∑j=1

s(ij),

i.e.,m∑j=1

s(ij) ≥ t(p− 1)⌈kd

⌉.

Therefore,

ν℘

(ci1 · · · cim(q − 1)k+1qn−k

)= s(i1) + · · ·+ s(im) + t(p− 1)(n− k) (by (4.4))

≥ t(p− 1)(⌈kd

⌉+ n− k

).

Clearly, min{dkde+ n− k : 0 ≤ k ≤ n} = dnd e. Thus (4.8) is proved. �

Corollary 4.3. Let f ∈ Fq[X1, . . . , Xn] with deg f = d > 0. Then either|Z(f)| = 0 or |Z(f)| ≥ qdn/qe−1.

4.1. AX’S THEOREM 101

Congruence (4.6) in Ax’s theorem is equivalent to

(4.14) νp(|Z(f)|

)≥ t

(⌈kd

⌉− 1

).

This lower bound is the best possible as shown in the next lemma.

Lemma 4.4. Let I1, . . . , Ik be a partition of {1, . . . , n} (Ii 6= ∅, 1 ≤ i ≤ n). Let

f =k∑i=1

∏j∈Ii

Xj ∈ Fq[X1, . . . , Xn].

Then

|Z(f)| = qn−1 + qk−1(q − 1)k∏i=1

(q|Ii|−1 − (q − 1)|Ii|−1

).

In particular,νp

(|Z(f)|

)= t(k − 1).

Proof. We have

q|Z(f)| =∑x0∈Fq

∑(x1,...,xn)∈Fn

q

ζTrFq/Fp (x0f(x1,...,xn))p

= qn +∑x0∈F∗q

∑(x1,...,xn)∈Fn

q

ζ

∑ki=1 TrFq/Fp (x0

∏j∈Ii

xj)p

= qn +∑x0∈F∗q

∑(x1,...,xn)∈Fn

q

k∏i=1

ζTrFq/Fp (x0

∏j∈Ii

xj)p

= qn +∑x0∈F∗q

k∏i=1

( ∑xj∈Fq

j∈Ii

ζTrFq/Fp (x0

∏j∈Ii

xj)p

)

= qn + (q − 1)k∏i=1

( ∑xj∈Fq

j∈Ii

ζTrFq/Fp (

∏j∈Ii

xj)p

)

= qn + (q − 1)k∏i=1

q(q|Ii|−1 − (q − 1)|Ii|−1

).

In the last step of the above calculation, we used the fact that∑(x1,...,xl)∈Fl

q

ζ∑k

i=1 TrFq/Fp (x1···xl)p = q

∣∣{(x1, . . . , xl−1) ∈ Fl−1q : x1 · · ·xl−1 = 0

}∣∣= q

(ql−1 − (q − 1)l−1

).

So, the lemma is proved. �

For each d > 0, we can choose a partition I1, . . . , Idn/de of {1, . . . , n} such that|Ii| ≤ d for all i and |I1| = d. Let f =

∑dn/dei=1

∏j∈Ii

Xj ∈ Fq[X1, . . . , Xn]. Thendeg f = d and by Lemma 4.4, νp

(|Z(f)|

)= t(dnd e − 1). Hence, the lower bound in

(4.14) is attained.

Hints for the Exercises

1.1. (iii) Let b, τ(b), . . . , τn−1(b) be a normal basis of Fqn over Fq. Let a =g(b)

TrFqn /Fq (b) . Compare g and aTrFqn/Fqat b, τ(b), . . . , τn−1(b).

1.2. Only have to consider the case where q is odd. Prove that the set ofsquares in Fq is not closed under addition.

1.3. Prove that n =∑d|n φ(d) by partitioning Z/nZ according to the order of

elements.

2.3. Consider f =∑q−1j=0 haj where haj is defined in (2.19).

2.4. Use Proposition 2.27.

3.3. (i) Assume o is a UFD and let a be an ideal of o. Since a is invertible,1 =

∑ni=1 biai where 0 6= bi ∈ a−1, 0 6= ai ∈ a, 1 ≤ i ≤ n. Write bi = ci

di,

ci, di ∈ o, gcd(ci, di) = 1. Let d = lcm(d1, . . . , dn). Show that a = (d).(ii) If a+ b

√−5 ∈ ok, then 2a ∈ Z and a2 + 5b2 ∈ Z. Show a, b ∈ Z. Try

to factor 6 ∈ Z[√−5] in two ways.

103

Bibliography

[1] J. Ax, Zeros of polynomials over finite fields, Amer. J. Math., 86 (1964), 255 – 261.

[2] E. A. Bender and J. R. Goldman, On the applications of Mobius inversion in combinatorialanalysis, Amer. Math. Monthly 82 (1975), 789–803.

[3] E. R. Berlekamp, Algebraic Coding Theory, McGraw-Hill, New York, 1968.

[4] H. Davenport and H. Hasse,[5] Estermann

[6] X. Hou, Solution to a problem of S. Payne, Proc. Amer. Math. Soc., 132 (2004), 1– 8.

[7] X. Hou, A note on the proof of a theorem of Katz, Finite Fields Appl., 11 (2005) 316 – 319.[8] K. Ireland and M. Rosen, A Classical Introduction to Modern Number Theory, Springer, New

York, 1982.[9] N. M. Katz, On a theorem of Ax, Amer. J. Math., 93 (1971), 485 – 499.

[10] I. Niven, Formal power series, Amer. Math. Monthly 76 (1969), 871–889.

[11] S. E. Payne, A complete determination of translation ovoids in finite Desarguian planes,Lincei - Rend. Sc. fis. mat. nat. LI (1971), 328–331 (1972).

105

Lectures on Finite Fields Xiang-dong Houshell.cas.usf.edu/~xhou/MAD6617F05/LecFF-web.pdf ·...

Documents

Transcript of Lectures on Finite Fields Xiang-dong Houshell.cas.usf.edu/~xhou/MAD6617F05/LecFF-web.pdf ·...