Groebner Bases with an Application to Integer Programmingand algebraic geometry. The primary purpose...

Groebner Bases with an Application to IntegerProgramming

Carolyn WendlerSenior Exercise

May 17, 2004

0.1 Introduction

The theory of Groebner bases involves a generalization of the theory of polynomi-als in one variable. They were introduced by Bruno Buchburger in 1965 (Adamsx). The significance of Groebner bases lies in the function they provide as compu-tational tools to finding algorithmic solutions to problems in commutative algebraand algebraic geometry. The primary purpose of this paper is to introduce Groebnerbases and techniques of working with Groebner bases. To achieve this purpose, thepaper will focus on an application of Groebner bases, integer programming, in orderto demonstrate how the techniques of Groebner bases work and are applied to solverelevant problems. Linear and Integer programs are a class of optimization problems.There are many real world applications to these problems in many different fields.The general linear programming problem is to either maximize or minimize an objec-tive subject to some constraints. For example, in the telecommunications industry, aprovider may need to decide how to supply adequate service to an area of customerswhile minimizing cost. Or a manufacturer may need to decide how to best allocateresources while maximizing profits. Linear and integer programming are used to helpmake these decisions. Mathematically, we can use linear programming when the con-straints are linear. These types of problems are solvable in polynomial time by usinglinear algebra (Thomas 119). For a detailed discussion of linear programming meth-ods, see Craig 2003. Integer programs differ from linear programs in that solutionsare required to be integral instead of real. It is much more difficult to find integer-onlysolutions to the problem, and currently, integer programs are not solvable in poly-nomial time (they are NP-complete) (CLO 1998 363). The techniques of Groebnerbasis have provided new insight into this problem. These techniques provide rathercomplicated algorithms for solving integer programs, but they are worth investigatingbecause of the possibility of finding a more efficient solution. This paper will providethe most simple examples that involve few variables with few constraints, yet eventhese problems involve a good deal of computation. The general integer program isthe problem:

Integer Programming Problem Let aij ∈ ZZ , bi ∈ ZZ , andcj ∈ IR , i = 1, . . . , n, j = 1, . . . ,m; we wish to find a solution (σ1, σ2, . . . , σm) in IN m

of the system

a11σ1 + a12σ2 + · · ·+ a1mσm = b1

a21σ1 + a22σ2 + · · ·+ a2mσm = b2

·· (1)

·an1σ1 + an2σ2 + . . . + anmσm = bn,

1

which minimizes the “cost function”

c(σ1, σ2, . . . , σm) =m∑

j=1

cjσj.

(Adams 105)

The main goal of this paper will be to find a method for solving this problem. Todo this, we will translate this problem into a problem about polynomials that we cansolve using Groebner bases techniques. In dealing with sets of polynomials, we willfind that certain sets can be generated by a finite number of polynomials, and in factcan be generated by a Groebner basis. Due to the structure of Groebner bases, thisgenerating set will give us polynomial solutions to our problem. By discovering howwe can find the elements of a Groebner basis in a systematic and strategic way, wewill solve our main problem.

2

Chapter 1

Algebra and Geometry:Preliminaries

We will begin with some useful definitions. Polynomials are certainly familiar, but inthis paper we will discuss polynomials in n variables x1, . . . , xn with coefficients in anarbitrary field k, which is a generalization of polynomials involving a single variable.First we will define monomials.

Definition 1 A monomial in x1, . . . , xn is a product of the form x1α1 ·x2

α2 . . . xnαn,

where all of the exponents α1, . . . , αn are nonnegative integers. The total degree ofthis monomial is the sum α1 + · · ·+ αn. We will denote this sum by | α |.

We will simplify the notation for monomials by letting α = (α1, . . . , αn) be an n-tupleof non-negative integers. Then we write

xα = x1α1 · x2

α2 . . . xnαn .

A polynomial is simply a sum of these monomials with coefficients from our field k.

Definition 2 A polynomial f in x1, . . . , xn with coefficients in a field k is a finitelinear combination (with coefficients in k) of monomials. We will write a polynomialf in the form

f =∑α

aαxα, aα ∈ k,

where the sum is over a finite number of n-tuples α = (α1, . . . , αn). The set of allpolynomials in x1, . . . , xn with coefficients in k is denoted k[x1, . . . , xn].

We will be using the following terminology in our discussion of polynomials:

Definition 3 Let f =∑

α aαxα be a polynomial in k[x1, . . . , xn].

1. We call aα the coefficient of the monomial xα.

3

2. If aα 6= 0, then we call aαxα a term of f.

3. The total degree of f, denoted deg(f), is the maximum | α | such that thecoefficient aα is nonzero.

It is clear that, under addition and multiplication, the set of polynomials k[x1, . . . , xn]is a ring. The ring k[x1, . . . , xn] is also commutative under multiplication, and so itis a commutative ring. Thus we call k[x1, . . . , xn] a polynomial ring.

Now we will introduce the notion of affine space, within which we will be working.

Definition 4 Given a field k and a positive integer n, we define the n-dimensionalaffine space over k to be the set

kn = {(a1, . . . , an) : a1, . . . , an ∈ k}.

A familiar example of an affine space is the case k = IR and we get the space IR n.To see how polynomials are related to affine space (and thus algebra to geometry), onemust regard a polynomial as a function from kn to k. We can see that the polynomialf =

∑α aαxα ∈ k[x1, . . . , xn] gives a function

f : kn → k.

We now define a geometric object, based on these ideas.

Definition 5 Let k be a field, and let f1, . . . , fs be polynomials in k[x1, . . . , xn]. Thenwe set V (f1, . . . , fs) = {(a1, . . . , an) ∈ kn : fi(a1, . . . , an) = 0 for all 1 ≤ i ≤ s}. Wecall V (f1, . . . , fs) the affine variety defined by f1, . . . , fs.

So, an affine variety V(f1, . . . , fn) ⊂ kn is the set of all solutions of the system ofequations f1(x1, . . . , xn) = · · · = fs(x1, . . . , xn) = 0. The polynomials (f1, . . . , fs) aresaid to vanish on V . We can picture familiar graphs that are affine varieties, suchas circles, ellipses, parabolas, and hyperbolas. The graph of a polynomial functiony = f(x) is also the affine variety V (y− f(x)). One example of such an affine varietyis V (z2 + x2 − y2):

We now define the main algebraic object that will be used in this paper.

4

Definition 6 A subset I ⊂ k[x1, . . . , xn] is an ideal if it satisfies:

1. 0 ∈ I.

2. If f, g ∈ I, then f + g ∈ I.

3. If f ∈ I and h ∈ k[x1, . . . , xn], then h · f ∈ I. (CLO 29)

Based on the third property, we say that an ideal absorbs elements of the ringunder multiplication. So an ideal is an additive subgroup of the ring k[x1, . . . , xn]that absorbs multiplication. The following definition will show that one exampleof an ideal is the ideal generated by a finite number of polynomials. That is, thegroup of all polynomial combinations of elements of k[x1, . . . , xn] with a generatingset polynomials forms an ideal.

Definition 7 Let f1, . . . , fs be polynomials in k[x1, . . . , xn]. Then we set

〈f1, . . . , fs〉 = {s∑

i=1

hifi : h1, . . . , hs ∈ k[x1, . . . , xn]}.

We call 〈f1, . . . , fs〉 the ideal generated by f1, . . . , fs. (CLO 29)

Theorem 1 Let f1, . . . , fs be polynomials in k[x1, . . . , xn]. Then 〈f1, . . . , fs〉 is indeedan ideal of k[x1, . . . , xn].

Proof:We can see that this is an ideal by showing that the three conditions for an ideal holdfor 〈f1, . . . , fs〉.

1. First we want to show that 0 ∈ 〈f1, . . . , fs〉. We know 0 ∈ k[x1, . . . , xn], sincek[x1, . . . , xn] is a ring. Thus

∑si=1 0 · fi ∈ 〈f1, . . . , fs〉. Since 0 =

∑si=1 0 · fi, then

0 ∈ 〈f1, . . . , fs〉.

2. Secondly we need to show that 〈f1, . . . , fs〉 is closed under addition. Let f =∑si=1 pifi ∈ 〈f1, . . . , fs〉 and g =

∑si=1 qifi ∈ 〈f1, . . . , fs〉. Then f+g =

∑si=1(pi+

qi)fi. We know that (pi + qi) ∈ k[x1, . . . , xn], since k[x1, . . . , xn] is a ring andthus closed under addition. Therefore, f + g =

∑si=1(pi + qi)fi ∈ 〈f1, . . . , fs〉.

3. Finally, we need to show that 〈f1, . . . , fs〉 absorbs elements of the ring k[x1, . . . , xn]under multiplication. Let h ∈ k[x1, . . . , xn] and let f =

∑si=1 pifi ∈ 〈f1, . . . , fs〉.

Then h · f =∑s

i=1(hpi)fi. Since k[x1, . . . , xn] is a ring and thus is closedunder multiplication, we know that hpi ∈ k[x1, . . . , xn]. Therefore h · f =∑s

i=1(hpi)fi ∈ 〈f1, . . . , fs〉 and we are done.(CLO 29)

5

The ideals generated by polynomials are computationally important, because anelement of the ideal will be a polynomial combination of the basis polynomials. Thusan element of the ideal will have important properties in common with the basiselements, such as common divisors and also will vanish on the same set. We can seethis by looking at the following type of ideal, which is constructed using varieties. Foran affine variety V = V (f1, . . . , fs) ⊂ kn defined by f1, . . . , fs ∈ k[x1, . . . , xn], we knowthat the polynomials f1, . . . , fs vanish on V . But there might be other polynomialsthat also vanish on V . It turns out that the set of all polynomials vanishing on V isan ideal. Intuitively, this makes sense because any combination of polynomials thatvanish on V will also vanish on V .

Definition 8 Let V ⊂ kn be a variety. Define

I(V ) = {f ∈ f [x1, . . . , xn]|f(a1, . . . , an) = 0 for all (a1, . . . , an) ∈ V }.This ideal is called the vanishing ideal of V.

We need to show that I(V ) is indeed an ideal.

Theorem 2 Let V ⊂ kn be a variety. Then I(V ) is an ideal in the ring k[x1, . . . , xn].

Proof:To prove that I(V ) is an ideal, we need to show that the three conditions for an idealare met.

1. First we need to show that 0 ∈ I(V ), where 0 is the zero polynomial. Since thezero polynomial vanishes on all of kn, in particular it vanishes on V. Therefore0 ∈ I(V ).

2. Next we need to show that I(V ) is closed under addition. Let f, g ∈ I(V ) andlet (a1, . . . , an) be an arbitrary point of V. Then

f(a1, . . . , an) + g(a1, . . . , an) = 0 + 0 = 0.

Therefore f + g ∈ I(V ).

3. Finally, we need to show that I(V ) absorbs elements of the ring k[x1, . . . , xn].Let f ∈ I(V ) and let h ∈ k[x1, . . . , xn] and pick an arbitrary point (a1, . . . , an)in V. Then h(a1, . . . , an)f(a1, . . . , an) = h(a1, . . . , an) · 0 = 0. So h · f ∈ I(V ),and we are done.

Therefore I(V ) is an ideal.

Thus we have had an introduction to ideals and can see why they are interestingalgebraic objects. The next chapter will focus on the bases of ideals. In particular,we will prove that every ideal of the ring k[x1, . . . , xn] has a finite basis, and we willintroduce a specific type of basis, a Groebner basis for ideals, and will find a methodfor constructing a finite Groebner basis for every ideal.

6

Chapter 2

Groebner Bases for Ideals

2.1 Monomial Ideals

As discussed in the previous section, when we are talking about ideals generated bya set of polynomials, an element of these ideals will be a sum of multiples of thebasis polynomials. So we can see that in trying to find whether a given polynomialis the element of an ideal, it would be important to discuss whether or not it isdivisible by elements of the basis. Divisibility turns out to be an important factor infinding elements of an ideal, as we will see in this chapter, and so one major themeof this chapter will be discussing how we can extend what we know about division ofpolynomials in one variable to division of polynomials with many (possibly infinitelymany) variables. We will begin with a discussion of monomials. Because we areworking with multi-variables, it is important to define a way to order monomialsaccording to their degree. This is natural in the one variable case, when we knowsimply that x2 has a greater degree than x. In the multi-variable case, however, thereare many possible ways of ordering monomials, as long as the ordering meets thefollowing criteria.

Definition 9 A monomial ordering on k[x1, . . . , xn] is any relation > on ZZ n≥0

satisfying:

1. > is a total (or linear) ordering on ZZ n≥0.

2. If α > β and γ ∈ ZZ n≥0, then α + γ > β + γ.

3. > is a well-ordering on ZZ n≥0. This means that every nonempty subset of ZZ n

≥0

has a smallest element under >. (CLO 53)

Note that we have defined a monomial ordering as an ordering on the n-tuplesα = (α1, . . . , αn) ∈ ZZ n

≥0. We construct a monomial xα = xα11 xα2

2 · · ·xαnn with the

n-tuple α as the exponent, and so there is a one-to-one relationship between the

7

monomials in k[x1, . . . , xn] and ZZ n≥0. So the ordering > on ZZ n

≥0 gives us an orderingon the monomials in k[x1, . . . , xn]. Thus if α > β, then by the ordering, xα > xβ.

Also note that in ZZ n≥0, addition is done component-wise. So if we have (a1, . . . , an),

(b1, . . . , bn) ∈ ZZ n≥0, then (a1, . . . , an) + (b1, . . . , bn) = (a1 + b1, . . . , an + bn).

The monomial ordering in the one variable case can also be thought of simply asdivisibility. That is, x2 is greater than x, since x2 is divisible by x. It is easily seenthat divisibility is not a monomial ordering in k[x1, . . . , xn] for n > 1. Once we havemore than one variable, simple divisibility cannot help us decide in general whetherone monomial is greater than another, because, for instance, x1 6≥ x2 and x2 6≥ x1,and so divisibility does not tell us which monomial is bigger. We must have someway of ordering these variables. A common ordering that is used in k[x1, . . . , xn] islexicographic ordering, in which the variables are usually ordered simply as x1 > x2 >· · · > xn or, depending on our notation, alphabetically so that a > b > · · ·x > y > z.For lex ordering alone, however, this can be done in n! different ways. So we cansee that for ZZ n, there are n! different lex orderings. The lexicographic monomialordering is defined as follows.

Definition 10 Lexicographic Order. Let α = (α1, . . . , αn) and β = (β1, . . . , βn) ∈ZZ n

≥0. We say α >lex β if, in the difference α−β ∈ ZZ n, the left-most nonzero elementis positive. We write xα >lex xβ if α >lex β.

Proposition 1 The lex ordering >lex is a monomial ordering on ZZ n≥0.

Proof:1. The lex ordering on ZZ n

≥0 is defined based on the numerical values of the componentsof it elements (a1, . . . , an). Since numerical ordering is a total ordering on ZZ≥0, itfollows directly that >lex is a total ordering.2. Let α, β, γ ∈ ZZ n

≥0 such that α >lex γ. Then, by definition of lex ordering, theleft-most nonzero entry in α− β is positive. We now need to order α + γ and β + γ.Taking the difference (α + γ) − (β + γ) = α − β, we see that the left-most nonzeroentry is positive. Thus α + γ >lex β + γ.3. To show that >lex is a well-ordering, we will proceed by contradiction. Assumethat <lex is not a well-ordering. Then there exists a nonempty subset S ⊂ ZZ n

≥ thathas no least element. Let α1 ∈ S. Then α1 is not the least elements and so theremust exist α2 ∈ S such that α1 > α2. Continuing in this manner, we get an infinitestrictly decreasing sequence

α1 >lex α2 >lex · · · .

By definition of lex ordering, the components in the first entries of these elementsform a monotonic decreasing sequence. We know that the numerical order on ZZ≥0 isa well-ordering. So there must be an element αk for which the first entries of all αi fori ≥ k are equal. Starting with this αk, the lex ordering, by definition, is determinedby the components in the entries after the first one. By the same argument as before,

8

the sequence of the second entries must have a least element as well. If we continuethis argument, we see that there must be some element αl in the infinite sequencesuch that all n entries of each vector αj for j ≥ l are equal. Then we have αl = αl+1,by definition of lex ordering. But according to our sequence, αl >lex αl+1. So bycontradiction, >lex is a well-ordering. (CLO 55)

Example:

1. x2y >lex xy, since (2, 1) >lex (1, 1) because α− β = (2, 1)− (1, 1) = (1, 0).

2. xy >lex x, since (1, 1) >lex (1, 0) because α− β = (1, 1)− (1, 0) = (0, 1).

3. x2 >lex xy2z, since (2, 0, 0) >lex (1, 2, 1) because α − β = (2, 0, 0) − (1, 2, 1) =(1,−2,−1).

Another example of a monomial ordering is Graded Lexicographic Order, whichis based upon the total degree of a monomial. In the case in which the total degreesof two monomials are equal, the tie is broken by using lex order.

Definition 11 Graded Lexicographic Order Let α, β ∈ ZZ n≥0. We say that

α >grlex β if |α| = ∑ni=1 αi > |β| = ∑n

i=1 βi, or |α| = |β| and α >lex β.

Example:

1. x2y >grlex xy, since (2, 1) >grlex (1, 1), because |α| = 3 > 2 = |β|. So theseparticular monomials are ordered the same as they were under lex ordering.

2. xy2z >grlex x2, since (1, 2, 1) >grlex (2, 0, 0), because |α| = 4 > 2 = |β|. Thisis an example of monomials that are ordered differently under graded lex orderthan lex order.

3. x3yz5 >grlex x2y3z4, since (3, 1, 5) >grlex (2, 3, 4). In this case, |α| = 9 = |β|,and so we had to revert to lex ordering, under which α >lex β, because α−β =(3, 1, 5)− (2, 3, 4) = (1,−2, 1).

Another, somewhat less intuitive, monomial ordering is Graded Reverse Lexico-graphic Order. It works the same as graded lex order, but ties are fixed by, in a sense,a reverse of the lex order. This order turns out to be computationally more efficientwhen using the algorithms that will be included in this paper.

9

Definition 12 Graded Reverse Lexicographic Order Let α, β ∈ ZZ n≥0. We say

that α >grevlex β if |α| =∑n

i=1 αi > |β| =∑n

i=1 βi, or |α| = |β|, and in α − β ∈ ZZ n,the right-most nonzero entry is negative.

To see that this ordering breaks ties differently than Graded Lex, we will revisitthe third pair of monomials in the previous example.

Example:The pair x3yz5 and x2y3z4 are ordered differently under Graded Reverse Lex thanunder Graded Lex or Lex. Under Graded Reverse Lex,

x2y3z4 >grevlex x3yz5, since (2, 3, 4) >grevlex (3, 1, 5).

In this case, |α| = 9 = |β|, and so we had to break the tie by taking the differenceα − β = (2, 3, 4) − (3, 1, 5) = (−1, 2,−1), and since the right-most nonzero entry isnegative, α >grevlex β.

Now that we have a grasp of monomial orderings, we can return to how these relateto polynomials. We will use the following terms for polynomials under a monomialorder:

Definition 13 Let f =∑

α aαxα be a nonzero polynomial in k[x1, . . . , xn] and let >be a monomial order.

1. The multidegree of f is

multideg(f) = max(α ∈ ZZ n≥0 : aα 6= 0)

(the maximum is taken with respect to >).

2. The leading coefficient of f is

LC(f) = amultideg(f) ∈ k.

3. The leading monomial of f is

LM(f) = xmultideg(f)

(with coefficient 1).

4. The leading term of f is

LT (f) = LC(f) · LM(f).

(CLO 57)

10

The following is an illustration of these terms, under Lex Order.

Example:Let f = 3x2y + 4xy ∈ k[x, y] under lex ordering with x > y. Then LC(f) = 3,LM(f) = x2y, and LT (f) = 3x2y.

Lemma 1 Let f, g ∈ k[x1, . . . , xn] be nonzero polynomials. Then:

1. multideg(f · g) = multideg(f) + multideg(g).

2. If f + g 6= 0, then multideg(f + g) ≤ max(multideg(f), multideg(g)). If, inaddition, multideg(f) 6= multideg(g), then equality occurs. (CLO 58)

Proof:Let f =

∑α aαxα and g =

∑β bβxβ for α, β ∈ ZZ n

≥0.1. We first need to show that multideg(f · g) = multideg(f) + multideg(g).

f · f =∑α

aαxα∑β

bβxβ

=∑α

∑β

aαbβxαxβ

=∑α

∑β

aαbβxα+β.

Then

multideg(f · g) = max(α + β ∈ ZZ n≥0 : aαbβ 6= 0)

= max(α ∈ ZZ n≥0 : aα 6= 0) + max(β ∈ ZZ n

≥0 : aβ 6= 0)

= multideg(f) + multideg(g).

2. Assume that f + g 6= 0 and that multideg(f) = multideg(g). So LM(f) =LM(g). We need to show that multideg(f + g) ≤ max(multideg(f), multideg(g). IfLC(f)+LC(g) = 0, then the leading terms of f and g cancel and multideg(f + g) <max(multideg(f), multideg(g). If LC(f) + LC(g) 6= 0, then LM(f + g) = LM(f) =LM(g).

Assume that f + g 6= 0 and that multideg(f) 6= multideg(g). Assume, WLOG,that multideg(f) > multideg(g). Then LM(f) > LM(g) and so LM(f + g) =LM(f). Thus multideg(f + g) = multideg(f) = max(multideg(f), multideg(g)).

We are now ready to study the properties of monomial ideals, and we will seeformally why divisibility is so important for finding an element of an ideal.

Definition 14 An ideal I ⊂ k[x1, . . . , xn] is a monomial ideal if there is a subsetA ⊂ ZZ n

≥0 (possibly infinite) such that I consists of all polynomials which are finitesums of the form

∑α∈A hαxα, where hα ∈ k[x1, . . . , xn] and α ∈ A. In this case, we

write I = 〈xα : α ∈ A〉. (CLO 67)

11

It follows directly from the definition of an ideal that a monomial ideal is indeed anideal.

Example:An example of a monomial ideal is I = 〈x6y7, x2y5, x5y3〉 ⊂ k[x, y] with correspondingset A = {(6, 7), (2, 5), (5, 3)}.

So we can see that a monomial ideal I is generated by a set of monomials andthe elements of this ideal are polynomials. The following two lemmas describe howwe can characterize an element of a monomial ideal, which will later help us todetermine whether a given element of the polynomial ring is in an ideal. The firstlemma concerns monomials in the ideal.

Lemma 2 Let I = 〈xα : α ∈ A〉 be a monomial ideal. Then a monomial xβ lies in Iif and only if xβ is divisible by xα for some α ∈ A.

Proof:

(⇐=) Assume xβ is a multiple of xα for some α ∈ A. From this it follows thatxβ ∈ I by the definition of an ideal.

(=⇒) Assume xβ ∈ I. Then by definition, xβ =∑s

i=1 hixα(i) where hi ∈ k[x1, . . . , xn]

and α(i) ∈ A. We can write hi as a linear combination of monomials, hi = apixp(i) +

ap−1ixp−1(i) + · · ·+ a0i

. Then hi · xα(i) = apixp(i)+α(i) + ap−1x

p−1(i)+α(i) + · · ·+ a0ixα(i).

So we can see that every term of∑s

i=1 hixα(i) must be divisible by some xα(i). Since

the sum of these terms is the monomial xβ, then each term must be divisible by thesame xα(i). Thus xβ must also be divisible by some xα(i). (CLO 67-69)

The next lemma describes how we can characterize a polynomial that is in a givenmonomial ideal.

Lemma 3 Let I be a monomial ideal, and let f ∈ k[x1, . . . , xn]. Then the followingare equivalent:

1. f ∈ I.

2. Every term of f lies in I.

3. f is a k-linear combination of the monomials in I.

(CLO 68)

Proof:We will show that: 3 ⇒ 2 ⇒ 1 ⇒ 3, and thus prove that the three are equivalent.

12

(3 ⇒ 2) Assume f is a k -linear combination of the monomials in I. Then everyterm of f is a multiple of an element of I. So by definition, each term of f is in I.Since I is closed under addition, it follows that the sum of these terms, f , is in I.

(2 ⇒ 1) Assume every term of f lies in I. Then f ∈ I, since I is closed underaddition.

(1 ⇒ 3) Let f ∈ I and assume I = 〈xα : α ∈ A〉. Then by definition, f =∑si=1 hix

α(i) where hi ∈ k[x1, . . . , xn] and α(i) ∈ A. Let hi = a0ixm(i) + a1i

xm−1(i) +· · ·+ am(i). Then hi · xα(i) = a0i

xm(i) · xα(i) + a1ixm−1(i) · xα(i) + · · ·+ am(i)x

α(i). So theterms of f are linear combinations of monomials xα(i) in I.

Using the previous two lemmas, we can now prove that any monomial ideal hasa finite basis, which will be the first step in showing that every ideal has a finitegenerating set.

Theorem 3 Dickson’s Lemma A monomial ideal I = 〈xα : α ∈ A〉 ⊂ k[x1, . . . , xn]can be written down in the form I = 〈xα(1), . . . , xα(s)〉, where α(1), . . . , α(s) ∈ A. Inparticular, I has a finite basis.

Proof:We will prove this Lemma by induction on the number of variables n.

Base Case: Our base case will be n = 1. Then I is generated by the monomial xα1

where α ∈ A ⊂ ZZ≥0. We know that A ⊂ ZZ≥0 has a smallest element β by definitionof monomial ordering. So β < α for all α ∈ A, and so xβ

1 divides all other generatorsxα

1 . Therefore I = 〈xβ1 〉 by Lemma 2.

Inductive Hypothesis: Assume that n > 1 and that the theorem is true for n− 1.That is, a monomial ideal I = 〈xα : α ∈ A〉 ⊂ k[x1, . . . , xn−1] can be written downin the form I = 〈xα(1), . . . , xα(s)〉, where α(1), . . . , α(s) ∈ A. Thus if we add avariable y, the monomials in k[x1, . . . , xn−1, y] can be written as xαym, where α =(α1, . . . , αn−1) ∈ ZZ n−1

≥0 and m ∈ ZZ≥0.Let I ⊂ k[x1, . . . , xn−1, y] be a monomial ideal, and let J be the monomial ideal

in k[x1 . . . , xn−1] generated by the monomials xα for which xαym ∈ I for some m ≥ 0.We know that J is in fact an ideal, because the α’s for which xαym ∈ I form asubset A ⊂ ZZ n

≥0 for which the definition of monomial ideal holds for J . Then byour inductive hypothesis, finitely many of the xα’s generate J , and so we can writeJ = 〈xα(1), . . . , xα(s)〉. Then, by the definition of J, we know that for each i between1 and s, xα(i)ymi ∈ I for some mi ≥ 0. Since there is only a finite number of xα(i)’s,there is only a finite number of ymi ’s. Let m be the largest of the mi. Then, foreach l between 0 and m− 1, consider the ideal Jl ⊂ k[x1, . . . , xn−1] generated by themonomials xβ such that xβyl ∈ I. That is, Jl is the part of I generated by monomialscontaining y exactly to the lth power. Again, by our inductive hypothesis, Jl has afinite generating set of monomials, and so we can write, Jl = 〈xαl(1), . . . , xαl(sl)〉.

Claim: I is generated by the following set of monomials:

13

from J : xα(1)ym, . . . , xα(s)ym

from J0 : xα0(1), . . . , xα0(s0)

from J1 : xα1(1)y, . . . , xα1(s1)y

from J2 : xα2(1)y2, . . . , xα2(s2)y2

···

from Jm−1 : xαm−1(1)ym−1, . . . , xαm−1(sm−1)ym−1.

We will prove this claim by showing that the above monomials generate an idealthat has the same elements as I. Let xβyp ∈ I. Either p ≥ m or p ≤ m − 1. Wewill start with the case where p ≥ m. We know that J contains all xα(i) such thatxα(i)ym ∈ I. If we divide xβyp by yp−m ∈ I, we get xβym, and so we can see thatxβ ∈ J . Thus xβyp is divisible by some xα(i)ym by the construction of J . Therefore,in the case that p ≥ m, the monomials xα(i)ym generate elements xβyp ∈ I.

Our second case is that p ≤ m − 1. By a similar argument, xαyp is divisible bysome xβp(j)yp by the construction of Jp. Then it follows from Lemma 2 that the listedmonomials generate an ideal having the same monomials as I. It follows directly frompart III of Lemma 3 that a monomial ideal is uniquely determined by the monomialsit contains. Therefore the ideal generated by the above monomials is the same as I.Thus we have a set of generators for I.

Now we need to show that a finite set of generators can be chosen from this set ofgenerators of the ideal I. To simplify things, we will change our notation by lettingy = xn so that we are again writing the variables x1, . . . , xn. So, as we have shown,our monomial ideal is I = 〈xα : α ∈ A〉 ⊂ k[x1, . . . , xn]. We need to show that Iis generated by finitely many of the xα’s, where α ∈ A. The above list gave us afinite set of generators for I, since there are si monomials on each line for some finitesi ∈ ZZ≥0, and there are m lines, for some finite m ∈ ZZ≥0. We can write this set ofgenerators as 〈xβ(1), . . . , xβ(t)〉 for some monomials xβ(i) ∈ I. Since I = 〈xα : α ∈ A〉,each xβ(i) is divisible by xα(i) by Lemma 2. Therefore, by replacing each xβ(i) by xα(i),we get that I = 〈xα(1), . . . , xα(t)〉. (CLO 69-70).

The following example will illustrate how the above proof constructs a finite gen-erating set for an ideal.

Example:We will use the steps of the above proof to find a finite generating set for the idealI = 〈x7y3, x4y6, x3y7〉 ∈ k[x, y] (Note that we already have a finite generating set forthe ideal, but it is still a useful example to see how the proof would construct one).

14

Let J be defined, as above, as a monomial ideal generated by the monomials xα

for which xαymi ∈ I for some mi > 0. We can see that the largest of these mi’s, forour example, is m = 7. Thus we let J = 〈x3〉 ⊂ k[x]; that is, J is generated by themonomials xα for which xαy7 ∈ I. We then construct other monomial ideals, the Jk’sfrom our proof, for each 0 ≤ k ≤ 6 = m − 1. And so we get that J0 = J1 = J2 ={0} ⊂ k[x], since no monomials in the generating set, and thus no monomials in I,have a y factor of degree 0, 1, or 2. Next we get, J3 = J4 = J5 = 〈x7〉 ⊂ k[x], sincex7y3 is in the generating set of I, and all monomials in I that can be written xαy3,xαy4, or xαy5, for any α ≥ 7, will be generated by the element x7y3 of the generatingset. Finally, we get J6 = 〈x4〉 ⊂ k[x], since x4y6 is in the generating set of I andgenerates monomials in I that can be written xαy6. As the proof dictates, we thenpick a finite generating set from each of these Jk ideals, which in our case will be thegenerating monomial for each one. This gives us

from J : x3y7

from J3 : x7y3

from J4 : x7y4

from J5 : x7y5

from J6 : x4y6.

This gives us the finite generating set I = 〈x3y7, x7y3, x7y4, x7y5, x4y6〉.

We can also look at an illustration of this example that shows how the theoremworks. If for each xmyn, we plot (m, n) on the first quadrant in the plane:

So we can visualize the ideal I = 〈x7y3, x4y6, x3y7〉 by seeing that the elementsof the generating set correspond to the corner points on the graph, and each plottedpoint represents a monomial in I. The ideals Jk are horizontal slices of this area.That is, J from our example includes all points in the area along the line n = 7 andup, and we can see that these elements can be generated by x3y7, because this is theleftmost corner of this area. J6 is the area including the points to the right of and onthe line m = 4 and above and on the line n = 6. The others are easily seen as well.

15

2.2 Division Algorithm for k[x1, . . . , xn]

In the previous section, we saw how important division is for determining whether apolynomial is in a given monomial ideal. We will see in the next section that divisionis used in a similar way to determine whether a polynomial is in any given ideal. Inorder to do division with monomials in k[x1, . . . , xn], we needed to define a monomialordering. In the same way, we need to now define a division algorithm for k[x1, . . . , xn]so that we can divide polynomials. To see how we can do this, we will first look atthe one variable case in which we have the following algorithm for division.

Theorem 4 Division Algorithm for k[x] Let k be a field and let g be a nonzeropolynomial in k[x]. Then every f ∈ k[x] can be written as

f = q · g + r,

where q, r ∈ k[x], and either r = 0 or deg(r) < deg(g). Furthermore, q and r areunique, and the following is an algorithm for determining q and r:

Let g, f ∈ I. Start by letting q0 = 0 and r0 = f .Then, for as long as rn 6= 0 and LT (g) divides LT (rn), where n ∈ IN , repeat thefollowing:

Let qn+1 = qn + LT (rn)/LT (gn) and let rn+1 = rn − (LT (rn)/LT (g)) · g.If rn = 0 or LT (g) does not divide LT (rn), then let r = rn and q = qn,and the algorithm is complete.

(Adams 26)

We can check that the above values assigned in the algorithm work, by observingthat : f = q · g + r = (q + LT (r)/LT (g)) · g + (R − (LT (r)/LT (g)) · g). We will usethe following example to demonstrate how this algorithm works:

Example:

Let f = 3x4 + 4x2 + 2x + 1.Let g = x2 + 3x.Then r0 = 3x4 + 4x2 + 2x + 1 and q0 = 0.

• Since LT (g) = x2 divides LT (r0) = 3x4, we let

q1 = q0 + LT (r0)/LT (g) = 3x4/x2 = 3x2

and

r1 = r0 − (LT (r0)/LT (g)) · g = 3x4 + 4x2 + 2x + 1− 3x2(x2 + 3x)

= −9x3 + 4x + 2x + 1.

• Then LT (g) = x2 divides LT (r1) = −9x3, and so we let

16

q2 = q1 + LT (r1)/LT (g) = 3x2 + (−9x3/x2) = 3x2 − 9x

and

r2 = r1 − (LT (r1)/LT (g)) · g = −9x3 + 4x2 + 2x + 1− (−9x3/x2)(x2 + 3x)

= 31x2 + 2x + 1.

• Then LT (g) = x2 divides LT (r2) = 31x2, and so we let

q3 = q2 + LT (r2)/LT (g) = 3x2 − 9x + (31x2/x2) = 3x2 − 9x + 31

and

r3 = r2 − (LT (r2)/LT (g)) · g = 31x2 + 2x + 1− (31x2/x2)(x2 + 3x)

= −91x + 1.

• Then LT (g) = x2 does not divide LT (r3) = −91x. And so we get

3x4 + 4x2 + 2x + 1 = q · (x2 + 3x) + r = (3x2 − 9x + 31)(x2 + 3x) + (−91x + 1).

It is easy to see how these steps correspond to how division is usually done:

q3︷︸︸︷q2︷︸︸︷

q1︷︸︸︷3x2 −9x +31

x2 + 3x√

3x4 + 4x2 + 2x + 1

−(3x4 + 9x3)

−9x3 + 4x2 + 2x + 1 }r1

−(−9x3 − 27x2)

31x2 + 2x + 1 }r2

−(31x2 + 93x)

−91x + 1 }r3

Thus we can see that the algorithm works, and that it will always terminate in afinite number of steps, since this is true for “normal” division.

We will now construct a division algorithm for k[x1, . . . , xn].

Theorem 5 Division Algorithm in k[x1, . . . , xn] We start by fixing a monomialorder > on ZZ n

≥0, and let F = (f1, . . . , fs) be an ordered s-tuple of polynomials ink[x1, . . . , xn]. Then every f ∈ k[x1, . . . , xn] can be written as

f = a1f1 + · · · asfs + r,

17

where ai, r ∈ k[x1, . . . , xn], and either r = 0 or r is a linear combination, with coeffi-cients in k, of monomials, none of which is divisible by any of LT (f1), . . . , LT (fs).We will call r a remainder of f upon division by F.Furthermore, if aifi 6= 0, the we have

multideg(f) ≥ multideg(aifi).

(CLO 62)

To prove the above statement, i.e. the existence of a1, . . . , as and r, we will givean algorithm for their construction that operates for any given F = (f1, . . . , fs) ⊂k[x1, . . . , xn] and f ∈ k[x1, . . . , xn]. Note that, unlike the one variable case in whichthe quotient and remainder are unique, this division algorithm does not guaranteethe uniqueness of a1, . . . , as and r.

Proof:Fix a monomial order > on ZZ n

≥0 and let F = (f1, . . . , fs) be an ordered s-tuple ofpolynomials in k[x1, . . . , xn]. Pick f ∈ k[x1, . . . , xn]. Unlike the one-variable case,when we do division in multi-variables, terms are added to the remainder throughoutthe division process rather than just at the end. So there are two steps to the process.At the start of the algorithm, we set h = f and r = 0, and we will keep addingremainder terms to r throughout the algorithm. The first step is that whenever someLT (fi) divides LT (h), then we divide the monomial h by fi in the way that occursin the algorithm for the one-variable case. That is, we begin with a quotient qi = 0.Then we let qi = qi + LT (h)

LT (fi)and h = h− LT (h)

LT (fi)· fi. Once LT (h) is no longer divisible

by any LT (fi), this step terminates, as in the one-variable case. We then move on tothe second step: Whenever LT (h) is not divisible by any LT (fi), we add LT (h) tothe remainder column. That is, we let r = r + LT (h) and h = h−LT (h). We returnto the first step once we have moved enough terms of h to the remainder column thatLT (h) is divisible by some LT (fi). We stop this process once we have p = 0.

We will now prove that this algorithm works by showing that

f = q1f1 + · · · qsfs + h + r (2.1)

at every stage of the process. When h = f and r = 0, it is clearly true that f = h+r.Assume that 2.1 holds for some step k of the algorithm. Then the next step is eithera step in which division occurs or a step in which we add a term to the remainder.If division occurs, then LT (h) is divided by some LT (fi). Then according to ouralgorithm

qifi + h = (qi + LT (h)/LT (fi)) · fi + (h− (LT (h)/LT (fi)) · fi)

which does not change the sum qifi + h. This division step keeps all other terms thesame and so 2.1 holds in this case. If, instead, we add a term to the remainder in

18

this step, then according to our algorithm,

h + r = (h− LT (h)) + (r + LT (h))

which does not change the sum h + r. Thus 2.1 holds in this case as well. Thealgorithm stops when h = 0. At this point,

f = q1f1 + · · ·+ asfs + r

in which, by the steps of the algorithm, no terms of r are divisible by any LT (fi).Thus, by induction, 2.1 holds in all cases.

Now we need to show that the algorithm terminates. Note that in each step ofthe algorithm we are always subtracting LT (h) from h. Thus, after every step, themultideg(h) decreases. If the algorithm never terminated, then we would have aninfinite decreasing sequence of multidegrees from each step of the algorithm. But thiswould be a contradiction, since our monomial order > must be a well-ordering by thedefinition of a monomial order. Thus the set of multidegrees taken for h at each stepof the algorithm must have a least element, and since each step gives us a smallermultidegree, h = 0 must happen eventually. Thus we will be able to divide, usingthis algorithm, in a finite number of steps.

Finally, we need to show that if qifi 6= 0, then multideg(f) ≥ multideg(qifi). ByLemma 1, we know that multideg(qifi) = multideg(qi) + multideg(fi). Also, since

every term of qi is equal to LT (h)LT (f)

for some value of h, we know that multideg(qi) =

multideg(h)−multideg(f) for some value of h. So we have multideg(qifi) = multideg(h)−multideg(f) + multideg(fi). Since h = f at the start of the algorithm and we knowthat multideg(h) decreases after every step, we know that multideg(h) ≤ multideg(f)for all values of h. Similarly, we know from our algorithm that since qifi 6= 0,multideg(fi) ≤ multideg(f). Thus

multideg(qifi) = multideg(h)−multideg(f) + multideg(fi)

≤ 2multideg(f)−multideg(f)

= multideg(f).

QED (Adams 28, 31, CLO 62).

The following example will demonstrate this algorithm.

Example:We will divide f = x3y2 + xy + x + 1 by f1 = x3 + 1 and f2 = y2 + 1 using lex orderwith x > y. Then according to our algorithm, we get the following (since terms areadded to the remainder throughout the algorithm’s process, we write the remaindercolumn on the right to keep track of the terms that go to the remainder):

q1 : y2

19

q2 :

x3 + 1√

x3y2 + xy + x + 1 r

y2 + 1

−(x3y2 + y2)

xy + x− y2 + 1 −→ xy + x

−y2 + 1

After dividing f by the leading term of f1, we get the polynomial xy + x− y2 + 1with no terms that are divisible by the leading term of f1. Furthermore, the first twoterms, xy and x are not divisible by the leading term of f2, and so these go to theremainder column. We are left with −y2 + 1 and we divide this by the leading termof f2.

q1 : y2

q2 : −1

x3 + 1√

x3y2 + xy + x + 1 r

y2 + 1

−(x3y2 + y2)

xy + x− y2 + 1 −→ xy + x

−y2 + 1

−(−y2 − 1)

2 −→ xy + x + 2

After dividing by the leading term of f2, we get the 2, and so this term is sent tothe remainder column and we have a total remainder of xy + x + 2. Thus we obtain

x3y2 + xy + x + 1 = y2 · (x3 + 1) + (−1) · (y2 + 1) + xy + x + 2.

This generalization of the division algorithm fails to be as efficient as the onevariable division algorithm. The remainders are not guaranteed to be unique; theydepend on the ordering of the s-tuple of polynomials F = (f1, . . . , fs). The followingis a simple example demonstrating that remainders are not uniquely determined. Inthe example, F has two elements, and the remainder of the polynomial we pick, upondivision by F , is shown to be different depending on the order of the two elements ofF .

Example:Let f1 = x2y − 2x, f2 = y3 + 4 ∈ k[x, y]. We will use lex order with x > y.

20

Let f = x2y3 − 2xy2 ∈ k[x, y].Our first case will be F = (f1, f2).Then

y2

x2y − 2x√

x2y3 − 2xy2

y3 + 4

−(x2y3 − 2xy2)

0

So x2y3 − 2xy2 = y2 · (x2y − 2x) + 0 · (y3 + 4) + 0.

In our second case, we will let F = (f2, f1).Then

x2

y3 + 4√

x2y3 − 2xy2

x2y − 2x

−(x2y3 + 4x2)

−2xy2 − 4x2

So x2y3 − 2xy2 = x2 · (y3 + 4) + 0 · (x2y − 2x)− 2xy2 − 4x2.So we can see that the two cases produced two different remainders, 0 and−2xy2−4x2,respectively, due to a switch in the order of the polynomials in F .

In the first case, the remainder of 0 told us that f was in the ideal 〈f1, f2〉 byLemma 2. But if we had only looked at the second case, we would not have knownthat f was in 〈f1, f2〉. Thus, for an ideal I = 〈f1, . . . , fn〉, we know that a polynomialf ∈ k[x1, . . . , xn] is in the ideal if the remainder of f upon division by I is zero, butthis is not a necessary condition for ideal membership. Our goal is to establish sucha necessary condition.

2.3 Groebner Bases

Now that we have proven that a monomial ideal has a finite generating set of mono-mials, we will use this fact to show that every ideal has a finite generating set. To dothis, we will introduce a monomial ideal that is generated by the leading terms of eachpolynomial in the ideal. Once we have a monomial ordering, each f ∈ k[x1, . . . , xn]has a unique leading term denoted LT (f) and these leading terms generate a mono-mial ideal.

Definition 15 Let I ⊂ k[x1, . . . , xn] be an ideal other than 0.

21

1. We denote by LT(I) the set of leading terms of elements of I. Thus,

LT (I) = {cxα : there exists f ∈ I with LT (f) = cxα}.

2. We denote by 〈LT (I)〉 the ideal generated by the elements of LT (I).

Note that if we are given a finite generating set for I, say I = 〈f1, . . . , fs〉, then〈LT (f1), . . . , LT (fs)〉 and 〈LT (I)〉 may be different ideals. The following exampleillustrates this.Example:Consider I = 〈x2 + 1, xy〉 with < as a lex ordering with x > y.Then LT (x2 + 1) = x2 and LT (xy) = xy. So, 〈LT (x2 + 1), LT (xy)〉 = 〈x2, xy〉.Since, y(x2 + 1)− x(xy) = y, we know y ∈ I and so LT (y) ∈ 〈LT (I)〉.However, LT (y) = y 6∈ 〈x2, xy〉.Therefore 〈LT (I)〉 6= 〈LT (x2 + 1), LT (xy)〉.

Even though 〈LT (f1), . . . , LT (fs)〉 and 〈LT (I)〉 may be different ideals, the nexttheorem will show that there is a set of polynomials in the ideal I for which they arethe same.

Proposition 2 Let I ⊂ k[x1, . . . , xn] be an ideal.

1. 〈LT (I)〉 is a monomial ideal.

2. There are g1, . . . , gt ∈ I such that 〈LT (I)〉 = 〈LT (g1), . . . , LT (gt)〉.

Proof:Let I ⊂ k[x1, . . . , xn] be an ideal. We will show that the above two conditions hold.

1. We know that the leading monomials LM(g) of elements g ∈ I \ {0} generatethe monomial ideal 〈LM(g) : g ∈ I \ {0}〉. Let xα(i) ∈ 〈LM(g) : g ∈ I \ {0}〉 bethe leading monomial of gi and aαi

∈ k be the leading coefficient of gi. Thenaαi

·xα(i) ∈ 〈LM(g) : g ∈ I \{0}〉. Now start with aαjgj ∈ 〈LT (g) : g ∈ I \{0}〉.

Then a−1αj∈ k, since k is a field. So a−1

αj· aαj

gj = gj ∈ 〈LM(g) : g ∈ I \ {0}〉.Thus 〈LM(g) : g ∈ I \ {0}〉 = 〈LT (g) : g ∈ I \ {0}〉 = 〈LT (I)〉. Therefore〈LT (I)〉 is a monomial ideal.

2. By Dickson’s Lemma, we know that the monomial ideal〈LT (I)〉 = 〈LT (g1), . . . , LT (gt)〉 for finitely many g1, . . . , gs ∈ I.(CLO 73-74).

We will now complete this series of theorems by proving that every polynomialideal has a finite generating set. The set of monomials g1, . . . , gs in the above theoremsuch that 〈LT (I)〉 = 〈LT (g1), . . . , LT (gt)〉 are in fact a finite generating set for theideal I. We will fix a monomial ordering and use the division algorithm defined forpolynomials in k[x1, . . . , xn].

22

Theorem 6 (Hilbert Basis Theorem) Every ideal I ⊂ k[x1, . . . , xn] has a finitegenerating set. That is, I = 〈g1, . . . , gt〉 for some g1, . . . , gt ∈ I.

Proof:If I = 0, then our generating set is 0, and we are done. If I contains some nonzeropolynomial, then by Proposition 1, there are g1, . . . , gt ∈ I such that 〈LT (I)〉 =〈LT (g1), . . . , LT (gt)〉.

Show: I = 〈g1, . . . , gt〉.(⊇) We know that 〈g1, . . . , gt〉 ⊂ I, since each gi ∈ I.(⊆) Let f ∈ I be any polynomial. If we divide f by 〈g1, . . . , gt〉 (using the division

algorithm), then we get an expression of the form

f = a1g1 + · · ·+ atgt + r

where every term in r is not divisible by any LT (gi), . . . , LT (gt). We need to showthat r = 0. Note that

r = f − a1g1 − · · · − atgt ∈ I.

So, since r ∈ I, then LT (r) ∈ 〈LT (I)〉 = 〈LT (g1), . . . , LT (gt)〉. Then by Lemma 2,LT (r) must be divisible by some LT (gi). But, unless r = 0, this contradicts what itmeans for r to be a remainder by the division algorithm. Thus,

f = a1g1 + · · ·+ atgt + 0 ∈ 〈g1, . . . , gt〉,

and so f ∈ I. Therefore I ⊂ 〈g1, . . . , gt〉. (CLO 1997 74).

As a consequence of the Hilbert Basis theorem, we have an important geometricresult. We can now describe the set of zeros of an ideal I as the set of zeros commonto finitely many polynomials. We introduced varieties as the sets of solutions to afinite set of polynomial equations, and now, even though every nonzero ideal I ⊂k[x1, . . . , xn] contains infinitely many polynomials, it makes sense to talk about theaffine variety defined by that ideal.

Definition 16 Let I ⊂ k[x1, . . . , xn] be an ideal. We will denote by V (I) the set

V (I) = {(a1, . . . , an) ∈ kn : f(a1, . . . , an) = 0 for all f ∈ I}.

The following is a proof that this set V (I) is indeed an affine variety.

Proposition 3 V (I) is an affine variety. In particular, if I = 〈f1, . . . , fs〉, thenV (I) = V (f1, . . . , fs).

Proof:By the Hilbert Basis Theorem, I = 〈f1, . . . , fs〉 for a finite set of polynomials f1, . . . , fs ∈I.Claim: V (I) = V (f1, . . . , fs).

23

(⊆) Let (a1, . . . , an) ∈ V (I). Let fi ∈ I be a generating polynomial of I. Then, sincef(a1, . . . , an) = 0 for all f ∈ I, fi(a1, . . . , an) = 0. So (a1, . . . , an) ∈ V (f1, . . . , fs).Thus V (I) ⊂ V (f1, . . . , fs).(⊇) Let (a1, . . . , an) ∈ V (f1, . . . , fs) and let f ∈ I. Since I = 〈f1, . . . , fs〉, we canwrite

f =s∑

i=1

hifi

for some hi ∈ k[x1, . . . , xn]. Thus

f(a1, . . . , an) =s∑

i=1

hi(a1, . . . , an)fi(a1, . . . , an)

=s∑

i=1

hi(a1, . . . , an) · 0 = 0.

So V (f1, . . . , fs) ⊂ V (I). Therefore V (f1, . . . , fs) = V (I). (CLO 77)

We chose the basis {g1, . . . , gt} in the Hilbert Basis theorem so that it had theproperty 〈LT (I)〉 = 〈LT (g1), . . . , LT (gt)〉. Yet we saw in our example above that thisproperty does not hold for all bases. Every ideal, however, has such a basis, and itis known as a Groebner basis. We will see in the next section that Groebner basesare very useful algebraic tools, and we will see how they can solve our question ofdetermining whether an element of k[x1, . . . , xn] is in a given ideal.

Definition 17 Fix a monomial order. A finite subset G = {g1, . . . , gt} of an ideal Iis said to be a Groebner basis if

〈LT (g1), . . . , LT (gt)〉 = 〈LT (I)〉.

Corollary 1 Fix a monomial order > on k[x1, . . . , xn], and let I ⊂ k[x1, . . . , xn] bean ideal. Then I has a Groebner basis. Furthermore, any Groebner basis of I is abasis of I.

Proof:Let I be a nonzero ideal. Then the set G = {g1, . . . , gt} constructed in the proof ofTheorem 6 is a Groebner basis by definition, and therefore every ideal has a Groebnerbasis. Then for a Groebner basis G, 〈LT (I)〉 = 〈LT (g1), . . . , LT (gt)〉 and so I =〈g1, . . . , gt〉 by the proof of Theorem 6. Therefore G is a basis for I.

2.4 Properties of Groebner Bases

Groebner bases yield some very useful algebraic results. In this paper, we have beendiscussing what is known as the Ideal Membership Problem, that is, finding a way to

24

determine whether a given polynomial is an element of an ideal. This can be donewith Groebner basis due to the important fact that the remainder of a polynomial ink[x1, . . . , xn] on division by a Groebner basis is unique. That is, when a polynomialis reduced by the elements of a Groebner basis of an ideal, the remainder is unique,no matter what the order of the elements of G when using the division algorithm. Ifwe let G = {g1, . . . , gt} be a Groebner basis, then the remainder of f on division byG will be denoted

r = fG.

Proposition 4 If G = {g1, . . . , gt} is a Groebner basis for I and f ∈ k[x1, . . . , xn],then f ∈ I if and only if the remainder of f on division by g1, . . . , gt is zero.(Cox 7)

The following is a partial proof that will be completed by the next result.

Proof:(⇐) If f ∈ I, then we know that f can be written as a combination of g1, . . . , gt. Sowe can see that there must be some way to divide f by G that gives us a remainderof zero, but on the other hand, as we have seen, if we divide any way that we pleasewe might not get a zero remainder...(⇒) If f ∈ I, then the remainder of f on division by g1, . . . , gt is zero, as we saw inthe proof of Theorem 6.

So the above proposition tells us when an element of a ring is an element of agiven ideal, by using the a Groebner basis for the ideal. But this is only useful whenwe know that the zero remainder upon division by a Groebner basis is unique. Sincewe have developed a division algorithm for k[x1, . . . , xn], we can now use this divisionalgorithm to prove the following crucial property about Groebner bases.

Proposition 5 If G = {g1, . . . , gt} is a Groebner basis for I and f ∈ k[x1, . . . , xn],then f can be written uniquely in the form

f = g + r

where g ∈ I and no term of r is divisible by any LT (gi).

Proof:Let G = {g1, . . . , gt} be a Groebner basis for an ideal I ⊂ k[x1, . . . , xn] and let f ∈k[x1, . . . , xn]. Then by the division algorithm, we get f = a1g1 + · · ·+ atgt + r, whereai ∈ k[x1, . . . , xn] and no term of r is divisible by any LT (gi). Thus a1g1+· · ·+atgt ∈ I,since G is a basis for I. This proves the existence of r satisfying the conditions of theproposition, and now it remains to show that r is unique.

Suppose f = g1 + r1 = g2 + r2, satisfying the conditions of Proposition 3. Thenr1 − r2 = g2 − g1 ∈ I. If r1 − r2 6= 0, then LT (r1 − r2) ∈ 〈LT (I)〉. Since, by

25

definition of a Groebner basis, 〈LT (I)〉 = 〈LT (g1), . . . , LT (gt)〉, then LT (r1 − r2) isdivisible by some LT (gi) by Lemma 2. But no term of r1 or r2 is divisible by anyLT (g1, . . . , gt). Thus it is not possible that LT (r1− r2) 6= 0. So r1 = r2, and we haveproved uniqueness of the remainder of a polynomial upon division by a Groebnerbasis. (CLO 79)

So what we have shown is that remainders of f upon division by G are unique,no matter what order we put the elements of G when we divide f by them. Thatis, if G = {g1, . . . , gn}, we can divide f by {g1, . . . , gn} or {gn, . . . , g1} or any otherorder you would like and still get the same remainder in the end. Therefore we havefound our necessary condition for ideal membership (to restate Proposition 2): for anideal I ⊂ k[x1, . . . , xn] and a Groebner basis G of I, a polynomial f ∈ k[x1, . . . , xn]is in I if and only if the remainder of f upon division by G is zero. This test forideal membership is useless unless we have a Groebner basis for our ideal. In thefollowing section we will establish a criteria for determining whether a given basis isa Groebner basis and thereby gain an algorithm for constructing a Groebner basisfrom any given basis for an ideal.

2.5 Computing a Groebner Basis

The definition of a Groebner basis alone does not give us a way of determining whethera given basis of an ideal I is a Groebner basis or not, unless we were to check everyleading term of the elements of I to see if LT (gi) for some gi ∈ G divides it, butwe have no real way of doing this. If a generating set {f1, . . . , fn} of an ideal Iis not a Groebner basis, then there must be a leading term in 〈LT (I)〉 that is notin 〈LT (f1), . . . , LT (fn)〉. For this to happen, there would have to be a polynomialcombination in which the leading terms of the elements of the basis cancel, leavingonly smaller terms, the greatest of which is less than any of the leading terms of thebasis. But this combination would be in the ideal, and so the smaller leading termwould be in 〈I〉, but not in 〈LT (f1), . . . , LT (fn)〉. The following definitions will helpus to address these cancellations:

Definition 18 Let f, g ∈ k[x1, . . . , xn] be nonzero polynomials.

1. If multideg(f) = α and multideg(g) = β, then let γ = (γ1, . . . , γn), whereγi = max(αi, βi) for each 1 ≤ i ≤ n. We call xγ the least common multipleof LM(f) and LM(g), written xγ = LCM(LM(f), LM(g)).

2. The S-polynomial of f and g is the combination

S(f, g) =xγ

LT (f)· f − xγ

LT (g)· g.

26

Example:Let f1 = x4y + x2y + x and f2 = x2y2 − y2 ∈ k[x, y]. So multideg(f1) = (4, 1)and multideg(f2) = (2, 2). Then using lex order with x > y, we get γ = (4, 2). Soxγ = x4y2 and xγ is the LCM of x4y and x2y2. Computing the S-polynomial of f1

and f2 then gives us

S(f1, f2) = (x4y2)/(x4y) · f1 − (x4y2)/(x2y2) · f2

= y · f1 − x2 · f2

= y(x4y − x2y + x)− x2(x2y2 − y2)

= x4y2 − x2y2 + xy − x4y2 + x2y2

= xy ∈ 〈f1, f2〉.

So we get the polynomial xy, which is a linear combination of f1 and f2 with elementsof k[x, y], and thus xy ∈ 〈f1, f2〉 by definition of an ideal. Note that since LT (xy) isdivisible neither by LT (f1) = x4y or LT (f2) = x2y2, then we know that, by definition,{f1, f2} is not a Groebner basis of 〈f1, f2〉.

So we can see that these S-polynomials are designed to produce cancellation ofleading terms. In fact, every cancellation of leading terms among polynomials of thesame multidegree results from an S-polynomial type of cancellation, as will be provenby the following lemma. Using this knowledge, we will be able to establish a criterionfor testing whether a generating set of an ideal is a Groebner basis.

Lemma 4 Suppose we have a sum f =∑s

i=1 cifi, where ci ∈ k and multideg(fi) =δ ∈ ZZ n

≥0 for all i. If multideg(f =∑s

i=1 cifi) < δ, then f =∑s

i=1 cifi is a linearcombination, with coefficients in k, of the S-polynomials S(fj, fl) for 1 ≤ j, l ≤ s.Furthermore, each S(fi, fl) has multidegree < δ.

Proof:Assume that we have a sum

∑i=1 cifi where ci ∈ k and multideg(fi) = δ ∈ ZZ n

≥0 forall i. Write fi = aix

δ+ lower terms, where ai = LC(fi). Then ciai is the leadingcoefficient of cifi.Since each cifi has multidegree δ and their sum has a strictly smaller multidegree,then it follows from the hypothesis that

∑si=1 ciai = 0, since all the leading terms

must cancel in order for the sum to have a lesser degree than δ. Thus each S(fi, fj)has multidegree < δ, because cancellation of the leading terms must occur. Also, bydefinition, S(fi, fj) = 1

aifi − 1

ajfj, since multideg(fi) = multideg(fj) = δ. Thus we

have

f = c1f1 + · · ·+ csfs

= c1a1(1

a1

f1) + · · ·+ csas(1

as

).

27

Then we can form the following telescoping sum:

f = c1a1(1

a1

f1 −1

a2

f2) + (c1a1 + c2a2)(1

a2

f2 −1

a3

f3) + · · ·

+(c1a1 + · · ·+ cs−1as−1)(1

as−1

fs−1 −1

as

fs) + (c1a1 + · · ·+ csas)1

as

fs

= c1a1S(f1, f2) + (c1a1 + c2a2)S(f2, f3) + (c1a1 + · · ·+ cs−1as−1)S(fs−1, fs) + · · ·

+(c1a1 + · · ·+ csas)1

as

fs,

since S(fi, fj) = 1ai

fi − 1aj

fj.

Then we can eliminate the last term:

f = c1a1S(f1, f2) + (c1a1 + c2a2)S(f2, f3) + · · ·+(c1a1 + · · ·+ cs−1as−1)S(fs−1, fs),

since c1a1 + · · · csas = 0.Therefore f is a linear combination, with coefficients in k, of the S-polynomialsS(fj, fl) for 1 ≤ j, l ≤ s. (Adams 41)

We can use S-polynomials in this way to tell if a set of generators is a Groebnerbasis, according to the following criterion. As we have already discussed, for G to be aGroebner basis of an ideal I, all polynomial combinations of the basis elements musthave leading terms that are generated by leading terms of the basis elements. Sincewe have proven that all cancellations of leading terms are created by S-polynomials,then it is sufficient to test these polynomials to see if they have leading terms thatare generated by elements of the basis, which is exactly what the following criteriondoes.

Theorem 7 Buchberger’s S-Pair Criterion A basis {g1, . . . , gt} ⊂ I is a Groeb-ner basis of I if and only if for all pairs i < j, we have

S(gi, gj)G

= 0.

Proof:(⇒) If G = {g1, . . . , gt} is a Groebner basis for I = 〈g1, . . . , gt〉, then S(gi, gj) ∈ I.

Then we know by Proposition 2 that S(gi, gj)G

= 0 for all i 6= j. This completes theproof of this direction.

(⇐) Assume S(gi, gj)G

= 0 for all i 6= j.

Let f ∈ I be a nonzero polynomial. By definition of Groebner basis, we need to showthe following:

28

Show: LT (f) ∈ 〈LT (g1), . . . , LT (gt)〉.

Let

f =t∑

i=1

higi. (2.2)

Define δ = max1≤i≤t{multideg(higi)}. There are possibly quite a few ways that f canbe written as 2.2, that is, there are quite a few ways that f can be written as a linearcombination of the gi’s. For each such expression, we get a possibly different δ. Sincemonomial order is a well-ordering (for every nonempty set there is a minimal elementon <), we can select an expression 2.2 for f such that δ is minimal. So we will let 2.2be the linear combination of gi’s so that δ = max1≤i≤t{multideg(higi)} is minimal.

Now either multideg(f) = δ or multideg(f) < δ. If multideg(f) = δ, then LT (f)is divisible by LT (gi). Thus LT (f) ∈ 〈LT (g1), . . . , LT (gt)〉, and we are done.

Now we will assume that multideg(f) < δ, and using the fact that δ is minimal, wewill come up with a contradiction. That is, we will find another

∑ti=1 higi expression

for f with a smaller δ, and this will be a contradiction.Let m(i) = multideg(higi).Then, by isolating the terms with multidegree δ, we get

f =∑

m(i)=δ

higi +∑

m(i)<δ

higi

=∑

m(i)=δ

LT (hi)gi (2.3)

+∑

m(i)=δ

(hi − LT (hi))gi +∑

m(i)<δ

higi︸︷︷︸m(i)<δ

(2.4)

We can see that the terms on the right all have multidegree < δ, since we splitthe leading terms of the hi’s away from the rest of the equation. Then, since we haveassumed that multideg(f) < δ, we know that

∑m(i) LT (hi)gi must have multidegree

less than δ.Let LT (hi) = cix

α(i).Then

∑m(i)=δ cix

α(i)gi has the form of the equation of Lemma 4, since m(i) = δfor each term, and the sum has multidegree less than δ. Thus, by Lemma 4, the∑

m(i)=δ cixα(i)gi is a linear combination of the S-polynomials S(xα(j)gj, x

α(l)gl).By definition of S-polynomial,

S(xα(i)gj, xα(l)gl) =

xδ

xα(j)LT (gj)xα(j)gj −

xδ

xα(l)LT (gl)xα(l)gl

=xδ

LT (gj)gj −

xδ

LT (gl)gl.

29

To simplify, let xγjl = LCM(LM(gj), LM(gl)).So

xδ

LT (gj)gj −

xδ

LT (gl)gl =

xδ−γjl · xγjl

LT (gj)gj −

xδ−γjl · xγjl

LT (gl)gl

= xδ−γjlS(gj, gl).

Therefore we have the following equation, where there are constants cjl ∈ k such that∑m(i)=δ

LT (hi)gi =∑j,l

cjlxδ−γjlS(gj, gl). (2.5)

By our assumption, S(gj, gl)G

= 0. Thus, by the division algorithm, we can writeS(gj, gl) as a linear combination of elements of G:

S(gj, gl) =t∑

i=1

aijlgi,

where aijl ∈ k[x1, . . . , xn].Also by the division algorithm,

multideg(aijlgi) ≤ multideg(S(gj, gl))

for all i, j, l.So, we can write:

xδ−γjlS(gj, gl) =t∑

i=1

bijlgi, (2.6)

where bijl = xδ−γjlaijl.Then, by our second conclusion from the division algorithm, we know that

multideg(bijl, gi) ≤ multideg(xδ−γjlS(gj, gl))

andmultideg(xδ−γjlS(gj, gl)) < δ (2.7)

by Lemma 5, since each S(gi, gl) has multidegree < δ.If we substitute 2.6 into 2.5, we get the equation:∑

m(i)=δ

LT (hi)gi =∑j,l

cjlxδ−γjlS(gj, gl) =

∑j,l

cjl(∑

i

bijlgi) =∑

i

h̃igi

where multideg(h̃igi) < δ by 2.7.Then if we substitute

∑m(i)=δ LT (hi)gi =

∑i h̃igi back into equation 2.4, we get that

all terms in that sum have multidegree < δ. Therefore we have obtained an expressionfor f that is a polynomial combination of the gi’s where all terms have multidegree

30

< δ. Thus δ is not minimal, and we we have a contradiction. So multideg(f) = δ, andas we have already shown in our first case, this gives us LT (f) ∈ 〈LT (g1), . . . , LT (gt)〉and we are done. (CLO 82-84 and Adams 41-42)

We can see that this Criterion not only detects Groebner bases but also suggestsa way to construct a Groebner basis from a basis F that does not fit the Criterion:

add the nonzero remainder S(fi, fj)F

to the set F and test it again (Cox 8). Thefollowing example will show how this method works.

Example:Let F = {f1, f2} = {x4y − x2y, x2y2 − y2}. We already know, from the previous

example, that S(f1, f2)F

= xy = f3. So set F1 = {f1, f2, f3} = {x4y − x2y, x2y2 −y2, xy}. Then compute:

•S(f1, f2) = xy

= 0 · f1 + 0 · f2 + 1 · f3 + 0.

Thus S(f1, f2)F1

= 0.

•S(f1, f3) =x4y

x4y· f1 −

x4y

xy· f3

= f1 − x3 · f3

= x4y − x2y − x4y

= −x2y.

Since −x2y is not divisible by LT (f1) or LT (f2), we divide −x2y by f3:

−x

xy√−x2y

−(−x2y)

0

Thus S(f1, f3)F1

= 0.

•S(f2, f3) =x2y2

x2y2· f2 −

x2y2

xy· f3

= 1 · (x2y2 − y2)− (xy) · (xy)

= x2y2 − y2 − x2y2

= −y2.

31

Since −y2 is not divisible by LT (f1), LT (f2), or LT (f3), S(f2, f3)F1

= −y2 = f4.Adding these two nonzero remainders to F1, gives

F2 = {f1, f2, f3, f4} = {x4y − x2y + x, x2y2 − y2, xy,−y2}.

This time S(fi, fj)F2

= 0 ∀ i 6= j.So the Groebner basis of 〈x3 − 2xy, x2y − x− 2y2〉 for lex order with x > y is

F2 = {f1, f2, f3, f4} = {x4y − x2y + x, x2y2 − y2, xy,−y2}.

The above example shows how the following algorithm for computing a Groebnerbasis works. As ideals are generally defined by their generating sets, this algorithmgives a method for constructing a Groebner basis for any given ideal. It will be proventhat the algorithm holds for all polynomials rings and bases in general.

Theorem 8 Buchberger’s Algorithm Given f1, . . . , fs ⊂ k[x1, . . . , xn], considerthe algorithm which starts with F = f1, . . . , fs and then repeats the two steps:

1. (Compute Step) Compute S(fi, fj)G

for all fi, fj ∈ F with i < j.

2. (Augment Step) Augment F by adding the nonzero S(fi, fj)F.

until the Compute Step gives only zero remainders. This algorithm always terminates(in a finite number of steps) and the final value of F is a Groebner basis of 〈f1, . . . , fs〉.(Cox 8)

If the reader is interested, a more precise version of the algorithm is given in CLO87 or Adams 43.

Theorem 9 Given F = {f1, . . . , fs} with fi 6= 0 (1 ≤ i ≤ s), Buchberger’s Algorithmwill produce a Groebner basis for the ideal I = 〈f1, . . . , fs〉.

Proof:First we need to show that the algorithm terminates in a finite number of steps. Wewill proceed by contradiction by assuming that the algorithm does not terminate.Then, as the algorithm progresses, we keep constructing sets Fi strictly larger thanFi−1 and obtain a strictly increasing infinite sequence

F1 ⊂ F2 ⊂ F3 ⊂ · · · .

Each Fi is obtained from Fi−1 by adding some h ∈ I to Fi−1, where h = S(fi, fj)Fi−1 6=

0 for some elements fi, fj ∈ Fi−1. Since h is reduced with respect to Fi−1, we have

32

that LT (h) /∈ LT (Fi−1). Yet LT (h) ∈ LT (Fi). Thus we get the following strictlyascending chain of ideals

〈LT (F1)〉 ⊂ 〈LT (F2〉 ⊂ 〈LT (F3)〉 ⊂ · · · .

This contradicts the Ascending Chain Condition (see Appendix). Therefore the al-gorithm must terminate and 〈LT (Fi−1)〉 = 〈LT (Fi)〉 for some i.

Now we need to show that the final value of F , which we will call F ′ is a Groebnerbasis of I = 〈f1, . . . , fs〉. We first need to show that F ′ ⊂ I. Since F ⊂ I, then p, q

and hence S(p, q) are in I, and thus every S(p, q)F

is in I. Thus when we enlarge F ,we enlarge it by elements of I and so F ′ ⊂ I.

Furthermore, since F = 〈f1, . . . fs〉 ⊆ F ′’, we know that F ′ is a basis for I.

Moreover, if fi, fj ∈ F ′, then S(gi, gj)F ′

= 0 by construction. Therefore, F ′ is aGroebner basis for I by Buchberger’s S-Pair Criterion. (CLO 88, Adams 42).

Buchberger’s algorithm is crucial in the computations using Groebner bases. Thereare more efficient versions of this algorithm for computing a Groebner basis, whichcut down on computation time, and these are presented in CLO 2.9, if the readeris interested. Furthermore, the Groebner bases computed by the algorithm given inTheorem 8, often have more generators than necessary. Since we are using these gen-erators in solving the integer programming problem, we can cut down on computationtime by eliminating the extra ones, according to the following fact.

Lemma 5 Let G be a Groebner basis for the polynomial ideal I. Let p ∈ G be apolynomial such that LT (p) ∈ 〈LT (G− p)〉. Then G− p is also a Groebner basis forI.

Proof:Since G is a Groebner basis of I, 〈LT (G)〉 = 〈LT (I)〉. If 〈LT (P )〉 ∈ 〈LT (G − p)〉,then LT (G− p) = LT (G). Therefore G− p is also a Groebner basis for I. (CLO 89)

By removing unneeded generators from a Groebner basis and multiplying by con-stants so that all of the leading coefficients are 1, a unique reduced Groebner basiscan be obtained for any ideal.

Definition 19 A reduced Groebner basis for a polynomial ideal I is a Groebnerbasis G for I such that:

1. LC(p) = 1 for all p ∈ G.

2. For all p ∈ G, no monomial of p lies in 〈LT (G− p)〉.

33

This result is especially useful when working with complicated integer program-ming problems or other applications using Groebner bases. For the purposes of thispaper, we will only be looking at simple examples that can be done by hand, and soa proof of the above result will not be included here. If the reader is interested, thereis a discussion of this result in CLO 2.7.

34

Chapter 3

Elimination Theory

We will now return to our main problem of solving a system of equations, and we willsee how Groebner bases will give us a method for doing this involving n variables.There are two steps that allow us to solve a system of equations. The first is thatwe can eliminate variables by manipulating the equations, allowing us to find simplerequations with fewer variables. The second is that we can extend the solutions ofthese simpler equations back into our original equations. These steps are called theElimination Step and the Extension Step, respectively. Elimination theory involvesgeneralizing these steps. The Elimination Step can be generalized as follows:

Definition 20 Given I = 〈f1, . . . , fs〉 ⊂ k[x1, . . . , xn], the lth elimination ideal Il

is the ideal of k[xl+1, . . . , xn] defined by

Il = I ∩ k[xl+1, . . . , xn].

So the elements of Il are all the equations that follow from f1 = · · · = fs = 0and eliminate the variables x1, . . . , xl. We can easily see that Il is indeed an ideal ofk[xl+1, . . . , xn]:

Proposition 6 The lth elimination ideal Il, defined above, is an ideal of k[xl+1, . . . , xn].

Proof:Let I be an ideal and let Il = I ∩ k[xl+1, . . . , xn]. Then 0 ∈ Il, because 0 ∈ I and0 ∈ k[xl+1, . . . , xn]. Il is closed under addition, since I and k[xl+1, . . . , xn] are closedunder addition, and the sum of any two polynomials with variables xl+1, . . . , xn mustbe a polynomial with these variables. It only remains to show that Il absorbs elementsof the ring k[xl+1, . . . , xn] under multiplication. Pick f ∈ Il and g ∈ k[xl+1, . . . , xn].Then f ∈ I and so f · g ∈ I. Also f · g ∈ k[xl+1, . . . , xn] and so f · g ∈ Il. ThereforeIl is an ideal.

One of the goals of elimination theory is to give a systematic procedure for findingelements (preferably generators) of each Il. It turns out that we can find bases of

35

all these ideals by using Groebner bases with lex ordering (Cox 10). The followingresult will show that given an ideal I, elements of the Groebner basis of the ideal thatinvolve only the variables xl+1, . . . , xn form the Groebner basis of the lth eliminationideal Il.

Theorem 10 If I = 〈f1, . . . , fs〉 ⊂ k[x1, . . . , xn] is an ideal and G = {g1, . . . , gt} isa Groebner basis for I for lex order with x1 > · · · > xn, then for each 0 ≤ l ≤ n , theset

Gl = G ∩ k[xl+1, xl+2, . . . , xn]

is a Groebner basis for the elimination ideal

Il = I ∩ k[xl+1, xl, . . . , xn].

Proof:Since G ⊂ I, we know Gl ⊂ Il. Let f ∈ Il. We need to show that LT (f) isdivisible by LT (g) for some g ∈ Gl. By definition of Il, we know that f ∈ I andthus LT (f) is divisible by LT (g) for some g ∈ G, since G is a Groebner basis ofI. Since f ∈ Il, its leading term LT (f) involves only the variables xl+1, . . . , xn.So the same must be true for LT (g). Note that since we are using lex orderingwith x1 > x2 > · · · > xn, any monomial involving x1, . . . , xl is greater than allmonomials in k[xl+1, . . . , xn]. It follows that LT (g) ∈ k[xl+1, . . . , xn] implies thatg ∈ k[xl+1, . . . , xn]. Thus g ∈ G ∩ k[xl+1, . . . , xn] = Gl. So we have shown that anyf ∈ Il = I∩k[xl+1, . . . , xn] is divisible by LT (g) for some g ∈ Gl = G∩k[xl+1, . . . , xn].Therefore, Gl is a Groebner basis of Il by definition of Groebner basis. (CLO 113-114and Cox 10).

More efficient monomial orders can be used other than lex ordering if we aresolving a problem for which we want to eliminate certain variables. Rather thaneliminating the first variable x1, and then eliminating x2, and so on, there are quickermethods that take less computing time. We can establish an elimination order inwhich we order the variables according to how we want to eliminate them, and wewill create such orders in the next section.

The elimination theorem gives us partial solutions of the original system that weneed to extend to a complete solution (if there is one). If we transfer this problemto geometry, as before, we can talk about affine varieties in relation to the problemof finding solutions to a system of equations. Recall that an ideal I ⊂ k[x1, . . . , xn],yields an affine variety

V (I) = {(a1, . . . , an) ∈ kn : f(a1, . . . , an) = 0 for all f ∈ I}.

The elimination theorem allows us to describe points of V (I) by building them up onecoordinate at a time. If we pick the lth coordinate, then we can find the eliminationideal Il. Then a solution (al+1, . . . , an) ∈ V (Il) is a partial solution to our original

36

system of equations. To extend this solution in V (Il) to a complete solution in V (I),we need to find al so that (al, al+1, . . . , an) lies in the variety V (Il−1) of the nextelimination ideal (CLO 114). The Extension Theorem gives a way to determinewhether a partial solution (al, . . . , an) ∈ V (Il) can be extended in such a way to asolution (al−1, . . . , an) ∈ V (I). We include the Extension Theorem below, withoutproof, for the sake of completion and because it is an interesting result. We will notbe using this result in this paper and the proof of the theorem is beyond the scope ofthis paper. If the reader is interested, a discussion of the extension theorem can befound in CLO 3.1 with the complete proof in CLO 3.6.

Theorem 11 (The Extension Theorem) Let I = 〈f1, . . . , fs〉 ⊂ IC [x1, . . . , xn] andlet I1 be the first elimination ideal of I. For each 1 ≤ i ≤ s, write fi in the form

fi = gi(x2, . . . , xn)xNi1 + terms in which x1 has degree < Ni,

where Ni ≥ 0 and gi ∈ IC [x2, . . . , xn] is nonzero. Suppose that we have a partialsolution (a2, . . . , an) ∈ V (I1). If (a2, . . . , an) /∈ V (g1, . . . , gs), then there exists a1 ∈ ICsuch that (a1, a2, . . . , an) ∈ V (I) (CLO 115).

Note the Extension Theorem is defined over the field ICand it is false in IR .

37

Chapter 4

Integer Programming

Recall that the integer programming problem can be defined as follows:

Integer Programming Problem Let aij ∈ ZZ , bi ∈ ZZ , and cj ∈ IR , i = 1, . . . , n,j = 1, . . . ,m; we wish to find a solution (σ1, σ2, . . . , σm) in IN m of the system

a11σ1 + a12σ2 + · · ·+ a1mσm = b1

a21σ1 + a22σ2 + · · ·+ a2mσm = b2

·· (4.1)

·an1σ1 + an2σ2 + . . . + anmσm = bn,

which minimizes the “cost function”

c(σ1, σ2, . . . , σm) =m∑

j=1

cjσj.

(Adams 105)

Our strategy for solving the integer programming problem will first be to transferit into a problem about polynomials. Then we can use the tools we gained fromChapter 2 to compute the Groebner basis and the methods from Elimination Theoryin Chapter 3 to find a solution to the problem. We will then transfer this solutionback into a solution of the integer programming problem.

First, we transfer this problem into a problem about polynomials by assigning avariable to each linear equation in 4.1. We will let these variables be x1, x2, . . . , xn forthe n equations of 4.1. Then, we can represent any equation of 4.1 as the following:

xai1σ1+···+aimσmi = xbi

i ,

for i = 1, . . . , n in which we form a monomial equation by letting the ith equation ofthe system be the exponent of a variable xi. But we are looking for a solution to the

38

entire system, and so we must use each of these equations to form a single equationthat represents the system. The product of all the monomials xai1σ1+···+aimσm

i setequal to the product of all the monomials xbi

i ’s will be a single monomial equationthat represents the system. So we can write 4.1 as

xa11σ1+···+a1mσm1 · · ·xan1σ1+···+amnσm

n = xb11 · · ·xbn

n .

By grouping the factors with exponents that are multiples of σi together, we get

(xa111 xa21

2 · · ·xan1n )σ1 · · · (xa1m

1 xa2m2 · · ·xanm

n )σm = xb11 · · ·xbn

n .

Then, to find the solutions, we form a polynomial map ϕ : k[y1, . . . , ym] 7→ k[x1, . . . , xn]such that ϕ(yi) = x

a1j

1 · · ·xanjn , according to the following definition:

Definition 21 A polynomial map

ϕ : k[y1, . . . , ym] 7→ k[x1, . . . , xn]

is a homomorphism between polynomial rings k[y1, . . . , ym] and k[x1, . . . , xn] (see Ap-pendix), which is a k-vector space linear transformation.

Note that a polynomial map is uniquely determined by ϕ : yi 7→ fi, where fi ∈k[x1, . . . , xn], 1 ≤ i ≤ m. That is, a polynomial map is uniquely determined by whereit sends each variable yi. So if h ∈ k[y1, . . . , ym] such that h =

∑v cvy

v11 · · · yvm

m , wherecv ∈ k, v = (v1, . . . , vm) ∈ ZZ m

≥0, and only finitely many cv’s are non-zero, then wehave

ϕ(h) =∑v

cvyv11 · · · yvm

m =∑v

cvfv11 · · · f vm

m = h(f1, . . . , fm) ∈ k[x1, . . . , xn].

That is, ϕ(h) depends entirely on the way the variables yi from k[y1, . . . , ym] areassigned to polynomials fi in k[x1, . . . , xn]. With the necessary language in place, weare now ready to explain how to find a solution to the system.

Lemma 6 Assume all aij’s and bi’s are non-negative. Then there exists a solution(σ1, σ2, . . . , σm) ∈ ZZ m

≥0 of 4.1 if and only if the monomial xb11 xb2

2 · · ·xbnn is in the image

under ϕ of a monomial in k[y1, . . . , ym].Moreover, if xb1

1 xb22 · · ·xbn

n = ϕ(yσ11 yσ2

2 · · · yσmm ), then (σ1, σ2, . . . , σm) ∈ ZZ m

≥0 is a solu-tion to 4.1.

Proof:We will do both directions of this proof at once by assuming that there exists a solution(σ1, σ2, . . . , σm) ∈ ZZ m

≥0 of 4.1 and showing this is an equivalent to the statement that

the monomial xb11 xb2

2 · · ·xbnn is in the image under ϕ of a monomial in k[y1, . . . , ym].

39

Assume there exists a solution (σ1, σ2, . . . , σm) ∈ ZZ m≥0 of 4.1. As we have shown,

this means that

(xa111 xa21

2 · · ·xan1n )σ1 · · · (xa1m

1 xa2m2 · · ·xanm

n )σm = xb11 · · ·xbn

n .

And so, according to our map, it is equivalent to say that

(ϕ(y1))σ1 · · · (ϕ(ym))σm = xb1

1 · · ·xbnn .

Then, by definition of homomorphism (and ϕ is a homomorphism) the following isequivalent to the above statement.

ϕ(yσ11 · · · yσm

m ) = xb11 · · ·xbn

n .

Therefore the monomial xb11 xb2

2 · · ·xbnn is in the image under ϕ of a monomial in

k[y1, . . . , ym], and we have shown that this is equivalent to (σ1, σ2, . . . , σm) ∈ ZZ m≥0

being a solution to 4.1.(Adams 106).

Now we know from this lemma that to find a solution to our system, we arelooking for a monomial yσ = yσ1

1 · · · yσmm st ϕ(yσ) = xb. Or, equivalently, if we have

xb ∈ Im(ϕ), we need to find yσ. That is, our next goal is to find a method fordetermining whether an element of k[x1, . . . , xn] is in the image of a polynomial mapsuch as ϕ. To do this, we will begin by studying the structure of the image of ϕ.

The image of ϕ is defined as Im(ϕ) = {f ∈ k[x1, . . . , xn]| there exists h ∈k[y1, . . . , ym] with f = ϕ(h)}. By definition of a ring homomorphism, we know thatthe image of ϕ is a sub-ring of k[x1, . . . , xn], since it is closed under addition andmultiplication, additive inverses exist in Im(ϕ), and all other properties of a ring areinherited from k[x1, . . . , xn] (see Appendix; these properties are easily seen from thedefinition of a ring homomorphism). We will denote Im(ϕ) = k[f1, . . . , fm], since bydefinition, all polynomials in Im(ϕ) are linear combinations of the polynomials fi.

Also recall from algebra the definition of the kernel of ϕ,

ker(ϕ) = {h ∈ k[y1, . . . , ym]|ϕ(h) = 0}.

We know from algebra (and it is easily seen) that ker(ϕ) is an ideal of k[y1, . . . , ym],using the definition of ring homomorphism. Then by the First Isomorphism Theorem(see Appendix), we know that

Im(ϕ) ∼= k[y1, . . . , ym]/ker(ϕ).

by a mappingφ : k[y1, . . . , ym]/ker(ϕ) 7→ k[f1, ..., fm]

defined byg + ker(ϕ) 7→ ϕ(g).

40

Theorem 12 φ : k[y1, . . . , ym]/ker(ϕ) → k[f1, ..., fm] defined by g + ker(ϕ) 7→ ϕ(g)is a polynomial map. (Adams 80).

Proof:We need to show that φ is a homomorphism between rings. We already know thatk[f1, . . . , fm] is a ring, and we know from algebra that k[y1, . . . , ym]/ker(ϕ) is a ring.So we need to show that φ is a ring homomorphism to complete the proof.

Let g1 + ker(ϕ), g2 + ker(ϕ) ∈ k[y1, . . . , ym]/ker(ϕ). First we need to show thataddition is preserved. So we calculate

φ(g1 + ker(ϕ) + g2 + ker(ϕ)) = φ(g1 + g2 + ker(ϕ))

= ϕ(g1 + g2)

= ϕ(g1) + ϕ(g2)

= φ(g1 + ker(ϕ)) + φ(g2 + ker(ϕ)),

since ϕ is a ring homomorphism. Next we need to show that the zero element ismapped to zero.

φ(0 + ker(ϕ)) = φ(ker(ϕ)) = 0.

Finally, we need to show that multiplication is preserved.

φ((g1 + ker(ϕ))(g2 + ker(ϕ))) = φ(g1g2 + ker(ϕ))

= ϕ(g1g2)

= ϕ(g1)ϕ(g2)

= (φ(g1 + ker(ϕ))(φ(g2) + ker(ϕ)),

since ϕ is a ring homomorphism.

So we can see how crucial the ker(ϕ) is to the structure of the image of ϕ. We willnow find a way to construct, through elimination theory, a Groebner basis of ker(ϕ)and this will give us an algorithm for determining whether a polynomial f is in theimage of ϕ, which is our goal.

First, we need the following technical lemma.

Lemma 7 Let a1, a2, . . . , an, b1, b2, . . . , bn be elements of a commutative ring R. Thenthe element aα1

1 aα22 · · · aαn

n −bα11 bα2

2 · · · bαnn , where αi ∈ ZZ ≥0, is in the ideal 〈a1−b1, a2−

b2, . . . , an − bn〉. (Adams 80)

Proof:This proof will be done by induction on the number of variables n.Base Case: Let n = 1. Then for a1, b1 ∈ R, aα1

1 − bα11 = (a1− b1)(a

α−11 + aα−2

1 b1 + · · ·+a1b

α−21 + bα−1

1 ) ∈ 〈a1− b1〉. (This is easily checked by dividing aα1 − bα

1 by a1− b1.) Forpurposes of clarity, we will show the next case n = 2, so that it is easier to see how

41

our inductive step works later.For a1, a2, b1, b2 ∈ R,

aα11 aα2

2 − bα11 bα2

2 = aα22 (aα1

1 − bα11 ) + bα1

1 (aα22 − bα2

2 ).

Then, by our first case, (aα11 − bα1

1 ) is a multiple of (a1 − b1) and (aα22 − bα2

2 ) is amultiple of (a2 − b2). Thus aα2

2 (aα11 − bα1

1 ) + bα11 (aα2

2 − bα22 ) ∈ 〈a1 − b1, a2 − b2〉 by the

definition of an ideal.Inductive Hypothesis: Assume for n ≤ k and a1, a2, . . . , an, b1, b2, . . . , bn ∈ R that

aα11 aα2

2 · · · aαnn − bα1

1 bα22 · · · bαn

n in in the ideal 〈a1 − b1, a2 − b2, . . . , an − bn〉.Now we need to show that aαk

1 aα22 · · · aαk+1

k+1 −bα11 bα2

2 · · · bαk+1

k+1 is in the ideal 〈a1−b1, a2−b2, . . . , ak+1 − bk+1〉.Note that

aα11 aα2

2 · · · aαk+1

k+1 −bα11 bα1

1 · · · bαk+1

k+1 = aαk+1

k+1 (aα11 · · · aαk

k −bα11 · · · bαk

k )+bα11 · · · bαk

k (aαk+1

k+1 −bαk+1

k+1 ).

Then we know from our hypothesis that

aαk+1

k+1 (aα11 · · · aαk

k − bα11 · · · bαk

k ) ∈ 〈a1 − b1, a2 − b2, . . . , ak − bk〉.

Alsobα11 · · · bαk

k (aαk+1

k+1 − bαk+1

k+1 ) ∈ 〈ak+1 − bk+1〉.

Therefore

aαk+1

k+1 (aα11 · · · aαk

k −bα11 · · · bαk

k )+bα11 · · · bαk

k (aαk+1

k+1 −bαk+1

k+1 ) ∈ 〈a1−b1, a2−b2, . . . , ak+1−bk+1〉.

Therefore, by induction, the lemma is true for any n. (Hint from Adams 81).

Now we are able to find elements of ker(ϕ) in the following way.

Theorem 13 Let K = 〈y1 − f1, . . . , ym − fm〉 ⊆ k[y1, . . . , ym, x1, . . . , xn], for fi ∈k[x1, . . . , xn].Then ker(ϕ) = K ∩ k[y1, . . . , ym].

Proof:(⊇) Let g ∈ K ∩ k[y1, . . . , ym]. Then we can write

g(y1, . . . , ym) =m∑

i=1

(yi − fi(x1, . . . , xn))hi(y1, . . . , ym, x1, . . . , xn),

where hi ∈ k[y1, . . . , ym, x1, . . . , xn]. We know that ϕ(g) maps yi 7→ fi. By construc-tion, we can see that when we evaluate g at (f1, . . . , fm), we get

ϕ(g) =m∑

i=1

(fi(x1, . . . , xn)− fi(x1, . . . , xn))(hi(y1, . . . , ym, x1, . . . , xn)) = 0.

42

Thus g ∈ ker(ϕ).(⊆) Let g ∈ ker(ϕ). Then since, g ∈ k[y1, . . . , ym], we let

g =∑v

cvyv,

where cv ∈ k, v = (v1, . . . , vm), and finitely many cv’s are nonzero. Then, sinceg(f1, . . . , fm) = 0,

g = g − g(f1, . . . , fm)

=∑v

cvyv11 · · · yvm

m −∑v

cvfv11 · · · f vm

m

=∑v

cv(yv11 · · · yvm

m − f v11 · · · f vm

m ).

By Lemma 7, each term in the above sum is in the ideal K = 〈y1 − f1, . . . , ym − fm〉,and therefore g ∈ K ∩ k[y1, . . . , ym]. QED.(Adams 80-81)

We can now have a way to find elements in the kernel of ϕ, by computing aGroebner basis for ker(ϕ) in the following way. First, by Buchberger’s algorithm,we can compute a Groebner basis G for the ideal K = 〈y1 − f1, . . . ym − fm〉 ink[y1, . . . , ym, x1, . . . , xn] with respect to a lex ordering with x1 > x2 > · · · > y1 > · · · >ym. Then a Groebner basis for K ∩ k[y1, . . . , ym] will be precisely the polynomialsof the Groebner basis G of K that do not have any x variables, by the EliminationTheorem. The following example illustrates this process.

Example:Let φ : Q[a, b, c] → Q[x, y] be the map defined by

a 7−→ x2y

b 7−→ xy

c 7−→ y3

We first compute a Groebner basis for the ideal K = 〈a − x2y, b − xy, c − y3〉 ⊆Q[a, b, c, x, y] with respect to lex term ordering on the x, y variables with x > y andrevlex ordering on the a, b, c variables with a > b > c and with an elimination orderin which the x, y variables are larger than the a, b, c variables.

•S(a− x2y, b− xy) =x2y

−x2y(−x2y + a)− x2y

−xy(−xy + b)

= xb− a.

•S(a− x2y, c− y3) =x2y3

−x2y(−x2y + a)− x2y3

−y3(−y3 + c)

= x2c− ay2.

43

•S(b− xy, c− y3) =xy3

−xy(−xy + b)− xy3

−y3(−y3 + c)

= xc− by2.

Thus F1 = {a− x2y, b− xy, c− y3, xb− a, x2c− ay2, xc− by2}.Then we compute the S-polynomials:

•S(a− x2y, xc− by2) =x2y2b

−x2y(−x2y + a)− x2y2b

−by2(xc− by2)

= x3c− yab.

Then x3c− yabF1

= 0.

•S(b− xy, x2c− ay2) =x2yc

−xy(−xy + b)− x2yc

x2c(x2c− ay2)

= −xbc + ay3.

Then −xbc + ay3F1= −b2y2 − ac.

•S(b− xy, c− y3) =xy3

−xy(−xy + b)− xy3

−y3(−y3 + c)

= xc− by2.

Then −xbc + y3aF1

= −ac + a + c.Therefore we have a Groebner basis for K = 〈a−x2y, b−xy, c−y3〉, G = {a−x2y, b−xy, c− y3, xb− a, x2c− ay, xc− by2,−b2y2 − ac,−ac + a + c}.Therefore a Groebner basis for ker(φ) is

G ∩ k[a, b, c] = {−ac + a + c}.

Now that we can construct a Groebner basis for ker(ϕ), we will return to ourmain goal of finding an algorithm to determine whether an element f ∈ k[x1, . . . , xn]is in the image of the map ϕ. This is accomplished by the following theorem.

Theorem 14 Let K = 〈y1 − f1, . . . , ym − fm〉 ⊆ k[y1, . . . , ym, x1, . . . , xn] be the idealconsidered in the above Theorem, and let G be a Groebner basis for K with respectto an elimination order with the x variables larger than the y variables. Then f ∈k[x1, . . . , xn] is in the image of ϕ if and only if there exists h ∈ k[y1, . . . , ym] such that

fG

= h. In this case, f = ϕ(h) = h(f1, . . . , fm). (Adams 82)

44

Before looking at the formal proof, we will discuss the intuition behind this theorem.We know that f ∈ Im(ϕ) if and only if f = ϕ(h), for some h ∈ k[y1, . . . , ym].From the first isomorphism theorem, we know that there is a direct relationshipϕ(h) ↔ h+ ker(ϕ). Thus f ∈ Im(ϕ) if and only if f corresponds to some h+ ker(ϕ)for some h ∈ k[y1, . . . , ym]. It follows that f ∈ Im(ϕ) if and only if f =

∑i=1 gipi +h,

where G = {g1, . . . , gs} is a Groebner basis for ker(ϕ), pi ∈ k[y1, . . . , ym]. This is true

if and only if fG

= h. Therefore, we can see that f ∈ Im(ϕ) if and only if fG

= h, forsome h ∈ k[y1, . . . , ym] and in this case, f = ϕ(h), which is what the theorem states.The following is a formal proof that will pin down the details of this argument.

Proof:(⇒) Let q ∈ k[x1, . . . , xn] be in Im(ϕ). Then q = h(f1, . . . , fm) for some h ∈k[y1 . . . , ym]. Consider the polynomial

q(x1, . . . , xn)− h(y1, . . . , ym) ∈ k[y1, . . . , ym, x1, . . . , xn].

Then by construction of h,

q(x1, . . . , xn)− h(y1, . . . , ym) = h(f1, . . . , fm)− h(y1, . . . , ym).

We can see that the polynomial h(f1, . . . , fm) − h(y1, . . . , ym) has monomial pairsfα1

1 . . . fαmm − yα1

1 . . . yαmm . Thus h(f1, . . . , fm)−h(y1, . . . , ym) ∈ K by Lemma 7 and so

h(x1, . . . , xm)−h(y1, . . . , ym) ∈ K by substitution. Therefore q(x1, . . . , xn)− h(y1, . . . , yn)G

=

0 by Proposition 2. This implies that q(x1, . . . , xn)G

= h(y1, . . . , ym)G, because

we have unique remainders. Let q(x1, . . . , xn)G

= g(y1, . . . , ym)G

= m. Since h ∈k[y1, . . . , ym], h can only be reduced by polynomials in G which have leading termsin the y variables alone. If the leading term of a polynomial has no x variables,then the polynomial must have no x variables, since x variables are greater than yvariables according to our elimination order. Thus the polynomials of G used toreduce h must be polynomials in k[y1, . . . , ym]. Therefore m ∈ k[y1, . . . , ym]. Thus

q(x1, . . . , xn)G

= m ∈ k[y1, . . . , ym], and this completes this direction of the proof.

(⇐) Pick q ∈ k[x1, . . . , xn]. Let qG = h, where h ∈ k[y1, . . . , ym]. Then q − h ∈ K,and so by definition of K we can write

q(x1, . . . , xn)− h(y1, . . . , ym) =m∑

i=1

gi(y1, . . . , ym, x1, . . . , xn)(yi − fi(x1, . . . , xn)).

Since ϕ(yi) = fi, then if we can map each yi to fi,

ϕ(q(x1, . . . , xn)− h(y1, . . . , ym)) =m∑

i=1

ϕ(gi(y1, . . . , ym, x1, . . . , xn)(yi − fi(x1, . . . , xn))

=m∑

i=1

(gi(f1, . . . , fm, x1, . . . , xn)(fi(x1, . . . , xn)− fi(x1, . . . , xn))

= 0.

45

Thus, by definition of homomorphism,

ϕ(q(x1, . . . , xn)− h(y1, . . . , ym)) = ϕ(q(x1, . . . , xn))− ϕ(h(y1, . . . , ym))

= q(x1, . . . , xn)− h(fi, . . . , fm)

= 0.

Therefore we have q = h(f1, . . . , fm) = ϕ(h), and so q is in the image of ϕ. (Adams82)

We now have an algorithm to determine whether a polynomial is in the imageof a polynomial mapping. In our case, in the application of these tools to integerprogramming, we are just working with monomials, but this does not change how theabove theorem works. Our polynomial map ϕ will map monomials to monomials, sothat the ideal considered in Theorem 14 will be K = 〈yj−x

α1j

1 xα2j

2 · · ·xαnjn |1 ≤ j ≤ m〉.

That is, the fj’s, which represent the image of the variables yi’s under ϕ will bemonomials xαj ’s (where αj = (α1j, α2j, . . . , αnj)). Therefore we now have a methodfor determining whether the following system has a solution, and for finding a solution.

a11σ1 + a12σ2 + · · ·+ a1mσm = b1

a21σ1 + a22σ2 + · · ·+ a2mσm = b2

···

an1σ1 + an2σ2 + . . . + anmσm = bn.

Our method, following from Lemma 6 and Theorem 14, is the following:

1. Compute a Groebner basis G for K = 〈yj − xα1j

1 xα2j

2 · · ·xαnjn |1 ≤ j ≤ m〉 with

respect to an elimination order with the x variables larger than the y variables;

2. Find the remainder h of the division of the monomial xb11 xb2

2 · · ·xbnn by G;

3. If h /∈ k[y1, . . . , ym], then the System 4.1 does not have non-negative integersolutions. If h = yσ1

1 yσ22 · · · yσm

m , then (σ1, σ2, . . . , σm) is a solution of System 4.1(Adams 107).

We will consider a simple example that illustrates the above method.

Example:We will let our system be:

2σ1 + σ2 = 3

σ1 + σ2 + 3σ3 = 5

46

Letting the x1 represent the first equation and x2 represent the second, we then havethat

(x21x2)

σ1(x1x2)σ2(x3

2)σ3 = x3

1x52

Then our ideal is K = 〈y1 − x21x2, y2 − x1x2, y3 − x3

2〉. Conveniently, this is equivalentto the ideal in the previous example (by converting the variables a = y1, b = y2, c =y3, x = x1, and y = x2). Thus we already have the following Groebner basis for K,G = {y1−x2

1x2, y2−x1x2, y3−x32, x1y2−y1, x

21y3−x2y1, x1y3−x2

2y2, x22y

22−y1y3,−y1y3+

y1 + y3}.Dividing x3

1x52 by terms in G until we get a remainder that is no longer divisible by

G gives us:

q3 : y1y2

q2 : x32y1

q1 : x1x42

x21x2 − y1

√x3

1x52

x1x2 − y2 −(x21x

52 − x1x

42y1)

x32 − y3 x1x

42y1

−(x1x42y1 − x3

2y1y2)

x32y1y2

−(x32y1y3 − y1y2y3)

y1y2y3

Thus we get a solution y1y2y3. Translating this back into the integer programmingproblem, using the exponents of the yi’s, gives us: σ1 = 1, σ2 = 1, σ3 = 1, which isindeed a solution to our system.

So far we have only considered the case where the coefficients of the equationsin our integer programming problem are positive. Handling the case where somecoefficients are negative cannot work in the same way, because when we translate theproblem into polynomials, we will have negative exponents and this cannot be donein the polynomial ring k[x1, . . . , xn]. This case is not conceptually different, however,and analogous theorems to Lemma 6 and Theorems 13 and 14 that handle this casecan be found in Adams 2.8. For the purposes of this paper, we will handle only thenon-negative case.

Now that we have found solutions (σ1, σ2, . . . , σm) of system 4.1, to complete ourgoal we want to find the solutions that minimize the cost function c(σ1, σ2, . . . , σm) =∑m

j=1 cjσj. In our method for obtaining solutions, we have an elimination order withthe x variables larger than the y variables. In order to minimize the cost function,we will use the cj’s to define a term order on y variables in the following way.

47

Definition 22 A term order <c on the y variables is said to be compatible withthe cost function c and the map ϕ if

{ϕ(yσ11 yσ2

2 · · · yσmm ) = ϕ(y

σ′11 y

σ′22 · · · yσ′m

m )

and c(σ1, . . . , σm) <c (σ′1, . . . , σ′m)}

=⇒ yσ11 yσ2

2 · · · yσmm <c y

σ′11 y

σ′22 · · · yσ′m

m .

That is, if we have two monomials in k[y1, . . . , ym] mapping to xb11 · · ·xbn

n , and thecost function when evaluated at one solution is less than when it is evaluated at theother, then a term order that is compatible with this would order the monomial ofthe lesser solution as less that the monomial of the solution that is greater. Such aterm order on the y variables gives us solutions of system 4.1 which minimize the costfunction as our final result shows.

Proposition 7 Let G be a Groebner basis for K with respect to an elimination orderwith the x variables larger than the y variables, and an order <c on the y variables

which is compatible with the cost function c and the map ϕ. If xb′11 x

b′22 · · ·x

b′nn

G

=yσ1

1 · · · yσmm , where yσ1

1 · · · yσmm is reduced with respect to G, then (σ1, . . . , σm) is a solu-

tion of system 4.1. (Adams 110)

Proof:Let G be a Groebner basis for K = 〈yj − x

a1j

1 xa2j

2 · · ·xanjn |1 ≤ j ≤ m〉 with respect

to an elimination order with the x variables larger than the y variables, and an order<c on the y variables which is compatible with the cost function c and the map ϕ.

Let xb′11 x

b′22 · · ·x

b′nn wβ

G

= yσ11 · · · yσm

m , with yσ11 · · · yσm

m reduced with respect to G. Then(σ1, . . . , σm) is a solution of system 4.1 by Lemma 6. Now we want to show that(σ1, . . . , σm) is a minimal solution.Proceed by contradiction. Assume there exists a solution (σ′1, . . . , σ

′m) of system 4.1

such that∑m

j=1 cjσ′j <

∑mj=1 cjσj. Then by definition of a ring homomorphism, since

ϕ(yσ11 · · · yσm

m ) = ϕ(yσ′11 · · · yσ′m

m ),

ϕ(yσ11 · · · yσm

m )− ϕ(yσ′11 · · · yσ′m

m ) = ϕ(yσ11 · · · yσm

m − yσ′11 · · · yσ′m

m ) = 0.

Therefore yσ11 · · · yσm

m − yσ′11 · · · yσ′m

m ∈ ker(ϕ). By Theorem 13, ker(ϕ) ⊆ K, and so

yσ11 · · · yσm

m − yσ′11 · · · yσ′m

m ∈ K. Thus yσ11 · · · yσm

m − yσ′11 · · · yσ′m

m

G

= 0 by Proposition2. We will arrive at a contradiction to this. Since, by assumption, yσ1

1 · · · yσmm >c

yσ′11 · · · yσ′m

m , then LT (yσ11 · · · yσm

m − yσ′11 · · · yσ′m

m ) = yσ11 · · · yσm

m . But yσ11 · · · yσm

m is alreadyreduced with respect to G, and therefore, by the division algorithm, yσ1

1 · · · yσmm −

yσ′11 · · · yσ′m

m cannot reduce to 0 by G. Thus, by contradiction, we have that yσ11 · · · yσm

m

is a solution that minimizes the cost function c. (Adams 111)

48

In the case where the cost function has only positive coefficients, which is the onlycase we have discussed, a term order that is compatible with the cost function andthe map ϕ is one that orders monomials using the cost function and breaks ties usingany other monomial order (Adams 111). For example, if our integer programmingproblem had m unknowns, then we could order the monomials with y variables usingthe cost function and break any ties by lex order with y1 > y2 > · · · > ym. That

is, yσ11 yσ2

2 · · · yσmm < y

σ′11 y

σ′22 · · · yσ′m

m if and only if either the cost function evaluated at(σ1, . . . , σm) is less than when it is evaluated at (σ′1, . . . , σ

′m) or the cost is equal for

both and yσ11 yσ2

2 · · · yσmm <lex y

σ′11 y

σ′22 · · · yσ′m

m . In the above example, we have only onesolution to our system. Having a solution that minimizes cost is only meaningful ifthere is more than one possible solution. Without an ordering that is compatible withthe cost function, multiple solutions to the system can arise when we are reducingthe monomial xb1

1 xb22 · · ·xbm

m by the Groebner basis G of our ideal K (step 2 of ourmethod for finding a solution to our system). During this reduction, xb1

1 xb22 · · ·xbm

m

can be reduced to multiple monomials in y variables before the reduction is complete.The exponents of each of these monomials is a solution to our system. In an involvedproblem, there may be many solutions and it may be complicated to check these tofind a minimal one. As we have just proven, an ordering that is compatible with ourcost function gives us only the minimal solution (although multiple minimal solutionsmay exist).

4.1 Conclusion

This paper has provided an insight into the theory of Groebner bases as well as a briefintroduction to the techniques of working with Groebner bases through a discussionof their application to integer programming. One can see that the application ofthese techniques can become very complicated computationally, and all but the mostsimple cases require the use of a Computer Algebra System. Yet, through the integerprogramming example, one can see how Groebner bases are an important computa-tional tool. Groebner bases have a wide range of applications, including ComputerAided Geometric Design and Coding Theory (Cox). Improvements on the algorithmsdiscussed in this paper are currently being researched.

4.2 Acknowledgements

I would like to thank Professor Ivelisse Rubio, University of Puerto Rico, for motivat-ing my interest in the study of Groebner Bases and for suggesting its applications tointeger programming as an interesting topic to explore. I would also like to thank Pro-fessor Murli Gupta, Professor E. Arthur Robinson, and Professor Katherine Gurski,hosts of The George Washington University Summer Program For Women In Mathe-matics, for providing an excellent summer experience that introduced me to the topics

49

of this paper.

50

Appendix A

Group Theory

A.1 Ascending Chain Condition

This result shows that there cannot be an infinite strictly ascending chain of ideals,but that at some point, one will reach an ideal that is the biggest possible in thechain.

Theorem 15 Let I1 ⊂ I2 ⊂ I3 ⊂ · · · be an ascending chain of ideals in k[x1, . . . , xn].Then there exists an N ≥ 1 such that IN = IN+1 = IN+2 = · · ·.

Proof:Consider I = ∪∞i=1Ii. We will first show that I is an ideal in k[x1, . . . , xn]. Let f, g ∈ I.Then there are ideals Ii and Ij contained in I such that f ∈ Ii and g ∈ Ij. Assume,WLOG, that Ii ⊂ Ij. Then f, g ∈ Ij. Thus f ±g and f ·g ∈ Ij, so f ±g and f ·g ∈ I.Also, since −f, 0 ∈ Ij, −f, 0 ∈ I. So I is a subring of k[x1, . . . , xn]. Now we need toshow that I absorbs elements of k[x1, . . . , xn] under multiplication to finish the proofthat I is an ideal. Let f ∈ I and h ∈ k[x1 . . . , xn]. Then f ∈ Ii for some Ii. Since Ii

is an ideal, f · h ∈ Ii. Thus f · h ∈ I. Therefore I is an ideal.So by the Hilbert Basis Theorem, I must have a finite generating set. So we can

say, I = 〈f1, . . . , fs〉. Each generator fi must be in one of the I ′js such that fi ∈ Iji

for some ji. We have a finite set of these Iji, and so we can let N be the maximum

of the ji. Then, by the construction of the ascending chain, fi ∈ IN for all i. Thus

I = 〈f1, . . . , fs〉 ⊂ IN ⊂ IN+1 ⊂ · · · ⊂ I.

Therefore all subsequent ideals following IN in the chain are equal. (CLO 76)

A.2 Group Homomorphisms

Some results from algebra about the mappings of groups are given here.

51

Definition 23 A map φ of a group G into a group G′ is a group homomorphismif φ(ab) = φ(a)φ(b) for all a, b ∈ G.

This idea can be expanded to the following definition for rings.

Definition 24 A map φ of a ring R into a ring S is a ring homomorphism if thefollowing conditions hold for all a, b ∈ R:

1. Addition is preserved: φ(a + b) = φ(a) + φ(b), and

2. Multiplication is preserved: φ(ab) = φ(a)φ(b).

Note that the zero element is mapped to zero, since by part 1, φ(0R) = φ(0R + 0R) =φ(0R) + φ(0R) = 2φ(0R). Thus φ(0R) = 0S. Also, a homomorphism must preserve theadditive inverse map, since for a ∈ R, φ(a) + φ(−a) = φ(a− a) = φ(0R) = 0S and so−φ(a) = φ(−a).

Theorem 16 First Isomorphism Theorem Let φ : G → G′ be a homomorphismwith kernel K, and let γk : G → G/K be the canonical homomorphism. There isunique isomorphism µ : G/K → φ[G] such that φ(x) = µ(γk(x)) for each x ∈ G.(Fraleigh 210)

This theorem states that there is an isomorphism between the cosets of G (modthe elements of the kernel of φ) and the image of G under φ. The following is anillustration of this theorem.

G φ[G]

G/K

-

AAAAAAAAU �

��

φ

γK µ(isomorphism)

52

Bibliography

[1] Adams, Loustaunau. An Introduction to Groebner Bases, Graduate Studies inMathematics, AMS, Rhode Island. 1994.

[2] Craig, Joe. “L2 : Leopntief Models and Linear Programming.” Kenyon CollegeSenior Comps. 2003.

[3] Cox, David A. “Introduction to Groebner bases.” Applications of ComputationalAlgebraic Geometry.Ed. Cox and Sturmfels, AMS, Rhode Island. 1998. 1-22.

[4] Cox, Little, O’Shea. Ideals, Varieties, and Algorithms, second edition, Springer,New York. 1997.

[5] Cox, Little, O’Shea. Using Algebraic Geometry, Springer, New York. 1998.

[6] Fraleigh, John. Abstract Algebra, Addison-Wesley, New York. 1998.

[7] Greuel, Pfister. “Grobner bases and algebraic geometry.” Grobner Bases andApplications. Ed. Buchberger and Winkler, Cambridge University Press. 1998.109-143.

[8] Hosten, R. Thomas. “Grobner bases and integer programming.” Grobner Basesand Applications. Ed. Buchberger and Winkler, Cambridge University Press.1998. 144-157.

[9] Thomas, John B. “Applications to Integer Programming.” Applications of Com-putational Algebraic Geometry. Ed. Cox and Sturmfels, AMS, Rhode Island.1998. 119-140.

53

Groebner Bases with an Application to Integer Programmingand algebraic geometry. The primary purpose...

Documents

Transcript of Groebner Bases with an Application to Integer Programmingand algebraic geometry. The primary purpose...