Applications of Trees - NTNU discre… · May 6, 2002 Applied Discrete Mathematics Week 14: Trees...

38
May 6, 2002 Applied Discrete Mathematics Week 14: Trees and Boolean Algebra 1 Applications of Trees There are numerous important applications of trees, only three of which we will discuss today: Network optimization with minimum spanning trees Problem solving with backtracking in decision trees Data compression with prefix codes in Huffman coding trees

Transcript of Applications of Trees - NTNU discre… · May 6, 2002 Applied Discrete Mathematics Week 14: Trees...

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

1

Applications of Trees

There are numerous important applications of trees, only three of which we will discuss today:

• Network optimization with minimum spanning trees

• Problem solving with backtracking in decision trees

• Data compression with prefix codes in Huffman coding trees

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

2

Spanning Trees

Definition: Let G be a simple graph. A spanning tree of G is a subgraph of G that is a tree containing every vertex of G.Note: A spanning tree of G = (V, E) is a connected graph on V with a minimum number of edges (|V| - 1).Example: Since winters in Boston can be very cold, six universities in the Boston area decide to build a tunnel system that connects their libraries.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

3

Spanning TreesThe complete graph including all possible tunnels:

Brandeis Harvard

MIT

TuftsBU

UMassThe spanning trees of this graph connect all libraries with a minimum number of tunnels.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

4

Spanning TreesExample for a spanning tree:

Brandeis Harvard

MIT

TuftsBU

UMassSince there are 6 libraries, 5 tunnels are sufficient to connect all of them.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

5

Spanning Trees

Now imagine that you are in charge of the tunnel project. How can you determine a tunnel system of minimal cost that connects all libraries?Definition: A minimum spanning tree in a connected weighted graph is a spanning tree that has the smallest possible sum of weights of its edges.How can we find a minimum spanning tree?

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

6

Spanning TreesThe complete graph with cost labels (in billion $):

The least expensive tunnel system costs $20 billion.

Brandeis Harvard

MIT

TuftsBU

UMass

7

8

9

9 6

64 5 4

43

2

35

4

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

7

Spanning Trees

Prim’s Algorithm:• Begin by choosing any edge with smallest weight

and putting it into the spanning tree,• successively add to the tree edges of minimum

weight that are incident to a vertex already in the tree and not forming a simple circuit with those edges already in the tree,

• stop when (n – 1) edges have been added.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

8

Spanning Trees

Kruskal’s Algorithm:Kruskal’s algorithm is identical to Prim’s algorithm, except that it does not demand new edges to be incident to a vertex already in the tree.Both algorithms are guaranteed to produce a minimum spanning tree of a connected weighted graph.Please look at the proof with regard to Prim’s algorithm on page 585 in the textbook.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

9

Backtracking in Decision TreesA decision tree is a rooted tree in which each internal vertex corresponds to a decision, with a subtree at these vertices for each possible outcome of the decision.Decision trees can be used to model problems in which a series of decisions leads to a solution (compare with the “counterfeit coin” and “binary search tree” examples).The possible solutions of the problem correspond to the paths from the root to the leaves of the decision tree.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

10

Backtracking in Decision Trees

There are problems that require us to perform an exhaustive search of all possible sequences of decisions in order to find the solution. We can solve such problems by constructing the complete decision tree and then find a path from its root to a leave that corresponds to a solution of the problem.In many cases, the efficiency of this procedure can be dramatically increased by a technique called backtracking.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

11

Backtracking in Decision TreesIdea: Start at the root of the decision tree and move downwards, that is, make a sequence of decisions, until you either reach a solution or you enter a situation from where no solution can be reached by any further sequence of decisions.In the latter case, backtrack to the parent of the current vertex and take a different path downwards from there. If all paths from this vertex have already been explored, backtrack to its parent.Continue this procedure until you find a solutionor establish that no solution exists (there are no more paths to try out).

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

12

Backtracking in Decision TreesExample: The n-queens problemHow can we place n queens on an n×n chessboard so that no two queens can capture each other?

QQxxx

xx

x

xx

x

xx

xx

xxx

xxxxx

xx

xx

xx

A queen can move any number of squares horizontally, vertically, and diagonally.Here, the possible target squares of the queen Q are marked with an x.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

13

Backtracking in Decision TreesObviously, in any solution of the n-queens problem, there must be exactly one queen in each columnof the board. Therefore, we can describe the solution of this problem as a sequence of n decisions: Decision 1: Place a queen in the first column.Decision 2: Place a queen in the second column....Decision n: Place a queen in the n-th column.We are now going to solve the 4-queens problemusing the backtracking method.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

14

Backtracking in Decision Trees

QQ

Q

Q

Q

Q

Q

Q

Q

Q

Q

QQ

Q

Q

QQ

Q

place 1st queen

place 2nd queen

place 3rd queen

place 4th queen

empty board

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

15

Backtracking in Decision TreesWe can also use backtracking to write “intelligent”programs that play games against a human opponent.Just consider this extremely simple (and not very exciting) game:At the beginning of the game, there are seven coins on a table. Player 1 makes the first move, then player 2, then player 1 again, and so on. One move consists of removing 1, 2, or 3 coins. The player who removes all remaining coins wins.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

16

Backtracking in Decision TreesLet us assume that the computer has the first move. Then, the game can be described as a series of decisions, where the first decision is made by the computer, the second one by the human, the third one by the computer, and so on, until all coins are gone.The computer wants to make decisions that guarantee its victory (in this simple game).The underlying assumption is that the humanalways finds the optimal move.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

17

Backtracking7

6

5

4

3 2 11 2 3

2 1

C C

14

2

C3

3

52

C

H

C

H

C

1

3 2 11 2 3

2 1

H HH3

1

1

4

3 2 11 2 3

2 1

H HH3

4

3 2 11 2 3

2 1

C CC3

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

18

Backtracking in Decision TreesSo the computer will start the game by taking three coins and is guaranteed to win the game.For more interesting games such as chess, it is impossible to check every possible sequence of moves. The computer player then only looks ahead a certain number of moves and estimates the chance of winning after each possible sequence.We will study backtracking algorithms in CS 410in the fall and especially CS 470/670 next spring (board game tournament).

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

19

Huffman Coding Trees

We usually encode strings by assigning fixed-length codes to all characters in the alphabet (for example, 8-bit coding in ASCII). However, if different characters occur with different frequencies, we can save memory and reduce transmittal time by using variable-length encoding.The idea is to assign shorter codes to characters that occur more often.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

20

Huffman Coding Trees

We must be careful when assigning variable-length codes. For example, let us encode e with 0, a with 1, and twith 01. How can we then encode the word tea?The encoding is 0101.Unfortunately, this encoding is ambiguous. It could also stand for eat, eaea, or tt.Of course this coding is unacceptable, because it results in loss of information.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

21

Huffman Coding TreesTo avoid such ambiguities, we can use prefix codes. In a prefix code, the bit string for a character never occurs as the prefix (first part) of the bit string for another character.For example, the encoding of e with 0, a with 10, and t with 11 is a prefix code. How can we now encode the word tea?The encoding is 11010.This bit string is unique, it can only encode the word tea.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

22

Huffman Coding TreesWe can represent prefix codes using binary trees, where the characters are the labels of the leaves in the tree.The edges of the tree are labeled so that an edge leading to a left child is assigned a 0 and an edge leading to a right child is assigned a 1.The bit string used to encode a character is the sequence of labels of the edges in the unique path from the root to the leaf labeled with this character.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

23

Huffman Coding TreesThe tree corresponding to our example:

0 1

0 1e

a tIn a tree, no leaf can be the ancestor of another leaf. Therefore, no encoding of a character can be a prefix of an encoding of another character (prefix code).

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

24

Huffman Coding TreesTo determine the optimal (shortest) encoding for a given string, we first have to find the frequencies of characters in that string. Let us consider the following string:eeadfeejjeggebeeggddehhhececddeciedeeIt contains 1×a, 1×b, 3×c, 6×d, 15×e, 1×f, 4×g, 3×h, 1×i, and 2×j.We can now use Huffman’s algorithm to build the optimal coding tree.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

25

Huffman Coding TreesFor an alphabet containing n letters, Huffman’s algorithm starts with n vertices, one for each letter, labeled with that letter and its frequency.We then determine the two vertices with the lowest frequencies and replace them with a treewhose root is labeled with the sum of these two frequencies and whose two children are the two vertices that we replaced.In the following steps, we determine the two lowest frequencies among the single vertices and the roots of trees that we already created.This is repeated until we obtain a single tree.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

26

Huffman Coding Trees

1 1 1 1 2 3 3 4 6 15a b f i j c h g d e

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

27

Huffman Coding Trees

2 1 1 2 3 3 4 6 15f i j c h g d e

1 1a b

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

28

Huffman Coding Trees

2 2 3 3 4 6 15j c h g d e

1 1a b

2

1 1f i

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

29

Huffman Coding Trees

2

2

3 3 4 6 15

j

c h g d e1 1a b

2

1 1f i

4

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

30

Huffman Coding Trees

2 23

3 4 6 15

jc

h g d e

1 1a b

2

1 1f i

45

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

31

Huffman Coding Trees

2 23

6 15

jc

d e

1 1a b

2

1 1f i

45

3 4h g

7

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

32

Huffman Coding Trees

2 23

6 15

jc

d e

1 1a b

2

1 1f i

45 3 4h g

79

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

33

Huffman Coding Trees

2 23

6

15

jc

d

e

1 1a b

2

1 1f i

45

3 4h g

7

9 13

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

34

Huffman Coding Trees

2 23

6

15

jc

d

e

1 1a b

2

1 1f i

45

3 4h g

7

9 13

22

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

35

Huffman Coding Trees

2 23

6

15

jc

d

e

1 1a b

2

1 1f i

45

3 4h g

7

9 13

22

37

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

36

Huffman Coding TreesFinally, we convert the tree into a prefix code tree:

0 10 1 e

0 1

0 1

0 1

0 1

0 1

0 1

0 1

d

ghjif

cba

The variable-length codes are:a (freq. 1): 00000b (freq. 1): 00001c (freq. 3): 0001d (freq. 6): 011e (freq. 15): 1f (freq. 1): 00100g (freq. 4): 0101h (freq. 3): 0100i (freq. 1): 00101j (freq. 2): 0011

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

37

Huffman Coding TreesIf we encode the original stringeeadfeejjeggebeeggddehhhececddeciedeeusing a fixed-length code, we need four bits per character (for ten different characters). Therefore, the encoding of the entire string is 4⋅37 = 148 bits long.With our variable-length code, we only need 1⋅5 + 1⋅5 + 3⋅4 + 6⋅3 + 15⋅1 + 1⋅5 + 4⋅4 + 3⋅4 + 1⋅5 + 2⋅4 = 101 bits.

May 6, 2002 Applied Discrete MathematicsWeek 14: Trees and Boolean Algebra

38

Huffman Coding Trees

It can be shown that, for any given string, Huffman coding trees always produce a variable-length code with minimum description length for that string.

Since Huffman’s algorithm is not covered in the textbook, please take a look at:

http://www.cs.duke.edu/csed/poop/huff/info/