Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua...

30
Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng

Transcript of Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua...

Page 1: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Trees (Ch. 9.2)

Longin Jan LateckiTemple University

based on slides by

Simon Langley and Shang-Hua Teng

Page 2: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Basic Data Structures - Trees

Informal: a tree is a structure that looks like a real tree (up-side-down)

Formal: a tree is a connected graph with no cycles.

Page 3: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Trees - Terminology

x

b e m

c d a

root

leaf

height=2

size=7

Every node must have its value(s)Non-leaf node has subtree(s)Non-root node has a single parent nodeA parent may have 1 or more children

value

subtree

nodes

Page 4: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Types of Tree

Binary Tree

m-ary Trees

Each node has at most 2 sub-trees

Each node has at most m sub-trees

Page 5: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Binary Search Trees

A binary search tree: … is a binary tree. if a node has value N, all values in its

left sub-tree are less than N, and all values in its right sub-tree are greater than N.

Page 6: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

This is a binary search tree

Page 7: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

This is NOT a binary search tree

5

4 7

3 2 8 9

Page 8: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Searching a binary search tree

search(t, s) {

If(s == label(t))

return t;

If(t is leaf) return null

If(s < label(t))

search(t’s left tree, s)

else

search(t’s right tree, s)}

h

Time per level

O(1)

O(1)

Total O(h)

Page 9: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Searching a binary search tree

search( t, s )

{ while(t != null)

{ if(s == label(t)) return t;

if(s < label(t)

t = leftSubTree(t);

else

t = rightSubTree(t);

}

return null;

h

Time per level

O(1)

O(1)

Total O(h)

Page 10: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Here’s another function that does the same (we search for label s):

TreeSearch(t, s)

while (t != NULL and s != label[t])

if (s < label[t])

t = left[t];

else

t = right[t];

return t;

Page 11: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Insertion in a binary search tree:we need to search before we insert

5

3 8

2 4 7 9

Time complexity ?

Insert 6 6

6

6

6

Insert 1111

11

11

O(height_of_tree)O(log n) if it is balanced n = size of the tree

always insert to a leaf

Page 12: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Insertion

insertInOrder(t, s)

{ if(t is an empty tree) // insert here

return a new tree node with value s

else if( s < label(t))

t.left = insertInOrder(t.left, s )

else

t.right = insertInOrder(t.right, s)

return t }

Page 13: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Try it!!

Build binary search trees for the following input sequences• 7, 4, 2, 6, 1, 3, 5, 7

• 7, 1, 2, 3, 4, 5, 6, 7

• 7, 4, 2, 1, 7, 3, 6, 5

• 1, 2, 3, 4, 5, 6, 7, 8

• 8, 7, 6, 5, 4, 3, 2, 1

Page 14: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Comparison –Insertion in an ordered list

Insert 6

Time complexity?

2 3 4 5 7 98

6 6 6 6 6

O(n) n = size of the list

insertInOrder(list, s) { loop1: search from beginning of list, look for an item >= s loop2: shift remaining list to its right, start from the end of list insert s}

6 7 8 9

Page 15: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Suppose we have 3GB character data file that we wish to include in an email.

Suppose file only contains 26 letters {a,…,z}. Suppose each letter in {a,…,z} occurs with frequency

f. Suppose we encode each letter by a binary code If we use a fixed length code, we need 5 bits for each

character The resulting message length is

Can we do better?

Data Compression

zba fff 5

Page 16: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Data Compression: A Smaller Example Suppose the file only has 6 letters {a,b,c,d,e,f}

with frequencies

Fixed length 3G=3000000000 bits Variable length

110011011111001010

101100011010001000

05.09.16.12.13.45.

fedcba

Fixed length

Variable length

G24.2405.409.316.312.313.145.

Page 17: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

How to decode?

At first it is not obvious how decoding will happen, but this is possible if we use prefix codes

Page 18: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Prefix Codes No encoding of a

character can be the prefix of the longer encoding of another character:

We could not encode t as 01 and x as 01101 since 01 is a prefix of 01101

By using a binary tree representation we generate prefix codes with letters as leaves

e

a

t

n s

0 1

1

1

1

0

0

0

Page 19: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Prefix codes allow easy decoding

e

a

t

n s

0 1

1

1

1

0

0

0

Decode:

11111011100

s 1011100

sa 11100

san 0

sane

Page 20: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Prefix codes

A message can be decoded uniquely.

Following the tree until it reaches to a leaf, and then repeat!

Draw a few more trees and produce the codes!!!

Page 21: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Some Properties

Prefix codes allow easy decoding An optimal code must be a full binary tree (a

tree where every internal node has two children)

For C leaves there are C-1 internal nodes The number of bits to encode a file is

ccfT TCc

length )()B(

where f(c) is the freq of c, lengthT(c) is the tree depth of c, which corresponds to the code length of c

Page 22: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Optimal Prefix Coding Problem

Input: Given a set of n letters (c1,…, cn) with frequencies (f1,…, fn).

Construct a full binary tree T to define a prefix code that minimizes the average code length

iT

n

i i cfT length )Average(1

Page 23: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Greedy Algorithms

Many optimization problems can be solved using a greedy approach• The basic principle is that local optimal decisions may be used to

build an optimal solution

• But the greedy approach may not always lead to an optimal solution overall for all problems

• The key is knowing which problems will work with this approach and which will not

We study• The problem of generating Huffman codes

Page 24: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Greedy algorithms A greedy algorithm always makes the choice that looks

best at the moment• My everyday examples:

• Driving in Los Angeles, NY, or Boston for that matter

• Playing cards

• Invest on stocks

• Choose a university

• The hope: a locally optimal choice will lead to a globally optimal solution

• For some problems, it works

Greedy algorithms tend to be easier to code

Page 25: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

David Huffman’s idea

A Term paper at MIT

Build the tree (code) bottom-up in a greedy fashion

Each tree has a weight in its root and symbols as its leaves.

We start with a forest of one vertex trees representing the input symbols.

We recursively merge two trees whose sum of weights is minimal until we have only one tree.

Page 26: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Building the Encoding Tree

Page 27: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Building the Encoding Tree

Page 28: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Building the Encoding TreeBuilding the Encoding Tree

Page 29: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Building the Encoding TreeBuilding the Encoding Tree

Page 30: Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.

Building the Encoding TreeBuilding the Encoding Tree