Tree Data Structures. Heaps for searching Search in a heap? Search in a heap? Would have to look at...
-
Upload
beverly-hampton -
Category
Documents
-
view
223 -
download
5
description
Transcript of Tree Data Structures. Heaps for searching Search in a heap? Search in a heap? Would have to look at...
Tree Data Tree Data StructuresStructures
Heaps for searchingHeaps for searching
Search in a heap?Search in a heap? Would have to look at rootWould have to look at root If search item smaller than root, look at left and right If search item smaller than root, look at left and right
childchild If search item smaller than any node , look at both If search item smaller than any node , look at both
childrenchildren Search in a heap is upper bounded by O(n) – may Search in a heap is upper bounded by O(n) – may
have to look at every node (if your value is smaller have to look at every node (if your value is smaller than every node in the tree).than every node in the tree).
Total time: Total time: n nodes inserted in heap, O(logn nodes inserted in heap, O(log22n) insertionn) insertion n searches, O(n) eachn searches, O(n) each O(n logO(n log22n + nn + n22))
Heaps for SearchingHeaps for Searching Heap is well suited for problems where have Heap is well suited for problems where have
to remove specific elements (largest, to remove specific elements (largest, smallest)smallest)
Need to better exploit binary tree properties Need to better exploit binary tree properties (max height log(max height log22n) better than naive heap n) better than naive heap search to allow O(logsearch to allow O(log22n) search for arbitrary n) search for arbitrary elementselements
But if pull out largest element every time, But if pull out largest element every time, Result is reverse sorted list – HeapsortResult is reverse sorted list – Heapsort Cost = ~ O(nlogCost = ~ O(nlog22n) to build heap, O(nlogn) to build heap, O(nlog22n) to n) to
extract items back outextract items back out
A Better Way to Search: A Better Way to Search: Binary Search TreesBinary Search Trees
Binary Search Tree:Binary Search Tree: A binary tree (2 children, left and right)A binary tree (2 children, left and right) Either zero nodes (empty) orEither zero nodes (empty) or If > 0 nodes,If > 0 nodes,
Every element has a key and no two elements Every element has a key and no two elements have the same key (have the same key (uniqueunique keys) keys)
All keys (if any) in the left sub-tree are smaller All keys (if any) in the left sub-tree are smaller than the key in the rootthan the key in the root
All keys (if any) in the right sub-tree are larger All keys (if any) in the right sub-tree are larger than the key in the rootthan the key in the root
The left and right subtrees are also binary search The left and right subtrees are also binary search trees.trees.
Binary Search TreesBinary Search Trees
30
5
2
40
Unique keysLeft nodes < rootRight nodes > rootLeft and right are alsobinary search trees
60
8065
70
Unique keysLeft nodes < rootRight nodes > rootLeft and right are alsobinary search trees
Binary Search TreesBinary Search Trees
20
15
12
25
Unique keysLeft nodes < rootRight nodes > rootLeft and right are not alsobinary search trees
18 22Not a binary search treeRight child of 25 is not larger than 25
Binary Search TreesBinary Search Trees Note that there is no constraint to be Note that there is no constraint to be
a complete binary tree, just an a complete binary tree, just an arbitrary binary treearbitrary binary tree Suggests that linked node Suggests that linked node
implementation may be more usefulimplementation may be more useful May affect properties of searchingMay affect properties of searching
Recursive definition of binary search Recursive definition of binary search tree = recursive algorithmstree = recursive algorithms
Binary Search Trees: Binary Search Trees: SearchSearch
Search:Search: Take advantage of binary tree propertiesTake advantage of binary tree properties
Begin at rootBegin at root If root == 0, return as tree is emptyIf root == 0, return as tree is empty Otherwise, Otherwise,
Compare x to root keyCompare x to root key If x == root key, return root nodeIf x == root key, return root node Else if x < root key, then can’t be in right Else if x < root key, then can’t be in right
subtree due to binary tree properties -> subtree due to binary tree properties -> Recursively search on left childRecursively search on left child
Else recursively search on right childElse recursively search on right child
Binary Search Trees: Binary Search Trees: BSTNode DefinitionBSTNode Definition
template <class Type>template <class Type>class BSTNodeclass BSTNode{{
private:private:BSTNode* leftChild;BSTNode* leftChild;BSTNode* rightChild;BSTNode* rightChild;Element<Type> data;Element<Type> data;
};};
template <class Type>template <class Type>class Elementclass Element{{
private:private:Type key;Type key;??? OTHER DATA??? OTHER DATA
}}
Binary Search Tree:Binary Search Tree:Search ImplementationSearch Implementation
template <class Type>template <class Type>BSTNode<Type>* BST<Type>::Search(const BSTNode<Type>* BST<Type>::Search(const
Element<Type>& x)Element<Type>& x){ return Search(root,x); }{ return Search(root,x); }
template<class Type>template<class Type>BSTNode<Type>* BST<Type>::Search(BSTNode*<Type> *b, BSTNode<Type>* BST<Type>::Search(BSTNode*<Type> *b,
const Element<Type>& x)const Element<Type>& x){{
if (b == 0) return 0;if (b == 0) return 0;if (x.key == b->data.key) return b;if (x.key == b->data.key) return b;if (x.key < b->data.key) return Search(b->LeftChild, x);if (x.key < b->data.key) return Search(b->LeftChild, x);return Search(b->rightChild, x);return Search(b->rightChild, x);
}}
Binary Search Trees: Binary Search Trees: Search ExampleSearch Example
30
5
2
40
15
Find 15
Is root == 0? No
Compare 15 to root (30)
15 < 30, so recurse on left child
Compare 15 to 5
15 > 5, so recurse on right child
Compare 15 to 15
15 == 15, so return node with 15
Binary Search Trees: Big Binary Search Trees: Big Oh AnalysisOh Analysis
At the root, we do one comparisonAt the root, we do one comparison > Root or < Root> Root or < Root
Depending on resultDepending on result Move to one child of root [moving down a level] Move to one child of root [moving down a level] Do one comparisonDo one comparison
Max number of times could do this is the height Max number of times could do this is the height of the tree (maximum number of levels) – O(h).of the tree (maximum number of levels) – O(h).
Thus ease of search is dependent on the shape of Thus ease of search is dependent on the shape of the tree:the tree: Skewed – expensive: O(n)Skewed – expensive: O(n) Balanced – cheap: O (log Balanced – cheap: O (log 22 n) n)
Binary Search Trees: Binary Search Trees: InsertionInsertion
Rules: Insertion must preserveRules: Insertion must preserve Unique keysUnique keys Right child > parentRight child > parent Left child < parentLeft child < parent Self-similar (internal nodes are also Self-similar (internal nodes are also
binary trees)binary trees) How do we check for uniqueness?How do we check for uniqueness?
Look at all the nodes?Look at all the nodes?
Binary Search Trees: Binary Search Trees: InsertionInsertion
Don’t need to look at all the nodesDon’t need to look at all the nodes Take advantage of the fact that before Take advantage of the fact that before
adding it was already a binary search adding it was already a binary search treetree
To see if value is in tree, search for it.To see if value is in tree, search for it.
30
5
2
40
15
Add
15
Search for 1515 ? 30, 15 < 30 => Left15 ? 5, 15 > 5 => Right15 ? 15, 15 == 15 => Not Unique
Binary Search Trees: Binary Search Trees: InsertionInsertion
Search not also performs test for Search not also performs test for uniqueness, but also puts us in the uniqueness, but also puts us in the right place to insert at right place to insert at Where input value should be in treeWhere input value should be in tree
30
5
2
40
Add
15
Search for 1515 ? 30, 15 < 30 => Left15 ? 5, 15 > 5 => RightNo right child, so not present
Add 15 as right child of 515
Binary Search Trees: Binary Search Trees: Insertion Insertion
ImplementationImplementationtemplate <class Type>template <class Type>bool BST<Type>::Insert(const Element<Type> & x)bool BST<Type>::Insert(const Element<Type> & x){{
// search for x// search for xBSTNode<Type> *current = root; BSTNode<Type>* parent = 0;BSTNode<Type> *current = root; BSTNode<Type>* parent = 0;
while (current) { while (current) { parent = current;parent = current;if (x.key == current-> data.key) return false;if (x.key == current-> data.key) return false;if (x.key < current->data.key) current = current->leftChild;if (x.key < current->data.key) current = current->leftChild;else current = current->rightChild; }else current = current->rightChild; }current = new BSTNode<Type>;current = new BSTNode<Type>;current->leftChild = 0; current->rightChild = 0; current->data = current->leftChild = 0; current->rightChild = 0; current->data = x;x;if (!root) root = current;if (!root) root = current;else if (x.key < parent->data.key) parent->leftChild = current;else if (x.key < parent->data.key) parent->leftChild = current;else parent->rightChild = current;else parent->rightChild = current;return true;return true;
}}
Binary Search Trees: Binary Search Trees: Insertion Big Oh AnalysisInsertion Big Oh Analysis Core of insertion function is in the search Core of insertion function is in the search
implementationimplementation Dependent on shape and size of treeDependent on shape and size of tree
Actual insertion is constant timeActual insertion is constant time Cost is bounded by search cost, which we Cost is bounded by search cost, which we
have said:have said: O(n) worst caseO(n) worst case ~O(log~O(log22n) average case with a well balanced n) average case with a well balanced
tree.tree.
Binary Search Trees: Binary Search Trees: DeletionDeletion
Rules: Deletion must preserveRules: Deletion must preserve Unique keys Unique keys
No work to do here. If unique before delete, unique No work to do here. If unique before delete, unique afterwards as deletes can’t change values in treeafterwards as deletes can’t change values in tree
Do need to ensure:Do need to ensure: Right child > parent, left child < parentRight child > parent, left child < parent Self-similar (internal nodes are also binary Self-similar (internal nodes are also binary
trees)trees)
Binary Search Trees: Binary Search Trees: DeletionDeletion
30
5
2
40
15
Three cases:
1) Leaf Node (15)
Remove leaf nodeSet parents pointerwhere leaf node wasto zero
30
5
2
40
Binary Search Trees: Binary Search Trees: DeletionDeletion
2) Non-leaf, one child (5)
From current,Set parents link to currents linkRemove current node
30
5
2
40
30
5
2
40
Binary Search Trees: Binary Search Trees: DeletionDeletion
30
5
2
40
Non-leaf, multiple children (30)
Replace value with largest element of left subtree or smallest element of right subtree
Delete node from which youswapped
This then becomescase 1 or case 2
5
5
2
40
toDelete
5
5
2
40
Binary Search Trees: Binary Search Trees: DeletionDeletion
The rule was:The rule was:““Replace value with largest element of left subtree or smallest element of right subtree””
Is this guaranteed to work?Is this guaranteed to work? Yes, because of binary tree properties, largest Yes, because of binary tree properties, largest
element of left side is:element of left side is: Bigger than anything in left sideBigger than anything in left side Smaller than anything in right sideSmaller than anything in right side
Smallest of right side is:Smallest of right side is: Bigger than anything in left sideBigger than anything in left side Smaller than anything in right side Smaller than anything in right side
These are exactly the roles that must be fulfilled when These are exactly the roles that must be fulfilled when moving to become the root of that subtreemoving to become the root of that subtree
Binary Search Trees: Binary Search Trees: HeightHeight
The worst case height for a binary The worst case height for a binary tree is the number of elements in the tree is the number of elements in the treetree Skewed treeSkewed tree
30
5
2
40
Binary Tree operation costsare bounded by the height of the tree, so in these cases become O(n).
How easy is it to get a skewed tree? Sorted or nearly sorted data
Binary Search Trees: Binary Search Trees: HeightHeight
bool BST<Type>::Insert(const Element<Type> & x)bool BST<Type>::Insert(const Element<Type> & x){{
// search for x// search for xBSTNode<Type> *current = root; BSTNode<Type>* BSTNode<Type> *current = root; BSTNode<Type>* parent = 0;parent = 0;while (current) { while (current) {
parent = current;parent = current;if (x.key == current-> data.key) return if (x.key == current-> data.key) return
false;false;if (x.key < current->data.key) current = if (x.key < current->data.key) current =
current->leftChild;current->leftChild;else current = current->rightChild; }else current = current->rightChild; }
current = new BSTNode<Type>;current = new BSTNode<Type>;current->leftChild = 0; current->rightChild = 0; current->leftChild = 0; current->rightChild = 0; current->data = x;current->data = x;if (!root) root = current;if (!root) root = current;else if (x.key < parent->data.key) parent->leftChild else if (x.key < parent->data.key) parent->leftChild = current;= current;else parent->rightChild = current;else parent->rightChild = current;return true;return true;
}}
Insert: 3, 4, 6, 5, 8
3root
4
6
5 86
Binary Search Trees: Binary Search Trees: HeightHeight
If insertions are made at random, height If insertions are made at random, height is O(log n) on averageis O(log n) on average
Random insertions are the general case, Random insertions are the general case, so most of the time will achieve O(log n) so most of the time will achieve O(log n) heightheight
There are ways to guarantee O(log n) There are ways to guarantee O(log n) height – requires modifications to insert height – requires modifications to insert and delete functions to maintain balance.and delete functions to maintain balance.
TreeSort:TreeSort: Insertion into a binary tree places a Insertion into a binary tree places a
specific ordering on the elements.specific ordering on the elements.
30
5
2
40
15 35 50
For the root, Everything in the left subtree is < rootEverything in the right subtree is > rootFor each subtree,Everything on the left < subtree root, Everything on the right is > subtree root
TreeSort:TreeSort:
Theoretically, should be able to Theoretically, should be able to construct an ordering of all elements construct an ordering of all elements from the tree:from the tree: Generate an array of size equal to number Generate an array of size equal to number
of elements in treeof elements in tree Root goes in middle of arrayRoot goes in middle of array Left subtree fills in left half of arrayLeft subtree fills in left half of array Right subtree fills in right half of arrayRight subtree fills in right half of array
And Recurse And Recurse
< 30 > 3030
<5 5 >5 <40 40 >4030
TreeSort: TreeSort: Extracting ordered array from Extracting ordered array from
binary tree:binary tree: Perform in-order traversal (LVR) – Perform in-order traversal (LVR) –
Ensures will visit all smaller items first Ensures will visit all smaller items first and larger items lastand larger items last
30
5
2
40
15 35 50
LVR Ordering: 2,5,15,30,40,35,50
TreeSort:TreeSort: Analysis of TreeSort:Analysis of TreeSort:
Given an array of size n, have to build binary a tree with Given an array of size n, have to build binary a tree with n-elementsn-elements
Requires N insertionsRequires N insertions Given a binary tree with n-elements, have to traverse Given a binary tree with n-elements, have to traverse
tree in LVR order to extract sorted ordertree in LVR order to extract sorted order
Construction: O(n * log Construction: O(n * log 22 n) if balanced n) if balanced O(n * n) if not balancedO(n * n) if not balanced
Traversal: O(n) anytimeTraversal: O(n) anytime
Average Case: O(n log 2 n), Worst Case: O(nAverage Case: O(n log 2 n), Worst Case: O(n22))
TreeSort:TreeSort:
Very similar to quicksort!Very similar to quicksort! Same average case [O(n log n)] and worst case Same average case [O(n log n)] and worst case
[O(n[O(n22)] times)] times
Roots of binary search tree subnodes are the Roots of binary search tree subnodes are the pivotspivots
Place data smaller than pivot on left of pivot Place data smaller than pivot on left of pivot (leftChild), place larger data on right of pivot (leftChild), place larger data on right of pivot (rightChild)(rightChild)
The better the pivot is, the more balanced the The better the pivot is, the more balanced the tree is (same for quicksort recursion)tree is (same for quicksort recursion)
Nearly sorted/already sorted data leads both to Nearly sorted/already sorted data leads both to trouble: Bad partitioning for quicksort, Bad trouble: Bad partitioning for quicksort, Bad construction for treesortconstruction for treesort
Rank InformationRank Information Often times when working with lists of data, Often times when working with lists of data,
interested in rank information:interested in rank information: What is the largest item?What is the largest item? What is the smallest?What is the smallest? What is the median?What is the median? What is the fifth smallest item?What is the fifth smallest item?
Largest and smallest are trivial [O(n)] Largest and smallest are trivial [O(n)] What if want to ask a lot of questions about rank What if want to ask a lot of questions about rank
or want to know about something other than or want to know about something other than largest smallest?largest smallest?
Rank InformationRank Information Sorting approach to rank Sorting approach to rank
information:information: Sort the listSort the list Return list[rankOfInterest]Return list[rankOfInterest] O(n log n) [sort] + O(1) [value retrieval]O(n log n) [sort] + O(1) [value retrieval]
If using dynamic data, may not have the If using dynamic data, may not have the array to work with – instead a linked list array to work with – instead a linked list would be more likelywould be more likely
Rank InformationRank Information Linked List ApproachLinked List Approach
Sort listSort list Assuming mergesort for linked listsAssuming mergesort for linked lists
Traverse list to find rankOfInterest elementTraverse list to find rankOfInterest element
O(n log n) [sort] + O(rankOfInterest) O(n log n) [sort] + O(rankOfInterest) [traversal][traversal]
Can handle dynamic data, but slower!Can handle dynamic data, but slower!
Rank InformationRank Information Binary Tree Approach:Binary Tree Approach:
Insert into binary treeInsert into binary tree Inorder traversal up until rankOfInterest node Inorder traversal up until rankOfInterest node
(goes through in sorted order)(goes through in sorted order)
O(n log n) [building tree] + O(rankOfInterest) O(n log n) [building tree] + O(rankOfInterest) [traversal][traversal]
Same cost as linked list approach (probably Same cost as linked list approach (probably easier since don’t have to write quicksort for easier since don’t have to write quicksort for linked lists).linked lists).
Rank Information:Rank Information:
Binary Tree Approach II:Binary Tree Approach II: Add a new variable to each node in the tree Add a new variable to each node in the tree
leftSize = indicates number of elements in nodes leftSize = indicates number of elements in nodes left subtree + selfleft subtree + self
Initially set all left sizes to 1 (for self)Initially set all left sizes to 1 (for self) Insert elements into binary treeInsert elements into binary tree
As pass by parent nodes in searching for As pass by parent nodes in searching for appropriate place, store references to each parent appropriate place, store references to each parent nodenode
If do insertion, update each parent’s leftSize valueIf do insertion, update each parent’s leftSize value If don’t insert (non-unique), no updates for leftSizeIf don’t insert (non-unique), no updates for leftSize
Search by rank using traditional binary tree Search by rank using traditional binary tree search on leftSize value search on leftSize value
Function on next slideFunction on next slide
Rank Information:Rank Information:
template <class Type>template <class Type>BinaryTreeNode<Type>* BinarySearchTree<Type>:: BinaryTreeNode<Type>* BinarySearchTree<Type>::
search(int rank)search(int rank){{
BinaryTreeNode<Type>* current = root;BinaryTreeNode<Type>* current = root;while (current)while (current){{
if (k == current->leftSize) return current;if (k == current->leftSize) return current;else if (rank < current->leftSize) current = else if (rank < current->leftSize) current =
current-leftChild;current-leftChild;else { rank = rank – leftSize; current = else { rank = rank – leftSize; current =
current-current->rightChild;}>rightChild;}}}
}}
Rank Information: Rank Information: ExampleExample
Mike
John
Georgia
Thomas
Kylie Shelley Tyler
4
2
1 1 1 1
2
What is 2nd element?
Rank 2 < leftSize(Mike) [4]Move to root->leftChildRank 2 == leftSize(John) [2]
Return John Node
What is 5th element?
Rank 5 > leftSize(Mike) [4]Move to root->rightChildRank = 5-4 = 1 < leftSize(Thomas) [2]Move to leftChild of ThomasRank == leftSize(Shelley) [1]
Return Shelley Node
Real Ranks for Data [First is rank 1, Last is 7]:Georgia, John, Kylie,Mike, Shelley, Thomas, Tyler
leftSize values:
Rank Information: Rank Information: AnalysisAnalysis
Searching (traversal) is now bounded by the height of Searching (traversal) is now bounded by the height of the treethe tree On average O(log n)On average O(log n)
Building tree was O(n log n), but we added more work Building tree was O(n log n), but we added more work Original n log n comes from n insertions, log n cost eachOriginal n log n comes from n insertions, log n cost each Now have to update parents leftSize valuesNow have to update parents leftSize values However, maximum number of parents = height of tree = on However, maximum number of parents = height of tree = on
average log naverage log n So the cost for a single insertion is now just 2 log n, and all So the cost for a single insertion is now just 2 log n, and all
insertions costs are still bounded by O(n log n)insertions costs are still bounded by O(n log n)
So for dynamic data, can do rank information in: So for dynamic data, can do rank information in: O(n log n) [building] + O(log n) [searching]O(n log n) [building] + O(log n) [searching]Better than approaches that sort and traverse to rank Better than approaches that sort and traverse to rank positionposition
Threaded Trees: General Threaded Trees: General Trees Trees
Mike
John
Georgia
Thomas
Kylie Shelley Tyler
Fred Hall
Wasting a lot of links in this tree -> All terminals waste 2 links! Can we make use of those for our good? Yes.
Threaded TreesThreaded Trees
Mike
John
Georgia
Thomas
Kylie Shelley Tyler
Fred Hall
NULL
tt
ff
tt
tt tt tt
ff
ff
ff
Threaded Trees: Threaded Trees: InsertionInsertion
Mike
John
Kylie
Hall Bill
Mike
John
Kylie
Hall Bill
Threaded Trees: Threaded Trees: InsertionInsertion
Mike
John
Kylie
Hall Bill
Fred Jane
Mike
John
Kylie
Hall
Bill
Fred Jane
Kate
Kate