Interaction LBSC 796/INFM 718R Douglas W. Oard Week 4, February 23, 2011.
Algorithms Session 7 LBSC 790 / INFM 718B Building the Human-Computer Interface.
-
Upload
alannah-mosley -
Category
Documents
-
view
215 -
download
1
Transcript of Algorithms Session 7 LBSC 790 / INFM 718B Building the Human-Computer Interface.
Algorithms
Session 7
LBSC 790 / INFM 718B
Building the Human-Computer Interface
Agenda: Weeks 7 and 8
• Questions
• Some useful algorithms
• Project
• Some useful data structures– Including Java implementations
Why Study Algorithms?
• Some generic problems come up repeatedly– Sorting– Searching– Graph traversal
• Need a way to compare alternative solutions• Reusing algorithms is easy and productive
– Focusing on the algorithm reveals the key ideas– Language and interface make reusing code hard
Sorting
• Given an array, put the elements in order– Numerical or lexicographic
• Desirable characteristics– Fast– In place (don’t need a second array)– Able to handle any values for the elements– Easy to understand
Insertion Sort
• Simple, able to handle any data
• Grow a sorted array from the beginning– Create an empty array of the proper size– Pick the elements one at a time in any order– Put them in the new array in sorted order
• If the element is not last, make room for it
– Repeat until done
• Can be done in place if well designed
Insertion Sort
90 11 27 37111631 4
Insertion Sort
90 11 27 37111631 4
90
Insertion Sort
90 11 27 37111631 4
9011
Insertion Sort
90 11 27
90
37111631 4
2711
Insertion Sort
90 11 27
31 90
37111631 4
2711
Insertion Sort
90 11 27
27 31 90
37111631 4
114
Insertion Sort
90 11 27
16 27 31 90
37111631 4
114
Insertion Sort
90 11 27
11 16 27 31 90
37111631 4
114
Insertion Sort
90 11 27
11 16 27 31 37 90
37111631 4
114
Insertion Sort
• Sorting can actually be done in place– Never need the same element in both arrays
• Every insertion can cause lots of copying– If there are N elements, need to do N insertions– Worst case is about N/2 copys per insertion– N elements can take nearly N operations to sort
• But each operation is very fast– So this is fine if N is small (20 or so)
2
Merge Sort
• Fast, able to handle any data– But can’t be done in place
• View the array as a set of small sorted arrays– Initially only the 1-element “arrays” are sorted
• Merge pairs of sorted arrays– Repeatedly choose the smallest element in each
– This produces sorted arrays that are twice as long
• Repeat until only one array remains
Merge Sort
90 11 27 37111631 4
9011
Merge Sort
90 11 27
27 31
37111631 4
9011
Merge Sort
90 11 27
27 31 4 16
37111631 4
9011
Merge Sort
90 11 27
27 31 4 16 11 37
37111631 4
9011
Merge Sort
11 27 31
27 31 4 16 11 37
90
9011
Merge Sort
11 27 31
27 31 4 16 11 37
37161190 4
9011
Merge Sort
11 27 31
11 16 27 31 37 90
37161190 4
114
Merge Sort
• Each array size requires N steps– But 8 elements requires only 3 array sizes
• In general, 2 elements requires k array sizes– So the complexity is N*log(N)
• No faster sort (based on comparisons) exists– Faster sorts require assumptions about the data
– There are other N*log(N) sorts, though• Merge sort is most often used for large disk files
k
Computational Complexity
• Run time typically depends on:– How long things take to set up– How many operations there are in each step– How many steps there are
• Insertion sort can be faster than merge sort– One array, one operation per step– But N*log(N) eventually beats N for large N
• And once it does, the advantage increases rapidly
2
Divide and Conquer
• Split a problem into simpler subproblems– Keep doing that until trivial subproblems result
• Solve the trivial subproblems
• Combine the results to solve a larger problem– Keep doing that until the full problem is solved
• Merge sort illustrates divide and conquer– But it is a general strategy that is often helpful
Recursion
• Divide and conquer problems are recursive– Solve the same problem at increasing granularity
• Construct a Java method to solve the problem– Divide the problem into subproblems– Call the same method to solve each subproblem
• Unless the subproblems are trivial
– Use the parameters to control the granularity
• See this week’s notes page for merge sort example
A Recipe for Recursion
• First, craft a divide and conquer strategy
• Create a non-recursive top-level method– Calls recursive method with initial parameters
• In the recursive method:– First solve the problem if it is trivial and return
• Be sure you eventually get here!
– Otherwise, split the problem and call itself– Combine the results and return them
Search
• Find something by following links– Web pages– Connections in the flight finder– Winning moves in chess
• This may seem like an easy problem– But computational complexity can get really bad– Simple tricks can help in some cases
Web Crawlers
• Goal is to find everything on the web– Build a balanced tree, sorted by search terms
• Start anywhere, follow every link
• If every page has 1 kB of text and 10 links– Then 10 levels would find a terabyte of data!
• Avoid links that are likely to be uninteresting
• Detect duplicates quickly with hashing
Outside-In Search
• Explore multiple paths between two points– Usually trying to find the best by some measure
• Flight finder searches like a web crawler– Every possible continuation of every route
• Also search backward from the destination• Assuming 10 departures per airfield:
– 3 connections takes Flight finder 10,000 steps
– 1 connection twice would take 200 steps
How to Win at Chess
• The paths are the legal moves– And the “places” are possible board positions
• You are seeking to make things better– Your opponent seeks to make things worse
• Such “zero sum games” are common– Although many lack chess’ shared information
• Any problem structure makes search easier– The trick is to exploit constraints effectively
Minimax Searching
• Decide how many half-moves to look ahead
• Develop a scoring strategy for final positions– Based on piece count and positional factors
• Follow a promising path– Helpful to guess the best moves for each side
• With several moves available, pick the best– But stop any search that can’t improve things
Minimax Search ExampleMin
Max
0 5
0
-3
-3
0
3 5
3
3
0
Traveling Salesperson Problem
• Find the cheapest way of visiting 10 cities– Given the airfare between every city pair– Only visit each city once, finish where you start
• There are only 90 city pairs– But there are a LOT of possible tours
• The best known algorithm is VERY slow– Because the problem is “NP complete”
NP Complete Problems
• No “polynomial time” algorithm is known– Not N , N , N …
• Haven’t proved that none exists– But if it does, many hard problems would be easy
• Approximate solutions with heuristic methods– Greedy methods– Genetic algorithms– Simulated annealing
2 3 4
What’s Wrong With Arrays?
• Must specify maximum size when declared– And the maximum possible size is always used
• Can only index with integers– For efficiency they must be densely packed
• Adding new elements is costly– If the elements are stored in order
• Every element must be the same type
What’s Good About Arrays?
• Can get any element quickly– If you know what position it is in
• Natural data structure to use with a loop– Do the same thing to different data
• Efficiently uses memory– If the array is densely packed
• Naturally encodes an order among elements
Linked Lists
• A way of making variable length arrays– In which insertions and deletions are easy
• Very easy to do in Java
• But nothing comes for free– Finding an element can be slow– Extra space is needed for the links– There is more complexity to keep track of
Making a Linked List
• In Java, all objects are accessed by reference– Object variables store the location of the object
• New instances must be explicitly constructed
• Add reference to next element in each object– Handy to also have a reference to the prior one
• Keep a reference to the first object– And then walk down the list using a loop
Linked List Example
Jill Joe Tom
first
Public static main (String[] argv) {Student first;
…}
Public class Student {int String name;public Student next;
}
Linked List Operations
• Add an element– Easy to put it in sorted order
• Examine every element– Just as fast as using an array
• Find just one element– May be as slow as examining every element
• Delete an element after you find it– Fast if you keep both next and prior links
Linked List Insertion
Jill Joe Tom
first
public void insert(String newName) {Student temp = first;boolean done = false;while (!done) {
if ((temp.next==null) || (temp.next.name.follows(newName))){
Student new = new Student(name, temp.next);
temp.next=new;done = true;
}temp = temp.next;
}}
Trees
• Linked list with multiple next elements– Just as easy to create as linked lists– Binary trees are useful for relationships like “<“
• Insertions and deletions are easy
• Useful for fast searching of large collections– But only if the tree is balanced
• Efficiently balancing trees is complex, but possible
Binary Tree Example
Jill
Joe
Tom
root Public class Student {int String name;public Student left;public Student right;
}
Data Structures in Java• Resizable array [O(n) insertion, O(1) access]:
– ArrayList
• Linked list [O(1) insertion, O(n) access, sorted]: – LinkedList
• Hash table [object index, unsorted, O(1)]: – HashSet (key only)– HashMap (key+value)
• Balanced Tree [object index, sorted, O(log n)]:– TreeSet (key only)– TreeMap (key+value)
Hashtables
• Find an element nearly as fast as in an array– With easy insertion and deletion– But without the ability to keep things in order
• Fairly complex to implement– But Java defines a class to make it simple
• Helpful to understand how it works– “One size fits all” approaches are inefficient
How Hashtables Work
• Create an array with enough room– It helps a lot to guess the right size first
• Choose a variable in each object as the key– But it doesn’t need to be an integer
• Choose a spot in the array for each key value– Using a fast mathematical function
– Best if things are scattered around well
• Choose how to handle multiple keys at one spot
Java HashMap Class
• Hashtables are objects like any other– import java.util.*– Must be declared and instantiated
• The constructor optionally takes a size parameter
• put(key, value) adds an element
• containsKey(key) checks for an element
• get(key) returns the “value” object for that key
Stacks
• Maintain an implicit order– Last In-First Out (LIFO)
• Easy additions and deletions– push and pop operations
• Maps naturally to certain problems– Interrupt a task to compute an intermediate value
• Implemented as a Java class– import java.util.*
Choosing a Data Structure
• What operations do you need to perform?– Reading every element is typically easy– Other things depend on the representation
• Hashing finds single elements quickly– But cannot preserve order
• Stacks and linked lists preserve order easily– But they can only read one element at any time
• Balanced trees are best when you need both
• Which operations dominate the processing?