Algorithm Design & Data Structures

13
Prepared by Engistan Team for various Competitive Exams ENGISTAN.COM ALGORITHM DESIGN & DATASTRUCTURES

description

Algorithm Design & Data Structures

Transcript of Algorithm Design & Data Structures

  • Prepared by Engistan Team for various Competitive Exams

    ENGISTAN.COM ALGORITHM DESIGN & DATASTRUCTURES

    http://engistan.com/

  • Engistan.com [Algorithm Design & Datastructures] Algorithm: An algorithm is a finite set of instructions which, if followed, accomplish a particular task. And satisfy the following criteria:

    Input: there are zero or more quantities which are externally supplied Output: at least one quantity is produced Definiteness: each instruction must be clear and unambiguous Finiteness: if we trace out the instructions of an algorithm, then for all cases the algorithm

    will terminate after a finite number of steps Effectiveness: every instruction must be sufficiently basic that it can in principle be carried

    out by a person using only pencil and paper.

    Performance Analysis: For an algorithm performance analysis can be divided into two major phases:

    Space complexity: it is amount of memory algorithm needs to run to completion. Time Complexity: it is the amount of computer time it need to run to completion. It can be

    divided into two phases: (i) A priori (ii) A posterior

    Time complexity can be:

    Worst case: It measures the behaviour of algorithm with respect to the worst possible case of the input instance.

    Average case: It measures the behaviour of algorithm when the input is randomly drawn from a given distribution.

    Best case: It measures the behaviour of algorithm with respect to the best possible case of the input instance

    Asymptotic analysis: Evaluate the performance of an algorithm in terms of input size (dont measure the actual running time).And calculate, how does the time (or space) taken by an algorithm increases with the input size.

    Asymptotic notations: The following 3 asymptotic notations are mostly used to represent time complexity of algorithms:

    Notation: For a given function g(n), we denote (g(n)) is following set of functions.

    ((g(n)) = {f(n): there exist positive constants c1, c2 and n0 such that 0

  • Engistan.com [Algorithm Design & Datastructures]

    Big O Notation: For a given function g(n), we denote O (g(n)) is following set of functions.

    O(g(n)) = { f(n): there exist positive constants c and n0 such that 0

  • Engistan.com [Algorithm Design & Datastructures]

    Conquer: Recursively solve these subproblems Combine: Appropriately combine the answers

    Following are some standard algorithms that are Divide and Conquer algorithms.

    Binary Search is a searching algorithm Quicksort is a sorting algorithm Merge Sort is also a sorting algorithm Strassens Algorithm is an efficient algorithm to multiply two matrices.

    Dynamic programming: This algorithm solves problem by dividing into sub problems. It is used when sub problems are not independent whereas DAC solves a subproblem n times. This algorithm solves a problem using following three steps:

    Characterize the structure of an optimal solution. Recursively define the value of optimal solution. Compute the value of optimal solution from bottom up Construct the optimal solution for entire problem from the computed values of smaller

    subproblems

    Following are some standard algorithms that Dynamic programming algorithms.

    0-1 Knapsack problem Shortest path problems Matrix chain multiplication Longest common sub sequence

    Greedy Algorithm: It solves the problems by making the choice that seems best at that particular moment. A greedy algorithm works if a problem exhibits the following 2 properties:

    1. Greedy choice property: A globally optimal solution can be arrived at by making a locally optimal solution.

    2. Optimal substructure: optimal solution contain optimal sub solution.

    Following are some standard algorithms that are Greedy algorithms:

    An Activity selection problem Knapsack problem Huffman codes Task scheduling problem Travelling sales man problem Minimum Spanning problem

    Sorting Algorithms with their stability and time complexity:

    Engistan.com | Engineers Community

    3

    engistan.com

    http://engistan.com/http://engistan.com/http://engistan.com/http://engistan.com/http://engistan.com/

  • Engistan.com [Algorithm Design & Datastructures]

    Name Best Average Worst Stable

    Quick sort

    typical in-place sort is not stable; stable

    versions exist

    Merge sort

    Yes

    Heap sort

    No

    Insertion sort

    Yes

    Selection sort

    No

    Shell sort

    or

    Depends on gap sequence;

    best known is No

    Bubbel sort

    Yes

    Elementary data structure: A data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently. Following are some data structure:

    Array: It consists of collection of elements, each identified by at least one array index or key. For example: Age 0, Age1 etc is index value and 30,32 etc are array elements.

    Age 0 Age 1 Age 2 Age 3 Age 4 Age 5 Age 6 Age 7 Age 8 Age 9

    30 32 54 32 26 29 23 43 34 5

    Type of array:

    One-dimensional arrays Multidimensional arrays (2-D , 3-D etc)

    Engistan.com | Engineers Community

    4

    engistan.com

    http://engistan.com/http://engistan.com/http://engistan.com/http://engistan.com/http://engistan.com/

  • Engistan.com [Algorithm Design & Datastructures] Stack: it is a data structure based on the principles of last in first out (LIFO) i.e addition (push) and deletion (pop) are performed at one end.

    Application of stack:

    Polish notation Evaluation of postfix/prefix expression

    Queue: It is a data structure in which item are added (enqueue) at one end and deleted (dequeue) at another end. It works on the principle of first in first out (FIFO).

    Types of queue:

    Deques: It stands for Double ended queue and elements can be added or removed form at either end but not in the middle.

    Priority queues: It is a queue, where additionally each element has a "priority" associated with it. An element with high priority is served before an element with low priority. If two elements have the same priority, they are served according to their order in the queue.

    Circular queue: In circular queue the last node is connected back to the first node to make a circle

    Linked list: it is a data structure whose length can be increased or decreased at run time. For example: 12 , 99 etc are data to be stored and along with data part address of the next element is stored.

    Tree: It is a data structure consisting of nodes organised as a hierarchy. Tress dnt has cycle.

    Engistan.com | Engineers Community

    5

    engistan.com

    http://engistan.com/http://engistan.com/http://engistan.com/http://engistan.com/http://en.wikipedia.org/wiki/File:Data_Queue.svghttp://engistan.com/

  • Engistan.com [Algorithm Design & Datastructures]

    Binary tree: A binary tree is one in which each node has at most two descendants - a node can have just one but it can't have more than two.

    Perfect binary tree extreme binary tree

    Binary search tree: A tree is binary search tree if it hold following property:

    The left subtree of a node contains only nodes with keys less than the node's key. The right subtree of a node contains only nodes with keys greater than the node's key. The left and right subtree each must also be a binary search tree. There must be no duplicate nodes.

    For example: Below is a binary search tree of size 9 and depth 3, with root 8 and leaves 1, 4, 7 and 1

    Engistan.com | Engineers Community

    6

    engistan.com

    http://engistan.com/http://engistan.com/http://engistan.com/http://engistan.com/http://en.wikipedia.org/wiki/File:Binary_search_tree.svghttp://engistan.com/

  • Engistan.com [Algorithm Design & Datastructures] AVL tree: an AVL tree is a self-balancing binary search tree. In it heights of the two child subtrees of any node differ by at most one; if at any time they differ by more than one, rebalancing is done to restore this property. Below is example of AVL tree.

    B Tree: A m-way tree in which (m way means the tree has m children):

    The root has at least one key Non-root nodes have at m/2 subtrees (i.e., at least (m - 1)/2 keys) All the empty subtrees (i.e., external nodes) are at the same level For example:

    B-tree of order 3 not a B-tree

    Graph Algorithms:

    Graph: It is collection of vertex, edges and can contain cycle.

    Simple path: it is a path with no vertex repeated.

    Simple cycle: It is a simple path but here first and last vertex is same.

    Complete graph: graph in which all edges are present.

    BFS (Breadth first search): BFS is a strategy for searching in a graph or tree. Search is performed by two operations:

    o Visit and inspect a node of a graph o Visit the nodes that neighbor the currently visited node.

    While visiting nodes cycle should not be formed.

    Engistan.com | Engineers Community

    7

    engistan.com

    http://engistan.com/http://engistan.com/http://engistan.com/http://engistan.com/http://en.wikipedia.org/wiki/File:AVLtreef.svghttp://engistan.com/

  • Engistan.com [Algorithm Design & Datastructures]

    Breadth First Traversal of the following graph is 2, 0, 3, 1.

    DFS(Depth first search): DFS is an algorithm for traversing or searching tree or graph data structures. One starts at the root (selecting some arbitrary node as the root in the case of a graph) and explores as far as possible along each branch before backtracking. Depth First Traversal of the following graph is 2, 0, 1, 3

    Topological Sort: Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of vertices such that for every directed edge uv, vertex u comes before v in the ordering. Topological Sorting for a graph is not possible if the graph is not a DAG. For example, a topological sorting of the following graph is 5 4 2 3 1 0

    Minimum spanning Tree (MST):

    Engistan.com | Engineers Community

    8

    engistan.com

    http://engistan.com/http://engistan.com/http://engistan.com/http://engistan.com/http://d2o58evtke57tz.cloudfront.net/wp-content/uploads/BFS.jpghttp://d2o58evtke57tz.cloudfront.net/wp-content/uploads/DFS.jpghttp://d2o58evtke57tz.cloudfront.net/wp-content/uploads/graph.pnghttp://engistan.com/

  • Engistan.com [Algorithm Design & Datastructures]

    Given a connected and undirected graph, a spanning tree of that graph is a subgraph that is a tree and connects all the vertices together.

    A single graph can have many different spanning trees. MST for a weighted, connected and undirected graph is a spanning tree with weight less

    than or equal to the weight of every other spanning tree. The weight of a spanning tree is the sum of weights given to each edge of the spanning tree

    Prims algorithm: It is a greedy algorithm that finds a minimum spanning tree for a connected weighted undirected graph. This means it finds a subset of the edges that forms a tree that includes every vertex, where the total weight of all the edges in the tree is minimized.

    Kruskal's algorithm: It is a greedy algorithm in graph theory that finds a minimum spanning tree for a connected weighted graph. This means it finds a subset of the edges that forms a tree that includes every vertex, where the total weight of all the edges in the tree is minimized. If the graph is not connected, then it finds a minimum spanning forest (a minimum spanning tree for each connected component).

    Single source shortest path: In graph theory, the shortest path problem is the problem of finding a path between two vertices (or nodes) in a graph such that the sum of the weights of its constituent edges is minimized.

    Dijkstra's algorithm: It is a graph search algorithm that solves the single-source shortest path problem for a graph with non-negative edge path costs, producing a shortest path tree. This algorithm is often used in routing and as a subroutine in other graph algorithms.

    BellmanFord algorithm:

    It is an algorithm that computes shortest paths from a single source vertex to all of the other vertices in a weighted digraph.

    It is slower than Dijkstra's algorithm for the same problem, but more versatile, as it is capable of handling graphs in which some of the edge weights are negative numbers.

    FloydWarshall algorithm: It is a graph analysis algorithm for finding shortest paths in a weighted graph with positive or negative edge weights (but with no negative cycles, see below) and also for finding transitive closure of a relation R. A single execution of the algorithm will find the lengths (summed weights) of the shortest paths between all pairs of vertices, though it does not return details of the paths themselves.

    Articulation Points (or Cut Vertices) in a Graph: A vertex in an undirected connected graph is an articulation point (or cut vertex) if removing it (and edges through it) disconnects the graph. Following are some example graphs with articulation points encircled with red color.

    Engistan.com | Engineers Community

    9

    engistan.com

    http://engistan.com/http://engistan.com/http://engistan.com/http://engistan.com/http://engistan.com/

  • Engistan.com [Algorithm Design & Datastructures]

    Biconnected graph: An undirected graph is called Biconnected if there are two vertex-disjoint paths between any two vertices. In a Biconnected Graph, there is a simple cycle through any two vertices. By convention, two nodes connected by an edge form a biconnected graph, but this does not verify the above properties. Following are some examples.

    Bridges in a graph: An edge in an undirected connected graph is a bridge if removing it disconnects the graph. Following are some example graphs with bridges highlighted with red color.

    Eulerian path and circuit: Eulerian Path is a path in graph that visits every edge exactly once.

    Eulerian Circuit is an Eulerian Path which starts and ends on the same vertex.

    Engistan.com | Engineers Community

    10

    engistan.com

    http://engistan.com/http://engistan.com/http://engistan.com/http://engistan.com/http://d2o58evtke57tz.cloudfront.net/wp-content/uploads/ArticulationPoints.pnghttp://d2o58evtke57tz.cloudfront.net/wp-content/uploads/ArticulationPoints1.pnghttp://d2o58evtke57tz.cloudfront.net/wp-content/uploads/ArticulationPoints21.pnghttp://d2o58evtke57tz.cloudfront.net/wp-content/uploads/Biconnected11.pnghttp://d2o58evtke57tz.cloudfront.net/wp-content/uploads/Biconnected1.pnghttp://d2o58evtke57tz.cloudfront.net/wp-content/uploads/Biconnected.pnghttp://d2o58evtke57tz.cloudfront.net/wp-content/uploads/Biconnected4.pnghttp://d2o58evtke57tz.cloudfront.net/wp-content/uploads/Biconnected5.pnghttp://d2o58evtke57tz.cloudfront.net/wp-content/uploads/Bridge1.pnghttp://d2o58evtke57tz.cloudfront.net/wp-content/uploads/Bridge2.pnghttp://d2o58evtke57tz.cloudfront.net/wp-content/uploads/Bridge3.pnghttp://engistan.com/

  • Engistan.com [Algorithm Design & Datastructures] Strongly Connected Components: A directed graph is strongly connected if there is a path between all pairs of vertices. A strongly connected component (SCC) of a directed graph is a maximal strongly connected subgraph. For example, there are 3 SCCs in the following graph.

    Graph Coloring: Graph coloring problem is to assign colors to certain elements of a graph subject to certain constraints.

    Vertex coloring: It is the most common graph coloring problem. The problem is, given m colors, find a way of coloring the vertices of a graph such that no two adjacent vertices are colored using same color.

    The other graph coloring problems like Edge Coloring (No vertex is incident to two edges of same color) and Face Coloring (Geographical Map Coloring) can be transformed into vertex coloring.

    Chromatic Number: The smallest number of colors needed to color a graph G is called its chromatic number. For

    example, the following can be colored minimum 3 colors.

    The problem to find chromatic number of a given graph is NP Complete.

    NP, P, NP-complete and NP-Hard problems: P: It is set of problems that can be solved by a deterministic Turing machine in Polynomial time.

    Engistan.com | Engineers Community

    11

    engistan.com

    http://engistan.com/http://engistan.com/http://engistan.com/http://engistan.com/http://d2o58evtke57tz.cloudfront.net/wp-content/uploads/SCC.pnghttp://d2o58evtke57tz.cloudfront.net/wp-content/uploads/vertex_coloring.pnghttp://engistan.com/

  • Engistan.com [Algorithm Design & Datastructures] NP: It is set of decision problems that can be solved by a Non-deterministic Turing Machine in Polynomial time.

    P is subset of NP (any problem that can be solved by deterministic machine in polynomial time can also be solved by non-deterministic machine in polynomial time).

    NP-complete problems are the hardest problems in NP set. A decision problem L is NP-complete if:

    L is in NP (Any given solution for NP-complete problems can be verified quickly, but there is no efficient known solution).

    Every problem in NP is reducible to L in polynomial time (Reduction is defined below).

    Engistan.com | Engineers Community

    12

    engistan.com

    http://engistan.com/http://engistan.com/http://engistan.com/http://engistan.com/http://d2o58evtke57tz.cloudfront.net/wp-content/uploads/NP-CompleteSet.pnghttp://engistan.com/

    Prepared by Engistan Team for various Competitive Exams Algorithm: An algorithm is a finite set of instructions which, if followed, accomplish a particular task. And satisfy the following criteria: Input: there are zero or more quantities which are externally supplied Output: at least one quantity is produced Definiteness: each instruction must be clear and unambiguous Finiteness: if we trace out the instructions of an algorithm, then for all cases the algorithm will terminate after a finite number of steps Effectiveness: every instruction must be sufficiently basic that it can in principle be carried out by a person using only pencil and paper. Performance Analysis: For an algorithm performance analysis can be divided into two major phases: Space complexity: it is amount of memory algorithm needs to run to completion. Time Complexity: it is the amount of computer time it need to run to completion. It can be divided into two phases:(i) A priori(ii) A posteriorTime complexity can be: Worst case: It measures the behaviour of algorithm with respect to the worst possible case of the input instance. Average case: It measures the behaviour of algorithm when the input is randomly drawn from a given distribution. Best case: It measures the behaviour of algorithm with respect to the best possible case of the input instanceAsymptotic analysis: Evaluate the performance of an algorithm in terms of input size (dont measure the actual running time).And calculate, how does the time (or space) taken by an algorithm increases with the input size.Asymptotic notations: The following 3 asymptotic notations are mostly used to represent time complexity of algorithms: Notation: For a given function g(n), we denote/(g(n)) is following set of functions./((g(n)) = {f(n): there exist positive constants c1, c2 and n0 such that 0