DSC - Handout

download DSC - Handout

of 85

Transcript of DSC - Handout

  • 7/31/2019 DSC - Handout

    1/85

    Handout: Data Structures with CVersion: DSC/Handout/0307/2.1

    Date: 05-03-07

    Cognizant

    500 Glen Pointe Center West

    Teaneck, NJ 07666

    Ph: 201-801-0233

    www.cognizant.com

  • 7/31/2019 DSC - Handout

    2/85

    Data Structures with C

    Page 2 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    TABLE OF CONTENTS

    Introduction ................................................................................................................................4

    About this Document..... .......... ........... ............ .......... ........... ........... .......... ............ .......... ........... .4

    Target Audience.........................................................................................................................4

    Objectives ..................................................................................................................................4

    Pre-requisite ..............................................................................................................................4

    Session 1: Introduction to Data Structure .................................................................................5

    Learning Objectives ...................................................................................................................5

    Overview....................................................................................................................................5

    Summary ...................................................................................................................................9

    Test your Understanding..........................................................................................................10

    Session 2: Arrays ......................................................................................................................11

    Learning Objectives .................................................................................................................11

    Overview..................................................................................................................................11

    Summary .................................................................................................................................20

    Test your Understanding..........................................................................................................20

    Session 4: Linked Lists .............................................................................................................21

    Learning Objectives .................................................................................................................21

    Linked lists ...............................................................................................................................21 Summary .................................................................................................................................32

    Test your Understanding..........................................................................................................32

    Session 6: Sorting and Searching............................................................................................33

    Learning Objectives .................................................................................................................33

    Sorting .....................................................................................................................................33

    Summary .................................................................................................................................43

    Test your Understanding..........................................................................................................44

    Session 8: Trees ........................................................................................................................45

    Learning Objectives .................................................................................................................45

    Overview: .................................................................................................................................45

    Summary .................................................................................................................................56

    Test your Understanding..........................................................................................................56

  • 7/31/2019 DSC - Handout

    3/85

  • 7/31/2019 DSC - Handout

    4/85

    Data Structures with C

    Page 4 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Introduction

    About this Document

    This module provides the participants with the basic knowledge to understand data structuresand to measure the performance of various algorithms used in different problems.

    Target Audience

    In-Campus Trainees

    Objectives

    Acquire the basic knowledge on data structures

    Select the appropriate data structures for the application

    Analyze the complexity of the algorithm

    Apply data structures using data structures

    Pre-requisite

    The participants must have basic knowledge in writing programs using C.

  • 7/31/2019 DSC - Handout

    5/85

    Data Structures with C

    Page 5 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Session 1: Introduction to Data Structure

    Learning Objectives

    After completing this chapter, you will be able to:

    Define a data structure

    List the types of data structures

    Identify how to analyze and select data structure for a particular application

    Overview

    Study of computer science involves study of organization, manipulation and utilization of data in acomputer in order to improve the efficiency of the processor and memory.

    Data type and data structure

    Data can be represented in the form of binary digits in memory. A binary digit can be stored usingthe basic unit of data called bit. A bit can represent either a zero or a one.

    Data type A data type defines the specification of a set of data and the characteristics for that data. Data typeis derived from the basic nature of data that are stored for processing rather from their implementation.

    Data StructureData structure refers to the actual implementation of the data type and offers a way of storing datain an efficient manner. Any data structure is designed to organize data to suit a specific purpose sothat it can be accessed and worked in appropriate ways both effectively and efficiently. Incomputer programming, a data structure may be selected or designed to store data for thepurpose of working on it by various algorithms.

    The choice of a data structure begins from the choice of an abstract data type. Data structures areimplemented using the data types, references and operations on them that are provided by aprogramming language.

    Example data structures include:

    Arrays Stacks

    Queues

    Linked Lists

  • 7/31/2019 DSC - Handout

    6/85

    Data Structures with C

    Page 6 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Abstract Data Types (ADT) An Abstract Data Type (ADT) defines data together with the operations. ADT is specifiedindependently of any particular implementation. ADT depicts the basic nature or concept of thedata structure rather than the implementation details of the data. A stack or a queue is an exampleof an ADT. Both stacks and queues can be implemented using an array or using a linked list.

    Types of Data StructuresThe different types of data structures include linear data structures, hash tables and non linear data structures. The structure of a data file defines how records, or rows of data, are related tofields, or columns of data.

    Linear structures

    A data structure is said to be linear if its elements form a sequence or a linear list.

    Some of the linear structures are:

    Array: Fixed-size

    Linked-list: Variable-size

    Stack: Add to top and remove from top

    Queue: Add to back and remove from front

    Priority queue: Add anywhere, remove the highest priority

    Possible operations on these linear structures include:

    Traversal: Travel through the data structure

    Search: Traversal through the data structure for a given element

    Insertion: Adding new elements to the data structure

    Deletion: Removing an element from the data structure

    Sorting: Arranging the elements in some type of order

    Merging: Combining two similar data structures into one

    Hash table A hash table , or a hash map , is a data structure that associates keys with values. A functiontermed as Hash function is applied on the key to find the address of the record.

    Non linear structures A data structure is said to be non linear if its elements are not in a sequence. The elements in thedata structure are not arranged in a linear manner; rather it has a branched structure.

    Some of the non linear structures are:

    Tree: Collection of nodes represented in hierarchical fashion

    Graph: Collection of nodes connected together through edges

  • 7/31/2019 DSC - Handout

    7/85

    Data Structures with C

    Page 7 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Selecting a Data StructureData structures that suit certain applications may not suit certain other applications. The choice of the data structure often begins from the choice of an abstract data structure an abstract storagefor data defined in terms of the set of operations to be performed on data and computationalcomplexity for performing these operations, regardless of the implementation in a concrete datastructure.

    Selection of an abstract data structure is crucial in the design of efficient algorithms and inestimating their computational complexity, while selection of concrete data structures is importantfor efficient implementation of algorithms. The names of many abstract data structures andabstract data types match the names of concrete data structures.

    In the design of many types of programs, the choice of data structures is a primary designconsideration, as experience in building large systems has shown that the difficulty of implementation and the quality and performance of the final result depends heavily on choosingthe best data structure.

    Performance Analysis and MeasurementsPerformance analysis is often made in terms of best , worst and average cases of a givenalgorithm. This expresses the resource usage as minimum, maximum, and average respectively.The resource includes the running time, memory and any other resource. In real-time computing,the worst case execution time is often of particular concern since it is important to know how muchtime might be needed in the worst case to guarantee that the algorithm would always finish ontime.

    Average performance and worst case performance are the most used in algorithm analysis. Lesswidely found is best case performance. The best case performance is measured usually toimprove accuracy of an overall worst case analysis. Computer scientists use probabilistic analysistechniques, especially expected value, to determine expected average running times.

    Worst case performance analysis and average case performance analysis have similarities, butusually require different tools and approaches in practice.

    Determining what average input means is difficult. The complexity is analyzed based on the inputin general. Based on the nature of input, it is difficult to analyze equations in average case, andhence it is difficult to characterize the complexity mathematically.

    Worst case analysis has similar problems. Typically it is difficult to determine the exact worst casescenario. Instead, a scenario is considered which is at least as bad as the worst case. For example, when analyzing an algorithm, it may be possible to find the longest possible path throughthe algorithm.

    It is always important to find the efficiency of an algorithm with respect to the following:

    CPU (time) usage

    memory usage

    disk usage

    network usage

  • 7/31/2019 DSC - Handout

    8/85

  • 7/31/2019 DSC - Handout

    9/85

    Data Structures with C

    Page 9 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Here, either the sequence of statements 1 will be executed or sequence of statements 2will be executed. So, the worst case complexity for the entire selection statement dependson the complexity of sequence 1 and sequence 2. If sequence 1 has the complexity O(1)and sequence 2 has the complexity O(N), the worst case complexity is taken as O(N).

    Looping statement (for)

    for (condition)Sequence of simple statements;

    Here, considering that the loop executes N times, the complexity can be given by N * O(1)which is equivalent to O(N).

    Nested loopsfor (condition 1)

    for (condition 2)Sequence of simple statements;

    Here, considering that the outer loop executes N times and the inner loop executes M

    times, the complexity can be given by N * M * O(1). i.e., the complexity can be given asO(N*M)

    Summary

    Study of data structure deals with the actual implementation of the data type andoffers a way of storing data in an efficient manner.

    An Abstract Data Type (ADT) is a data type together with the operations, whoseproperties are specified independently of any particular implementation

    The different types of data structure available are:o Linear o Hash tableo Treeso Graphs

    A well-designed data structure allows a variety of critical operations to be performed,using as few resources, both execution time and memory space, as possible.

    Big O Notation can be made use of for the analysis of the complexity of algorithms.

  • 7/31/2019 DSC - Handout

    10/85

    Data Structures with C

    Page 10 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Test your Understanding

    1. The complexity of an algorithm which finds the sum of n numbers will bea. O(n log n)b. O(n 2)

    c. O(n)d. O(2n)

    2. ParentChild relationship can be considered as a linear data structurea. Trueb. False

    Answers1. c2. b

  • 7/31/2019 DSC - Handout

    11/85

    Data Structures with C

    Page 11 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Session 2: Arrays

    Learning Objectives

    After completing this chapter, you will be able to:

    Define arrays

    Use arrays as data structures

    Overview

    An array is a collection of individual values of the same data type stored in consequent memorylocations.

    An array index (positioning in the array) usually starts from 0. We can even specify the value fromwhich the index should start depending on the language we use.Here is an array of integers:myArray

    0 1 2 3 4 Array positions/Index

    Declaring an array in C

    int CArray[10];

    Referring to elements of the arrayThe position of an element in an array is given by the index. The name of the array, followed bythe index, is used to refer to a particular element:myArray[1] = 5;

    The above statement assigns the value 5 to the element at the position 1(second element) of thearray, myArray.

    Using elements of an array

    Elements of the array can be used in the same way as variables of the same data type can beused. i.e. an element of an array of integers can be used anywhere an integer variable can beused.printf ('The fifth element of the array is %d', myArray[4]);

    Array values13 5 12 3 6

  • 7/31/2019 DSC - Handout

    12/85

    Data Structures with C

    Page 12 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    The above statement prints the 5 th element in myArray. i.e, it will print as follows:

    The fifth element of the array is 6

    Example: Assigning values to each element of the array

    for ( count = 0 ; count < 5 ; count++)

    {

    evens[count] = 2 * count;

    }

    The above piece of code will construct an array evens as given below

    0 2 4 6 8

    0 1 2 3 4 Array index

    Multi Dimensional ArraysThese are the arrays which has more than one dimension. For example, the following declarationin C creates a two-dimensional array of two rows and two columns:int myArray1[4,2]

    The following declaration creates an array of three dimensions, 2, 2, and 3:int myArray2[4,2,3];

    Initialization

    The following piece of code initializes the arrays myArray1 and myArray2 myArray1 = {(1, 2), (3, 4)}myArray2 = {(1, 2), (3, 4), (5, 6)}In a matrix form the above array can be represented as below

    myArray11 23 4

    myArray21 23 45 6

    Arra values

  • 7/31/2019 DSC - Handout

    13/85

    Data Structures with C

    Page 13 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Memory Organization in an array Array elements occupy contiguous locations in memory. The array elements are accessed usingtheir index. A function is needed to translate an array index to the address of the indexed element.

    For a single dimensional array the address can be calculated as below:

    Address = Base Address + (Index Base Index) * Size

    Where,

    Base Index represents the value of the first index in the array

    Size represents the size of a single element in bytes

    Advantages and disadvantages of an array

    Advantages

    Array data structure is simple to use.

    Elements in an array are stored in contiguous memory locations and hence eachelement can be accessed directly using their index.

    Allocation and de-allocation of memory is done automatically by the computer.

    Disadvantages

    Elements in an array are stored in contiguous memory locations and hence array cannot be stored if the available memory is non contiguous. i.e. if the size of the array is nbytes, then there should be n contiguous bytes available in memory.

    The array size is fixed and hence the size of the array can not be reduced or increased at run time based on the requirement.

    Stacks A stack is a homogeneous collection of items of any one type, arranged linearly with access at oneend only, known as the top. This means that data can be added or removed from only the top.Formally this type of stack is called a Last In First Out (LIFO) stack. Data is added to the stackusing the Push operation, and removed using the Pop operation.

    In order to clarify the idea of a stack here is an example. Think of a number of plates kept in acafeteria. When the plates are being stacked, they are added one on top of each other. It doesn'tmake much sense to put each plate on the bottom of the pile, as that would be far more work.Similarly, when a plate is taken, it is usually taken from the top of the stack.

    Stack consists of two parts: Storage space within stack that contains the elements of a stack.

    Top of stack that refers to the element pushed recently.

  • 7/31/2019 DSC - Handout

    14/85

    Data Structures with C

    Page 14 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    A stack can be implemented either using an array or a linked list.

    Stack implementation using an arrayTop is an integer value, which contains the array index for the top of the stack. Each time data is

    pushed or popped , top is incremented or decremented accordingly, to keep track of the current topof the stack. By convention, an empty stack is indicated by setting top to be equal to -1.

    Stacks implemented as arrays are useful if a fixed amount of data is to be used. However, if theamount of data is not a fixed size or the amount of the data fluctuates widely during the stack's lifetime, then an array is a poor choice for implementing a stack.

    Any recursive call is implemented with the help of a stack by the computer. The size of the stackcan not be predicted in recursion, and implementing the stack using array is a poor choice in this

    Algorithm to implement the operations using array

    Push:if(top>=total_no_elements)

    return(1); // Error code

    else

    {

    printf("\n Enter the element \n");scanf("%d",&stack[top]);

    top++;

    }

  • 7/31/2019 DSC - Handout

    15/85

    Data Structures with C

    Page 15 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Pop:if(top==0)

    {

    printf("\n STACK EMPTY \n");

    }

    else

    {

    top--;

    printf("\n\nPopped element = %d\n",stack[top]);

    }

    Display:if(top==0)

    {

    printf("\n STACK IS EMPTY \n");

    }

    else

    {

    printf("\n The elements inside the stack are :\n");

    for(j=top-1;j>=0;j--)

    {

    printf("\n%d",stack[j]);

    }

    }

    Stack operations:

    Operation Description Return type Requirement

    PushThis operation adds or pushesanother item onto the stack.

    Data typeThe number of items on thestack is less than n.

    Pop:This operation removes an item fromthe stack.

    Data typeThe number of items on thestack must be greater than 0.

    Top:This operation returns the value of theitem at the top of the stack.

    Data typeNote: It does not remove thatitem.

    Is Empty:This operation returns true if the stackis empty and false if it is not.

    Boolean

    Is Full: This operation returns true if the stackis full and false if it is not.

    Boolean

  • 7/31/2019 DSC - Handout

    16/85

    Data Structures with C

    Page 16 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Queues A queue is data structure in which elements are accessed from two different ends called Front andRear. The elements are inserted into a queue through the Rear end and are removed from theFront end. The principle used in queue is "First In First Out" or FIFO.

    There are two basic operations associated with a queue : enqueue and dequeue .

    Enqueue means adding a new item to the rear end of the queue. The rear end always points to therecently added element.

    Dequeue refers to removing the item from front end of the queue. The front end always points tothe recently removed element.

    Theoretically, a queue does not have a specific capacity. Regardless of how many elements arealready contained, a new element can always be added. It can also be empty, at which pointremoving an element will be impossible until a new element has been added again.

    A practical implementation of a queue using arrays does have some capacity limit. For a datastructure the executing computer will eventually run out of memory, thus limiting the queue size.Queue overflow results from trying to add an element into a full queue and queue underflowhappens when trying to remove an element from an empty queue.

    A queue consists of two major variables Front and Rear . Front refers to the first position of thequeue and Rear refers to the last position of the queue.

    Types of queues

    Circular queue

    A circular queue is one in which the insertion of a new element is done at the very first location of the queue if the last location of the queue is full. i.e. circular queue is one in which the first elementcomes just after the last element.

    A circular queue overcomes the problem of unutilized space in linear queues implemented asarrays. A circular queue also have a Front and Rear to keep the track of elements to be deletedand inserted and therefore to maintain the unique characteristic of the queue . The assumptionsmade are:

    1. Front will always be pointing to the first element2. If Front =Rear , the queue is empty3. Each time a new element is inserted into the queue the Rear is incremented by one.4. Each time an element is deleted from the queue the value of Front is incremented by one

  • 7/31/2019 DSC - Handout

    17/85

    Data Structures with C

    Page 17 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Example: Circular Queue

    Inserting and deleting elementsInsertion and deletion of elements in a circular queue is the same as that in a linear queue exceptthat whenever an element is deleted from the front of the queue, the rear pointer can be made topoint to the vacant position and the element can be inserted there once the queue is full.

    Before insertion

    Q[0] Q[1]

    Q[2]

    Q[3]

    Q[4]

    5 10

    20

    Q[3]

    Q[4]

    Front

    Rear

    5 10

    20

    30

    40

    Front

    Rear

  • 7/31/2019 DSC - Handout

    18/85

    Data Structures with C

    Page 18 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    After inserting two elements 30 and 40 Queue full

    Deletion in a circular queueNow Q[0] will be available in the queue for another insertion.

    Double Ended Queues

    Double ended queue is a homogeneous list of elements in which insertion and deletion operationsare performed from both the ends. They are also called as deque .

    There are two types of deques Input-restricted deques and Output-restricted deques

    The major operations involved are:

    Insertion of an element at the Rear end of the queue.

    Deletion of an element from the Front end of the queue

    Insertion of an element at the Front end of the queue

    Deletion of an element from the Rear end of the queue

    For an input-restricted deque , all the four operations mentioned above are valid. For an output-restricted deque , all the above points except the fourth are valid.

    Priority QueueIn priority queues, the items added to the queue have a priority associated with them whichdetermines the order in which they exit the queue. Items with highest priority are removed first.

    A priority queue is an abstract data type supporting the following three operations:

    add an element to the queue with an associated priority

    remove the element from the queue that has the highest priority, and return it

    (optionally) peek at the element with highest priority without removing it

    The simplest way to implement a priority queue data type is to keep an associative array mappingeach priority to a list of elements with that priority

    Q[0] 10

    20

    30

    40

    Front

    Rear

  • 7/31/2019 DSC - Handout

    19/85

    Data Structures with C

    Page 19 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Applications of queues

    Round robin technique for processor scheduling uses the concept of queues

    Railway ticket reservation center is designed using queues to store customer information

    Printer server routines are designed using queues

    Scheduling and buffering queues A queue is natural data structure for a system to serve the incoming requests. Most of the processscheduling or disk scheduling algorithms in operating systems use queues. Computer hardwarelike a processor or a network card also maintain buffers in the form of queues for incomingresource requests. A stack like data structure causes starvation of the first requests, and is notapplicable in such cases. A mailbox or port to save messages to communicate between two usersor processes in a system is essentially a queue like structure.

    Search space explorationLike stacks, queues can be used to remember the search space that needs to be explored at onepoint of time in traversing algorithms. Breadth first search of a graph uses a queue to remember

    the nodes yet to be visited.

    Implementation of queue using arrayInserting an element into a queueif( rear ==max_no_of_elements)

    rear =0;

    elserear = rear +1;

    if( rear == front )

    {

    printf("QUEUE OVERFLOW \n");

    if( rear ==0)rear =max_no_of_elements-1;

    elserear = rear -1;

    break;

    }

    else

    {

    printf("\n Enter the elements which you want to insert:\n");

    scanf("%d",&x);

    queue[ rear ]=x;}

  • 7/31/2019 DSC - Handout

    20/85

    Data Structures with C

    Page 20 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Deletion of an element from a queueif(front==rear)

    printf(" QUEUE UNDERFLOW \n ");

    else

    {

    if( front == (max_no_of_elements -1) )

    front=0;

    else

    front=front+1;

    x=queue[front];

    }

    In a stack, each new data item is stored at the top of the stack. Top points to the top of the stackin the figure. When a new data is added, the data is stored in the Top position and the Top pointer is increased.

    Summary

    An array is a collection of individual values of the same data type stored in adjacentmemory locations

    A stack is a homogeneous collection of items of any one type, arranged linearly withaccess at one end only, known as the top. The two major operations available for a stackinclude push(adding an element) and pop(deleting an element)

    A collection of items in which only the earliest added item may be accessed. Basicoperations are add (to the tail) or enqueue and delete (from the head) or dequeue .

    The major variations for queues are double ended queue, circular queue and priority queue

    Test your Understanding

    1. The elements inserted in order A, B, C, D are traversed in stack asa. ABCDb. DCBAc. ADCBd. None of the above

    2. The size of an array can be ---a. Extendedb. Reducedc. Either a or bd. Neither a nor b

    Answers1. b2. d

  • 7/31/2019 DSC - Handout

    21/85

    Data Structures with C

    Page 21 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Session 4: Linked Lists

    Learning Objectives

    After completing this chapter, you will be able to:

    Define linked list

    Implement linked list operations in your program

    Linked lists

    A linked list can be viewed as a group of items, each of which points to the item in itsneighbourhood. An item in a linked list is known as a node. A node contains a data part and one or two pointer part which contains the address of the neighbouring nodes in the list. Linked list is adata structure that supports dynamic memory allocation and hence it solves the problems of usingan array.

    Types of linked listsThe different types of linked lists include:

    Singly linked lists

    Circular linked lists

    Doubly linked lists

    Simple/Singly Linked ListsIn singly linked lists, each node contains a data part and an address part. The address part of thenode points to the next node in the list.Node Structure of a linked list

    Data part Link part

    An example of a singly linked list can be pictured as shown below. Note that each node is picturedas a box, while each pointer is drawn as an arrow. A NULL pointer is used to mark the end of the

    list.

  • 7/31/2019 DSC - Handout

    22/85

    Data Structures with C

    Page 22 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    The head pointer points to the first node in a linked listIf head is NULL, the linked list is empty

    A head pointer to a list

    Possible Operations on a singly linked list

    Insertion: Elements are added at any position in a linked list by linking nodes.

    Deletion: Elements are deleted at any position in a linked list by altering the links of theadjacent nodes.

    Searching or Iterating through the list to display items.

    To insert or delete items from any position of the list, we need to traverse the list starting from its

    root till we get the item that we are looking for.

    Implementation of a singly linked list

    Creating a linked list A node in a linked list is usually a structure in C and can be declared asstruct Node{

    int info;Node *next;

    }; //end struct

    A node is dynamically allocated as follows:Node *p;p = new Node;

    For creating the list, the following code can be used:do{

    Current_node = malloc (sizeof (node) );Current_node->info=input_value;Current_node->next=NULL;if(root_node==NULL) // the first node in the list

    root_node=Current_node;else

    previous_node->next=Current_node;previous_node=Current_node;scanf("%d",&input_value);

    } while(x!=-999);

  • 7/31/2019 DSC - Handout

    23/85

    Data Structures with C

    Page 23 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    The above given code will create the list by taking values until the user inputs -999.

    Inserting an element After getting the position and element which needs to be inserted, the following code can be usedto insert an element to the list

    if(position==1||root_node==NULL){

    Current_node->next=root_node;Root_node=Current_node;

    }else{

    counter=2;temp_node=root_node;while((counternext;

    }Current_node->next=temp_node->next;temp_node->next=Current_node;

    }

    The following figure illustrates how a node is inserted at an intermediate position in the list.

    The following figure illustrates how a node is inserted at the beginning of the list.

    To insert a node between two nodes

  • 7/31/2019 DSC - Handout

    24/85

    Data Structures with C

    Page 24 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Deleting an element After getting the element to be removed, the following code can be used to remove the particular element.

    temp_node=root_node;

    if ( root_node != NULL )if ( temp_node->info == input_element )

    {

    root_node=root_node->next;

    return;

    }

    While ( temp_node != NULL && temp_node->next->info !=input_element )

    temp_node = temp_node->next;

    if ( temp->next != NULL )

    {

    delete_node = temp_node->next;temp_node->next=delete_node->next;

    free ( delete_node ) ;

    }

    The following figures illustrate the deletion of an intermediate node and the deletion of the firstnode from the list.

    To insert a node at the beginning of a linked list

  • 7/31/2019 DSC - Handout

    25/85

    Data Structures with C

    Page 25 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    To display the elements of the list

    temp_node = root_node;

    while(temp_node != NULL)

    {

    printf("%d\t", temp_node->info);

    temp_node = temp_node->next;

    }

    The following figure illustrates the above piece of code.

    Deleting an intermediate node from a linked list

    Deleting the first node

    The effect of the assignment temp_node = temp_node->next

  • 7/31/2019 DSC - Handout

    26/85

    Data Structures with C

    Page 26 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Efficiency and advantages of Linked Lists

    Although arrays require same number of comparisons, the advantage lies in the factthat no items need to be moved after insertion or deletion.

    As opposed to fixed size of arrays, linked lists use exactly as much memory as isneeded.

    Individual nodes need not be contiguous in memory.

    Doubly Linked List A more sophisticated kind of linked list is a doubly-linked list or a two-way linked list. In a doublylinked list, each node has two links: one pointing to the previous node and one pointing to the nextnode.

    Node structure

    Previous Link Data Next Link

    An example of a doubly linked list

    Implementation of a doubly linked listAdding an element to the list

    To add the first nodefirst_node->next = NULL;

    first_node->data = input_element;

    first_node->prev = NULL;

    To add a node at the position specifiedTemp_node = *first_node;

    for ( counter = 0 ; counternext;

    }

    new_node->next = temp_node->next;

    temp_node->next->new_node;

    new_node->prev = temp_node->next->prev;

    temp_node->next->prev = new_node;

  • 7/31/2019 DSC - Handout

    27/85

    Data Structures with C

    Page 27 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Deleting a particular element from the listTemp_node = *first_node;

    If ( temp_node->data = = input_element )

    First_node = first_node->next;

    else

    {

    while ( temp_node != NULL && temp_node->next->data !=input_element)

    temp_node = temp_node -> next;

    delete_node=temp_node->next;

    temp_node->next=delete_node->next;

    delete_node->next->prev=temp_node;

    free(delete_node);

    }

    Circular Linked Lists

    In a circularly-linked list, the first and final nodes are linked together. In another words, circularly-linked lists can be seen as having no beginning or end. To traverse a circular linked list, begin atany node and follow the list in either direction until you return to the original node. This type of listis most useful in cases where you have one object in a list and wish to see all other objects in thelist.

    The pointer pointing to the whole list is usually called the end pointer .

    Singly-circularly-linked listIn a singly-circularly-linked list, each node has one link, similar to an ordinary singly-linked list,except that the link of the last node points back to the first node. As in a singly-linked list, newnodes can only be efficiently inserted after a node we already have a reference to. For this reason,it's usual to retain a reference to only the last element in a singly-circularly-linked list, as this allowsquick insertion at the beginning, and also allows access to the first node through the last node'snext pointer. The following figure shows a singly circularly linked list.

    Doubly-circularly-linked list

    In a doubly-circularly-linked list, each node has two links, similar to a doubly-linked list, except thatthe previous link of the first node points to the last node and the next link of the last node points tothe first node. As in doubly-linked lists, insertions and removals can be done at any point withaccess to any nearby node.

    10 20 30 40

  • 7/31/2019 DSC - Handout

    28/85

    Data Structures with C

    Page 28 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    The following figure illustrates a doubly circularly linked list

    Circularly-linked list vs. linearly-linked listCircularly linked lists are useful to traverse an entire list starting at any point. In a linear linked list,it is required to know the head pointer to traverse the entire list. The linear linked list cannot betraversed completely with the help of an intermediate pointer.

    Access to any element in a doubly circularly linked list is much easier than in a linearly linked listsince the particular element can be approached in two directions. For example to access anelement present in the fourth node of a circularly linked list having five elements, it is enough tostart from the last node and traverse the list in the reverse direction to get the value in the fourth

    node.

    Implementation of a circular linked list:Creating the list

    while (input_element != -999)

    {

    new_node=(struct node *) malloc (size);

    new_node->info=input_element;

    if ( root_node==NULL )

    root_node=new_node;

    else

    ( *last_node )->next=new_node;

    (*last_node)=new_node;

    scanf("%d",&input_element);

    }

    if(root!=NULL)

    new->next=root;

    return root;

    10 20 30 40

  • 7/31/2019 DSC - Handout

    29/85

    Data Structures with C

    Page 29 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Inserting elements into the list After getting the position and value to be inserted, the following code can be followed:

    new_node=(struct node *)malloc(sizeof(struct node));

    new_node-> info=input_element;

    if((position==1)||((*root_node)==NULL))

    {

    new_node->next =*root_node;

    *root_node = new_node;

    if((*last_node)!=NULL)

    (*last_node)->next=*root_node;

    else

    *last_node=*start_node;

    }

    else

    {

    temp_node=*root_node;

    counter=2;

    while ( (counternext !=(*root_node) ) )

    {

    temp_node=temp_node->next;

    ++counter;

    }

    if(temp_node->next==(*root_node))

    *last_node=new_node;

    new_node->next=temp_node->next;

    temp_node->next=new_node;

    }

    Deleting an element from the list

    After getting the element to be deleted, the following code can be used:If(* front _node != NULL)

    {

    printf(The item deleted is %d,(* front _node->info));

    If (* front _node == * rear _node)

    {

    * front _node = * rear _node = NULL;}

    else

    {

    * front _node = * front _node->next;

    * rear _node->link = * front _node;

    }

  • 7/31/2019 DSC - Handout

    30/85

    Data Structures with C

    Page 30 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    }

    Stacks and queues using pointers

    One disadvantage of using an array to implement a stack or queue is the wasted space---most of the time most of the most of the space in the array is unused. A more elegant and economicalimplementation of a stack or queue uses a linked list.

    Here is a sketch of a linked-list-based stack that holds 1, then 5, and then 20 at the bottom:

    The list consists of three cells, each of which holds a data object and a link to another cell. Avariable, top, holds the address of the first cell in the list.

    An empty stack looks like this:

    Top NULLImplementing stacks as linked lists provides a feasibility on the number of nodes by dynamicallygrowing stacks, as a linked list is a dynamic data structure. The stack can grow or shrink as theprogram demands it to.

    Algorithm to implement stack operations using pointers:

    Pushnode=(struct stack*)malloc(sizeof(struct stack));

    printf("\n\n Enter the data ");scanf("%d",&node->data);

    node->link=top;

    top=node;

    Popif(top==NULL)

    return(1); //Error code

    else

    {

    printf("\n \n Item deleted is %d ",top->data);

    top=top->link;

    }

    NULLTop

    1 5 20

  • 7/31/2019 DSC - Handout

    31/85

    Data Structures with C

    Page 31 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Displayi=top;

    if(top==NULL)

    return(1); //Error code

    else

    {

    printf(" \n\n ELEMENTS ARE : \n");

    while(i!=NULL)

    {

    printf("%d\n\n",i->data);

    i=i->link;

    }

    }

    Implementation of queues using lists is very similar to the implementation of stacks, except that inthis case items join the queue at the back and leave at the front . If the queue is represented by thelist [5, 2], adding a new item 3 will give the list [5, 2, 3]. In other words new items are added to theend of the list. Removing an item from the queue will be done from the front .

    A pictorial representation of a queue being implemented as a linked list is given below. Thevariable rear points to the last item in the queue.

    Rear

    Algorithm to represent queue operations using pointers

    Inserting an elementnew_element->link = NULL;

    if (front==NULL)

    front = new_element;

    else

    rear->link = new_element;

    rear = new_element;

    Deleting an elementtemp = front ;front = front ->link;

    free (temp);

    5 2 3Front NULL

  • 7/31/2019 DSC - Handout

    32/85

    Data Structures with C

    Page 32 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Summary

    A linked list is a collection of elements called nodes, each of which contains a dataportion and a pointer to the node following that one in the linear ordering of the list.

    A singly linked list is a dynamic data structure which can grow and shrink depending

    upon the operations made. It has a single pointer which points to the successive nodein the list.

    A doubly linked list is one in which all nodes are linked together by multiple number of links which help in accessing both the successor node and the predecessor node froma given node position. I t provides bi-directional traversing.

    A circular linked list is the one which has no end. i.e the link field of the last node doesnot point to NULL, rather it points back to the beginning of the linked list.

    Stacks and queues can be more efficiently implemented using pointers rather than byusing arrays.

    Test your Understanding

    1. The last node of a linear linked list ______.a. Has the value nullb. Has a next reference whose value is nullc. Has a next reference which references the first node of the listd. Cannot store any data

    2. To delete a node N from a linear linked list, you will need to ______.a. Set the link in the node that precedes N to link in the node that follows Nb. Set the link in the node that precedes N to link Nc. Set the link in the node that follows N to link in the node that precedes Nd. Set the link in N to link in the node that follows N

    3. Write a function that removes all duplicate elements from a linear linked list.

    4. Write a function to print the elements in reverse order of a singly linked list.

    5. Write a function to find the largest element in a circular linked list.

    Answers1. b2. b

  • 7/31/2019 DSC - Handout

    33/85

    Data Structures with C

    Page 33 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Session 6: Sorting and Searching

    Learning Objectives

    After completing this chapter, you will be able to:

    Explain the concepts of sorting and searching

    List the advantages of each technique

    List the limitations of each technique

    Sorting

    Sorting refers to ordering data in an increasing or decreasing fashion according to some linear relationship among the data items.

    Sorting can be done on names, numbers and records. Sorting reduces the For example, it isrelatively easy to look up the phone number of a friend from a telephone dictionary because thenames in the phone book have been sorted into alphabetical order. This example clearly illustratesone of the main reasons that sorting large quantities of information is desirable. That is, sortinggreatly improves the efficiency of searching. If we were to open a phone book, and find that thenames were not presented in any logical order, it would take an incredibly long time to look upsomeones phone number.

    Sorting can be performed using several methods, they are:

    Selection Sort.In this method, the successive elements are selected in order and are placed in their proper sortedpositions.

    Insertion sort.In this method, sorting is done by inserting elements into an existing sorted list. Initially, the sortedlist has only one element. Other elements are gradually added into the list in the proper position.

    Bubble Sort.In this method, the entire file will be passed through several times. Each pass will compare eachelement with its successor and putting the element in the proper position.

    Merge Sort.In this method, the elements are divided into partitions until each partition has sorted elements.Then, these partitions are merged and the elements are properly positioned to get a fully sortedlist.

  • 7/31/2019 DSC - Handout

    34/85

    Data Structures with C

    Page 34 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Quick Sort.In this method, an element called pivot is identified and that element is fixed in its place by movingall the elements less than that to its left and all the elements greater than that to its right.

    Radix Sort.In this method, sorting is done based on the place values of the number. In this scheme, sorting isdone on the less-significant digits first. When all the numbers are sorted on a more significant digit,numbers that have the same digit in that position but different digits in a less-significant positionare already sorted on the less-significant position.

    Heap SortIn this method, the file to be sorted is interpreted as a binary tree. Array, which is a sequentialrepresentation of binary tree, is used to implement the heap sort.

    In this chapter, focus is given to bubble sort, quick sort and heap sort.

    The basic premise behind sorting an array is that its elements start out in some random order and

    need to be arranged from lowest to highest.

    It is easy to see that the list1, 5, 6, 19, 23, 45, 67, 98, 124, 401

    is sorted, whereas the list4, 1, 90, 34, 100, 45, 23, 82, 11, 0, 600, 345

    is not. The property that makes the second one "not sorted" is that there are adjacent elementsthat are out of order. The first item is greater than the second instead of less, and likewise the thirdis greater than the fourth and so on. Once this observation is made, it is not very hard to devise asort that proceeds by examining adjacent elements to see if they are in order, and swapping themif they are not.

    Bubble SortThis sorting technique is named so because of the logic is similar to the bubble in water. When abubble is formed it is small at the bottom and when it moves up it becomes bigger and bigger i.e.bubbles are in ascending order of their size from the bottom to the top. This sorting methodproceeds by scanning through the elements one pair at a time, and swapping any adjacent pairs itfinds to be out of order.

  • 7/31/2019 DSC - Handout

    35/85

    Data Structures with C

    Page 35 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Example 6.1Input sequence: 34 8 64 51 32 21

    After iteration Altered sequence# after an iteration # of swaps

    ------------------------------------------------------------------------1 8 34 51 32 21 64 42 8 34 32 21 51 64 23 8 32 21 34 51 64 24 8 21 32 34 51 64 15 8 21 32 34 51 64 06 8 21 32 34 51 64 0

    Each pass consists of comparing each element in the file with its successor ( x [i ] > x [i +1])

    Swap the two elements if they are not in proper order. After each pass i , the largest element x [n-(i-

    1)] is in its proper position within the sorted array.

    Bubble Sort - Algorithmbubble(int x[], int n)

    {

    int hold, j, pass;

    int switched = TRUE;

    for (pass = 0; pass < n - 1 && switched == TRUE; pass++)

    {

    switched = FALSE;

    for (j = 0; j < n-pass-1; j++)

    if (x[j] > x[j+1]){

    switched = TRUE; /* swap x[j], x[j+1] */

    hold = x[j];

    x[j] = x[j+1];

    x[j+1] = hold;

    }

    } /* it stops if there is no swap in the pass */

    }

    In the first pass, n-1 items have to be scanned. On the second pass, the second largest item will

    move to its correct position, and on the third pass (stopping at item n-3) the third largest will be inplace. It is this gradual filtration, or bubbling of the larger items to the top end that gives this sortingtechnique its name.

  • 7/31/2019 DSC - Handout

    36/85

    Data Structures with C

    Page 36 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    There are two ways in which the sort can terminate with everything in the right order. It couldcomplete by reaching the n-1st pass and placing the second smallest item in its correct position.

    Alternatively, it could find on some earlier pass that nothing needs to be swapped. That is, alladjacent pairs are already in the correct order. In this case, there is no need to go on tosubsequent passes, for the sort is complete already. If the list started in sorted order, this wouldhappen on the very first pass. If it started in reverse order, it would not happen until the last one.

    Quick SortIn this sort an element called pivot is identified and that element is fixed in its place by moving allthe elements less than that to its left and all the elements greater than that to its right. Since itpartitions the element sequence into left, pivot and right it is referred as a sorting by partitioning.Instead of moving a single element towards its place, a pair element is moved in a single swap.This makes the sorting quick. After the partitioning, each of the sub-lists is sorted, which will causethe entire array to be sorted.quickSort(int first,int last)

    {

    if (first < last) /* if the part being sorted isn't empty */

    {

    mid = quickParition(first,last);

    if (mid-1 > first)

    quickSort(first,mid-1);

    if (mid+1 < last)

    quickSort(mid+1,last);

    }

    return;

    }

    The hardest part of quick sort is the partitioning of elements. The algorithm looks at the firstelement of the array (called the "pivot"). It will put all of the elements which are less than the pivotin the lower portion of the array and the elements higher than the pivot in the upper portion of thearray. When that is complete, it can put the pivot between those two sections and quick sort will beable to sort the two sections separately.

    The details of the partitioning algorithm depend on counters which are moving from the ends of thearray toward the center. Each will move until it finds a value which is in the wrong section of thearray (larger than the pivot and in the lower portion or less than the pivot and in the upper portion).Those entries will be swapped to put them into their appropriate sections and the counters willcontinue searching for out of place values. When the two counters cross, partitioning is completeand the pivot can be swapped to its proper place between the two sections.

  • 7/31/2019 DSC - Handout

    37/85

    Data Structures with C

    Page 37 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    QuickParition(first, last)

    {

    mid_val = data[first]; /* This is the pivot value */

    i = first+1;

    j = last;

    while (i mid_val))

    j--;

    if (i < j)

    swap(i,j);

    else

    i++;

    }

    if (j != first)

    swap(j,first);

    return j;

    }

    Example: 6.2Input sequence: 34,8,64,51,32,21Square brackets are used to demarcate sub files yet to be sorted.R1 R2 R3 R4 R5 R6 m n[34 8 64 51 32 21] 1 6[32 8 21] 34 [51 64] 1 3

    [21 8] 32 34 [51 64] 1 2[8] 21 32 34 [51 64] 1 18 21 32 34 [51 64] 5 68 21 32 34 51 [64] 6 6

    Heap SortIn heap sort the file to be sorted is interpreted as a binary tree. The sorting technique isimplemented using array, which is a sequential representation of binary tree. The positioning of anode is given as follows

    For a node at position i the parent is at position i/2, the left child is at position 2i and right child is atposition 2i+1 ( 2i and 2i+1

  • 7/31/2019 DSC - Handout

    38/85

    Data Structures with C

    Page 38 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Example 6.3The list of numbers 34, 8, 64, 51, 32, 21 is arranged in an array initially as in Input file of theexample given below. Here the value of n is 6, hence the least parent is 6/2 = 3. Left child of 64(index 3) is compared with largest child, since 64 > 21 it is retained in its position. Parent 8 (index2) is compared with its largest child 51 and are interchanged since 8 < 51. Now root 31(index 1) iscompared with its largest child 64 and are interchanged since 34 < 64 and is shown in initial heap.

    Input File Initial HeapIn fig 6.3(a) given below, the first largest number 64 which was brought into root is interchangedwith the last element 21 (index 6) in the tree. For easy identification of arranged elements the edgeis removed from its parent. In fig 6.3(b) given below, the same procedure is followed to bring 51 toroot and is interchanged with the element in index 5. The same step is followed in fig 6.3(c) and fig6.3(d) to get a sorted file as given in fig 6.3(e)

    6.3 (a) 6.3 (b)

    34

    21

    64

    32

    518

    64

    34

    21

    51

    328

    51

    34

    64

    32

    218

    34

    64

    21

    8

    3251

  • 7/31/2019 DSC - Handout

    39/85

    Data Structures with C

    Page 39 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    6.3 (c) 6.3 (d)

    6.3 (e) Sorted FileAlgorithm 6.3.1: Heap Sort implementationHeap is an algorithm which sorts the given set of numbers using heap sort technique. Where n isthe number of elements, a is the array representation of elements in the input binary tree. Theheap algorithm 6.3.1 calls adjust algorithm 6.3.2 each time when heaping is needed.heap(a,n)

    {

    Int i,t;

    for(i=n/2;i>=1;i--)

    {

    adjust(a,i,n);

    }

    for(i=n;i>=2;i--)

    {

    t=a[i];

    a[i]=a[1];

    a[i]=t;

    adjust(a,1,i-1);

    }

    }

    8

    32

    64

    21

    5134

    21

    32

    64

    8

    5134

    32

    21

    64

    8

    5134

  • 7/31/2019 DSC - Handout

    40/85

    Data Structures with C

    Page 40 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Algorithm 6.3.2adjust(int x[10],int i, int n)

    {

    int item, j;

    j=2 * i;

    item = x[i];

    while (j

  • 7/31/2019 DSC - Handout

    41/85

    Data Structures with C

    Page 41 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Algorithm : Linear search implementation

    bool linear_search ( int *list, int size, int key, int* rec )

    {

    // Basic Linear search

    bool found = false;

    int i;

    for ( i = 0; i < size; i++ )

    {

    if ( key == list[i] )

    break;

    }

    if ( i < size )

    {

    found = true;

    rec = &list[i];

    }

    return found;

    }

    The code searches for the element through a loop starting form 0 to n. The loop can terminate inone of two ways. If the index variable i reach the end of the list, the loop condition fails. If thecurrent item in the list matches the key, the loop is terminated early with a break statement. Thenthe algorithm tests the index variable to see if it is less than that size (thus the loop was terminatedearly and the item was found), or not (and the item was not found).

    Example 6.4 Assume the element 45 is searched from a sequence of sorted elements 12, 18, 25, 36, 45, 48,50. The Linear search starts from the first element 12, since the value to be searched is not 12(value 45), the next element 18 is compared and is also not 45, by this way all the elements before45 are compared and when the index is 5, the element 45 is compared with the search value andis equal, hence the element is found and the element position is 5.

    List i Result of comparison12 18 25 36 45 48 50 1 12 45 : false

    12 18 25 36 45 48 50 2 18 45 : false

    12 18 25 36 45 48 50 3 25 45 : false

    12 18 25 36 45 48 50 4 36 45 : false

    12 18 25 36 45 48 50 5 45 = 45 : true

  • 7/31/2019 DSC - Handout

    42/85

    Data Structures with C

    Page 42 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Binary SearchIn a linear search the search is done over the entire list even if the element to be searched is notavailable. Some of our improvements work to minimize the cost of traversing the whole data set,but those improvements only cover up what is really a problem with the algorithm. By thinking of the data in a different way, we can make speed improvements that are much better than anythinglinear search can guarantee. Consider a list in sorted order. It would work to search from the

    beginning until an item is found or the end is reached, but it makes more sense to remove as muchof the working data set as possible so that the item is found more quickly. If we started at themiddle of the list we could determine which half the item is in (because the list is sorted). Thiseffectively divides the working range in half with a single test. This in turn reduces the timecomplexity.

    Algorithm:bool Binary_Search ( int *list, int size, int key, int* rec )

    {

    bool found = false;

    int low = 0, high = size - 1;

    while ( high >= low )

    {

    int mid = ( low + high ) / 2;

    if ( key < list[mid] )

    high = mid - 1;

    else

    if ( key > list[mid] )

    low = mid + 1;

    else

    {

    found = true;

    rec = &list[mid];

    break;

    }

    }

    return found;

    }

  • 7/31/2019 DSC - Handout

    43/85

    Data Structures with C

    Page 43 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Example 6.5Binary search is applied for data in example 6.4

    The active part of search is underlined

    List i j mid Result of comparison12 18 25 36 45 48 50 1 7 4 45 > 36 : Right part

    12 18 25 36 45 48 50 5 7 6 45 < 48 : Left part

    12 18 25 36 45 48 50 5 6 5 45 = 45 : Found

    Method of search Advantages Disadvantages

    Linear SimpleElements need not be in order

    Less efficient since timeComplexity is more comparedto Binary search -O(n)

    Binary More efficient since the timecomplexity is less compared toLinear search O(log n)

    Not simple as Linear searchElements must be in order

    Summary

    Sorting is process of arranging elements either in ascending or descending order. Thisfacilitates the searching faster.

    Bubble sorting is a sorting in which each element is compared with its adjacentelements and largest value is moved to last.

    Quick sorting is a sorting by partitioning. Instead of a single element a pair of elementsare arrange in one swap.

    Heap sorting is a sorting by heaping the elements in a tree. It works with the samecomplexity in all its worst, best and average cases.

    In Linear search all the elements preceding the search element must be searched.

    In Binary search the middle element is compared and either the left are right part isonly checked instead of all.

  • 7/31/2019 DSC - Handout

    44/85

    Data Structures with C

    Page 44 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Test your Understanding

    1. Which of the following sort works with same complexity in all casesa. Heap sortb. Quick sort

    c. Merge sortd. Bubble sort

    2. Quick sort works better if the input elements are of a. Sorted order b. Jumbled order c. Reverse order d. All the above

    Answers1. a

    2. c

  • 7/31/2019 DSC - Handout

    45/85

    Data Structures with C

    Page 45 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Session 8: Trees

    Learning Objectives

    After completing this chapter, you will be able to

    Describe a tree

    Explain how a tree can be represented internally

    Describe how a tree can be traversed

    Overview:

    The data structures discussed in the previous sessions like Lists, stacks, and queues, are all linear data structures. Tree is one of the several types of non-linear data structure.

    Tree is a collection of nodes represented in a hierarchical fashion, with a specially designatednode called root . Except root all other nodes have parent in their higher hierarchy.

    A parent node of a particular node is the one which is in the higher hierarchy for a A node canhave exactly one parent i.e. a node can be attached to exactly one node in its higher hierarchy.

    Example 8.1

    A

    D

    G

    B

    FE

    C

    H

  • 7/31/2019 DSC - Handout

    46/85

    Data Structures with C

    Page 46 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    The following table depicts some of the important terminologies related to a general tree structure.

    Term Description Example

    Node An item or single element represented in a tree A,B,C.,H

    Root Node that does not have any ancestors (parent

    or Grandparent

    A

    Sub tree Internal nodes in a tree which has bothancestor(parent) and descendant(child)

    B,C,D

    Leaf External nodes that does not have anydescendant(child)

    E,F,G,H

    Edge The line depicts the connectivity between twonodes

    (A-B),(A-C)

    Path Sequence of nodes connected A-B-E for E from root

    Length Number of nodes involved in the path 2 for E from B

    Height Length of the longest path from the root 3

    Depth Length of the path to that node from the root 2 for DDegree of anode

    Number of children connected from that node 3 for A, 1 for B,D, 2 for C and0 for leaves

    Degree of atree

    Degree of a node which has maximum degree 3 (since A has maximumdegree)

    Some applications of trees are:

    representing family genealogy

    as the underlying structure in decision-making algorithms

    to represent priority queues (a special kind of tree called a heap)

    to provide fast access to information in a database (a special kind of tree called a b-tree)

    Binary TreeBinary tree is a finite set of nodes which either empty, or consist of a root and two disjoint binarytrees, called the left and right sub-trees. In other words it can be defined as a tree in which all thenodes can have 2 as a maximum degree i.e. a node can have maximum two children.

    A binary tree differs from a general tree in the following aspects:

    A tree must have at least one node but a binary tree may be empty.

    A tree may have any number of sub-trees but a binary tree can have at most two.

  • 7/31/2019 DSC - Handout

    47/85

    Data Structures with C

    Page 47 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Example 8.2

    Full Binary tree: A binary tree in which all its leaf nodes are in the same level is called a full binarytree.

    Example 8.3

    Complete Binary tree A binary tree in which the array representation is contiguous without any null pointers in between isa complete binary tree.

    B C

    D GF

    A

    B C

    D GFE

    A

  • 7/31/2019 DSC - Handout

    48/85

    Data Structures with C

    Page 48 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Example 8.4

    Array representation of the above tree is : 0 1 2 3 4 A B C D E

    In a binary tree the maximum number of nodes at level i (level of the root node is 1) is equal to 2 i-1 and the maximum number of nodes till level i is equal to 2 i 1

    Example 8.5In example 8.2Number of nodes at level 2 is 2 2-1 = 2Number of nodes at level 3 is 2 3-1 = 4Maximum number of nodes till level 2 is 2 2 -1 = 3

    Skewed binary tree A binary tree is a skewed binary tree, if it has only left child (skewed left) or only right (skewedright) child for all its internal nodes.

    B C

    D E

    A

  • 7/31/2019 DSC - Handout

    49/85

    Data Structures with C

    Page 49 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Example 8.6

    Skewed left Skewed right

    Tree Representation

    A binary tree can be represented in two ways and are1. Array representation2. Linked list representation

    Array representationThe binary tree can be represented as we have discussed in the heap sort.

    Since a binary-tree node never has more than two children, a node can be represented with 3fields as one field for the data in the node in remaining two fields for two child pointers.

    Left child Data Right Child

    Programming representation of node is as follows.Struct BinaryTreenode{

    Struct BinaryTreenode * leftChild;

    Char data;Struct BinaryTreenode * rightChild;};

    Many algorithms pertaining to tree structures usually involve a process in which each node of thetree is visited, or processed, exactly once. Such a process is called a traversal.

    B

    D

    B

    A

    D

    A

  • 7/31/2019 DSC - Handout

    50/85

    Data Structures with C

    Page 50 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Tree Traversals A tree can be traversed in three different ways and are

    Inorder traversal

    Preorder traversal

    Postorder traversal.

    In all the traversal types the order of left and right sub tree are not changed i.e. always the left subtree is traversed before the right sub tree. The type of traversal is decided based on the position of the data.

    In preorder traversal the data is traversed before its sub trees are traversed.

    In post order traversal the data is traversed after its sub trees are traversed.

    In inorder traversal the data is traversed between its sub trees.

    Simple steps in traversals

    Preorder traversalo Visit the root

    o Traverse the left sub-tree in preorder o Traverse the right sub-tree in preorder

    Inorder traversalo Traverse the left sub-tree in inorder o Visit the rooto Traverse the right subtree in inorder

    Postorder traversalo Traverse the left subtree in postorder o Traverse the right subtree in postorder o Visit the root

  • 7/31/2019 DSC - Handout

    51/85

    Data Structures with C

    Page 51 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Example 8.7

    Inorder traversal : D B E A I H J F C GPreorder traversal : A B D E C F H I J GPostorder traversal : D E B I J H F G C A

    Algorithms for the tree traversals

    Inorder traversalvoid inorder(struct btreenode *sr)

    {

    if(sr!=NULL)

    {

    inorder (sr->left);

    printf(%d\n, sr->data);

    inorder (sr ->right);

    }

    }

    B C

    D GFE

    H

    I J

    A

  • 7/31/2019 DSC - Handout

    52/85

    Data Structures with C

    Page 52 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Preorder traversalvoid preorder(struct btreenode *sr)

    {

    if(sr!=NULL)

    {

    printf(d\n, sr->data);

    preorder(sr -> left);

    preorder (sr ->right);

    }

    }

    Postorder traversalvoid postorder(struct btreenode *sr)

    {

    if(sr!=NULL)

    {

    postorder(sr -> left);postorder (sr ->right);

    printf(d\n, sr->data);

    }

    }

    Binary Search Tree (BST)BST is a binary tree which has the following properties.

    All elements stored in the left subtree of a node whose value is K have values lessthan K. All elements stored in the right subtree of a node whose value is K have

    values greater than or equal to K. That is, a nodes left child must have a key less than its parent, and a nodes right

    child must have a key greater or equal to its parent

    The left and right sub trees of a node is also a binary search tree

  • 7/31/2019 DSC - Handout

    53/85

    Data Structures with C

    Page 53 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Example 8.8

    Operations that can be performed on a BST are:

    Creation

    Insertion

    Deletion

    Searching

    CreationThe first element in the list is made as the root of the node. The elements following first are placedin its left sub tree if they are less than the root and are placed in its right sub tree if they are greater than the root. In other words we can state that creation is a combination of search and insertionafter the of root node.

    SearchingThe search is always carried from the root node, if the node to be searched is less than the rootvalue then the left sub tree is searched. If the search value is greater than the node value then theright sub tree is searched. The search is continued till the search node is found or till the search isended without any branch to proceed.

    InsertionSteps involved in inserting a node are

    Search for the node that has to be inserted (though it is not available) in the tree. If the search ended at a node x insert the new node as its left child if the new node is

    less than X, otherwise insert as its right child.

    47 71

    6 846754

    79 91

    63

  • 7/31/2019 DSC - Handout

    54/85

    Data Structures with C

    Page 54 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Example 8.9: Inserting 15 in BSTThe dotted line represents the search and the dotted circle represents the newly added node.

    15 is greater than 6 hence it is joined as its right child.

    DeletionThe node which has to deleted is first searched from the root to find its position. The deletionoperation is easier if the node which has to deleted is a leaf node. The link from its parent isdisconnected in order to delete that node.

    If the node is a non leaf node the deletion is carried as below.

    If the non leaf node has a single sub tree then the child node is replaced in its place.

    If the non leaf node has both left and right sub tree then either the in order successor or thepredecessor is replaced in its place.(i.e. the greatest left descendent or the smallest rightdescendent)

    Example 8.10 : Deleting 71 from example 8.9The dotted line represents the search and the dotted circle represents the node to be deleted.

    47 71

    6 846754

    79 9115

    63

  • 7/31/2019 DSC - Handout

    55/85

    Data Structures with C

    Page 55 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    The node 71 is replaced either by its left or right descendent

    Replaced by its left descendant Replaced by its left descendant

    Advantage of a BSTSearching a node in a BST is faster, since either left or right sub tree is only searched from theroot till the node is found instead of comparing all the nodes preceding it.

    Disadvantage of a BSTThe tree may be a skewed binary tree if the elements are either in ascending(skewed left) or indescending(skewed right) order, which lead to more levels.

    47 67

    6 8454

    79 9115

    63

    47 79

    6 846754

    9115

    63

    47 71

    6 846754

    79 9115

    63

  • 7/31/2019 DSC - Handout

    56/85

    Data Structures with C

    Page 56 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Summary

    Tree is collection of nodes arranged in hierarchical fashion

    Binary tree is tree with 2 as its maximum degree

    Tree can be represented either using an array or linked list

    Tree can be traversed in 3 ways Binary search tree is a binary tree in which a node can have all its left descendants as

    less than that and right as greater than that.

    Test your Understanding

    1. A complete binary tree is a tree in which ----a. All the leaf nodes are in the same levelb. All the parent nodes have exactly two childrenc. The representation is contiguous without any null branch in betweend. None of the above

    2. Binary search tree must be a ----a. Complete binary treeb. Full binary treec. Either a or bd. Need not be a or b

    Answers1. c2. d

  • 7/31/2019 DSC - Handout

    57/85

    Data Structures with C

    Page 57 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Session 10: Balanced trees and hashing

    Learning Objectives

    After completing this chapter you will be able to

    Define a balanced tree

    Identify how a balanced tree can be constructed from a Binary tree

    Define hashing

    List the advantages and disadvantages of Hashing

    Overview:

    Balanced trees are classified into two categories

    Height Balanced tree Weight Balanced tree

    AVL Tree An AVL tree is a height balanced Binary Search Tree. The number of null branches is more in anormal BST if the elements are almost in order, this leads to more levels and in turn need morespace. This problem is solved by balancing the height whenever a node is inserted into an AVLtree. The re-balancing is recommended based on the balancing factor.

    Balancing factor Balancing factor of each node is calculated by finding the difference in levels between the left and

    right sub tree.

    Balancing factor of X = height of left sub tree of X - height of right sub tree of XIf the balancing factor of all the nodes in the tree is within the range of -1 and 1, then the tree isalready in balanced form, otherwise balancing is needed.

    AVL Tree Rotations As mentioned previously, an AVL Tree and the nodes it contains must meet strict balancerequirements to maintain its O(log n) search capabilities. These balance restrictions aremaintained using various rotation functions. Below is a diagrammatic overview of the four possiblerotations that can be performed on an unbalanced AVL Tree, illustrating the before and after statesof an AVL Tree requiring the rotation.

  • 7/31/2019 DSC - Handout

    58/85

    Data Structures with C

    Page 58 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Example 10.1: LL Rotation

    Example 10.2: RR Rotation

  • 7/31/2019 DSC - Handout

    59/85

    Data Structures with C

    Page 59 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Example 10.3: LR Rotation

    Example 10.4: RL Rotations

  • 7/31/2019 DSC - Handout

    60/85

    Data Structures with C

    Page 60 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Inserting in an AVL TreeNodes are initially inserted into AVL Trees in the same manner as an ordinary binary search tree(that is, they are always inserted as leaf nodes). After insertion, however, the insertion algorithmfor an AVL Tree travels back along the path it took to find the point of insertion, and checks thebalance at each node on the path. If a node is found that is unbalanced (that is, it has a balancefactor of either -2 or +2), then a rotation is performed based on the inserted nodes position relative

    to the node being examined (the unbalanced node).

    NB. There will ever be at most one rotation required after an insert operation.

    Example: 10.5: Constructing an AVL tree for the list of elements 50, 45, 30, 55, 63, 53The upper part of the node represents the balancing factor and the lower part represents data.

    LL rotationInsert 50, 45, 30 Insert 55 Insert 63

    2

    50

    1

    45

    0

    30

    -1

    45

    0

    30

    -1

    50

    0

    55

    -2

    45

    0

    30

    -2

    50

    -1

    55

    0

    63

  • 7/31/2019 DSC - Handout

    61/85

    Data Structures with C

    Page 61 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    RR Rotation Insert 53 RL Rotation

    Deletion in AVL treeThe deletion algorithm for AVL Trees is a little more complex, as there are several extra stepsinvolved in the deletion of a node. If the node is not a leaf node (that is, it has at least one child),then the node must be swapped with either it's in-order successor or predecessor (based onavailability). Once the node has been swapped we can delete it (and have its parent pick up anychildren it may have - bear in mind that it will only ever have at most one child). If a deletion nodewas originally a leaf node, then it can simply be removed.

    Now, as with the insertion algorithm, we traverse back up the path to the root node, checking thebalance of all nodes along the path. If we encounter an unbalanced node we perform anappropriate rotation to balance the node.

    NB. Unlike the insertion algorithm, more than one rotation may be required after a deleteoperation, so in some cases we will have to continue back up the tree after a rotation.

    Weight Balanced TreesTree structures support various basic dynamic set operations including Search , Predecessor ,Successor , Minimum , Maximum , Insert , and Delete in time proportional to the height of the tree.Ideally, a tree will be balanced and the height will be log n where n is the number of nodes in thetree. To ensure that the height of the tree is as small as possible and therefore provide the bestrunning time, a balanced tree structure like a red-black tree, AVL tree, or b-tree must be used.

    When working with large sets of data, it is often not possible or desirable to maintain the entirestructure in primary storage (RAM). Instead, a relatively small portion of the data structure is

    -2

    45

    0

    30

    1

    55

    0

    63

    -1

    50

    0

    53

    0

    50

    1

    45

    0

    55

    0

    63

    0

    53

    0

    30

    -1

    45

    0

    30

    0

    55

    0

    63

    0

    50

  • 7/31/2019 DSC - Handout

    62/85

    Data Structures with C

    Page 62 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    maintained in primary storage, and additional data is read from secondary storage as needed.Unfortunately, a magnetic disk, the most common form of secondary storage, is significantlyslower than random access memory (RAM). In fact, the system often spends more time inretrieving data than actually processing data.

    B-trees are weight balanced trees that are optimized for situations when part or the entire tree

    must be maintained in secondary storage such as a magnetic disk. Since disk accesses areexpensive (time consuming) operations, a b-tree tries to minimize the number of disk accesses.For example, a b-tree with a height of 2 and a branching factor of 1001 can store over one billionkeys but requires at most two disk accesses to search for any node

    B-TreesThe Structure of B-TreesUnlike a binary-tree, each node of a b-tree may have a variable number of keys and children. Thekeys are stored in non-decreasing order. Each key has an associated child that is the root of asubtree containing all nodes with keys less than or equal to the key but greater than the precedingkey. A node also has an additional rightmost child that is the root for a subtree containing all keysgreater than any keys in the node.

    A b-tree has a minimum number of allowable children for each node known as the minimizationfactor . If t is this minimization factor , every node must have at least t - 1 keys. Under certaincircumstances, the root node is allowed to violate this property by having fewer than t - 1 keys.Every node may have at most 2t - 1 keys or, equivalently, 2t children.

    Since each node tends to have a large branching factor (a large number of children), it is typicallynecessary to traverse relatively few nodes before locating the desired key. If access to each noderequires a disk access, then a b-tree will minimize the number of disk accesses required. Theminimization factor is usually chosen so that the total size of each node corresponds to a multipleof the block size of the underlying storage device. This choice simplifies and optimizes diskaccess. Consequently, a b-tree is an ideal data structure for situations where all data cannot residein primary storage and accesses to secondary storage are comparatively expensive (or timeconsuming).

    Height of B-TreesFor n greater than or equal to one, the height of an n-key b-tree T of height h with a minimumdegree t greater than or equal to 2,

    The worst case height is O(log n). Since the "branchiness" of a b-tree can be large compared tomany other balanced tree structures, the base of the logarithm tends to be large; therefore, thenumber of nodes visited during a search tends to be smaller than required by other tree structures.

    Although this does not affect the asymptotic worst case height, b-trees tend to have smaller heights than other trees with the same asymptotic height.

    Operations on B-TreesThe algorithms for the search, create, and insert operations are shown below. Note that thesealgorithms are single pass; in other words, they do not traverse back up the tree. Since b-trees

  • 7/31/2019 DSC - Handout

    63/85

    Data Structures with C

    Page 63 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    strive to minimize disk accesses and the nodes are usually stored on disk, this single-passapproach will reduce the number of node visits and thus the number of disk accesses. Simpler double-pass approaches that move back up the tree to fix violations are possible.

    Since all nodes are assumed to be stored in secondary storage (disk) rather than primary storage(memory), all references to a given node be preceded by a read operation denoted by Disk-Read .

    Similarly, once a node is modified and it is no longer needed, it must be written out to secondarystorage with a write operation denoted by Disk-Write . The algorithms below assume that all nodesreferenced in parameters have already had a corresponding Disk-Read operation. New nodes arecreated and assigned storage with the Allocate-Node call. The implementation details of the Disk-Read , Disk-Write , and Allocate-Node functions are operating system and implementationdependent.

    B-Tree-Search(x, k)

    i

  • 7/31/2019 DSC - Handout

    64/85

    Data Structures with C

    Page 64 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    n[z]

  • 7/31/2019 DSC - Handout

    65/85

    Data Structures with C

    Page 65 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    i

  • 7/31/2019 DSC - Handout

    66/85

    Data Structures with C

    Page 66 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    B-Tree Insertion10 17 25 9 13 16 8 5 15 22Underlined elements are newly added

    10 10 17 17

    10 25

    17

    9 10 25

    10 17

    259 13

    10 17

    259 13 16

    10 17

    258 9 13

    10

    178

    2513 1695

  • 7/31/2019 DSC - Handout

    67/85

    Data Structures with C

    Page 67 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    After deleting 16 from the above B-Tree

    10

    15 22 8

    25 139 5 17

    10

    15 178

    251395 16

    10

    15 178

    22 251395 16

  • 7/31/2019 DSC - Handout

    68/85

    Data Structures with C

    Page 68 Copyright 2007, Cognizant Technology Solutions, All Rights Reserved

    C3: Protected

    Hashing

    Hashing is a technique which improvises the speed of search by calculating the address of thesearch element directly using a mathematical formula instead of searching it.

    Symbol Table

    Symbol table is a dictionary of ADT used in a program. It is a set of names and