Searching Algorithm in data structure using C

download Searching Algorithm in data structure using C

of 63

Transcript of Searching Algorithm in data structure using C

  • 8/13/2019 Searching Algorithm in data structure using C

    1/63

    Searching

    Searching is the process of finding the location of

    given element in the linear array. The search is

    said to be successful if the given element is

    found i.e. , the element does exists in the array;otherwise unsuccessful.

    There are two approaches to search operation:

    Linear search

    Binary search

  • 8/13/2019 Searching Algorithm in data structure using C

    2/63

    Linear Search

    This method, which traverse a sequentiallyto locate item is called linear search or

    sequential search.

    The algorithm that one chooses generallydepends on organization of the array

    elements, if the elements are in random

    order, then one have to use linear searchtechnique

  • 8/13/2019 Searching Algorithm in data structure using C

    3/63

    Algorithm

    Linearsearch (a,n,item,loc)

    Here a is the linear array of the size n. this algorithm finds the locationof the elements item in linear array a. if search ends in success itsets loc to the index of the element; otherwise it sets loc to -1

    Begin

    for i=0 to (n-1) by 1 doif (a[i] = item) then

    set loc=I

    exit

    endif

    endfor

    set loc -1

    end

  • 8/13/2019 Searching Algorithm in data structure using C

    4/63

    C implementation of algorithm

    Int linearsearch (int *a, int n, int item)

    {

    int k;

    for(k=0;k

  • 8/13/2019 Searching Algorithm in data structure using C

    5/63

    Analysis of Linear Search

    In the best possible case, the item may occur atfirst position. In that case, the search operationterminates in success with just one comparison.

    Worst case occurs when either the item ispresent at last position or missing from the array.In former case, the search terminates in successwith n comparisons.

    In the later case, the search terminates in failurewith n comparisons. Thus, we find that in worstcase the linear search is O(n) operations.

  • 8/13/2019 Searching Algorithm in data structure using C

    6/63

    Binary Search

    Suppose the elements of the array are

    sorted in ascending order. The best

    sorting algorithm, called binary search, is

    used to fined the location of the givenelement.

  • 8/13/2019 Searching Algorithm in data structure using C

    7/63

    Example

    3,10,15,20,35,40,60

    We want to search element 15

    Given array

    1. We take the beg=0, end=6 and compute location of themiddle element as

    mid=(beg+end)/2 = (0+6)/2=3

    A[0] A[1] A[2] A[3] A[4] A[5] A[6]

    3 10 15 20 35 40 60

  • 8/13/2019 Searching Algorithm in data structure using C

    8/63

    2. Compare the item with mid i.e. a[mid]=a[3] is not equalto 15, beg15, therefore, we take end=mid-1=3-1=2where as beg remains same.. Thus

    mid=(beg+end)/2 = (0+2)/2=1Since a[mid] i.e. a[1]=10

  • 8/13/2019 Searching Algorithm in data structure using C

    9/63

    AlgorithmBinarysearch(a,n,item,loc)

    Begin

    set beg=0set end=n-1

    Set mid=(beg+end)/2

    while((beg

  • 8/13/2019 Searching Algorithm in data structure using C

    10/63

    C Implementationint binarysearch(int *a, int n, int item)

    {

    int beg,end,mid;

    beg=0; end=n-1;

    mid=(beg+end)/2;

    while((beg

  • 8/13/2019 Searching Algorithm in data structure using C

    11/63

    Analysis of binary search

    In each iteration or in each recursive call,the search is reduced to one half of the

    array. Therefore , for n element in the

    array, there will be log2n iteration forrecursive calls.

    Thus the complexity of binary search isO(log2n).

    This complexity will be same irrespectiveof the position of the element, event if it is

    not present in the array.

  • 8/13/2019 Searching Algorithm in data structure using C

    12/63

    Hash table and Hashing

    Objectives:

    Understand the problem with directaddress tables

    Understand the concept of hash tables.

    Understand different hash functions.

    Understand the different collisionresolution schemes.

  • 8/13/2019 Searching Algorithm in data structure using C

    13/63

    Introduction

    In all the search algorithms considered so far, the locationof item is determined by a sequence of comparisons.

    In each case, a data item sought is repeatedly comparedwith item in certain locations of the data structure.

    However, the number of comparison depends on the datastructure and the search algorithm used. E.g.

    In an array and linked list, the linear search requires O(n)comparisons.

    In an sorted array, the binary search requires O(logn)comparisons. In a binary search tree, search requires O(logn)

    comparisons.

  • 8/13/2019 Searching Algorithm in data structure using C

    14/63

    Contd..

    However, there are some applications that

    requires search to be performed in

    constant time, i.e. O(1).

    Ideally it may not be possible, but still we

    can achieve a performance very close to

    it. And this is possible using a data

    structure known as hash table.

  • 8/13/2019 Searching Algorithm in data structure using C

    15/63

    Contd..

    A hash table in basic sense, is a generalization of the simplernotation of an ordinary array.

    Directly addressing into an array makes it possible to access anydata element of the array in O(1) time.

    For example, if a[1..100] is an ordinary array, then the nth dataelement, 1

  • 8/13/2019 Searching Algorithm in data structure using C

    16/63

    Direct address Tables

    Direct addressing is a simple technique thatworks quite well when the universe U of keys is

    reasonably small.

    As an example, consider an application thatneeds a dynamic set in which each element has

    a key drawn from the universe U={0,1,2,3.,m-1}, where m is not very large.

    We also assume that all elements are unique i.e.no two elements have the same key.

  • 8/13/2019 Searching Algorithm in data structure using C

    17/63

    Contd..

    Figure in next page shows the implementing adynamic set by a direct address table T, where

    the elements are stored in the table itself.

    Here each key in the unverse U={0,1,2,..9}corresponds to a index in a table.

    The set K={1,4,7,8} of actual key determines theslot in the table that contains elements.

    The empty/vaccant slots are marked with slashcharacter /.

  • 8/13/2019 Searching Algorithm in data structure using C

    18/63

  • 8/13/2019 Searching Algorithm in data structure using C

    19/63

    Contd..

    Previous figure shows the implementationof a dynamic set where a pointer to an

    element is stored in the direct address

    table T.

    To represent the dynamic set, we can usearray T[0..m-1] in which each position or

    slot correspond to a key in the universe U.

  • 8/13/2019 Searching Algorithm in data structure using C

    20/63

    Direct Addressing Tables

    K K

    (actual keys)

    /

    /

    /

    /

    /

    14

    7

    8

    U(Universe of keys)

    69

    2

    0

    3

    5

    1

    2

    34

    5

    6

    7

    8

    9

    T

    Implementing a dynamic set by a direct address table T, where the

    elements are stored in the table itself

    element

    element

    element

    element

  • 8/13/2019 Searching Algorithm in data structure using C

    21/63

    Operations on Direct Address Table

    Initializing a direct address table

    In order to initialize a direct address tableT[0..m-1], sentinel value -1 is assigned toeach slot.

    void initializeDAT(int t[],int m){

    int i;

    for(i=0;i

  • 8/13/2019 Searching Algorithm in data structure using C

    22/63

    Operations on Direct Address

    Table

    Searching an element in direct address table

    To search an element x in a direct address table

    T[0..m-1], the element at index key[x] is

    returned.

    Int serch(int t[],int x)

    {return t[key[x]];

    }

  • 8/13/2019 Searching Algorithm in data structure using C

    23/63

    Operations on Direct Address

    Table

    Inserting a new element in direct addresstable

    To insert a new element x in a direct address

    table T[0..m-1], the element is stored at indexkey[x].

    Void insertDAT(int t[],int x)

    {t[key[x]]=x;

    }

  • 8/13/2019 Searching Algorithm in data structure using C

    24/63

    Operations on Direct Address

    Table

    Deleting a new element from direct addresstable

    To delete an element x from a direct address

    table T[0..m-1], the sentinel value -1 is stored atindex key[x].

    Void deletefromDAT(int t[],int x)

    {t[key[x]]=-1;

    }

  • 8/13/2019 Searching Algorithm in data structure using C

    25/63

    DAT

    Each of these operations is fast: only o(1) timeis required.

    However, the difficulties with the direct addresstable are obvious as stated below.

    1. If the universe U is large, storing a table T ofsize U may be impractical or even impossiblegiven the memory available on a typicalcomputer.

    2. If the set K of actual keys is very small relativeto U, most of the space allocated for T will bewasted.

  • 8/13/2019 Searching Algorithm in data structure using C

    26/63

    Hash Table

    A hash table is data structure in whichlocation of a data item is determined

    directly as a function of the data item itself

    rather than by a sequence of comparisons.

    Under ideal conditions, the time requiredto locate a data item in a hash table is o(1)

    i.e. it is constant and does not depend onthe number of data item stored.

  • 8/13/2019 Searching Algorithm in data structure using C

    27/63

    Hashing

    Hashing is a technique where we cancompute the location of the desired record

    in order to retrieve it in a single access

    Here, the hash function h maps theuniverse U of keys into the slots of a hash

    table T[0..m-1]. This process of mapping

    keys to appropriate slots in a hash table isknown as hashing.

  • 8/13/2019 Searching Algorithm in data structure using C

    28/63

    KK

    (actual keys)

    /

    /

    /

    /

    /

    k1k4

    k7k6

    U(Universe of keys)

    0

    M-1

    T

    Implementing a dynamic set by a hash table T[0..m-1], where the elements

    are stored in the table itself

    h(k1)

    h(k2)=h(k4)=h(k7)

    h(k6)

    h(k3)=h(k5)

    k2

    k3

    K5

    Hash table

  • 8/13/2019 Searching Algorithm in data structure using C

    29/63

    Hash table Figure in the previous page shows the implementing a dynamic

    set by a hash table T[0..m-1], where the elements are stored in

    the table itself. Here each key in the dynamic set K of actual keys is mapped tohash table slots using hash function h.

    Note that the keys k2,k4, and k7 map to the same slot. Mapping of more than one key to the same slot known as

    collision.

    We can also say that keys k2,k4 and k7 collide. We usually say that an element with key k hashes to slot h(k).

    We can say that h(k) is the hash value of key k.

    The purpose of the hash function is to reduce the range of arrayindices that need to be handled. Therefore, instead of U values,we need to handle only m values which led to the reduction inthe storage requirements.

  • 8/13/2019 Searching Algorithm in data structure using C

    30/63

    What is hash function? A hash function h is simply a mathematical

    formula that manipulates the key in some formto compute the index for this key in the hashtable.

    For example, a hash function can divide the keyby some number, usually size of the hash table,and return remainder as the index of the key.

    In general, we say that a hash function h mapsthe universe U of keys into the slots of a hashtable T[0..m-1]. This process of mapping keysto appropriate slots in a hash table is known as

    hashing.

  • 8/13/2019 Searching Algorithm in data structure using C

    31/63

    Different hash functions

    There is variety of hash functions.The main considerations while choosing

    particular hash function h are:

    1. It should be possible to compute itefficiently

    2. It should distribute the keys uniformly

    across the hash table i.e. it should keepthe number of collisions as minimum aspossible.

  • 8/13/2019 Searching Algorithm in data structure using C

    32/63

    Hash Functions

    1. Division method:

    In division method, key K to be mapped into

    one of the m slots in the hash table is

    divided by m and the remainder of this

    division is taken as index into the hash

    table.

    That is hash function is

    h(k)=k mod m

  • 8/13/2019 Searching Algorithm in data structure using C

    33/63

    Division method

    Consider a hash table with 9 slots i.e. m=9

    then the hash function

    h(k)= k mod m

    will map the key 132 to slot 6 since

    h(132)= 132 mod 9 = 6

    Since it requires only a single divisionoperation, hashing is quite fast.

    example

  • 8/13/2019 Searching Algorithm in data structure using C

    34/63

    example

    Let company has 90 employees and 00,01,02,..89 bethe two digits 90 memory address ( or index or hash

    address) to store the records. We have employeecode as the key.

    Choose m in such a way that it is greater than 90.suppose m=93, then for the following employee code(or key k)

    h(k)=h(2103)=2103(mod 93) =57

    h(k)=h(6147)=6147(mod 93) =9

    h(k)=h(3750)=3750(mod 93) =30Then typical hash table will look like as next page

    So if you enter the employee code to the hash functionwe can directly retrieve table[h[k]] details directly.

  • 8/13/2019 Searching Algorithm in data structure using C

    35/63

    Hash Address Emploee code

    (keys)

    Employee name

    & other details

    0

    1

    9 6147 Anish

    ..

    30 3750 Saju

    ..57 2103 Rarish

    ..

    89

  • 8/13/2019 Searching Algorithm in data structure using C

    36/63

    Midsquare method

    The midsquare method operates in two step, thesquare of the key value k is taken. In the second

    step, the hash value is obtained by deleting

    digits from ends of the squared value i.e.k2

    . It isimportant to note that same position of k2must

    be used for all keys. This the hash function is

    h(k)=k2

    Where s is obtained by deleting digits from both

    sides of k2.

  • 8/13/2019 Searching Algorithm in data structure using C

    37/63

    Midsquare method

    Consider the hash table with 100 slots

    i.e.m=100, and values k=3205,7148,2345

    Solution:

    K 4147 3750 2103

    K2 17197609 14062500 4422609

    h(k) 97 62 22The hash values are obtained by taking

    fourth and fifth digits counting from right

  • 8/13/2019 Searching Algorithm in data structure using C

    38/63

    Hash Address Emploee code

    (keys)

    Employee name

    & other details

    0

    1

    22 2103 Giri

    ..

    62 3750 Suni

    ..97 4147 Rohit

    ..

    89

  • 8/13/2019 Searching Algorithm in data structure using C

    39/63

    Folding method

    The folding method also operates in two steps. In thefirst step, the key value k is divided into number of parts,k1,k2..kr, where each parts has the same number ofdigits except the last part, which can have lesser digits.

    H(k)=k1+k2+.+kr In the second step, these parts are added together andhash values are obtained by ignoring the last carry, ifany.

    For example, the hash table has 1000 slots, each partswill have three digits, and the sum of these parts afterignoring the last carry will also be three digits number inthe range 0 to 999.

  • 8/13/2019 Searching Algorithm in data structure using C

    40/63

    Folding method

    Here we are dealing with a hash table with indexfrom 00 to 99, i.e, two digit hash table. So we

    divide the K numbers of two digits

    K 2103 7148 12345

    K1 k2 k3 21,03 71,48 12,34,5

    H(k)=

    K1+k2+k3

    H(2103)

    =21+03=24

    H(7148)

    =71+48=19

    H(12345)

    =12+34+5=51

    Folding method

  • 8/13/2019 Searching Algorithm in data structure using C

    41/63

    Folding method Extra milling can also be applied to even numbered parts,

    k2, k4, are each reversed before the addition

    K 2103 7148 12345

    K1 k2 k3 21,03 71,48 12,34,5

    Reversing

    k2, k421,30 71,84 12,43,5

    H(k)=

    K1+k2+k3

    H(2103)

    =21+30=51

    H(7148)

    =71+84=55

    H(12345)

    =12+43+5=60

    Multiplication method

  • 8/13/2019 Searching Algorithm in data structure using C

    42/63

    Multiplication methodThe multiplication method operates in two steps.

    In the first step, the key value K is multiplied by a constant A inrange 0

  • 8/13/2019 Searching Algorithm in data structure using C

    43/63

    Multiplication method

    Consider a hash table with 10000 slots i.em=10000 then the hash function

    h(k)=m(kAmod1)

    Will map the key 123456 to slot 41 sinceH(123456)=10000*(123456*0.61803mod1)

    =10000*(76300.0041151mod1)

    =100000*0.0041151.) =41.151.

    =41

  • 8/13/2019 Searching Algorithm in data structure using C

    44/63

    Hash Collision

    It is possible that two non identical keysk1, k2 are hased into the same hash

    address. This situation is called hash

    collision

    Hash Collision

  • 8/13/2019 Searching Algorithm in data structure using C

    45/63

    Hash Collision

    Location keys Records

    0 2101 111

    2

    3 8834 344

    5

    6

    7

    8 488

    9

  • 8/13/2019 Searching Algorithm in data structure using C

    46/63

    Hash collision

    Let us consider a hash table having 10 location as shownin previous figure. Division method is used to hash thekey.

    H(k)=k(mod m)

    Here m is chosen as 10. the hash function produces anyinteger between 0 and 9. depending on the value of thekey. If we want to insert a new record with key 500 then

    H(500)=500(mod10)=0

    The location 0 in the table is already filled. Thus collisionoccurred. Collision are almost impossible to avoid but itcan be minimized considerably by introducing fewtechniques.

  • 8/13/2019 Searching Algorithm in data structure using C

    47/63

    Resolving Collision

    A collision is a phenomenon that occurswhen more than one keys maps to same

    slot in the hash table.

    Though we can keep collisions to a certainminimum level, but we can not eliminate

    them together. Therefore we need some

    mechanism to handle them.

    C lli i R l ti b

  • 8/13/2019 Searching Algorithm in data structure using C

    48/63

    Collision Resolution by

    Synonyms Chaining

    In this scheme, all the elements whosekey hash to same hash table slot are put

    in a linked list.

    Thus the slot I in the hash table contains apointer to the head of the linked list of all

    the elements that hashes to a value I

    If there is no such element that hash tovalue I, the slot I contains NULL value

  • 8/13/2019 Searching Algorithm in data structure using C

    49/63

    KK

    (actual keys)

    /

    /

    /

    /

    /

    k1k4

    k7k6

    U(Universe of keys)

    0

    M-1

    T

    Collision resolution by separating chainng. Each hash table slot T[i]

    contains a linked list of all the keys whose hash value is i

    k2

    k3

    K5

    K1 X

    K2 X K4 X K7 X

    K6 X

    K3 X K5 X

    C lli i R l ti b

  • 8/13/2019 Searching Algorithm in data structure using C

    50/63

    Collision Resolution by

    Synonyms Chaining

    Structure of node of linked list will look likeTypedef struct nodetype

    {

    int info;

    struct nodetype *next;

    }node;1. Initializing a Chained hash table

    Void iniHT(node*t[],int m)

    {

    int I;

    for(i=0;i

  • 8/13/2019 Searching Algorithm in data structure using C

    51/63

    Searching an element in Chainedhash table

    node *searchHT(node*t[],int x)

    {

    node *ptr;

    ptr=t[h(x)];

    while((ptr!=NULL)&&(ptr->info!=x))

    ptr=ptr->next;

    if(ptr->info==x)

    return ptr;

    else

    return NULL;

    }

    Inserting a new element in

  • 8/13/2019 Searching Algorithm in data structure using C

    52/63

    Inserting a new element inChained hash table

    void insertHT(node*t[],int x)

    {

    node *ptr;

    ptr=(node*)malloc(sizeof(node));

    ptr->info=x;

    ptr->next=t[h(x)];

    t[h(x)]=ptr;

    }

  • 8/13/2019 Searching Algorithm in data structure using C

    53/63

    Open addressing

    In open addressing method, when a key iscolliding with another key, the collision isresolved by finding a nearest empty space

    by probing the cells.Suppose a record R with key K has a hash

    address H(k)=h, then we will linearlysearch h+i( where i=0,1,2m) location forfree space (ie. H, h+1,h+2.hashaddress)

  • 8/13/2019 Searching Algorithm in data structure using C

    54/63

  • 8/13/2019 Searching Algorithm in data structure using C

    55/63

    Linear probing

    The main disadvantage of linear probing isthat substantial amount of time will take to

    find the free cell by sequential or linear

    searching the table.

  • 8/13/2019 Searching Algorithm in data structure using C

    56/63

    Quadratic Probing

    Suppose a record with R with key K has the

    hash address H(K)=h. then instead of

    searching the location with address h,

    h+1,h+2.h+i. We search for free hashaddress h,h+1,h+4,h+9,h+16,h+i2.

  • 8/13/2019 Searching Algorithm in data structure using C

    57/63

    Double Hashing

    Second hash function H1 is used to resolve

    the collision. Supose a record R with key K

    has the hash address H(k)=h and H1(k)=

    h1, which is not equal to m. then welinearly search for location with addresses

    H, h+h1,h+2h1,h+3h1,.h+i(h1)2 ( where

    i=0,1,2,3)

  • 8/13/2019 Searching Algorithm in data structure using C

    58/63

    Chaining

    In chaining technique the entries in the hash

    table are dynamically allocated and

    entered into a linked list associated with

    each hash key. The hash table in nexttable can represented using linked list.

    Hash Collision

  • 8/13/2019 Searching Algorithm in data structure using C

    59/63

    Hash Collision

    Location keys Records

    0 210 301 111 12

    2

    3 883 144 344 18

    5

    6 546 327

    8 488 31

    9

  • 8/13/2019 Searching Algorithm in data structure using C

    60/63

    Chaining

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    210 30

    111 12

    833 14

    344 18

    546 32

    488 31

    If we try to insert new record with a key 500 then

    H(500) 500(mod10) 0

  • 8/13/2019 Searching Algorithm in data structure using C

    61/63

    H(500)=500(mod10)=0

    then the collision occur in the normal way because there exists a record

    in the 0th position. But in the chaining corresponding linked list can be

    extended to accommodate the new record with the key as shown in fig

    0

    1

    2

    3

    4

    5

    6

    7

    8

    9

    210 30

    111 12

    833 14

    344 18

    546 32

    488 31

    500 53

  • 8/13/2019 Searching Algorithm in data structure using C

    62/63

    Bucket Addressing

    Another solution to the hash collision problem is to

    store colliding elements in the same position in

    the table by introducing a bucket with each hash

    address. A bucket is a block of memory space,which is large enough to store multiple items.

    Next figure shows how hash collision can be

    avoided using buckets. If a bucket is full, then

    the colliding item can be stored in the newbucket by incorporating its link to previous

    bucket.

  • 8/13/2019 Searching Algorithm in data structure using C

    63/63

    Bucket Addressing

    0

    1

    2 K2 D2

    3

    4

    5

    6

    7

    8

    K21 D21

    K25 D25

    K28 D28

    K29 D29

    K30 D30

    K33 D33

    K21 D21

    K25 D25

    K28 D28