MCOO68 SMU MCA SEM2 2011

MC0068 – Data Structures using(Book ID: B0701 & B0702)

Set-1

1. Write note on:

a) Elementary Data Structures

b) Ordered list

c) Linked list

d) Queue

e) Slack

f) Binary tree

g) Hash tables

Ans:1a) Elementary Data Structures:

Organizing the data for processing is an essential step in the development of a computer program. For many applications, the choice of the proper data structure is the only major decision involved in the implementation: once the choice has been made, the necessary algorithms are simple. For the same data, some data structures require more or less space than others; for the same operations on the data, some data structures lead to more or less efficient algorithms than others. The choices of algorithm and of data structure are closely intertwined, and we continually seek ways to save time or space by making the choice properly. Elementary data structures such as stacks, queues, lists, and heaps will be the “of-the-shelf” components we build our algorithm from. There are two aspects to any data structure-The abstract operations which it supports.-The implementation of these operations.

b) Ordered list:A list in which the order matters to the application. Therefore, for example, the implementer cannot scramble the order to improve efficiency. An ordered list is a list in which the order of the items is significant. However, the items in an ordered list are not necessarily sorted. Consequently, it is possible to change the order o items and still have a valid ordered list.Consider a list of the titles of the chapters in this book. The order of the items in the list corresponds to the order in which they appear in the book. However, since the chapter titles are not sorted alphabetically, we cannot consider the list to be sorted. Since it is possible to change the order of the chapters in book, we must be able to do the same with the items of the list. As a result, we may insert an item into an ordered list at any position. A searchable container is a container that supports the following additional operations:insert

-used to put objects into a the container; withdraw

-used to remove objects from the container; find

-used to locate objects in the container; isMember

-used to test whether a given object instance is in the container.


c) Linked list:A linked list is an ordered collection of items from which items may be deleted and inserted in any places. A data structure in which each element contains a pointer to the next element, thus forming a linear list.

Figure1. Singly linked listApart from singles linked we still have t the following types of linked lists: doubly linked list, ordered linked list, circular list. Linked lists are commonly used when the length of the list is not known in advance and/or when it is frequently necessary to insert and/or delete in the middle of the list. Doubly-linked vs. singly-linked lists-In a doubly-linked list, each node points to the next node and also to the previous node. In a singly-linked list, each node points to the next node but not back to the previous node. Circular list -A linked list in which the last node points to the first node. If the list is doubly-linked, the first node must also point back to the last node.

d) Queue:An ordered list in which insertion always occurs at one end and deletion always occurs at the other end.

Figure 2. FIFO

The following operations can be applied to a queue:

InitQueue(Queue): creates an empty queue;

Join(Item): inserts an item to the rear of the queue;

Leave(Queue): removes an item from the front of the queue;

isEmpty(Queue): returns true is the queue is empty.


e) Stack:A stack is a list in which all insertions and deletions are made at one end, called the top. The last element to be inserted into the stack will be the first to be removed. Thus stacks are sometimes referred to as Last In First Out (LIFO) lists.

Figure 3. LIFO

The following operations can be applied to a stack:

InitStack(Stack): creates an empty stack;

Push(Item): pushes an item on the stack;

Pop(Stack): removes the first item from the stack;

Top(Stack): returns the first item from the stack w/o removing it;

isEmpty(Stack): returns true is the stack is empty;

f) Binary tree:Each element in a binary tree is stored in a "node" class (or struct). Each node contains pointers to a left child node and a right child node. In some implementations, it may also contain a pointer to the parent node. A tree may also have an object of a second "tree" class (or struct) which as a header for the tree. The "tree" object contains a pointer to the root of the tree (the node with no parent) and whatever other information the programmer wants to squirrel away in it (e.g. number of nodes currently in the tree). In a binary tree, elements are kept sorted in left to right order across the tree. That is, if N is a node, then the value stored in N must be larger than the value stored in left-child(N) and less than the value stored in right-child(N). Variant trees may have the opposite order (smaller values to the right rather than to the left) or may allow two different nodes to contain equal values.


Figure 4. Binary Tree

g) Hash tables:A hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys (e.g., a person's name), to their associated values (e.g., their telephone number). Thus, a hash table implements an associate array. The hash function is used to transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to be sought.Ideally, the hash function should map each possible key to a unique slot index, but this ideal is rarely achievable in practice (unless the hash keys are fixed; i.e. new entries are never added to the table after it is created). Instead, most hash table designs assume that hash collisions—different keys that map to the same hash value—will occur and must be accommodated in some way.In a well-dimensioned hash table, the average cost (number of instructions) for each lookup is independent of the number of elements stored in the table. Many hash table designs also allow arbitrary insertions and deletions of key-value pairs, at constant average (indeed) cost per operation. In many situations, hash tables turn out to be more efficient than search trees or any other table lookup structure. For this reason, they are widely used in many kinds of computer software, particularly for associate arrays, database indexing caches and sets.

http://en.wikipedia.org/wiki/File:Binary_tree.svg


2. Illustrate the C program to represents the Stack Implementation on POP and PUSH

operation.

Ans:2

#include<studio.h>#include<conio.h>#define Max 5Int Staff [Max], top=-1;Void display(){If ((top==-1 || (top==0)){Printf(“\n stack is full\n”);}Else{Printf(“\n stack elements are\n”);For(int i=top-1;i>=0;i--)}}void push(){int ele;char ch;it(top-=-1)top=0;do{If (top>=5){Printf(“\n stack is full”);Break;}Else{Clrscr();Printf(“\n enter the element to be insearted\n”);Scanf(“%d”, &ele);Staff(top++)=eledisplay();}Printf(“\n Do u want to add more elements:?\n”);Scanf(“\n%c”, &ch);}While ((ch==’y’ ||(ch==’Y’));}void pop()


{If ((top==-1)||(top==0){Printf(“\nstack is under flow\n”);}Else{Printf(“%d is deleted fro stack\n”,Staff(--top]”);display();}}

Void main(){clrscr();char c;int choice;do{cirscr();printf(“\n enter the choice\n”);printf(“1->push\n”);printf(“2-pop\n”);scant(“%d”, &choice);if(choice==1)

push();elshif(choice==2)pop();elseprintf(“\in valid choice”);printf(“\n do u want to continue:?”);scanf(“\n%c”, &c);}While{(c==’y’)||(c==’Y’));}

3. Show the result of inserting 3, 1, 4, 5, 2, 9, 6, 8 into a


a) bottom-up splay tree

b) top-down splay tree.

Ans:3

Splay Trees

We shall describe the algorithm by giving three rewrite rules in the form of pictures. In these pictures, x is the node that was accessed (that will eventually be at the root of the tree). By looking at the local structure of the tree defined by x, x’s parent, and x’s grandparent we decide which of the following three rules to follow. We continue to apply the rules until x is at the root of the tree:

Notes

1) Each rule has a mirror image variant, which covers all the cases.

2) The zig-zig rule is the one that distinguishes splaying from just rotating x to the root of the tree.

3) Top-down splaying is much more efficient in practice.

The Basic Bottom-Up Splay Tree• A technique called splaying can be used so a logarithmic amortized bound can beachieved.• We use rotations such as we’ve seen before.• The zig case.o Let X be a non-root node on the access path on which we are rotating.o If the parent of X is the root of the tree, we merely rotate X and the root as shownin Figure 2.

http://train-srv.manipalu.com/wpress/wp-content/uploads/2009/06/clip-image00288.jpg


Figure 2 Zig Caseo This is the same as a normal single rotation.• The zig-zag case.o In this case, X and both a parent P and a grandparent G. X is a right child and P isa left child (or vice versa).

o This is the same as a double rotation.o This is shown in Figure 3.The zig-zig case.o This is a different rotation from those we have previously seen.o Here X and P are either both left children or both right children.o The transformation is shown in Figure 4.o This is different from the rotate-to-root. Rotate-to-root rotates between X and Pand then between X and G. The zig-zig splay rotates between P and G and X andP.Figure 4 Zig-zig Case


• Given the rotations, consider the example in Figure 5, where we are splaying c.

Top-Down Splay Trees• Bottom-up splay trees are hard to implement.• We look at top-down splay trees that maintain the logarithmic amortized bound.o This is the method recommended by the inventors of splay trees.• Basic idea – as we descend the tree in our search for some node X, we must take thenodes that are on the access path, and move them and their subtrees out of the way. Wemust also perform some tree rotations to guarantee the amortized time bound.• At any point in the middle of a splay, we have:o The current node X that is the root of its subtree.o Tree L that stores nodes less than X.o Tree R that stores nodes larger than X.• Initially, X is the root of T, and L and R are empty.• As we descend the tree two levels at a time, we encounter a pair of nodes.o Depending on whether these nodes are smaller than X or larger than X, they areplaced in L or R along with subtrees that are not on the access path to X.When we finally reach X, we can then attach L and R to the bottom of the middle tree,and as a result X will have been moved to the root.• We now just have to show how the nodes are placed in the different tree. This is shownbelow:o Zig rotation – Figure 7


Zig-Zig – Figure 8

Zig-Zag – Figure


4. Discuss the techniques for allowing a hash file to expand and shrink dynamically.

What are the advantages and disadvantages of each?

Ans:4Dynamic Hashing:As the database grows over time, we have three options:

1. Choose hash function based on current file size. Get performance degradation as file grows.

2. Choose has function based on anticipated file size. Space is wasted initially.3. Periodically re-organize hash structure as file grows. Requires selecting new hash

function, recomputing all addresses and generating new bucket assignments.Some hashing techniques allow the hash function to be modified dynamically to accommodate the growth or shrinking of the database. These are called dynamic hash functions. Extendable hashing is one form of dynamic hashing. Extendable hashing splits and coalesces buckets as database size changes.


Figure 11.9: General extendable hash structure.

We choose a hash function that is uniform and random that generates values over a relatively large range.

Range is b-bit binary integers (typically b=32). is over 4 billion, so we don't generate that many buckets! Instead we create buckets on demand, and do not use all b bits of the hash initially.

At any point we use i bits where . The i bits are used as an offset into a table of bucket addresses. Value of i grows and shrinks with the database. Figure 11.19 shows an extendable hash structure. Note that the i appearing over the bucket address table tells how many bits are required

to determine the correct bucket. It may be the case that several entries point to the same bucket. All such entries will have a common hash prefix, but the length of this prefix may be less

than i. So we give each bucket an integer giving the length of the common hash prefix.

This is shown in Figure 11.9 (textbook 11.19) as . Number of bucket entries pointing to bucket j is then .

To find the bucket containing search key value :

Compute .

Take the first i high order bits of . Look at the corresponding table entry for this i-bit string. Follow the bucket pointer in the table entry.

We now look at insertions in an extendable hashing scheme.


Follow the same procedure for lookup, ending up in some bucket j. If there is room in the bucket, insert information and insert record in the file. If the bucket is full, we must split the bucket, and redistribute the records. If bucket is split we may need to increase the number of bits we use in the hash.

Two cases exist:

1. If , then only one entry in the bucket address table points to bucket j.

Then we need to increase the size of the bucket address table so that we can include pointers to the two buckets that result from splitting bucket j.

We increment i by one, thus considering more of the hash, and doubling the size of the bucket address table.

Each entry is replaced by two entries, each containing original value. Now two entries in bucket address table point to bucket j. We allocate a new bucket z, and set the second pointer to point to z.

Set and to i. Rehash all records in bucket j which are put in either j or z. Now insert new record. It is remotely possible, but unlikely, that the new hash will still put all of the records in

one bucket. If so, split again and increment i again.

2. If , then more than one entry in the bucket address table points to bucket j.

Then we can split bucket j without increasing the size of the bucket address table (why?).

Note that all entries that point to bucket j correspond to hash prefixes that have the same

value on the leftmost bits.

We allocate a new bucket z, and set and to the original value plus 1. Now adjust entries in the bucket address table that previously pointed to bucket j. Leave the first half pointing to bucket j, and make the rest point to bucket z. Rehash each record in bucket j as before. Reattempt new insert.

Note that in both cases we only need to rehash records in bucket j.Deletion of records is similar. Buckets may have to be coalesced, and bucket address table may have to be halved.Insertion is illustrated for the example deposit file of Figure 11.20.

32-bit hash values on bname are shown in Figure 11.21. An initial empty hash structure is shown in Figure 11.22. We insert records one by one. We (unrealistically) assume that a bucket can only hold 2 records, in order to illustrate

both situations described. As we insert the Perryridge and Round Hill records, this first bucket becomes full.


When we insert the next record (Downtown), we must split the bucket.

Since , we need to increase the number of bits we use from the hash. We now use 1 bit, allowing us buckets. This makes us double the size of the bucket address table to two entries. We split the bucket, placing the records whose search key hash begins with 1 in the new

bucket, and those with a 0 in the old bucket (Figure 11.23). Next we attempt to insert the Redwood record, and find it hashes to 1.

That bucket is full, and . So we must split that bucket, increasing the number of bits we must use to 2. This necessitates doubling the bucket address table again to four entries (Figure 11.24). We rehash the entries in the old bucket. We continue on for the deposit records of Figure 11.20, obtaining the extendable hash

structure of Figure 11.25.

Advantages:

Extendable hashing provides performance that does not degrade as the file grows. Minimal space overhead - no buckets need be reserved for future use. Bucket address

table only contains one pointer for each hash value of current prefix length.

Disadvantages:

Extra level of indirection in the bucket address table Added complexity


Set-2

1. Explain the double ended queue with the help of suitable example.

Ans:1A double-ended queue (dequeue, often abbreviated to deque, pronounced deck) is an abstract data structure that implements a queue for which elements can only be added to or removed from the front (head) or back (tail). It is also often called a head-tail linked list.

Deque is a special type of data structure in which insertions and deletions will be done either at the front end or at the rear end of the queue. The operations that can be performed on deques are

· Insert an item from front end

· Insert an item from rear end

· Delete an item from front end

· Delete an item from rear end

· Display the contents of queue

The three operations insert rear, delete front and display and the associated operations to check for an underflow and overflow of queue have already been discussed in ‘ordinary queue’. In this section, other two operations i.e., insert an item at the front end and delete an item from the rear end are discussed.

a) Insert at the front end

Consider queue shown in above fig (a). Observe that, the front end identified by f is 0 and rear end identified by r is -1. Here, an item can be inserted first by incrementing r by 1 and then insert an item. If the front pointer f is not equal to 0 as shown in above fig. (b), an item can be inserted by decrementing the front pointer .f by 1 and then inserting an item at that position. Except for these conditions, it is not possible to insert an item at the front end. For example, consider the

http://train-srv.manipalu.com/wpress/wp-content/uploads/2009/06/clip-image01022.gif


queue shown in above figure (c). Here, an item 10 is already present in the first position identified by f and so, it is not possible to insert an item. The complete C function to insert an item is shown in below example.

Example 1: Function to insert an item at the front end

void insert_front(int item, int q[ ], int *f, int *r)

{

if( *f= = 0 && *r = = -1)

q[++(*r)] = item;

else if ( *f ! = 0)

q[--(*f)]=item;

else

printf("Front insertion not possible");

}

Delete from the rear end

To delete an element from the rear end, first access the rear element and then decrement rear end identified by r. As an element is deleted, queue may become empty. If the queue is empty, reset the front pointer f to 0 and rear pointer r to -1 as has been done in an ordinary queue. We delete an element only if queue is not empty. The complete C function to delete an item from the rear end is shown in below example.

Example 2: Function to delete an item from the rear end of queue

void delete_rear(int q[],int *f, int *r)

{

if ( qempty(*f,*r) )

{

printf("Queue underflown");

return;


}

printf("The element deleted is %dn".q[(*r)--]);

if (*f > *r)

{

*f = 0, *r = -1 ;

}

}

C program to Implement double-ended queue

#include <stdio.h>

#include <process.h>

#define QUEUE_SIZE 5

/* Include function to check for overflow 4.2.1 Eg.-1*/

/* Include function to check for underflow 4.2.1 Eg -3*/

/* Include function to insert an item at the front end 4.2. 3 Eg.-1*/

/* Include function to insert an item at the rear end 4.2.1 Eg -2*/

/* Include function to delete an item at the front end 4.2.1 Eg -4*/

/* Include function to delete an item at the rear end 4.2. 3 Eg.-2*/

/* Include function to display the contents of queue 4.2.1 Eg -5*/

void main()

{

int choice,item,f,r,q [10];

f=0; r = -1;


for (;;)

{

printf(" 1:Insert_front 2:lnsert_rearn");

printf("3: Delete_front 4: Delete_rearn" );

printf("5: Display 6:Exitn");

printf("Enter the choicen");

scanf("%d" ,&choice );

switch ( choice )

{

case 1:

printf("Enter the item to be insertedn");

scanf("%d",& item);

insert_ front(item, q, &f, &r);

break;

case 2:

printf("Enter the item to be insertedn");

scanf("%d",& item);

insert_rear(item, q, &r);

break; case 3:

delete _front(q, &f, &r);

break;

case 4:

delete_rear(q, &f, &r);


break;

cases 5:

display(q, f, r);

break;

default: .

exit(0);

}

}

}

2. Explain Insert/Delete a node at the rear, front end from circular singly linked list.

Ans:2

A singly linked circular list is a linked list where the last node in the list points to the first node in the list. A circular list does not contain NULL pointers.

With a circular list, a pointer to the last node gives easy access also to the first node, by following one link. Thus, in applications that require access to both ends of the list (e.g., in the implementation of a queue), a circular structure allows one to handle the structure by a single pointer, instead of two.

A circular list can be split into two circular lists, in constant time, by giving the addresses of the last node of each piece. The operation consists in swapping the contents of the link fields of those two nodes. Applying the same operation to any two nodes in two distinct lists joins the two list into one. This property greatly simplifies some algorithms and data structures, such as the quad-edge and face-edge.

The simplest representation for an empty circular list (when such a thing makes sense) is a null pointer, indicating that the list has no nodes. With this choice, many algorithms have to test for


this special case, and handle it separately. By contrast, the use of null to denote an empty linear list is more natural and often creates fewer special cases.

Insert a node at the rear end:

Consider the list contains 4 nodes and last is a pointer variable that contains the address of the last node.

Let us insert the item 80 at the end of this list.

Step 1:Obtain a free node temp from the availability list and store the item in info field. This can be accomplished using the following statementsTemp=getnode();Temp->info=item;

Step 2: Copy the address of the first node(i.e. last->link) into link field of newly obtained node temp and the statement to accomplish this task isTemp-> link=last->linkStep 3: Establish a link between the newly created node temp and the last node. This is achieved by copying the address of the node temp into link field of last node. The corresponding code for this isLast->link=temp;

Step 4: The new node is made as the last node using the statement:Return temp;

These steps have designed by assuming the list already exists.

Function to insert an item at the rear end of the list.NODE insert_rear(int item, NODE last){NODE temp;Temp=getnode();Temp->info=item;If(last==NULL)

50 20 45 10


Last=temp;ElseEmp->link=last->link;Last->link=temp;Retrun temp;}

Insert a node at the front end:

Consider the list contains 4 nodes and a pointer variable last contains address of the last node.

Step 1: To insert an item 50 at the front of the list, obtain a free node temp from the availability list and store the item in info field. This can be accomplished using the following statements

Temp=getnode();Temp->info=item;

Step 2: Copy the address of the first node(i.e. last->link) into link field of newly obtained node temp and the statement to accomplish this task isTemp->link=last->link.

Step 3: Establish a link between the newly created node temp and the last node. This is achieved by copying the address of the node temp into link field of last node. The corresponding code for this isLast->link=temp;

Now, an item is successfully inserted at the front of the list. All these steps have been designed by assuming the list is already existing. If the list is empty, make temp itself as the last node and establish a link between the first node and the last node. Repeatedly insert the items using the above procedure to create a list. Function to insert an item at the front end of the list. NODE insert_front(int item, NODE last){NODE temp;Temp=getnode();Temp->info=item;If(last==NULL)Last=temp;Elsetemp->link=last->link;Last->link=temp;

20 45 10 80


Return last;}

3. a. Compare and contrast DFS and BFS and DFS+ID approaches

b. Discuss how Splay Tree differs from a Binary Tree? Justify your answer with

appropriate example.

Ans:3 a)

Once you’ve identified a problem as a search problem, it’s important to choose the right type of search. Here are some information which may be useful while taking the decision.

Search Time Space When to use

DFS O(c^k) O(k)Must search tree anyway, know the level the answers are on, or you aren’t looking for the shallowest number.

BFS O(c^d) O(c^d)Know answers are very near top of tree, or want shallowest answer.

DFS+ID O(c^d) O(d)Want to do BFS, don’t have enough space, and can spare the time.

d is the depth of the answer k is the depth searched d <= k

Remember the ordering properties of each search. If the program needs to produce a list sorted shortest solution first (in terms of distance from the root node), use breadth first search or iterative deepening. For other orders, depth first search is the right strategy.

If there isn’t enough time to search the entire tree, use the algorithm that is more likely to find the answer. If the answer is expected to be in one of the rows of nodes closest to the root, use breadth first search or iterative deepening. Conversely, if the answer is expected to be in the leaves, use the simpler depth first search. Be sure to keep space constraints in mind. If memory is insufficient to maintain the queue for breadth first search but time is available, use iterative deepening.


B)

Binary Tree:

A binary tree is simply a tree in which each node can have at most two children.

A binary search tree is a binary tree in which the nodes are assigned values, with the following restrictions ;

-No duplicate values. -The left subtree of a node can only have values less than the node -The right subtree of a node can only have values greater than the node

and recursively defined; -The left subtree of a node is a binary search tree. -The right subtree of a node is a binary search tree.

Splay Trees

We shall describe the algorithm by giving three rewrite rules in the form of pictures. In these pictures, x is the node that was accessed (that will eventually be at the root of the tree). By looking at the local structure of the tree defined by x, x’s parent, and x’s grandparent we decide which of the following three rules to follow. We continue to apply the rules until x is at the root of the tree:

Notes

1) Each rule has a mirror image variant, which covers all the cases.

2) The zig-zig rule is the one that distinguishes splaying from just rotating x to the root of the tree.

3) Top-down splaying is much more efficient in practice.



Consider the following class of binary-tree based algorithms for processing a sequence of requests:

on each access we pay the distance from the root to the accessed node.

Arbitrary rotations are also allowed, at a cost of 1 per rotation.

For a sequence Sigma, let C(T,Sigma) be the cost of the optimal algorithm in this class for processing a sequence Sigma starting from tree T. Then the cost of splaying the sequence (starting from any tree) is O(C(T,Sigma) + n).

What about dynamic operations that change the set of elements in the tree? Here’s how we can do Insertion, Join, and Deletion.

Insertion

Say we want to insert a new item e. Proceed like ordinary binary tree insertion, but stop short of linking the new item e into the tree. Let the node you were about to link e to be called p. We splay p to the root. Now construct a tree according to one of the following pictures:

depending on if the item in e is before or after p.

Let’s analyze this when all the weights are 1. The amortized cost of the splay is O(log n) according to the access lemma. Then we have to pay the additional amortized cost which equals the potential of the new root. This is log(n+1). So the amortized cost of the whole thing is O(log(n)).

An alternative insertion algorithm is to do ordinary binary tree insertion, and then simply splay the item to the root. This also has efficient amortized cost, but the analysis is a tiny bit more complicated.

Join

Here we have two trees, say, A and B. We want to put them together with all the items in A to the left of all the items in B. This is called the "Join" operation. This is done as follows. Splay the rightmost node of A. Now make B the right child of this node. The amortized cost of the operation is easily seen to be O(log n) where n is the number of nodes in the tree after joining. (Just assign uniform weights to all the nodes, as we did in the Insertion analysis.)



Deletion

To delete a node, x, simply splay it to the root and then Join its left and right sub-trees together as described above. The amortized cost is easily seen to be O(log n), where n is the size before the deletion.

4. Given the B-tree (figure 1) shown below. Show

a. Show the step-by-step process of inserting key E into this tree.

b. Show the step-by-step process of removing key P from this tree.

c. Show the step-by-step process of removing key R from this tree.

Ans:4 Figure 1 not available.

MCOO68 SMU MCA SEM2 2011

Documents

Transcript of MCOO68 SMU MCA SEM2 2011