DATA STRUCTURE A data structure is a particular way of storing and organizing data in a computer so...

Post on 16-Jan-2016

221 views 0 download

Tags:

Transcript of DATA STRUCTURE A data structure is a particular way of storing and organizing data in a computer so...

DATA STRUCTURE

A data structure is a particular way of storing and organizing data in a computer so that it can be used 

efficiently.

Kinds of Data Structures

• For example, B-trees are particularly well-suited for implementation of databases, while compiler implementations usually use hash tables to look up identifiers.

Uses of Data Structures

Data structures are used in almost every program or software system.

Data types

1.Primitive Types

2.Composite Types

3.Abstract Data Types(ADT)

Primitive Types

1.  the Boolean or logical data type is the most primitive data type, having one of two values (true or false), intended to represent the truth values of logic andBoolean algebra.

2. a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language.

3. integer is used to refer to a data type which represents some finite subset of the mathematical integers. These are also known as integral data types

4. a string is a sequence of symbols that are chosen from a set or alphabet.

5. double precision is a computer numbering format that occupies two adjacent storage locations in computer memory. A double precision number, sometimes simply called a double, may be defined to be an integer, fixed point, or floating point.

• 6. Floating point data types.

The keyword float usually represents a single precision floating point data type, and double represents a double precision floating point data type. In TIGCC, both float and double (and even long double) are the same.

Composite Types

1. a record (also called tuple or struct) is one of the simplest data structures, consisting of two or more values or variables stored in consecutive memory positions; so that each component (called a field or member of the record) can be accessed by applying different offsets to the starting address.

2. a union is a value that may have any of several representations or formats; or a data structure that consists of a variable which may hold such a value. Someprogramming languages support special data types, called (somewhat confusingly) union types, to describe such values and variables.

3. a tagged union, also called a variant, variant record, discriminated union, or disjoint union, is a data structure used to hold a value that could take on several different, but fixed types. Only one of the types can be in use at any one time, and a tag field explicitly indicates which one is in use. 

Bubble SortBubble sort is a simple and well-known sorting algorithm. It is used in practice

once in a blue moon and its main application is to make an introduction to the sorting algorithms. Bubble sort belongs to O(n2) sorting algorithms, which makes it quite inefficient for sorting large data volumes. Bubble sort is stable and adaptive.

Algorithm1.Compare each pair of adjacent elements from the

beginning of an array and, if they are in reversed order, swap them.

2.If at least one swap has been done, repeat step 1

Imagine that on every step big bubbles float to the surface and stay there. At the step, when no bubble moves, sorting stops. Let us see an example of sorting an array to make the idea of bubble sort clearer.

Complexity analysis Average and worst case complexity of bubble sort is

O(n2). Also, it makes O(n2) swaps in the worst case. Bubble sort is adaptive. It means that for almost sorted array it gives O(n) estimation. Avoid implementations, which don't check if the array is already sorted on every step (any swaps made). This check is necessary, in order to preserve adaptive property.

See Sample Figure 1.0

TURTLE AND RABBITS• One more problem of bubble sort is that its running time badly

depends on the initial order of the elements. Big elements (rabbits) go up fast, while small ones (turtles) go down very slow. This problem is solved in the Cocktail sort.

• Turtle example. Thought, array {2, 3, 4, 5, 1} is almost sorted, it takes O(n2) iterations to sort an array. Element {1} is a turtle.

• Rabbit example. Array {6, 1, 2, 3, 4, 5} is almost sorted too, but it takes O(n) iterations to sort it. Element {6} is a rabbit. This example demonstrates adaptive property of the bubble sort.

See Sample Figure 1.1

See Sample Figure 1.2

Selection Sort

Selection sort is one of the O(n2) sorting algorithms, which makes it quite inefficient for sorting large data volumes. Selection sort is notable for its

programming simplicity and it can over perform other sorts in certain situations (see complexity analysis for more details).

Algorithm• The idea of algorithm is quite simple. Array is imaginary

divided into two parts - sorted one and unsorted one. At the beginning, sorted part is empty, while unsorted one contains whole array. At every step, algorithm finds minimal element in the unsorted part and adds it to the end of the sorted one. When unsorted part becomes empty, algorithm stops.

When algorithm sorts an array, it swaps first element of unsorted part with minimal element and then it is included to the sorted part. This implementation of selection sort in not stable. In case of linked list is sorted, and, instead of swaps, minimal element is linked to the unsorted part, selection sort is stable

Insertion Sort

Insertion sort belongs to the O(n2) sorting algorithms. Unlike many sorting algorithms with quadratic complexity, it is actually applied in practice for sorting small arrays of data. For instance, it is used to improve quicksort routine. Some

sources notice, that people use same algorithm ordering items, for example, hand of cards.

• Algorithm

Insertion sort algorithm somewhat resembles selection sort. Array is imaginary divided into two parts - sorted one andunsorted one. At the beginning, sorted part contains first element of the array and unsorted one contains the rest. At every step, algorithm takes first element in the unsorted part and inserts it to the right place of the sorted one. Whenunsorted part becomes empty, algorithm stops. Sketchy, insertion sort algorithm step looks like this:

Insertion sort properties

• adaptive (performance adapts to the initial order of elements);• stable (insertion sort retains relative order of the same elements);• in-place (requires constant amount of additional space);• Online (new elements can be added during the sort).

Illustration

Quick Sort

is a fast sorting algorithm, which is used not only for educational purposes, but widely applied in practice. On the average, it has O(n log n) complexity, making quicksort suitable for sorting big data volumes. The idea of the

algorithm is quite simple and once you realize it, you can write quicksort as fast as bubble sort.

STACKS

Stack is one of the fundamental data structures in computer science and it is used in many algorithms and applications. As an example, stack is used:implicitly in recursion;

-for expression evaluation;-to check the correctness of parentheses sequence;etc.-First of all, we will describe Stack ADT and then show -two different implementations.

• a stack is a last in, first out (LIFO) abstract data type and data structure. A stack can have any abstract data type as an element, but is characterized by only two fundamental operations: push and pop. The push operation adds to the top of the list, hiding any items already on the stack, or initializing the stack if it is empty. The pop operation removes an item from the top of the list, and returns this value to the caller. A pop either reveals previously concealed items, or results in an empty list.

Stack ADT

Well illustration from the real world is a stack of books, where you can operate with a "top" of the stack only. We will dedicate six operations on the stack. You can put a book on top (push); look, which one lays on top of a stack (peek)and remove a topmost book (pop) and check, if stack is empty (isEmpty). Also, there are two operations to create and to destroy a stack. Let us structure them up in the following list.

Operations• Stack create()

creates empty stack• boolean isEmpty(Stack s)

tells whether the stack s is empty• push(Stack s, Item e)

put e on top of the stack s• Item peek(Stack s)

returns topmost element in stack sPrecondition: s is not empty

• pop(Stack s)removes topmost element from the stack sPrecondition: s is not empty

• destroy(Stack s)destroys stack s

• Note. In different sources peek operation may be called top.

• Axioms• newly created stack is empty;• after pushing an item to a newly created stack, it

becomes nonempty;• peek returns the most recently pushed item;• stack remains untouched, after a pair of push

and pop commands, executed one after another.• It is a formal definition. Let us step forward to

more intuitive examples, before we turn to implementation issues.

Linked list

a linked list is a data structure that consists of a sequence of data records such that in each record there is a field that contains a reference (i.e., a link) to the next record in the sequence.

• Singly-linked list• Linked list is a very important dynamic data

structure. Basically, there are two types of linked list, singly-linked list and doubly-linked list. In a singly-linked list every element contains some data and a link to the next element, which allows to keep the structure. On the other hand, every node in a doubly-linked list also contains a link to the previous node. Linked list can be an underlying data structure to implement stack, queue or sorted list.

• Each cell is called a node of a singly-linked list. First node is called head and it's a dedicated node. By knowing it, we can access every other node in the list. Sometimes, last node, called tail, is also stored in order to speed up add operation.

• Singly-linked list. Traversal.Assume, that we have a list with some nodes. Traversal is the very basic operation, which presents as a part in almost every operation on a singly-linked list. For instance, algorithm may traverse a singly-linked list to find a value, find a position for insertion, etc. For a singly-linked list, only forward direction traversal is possible.

Traversal algorithmBeginning from the head• check, if the end of a list hasn't been reached yet;• do some actions with the current node, which is specific

for particular algorithm;• current node becomes previous and next node

becomes current. Go to the step 1

Binary Heap

• Binary heap. Array-based internal representation. Heap is a binary tree and therefore can be stored inside

the computer using links. It gives various benefits; one of them isthe ability to vary number of elements in a heap quite easily. On the other hand, each node stores two auxiliary links, which implies an additional memory costs. As mentioned above, a heap is a complete binary tree, which leads to the idea of storing it using an array. By utilizing array-based representation, we can reduce memory costs while tree navigation remains quite simple.

Complete binary tree

It is said, that binary tree is complete, if all its levels, except possibly the deepest, are complete. Thought, incomplete bottom level can't have "holes", which means that it has to be fulfilled from the very left node and up to some node in the middle. See illustrations below.

• Heap property

There are two possible types of binary heaps: max heap and min heap. The difference is that root of a min heap contains minimal element and vice versa. Priority queue is often deal with min heaps, whereas heapsort algorithm, when sorting in ascending order, uses max heap.

• Inserting an element into a heap

In this article we examine the idea laying in the foundation of the heap data structure. We call it sifting, but you also may meet another terms, like "trickle", "heapify", "bubble" or "percolate".

Insertion algorithm

1. Now, let us phrase general algorithm to insert a new element into a heap.

2. Add a new element to the end of an array;

3. Sift up the new element, while heap property is broken. Sifting is done as following: compare node's value with parent's value. If they are in wrong order, swap them.

Now heap property is broken at the root

node:

• Removing the minimum from a heap

Removal operation uses the same idea as was used for insertion. Root's value, which is minimal by the heap property, is replaced by the last array's value. Then new value is sifted down, until it takes right position.

Removal algorithm1. Copy the last value in the array to the root;2. Decrease heap's size by 1;3. Sift down root's value. Sifting is done as following:

– if current node has no children, sifting is over;– if current node has one child: check, if heap property is broken,

then swap current node's value and child value; sift down the child;

– if current node has two children: find the smallest of them. If heap property is broken, then swap current node's value and selected child value; sift down the child.

Introduction to Binary Tree

A binary tree is made of nodes, where each node contains a "left" pointer, a "right" pointer, and a data element. The "root" pointer points to the topmost node in the tree. The left and right pointers recursively point to smaller "subtrees" on either side. A null pointer represents a binary tree with no elements -- the empty tree. The formal recursive definition is: a binary tree is either empty (represented by a null pointer), or is made of a single node, where the left and right pointers (recursive definition ahead) each point to a binary tree. 

A "binary search tree" (BST) or "ordered binary tree" is a type of binary tree where the nodes are arranged in order: for each node, all elements in its left subtree are less-or-equal to the node (<=), and all the elements in its right subtree are greater than the node (>). The tree shown above is a binary search tree -- the "root" node is a 5, and its left subtree nodes (1, 3, 4) are <= 5, and its right subtree nodes (6, 9) are > 5. Recursively, each of the subtrees must also obey the binary search tree constraint: in the (1, 3, 4) subtree, the 3 is the root, the 1 <= 3 and 4 > 3. Watch out for the exact wording in the problems -- a "binary search tree" is different from a "binary tree".

Insert()

Insert() -- given a binary search tree and a number, insert a new node with the given number into the tree in the correct place. The insert() code is similar to lookup(), but with the complication that it modifies the tree structure. As described above, insert() returns the new tree pointer to use to its caller. Calling insert() with the number 5 on this

tree

hasPathSum()

We'll define a "root-to-leaf path" to be a sequence of nodes in a tree starting with the root node and proceeding downward to a leaf (a node with no children). We'll say that an empty tree contains no root-to-leaf paths. So for example, the following tree has exactly four root-to-

leaf paths: For this problem, we will be concerned with the sum of the values of such a path -- for example, the sum of the values on the 5-4-11-7 path is 5 + 4 + 11 + 7 = 27.

• A binary tree is either empty or consists of a node called the root together with two binary trees called the left subtree and the right subtree.

Examples of Binary Trees

Binary Tree Traversal

Deletion in BST Let x be a value to be deleted from the BST and let X

denote the node containing the value x. Deletion of an element in a BST again uses the BST property in a critical way. When we delete the node X containing x, it would create a "void" that should be filled by a suitable existing node of the BST. There are two possible candidate nodes that can fill this void, in a way that the BST property is not violated: (1). Node containing highest valued element among all descendants of left child of X. (2). Node containing the lowest valued element among all the descendants of the right child of X. In case (1), the selected node will necessarily have a null right link which can be conveniently used in patching up the tree. In case (2), the selected node will necessarily have a null left link which can be used in patching up the tree. Figure illustrates several scenarios for deletion in BSTs.