More on Data Structures in C

45
More on Data Stru ctures in C CS-2301 D-term 200 9 1 More on Data Structures in C CS-2301 System Programming D-term 2009 (Slides include materials from The C Programming Language, 2 nd edition, by Kernighan and Ritchie and from C: How to Program, 5 th and 6 th editions, by Deitel and Deitel)

description

More on Data Structures in C. CS-2301 System Programming D-term 2009 (Slides include materials from The C Programming Language , 2 nd edition, by Kernighan and Ritchie and from C: How to Program , 5 th and 6 th editions, by Deitel and Deitel). Linked List Review. Linear data structure - PowerPoint PPT Presentation

Transcript of More on Data Structures in C

Page 1: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 1

More on Data Structures in C

CS-2301 System Programming D-term 2009

(Slides include materials from The C Programming Language, 2nd edition, by Kernighan and Ritchie and from C: How to Program, 5th and 6th editions, by Deitel and Deitel)

Page 2: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 2

Linked List Review

• Linear data structure

• Easy to grow and shrink

• Easy to add and delete items

• Time to search for an item – O(n)

Page 3: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 3

Linked List (continued)

payload

nextpayload

nextpayload

next

payload

next

struct listItem *head;

Page 4: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 4

Doubly-Linked List (review)

prev next

payload

prev next

payloadprev next

payload

prev next

payload

struct listItem *head, *tail;

Page 5: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 5

AddAfter(item *p, item *new)

Simple linked list{ new -> next =

p -> next;p -> next = new;

}

Doubly-linked list{ new -> next =

p -> next;if (p -> next)

p->next->prev = new; new -> prev = p;p -> next = new;

}

Page 6: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 6

AddAfter(item *p, item *new)

Simple linked list{ new -> next =

p -> next;p -> next = new;

}

Doubly-linked list{ new -> next =

p -> next;if (p -> next)

p->next->prev = new; new -> prev = p;p -> next = new;

}

prev next

payloadprev next

payload

prev next

payload

Page 7: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 7

AddAfter(item *p, item *new)

Simple linked list{ new -> next =

p -> next;p -> next = new;

}

Doubly-linked list{ new -> next =

p -> next;if (p -> next)

p->next->prev = new; new -> prev = p;p -> next = new;

}

prev next

payloadprev next

payload

prev next

payload

Page 8: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 8

AddAfter(item *p, item *new)

Simple linked list{ new -> next =

p -> next;p -> next = new;

}

Doubly-linked list{ new -> next =

p -> next;if (p -> next)

p->next->prev = new; new -> prev = p;p -> next = new;

}

prev next

payloadprev next

payload

prev next

payload

Page 9: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 9

AddAfter(item *p, item *new)

Simple linked list{ new -> next =

p -> next;p -> next = new;

}

Doubly-linked list{ new -> next =

p -> next;if (p -> next)

p->next->prev = new; new -> prev = p;p -> next = new;

}

prev next

payloadprev next

payload

prev next

payload

Page 10: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 10

deleteNext(item *p)

Simple linked list{ if (p->next != NULL)

p->next = p->next->next;

}

Doubly-linked list• Complicated• Easier to deleteItem

Page 11: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 11

deleteItem(item *p)

Simple linked list• Not possible without

having a pointer to previous item!

Doubly-linked list{ if(p->next != NULL)

p->next->prev = p->prev; if(p->prev != NULL)p->prev->next = p->next;

}

prev next

payloadprev next

payload

prev next

payload

Page 12: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 12

deleteItem(item *p)

Simple linked list• Not possible without

having a pointer to previous item!

Doubly-linked list{ if(p->next != NULL)

p->next->prev = p->prev; if(p->prev != NULL)p->prev->next = p->next;

}

prev next

payloadprev next

payload

prev next

payload

Page 13: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 13

deleteItem(item *p)

Simple linked list• Not possible without

having a pointer to previous item!

Doubly-linked list{ if(p->next != NULL)

p->next->prev = p->prev; if(p->prev != NULL)p->prev->next = p->next;

}

prev next

payloadprev next

payload

prev next

payload

Page 14: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 14

Special Cases of Linked Lists

• Queue:– – Items always added to tail– Items always removed from head

• Stack:– – Items always added to head– Items always removed from head

Singly-linked list works okay

•Need pointers to head and tail

Singly-linked list works okay

•Only need pointer to head

Page 15: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 15

Bubble Sort a Linked List

item *BubbleSort(item *p) {if (p->next != NULL) {

item *q = p->next, *qq = p;for (;q != NULL; qq = q, q = q-

>next)if (p->payload > q->payload){

/*swap p and q */

}p->next = BubbleSort(p->next);

};return p;

}

Page 16: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 16

Bubble Sort a Linked List

item *BubbleSort(item *p) {if (p->next != NULL) {item *q = p->next, *qq = p;for (;q != NULL; qq = q, q = q->next)if (p->payload > q->payload){item *temp = p->next;p->next = q->next; q->next = temp;qq->next = p; p = q;}p->next = BubbleSort(p->next);};return p;

}

Page 17: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 17

Bubble Sort a Linked List

item *BubbleSort(item *p) {if (p->next != NULL) {item *q = p->next, *qq = p;for (;q != NULL; qq = q, q = q->next)if (p->payload > q->payload){item *temp = p->next;p->next = q->next; q->next = temp;qq->next = p; p = q;}p->next = BubbleSort(p->next);};return p;

}

Head of (sub)list being sorted Pointer to step thru (sub)list

Pointer to item previous to q in (sub)list

Page 18: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 18

Potential Exam Questions

• Analyze BubbleSort to determine if it is correct, and fix it if incorrect.

• Hint: you need to define “correct”

• Hint2: you need to define a loop invariant to convince yourself

• Draw a diagram showing the nodes, pointers, and actions of the algorithm

Page 19: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 19

Observations:–

• What is the order (Big-O notation) of the Bubble Sort algorithm?

• Answer: O(n2)

• Note that Quicksort is faster – O(n log n) on average

• Pages 87 & 110 in Kernighan and Ritchie• Potential exam question:– why?

Page 20: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 20

Questions?

Page 21: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 21

Binary Tree (review)

• A linked list but with two links per item

struct treeItem {type payload;treeItem *left; treeItem *right;

};

left right

payload

left right

payloadleft right

payload

left right

payloadleft right

payloadleft right

payload

left right

payload

Page 22: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 22

Binary Trees (continued)

• Two-dimensional data structure

• Easy to grow and shrink

• Easy to add and delete items at leaves• More work needed to insert or delete branch nodes

• Search time is O(log n)• If tree is reasonably balanced

• Degenerates to O(n) in worst case if unbalanced

Page 23: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 23

Order of Traversing Binary Trees

• In-order• Traverse left sub-tree (in-order)• Visit node itself• Traverse right sub-tree (in-order)

• Pre-order• Visit node first• Traverse left sub-tree• Traverse right sub-tree

• Post-order• Traverse left sub-tree• Traverse right sub-tree• Visit node last

Page 24: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 24

Order of Traversing Binary Trees

• In-order• Traverse left sub-tree (in-order)• Visit node itself• Traverse right sub-tree (in-order)

• Pre-order• Visit node first• Traverse left sub-tree• Traverse right sub-tree

• Post-order• Traverse left sub-tree• Traverse right sub-tree• Visit node last

Programming

Assignment #6

Page 25: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 25

Example of Binary Tree

x = (a.real*b.imag - b.real*a.imag) / sqrt(a.real*b.real – a.imag*b.imag)

=

x /

sqrt-

* *

. .

a real b imag

. .

b real a imag

-

Page 26: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 26

Question

• What kind of traversal order is required for this expression?

• In-order?

• Pre-order?

• Post-order?

Page 27: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 27

Binary Trees in Compilers

• Used to represent the structure of the compiled program

• Optimizations• Common sub-expression detection

• Code simplification

• Loop unrolling

• Parallelization

• Reductions in strength – e.g., substituting additions for multiplications, etc.

• Many others

Page 28: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 28

Questions about Trees?

or aboutProgramming Assignment 6?

Page 29: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 29

New Challenge

• What if we require a data structure that has to be accessed by value in constant time?

• I.e., O(log n) is not good enough!

• Need to be able to add or delete items

• Total number of items unknown• But an approximate maximum might be known

Page 30: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 30

Examples

• Anti-virus scanner

• Symbol table of compiler

• Virtual memory tables in operating system

• Bank account for an individual

Page 31: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 31

Observation

• Arrays provide constant time access …

• … but you have to know which element you want!• We only know the contents of the item we want!

• Also• Not easy to grow or shrink

• Not open-ended

• Can we do better?

Page 32: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 32

Answer – Hash Table

• Definition:– Hash Table• A data structure comprising an array (for constant time access)

• A set of linked lists (one list for each array element)

• A hashing function to convert search key to array index

Page 33: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 33

Definition

• Search key:– a value stored as (part of) the payload of the item you are looking for

• Need to find the item containing that value (i.e., key)

Page 34: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 34

Answer – Hash Table

• Definition:– Hash Table• A data structure comprising an array (for constant time access)

• A set of linked lists (one list for each array element)

• A hashing function to convert search key to array index

• Definition:– Hashing function (or simply hash function)

• A function that takes the search key in question and “randomizes” it to produce an index

• So that non-randomness of keys avoids concentration of too many elements around a few indices in array

• See §6.6 in Kernighan & Ritchie

Page 35: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 35

datanext

Hash Table Structure

item item item item item item item item item item...

datanext data

next

datanextdatanext

datanext

datanextdatanext

datanext

datanextdatanext

datanext

datanext

Page 36: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 36

Guidelines for Hash Tables

• Lists from each item should be short• I.e., with short search time (approximately constant)

• Size of array should be based on expected # of entries

• Err on large side if possible

• Hashing function• Should “spread out” the values relatively uniformly

• Multiplication and division by prime numbers usually works well

Page 37: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 37

Example Hashing Function

• P. 144 of K & R

#define HASHSIZE 101

unsigned int hash(char *s) {unsigned int hashval;for (hashval = 0; *s != ‘\0’; s++)

hashval = *s + 31 * hashval;

return hashval % HASHSIZE

}

Page 38: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 38

Example Hashing Function

• P. 144 of K & R

#define HASHSIZE 101

unsigned int hash(char *s) {unsigned int hashval;for (hashval = 0; *s != ‘\0’; s++)

hashval = *s + 31 * hashval;

return hashval % HASHSIZE

}

Note choice of prime

numbers to “mix it

up”

Page 39: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 39

Using a Hash Table

struct item *lookup(char *s) {struct item *np;

for (np = hashtab[hash(s)]; np != NULL;np = np -> next)

if (strcmp(s, np->data) == 0)return np; /*found*/

return NULL; /* not found */

}

Page 40: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 40

Using a Hash Table

struct item *lookup(char *s) {struct item *np;

for (np = hashtab[hash(s)]; np != NULL;np = np -> next)

if (strcmp(s, np->data) == 0)return np; /*found*/

return NULL; /* not found */

}

Hash table is indexed

by hash value of s

Page 41: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 41

Using a Hash Table

struct item *lookup(char *s) {struct item *np;

for (np = hashtab[hash(s)]; np != NULL;np = np -> next)

if (strcmp(s, np->data) == 0)return np; /*found*/

return NULL; /* not found */

}

Traverse the linked

list to find item s

Page 42: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 42

Using a Hash Table (continued)

struct item *addItem(char *s, …) {struct item *np;unsigned int hv;

if ((np = lookup(s)) == NULL) {np = malloc(item);/* fill in s and data */np -> next = hashtab[hv = hash(s)];hashtab[hv] = np;

};

return np;}

Page 43: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 43

Using a Hash Table (continued)

struct item *addItem(char *s, …) {struct item *np;unsigned int hv;

if ((np = lookup(s)) == NULL) {np = malloc(item);/* fill in s and data */np -> next = hashtab[hv = hash(s)];hashtab[hv] = np;

};

return np;}

Inserts new ite

m at head

of the lis

t indexed by

hash value

Page 44: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 44

Hash Table Summary

• Widely used for constant time access

• Easy to build and maintain

• There exist an art and science to the choice of hashing functions

• Consult textbooks, web, etc.

Page 45: More on Data Structures in C

More on Data Structures in C

CS-2301 D-term 2009 45

Questions?