1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental...

31
1 Symbol Tables Symbol Tables Chapter 12.1-12.3 Sedgewick

Transcript of 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental...

Page 1: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

1Symbol Tables

Symbol Tables

Chapter 12.1-12.3 Sedgewick

Page 2: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

2Symbol Tables

Searching Searching is a fundamental element of many

computational tasks looking up a name in a phone book selecting records in databases searching for pages on the web

Characteristics of searching: typically, very large amount of data (very many

items) "information need" specified by keys (search

terms) effective keys identify a small proportion of data

Page 3: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

3Symbol Tables

…Searching In our context (DS&A), we abstract the problem

to: we have a large collection of items each item contains key values and other data

The search problem: input: a key value output: item(s) containing that key

Page 4: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

4Symbol Tables

…Searching We assume:

keys are unique each item has one key

How do we represent a key in an item? Extend Item.h to handle complex items. Items now have a key and data

Page 5: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

5Symbol Tables

Complex Items – Item.htypedef int Key; //Can change key type

struct record{

Key keyVal;

char value[10]; //Can change value type

};

typedef struct record * Item;

#define NULLitem NULL //This indicates no item

#define key(A) ((A)->keyVal)

#define compare(A,B) ((A) – (B))

void itemShow(Item);

Page 6: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

6Symbol Tables

Symbol Tables A symbol table (dictionary) is a collection of

items with unique keys that has operations to Insert a new item Return an item with a particular key

Applications of symbol tables: programming language processors (e.g.

compilers, interpreters) text processing systems (spell-checkers,

document retrieval)

Page 7: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

7Symbol Tables

Symbol Table InterfaceSymbolTable.h

typedef struct sTabRep * STab;

// Create a new Symbol Table

STab stInit();

// insert an item in symbol table

void stInsert(STab s,Item i);

// return item with given key in table

// return NULLItem if key is not in the table

Item stSearch(STab s,Key k);

Page 8: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

8Symbol Tables

..First Class Symbol Table InterfaceSymbolTable.h

// return the number of items in the table

int stCount(STab s);

// Delete the given item from table

void stDelete(STab s,Item i);

// Find the nth item in table

Item stSelect(STab s,int n);

// Traverse items in key order

void stSort(STab s);

Page 9: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

9Symbol Tables

Insertion into the SymbolTable What does insert do if key already exists in

table? approach A: do nothing (insertion fails silently) approach B: return an error indication approach C: replace existing item associated

with key

We use approach A and provide a replace function if necessary

Page 10: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

10

Symbol Tables

Example: Symbol Table Client Using a symbol table

Generate an ordered list of random numbers with no duplicates Use stSearch to check if it is already in the

table If not use stInsert to insert it!

Use stSort to print out the numbers in order Only really care about the key of the item

See stClient1.c for full implementation

Page 11: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

11

Symbol Tables

Exercise: Random Number Checker

Write a program that uses a symbol table to generate many random #'s in the range 1..N count frequency of occurrence of each Expectation: all frequencies roughly equal

What are the items? What does the key represent What does the value represent

Page 12: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

12

Symbol Tables

Symbol Table Implementation 1: Key Indexed Array

Use key to determine the position of the item in the array Requires dense keys (ie. Few gaps) Keys must be integral (or easy to map to

integral values)

NULLitem NULLitem NULLitem

[0] [1] [2] [7][6][5][3] [4]

1,data 3,data 4,data 5,data 7,dataitems

Page 13: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

13

Symbol Tables

ST_keyIndexed.cstruct sTabRep{ Item *items; int count; int size;};//Assume keys are from 0 – (max-1) and are uniqueSTab stInit(){ int i; Stab st = malloc(struct sTabRep); STab st->items = malloc((max)* sizeof(Item)); for(i=0;i< max;i++) st->items[i] = NULLitem; } st->count = 0; st->size = max; return st;}

Page 14: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

14

Symbol Tables

…ST_keyIndexed.cint stCount(STab st){ assert(st != NULL); return st->count;}void stInsert(STab st, Item i){ assert(st != NULL); if(compare(key(i), st->size) < 0 && compare(st->items[key(i)],NULLitem) == 0){ st->items[key(i)] = item; st->count++; }}//Exercise:Item stSearch(STab st, Key k);

Page 15: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

15

Symbol Tables

…ST_keyIndexed.c//Exercise

void stDelete(STab st, Item i){

}

//Exercise

Item stSelect(STab st, int n){

}

Page 16: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

16

Symbol Tables

…ST_keyIndexed.c//Traverse all items in sorted order

void stSort(STab st){

int i;

assert(st != NULL);

for(i = 0; i < st->size; i++){

if(compare(st->items[i],NULLitem) != 0){

showItem(st->items[i]);

}

}

}

Page 17: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

17

Symbol Tables

Properties:Key Indexed Array Implementation

Insert, search and delete and count are O(1) Init, select and sort are O(n)

Problem: May have large gaps in array due to sparse keys. Not suitable for all types of data

Large range of keys Key cannot easily be mapped to unique index

Page 18: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

18

Symbol Tables

Symbol Table Implementation2: Ordered Array

Enter items into array without leaving gaps Put items in key order

Can use linear or binary search to find items

items

Page 19: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

19

Symbol Tables

ST_orderedArray.c//Data structure representation is the samestruct sTabRep{ Item *items; int count; int size;};STab stInit(){ int i; Stab st = malloc(struct sTabRep); assert(st != NULL); STab st->items = malloc((max)* sizeof(Item)); st->count = 0; st->size = max; return st;}

Page 20: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

20

Symbol Tables

…ST_orderedArray.cItem search(STab st, Key k) {

int i;

Item returnVal;

assert(st != NULL && st->items != NULL);

i = findInArray(k, st->items, 0, st->count-1);

if( i < st->count &&

compare(key(st->items[i]),k) == 0){

returnVal = st->items[i];

}else{

returnVal = NULLitem;

}

return returnVal;

}

Page 21: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

21

Symbol Tables

…ST_orderedArray.cvoid stInsert(STab st, Item it) {

assert(st != NULL && st->items != NULL);

assert(st->count < st->size);

int i = findInArray(key(it),st->items,0,st->count -1);

if (i < st->count &&

compare(key(st->items[i]),key(it)) != 0){

int j;

for (j = st->count; j > i; j--)

st->items[j] = st->items[j-1];

}

st->items[i] = it;

st->count++;

}

}

Page 22: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

22

Symbol Tables

Linear Search for findInArray()//does not indicate if k is there or not

//return value 0 indicates k ≤ all keys in array

//return value N for array of size N indicates that

//k is larger than all keys in array

int findInArray(Key k, Item a[], int lo, int hi) {

int i, diff;

for (i = lo; i <= hi && diff > 0; i++) {

diff = compare(k, key(a[i]));

}

return i;

}

Page 23: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

23

Symbol Tables

Binary Search for findInArray()int findInArray(Key k, Item a[], int lo, int hi) { int returnVal; if (hi <= lo) { returnVal = lo; }else{ int mid = (hi+lo)/2; int diff = compare(k, key(a[mid])); if (diff < 0){ returnVal = findInArray(k, a, lo, mid); } else if (diff > 0){ returnVal = findInArray(k, a, mid+1, hi); }else{ returnVal = mid; } } return returnVal;}

Page 24: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

24

Symbol Tables

Cost Analysis of Searching Linear Search:

best case: key is min key (1 comparison) worst case: key is not in array (N comparisons) average case: key is in middle (N/2

comparisons) Binary Search:

best case: key is mid key (1 comparison) worst case: key is not in array (log2N

comparisons) average case: find key part-way through

partitioning

Page 25: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

25

Symbol Tables

Properties:Ordered Array Implementation

Init, Count O(1) Search O(logn)

Assuming binary search Insertion, Deletion O(n)

Need to shuffle items along to fill gaps

Page 26: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

26

Symbol Tables

Symbol Table Implementation 3:Linked Lists

Linked list of items, maintained in key order No real need for max size Must use linear search

items

Page 27: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

27

Symbol Tables

ST_LinkedList.ctypedef struct sTabRep *STab;

typedef typedef struct node Node;

struct node {

Item data;

Node *next;

};

struct STabRep {

Node *items;

int count;

int max;

};

Page 28: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

28

Symbol Tables

Properties:Ordered Linked List Implementation

Init, count O(1) Search, Insert, Delete O(n)

best case: key is min key (1 comparison) worst case: key is max key (n comparisons) average case: key is in middle (n/2

comparisons)

Page 29: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

29

Symbol Tables

Symbol Table Implementation 4:Binary Search Tree

items

Page 30: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

30

Symbol Tables

ST_bst.ctypedef struct sTabRep *STab;

typedef typedef struct node Node;

typedef struct node *Tree;

struct node {

Item data;

Node *left;

Node *right;

};

struct sTabRep {

Node *items;

int count;

int size;

};

Page 31: 1 Symbol Tables Chapter 12.1-12.3 Sedgewick. 2 Symbol Tables Searching Searching is a fundamental element of many computational tasks looking up a name.

31

Symbol Tables

PropertiesBinary Search Tree Implementation

Init, Count O(1) We use a counter to keep track of how many

items in the tree

Insert, Search, Delete Average height worst case O(n) Max height worst case O(nlogn)