CS215 - Lec 9 indexing and reclaiming space in files

14

Transcript of CS215 - Lec 9 indexing and reclaiming space in files

Page 1: CS215 - Lec 9  indexing and reclaiming space in files
Page 2: CS215 - Lec 9  indexing and reclaiming space in files

� Maintain Indexes.

� Adding a data record with Indexing.

� Deleting a data record with Indexing.

� Reclaiming space.

� Multilevel Index.

Dr. Hussien M.

Sharaf2

Page 3: CS215 - Lec 9  indexing and reclaiming space in files

Dr. Hussien M.

Sharaf3

Structure of Indexes

� Indexes must be sorted on ascending or descending order with respect to a (one or more ) field(s).

CompanyName offset

Google 211Record1

\n

\n

IBM 0Record2 \n

ITE 643Record3 \n

Microsoft 462Record4 \n

Apple Mac 985New

record\n

Page 4: CS215 - Lec 9  indexing and reclaiming space in files

Dr. Hussien M.

Sharaf4

Operations needed for an Index:1. Create an index at memory by

looping on all records from the original data file.

2. If the there is an index file, load it into memory before using it.

3. Write the index into file at the closing of the program.

Page 5: CS215 - Lec 9  indexing and reclaiming space in files

Dr. Hussien M.

Sharaf5

-Now Index is loaded at memory, the following operations are needed:

1. Add: Add data records to the data file and insert an index record at the correct position.

2. Delete: mark the record at data file as deleted and delete the related record from the index.

3. Deleting and updating data records requires updating the offsets of all index records. Is it the same for the adding a data record?

Page 6: CS215 - Lec 9  indexing and reclaiming space in files

Dr. Hussien M.

Sharaf6

R1

R2

R3

R4

R5

Data recordsR4

R3

R2

R5

R1

Index on Name

R2

R3

R1

R4

R5

Index on Phone

Page 7: CS215 - Lec 9  indexing and reclaiming space in files

Dr. Hussien M.

Sharaf7

R1

R2

R3

R4

R5

Data records on disk

R4

R3

R2

R5

R1

Name Index on RAM

R2R3

R1R4R5

Phone Index on RAM

R6R6

R6

Page 8: CS215 - Lec 9  indexing and reclaiming space in files

Dr. Hussien M.

Sharaf8

1. Go to the end of data file, get current offset.

2. Data record is appended to the end of data file.

3. An index entry is built using offset and key of the new data record. (offset, Key)

4. The new index entry is inserted into its correct position at sorted index list.

5. At the end of the program the index list is saved into disk.

Page 9: CS215 - Lec 9  indexing and reclaiming space in files

Dr. Hussien M.

Sharaf9

1. Search for index entry by comparing target value with the key field value.

2. Mark the index entry as deleted.

3. Get the offset of the target data record.

4. Seek for the target offset , mark the data record as deleted.

NOTE: Data record is not actually deleted immediately. Space reclaiming function is required to run.

Page 10: CS215 - Lec 9  indexing and reclaiming space in files

Dr. Hussien M.

Sharaf10

R1

R2

R3

R4

R5

Data records on disk

R4

R6

R2

R5

R1

Name Index on RAM

R2R6

R1R4R5

Phone Index on RAMR6

R3

R3

Page 11: CS215 - Lec 9  indexing and reclaiming space in files

Dr. Hussien M.

Sharaf11

A. Create a new file stream.

B. While not end of records1. Read a collection of records into buffer.

2. For each record in the buffer:

� If record is marked deleted, go to the next record.

� Else copy record to the new file stream.

C. End While

D. Rebuild all indexes based on the new data file.

NOTE: in the process of copying data to the new stream, buffering is used.

Page 12: CS215 - Lec 9  indexing and reclaiming space in files

Dr. Hussien M.

Sharaf12

� When an Index gets very big, it can not be stored in RAM.� It should be stored on file, hence another level of index that can be loaded into memory is required.� Hence we need multilevel of indexing.

Page 13: CS215 - Lec 9  indexing and reclaiming space in files

Dr. Hussien M.

Sharaf13

� Level #4 Index can be loaded into memory

Page 14: CS215 - Lec 9  indexing and reclaiming space in files