Annotation techniques for Query-By-Concept Approach in Image Retrieval System
Data and its manifestations. Storage and Retrieval techniques.
-
Upload
gary-patrick -
Category
Documents
-
view
219 -
download
0
Transcript of Data and its manifestations. Storage and Retrieval techniques.
Data and its manifestations.
Storage and Retrieval techniques.
What is Data NumbersTextSentencesFilesImagesAudio files
One way to store data Columns and Rows of data can easily be
entered
Disadvantages Difficult to look for data Security Multiple files are not related to each other
Excel File
Data Redundancy
Data Inconsistency
Excel File
Bit
Byte
Field
Record
File
Database
Hierarchy of Data
Primary Keys
Secondary Keys (Alternate Keys)
Foreign Keys (will understand better with reference to a database)
What are Keys
Master Files permanent source, data of a permanent nature, data which will change every day
Transaction Files used to update a Master, batch processing
Serial and Sequential Files
Serial
Sequential
Indexed Sequential
Direct Access (random)
Types of File Organization
SERIAL
Just add records as they come in.Used for Transaction files.
Discuss why ?
Types of File Organization
SEQUENTIAL
Add records one after another but in key sequence
Used for master filesDiscuss why ?
Types of File Organization
Direct Access Files
Store the record at an address which is calculated using a reference to the Primary
Key
Types of File Organization
Add a record to a Serial File
Open fileAppend record to end of file
Algorithms
Add a record to a Sequential File
1. Open old file for reading2. Open new file for writing3. Start from beginning of old file4. Repeat
1. Read next record2. If current record key > new record key3. write new record to file4. End if5. Write current record to new file Until EOF
5. If new record is not yet inserted then write new record to new file.
Algorithms
Delete a record from a Serial or Sequential file
1. Open old file for reading2. Open new file for writing3. Repeat (read from old file)• Read next record• If current record key <> key of record to be deleted • then write record to the new file• End if Until End Of File
Algorithms
Search for a record with a particular key
Serial File
Open FileRepeat (start reading) Test for matchUntil EOF or match is made
Algorithms
Search for a record with a particular key
Sequential File
Open FileRepeat (start reading) Test for matchUntil match is found or key of this record > key of wanted record
Note : Here once the key passes the key of the wanted record the record can be deemed as not found. Because the records are sorted sequentially
Algorithms
Update Sequential Master file with Transaction records
Open a new file and add all records in Seq file to new file until the first sequential transaction record comes up. Now write the transaction record into the new file. Continue the process and write all other records from sequential file and transaction file.
Logic
Update a Sequential Master File
Open master file for readingOpen transaction file for readingOpen new master file for writingRepeat (transaction file records) While master record key < transaction record key Write master record to new master file End While (Read next master record) Write transaction record to new master fileUntil EOF (transaction)Repeat (master file records) Write master record t new master fileUntil EOF (master)
Algorithms
Also called Hash, Random or Relative files.
One hash algorithm could be:Every record has a key. Take the key and divide by total number of records. The remainder is the address where I will store the record.
Direct Access Filehow records are stored
This can cause synonyms or collisions.
One way to resolve a collision is if there is one, store the record at the next available memory address. When highest address is reached, wraparound and store at address 0.
Direct Access Filemanaging a collision
Another method is have a separate area to store these “collision affected” records.
Mark the new address at the original address location.
Direct Access Filemanaging a collision
Should retrievals be fast ?Should information be upto date or not necessary ?Can information be batched ?Are reports needed to be in order ?What happens when information is lost or destroyed ?
What kind of Files to use and When?
It is the proportion of records being accessed in any one run.
It is calculated by dividing the number of records accessed by the total number of records on file expressed as a percentage.
If hit rate is low, direct access is better. If high sequential is ok.
Payroll processing has high hit rates, Updating address has low hit rate.
Hit Rate
Data Security is keeping data safe from the various hazards to which it may be subjected.
Protection against loss, corruption, or unauthorized access to data.
Data Security
1. Use of passwords2. Immediate removal of employees who have
been handed the pink slip/sacked.3. Educating staff on ways data can be
breached.4. Separation of duties and having different
access levels.5. Appointing a security manager.
How to keep data secure
Keep passwords and user ids in a safe place – database tables.
Keep passwords encrypted.
Passwords should not be displayed on screens or on printouts. They should be suppressed.
User Ids and Passwords
Data encryption is done so that data transmitted to remote locations is secure from hackers and wire tappers.
There is no limit to damage that can occur should tapping happen and security of data is hampered in any way or form.
There are many encryption algorithms available including use of encryption keys.
Encrypting data
What do you mean by Access Rights---Right to see some or all information
Access Rights is implemented by having a leveled structure in security where people of a certain level can see certain data/even certain fields.
Access Rights
Needed to prevent loss of data due to a disaster
Protects against power failures, theft, viruses
Backup recovery should be properly tested before implementation
Sometimes replication is implemented in an organization to keep backups up to date
Backups taken on disks are transferred to remote locations to prevent major disasters
Backups
The difference between archiving and backing up should be clear.
What is Archiving ?
Archiving
A binary digit (1 or 0) is known as a bit.
8 bits make up a byte.
One character can be represented as one byte.
Data Representation
How do I represent 102 in decimal as a binary
64 32 16 8 4 2 1
Put in a 1 where possible and rest as zeroes starting from right 64 32 16 8 4 2 1 1 1 0 0 1 1 0
Denary to Binary number conversion
Consider 1 1 0 0 1 1 0
Start from right and represent each digit as 2,4,8 and so on
Multiply place position with 1 or 0 as case maybe and add the numbers together
Binary to Denary
Raw data is a collection of numbers and characters stored in a particular way so as to be able to read it later.
Information is what can be derived from the stored data. A communication that provides understandable and useful knowledge to the recipient.
Data and Information
4 bit representation of a decimal digit
Eg : 20 in BCD would be0010 0000
Advantage : Easier to convert. Just split into groups of 4 and convert to decimal.In BCD arithmetic rounding of fractions does not occur. In normal binary arithmetic some kind of rounding off occurs.
What is BCD Binary Coded Decimal
1. More bits are required to store a number2. Calculations with this is more complex than
ordinary binary.3. Consider adding 1 and 190000 00010001 10010001 1010 is not correct. 1010 is not a valid BCD.
Disadvantages of BCD
This problem occurs because 9 is represented as 1001 after which the next 6 binary numbers are unused. So we need to add 6 to this result.
0001 10100000 01100010 0000 which is 20 which is the correct result
Disadvantage of BCD
End of DATA and its
REPRESENTATIONS