Post on 03-Jun-2018
8/12/2019 Lecture 7 and 8 Databases and Information Management
1/33
8/12/2019 Lecture 7 and 8 Databases and Information Management
2/33
The need for a database
solution
Organisations have to collate, process and
output vast amounts of data and information
Manual or multiple system options not viablewhen dealing with possibly millions of
customers in a global market
Need for accessible informationtouch of a
button/key Need for up-to-date information
Need for information security
8/12/2019 Lecture 7 and 8 Databases and Information Management
3/33
The need for structure Our ability to use data improves when we
introduce more structure in how data is stored. The need to go through all the data to reach a
specific item or customer is now not arequirement.
The more the structure, the easier to sift or minethrough large amounts of data.
Think of a structure as an address. The moreprecise the address we have, the easier it is to find
the associated data. Similarly the structure of the data tells the system
where to look for data and avoid unnecessaryeffort.
8/12/2019 Lecture 7 and 8 Databases and Information Management
4/33
Structure example Ever wondered why an Internet search can
return the addresses of home pages thatcontain a word so quickly.
The answer is because the addresses arestored under a list of words arrangedalphabetically.
When you enter a word, the computer jumpsto the appropriate row of the Table without
any processing of information in other rows. Thus, it is possible to peruse very large
databases in a fraction of a second.
8/12/2019 Lecture 7 and 8 Databases and Information Management
5/33
Database models and
structures
There are four main types of databases:
flat
relational
hierarchical
distributed
8/12/2019 Lecture 7 and 8 Databases and Information Management
6/33
Flat structure databases Flat data models have the least amount of
structure. They typically take the form of onelarge table, where the first row is the list of thevariables and subsequent rows are data
Each case has a row of data
Many statisticians use flat models
Advantages
Most software include free access to flat datafiles. For a small number of cases, flatdatabases do a reasonably fast job
8/12/2019 Lecture 7 and 8 Databases and Information Management
7/33
Flat structure databases
Disadvantages
Flat databases waste computer storageas it is required to keep information onitems that logically cannot be available.
For example, if information onpostcodes/zip codes is required, flat files
require us to enter missing information forzip codes in foreign countries.
8/12/2019 Lecture 7 and 8 Databases and Information Management
8/33
Flat structure databases
Hierarchical and structural databasesavoid this problem by defining different
tables for classes of countries in whichpost/zip code is not available.
Thus flat files keep large sparse data full ofmissing information.
When the size of database is large, searchthrough the data takes a long time
8/12/2019 Lecture 7 and 8 Databases and Information Management
9/33
Flat structure databases Flat databases are not conducive to complicated
search queries that divide the database further.
For example, in a flat file about students it wouldbe difficult to find all students that live in a certainpost/zip code that have received a grade C.
Such a query will require repeated pass throughthe data.
First pass may identify all students who live in a
post/zip code, second pass may identify allstudents who have a grade C and the third passmay find students in both groups
8/12/2019 Lecture 7 and 8 Databases and Information Management
10/33
Flat structure databases
Such multiple passes through data areinefficient and take a long time:
Since every simple ifstatement (ifpost/zipcode is equal to SE1 then ...) takes afraction of a secondeach subsequentsearch accumulates more time
Efficient search methods are important forlarge databases
8/12/2019 Lecture 7 and 8 Databases and Information Management
11/33
Flat structure database
example
Student ID Name Mid termgrade
Finalgrade
Address Post/zipcode
... .....
4561 Alison Safaie B A 13 Manor Park SE1 ... ... ..
7878 Mike Smith C B 24 Curzon Street NW14 ... ... ..
8954 SherifMohamed
A C 21a High Street NW14 ... ... ..
Table 1: Example of a flat file
8/12/2019 Lecture 7 and 8 Databases and Information Management
12/33
8/12/2019 Lecture 7 and 8 Databases and Information Management
13/33
Relational databases This information in essence says that the patient ID
number, name and medical record numberbelong to the same person.
To effectively store both information items and therelationships among these items, relational data iskept in table formats.
The first row of the table shows the name of thevariables and subsequent rows are data. All itemsin the same row usually belong to the same case
and are related. One column of the table istreated as the key to the table. Numbers orcharacters in this column corresponds to items inother tables. Thus it is possible to move from onetable to another.
8/12/2019 Lecture 7 and 8 Databases and Information Management
14/33
Relational databases
StudentID
Keycolumn
Name Mid-term
Final
4561 AlisonGhadiri
B A
7878 Mike Smith C B
8954 SherifMohamed
A C
Table 2: Table for "Students grades"
StudentID
Keycolumn
Address Post/zip code
4561 13 Manor Park SE1
7878 24 CurzonStreet
NW14
8954 21a HighStreet NW14
Table 3: "Students' contact information
ExampleHere is an example of a Table ofgrades for the example introducedunder flat files
This is an example of an additional tablefor contact information:
8/12/2019 Lecture 7 and 8 Databases and Information Management
15/33
Relational database queries
When a query is made, the relationaldatabase searches through its tables to find
the answer. Often the answer involves information pooled
together from different tables.
For example, a query for names of peoplewith grade of A, can be answered from the
table of grades. The query for grades ofpeople in post/zip code area NW14 must beanswered from both tables.
8/12/2019 Lecture 7 and 8 Databases and Information Management
16/33
8/12/2019 Lecture 7 and 8 Databases and Information Management
17/33
Hierarchical databases Hierarchical database models are one type
of relational data models based on a parent-
child relationship Hierarchical database models resemble al
tree structure
Example
The file directory on your desktop is an
example of a hierarchical database system. A folder may contains other folders which may
contain other files, which contain data.
8/12/2019 Lecture 7 and 8 Databases and Information Management
18/33
Hierarchical databases
advantages
In hierarchical models, children inherit therelationships and characteristics of theirparents.
An operation on the parent affects thechildren; if you tell the computer to do anoperation on a folder, the operation affectsall the folders/children and files that itcontains.
For example, if you delete the top folder, youwould delete all of the folders and files itcontains. This feature saves time for theperson maintaining the files.
8/12/2019 Lecture 7 and 8 Databases and Information Management
19/33
Distributed databases Most databases physically reside in one
place. They may be backed up to another placebut the elements of the database are notmaintained in different places.
In a distributed database, data is kept in differentsettings and on different computers. One orseveral central computers maintain indexes towhere the data is. Using the address of the data
computers can then communicate and find theinformation needed.
Distributed databases need not only addresses forwhere the data is but they also need an audit trailof who has updated data or retrieved it.
8/12/2019 Lecture 7 and 8 Databases and Information Management
20/33
Distributed databases Audit data is needed in order to pinpoint errors in
the system and in order to understand whereconfidentiality of the system breaks down.
When a computer requests data from another, anaudit trail is created by storing who sent the datawhere and when.
When this computer passes the data to another,the information needs to be updated in theoriginal computer.
As the number of computers receiving the dataincreases the task of auditing becomes moredifficult.
8/12/2019 Lecture 7 and 8 Databases and Information Management
21/33
Distributed database example
A good example of a distributed database isWorld Wide Web pages. These pages of data arekept on a different computers, often referred to asWeb servers.
The address of each file is the location addressyou enter when you want to see the page. Thislocation address is an index to the Web pages.
Centralised computers keep the beginning of
these addresses, called domains. Subsequent detailed addresses are kept at the
Web servers.
8/12/2019 Lecture 7 and 8 Databases and Information Management
22/33
Distributed database searches
When searching a distributed database, two stepsmust occur.
First a program, sometimes called crawler, mustindex the content of the databases, then anotherprogram, often called a search engine, wouldsearch the index for your request.
When a match is found, the index is used to findthe address of the information. This address is
provided to the engine that assembles a list foryou.
When you click on the items identified by thesearch engine, you use the address to retrieve thedata items it has found.
8/12/2019 Lecture 7 and 8 Databases and Information Management
23/33
Database considerations for
an organisation
In terms of database management anorganisation has to consider the following:
How to store their datawhichmodel/approach
Data security
Cost of a DBMS (database management
system) initial and maintenanceCentralised or de-centralised approach
8/12/2019 Lecture 7 and 8 Databases and Information Management
24/33
Data considerations for an
organisation What suits there needs, size and works with their
business processes and functions How is the data going to be stored on a single
DBMS or distributed across a number of databases(centralised v decentralised approach)
How to protect the data from internal andexternal threatshackers, viruses, natural disasters,incompetence of end users
Cost of design, implementation, management of
the systemneed for a database manager oradministrator(s)
Decentralised or centralised approach (see nextslides)
8/12/2019 Lecture 7 and 8 Databases and Information Management
25/33
Decentralised or centralised?
Decentralised databases exchange files andtherefore may exchange corrupted files or
viruses that may affect the entiresystem. Security of these databases aredifficult to maintain.
In decentralised databases the type of datato be exchanged, the process of addressingthe data, and the protocol for updating thedata must be agreed upon ahead of timeand plans must be in place for updating theprocess.
8/12/2019 Lecture 7 and 8 Databases and Information Management
26/33
8/12/2019 Lecture 7 and 8 Databases and Information Management
27/33
8/12/2019 Lecture 7 and 8 Databases and Information Management
28/33
Data-less information systems A data-less information system contains no
centralised data and relies on distributeddatabases.
The Data-less Information System relies on rapidcommunication to construct the data at the pointof need for the data.
A data-less information system also does notrequire any additional hardware; instead it relieson the hardware of existing organisations and
institutions. e.g. Wavenet relying on weather and climate
systems around the world
8/12/2019 Lecture 7 and 8 Databases and Information Management
29/33
Components of a data-less IS
There are three components to a data-less information system:
Decoder
Communicator
Analysis
8/12/2019 Lecture 7 and 8 Databases and Information Management
30/33
Decoder This software resides in the database of
participating organisations. It reads andanalyses the structure of the other systems
records. It uses known characteristics of the data,
national and international standard of data,and the supplied format of the data todecipher the format and structure of thedata.
When it is finished understanding the structureof the data, it sends a message to other unitsacknowledging participation
8/12/2019 Lecture 7 and 8 Databases and Information Management
31/33
Communicator This software resides in the database of participating
organisations systems. I
It verifies any consent that is required (if required e.g.
patient medical records), the nature of data to becollected, and the systems suggestion about whereto look for the data.
This software contacts all sites that have a reportingdecoder and collects the necessary information,including the possibility of different possibleinterpretation of the same data at different sites. The
data collected by the software is for one time useand would be erased after the use.
8/12/2019 Lecture 7 and 8 Databases and Information Management
32/33
8/12/2019 Lecture 7 and 8 Databases and Information Management
33/33