Lesson I Database Management Systems - XTEC · DBMS DBMS vs File-based Systems ... DBMS Elements A...

124
Lesson I Database Management Systems

Transcript of Lesson I Database Management Systems - XTEC · DBMS DBMS vs File-based Systems ... DBMS Elements A...

Lesson I

DatabaseManagement Systems

2

IN THIS LESSON YOU WILL LEARN…

� The concept and importance of information .

� What an information system is and its components.

� The structure of data files and the drawbacks of file-based system .

� What a database is and the advantages of using a database management system .

� The meaning of data model and a list of the main data models in use.

� The three layer architecture for databases.

� The different users of a database.

1. Information

2. Information Systems

3. File-based IS

4. Database Management Systems (DBMS)

5. Data Modelling

6. ANSI/SPARK Architecture

Information

1. INFORMATION1.1 Introduction

1.2 Importance

1.3 Information vs Computer Science

1.4 Quality of Information

1.5 Properties of Information

6

INFORMATIONIntroduction

� Nowadays, the amount of information we areexposed to is huge.

e.g: weather prediction, horoscopes, traintimetable...

� All areas of human development need to manipulate information.

� This information is changing constantly.

Why isinformationso important?

8

INFORMATIONImportance

� In our society the information has become a very important resource.

� In addition, all organisations need to saveand access data.

Relevant and timely information is essential to make decisions .

9

INFORMATIONInformation vs Computer Science

� In Catalan and Spanish the term computerscience comes from the French word:

informatiqueinformation automatique

That is what we are interested in!!

10

INFORMATIONInformation vs Data

� Data are the record of facts. It can be only series of numbers or characters with no meaning.

� Information implies that the data have beenprocessed so that they mean something for the recipient.

Activity 1.1Information vs Data

12

INFORMATIONQuality of Information

� Information is considered high-quality if itpermits the recipient to make the best decision.

� The quality of the information is moreimportant than its quantity.

� There are a set of properties that allow you to assess the quality of the information.

13

INFORMATIONProperties of Information

- reported in time

- understandable

- up to date

- reliable to the user

- safe

- relevant

- accurate

- complete

- transmitted to theright person

It’s not possible to get all of them!!

Activity 1.2Quality ofInformation

1. Information

2. Information Systems

3. File-based IS

4. Database Management Systems (DBMS)

5. Data Modelling

6. ANSI/SPARK Architecture

InformationSystems

2. INFORMATION SYSTEMS2.1 Concept

2.2 Components

2.3 Connection between Components and Objectives

2.4 Analysis and Design

2.5 Automated Information Systems

18

INFORMATION SYSTEMSConcept

� Companies and organisations have an infrastructure that allows the information to be manipulated quickly and easily.

Organisation’sINFORMATION SYSTEM (IS)

19

INFORMATION SYSTEMSComponents

� An information system is made up of:

– information– users– physical support– working procedures

Can you guess what each of these components is?

Who decides on the information system?

2. INFORMATION SYSTEMS2.1 …

2.2 Components– Information

– Users

– Physical Suport

– Working Procedures

2.3 …

22

INFORMATION SYSTEMSInformation

� It’s the most important component of the systems.

Which information must be stored?

2. INFORMATION SYSTEMS2.1 …

2.2 Components– Information

– Users

– Physical Suport

– Working Procedures

2.3 …

24

INFORMATION SYSTEMSUsers

� People that insert, manipulate or use the information to carry out their tasks.

Who is manipulating

the information?

2. INFORMATION SYSTEMS2.1 …

2.2 Components– Information

– Users

– Physical Suport

– Working Procedures

2.3 …

26

INFORMATION SYSTEMSPhysical Support

� All devices used to communicate, process and store information.

Which physical elements will be used to manipulate the

information?

Are computerspart of thephysicalsupports in allIS?

28

INFORMATION SYSTEMSPhysical Support

� It’s possible to find information systems that only use basic tools to manipulate information.

� Nowadays computers have replaced those old tools.

Although they are not essential!!

29

INFORMATION SYSTEMPhysical Support

data file

30

INFORMATION SYSTEMPhysical Support

� We are interested in IS that do use computers to manipulate information.

31

INFORMATION SYSTEMSPhysical Support

32

INFORMATION SYSTEMSPhysical Support

The databasesystems are theheart of all currentIS because theyallow to managelarge amounts of data.

2. INFORMATION SYSTEMS2.1 …

2.2 Components– Information

– Users

– Physical Suport

– Working Procedures

2.3 …

34

INFORMATION SYSTEMSWorking Procedures

� The IS must help to achieve the general objectives of the company.

� So, the managers decide a set of workingguidelines that they consider more efficientand useful.

How should things be done?

35

INFORMATION SYSTEMSConnection between Componentsand Objectives

� The working procedures determine the:

– necessary information

– involved users

– necessary physical support

36

INFORMATION SYSTEMSConnection between Componentsand Objectives

� The working procedures also have to adapt to the available elements:

– information

– characteristics of the users

– existing technology

What is the role of a databaseexpert in a non-automated IS?

Activity 1.3

38

INFORMATION SYSTEMS

39

INFORMATION SYSTEMSAnalysis and Design

databases

application programs

information system

Activity 1.4Identifying the components of an IS

41

INFORMATION SYSTEMSAutomated Infomation Systems

� An automated IS is made up of the followingelements:

– information

– working procedures

– users

– software

– hardware

– administrator

Is it the only physical support used?

1. Information

2. Information Systems

3. File-based IS

4. Database Management Systems (DBMS)

5. Data Modelling

6. ANSI/SPARK Architecture

File-basedInformationSystems

3. FILE-BASED IS’s3.1 Files

3.2 Concept

3.3 Separate Application Programs

3.4 Drawbacks

3.5 Database

45

FILE-BASED INFORMATION SYSTEMSFiles

� The computer’s RAM memory is volatile.

� The file is a high-level structure provided by the operating system that keeps the data in mass storage.

� Different types of files store different types of information.

� We are interested in data files and database files.

46

FILE-BASED INFORMATION SYSTEMSConcept

� Data files are used to store the information.

� In the same company or organisation, thereare several programs that perform servicesfor the final users.

� Each program defines and manages its owndata.

47

FILE-BASED INFORMATION SYSTEMSSeparate Application Programs

48

SEPARATE PROGRAMSExample

� Consider two departments in our college:

-admissions office-personnel

� Both of them need to keep information about the teachers but the information required is slightly different.

49

SEPARATE PROGRAMSExample

This approach has several limitations!!

50

FILE-BASED INFORMATION SYSTEMSDrawbacks

- Data dependence

- Incompatible file formats

- Security problems

- Data redundancy …

What are these disadvantages about?

51

FILE-BASEDIS DRAWBACKSData Dependence

� If some modifications have to be made in thedata, then the application program has to be rewritten.

52

FILE-BASEDIS DRAWBACKSIncompatible File Formats

� The structure of the file depends on theapplication programming language.

e.g. the structure of the files generated by C and Visual Basic may be different.

o The incompatibility of such files makes it difficult to process them together.

53

FILE-BASEDIS DRAWBACKSSecurity Problems

� There are no mechanisms to control thedifferent permissions on the users.

� In case of a system crash it also becomeshard to recover the data to a consistent state.

3. FILE-BASED IS’s…

3.4 Drawbacks– Data Dependence

– Incompatible File Formats

– Security Problems

– Data Redundancy

55

FILE-BASEDINFORMATION SYSTEMSData Redundancy

� Data redundancy means duplication of data.

� Data redundancy leads to:

-wastage of storage space

-more laborious updating process

- loss of data integrity!!

56

LOSS OF DATA INTEGRITYExample

� Let’s imagine that one of the teachers moves to another address:

BUT..

Files with contradictory information!!

What of thepreviousdisadvantages isthe worst?

58

To solve theproblems mentionedbefore...

databasesystems

A database is a well-organised collection of data that is related in a meaningful way.

How do databasesystems solvethe problemsmentioned?

60

DATABASEDefinition

� All the data needed by the organisation are in the database.

� All users access a single place to get theinformation they need.

61

DATABASEIntegrated Information

1. Information

2. Information Systems

3. File-based IS

4. Database Management Systems (DBMS)

5. Data Modelling

6. ANSI/SPARK Architecture

DatabaseManagement Systems

4. DBMS4.1 Concept4.2 Data Access4.3 Advantages4.4 DBMS vs File-based

Systems4.5 Drawbacks4.6 Elements4.7 Languages4.8 Users

65

DATABASE MANAGEMENT SYSTEMConcept

� A DBMS is a collection of programs thatpermits the user to create and manipulatedatabases.

� The DBMS provides the interface betweenthe database and the programs that accessthe data.

Do you know any DBMS?e.g: Access, MySQL, Oracle, Postgress, ...

66

DBMSData Access

What are theadvantages ofhaving all thedata in a single place?

68

DBMSAdvantages

- Data dependence

- Incompatible file formats

- Security problems

- Data redundancy

- Data independence

- Data standardization

- Security tools

- Controlled redundancy

69

ADVANTAGES DBMSData Independence

� Data independence means independencebetween application program and data.

� When the data representation changes, it is not necessary to change the applicationprogram.

70

ADVANTAGES DBMS Data Standardisation

� There’s a greater degree of data standardization within the organisation.

� The users are obliged to use the same data definitions.

71

ADVANTAGES DBMSControlled Redundancy

� Since the data are recorded only once:

– data need less space

– the data updating process is easier

– there’s no data inconsistency!!

72

ADVANTAGES DBMSSecurity Utilities

� Access to data can be restricted so that only authorised users may see or manipulate it.

� Backup copies of data need to be made regularly to recover in case of system failure.

4. DBMS…4.3 Advantages

– Data Independence– Data Standardisation– Controlled Redundancy– Security Utillities

4.4 DBMS vs File-based Systems

74

DBMSDBMS vs File-based Systems

FILE-BASED SYSTEM DBMS SYSTEM

Data dependence Data independence

Incompatible file formats Data standardization

Data redundancy Controlled redundancy

Security problems Backup and recovery utilitiesRestricted authorized access

Activity 1.5Advantagesof a DBMS

Is there anydisadvantage ofusing a DBMS?

77

DBMSDisadvantages

� The enterprise may be assuming additional risks in the following areas:

– the cost of using DBMS

– data integrity

– data quality

– confidentiality, privacy and security

– enterprise vulnerability

Activity 1.6Disadvantagesof a DBMS

What can you do with a DBMS?

80

DBMSElements

� A practical database package tipicallyprovides utilities for:

- Design and maintenance of database structures

- Formulation of queries- Design of forms- Design of reports- Contruction of macros and

programs

81

DBMSLanguages

� The DBMS has languages and procedures to communicate with the database.

– Data Definition Language (DDL)

– Data Manipulation Language (DML)

What is the purpose of these languages?

82

DBMS LANGUAGESData Definition Language

� It operates on the data structures.

� It is used to:

- define a database

- modify its structure

- destroy it when you no longer need it

83

DBMS LANGUAGESData Manipulation Language

� It operates on the data.

� There are four things that you can do with data:

– store the data– change the stored data– remove data from the database– retrieve data from a database

Activity 1.7DBMS Languages

85

DBMSUsers

� A database involves a group of people:

– database designer

– database users

– database administrator

What are these people in charge of?

86

USERS OF A DBMSDatabase Designer

� The database designer is responsible for:

– Identifying the data to be stored in thedatabase.

– Choosing an appropriate structure to represent and store the data.

87

USERS OF A DBMSDatabase Administrator

� The objectives of DBA are:

- To control the access to the database

- To restore a consistent state of the databasefrom a system failure

- To standardize the use of databases

- To support the development and maintenanceof database application programs

- To ensure all the documentation is up-to-date

88

USERS OF A DBMSDatabase Users

� They are people who need information from thedatabase to carry out their tasks.

Application programmers:

Write application programs and interact with the database through a host language like Pascal or C

End users

- Specialized end users- Non-experienced final users

Database users

1. Information

2. Information Systems

3. File-based IS

4. Database Management Systems (DBMS)

5. Data Modelling

6. ANSI/SPARK Architecture

Data Modelling

5. DATA MODELLING5.1 Concept of Modelling5.2 Why Modelling?5.3 Meaning of Data

Modelling5.4 Advantages5.5 Difficulties

92

MODELLINGConcept

� A model of something is a representation thatshares certain relevant features with theoriginal.

e.g. An actual physical scale modelA musical score

A plan of a house

A set of equations

Can you give examples of models?

Why modelling?

94

MODELLINGWhy Modelling?

� Models are useful because the characteristicsof the real system can be analysed bystudying the nature and behaviour of themodel.

� Models can give an accurate description.

There’s no ambiguity!!

95

DATA MODELLINGMeaning

� We need techniques that allow us to representa conceptual view of the informationinvolved in information systems.

� Data models are also concerned about theprocessing of data and the definition of operations on these.

96

DATA MODELLINGMeaning

The real system to be modelled usually refers to a company or organisation.

97

DATA MODELLINGAdvantages

� The use of data models provides a betterunderstanding of the nature of information.

� They enable a better design of databasesystems.

98

DATA MODELLINGDifificulties

� One difficulty in data modelling is thedifference between the human view of an IS and the way it has to be implemented withinthe computer.

� To solve this, we can view the architecture of a database as a series of levels that providedifferent degrees of abstraction.

1. Information

2. Information Systems

3. File-based IS

4. Database Management Systems (DBMS)

5. Data Modelling

6. ANSI/SPARK Architecture

ANSI/SPARK Architecture

6. ANSI/SPARK ARCHITECTURE6.1 Concept6.2 Levels of Abstraction6.3 Logical Level6.4 Pysical Level6.5 Schemas6.6 Types of Data Models6.7 Data Independence

102

ANSI/SPARC DATA MODELConcept

� Proposed by ANSI/SPARC for databasesystems.

� The data in a DBMS is represented at threelevels of abstraction.

� There’s a schema at each of these levels.

� A schema is the structure of the database, described in a formal language supported by the DBMS.

103

ANSI/SPARC DATA MODELLevels of Abstraction

104

ANSI/SPARK DATA MODEL

To do this we need...

DATA MODELS

105

DATA MODELSTypes

� There is a large number of data models:

– Hierarchical Model

– Network Model

– Relational Model

– Entity-Relationship Model

– Object-Oriented model

– ...

106

ANSI/SPARC DATA MODELLevels of Abstraction

107

ANSI/SPARC DATA MODELLogical Level

� The “reality” of the information system is represented to obtain a conceptual schema .

� From this schema, it’s described what data arestored.

� The logical schema contains the organisationof the data into tables and columns.

108

ANSI/SPARC DATA MODEL

ER model

Relational model

109

ANSI/SPARK DATA MODELPhysical Level

� It describes how the data relations described in the conceptual schema will be physically storedusing a particular DBMS.

110

ANSI/SPARC DATA MODELExternal Level

� It is the highest level of abstraction.

� An external schema is the view that theindividual user of the database has.

Users are not allowed to access all theinformation in the database

111

LOGICAL LEVELConceptual Schema

� The real-world information system is represented to obtain a conceptual schema .

112

CONCEPTUAL SCHEMAExample

Entity-relationship diagram

113

LOGICAL LEVELLogical Schema

� From this schema, it’s described what data arestored.

� The logical schema contains the organisationof the data into rows and columns.

114

LOGICAL SCHEMAExample

TEACHER (id, name, address )

SUBJECTS (code, name, hours, id_teacher)

foreign key: id_teacher �TEACHERS

Relational Model

115

LOGICAL SCHEMAExample

TEACHER (id, name, address )

SUBJECTS (code, name, hours, id_teacher)

foreign key: id_teacher �TEACHERS

id name address

12 Fox BCN

25 McKewan BCN

80 Fox GRN

code name hours id_teacher

C1 OS 240 25

C2 NET 180 25

C3 DB 60 12

TEACHERS SUBJECTS

tables

116

DATA MODELSTypes

SCHEMA MODEL

Conceptual Schema (real-world)

Entity-Relationshipmodel (ERM)

Logical Schema (data description)

-Relational model-Hierarchical model-Network model

Internal Schema(implementation of tables) �

117

DATA MODELSHierarchical Model

� Formed the basis of the earliest databases.

� Organised data were arranged on a top-downstructure.

118

DATA MODELSNetwork Model

� It was a more general representation of thehierachical model with no distiction betweenparent and child.

119

DATA INDEPENDENCE

� Data independence means that the changes in the way the data are structured and storeddon’t affect the programs.

– Physical data independece

– Logical data independence

120

DATA INDEPENDENCEPhysical Data Independence

� It is the ability to modify physical schemawithout causing the conceptual schema or application programs to be rewritten.

121

DATA INDEPENDENCEPhysical Independence

PHYSICAL INDEPENDENCE

122

DATA INDEPENDENCELogical Data Independence

� It’s the ability to modify the conceptual schemawithout having to change the externalschemas or application programs.

123

DATA INDEPENDENCEPhysical vs Logical Data Independence

LOGICAL INDEPENDENCE

124

DATA INDEPENDENCEPhysical vs Logical Data Independence

PHYSICAL INDEPENDENCE

LOGICAL INDEPENDENCE