rdbms concepts

16
RDBMS Concepts J.SRINIVASA REDDY Database Concepts Introduction: In today’s changing technological environment, information is power. Owing to the rapid growth of IT in the last few decades. The eternal quest for data management has led to the development of database management technology. The term database is made up of two separate words, data and base. Database is the base for data that is an assembled group of data. Database allows easy and efficient storage, retrieval, and modification of data, regardless of the amount of data being manipulated. Essentially a database is a computerized record keeping system. Data versus Information: Before defining database, we should know about the two terms, data and information, which are used frequently with databases. Data : data can be anything such as a number, a person’s name, images, sounds, and so on. Hence, data can be defined as a set of isolated and unrelated raw facts, represented by values, which have little or no meaning, simply because they lack a context for evaluation. Usually, the values are represented in the forms of characters, numbers, or any symbol such as ‘Rani’ , ‘35’ , and ‘chef’. Note that although these words and numbers have certain meaning, it is difficult to figure out exactly what these values signify. Information : when the data is processed and converted into a meaningful and useful form, it is known as information. Hence, information can be defined as a set of organized and validated collection of data. For example, ‘Rani is 35 years old and she is a chef’. Database: Database is logically organized collection of related data. Similar data refers to the collection of data, which is stored based on same context. For example, an employee database contains ‘Similar’ data for all employees and every employee’s entry contains similar type of information. The organized information serves as a base from which the desired information can be retrieved, conclusion can be drawn, and decisions can be made, by further reorganizing or processing this data. In essence, a database not only stores information, but the information is integrated so that when the database is interrogated, users can get some thing useful from it. Example: Dictionary Address book Telephone directory etc. - 1

description

introduction to rdbms concepts

Transcript of rdbms concepts

Page 1: rdbms concepts

RDBMS Concepts J.SRINIVASA REDDY

Database Concepts Introduction: In today’s changing technological environment, information is power. Owing to the rapid growth of IT in the last few decades. The eternal quest for data management has led to the development of database management technology. The term database is made up of two separate words, data and base. Database is the base for data that is an assembled group of data. Database allows easy and efficient storage, retrieval, and modification of data, regardless of the amount of data being manipulated. Essentially a database is a computerized record keeping system. Data versus Information: Before defining database, we should know about the two terms, data and information, which are used frequently with databases.

Data : data can be anything such as a number, a person’s name, images, sounds, and so on. Hence, data can be defined as a set of isolated and unrelated raw facts, represented by values, which have little or no meaning, simply because they lack a context for evaluation. Usually, the values are represented in the forms of characters, numbers, or any symbol such as ‘Rani’ , ‘35’ , and ‘chef’. Note that although these words and numbers have certain meaning, it is difficult to figure out exactly what these values signify. Information : when the data is processed and converted into a meaningful and useful form, it is known as information. Hence, information can be defined as a set of organized and validated collection of data. For example, ‘Rani is 35 years old and she is a chef’.

Database: Database is logically organized collection of related data. Similar data refers to the collection of data, which is stored based on same context. For example, an employee database contains ‘Similar’ data for all employees and every employee’s entry contains similar type of information. The organized information serves as a base from which the desired information can be retrieved, conclusion can be drawn, and decisions can be made, by further reorganizing or processing this data. In essence, a database not only stores information, but the information is integrated so that when the database is interrogated, users can get some thing useful from it. Example: Dictionary

Address book Telephone directory etc.

-

1

Page 2: rdbms concepts

RDBMS Concepts J.SRINIVASA REDDY

- 2

Databases are used to manage information. Within the database, the data is organized into storage containers called tables. Tables are made up of columns and rows. In a table, columns represent individual fields and rows represent records of data.

Field / Column : A column represents one related part of a table and is the smallest logical structure of storage in database. It holds one piece of information about an item or subject. It is generally used for a group of alphanumeric characters. Record / Row : A record is a collection of multiple related fields that can be treated as a unit. Conceptually, if you collected business cards from 50 persons, all 50 cards would represent a table and the information on any one business card would represent one record. Table : A table is a named collection of logically related multiple records. For example, a collection of all the employee records of a company would be an Employee Table. Fields

CODE DEPT NAME ADDRESS CITY PHONE 0101 RD01 Prince Park Way London 74134543 0102 RD01 Harry Pebble street Lester 54847156 0103 RD02 Tom Rose Garden Liverpool 21313603 0104 RD02 Susan Model Town Bristol 27565155 0105 ED01 Mark Victor Crescent Everton 39723624 0106 AD01 Francis Chelmsford Park Paris 28245374 0107 GR01 Robert Downtown Cross Berlin 26062700 0108 RD03 Philip Park Avenue Calgary 41816700

Figure 1.1 Table Database Management System (DBMS) As discussed earlier, a database is a collection of well-organized data. In order to carry out operations like insertion, deletion, and retrieval, the database needs to be managed by a software package. This software is called a Database Management System (DBMS). Hence, DBMS can be defined as collection of interrelated data and a set of programs to manage the data. the primary goal of a DBMS is to provide an Environment that is congenial and efficient to retrieve and store information. It allows a user to store, update, and retrieve data.

Page 3: rdbms concepts

RDBMS Concepts J.SRINIVASA REDDY

-

Relational Database Management System (RDBMS) RDBMS is acronym for Relational Database Management System. Dr. E. F. Codd first introduced the Relational database Model in 1970. The Relational model allows data to be represented in a simple row-column. Each data field is considered as a column and each record is considered as a row. Relational database is more or less similar to Database Management System. In relational model there is relation between their data elements. Data is stored in tables. Tables have columns, rows and names. Tables can be related to each other if each has a column with a common type of information. Relationship Relation ship is an association, dependency or link between two or more entities. Even though a relationship may involve more than two entities, the most commonly encountered relationships are binary, involving exactly two entities. Generally, such binary relations are of three types:

1. One – to – One relationship. 2. One –to – Many relationship. 3. Many –to – Many relationship.

One – to – One relationship:

In one to one relationship, one record in a table is related to only one record in another table. {One entity is related to only one another entity} For Example, a department cannot headed by more than one department head.

Department 1 Department Head 1

Department 2 Department Head 2

Department 3 Department Head 3

Figure 1.2 One – to – One relationship. One – to – Many relationship:

In one – to – many relationship one record in a table (parent table) can be related to many record in another table (child table). For Example, a father may have more than one child but the child has only one father.

3

Page 4: rdbms concepts

RDBMS Concepts J.SRINIVASA REDDY

-

Child 1

Child 2

Child 3

Father Figure 1.3 One –to – Many relationship Many – to – Many relationship:

In many – to – Many relationship, one record in a table is related to one or more records in another table, and one or more records in the second table can be related to one or more records in the first table. For Example, a customer can buy many goods and many customers can buy the same goods.

Customer 1

Television Customer 2 VCD Player Customer 3

Handy Cam Customer 4 Customer 5 Figure 1.4 Many –to – Many relationship File based approach of database management Traditionally, data was stored and processed on multiple files with the help of a program or group of programs for each application. This is known as file based approach of database management. A file may be defined as systematized self-containing collection of records. It may consists of data (data file) or it may contain a sequence of basic statements (program). Each user works with a different program that handles its own independent data. Each program maintains its own set of data. Users of one program may not be aware of potentially useful data held by other programs.

4

Page 5: rdbms concepts

RDBMS Concepts J.SRINIVASA REDDY

- 5

Employee Application Program

DATA

Employee Employee File Allowance Application Program

DATA

Allowance Allowance File

Figure 1.5 File Oriented Approach In the above figure, assume that the employee file has fields like CODE, NAME, DEPT, and SALARY, and the allowance file contain fields like CODE, HRA, DA, and CCA. Note that the same data for CODE field is held by different programs. As a result, space is wasted and potentially different values and/or different formats for the same item may be stored in different files. Since the file structure is defined in the program code, the data used for that file is dependent on it. The file structure is defined in the program code, the data used for that file is dependent on it. Usually programs are written in different languages that lead to incompatible file formats and one program file may not easily access other files. Since programs are written to comply with particular functions, only a fixed number of queries can be performed and any new requirement will need a new program.

Page 6: rdbms concepts

RDBMS Concepts J.SRINIVASA REDDY

DBMS Approach:

-

DBMS is a software system that enables users to define, create, and maintain the database and provides controlled access to this database. This approach has a database engine placed between the applications and the data. The engine is the central component of a DBMS and it provides access to the repository and the database. It also coordinates all the other functional elements (manipulate, add, delete, search, select, and store data) of the DBMS. Only the engine knows how the data physically stored and applications pass requests to the engine to read and write data. The definition of data is embedded in application programs, rather than on separate and independent files. The data is logically related which comprises entities, attributes, and relationships of the information. The data’s definition and structure (metadata) is defined and stored on the data dictionary. Since the dictionary provides the description of data, and the programs are based on it. The data becomes independent of the programs. Employee Database

DATA

DATA

DBMSEmployee Application

Program

Allowance Application Program Allowance

Figure 1.6 DBMS Approach

In general, the data in a DBMS is integrated as well as shared.

Integrated: A database can be considered as a unification of several data files (tables), with any redundancy among those files eliminated (either wholly or partly). For example, a database might contain both the EMPLOYEE and the ALLOWANCE files, to process the ALOWANCE file, one need not include the redundant filed in this database as it can easily be extracted by referring to the EMPLOYEE table.

6

Page 7: rdbms concepts

RDBMS Concepts J.SRINIVASA REDDY

Shared: Shared data means that individual pieces of data in the database can be shared among the numerous users. Each user can have access to the same piece of information and every user can use it for different purpose. In the above example, department information in the EMPLOYEE file can be shared by users in the personal department and the accounts department. Therefore, these two users would typically be using the information for different purposes.

Benefits in Database Management System Reduction in Data Redundancy Data redundancy refers to duplication of data. In non-database systems, each application has its own separate files. This can often lead to redundancy in stored data. Which results in wastage of space. A database management system does not maintain separate copies of the same data. All the data is kept at one place and various applications refer to the data from this centrally controlled system. Sharing of data

-

Sharing of data allows the existing applications to use the data in the database. It is also helps in developing new applications, which will use the same stored data. due to shared data, it is possible to satisfy the data requirement of new applications without having to create any additional stored data or with marginal modifications. Improvement in Data Security Usually, different systems of an organization would access different components of the operational data, in such an environment; enforcing security can be quite difficult. Setting up of a database management system makes it easier to enforce security restrictions since the data is stored centrally. DBMS can ensure that the only means of access to the database is through authorized channel. Hence, data security checks can be carried out whenever access is attempted to sensitive data. to ensure security, DBMS provides security tools such as user codes and passwords. Different checks can be established for each type of access to each piece of information in the database.

7

Page 8: rdbms concepts

RDBMS Concepts J.SRINIVASA REDDY

-

Maintenance of Data Integrity Data integrity refers to ensuring that the data in the data in the database is accurate. Since, in DBMS, the data is centralized and used by a number of users at a time, it is essential to enforce integrity controls. If many users are allowed to update the same data item at the same time, there is a possibility of incorrect and inconsistent updates. Since all data is stored only once, it is often easier to maintain integrity than in conventional systems. Better Interaction with Users As compared to traditional database systems, a DBMS often providers better service to the users. In conventional systems, usually the information is poorly arranged and, as a result, the availability of up-to-date information becomes poor. However, in case of DBMS makes it easy respond to unforeseen information requests. Centralizing the data in database also means that users can obtain new and combined information that would have been impossible to obtain otherwise. In addition, use of a DBMS allows the novice users (who do not know programming) to interact with the data more easily. Efficient System It is very common to change the contents of stored data these changes can easily be made in a database management system then in a conventional system as these changes do not need to have any impact on application programs. The cost of developing and maintaining systems is also lower. It is much easier to respond to unforeseen requests when the data is centralized in a database than when it is stored in file-based system. Components of DBMS Usually a DBMS is a large software package. On one hand it is an intermediate link between the physical database, the computer and operating system and on the other hand, the users.

Users: In DBMS, Generally three broad classes of users are considered.

1. Application Programmers The application programmers develop the applications. These programs can manipulate the database in all the possible ways.

2. End users The end user access the database from a terminal using a query language provided by the database system or through application programs developed by application programs.

8

Page 9: rdbms concepts

RDBMS Concepts J.SRINIVASA REDDY

-

3. DBA (Database Administrator) The database administrator is the person who is responsible for design, construction, and maintenance of a database.

Software:

Software of a database management system includes the DBMS, operating system, network software (if necessary), and the application programs.

Hardware:

Hardware of the system can range from a PC to a network of computers. It also includes various storage devices (like hard disks) and input and output devices (like key board, monitor, printer, etc.).

Data:

Data stored in a database includes numerical data including whole numbers and floating point numbers, and non-numerical data such as characters, date or logical (true or false). More advanced systems may include more complicated data entities such as pictures and images as data types.

DBMS Architecture We know that a database management system is a collection of programs that enables users to create and maintain a database. A major purpose of a DBMS is to provide users with an abstract view of the database. This means that the system does not provide users all the details of data, rather it hides the details of how the data is stored and maintained. However, in order for the system to be usable, data must be retrieved efficiently. This concern leads to the design of complex data structures for the representation of data in database.

View: Normally, a table contains many columns and rows. Sometimes all the data interests the user, and sometimes it dos not. There may be a case when only some columns or some rows of a table interested the user. To eliminate data that is not relevant to the current needs, a view is created. A view is a subset of a database that an application can process. It may contain parts of one or more tables. Views are sometimes called virtual tables. To the application or the user, views behave in similar fashion as tables.

9

Page 10: rdbms concepts

RDBMS Concepts J.SRINIVASA REDDY

Schema: The database schema refers to the overall structure of the database tables that store information such as user profile data, metadata or structured information, hence, the overall logical design of the database is called as the database schema. Note that once the schema of the database is created. Usually it is not changed. If in case it needs to be modified, only the owner of the schema that is DBA, who has access to manipulate the structure of any object in the schema, can modify it.

A DBMS can be envisioned as a three – layered system: Internal, Conceptual and external.

Internal (Physical) Level: This is the lowest level of abstraction and it describes how the data is physically stored and organized on the storage medium, as well as access to the data, such as through data storage in the tables an the use of indexes to expedite data access. At this level complex low-level data structures are described in detail, which deal with actual storage. The internal model separates the physical requirements of the hardware and the operating system from the data model.

-

Conceptual (Logical) Level: Conceptual Level also known as Logical or community-user level, it describes what type of data is actually stored in the database and the relationships among the data. At this level, entire database is described in terms of a small number of relatively simple structures, such as tables and constraints, although implementation of the simple structures at the conceptual level my involve complex physical level structures, the user of the conceptual level of abstraction is the DBA, who must decide what information is to be kept in the database.

External Level:

The external level (or Application Interface) is the view that the individual user of the database has. This view often a restricted view of the database and the same database may be provided a number of different views for different classes of users. This level deals with the methods through which users may access the schema, such as using an input form. In general, the end users and even the application programmers are only interested in a sub set of a database. To simplify their interaction with the system, the view level of abstraction is defined. Concisely, this level is concerned with the way in which the data is viewed by individual user.

10

Page 11: rdbms concepts

RDBMS Concepts J.SRINIVASA REDDY

-

User 1 User 2 User 3

View 1 View 2

Conceptual Schema

Internal (Physical) Schema

Database

View 3 External Level Conceptual Level Internal Level Physical Data Organization

DBMS Architecture User 1 User 2

View 1 Code Name

View 2 Branch_Code

Name Salary

Conceptual Code Character(6) Branch_Code Character(6) Name Character(20) Age Numeric(3) Salary Numeric(6)

Internal (Physical)Stored-Employee Length=43 Code Type=Byte(6), Offset=0, Index=Ex Branch_Code Type=Byte(6), Offset=6 Name Type=Byte(20), Offset=12 Age Type=Byte(3), Offset=32 Salary Type=Byte(6), Offset=35

External Level Conceptual Level Internal Level

Three Levels of Data Abstraction

11

Page 12: rdbms concepts

RDBMS Concepts J.SRINIVASA REDDY

Data Independence

-

The ability to modify a schema definition in one level without effecting a schema definition in next higher level is called data independence. Database is not a static system. Information is constantly added, modified or deleted to and from the database. However, this should not lead to the redesigning and re-implementation of the database. At such moments, the concept of data independence proves beneficial. There are two levels of data independence.

1. Logical Data Independence: The separation of external views from the conceptual view, which enables the users to change the conceptual view without affecting the external views or application programs, is called logical data independence. Simply, it refers to the immunity of the external model to changes in the conceptual model.

2. Physical Data Independence: The separation of the conceptual view from the internal view enables us to provide a logical description of the database with out the need to specify physical structures. This is often called physical data independence. Modification at the physical level is occasionally necessary in order to improve performance. Simply, it refers to the immunity of an application to changes in the internal model and access strategy.

DATABASE MODELS

1. Hierarchical Database Model This data model organizes the data in a tree structure, that is, each child node can have only one parent node and at the top of the tree structure there is a single parentless node. Each parent node can have many child nodes. The hierarchical model uses one-to-many relationship. In this model, a database record is a tree. The main advantage of hierarchical databases is that data access is quiet predictable in structure, and therefore, both retrieval and updates can be highly optimized by DBMS. The major drawbacks of hierarchical databases are that data redundancy, and in the hierarchical model, the links are ‘had coded’ in to the database structure, that is, the link is permanently established and cannot be modified. The physical links make it very difficult to expand or modify the database.

12

Page 13: rdbms concepts

RDBMS Concepts J.SRINIVASA REDDY

-

Subject 1 Subject 2

LIBRARY

Author 1 Author 2

Book 1 Book 2

Author 3 Author 4 Author 2

Book 1

Book 1 Book 2

Book 1 Book 2

Book 1 Book 2

Hierarchical Database Model

2. Network Database Model Attempts to solve the problems associated with hierarchical databases produced the network database model.

The network model is very similar to the hierarchical one; it is also based on the concept of parent/child relationship but removes the restriction of one child having one and only one parent. In the network database model a parent can have multiple children, and a child can have multiple parents. This structure could be visualized as several trees that share some branches. In network database jargon these relationships came to be known as sets.

In addition to the ability to handle a one-to-many relationship, the network database can handle many-to-many relationships. Also, data access did not have to begin with the root; instead one could traverse the database structure starting from any table and navigating a related table in any direction

13

Page 14: rdbms concepts

RDBMS Concepts J.SRINIVASA REDDY

-

Salesman 1

Salesman 2

Product 1

Product 4

Product 3

Product 2

Salesman 3

Network Database Model

While providing several advantages, network databases share several problems with hierarchical databases. Both are very inflexible, and changes in the structure (for example, a new table to reflect changed business logic) require that the entire database be rebuilt; also, set relationships and record structures must be predefined.

The major disadvantage of both network and hierarchical database was that they are programmers' domains. To answer the simplest query, one had to create a program that navigated database structure and produced an output; unlike SQL this program was written in procedural, often proprietary, language and required a great deal of knowledge — of both database structure and underlying operating system. As a result, such programs were not portable and took enormous (by today's standards) amount of time to write.

3. Relational Database Model The development of relational databases was driven by the need of the medium to big businesses to gather, preserve, and analyze data. The frustration with the inadequate capabilities of network and hierarchical databases resulted in the invention of the relational data model. The relational data model took the idea of the network database some several steps further. Relational models — just like hierarchical and network models — are based upon tables and use parent/child relationships.

14

Page 15: rdbms concepts

RDBMS Concepts J.SRINIVASA REDDY

-

The relational model was formally introduced by Dr. E. F. Codd in 1970 and has evolved since then, through a series of writings. The model provides a simple, yet rigorously defined, concept of how users perceive data. The relational model represents data in the form of two-dimension tables. Each table represents some real-world person, place, thing, or event about which information is collected. A relational database is a collection of two-dimensional tables. The organization of data into relational tables is known as the logical view of the database. That is, the form in which a relational database presents data to the user and the programmer. The way the database software physically stores the data on a computer disk system is called the internal view.

AUTHOR Table

Author ID Author Name Date of Birth 567-2643 J.S.Reddy 15-Jun-1977 482-7654 Joan Casteel 28-Mar-1973 573-2342 Deshpande 15-Nov-1969 567-5643 Steve 20-Aug-1978

PUBLISHER Table Pub ID Pub Name Address 07-05-112 Thomson hyderabad 16-99-876 Wiley pune 11-54-123 Oreilly kolkata 12-76-876 Wrox New Delhi

BOOKS Table ISBN Author ID Pub ID Date Title 923-121 567-2643 16-99-876 12-12-1997 Oracle 895-323 482-7654 07-05-112 5-4-1989 Java 887-122 573-2342 12-76-876 10-11-1999 Dot net 788-211 567-5643 11-54-123 6-6-2000 SAP-ABAP

4. Object – Oriented Database Model The object-oriented databases developed in 1980s. The main objective of Object-Oriented Database Management Systems, commonly known as OODBMS, is to provide consistent, data independent, secure, controlled and extensible data management services to support the object-oriented model. They were created to handle big and complex data that relational databases could not. The most important characteristic is the joining of object-oriented programming with database technology, which provides an integrated application development system. Object-oriented programming results in 4 main characteristics: inheritances, data encapsulation, object identity, and polymorphism.

15

Page 16: rdbms concepts

RDBMS Concepts J.SRINIVASA REDDY

-

Object-Relational database (ORDBMS) is the third type of database common today. ORDBMS are systems that “attempt to extend relational database systems with the functionality necessary to support a broader class of applications and, in many ways, provide a bridge between the relational and object-oriented paradigms.”

ORDBMS was created to handle new types of data such as audio, video, and image files that relational databases were not equipped to handle. In addition, its development was the result of increased usage of object-oriented programming languages, and a large mismatch between these and the DBMS software.

OODBMS :– Abandon SQL (use an OO language instead) ORDBMS :– Extend SQL (with OO features)

16