Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4....

49
Unit 6 Data Storage Design

Transcript of Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4....

Page 1: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Unit 6

Data Storage Design

Page 2: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Key Concepts1. Database overview2. SQL review3. Designing fields4. Denormalization5. File organization6. Object-relational database features

Page 3: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

What Is Physical Database Design?The part of a database design that deals

with efficiency considerations for access of data

Key issues include:Processing speedStorage spaceData manipulation and data access patterns

Page 4: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Sometimes, the analyst and the designer are the same person,

Deliverables

Page 5: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.
Page 6: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

What Is SQL?Structured Query LanguageOften pronounced “sequel”The standard language for creating and

using relational databasesANSI Standards

SQL-92 – most commonly availableSQL-99 – included object-relational features

Page 7: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Common SQL CommandsCREATE

used to create databases and database objects.examples:CREATE TABLECREATE DATABASE

SELECT used to retrieve data using specified formats and

selection criteriaINSERT

used to add new rows to a tableUPDATE

used to modify data in existing table rowsDELETE

used to remove rows from tables

Page 8: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Example CREATE TABLE Statement

Here, a table called DEPT is created, with one numeric and two text fields.

The numeric field is the primary key.

Page 9: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Example INSERT Statement

This statement inserts a new row into the DEPT table

DEPTNO’s value is 50DNAME’s value is “DESIGN”LOC’s value is “MIAMI”

Page 10: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

SELECTThe SELECT, and FROM clauses are required.

All others are optional.

WHERE is used very commonly.

Page 11: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

SELECT Statement: Example 1

Result: all fields of all rows in the DEPT table

Select * from DEPT;

Page 12: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

SELECT Statement: Example 2

Result: all fields for employee “Smith”

Select * from EMP where ENAME = 'SMITH';

Page 13: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

SELECT Statement: Example 3

Result: employee number, name and job for only salesmen from the EMP table, sorted by name

Select EMPNO, ENAME From EMP where JOB = 'SALESMAN' order by ENAME;

Page 14: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

What Is a Join Query?A query in which the WHERE clause

includes a match of primary key and foreign key values between tables that share a relationship

Page 15: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

SELECT Statement: Example 4

Result: all employees’ number and name (from the EMP table, and their associated department names, obtained by joining the tables based on DEPT_NO.

Only employees housed in department located in Chicago will be included

Select EMPNO, ENAME, DNAME from EMP, DEPT where EMP.DEPT_NO = DEPT.DEPT_NO and DEPT.LOC = 'CHICAGO';

Page 16: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

SELECT Statement: Example 4(cont.)

Join queries almost always involve matching the primary key of the dominant table with the foreign key of the dependent table.

Page 17: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

What Is an Aggregation Query?A query results in summary information

about a group of records, such as sums, counts, or averages

These involve aggregate functions in the SELECT clause (SUM, AVG, COUNT)

Aggregations can be filtered using the HAVING clause and/or grouped using the GROUP BY clause

Page 18: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

SELECT Statement: Example 5

The job name and average salary for each job of employees in the EMP table.

Only jobs with average salaries exceeding $3000 will be included

Select JOB, Avg(SALARY) from EMP Group by JOB Having Avg(SALARY) >= 3000;

Page 19: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

SELECT Statement: Example 5(cont.)

Note that clerks and salesmen are not included, because the average salaries for these jobs are below $3000.

Page 20: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Example Data Manipulation

Modifies the existing employee’s (7698) salary

Removes employee 7844 from the EMP table

Update EMP set SAL = 3000 where EMPNO = 7698;

Delete from EMP where EMPNO = 7844

Page 21: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Designing FieldsField – the smallest unit of named

application data recognized by system software such as a DBMS

Fields map roughly onto attributes in conceptual data models

When designing fields, consider:identitydata typessizesconstraints

Page 22: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Data type –A coding scheme recognized by system software for representing organizational data

Page 23: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

SQL Server Data Types

Storage type Data types

date and time values

smalldatetime, datetime

integral bit, tinyint, smallint, int, bigint

non-whole numbers

decimal, numeric, money, smallmoney, float, real

characters and strings

char, varchar, text

Unicode characters and strings

nchar, nvarchar, ntext

Binary strings binary, varbinary, image

Other cursor, sql_variant, table, timestamp, uniqueidentifier, xml

Page 24: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Considerations for Choosing Data Types Balance these four objectives:

1. Minimize storage space2. Represent all possible values of the field3. Improve data integrity for the field4. Support all data manipulations desired for

the field

Page 25: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Mapping a composite attribute onto multiple fields with various data types

Page 26: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Creating and Using Composite Attribute Types

Page 27: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Data Integrity ControlsDefault Values

used if no explicit value is enteredFormat Controls

restricts data entry values in specific character positions

Range Controls forces values to be among an acceptable set

of valuesReferential Integrity

forces foreign keys to align with primary keysNull Value Controls

determines whether fields can be empty of value

Page 28: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Referential integrity is important for ensuring that data relationships are accurate and consistent

Page 29: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

What Is Denormalization?The process of combining normalized

relations into physical tables based on affinity of use of rows and fields, and on retrieval and update frequencies on the tables

Results in better speed of access, but reduces data integrity and increases data redundancy

Page 30: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

This will result in null values in several rows’ application data.

Page 31: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.
Page 32: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

This will result in duplications of item descriptions in several rows of the CanSupplyDR table.

Page 33: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Duplicate regionManager data

Page 34: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

What Is a File Organization?A technique for physically arranging the

row objects of a file

Main purpose of file organization is to optimize speed of data access and modification

Page 35: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

11-35

Page 36: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Secondary Storage ConceptsBlock

a unit of data retrieval from secondary storage

Extent a set of contiguous blocks

Scan a complete read of a file block by block

Blocking factor the number of row objects that fit in one

block

Page 37: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Determining Table Scan TimeBlock read time is determined by seek,

rotation and transfer.

Average_table_scan_time = (#rows/blocking_factor) * block_ read_time

Page 38: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

What Is a Heap?A file with no organization

Requires full table scan for data retrieval

Only use this for small, cacheable tables

Page 39: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

What Is Hashing?

A technique that uses an algorithm to convert a key value to a row address

Useful for random access, but not for sequential access

Page 40: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

What Is an Indexed File Organization?

A storage structure involving indexes, which are key values and pointers to row addresses

Indexed file organizations are structured to enable fast random and sequential access

Index files are fast for queries, but require additional overhead for inserts, deletes, and updates

Page 41: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Random Access Processing Using B+ Tree IndexesRandom Access Processing Using B+ Tree Indexes

Indexes are usually implemented as B+ trees

These are balanced trees, which preserve a sequential ascending order of items as they are added.

Page 42: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Issues to Consider When Selecting a File Organization

File sizeFrequency of data retrievalsFrequency of updatesFactors related to primary and foreign keysFactors related to non-key attributes

Page 43: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Which Fields should be Indexed?

Page 44: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Design of Object Relational Features

Object-relatonal databases support:Generalization and inheritanceAggregationMultivalued attributesObject identifiersRelationships by reference (pointers)

Page 45: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Generalization in Oracle 9i/10g

Page 46: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Aggregation in Oracle 9i/10g

Page 47: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Multivalued Attributes in Oracle 9i/10g

Page 48: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

Object Identifiers in Oracle 9i/10g

Page 49: Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.

SQL Server Object-Relational Features

SQL Server 2005 SQL Server 2008

Common Language Runtime (CLR) integration

Common Language Runtime (CLR) integration

Spatial and geographic data types

.NET Language Integrated Query (LINQ)

Object-Relational Designer