PCLec08 / 1 8. Integrity This session will be directed at the many and various aspects of...
-
Upload
buddy-shaw -
Category
Documents
-
view
214 -
download
1
Transcript of PCLec08 / 1 8. Integrity This session will be directed at the many and various aspects of...
PCLec08 / 1
8. Integrity8. Integrity
This session will be directed at the many and various
aspects of ‘Integrity’
Integrity processes are included to ensure that the
data in the database, and the information derived from
data is clear, complete and accurate.
And we will have another quick look at ‘Business Rules’ also
PCLec08 / 2
But,before we do …..But,before we do …..
Here is an answer to the puzzle which has been keeping you awake for the past 3 nights.
Source Target Time Progressive
C and D move across(A,B) C,D 2 2
D returns (A,B,D) C 1 3
A and B move across (D) A,B,C 10 13
C returns (C,D) A,B 2 15
C and D move across A,B,C,D 2 17
PCLec08 / 3
IntegrityIntegrity
Components of a Data Model
Structure
Operators
Integrity
PCLec08 / 4
Dangers to DataDangers to Data
A DBMS must protect data from several dangers.
- Accidents( programming errors and miskeying) : these are
integrity issues.
- Malicious Use : A security issue.
- Hardware and Software Failures : these are concurrency and restart issues.
PCLec08 / 5
Definitions of IntegrityDefinitions of Integrity
Data integrity requires the database to be an accurate reflection of the real world
Data should be valid and complete
Integrity issues may have been handled external to the database in the application code and possibly in multiple programs.
Codd (1985) states that integrity constraintsspecific to a particular RDBMS must be definable in SQL and stored in the database dictionary (catalog).
PCLec08 / 6
A DBMS Enforced IntegrityA DBMS Enforced Integrity
Employee Table
Empno, emp_name,Age, SalaryEMPNO NUMBER(6,0) NOT NULL Attempt to add an employee without an EMPNO value
INSERT INTO EMPLOYEE(EMP_NAME,AGE,SALARY)VALUES (‘Smith’,22,35000)
This process is rejected by the DBMS but what would happen if the user entered (35000,’Smith’,22 ) ?
PCLec08 / 7
An Applications Forced IntegrityAn Applications Forced Integrity
A further constraintIF AGE > 16 OR AGE < 99
THEN O-KELSE
REJECT ‘Age Invalid’
This represents a segment of program code.
PCLec08 / 8
Integrity EnforcementIntegrity Enforcement
Integrity enforcement is usually split between the DBMS and the application programs.
Using application programs for integrity assertions has disadvantages.
Programming is more complex
Integrity constraints may be repeated
Change management is difficult
Constraints may contradict
Ad hoc operations may avoid the constraints
PCLec08 / 9
Integrity as a Role of the DBMSIntegrity as a Role of the DBMS
Integrity rules must be considered at design time
Transactions must be monitored for violations
and appropriate actions taken
Rules should be few, without overlap and
should not impact performance too much
(this is not an invitation to exclude rules)
PCLec08 / 10
Classifications of IntegritiesClassifications of Integrities
There are various possible classifications.Date(Vol 2)
- Domain Integrity (attribute based)
- Relational Integrity (table based)
PCLec08 / 11
Codd’s CURED or CRUDECodd’s CURED or CRUDE
Type E - Entity Integrity
Type R - Referential Integrity
Type D - Domain Integrity :A user defined datatype
Type C - Column Integrity:Linked to Domain integrity
Type U - User Defined Integrity
PCLec08 / 12
Data IntegrityData Integrity
Some terms you will encounter:
Entity Integrity
Referential Integrity
Functional Dependency (constraints between determinants
and attributes. For each value of the determinant there is only one value for each of the attributes it determines)
Multivalued Dependency
Join Dependency
Domain Constraints
Cardinality Constraint
User Defined Constraints
PCLec08 / 13
Data IntegrityData Integrity
General Principle: Data compliance with a set of rules
Rules Location: Best embodied in the DBMS
If they are contained in an application, there is the danger of saturating a network and causing degraded performance.
This is particularly so in client / server computing - but are ALL the rules applicable to ALL users ?
CONSTRAINTS: Declarative approach where integrity constraints are ‘declared’ as part of a table specification. ANSI SQL-89 and SQL-92 and SQL-93 standards include specifications for integrity constraints syntax and behaviour
PCLec08 / 14
Domain IntegrityDomain Integrity
A domain is a conceptual pool of values from whichone or more attributes draw their actual values.
Domain age range 0-127
Attribute employee_age 16-65
Two values can only be compared if they come fromthe same domain.
PCLec08 / 15
Primary Key IntegrityPrimary Key Integrity
(based on Oracle)
A primary key has these properties– unique value (no duplicates permitted in table)– not null– multiple keys if required– referenced qualification - foreign key(s)– may be limited to a small range of values (the check
option)– may be limited to a large range of values (the ‘exists’
option)
PCLec08 / 16
Foreign Key PropertiesForeign Key Properties
May be unique ( 1 : 1 relationship)
May be multiple keys
May be limited to a range of values (Domain -as for primary Keys)
May be null (as required)
May be not null (as required)
Will reference a Primary Key (or keys)
May be subject to cascade update, delete, insert
PCLec08 / 17
A Domain DefinitionA Domain Definition
DOMAIN GENDER
– Data Type: Character– Length: 6 bytes– Allowable Values: Male, Female, Null– Storage Format: Uppercase– Operations Allowed:– Inherited Operators: String, Unstring, =– Input Editing: Nil– Extra Functions: Is_ Male, Is_Female,What_Gender
PCLec08 / 18
Timing ConstraintsTiming Constraints
When should an integrity be checked ?TC - Test constraint no later than the end of the
current relational request.
TT - Test at the end of the transaction.(terminal test)
START TRANSACTION UPDATE EMPLOYEE SET SALARY= SALARY*1.1 TC
DELETE FROM EMPLOYEE WHERE SALARY > 1000 END TRANSACTION TT
In Oracle, the integrity check is determined by the commands ‘Update Immediate’ or ‘Update Deferred’There is also a ‘set constraints.. Immediate or deferred’ option
PCLec08 / 19
A Few ExamplesA Few Examples
A transaction is a unit of work
1. Single Table - The transaction affects 1 row only does not alter any domain setting.
2. Single Table - The transaction affects multiple rows and will affect domain settings.
When should the domain integrity breach be reported ?
At the first, second - or at the end of the processes ?
When should the transaction be aborted ?
Should there be a log held of these occurrences/rows ?
PCLec08 / 20
A Few ExamplesA Few Examples
3. Single Table - bulk loading.
Should the load process be stopped at the
detection of the first breach ?
Or should the load row be ‘diverted’ to a log file ?
Should there be a number count of failures ?
Should there be a limit over which the process
should be stopped ?
PCLec08 / 21
Slightly More ComplexSlightly More Complex
1:M 1:M
It is possible that multiple rows in table A, table B and table C will be affected by the transaction
Transaction
A B C
PCLec08 / 22
An Algorithm for Integrity ChecksAn Algorithm for Integrity Checks
Determine constraints that apply to request.
Inspect timing types.
Before the end of the relational request run types TC.
Append types TT to the end of the transaction.
Before the end of transaction run types TT.
PCLec08 / 23
Foreign Key RulesForeign Key Rules
For each foreign key three rules need to be answered:
• Can the foreign key accept nulls ?
• What should happen on an attempt to delete the target of a foreign key reference?
• What should happen on an attempt to update the target of a foreign key reference ? (primary key)
Employee Dept
Empnoe1e2e3
enameredbluebrown
Worksfordeptd1
d2
Deptd1d2d3
DnamePayTaxArt
PCLec08 / 24
Foreign Key RulesForeign Key Rules
When should foreign key rules be checked ?
Dept (Deptno, Dname, Budget)
Emp (Empno, Ename, Salary, WorksforDeptno)
WorksforDeptno References Dept delete cascades, update cascades
Depend (Empno, Dependname, Date-of-birth)
Empno references Emp delete cascades, update cascades
PCLec08 / 25
Foreign Keys and Referencing ActionForeign Keys and Referencing Action
CREATE TABLE SUPPLIER etc. Primary Key (Sno ) CREATE TABLE PART etc. Primary Key( Pno ) CREATE TABLE SUPPLIER_PART(etc. Primary Key (Sno,Pno) Foreign Key (Sno) REFERENCES SUPPLIER (Sno) ON DELETE RESTRICT ON UPDATE CASCADES Foreign Key (Pno)REFERENCES PART (Pno) ON DELETE RESTRICT ON UPDATE CASCADES)
e.g. SUPPLIER (Sno,Sname) PART (Pno,Pname) SUPPLIER-PART (Sno,Pno,Qty)
PCLec08 / 26
Foreign Keys and Referencing ActionForeign Keys and Referencing Action
The relation each Foreign Key identifies is defined. The foreign key clause also contains other information.
DELETE when the target record of a foreign key reference is detected
Performs the operation -
CASCADE - all matching SUPPLIER-PART records are also deleted.
RESTRICT - the delete is restricted such that there are no matching SUPPLIER-PART records.
SET NULL - the foreign key values are all set to NULL (only if nulls are allowed)
PCLec08 / 27
Foreign Keys and Referencing ActionForeign Keys and Referencing Action
UPDATE when the Primary Key of the target record of a foreign key is updated.
CASCADE
RESTRICT
SET NULLThese options are similar to delete.
Note that the design decisions embodied in pseudo SQL represent constraint information which reflects the nature or business rules of the organisation being modelled.
PCLec08 / 28
Possible Referential Integrity ProcessesPossible Referential Integrity Processes
1. Limited Insert : If an incoming Foreign Key DOES NOT EXIST as a referenced table Primary Key:
ABORT TRANSACTION - REPORT
2. Limited Update : If an incoming Foreign Key DOES NOT EXIST as a referenced table Primary Key
TERMINATE PROCESS
3. Restricted Delete : If there are referencing FOREIGN KEYS in a referencing table
TERMINATE DELETE PROCESS ON REFERENCED TABLE
PCLec08 / 29
Possible Referential Integrity ProcessesPossible Referential Integrity Processes
4. Restricted Update : If there are referencing Foreign Keys in a referencing table
INHIBIT UPDATE OPERATION ON THE REFERENCED KEY
5. Cascade Delete : If there are Referenced Keys
INITIATE DELETION OPERATION ON REFERENCED
TABLE BY DELETING ALL REFERENCING ROWS
6. Cascade Update : Commence an UPDATE on the REFERENCED TABLE by UPDATING the Foreign Keys on all Referencing Rows in the Referencing Table(s)
PCLec08 / 30
Possible Referential Integrity ProcessesPossible Referential Integrity Processes
7. Nullify Delete : Commence a DELETE operation on the REFERENCED table by setting ALL the FOREIGN KEYS on the Referencing Table(s) to NULL (watch Data Types)
8. Nullify Update : Set all of the Foreign Keys of the Referencing Table to NULL. This will invalidate any referencing of the Referenced Key (which must not be
NULL)
9. Default Update : Invalidate references to Updated Referenced Keys by setting all Referencing Table
Foreign Keys to a DEFAULT value
PCLec08 / 31
Possible Referential Integrity ProcessesPossible Referential Integrity Processes
10. Default Delete : Invalidate references to the deleted Referencing Key Value(s) by setting all Referencing Foreign Key values to a DEFAULT value
11. Warning Delete : Permit the deletion BUT Warn the User of the Unattached Foreign Keys which are now present in the Referencing Table(s)
12. Warning Update : Permit the Update BUT Warn the User of Unattached Foreign Keys which are now present in the Referencing Table(s)
PCLec08 / 32
Some Integrity Schema ExamplesSome Integrity Schema Examples
Create table monash1(
city varchar2(13) not null,
studydate date not null,
noonreading number(4,1),
midnightreading number(4,1),
rainfall number,
unique (city,studydate) );
Creates a table with the candidate key of city,studydate
There may be a number of Unique constraints
PCLec08 / 33
Some Integrity Schema ExamplesSome Integrity Schema Examples
Create table monash1(
city varchar2(13) not null,
studydate date not null,
noonreading number(4,1),
midnightreading number(4,1),
rainfall number,
primary key (city,studydate) );
Creates a table with the Primary Key of city,studydate
and there is only 1 such set of values in the table.
There may be a number of Unique constraints.
PCLec08 / 34
Some Integrity Schema ExamplesSome Integrity Schema Examples
Create table monash1(
city varchar2(13) not null,
studydate date not null,
noonreading number(4,1),
midnightreading number(4,1),
rainfall number,
constraint pk_citystudy primary key (city,studydate) );
Creates a table with the Primary Key key of city,studydate
and names the constraint citystudy in the Constraints table.
PCLec08 / 35
Enable, DisableEnable, Disable
There is a feature in Oracle which permits the
Disabling and Enabling of constraints.
e.g. alter table shipping add primary key (ship_no, container_no)
This identifies the composite primary key as ship_no + container_no, and ensures that no two rows have the same values.
The Disable option defines the constraint but does not enforce it.
The Enable function resets the enforcement of the constraint.
PCLec08 / 36
Enable, DisableEnable, Disable
The formats of the ‘disable’ and ‘enable’ commands are :
disable {{unique(column[,column]…) |
primary key |
Constrains constraint} [cascade] } |
all triggers
enable {{ unique(column[,column]…|
primary key |
[using index [initrans integer]
[maxtrans integer]
[Tablespace tablespace]
[Storage storage]
all triggers
PCLec08 / 37
TriggersTriggers
• Oracle triggers are used to include more processing power to the DBMS function for events which affect a database.
• In the following example a Trigger will be set which ensures that changes to employee records will only take place during business hours on working days ( security ?)
• See if you agree ...
PCLec08 / 38
TriggersTriggers
Create trigger emp_permit_change
before
delete or insert or update
on emp
declare dummy integer;
begin
/* if today is a Saturday or Sunday, then return an error*/
if (to_char(sysdate, ‘dy’) = ‘sat’ or
to_char (sysdate, ‘dy’) = ‘sun’)
then raise_application_error (-20501,
‘May not change employee table during the weekend’);
end if;
PCLec08 / 39
TriggersTriggers• Perhaps we need this as well :-
If (to_char(sysdate, ‘hh24’) < 8
or to_char(sysdate, ‘hh24’) >= 18)
then raise application_error (-20502,
‘May only change employee table during working hours’);
end if;
end;
which raises and interesting point - what happens with flexible time and enterprise bargaining ?
PCLec08 / 41
Business Rules and Data ModellingBusiness Rules and Data Modelling
Business Rules are necessary to ensure that data in a database reflects accurately those conditions which apply to data in the real world environment
The following overheads introduce some additional material on this subject
PCLec08 / 42
Business Rules and Data ModellingBusiness Rules and Data Modelling
Business Rules are at the core of commercial applications
If systems ‘obey’ the Business Rules, then– data will be correct– applications will function– users and management will be happy
Which leads us to– what is a business rule ?– where are they declared ?– where are they enforced ?
PCLec08 / 43
Business Rules and Data ModellingBusiness Rules and Data Modelling
4 Proposed Levels of Business Rules
1. Single attribute (column) format definitions enforced by the database
The ‘payment’ column is an amount interpreted as dollars and cents
The Surname column is a text field expressed in the ASCII character set
The Amount_on_Hand column must never be less than 0
PCLec08 / 44
Business Rules and Data ModellingBusiness Rules and Data Modelling
2. Multiple key column relationships
The ‘Brand Name’ column in the Brand table has a many to one relationship with the Manufacturer Name in the Manufacturer table
The Product foreign key in the Sales table has a many to one relationship to the Product primary key in the Product table
PCLec08 / 45
Business Rules and Data ModellingBusiness Rules and Data Modelling
3. Relationships between Entities
This is declared on the entity-relationship diagram, but is not directly enforced by the database because the relationship is many-to-many
The employee is a sub-type of Person
Supplier supplies the Customer
PCLec08 / 46
Business Rules and Data ModellingBusiness Rules and Data Modelling
4. Complex Business Logic
This relates to Business processes
It may be enforced at data entry time by a complex application such as this :-
“When an insurance policy has been committed but has not yet been approved by the underwriter, the administration date can be NULL, but when the policy has been underwritten, the administration date must be present (NOT NULL) and must be more recent than the agreement date”
PCLec08 / 47
Business Rules and Data ModellingBusiness Rules and Data Modelling
From this it can be stated that :
The core database software manages the first 2 levels only - the single column format definitions and multiple column key relationships
Level 3 (relationships between entities) and Level 4 (complex business logic) should also be enforced as there is much valuable business content at this level (or should that be essential ?)
PCLec08 / 48
Business Rules and Data ModellingBusiness Rules and Data Modelling
Entity relationship modelling (E/R modelling) seems to be a comprehensive language for mapping and describing relationships between entities.
E/R modelling is a diagrammatic technique which specifies
one-to-one
many-to-one and
many-to-many relationships among data elements
It is a logical model
PCLec08 / 49
Business Rules and Data ModellingBusiness Rules and Data Modelling
Computer Associates’ Erwin converts an E/R diagram into data definition language declarations
These declarations define key definitions and join constraints
You can follow this up, and use an Erwin example at
www.cai.com
Gershwin, which you have probably met, is a simpler E/R modelling tool
PCLec08 / 50
Business Rules and Data ModellingBusiness Rules and Data Modelling
E/R modelling is a useful technique for beginning the process of understanding and enforcing business rules
It does not provide a guarantee of completeness
E/R Modelling is incomplete in that the diagrams represent only what the designer decided to stress, or was aware of.
There is no test of an E/R diagram to determine if the designer has specified all possible one-to-one, one-to-many or many-to-many relationships.
PCLec08 / 51
Business Rules and Data ModellingBusiness Rules and Data Modelling
E/R modelling is not unique
A given set of data relationships can be represented by a number of E/R diagrams
Many real data relationships are many-to-many.
The E/R diagram model does not enforce the M:N situations which may involve various conditions and degrees of correlation which would be useful (and perhaps essential) to include a business rules. E/R modelling provides no extensions to the basic many-to-many declaration
PCLec08 / 52
Business Rules and Data ModellingBusiness Rules and Data Modelling
Many E/R models are ideal, not real
Many corporate models are based on ‘how things should be’
This is very useful in understanding the business
BUT if the model must be populated with real data
E/R models are rarely models of real data
There aren’t any tools for trawling over real data data sets and developing E/R models
The E/R model is invariably constructed and the data is ‘fitted’ into the model - and that means we need to clean data before it becomes resident in the model.
PCLec08 / 53
Business Rules and Data modellingBusiness Rules and Data modelling
E/R models lead to complex schemas which mitigate against the objectives of Information Delivery
As an example, the E/R diagram which underpins Oracle Financials (a current Applications Package) requires approximately 2000 tables
SAP’s model can require 10,000 tables.
All of which tends to work against the objectives of easy to understand models, and high performance.
PCLec08 / 54
Business Rules and Data ModellingBusiness Rules and Data Modelling
Chris Date (An Introduction to Database Systems, 7th edition) has this to say :
‘the E/R model is incapable of dealing with integrity constraints or ‘business rules’ except for a few special cases.
Declarative rules are too complex to be captured as part of the business model and must be defined separately by the analyst/developer’.