CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
-
Upload
tyrell-daughton -
Category
Documents
-
view
217 -
download
2
Transcript of CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
CS 405G: Introduction to Database Systems
Instructor: Jinze Liu
Fall 2009
04/10/23 2
Review A data model is
a group of concepts for describing data.
What are the two terms used by ER model to describe a miniworld? Entity Relationship
What makes a good database design1010111101
Student (sid: string, name: string, login: string, age: integer,
gpa:real)
204/10/23
04/10/23 3
Topics Next More case study Conversion of ER models to Schemas
304/10/23
04/10/23 4
ER Case study
04/10/234
Design a database representing cities, counties, and states For states, record name and capital (city) For counties, record name, area, and location (state) For cities, record name, population, and location
(county and state) Assume the following:
Names of states are unique Names of counties are only unique within a state Names of cities are only unique within a county A city is always located in a single county A county is always located in a single state
04/10/23 5
Case study : first design County area information is repeated for
every city in the county Redundancy is bad. What else?
State capital should really be a city Should “reference” entities through explicit
relationships
Cities In Statesname
capital
name
population
county_area
county_name
504/10/23
04/10/23 6
Case study : second design Technically, nothing in this design could
prevent a city in state X from being the capital of another state Y, but oh well…
Cities
IsCapitalOf
name
population
Countiesname
areaname
In
In States
604/10/23
04/10/23 7
Database Design
704/10/23
A Relation is a Table
8
name manf Winterbrew Pete’s Bud Lite Anheuser-
Busch Beers
Attributes(columnheaders)
Tuples(rows)
Schemas
9
Relation schema = relation name + attributes, in order (+ types of attributes). Example: Beers(name, manf) or Beers(name:
string, manf: string)
Database = collection of relations. Database schema = set of all relation
schemas in the database.
Why Relations?
10
Very simple model. Often matches how we think about data. Abstract model that underlies SQL, the most
important database language today. But SQL uses bags, while the relational model is a
set-based model.
From E/R Diagrams to Relations
11
Entity sets become relations with the same set of attributes.
Relationships become relations whose attributes are only: The keys of the connected entity sets. Attributes of the relationship itself.
Entity Set -> Relation
12
Relation: Beers(name, manf)
Beers
name manf
Relationship -> Relation
13
Drinkers BeersLikes
Likes(drinker, beer)Favorite
Favorite(drinker, beer)
Married
husband
wife
Married(husband, wife)
name addr name manf
Buddies
1 2
Buddies(name1, name2)
Combining Relations
14
It is OK to combine the relation for an entity-set E with the relation R for a many-one relationship from E to another entity set.
Example: Drinkers(name, addr) and Favorite(drinker, beer) combine to make Drinker1(name, addr, favBeer).
Risk with Many-Many Relationships
15
Combining Drinkers with Likes would be a mistake. It leads to redundancy, as:
name addr beerSally 123 Maple BudSally 123 Maple Miller
Redundancy
Handling Weak Entity Sets
16
Relation for a weak entity set must include attributes for its complete key (including those belonging to other entity sets), as well as its own, nonkey attributes.
A supporting (double-diamond) relationship is redundant and yields no relation.
Example
17
Logins HostsAt
name name
time
Example
18
Logins HostsAt
name name
Hosts(hostName)Logins(loginName, hostName, time)At(loginName, hostName, hostName2)
Must be the same
time
At becomes part ofLogins
19
A (Slightly) Formal Definition A database is a collection of relations (or
tables) Each relation is identified by a name and a
list of attributes (or columns) Each attribute has a name and a domain (or
type) Set-valued attributes not allowed
19
04/10/23 20
Schema versus instance Schema (metadata)
Specification of how data is to be structured logically
Defined at set-up Rarely changes
Instance Content Changes rapidly, but always conforms to the
schema Compare to type and objects of type in a
programming language
2004/10/23
04/10/23 21
Example Schema
Student (SID integer, name string, age integer, GPA float)
Course (CID string, title string) Enroll (SID integer, CID integer)
Instance { h142, Bart, 10, 2.3i, h123, Milhouse, 10,
3.1i, ...} { hCPS116, Intro. to Database Systemsi, ...} { h142, CPS116i, h142, CPS114i, ...}
2104/10/23
04/10/23 22
Relational Integrity Constraints Constraints are conditions that must hold on
all valid relation instances. There are four main types of constraints:
1. Domain constraints1. The value of an attribute must come from its
domain
2. Key constraints3. Entity integrity constraints4. Referential integrity constraints
2204/10/23
Primary Key Constraints A set of fields is a candidate key for a relation
if :1. No two distinct tuples can have same values in
all key fields, and2. This is not true for any subset of the key. Part 2 false? A superkey. If there’s >1 key for a relation, one of the keys is
chosen (by DBA) to be the primary key. E.g., given a schema Student(sid: string,
name: string, gpa: float) we have: sid is a key for Students. (What about name?)
The set {sid, gpa} is a superkey.
2304/10/23 Jinze Liu @ University of Kentucky
04/10/23 24
Key Example CAR (licence_num: string, Engine_serial_num:
string, make: string, model: string, year: integer) What is the candidate key(s) Which one you may use as a primary key What are the super keys
2404/10/23
04/10/23 25
Entity Integrity Entity Integrity: The primary key attributes PK
of each relation schema R in S cannot have null values in any tuple of r(R). Other attributes of R may be similarly
constrained to disallow null values, even though they are not members of the primary key.
2504/10/23
Foreign Keys, Referential Integrity Foreign key : Set of fields in one relation that
is used to `refer’ to a tuple in another relation. (Must correspond to primary key of the second relation.) Like a `logical pointer’.
E.g. sid is a foreign key referring to Students: Student(sid: string, name: string, gpa: float) Enrolled(sid: string, cid: string, grade: string) If all foreign key constraints are enforced,
referential integrity is achieved, i.e., no dangling references.
Can you name a data model w/o referential integrity? Links in HTML!
2604/10/23 Jinze Liu @ University of Kentucky
Foreign Keys Only students listed in the Students relation
should be allowed to enroll for courses.
sid name login age gpa
53666 Jones jones@cs 18 3.453688 Smith smith@eecs 18 3.253650 Smith smith@math 19 3.8
sid cid grade 53666 Carnatic101 C 53666 Reggae203 B 53650 Topology112 A 53666 History105 B
EnrolledStudents
Or, use NULL as the value for the foreign key in the referencing tuple when the referenced tuple does not exist
2704/10/23 Jinze Liu @ University of Kentucky
04/10/23 28
In-Class Exercise
(Taken from Exercise 5.16)
Consider the following relations for a database that keeps track of student enrollment in courses and the books adopted for each course:
STUDENT(SSN, Name, Major, Bdate)
COURSE(Course#, Cname, Dept)
ENROLL(SSN, Course#, Quarter, Grade)
BOOK_ADOPTION(Course#, Quarter, Book_ISBN)
TEXT(Book_ISBN, Book_Title, Publisher, Author)
Draw a relational schema diagram specifying the foreign keys for this schema.
2804/10/23
04/10/23Jinze Liu @ University of Kentucky29
In-Class Exercise(Taken from Exercise 5.16)
Consider the following relations for a database that keeps track of student enrollment in courses and the books adopted for each course:
STUDENT(SSN, Name, Major, Bdate)
COURSE(Course#, Cname, Dept)
ENROLL(SSN, Course#, Quarter, Grade)
BOOK_ADOPTION(Course#, Quarter, Book_ISBN)
TEXT(Book_ISBN, Book_Title, Publisher, Author)
Draw a relational schema diagram specifying the foreign keys for this schema.
04/10/23 30
Other Types of Constraints Semantic Integrity Constraints:
based on application semantics and cannot be expressed by the model per se
e.g., “the max. no. of hours per employee for all projects he or she works on is 56 hrs per week”
A constraint specification language may have to be used to express these
SQL-99 allows triggers and ASSERTIONS to allow for some of these
3004/10/23
04/10/23Jinze Liu @ University of Kentucky31
Update Operations on Relations
Update operations INSERT a tuple. DELETE a tuple. MODIFY a tuple.
Constraints should not be violated in updates
04/10/23Jinze Liu @ University of Kentucky32
Example We have the following relational schemas
Student(sid: string, name: string, gpa: float) Course(cid: string, department: string) Enrolled(sid: string, cid: string, grade: character)
We have the following sequence of database update operations. (assume all tables are empty before we apply any operations)
INSERT<‘1234’, ‘John Smith’, ‘3.5> into Student
sid name gpa
1234 John Smith 3.5
04/10/23Jinze Liu @ University of Kentucky33
Example (Cont.) INSERT<‘647’,
‘EECS’> into Courses
INSERT<‘1234’, ‘647’, ‘B’> into Enrolled
UPDATE the grade in the Enrolled tuple with sid = 1234 and cid = 647 to ‘A’.
DELETE the Enrolled tuple with sid 1234 and cid 647
sid name gpa
1234 John Smith 3.5
cid department
647 EECS
sid cid grade
1234 647 B
sid cid grade
1234 647 A
sid cid grade
04/10/23Jinze Liu @ University of Kentucky34
Exercise INSERT<‘108’,
‘MATH’> into Courses
INSERT<‘1234’, ‘108’, ‘B’> into Enrolled
INSERT<‘1123’, ‘Mary Carter’, ‘3.8’> into Student
sid name gpa
1234 John Smith 3.5
cid department
647 EECS
108 MATH
sid cid grade
1234 108 B
cid department
647 EECS
sid cid grade
sid name gpa
1234 John Smith 3.5
1123 Mary Carter 3.8
04/10/23Jinze Liu @ University of Kentucky35
Exercise (cont.) A little bit tricky INSERT<‘1125’,
‘Bob Lee’, ‘good’> into Student Fail due to domain
constraint INSERT<‘1123’,
NULL, ‘B’> into Enrolled Fail due to entity
integrity INSERT
<‘1233’,’647’, ‘A’> into Enrolled Failed due to
referential integrity
sid name gpa
1234 John Smith 3.5
1123 Mary Carter 3.8
cid department
647 EECS
108 MATH
sid cid grade
1234 108 B
04/10/23Jinze Liu @ University of Kentucky36
Exercise (cont.) A more tricky one UPDATE the cid in
the tuple from Course where cid = 108 to 109
sid name gpa
1234 John Smith 3.5
1123 Mary Carter 3.8
cid department
647 EECS
108 MATH
sid cid grade
1234 108 B
cid department
647 EECS
109 MATH
sid cid grade
1234 109 B
04/10/23Jinze Liu @ University of Kentucky37
Update Operations on Relations
In case of integrity violation, several actions can be taken: Cancel the operation that causes the violation
(REJECT option) Perform the operation but inform the user of
the violation Trigger additional updates so the violation is
corrected (CASCADE option, SET NULL option) Execute a user-specified error-correction
routine