IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept.,...
-
Upload
chad-dalton -
Category
Documents
-
view
218 -
download
0
Transcript of IMS 6217: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept.,...
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
1Dr. Lawrence West, Management Dept., University of Central [email protected]
Super-Type & Sub-Type Entities—Topics
• Problems needing subtype entities
• Nature of the solution
• Variations—Specialization and Completeness
• Subtype Identifiers
• Implementing
• Special Topics
• Using Super- and Sub-Types
• SQL w/ ST-ST
• Subtypes of an Unimplemented Supertype
• Performance Considerations
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
2Dr. Lawrence West, Management Dept., University of Central [email protected]
Supertype & Subtype Entities
• Some entities have records that come in various ‘flavors’.
– StudentsDoctoral, Masters, Undergraduate
– ProductsSerial-numbered, perishable, animals, etc.
– EmployeesSalaried, hourly, managerial, part time
– Pet Store Products Food, animals, accessories
• These entity sets have two types of attributes
– Attributes common to every occurrence
– Attributes required by one or more subtypes but not used by all occurrences of the entity
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
3Dr. Lawrence West, Management Dept., University of Central [email protected]
Exercise #1
Create all possible attributes and all immediate relationships for a Product entity in a
______________________________
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
4Dr. Lawrence West, Management Dept., University of Central [email protected]
Why is This a Problem?
• Variations on an entity create a space problem
– If we put all possible attributes for all possible variations (subtypes) in one entity we will waste unused fields in most records
– Sport attribute for students who are not athletes
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
5Dr. Lawrence West, Management Dept., University of Central [email protected]
Supertype & Subtype Entities (cont.)
• Subtypes also create relationship problems– Some relationships will only be with a subtype of the
entity, not with all types– Important when subtype is the Child in the relationship
(has the foreign key)
Student
PIDLastNameFirstNameStreetAddressCity :DissertationAdvisorID
Faculty
EmployeeIDLastNameFirstName :FacultyTypeRankDoctorallyQualified
Has DoctoralAdvisor
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
6Dr. Lawrence West, Management Dept., University of Central [email protected]
Nature of the Problem
• Many records willhave empty fields
– GraduationDate
• We care about fieldsthat will always be empty for certain categories of records…
• …and we can easily determine which records those are
• We will remove those often-empty fields to separate storage structures (entities/tables)
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
7Dr. Lawrence West, Management Dept., University of Central [email protected]
Supertype & Subtype Entities
• We can split upentities with variationsinto a supertype andone or more subtypes
– Supertype containsattributes common toall occurrences
– Subtypes contain attributes needed by the subtype
S tudent
M astersS tudent
D octora lS tudent
U ndergradS tudent
Student
MastersStudent
DoctoralStudent
Under-graduateStudent
d
ERD Notation
Visio Equivalent
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
8Dr. Lawrence West, Management Dept., University of Central [email protected]
An Example
• Cash is a PaymentTypebut needs no special attributes– Partial Specialization (coming up)
• Payment ID is PK of all entities
• Payment ID is also FK insubtype entities
– In SQL Server besure to set parent this way when implementing relationships
PAYMENT
PaymentIDAccountIDPaymentDatePaymentAmountPaymentType
CHECK_PAYMENT
PaymentIDCheckNum
CC_PAYMENT
PaymentIDCC_TypeCC_NumberSecurityCodeApprovalCode
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
9Dr. Lawrence West, Management Dept., University of Central [email protected]
Implementing in SQL Server—Table Design
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
10Dr. Lawrence West, Management Dept., University of Central [email protected]
Implementing in SQL Server—Relationships
PK is also FK
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
11Dr. Lawrence West, Management Dept., University of Central [email protected]
Implementing in SQL Server—Diagrams
• Arrange in org-chart hierarchy
– Gives visual cue that this is a ST/ST relationship
– You will need to wrestle with the relationship lines a little
• Note Key symbols at both ends of the lines
– Indicates 1:1 Cardinality
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
12Dr. Lawrence West, Management Dept., University of Central [email protected]
Need for Subtypes
• Subtypes are used when an identifiable subset of occurrences have a need for fields not needed by all occurrences
– Many occurrences will have empty attribute values
– An occurrence’s membership in the identifiable subset must be observable
• It is known whether a student is registered as an athlete
• But there is no obvious distinction to distinguish ‘local’ students from ‘transient’ students
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
13Dr. Lawrence West, Management Dept., University of Central [email protected]
Exercise #2 & #3
#2: Write the SQL to retrieve all of the information for credit card payments in the month of June 2007
#3: Write the SQL to recreate the table on Slide #4
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
14Dr. Lawrence West, Management Dept., University of Central [email protected]
Exercise #4
Split the Product entity into Super-/Subtypesby placing attributes appropriately
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
15Dr. Lawrence West, Management Dept., University of Central [email protected]
First Variation on Super-/Subtypes
• Completeness Constraint
– Must every supertype occurrence have at least one occurrence in one of the subtypes?
• Total specialization means thata subtype occurrence must exits
– Indicated with a double lineto the connecting circle
• Partial specialization means thata subtype need not exist
– Indicated with a single line tothe connecting circle
S tudent
S tudent
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
16Dr. Lawrence West, Management Dept., University of Central [email protected]
Total Specialization Completeness Constraint
• Total specialization means that every record in the supertype must have a matching record in one or more subtypes
• Relatively rare (in my experience) but possible
• Model in Visio using a thicker descending line (use Format Line)
– (Visio doesn’t do double lines)
– Increase thickness by two levels
S tudent
STUDENT
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
17Dr. Lawrence West, Management Dept., University of Central [email protected]
Partial Specialization Completeness Constraint
• Some records in supertypes may have no matching subtype records
• Their subtype groups do not needspecial attributes
– But membership in a groupmay still be important andtracked
• It is possible for a suptertype to haveonly one subtype group
STUDENT
ATHLETE
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
18Dr. Lawrence West, Management Dept., University of Central [email protected]
Second Variation on Super-/Subtypes
• You must also determine whether a supertype occurrence can be found in more than one subtype
– A disjoint relationship meansthat a supertype occurrence can only be found in one subtype
– An overlap relationship means that a supertype occurrencecan be found in multiple subtypes(E.g., some universities have ajoint J.D./MBA program)
S tudent
d
S tudent
o
“d”
“o”
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
19Dr. Lawrence West, Management Dept., University of Central [email protected]
Disjoint Relationships
• A registered vehicle canonly be of one type
VEHICLE
VINManufacturerYearWeightType
d
CAR
VINDoorsSeats
TRUCK
VINBedLengthTowingCapacityTailgateType
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
20Dr. Lawrence West, Management Dept., University of Central [email protected]
Overlap Relationships
STUDENT
o
ATHLETEPATIENTINTERN EMPLOYEE VETERAN
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
21Dr. Lawrence West, Management Dept., University of Central [email protected]
Subtype Identifiers
• The supertype entity must indicate which (if any) subtypes are used
• Disjoint subtypes can use one attribute with a code to indicate the type of subtype
– Value of the PaymentType attribute (‘Cash’, ‘Check’, ‘CC’) identifies the subtype
– Remember that some subtype identifiers (‘Cash’ here) may have no subtype entities
– Sometimes this value may be blank (not part of any group)
PAYMENT
PaymentIDAccountIDPaymentDatePaymentAmountPaymentType
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
22Dr. Lawrence West, Management Dept., University of Central [email protected]
Subtype Identifiers (cont.)
• Overlapping subtypes must use a collection of yes/no attributes, one for each possible subtype
– Setting attribute to true/yes in a record indicates that a matching subtype record exists
– Leaving all to false/no indicates no matching subtype (partial specialization)
STUDENT
PIDLastNameFirstName :InternPatientAthleteEmployeeVeteran
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
23Dr. Lawrence West, Management Dept., University of Central [email protected]
Subtype Identifiers (cont.)
• A subset of subtypes may be disjoint while others are overlap
o
MD…INTERN MASTERS PhD
STUDENT
PIDLastNameFirstName :InternPatientAthleteEmployeeVeteranDegreeSought
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
24Dr. Lawrence West, Management Dept., University of Central [email protected]
Subtypes of Subtypes
• It is possible to have subtypes of subtypes
• Model products in a pet store where some are inanimate, some are food, some are live and of the live animals some are tracked individually…
– Cute puppies with wet noses
– Cats
• … and others are not
– Goldfish
– Mice
• … and some are sold as food
– Cute little mice as food for slithering scaly snakes
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
25Dr. Lawrence West, Management Dept., University of Central [email protected]
Some Caveats
• ST/ST determined at the group level
• Membership in the group must be determinable for every record
– Every record in the group must have the same patter of value/no value for the attributes
– Subtype attributes may be null if they could receive values later
• Individual records may not have values for all fields
– Do not consider for ST/ST unless group membership can be determined
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
26Dr. Lawrence West, Management Dept., University of Central [email protected]
Some Caveats
• More than one subtype may have the same field in it
– Field goes in subtype entities if not every subtype group needs it
– E.g.—UndergraduateDegree for Doctoral/Masters students
• Consider eliminating subtypes if they have only one or two attributes
– Roll their attributes back into the suptertype and accept wasted space
– Consider if a large proportion of the population
– Consider if frequently accessed
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
27Dr. Lawrence West, Management Dept., University of Central [email protected]
Implementing Super-/Subtypes
• There is a Mandatory-1:Optional-1 relationship between entities in a super and subtype relationship
– Mandatory at supertype end
– Optional at each subtype end
• Each subtype occurrence (record) has identifier attribute values that exactly match a record in the supertype (but not vice-versa)
• All entities have the same primary key/ identifier attributes
• PK in the subtype is also the FK from supertype
– Special case of a weak entity
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
28Dr. Lawrence West, Management Dept., University of Central [email protected]
Subtypes of an Unimplemented Supertype
• Many, many data models will have records that could be subtypes of a supertype that is not implemented
• For UCF a “Person” entity could have subtypes
– Student − Donor
– Faculty − Contractor
• Tend to not implement this Person supertype unless the entities are regularly queried together
• Occasional queries can be supported with a UNION query
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
29Dr. Lawrence West, Management Dept., University of Central [email protected]
Subtypes and Object Oriented Design
• Super- and Sub-type design exactly corresponds to the philosophy of inheritance in object oriented design
• If programming using an OO approach you will almost always implement objects with inheritance to match super- and sub-type design
• You can also implement inheritance for the unimplemented supertype discussed in the previous slide, even if not implemented in the DB design
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
30Dr. Lawrence West, Management Dept., University of Central [email protected]
Using Super-/Sub-type Tables
• Application logic and SQL for super-and sub-type tables becomes more complex
• Inserts must test the subtype identifier to determine where to add records
– Always to the supertype
– Decide which (if any) subtype(s)
• Similar for Updates
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
31Dr. Lawrence West, Management Dept., University of Central [email protected]
Using Super-/Sub-type Tables (cont.)
• Retrieval also complex
• You cannot simply join the supertype with all subtypes since no records will be returned if a subtype has no match
– Why won’t the following work?
SELECT Payment.*, Check_Payment.*, CC_Payment.*FROM Payment, Check_Payment, CC_PaymentWHERE Payment.PaymentID = Check_Payment.PaymentID AND Payment.PaymentID = CC_Payment.PaymentID AND Payment.PaymentID = 1472
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
32Dr. Lawrence West, Management Dept., University of Central [email protected]
Using Super-/Sub-type Tables (cont.)
• Two query approaches
• Use conditional logic
• Use Left/Right Outer Joins
SELECT Customers.CompanyName, Orders.OrderDateFROM Customers LEFT OUTER JOIN Orders ON Customers.CustomerID = Orders.CustomerID
IMS 6217: Data Modeling—Super-Type/Sub-Type Entities
33Dr. Lawrence West, Management Dept., University of Central [email protected]
Performance Considerations
• Because of the performance considerations and complexity of Super- and Sub-types you will regularly consider eliminating subtypes
• Roll up their attributes into the super-type and accept the wasted columns
• Arguments for retaining subtypes
– Several unique attributes, especially large (text) ones
– Relatively few records in the subtype (compared to overall number of records)
– Relatively few transactions use the subtype
• Look at vertical partitioning later in the course