QAch07

33
CHAPTER 7: QUESTIONS AND ANSWERS 1. What is an insertion anomaly? Ans: An insertion anomaly occurs when extra data beyond the desired data must be added to a table. 2. What is an update anomaly? Ans: An update anomaly occurs when it is necessary to change multiple rows to modify only a single fact. 3. What is a deletion anomaly? Ans: A deletion anomaly occurs whenever deleting a row inadvertently causes other data to be deleted. 4. What is the cause of modification anomalies? Ans: Poor database design causes the modification anomaly. A good database design avoids modification anomalies by eliminating excessive redundancies. 5. What is a functional dependency? Ans: A functional dependency is a constraint about two or more columns of a table. X determines Y (X Y) if there exists at most one value of Y for every value of X. 6. How is a functional dependency like a candidate key? Ans: You can think about functional dependency as identifying potential primary keys. By stating that X Y, if X and Y are placed together in a table without other columns, X is a candidate key. Every determinant (LHS) is a potential primary key if placed in a table with the other columns that it determines. 7. Can a software design tool identify functional dependencies? Briefly explain your answer. Ans: No, a software design tool cannot identify functional dependencies. Functional dependencies must be asserted during the database development process. Typically, the database designer interacts with users to understand the functional dependencies that exist for a table. Software

Transcript of QAch07

Page 1: QAch07

CHAPTER 7: QUESTIONS AND ANSWERS

1. What is an insertion anomaly?Ans: An insertion anomaly occurs when extra data beyond the desired data must be added to a table.

2. What is an update anomaly?Ans: An update anomaly occurs when it is necessary to change multiple rows to modify only a single fact.

3. What is a deletion anomaly?Ans: A deletion anomaly occurs whenever deleting a row inadvertently causes other data to be deleted.

4. What is the cause of modification anomalies?Ans: Poor database design causes the modification anomaly. A good database design avoids modification anomalies by eliminating excessive redundancies.

5. What is a functional dependency?Ans: A functional dependency is a constraint about two or more columns of a table. X determines Y (X Y) if there exists at most one value of Y for every value of X.

6. How is a functional dependency like a candidate key?Ans: You can think about functional dependency as identifying potential primary keys. By stating that X Y, if X and Y are placed together in a table without other columns, X is a candidate key. Every determinant (LHS) is a potential primary key if placed in a table with the other columns that it determines.

7. Can a software design tool identify functional dependencies? Briefly explain your answer.

Ans: No, a software design tool cannot identify functional dependencies. Functional dependencies must be asserted during the database development process. Typically, the database designer interacts with users to understand the functional dependencies that exist for a table. Software design tools can aid a designer by eliminating FDs that do not exist and by suggesting FDs that are not contradicted by the data. Design tools examine sample rows in a table to see what functional dependencies do not hold. There are several commercial database design tools that automate the process of eliminating dependencies through examination of sample rows. Ultimately, the database designer must make the final decision about what functional dependencies exist in a table.

8. What is the meaning of an FD with multiple columns on the right-hand side?Ans: Multiple columns on the RHS abbreviate separate FDs with the LHS determining each of the RHS columns.

9. Why should you be careful when writing FDs with multiple columns on the left-hand side?

Page 2: QAch07

Ans: Multiple columns on the LHS indicate that the combination of columns determines the RHS column(s). You should be careful that you have written the LHS with the minimal columns for the dependency. Otherwise, the normalization rules and software procedures may not work as intended.

10. What is a normal form?Ans: A normal form is a rule about allowable dependencies. Each normal form removes certain kinds of redundancies.

11. What does 1NF prohibit?Ans: 1NF prohibits nesting or repeating groups in tables. A table not in 1NF is unnormalized or non-normalized.

12. What is a key column?Ans: A column is a key column if it is part of a candidate key or a candidate key by itself.

13. What is a non-key column?Ans: A column is a non-key column if it is not a key column.

14. What kinds of FDs are not allowed in 2NF?Ans: FDs in which part of a key determines a nonkey column.

15. What kinds of FDs are not allowed in 3NF?Ans: FDs in which a nonkey column determine another nonkey column.

16. What is the combined definition of 2NF and 3NF?Ans: A table is in 3NF if every nonkey column is dependent on a candidate key, the whole candidate key, and nothing but candidate keys.

17. What kinds of FDs are not allowed in BCNF?Ans: FDs in which the determinant is not a candidate key.

18. What is the relationship between BCNF and 3NF? Is BCNF a stricter normal form than 3NF? Briefly explain your answer.

Ans: BCNF is the revised 3NF definition. Yes BCNF is stricter than 3NF in that every table in BCNF is in 3NF but not every table in 3NF is in BCNF.19. Why is the BCNF definition preferred to the 3NF definition?Ans: BCNF is the preferred definition because it is a simpler definition and provides the basis for the simple synthesis algorithm.

20. What are the special cases covered by BCNF but not by 3NF?Ans: BCNF covers two special cases not covered by 3NF: (1) part of a key determines part of a key and (2) a nonkey column determines part of a key.

21. Are the special cases covered by BCNF but not by 3NF significant?

Page 3: QAch07

Ans: The special cases are not significant because they rarely occur. 22. What is the goal of the simple synthesis procedure?Ans: The simple synthesis procedure is used to generate tables satisfying BCNF starting with a list of functional dependencies.

23. What is a limitation of the simple synthesis procedure?Ans: The simple synthesis procedure does not work well on complex dependency structures. To make the synthesis procedure easy to use, some of the details have been omitted. In particular, step 2 can be rather involved because there are more ways to derive dependencies than transitivity. Even checking for transitivity can be difficult with many columns. Even if you understand the complex details, step 2 cannot be done by hand for complex dependency structures. For complex dependency structures, you need to use a CASE tool even if you are an experienced database designer.

24. What is a transitive dependency?Ans: A transitive dependency is an FD derived by the law of transitivity. The law of transitivity says that if an object A is related to B, and B is related to C, then A is related to C.

25. Are transitive dependencies permitted in 3NF tables? Explain why or why not.Ans: No, 3NF prohibits transitive dependencies. Because transitive dependencies are easy to overlook, the preferred definition of 3NF does not use transitive dependencies.

26. Why eliminate transitive dependencies in the simple synthesis procedure?Ans: The simple synthesis procedure requires a complete list of FDs without redundancies. Because transitive dependencies are redundant, they should be eliminated.

27. When is it necessary to perform the fifth step of the simple synthesis procedure?Ans: When there are multiple candidate keys for a table, the fifth step is necessary.

28. How is relationship independence similar to statistical independence?Ans: In statistical dependence, two variables are independent if knowing something about one variable tells you nothing about another variable. If two variables are independent, it is redundant to store data about how they are related. The concept of relationship dependence is similar to statistical independence. If two relationships are independent (that is, not related), it is redundant to store data about a third relationship. You can derive the third relationship by combining the two essential relationships through a join operation.

29. What kind of redundancy is caused by relationship independence?Ans: Modification anomaly is caused by relationship independence.

30. How many columns does an MVD involve?Ans: An MVD involves three columns.

Page 4: QAch07

31. What is a multivalued dependency (MVD)?Ans: A multivalued dependency is a relationship that can be derived from other relationships.

32. What is the relationship between MVDs and FDs?Ans: MVDs are generalizations of FDs. Every FD is an MVD, but not every MVD is an FD.

33. What is a non-trivial MVD?Ans: An MVD that is not also an FD.

34. What is the goal of 4NF?Ans: To prohibit redundancies caused by non trivial MVDs.

35. What are the advantages of using normalization as a refinement tool rather than as an initial design tool?

Ans: In the refinement approach, you perform conceptual data modeling using the ERD. If the design is large, you can split the conceptual data model into view design and view integration. Through development of an ERD, you intuitively group related fields. There is less normalization to perform, which ensures that you have not overlooked any redundancies.

36. Why is 5NF not considered a practical normal form?Ans: Because situations when a three-way relationship should be replaced with three binary relationships (not two binary relationships as for 4NF) are rare, 5NF is generally not considered a practical normal form.

37. Why is DKNF not considered a practical normal form?Ans: Because there is no known algorithm for converting a table into DKNF. In addition, it is not even known what tables can be converted to DKNF.

38. When is denormalization useful? Provide an example to depict when it may be beneficial for a table to violate 3NF.

Ans: Students' responses may vary. However, the following is a sample response: When a database is used predominantly for queries, denormalization may be appropriate. Another time to consider denormalization is when an FD is not important. For example, FDs Zip → City, State in a customer table, these independencies may not be important to maintain if there is not a need to manipulate zip codes independent of customers.

39. What are the two ways to use normalization in the database development process?Ans: There are two opposite ways to use normalization in the database development process: (i) as a refinement tool or (ii) as an initial design tool. In the refinement approach, you perform conceptual data modeling using the Entity Relationship Model. In the initial design approach, you use normalization techniques in conceptual data modeling. Instead of drawing an ERD, you identify functional dependencies and apply a normalization procedure like the simple synthesis procedure.

Page 5: QAch07

40. Why does this book recommend using normalization as a refinement tool, not as an initial design tool?

Ans: Through development of an ERD, you intuitively group related fields. Much normalization is accomplished in an informal manner without the tedious process of recording functional dependencies. As a refinement tool, there is usually less normalization to perform. The purpose is to ensure that you have not overlooked any redundancies. Normalization provides a rigorous way to reason about the quality of the design.

PROBLEM SOLUTIONS

Besides the problems presented here, the case study in Chapter 13 provides additional practice. To supplement the examples in this chapter, Chapter 13 provides a complete database design case including conceptual data modeling, schema conversion, and normalization.

1. For the big university database table, list FDs with the column StdCity as the determinant that are not true due to the sample data. With each FD that does not hold, show the sample data that violate it. Remember that it takes two rows to demonstrate a violation of an FD. The sample data are repeated in Table 7P–1 for your reference.

Ans: stdcity FDs and sample data that violates the FDs. The rows refer to the sample data below.

stdcity offerno is violated by the first two rows and the last two rowsstdcity offterm is violated by the last two rowsstdcity grade is violated by the first two rows and the last two rowsstdcity courseno is violated by the first two rows and the last two rowsstdcity crsdesc is violated by the first two rows and the last two rows

stdssn

stdcity stdclass

offerno offterm

offyear

enrgrade

courseno

crsdesc

s1 seattle jun o1 fall 2003 3.5 c1 dbs1 seattle jun o2 fall 2003 3.3 c2 vbs2 bothell jun o3 winter 2003 3.1 c3 oos2 bothell jun o2 fall 2003 3.4 c2 vb

2. Following on problem 1, list FDs with the column StdCity as the determinant that the sample data do not violate. For each FD, add one or more sample rows and then identify the sample data that violate the FD. Remember that it takes two rows to demonstrate a violation of an FD.

Ans: stdcity FDs not violated by the original sample data are listed below along with a reference to new rows (after row 4) that violate the FDs.

stdcity stdssn is violated by either the first or second row and the fifth row

Page 6: QAch07

stdcity stdclass is violated by either the first or second row and the fifth rowstdcity offyear is violated by either the first or second row and the fifth row

stdssn

stdcity stdclass

offerno offterm

offyear

enrgrade

courseno

crsdesc

s1 seattle jun o1 fall 2003 3.5 c1 dbs1 seattle jun o2 fall 2003 3.3 c2 vbs2 bothell jun o3 winter 2003 3.1 c3 oos2 bothell jun o2 fall 2003 3.4 c2 Vbs3 seattle SR O1 Fall 2004 3.3 C1 DB

3. For the big patient table, list FDs with the column PatZip as the determinant that are not true due to the sample data. Exclude the FD PatZip PatCity because it is a valid FD. With each FD that does not hold, show the sample data that violate it. Remember that it takes two rows to demonstrate a violation of an FD. The sample data are repeated in Table 7P–2 for your reference.

Ans: PatZip FDs and sample rows that violate the FDs. The rows refer to the sample data below.

PatZip PatNo is not violated by the sample dataPatZip PatAge is not violated by the sample dataPatZip VisitDate is not violated by the sample dataPatZip VisitNo is not violated by the sample dataPatZip ProvNo is violated by the first two rowsPatZip ProvSpecialty is violated by the first two rowsPatZip Diagnosis is violated by first two rows

VisitNo VisitDate PatNo PatAge PatCity PatZip ProvNo ProvSpecialty Diagnosis

V10020 1/13/2000 P1 35 DENVER 80217 D1 INTERNIST EAR INFECTIONV10020 1/13/2000 P1 35 DENVER 80217 D2 NURSE PRACTIONER INFLUENZAV93030 1/20/2000 P3 17 ENGLEWOOD 80113 D2 OBGYN PREGNANCYV82110 1/18/2000 P2 60 BOULDER 85932 D3 CARDIOLOGIST MURMUR

4. Following on problem 3, list FDs with the column PatZip as the determinant that sample data do not violate. Exclude the FD PatZip PatCity because it is a valid FD. For each FD, add one or more sample rows and then identify the sample data that violate the FD. Remember that it takes two rows to demonstrate a violation of an FD.

Ans: PatZip FDs not violated by the original sample data are listed below along with a reference to the new rows (after row 4) that violate the FDs.

PatZip PatNo is violated by the third and fifth rowsPatZip PatAge is violated by the third and fifth rowsPatZip VisitDate is violated by the third and fifth rowsPatZip VisitNo is violated by the third and fifth rows

Page 7: QAch07

VisitNo VisitDate PatNo PatAge PatCity PatZip ProvNo ProvSpecialty Diagnosis

V10020 1/13/2000 P1 35 DENVER 80217 D1 INTERNIST EAR INFECTIONV10020 1/13/2000 P1 35 DENVER 80217 D2 NURSE PRACTIONER INFLUENZAV93030 1/20/2000 P3 17 ENGLEWOOD 80113 D2 OBGYN PREGNANCYV82110 1/18/2000 P2 60 BOULDER 85932 D3 CARDIOLOGIST MURMURV34210 1/18/2000 P4 65 ENGLEWOOD 80113 D3 CARDIOLOGIST IRREGULAR BEAT

5. Apply the simple synthesis procedure to the FDs of the big patient table. The FDs are repeated in Table 7P–3 for your reference. Show the result of each step in the procedure. Include the primary keys and the foreign keys in the final list of tables.

Ans: The simple synthesis procedure is applied to the following list of FDs:

PatNo PatAge, PatCity, PatZip

PatZip PatCity

ProvNo ProvSpecialty

VisitNo PatNo, VisitDate, PatAge, PatCity, PatZip

VisitNo, DocNo Diagnosis

Step 1: There are no extraneous columns to removeStep 2: Remove the following FDs because they can be derived through transitivity:

VisitNo PatCityVisitNo PatZipVisitNo PatAgePatNo PatCity

Step 3: Arrange the remaining FDs into groups by determinant

PatNo PatAge, PatZip

PatZip PatCity

ProvNo ProvSpecialty

VisitNo PatNo, VisitDate

VisitNo, DocNo Diagnosis

Step 4: For each FD group, make a table with the determinant as the primary key. In the table list, the primary keys are underlined.

Patient (PatNo, PatAge, PatZip)FOREIGN KEY (PatZip) REFERENCES ZipCode

ZipCode(PatZip, PatCity)Provider(ProvNo, ProvSpecialty)Visit(VisitNo, VisitDate, PatNo)

Page 8: QAch07

FOREIGN KEY (PatNo) REFERENCES PatientDiagnosisTbl(VisitNo, ProvNo, Diagnosis)

FOREIGN KEY (VisitNo) REFERENCES VisitFOREIGN KEY (ProvNo) REFERENCES Provider

Step 5: Merge tables with the same columns. There is no work because no duplicate tables are present.

6. The FD diagram in Figure 7P.1 depicts relationships among columns in an order entry database. Figure 7P.1 shows FDs with determinants CustNo, OrderNo, ItemNo, the combination of OrderNo and ItemNo, the combination of ItemNo and PlantNo, and the combination of OrderNo and LineNo. In the bottom FDs, the combination of LineNo and OrderNo determines ItemNo and the combination of OrderNo and ItemNo determines LineNo. To test your understanding of dependency diagrams, convert the dependency diagram into a list of dependencies organized by LHSs.

Ans:

CustNo CustBal, CustDiscount

OrdNo CustNo, ShipAddr, OrderDate

ItemNo ItemDesc

ItemNo, PlantNo ReorderPoint, QtyOnHand

ItemNo, OrdNo LineNo, QtyOrdered,

QtyOutstanding

ItemNo, LineNo ItemNo, QtyOrdered,

QtyOutstanding

7. Using the FD diagram (Figure 7.P1) and the FD list (solution to problem 6) as guidelines, make a table with sample data. There are tow candidate keys for the underlying table: the combination of OrderNo, ItemNo, and PlantNo and the combination of OrderNo, LineNo, and PlantNo. Using sample data, identify insertion, update, and deletion anomalies in the table.

Ans:

custno custbal orderno

ShipAddr

orderdate itemno itemdesc qtyord plantno reordpoint qtyonhand

C1 100 O1 S1 2/1/2004 I1 Bolt 10 P1 10 15C1 100 O1 S1 2/1/2004 I2 Nut 5 P1 20 25C2 50 O2 S2 2/3/2004 I1 Bolt 1 P1 10 15C1 100 O3 S3 2/4/2004 I3 Screw 10 P1 10 14

Page 9: QAch07

C1 100 O3 S3 2/4/2004 I3 Screw 10 P2 15 20

The above table is missing several fields due to space limitations. There are many modification anomalies in the table: Insertion anomalies: cannot insert a new customer without having an order, item, and

a plant to stock the item. Update anomaly: must change the customer balance multiple times. Deletion anomaly: deleting an order (for example order O2) causes deletion of

customer if customer has only one order.

8. Derive 2NF tables starting with the FD list from problem 6 and the table from problem 7.

Ans: 2NF tables are shown below:

O1(orderno, shipaddr, orderdate, custno, custbal, custdiscount)O2(itemno, itemdesc)O3(orderno, itemno, lineno, qtyordered, qtyoutstanding)O4(plantno, itemno, qtyonhand, reorderpoint)

9. Derive 3NF tables starting with the FD list from problem 6 and the 2NF tables from problem 8.

Ans:All tables except O1 are in 3NF. For 3NF, O1 should be split as shown below:

O1.1(orderno, shipaddr, orderdate, custno) O1.2(custno, custbal, custdiscount)

10. Following on problems 6 and 7, apply the simple synthesis procedure to produce BCNF tables.

Ans:The steps of the BCNF process are listed below.

Step 1: There are no extraneous columns to removeStep 2: There are no transitively derived FDs. If the following FDs were in the dependency diagram, they should be removed:

orderno custbalorderno custdiscount

Step 3: Arrange the remaining FDs into groups by determinant

Page 10: QAch07

custno custbal, custdiscount

orderno orderdate, shipaddr, custno

itemno itemdesc

orderno, itemno qtyord, qtyoutstanding,

lineno

orderno, lineno qtyord, qtyoutstanding,

itemno

itemno, plantno qtyonhand, reorderpoint

Step 4: For each FD group, make a table with the determinant as the primary key. In the table list, the primary keys are underlined.

Order(OrderNo, ShipAddr, OrderDate, CustNo)FOREIGN KEY (CustNo) REFERENCES Customer

Customer(CustNo, CustBal, CustDiscount)Item(ItemNo, ItemDesc)OrderLine1(OrderNo, ItemNo, LineNo, QtyOrdered, QtyOutstanding)

FOREIGN KEY (OrderNo) REFERENCES OrderFOREIGN KEY (ItemNo) REFERENCES Item

OrderLine2(OrderNo, LineNo, ItemNo, QtyOrdered, QtyOutstanding)FOREIGN KEY (OrderNo) REFERENCES OrderFOREIGN KEY (ItemNo) REFERENCES Item

PlantStocking(PlantNo, ItemNo, QtyOnHand, ReorderPoint)FOREIGN KEY (ItemNo) REFERENCES Item

Step 5: Merge tables with the same columns. The tables orderline1 and orderline2 should be merged into one table. Either (orderno, itemno) or (orderno, lineno) can be chosen as the primary key.

11. Modify your table design in problem 10 if the shipping address (ShipAddr) column determines customer number (CustNo). Do you think that this additional FD is reasonable? Briefly explain your answer.

Ans:If shipaddr determines custno, the order table is not in BCNF because shipaddr is a determinant but not a candidate key. The order table should be split into two tables as shown below:

order(orderno, shipaddr, orderdate)shipping(shipaddr, custno)

Page 11: QAch07

This FD may not be reasonable for retail organizations. Two customers can share the same shipping address such as roommates. Even small businesses can share the same office space.12. Go back to the original FD diagram in which ShipAddr does not determine

CustNo. How does your table design change if you want to keep track of a master list of shipping addresses for each customer? Assume that you do not want to lose a shipping address when an order is deleted.

Ans:To keep track of shipping addresses independently of orders, a separate table should be added. The new table contains two columns as shown below. All shipping addresses for a customer are stored in this new table. The order table contains a shipping address field also. The foreign key in the order table should reference shipaddress, not customer.

shipaddress(custno, shipaddr)FOREIGN KEY (CustNo) REFERENCES Customer

13. Convert the ERD in Figure 7P.2 into tables and perform further normalization as needed. After converting to tables, write down FDs for each table. If a table is not in BCNF, explain why and split it into two or more tables that are in BCNF.

Ans:Result after conversionStudent(StdId, Name, Email, Phone, Web, Major, Minor, GPA, AdviserNo,

AdviserName)

Attends(InterviewId, StdId)FOREIGN KEY(InterviewId) REFERENCES InterviewFOREIGN KEY(StdId) REFERENCES Student

Interview(InterviewId, BldgName, RoomNo, RoomType, Date, Time)

Conducts(InterviewId, InterviewerId)FOREIGN KEY(InterviewId) REFERENCES InterviewFOREIGN KEY(InterviewerId) REFERENCES Interviewer

Interviewer(InterviewerId, Name, Phone, Email, CompId)FOREIGN KEY(CompId) REFERENCES Company

Company(CompId, CompName)

Position(PosId, Name)

CompPos(CompId, PosId, City, State)FOREIGN KEY(CompId) REFERENCES CompanyFOREIGN KEY(PosId) REFERENCES Position

Page 12: QAch07

Conversion without the optional 1-M relationship Rule

The above conversion uses the optional 1-M relationship rule (Rule 5). If using Rule 2 instead, the conversion changes as follows:

Replace the Attends table with StdId added as a foreign key in the Interview table. Replace the Conducts table with InterviewerId as a foreign key in the Interview table.

The revised interview table is shown below:

Interview(Interviewid, BldgName, RoomNo, RoomType, Date, Time, StdId, InterviewerId)

FOREIGN KEY(InterviewerId) REFERENCES InterviewerFOREIGN KEY(StdId) REFERENCES Student

Further normalization The student table is not in BCNF because AdviserNo AdviserName. If this FD is

significant, split student into 2 tables with AdviserNo and AdviserName in a new table. AdviserNo is the primary key of the new table.

The Interview table is not in BCNF because BldgName, RoomNo RoomType. If this FD is significant split interview into 2 tables with BldgName, RoomNo, and RoomType in a new table. The combination of BldgName and RoomNo is the primary key of the new table.

Another possible interpretation of the RoomNo attribute is that it contains both a building abbreviation and a room number. For example, PL212 means room 212 in the Plaza building. If RoomNo contains both a room number and a building abbreviation, then RoomNoBldgName, RoomType. If this FD is significant split the interview table into 2 tables with BldgName, RoomNo, and RoomType in a new table. The primary key of the new table is RoomNo.

14. Convert the ERD in Figure 7P.3 into tables and perform further normalization as needed. After the conversion, write down FDs for each table. If a table is not in BCNF, explain why and split it into two or more tables that are in BCNF. Note that in the Owner and Buyer entity types, the primary key (SSN) is included although it is inherited from the Person entity type.

Ans:Result after conversionHome(HomeId, AgentId, SSN, Street, City, State, Zip, NoBedRms, NoBath, SqFt, Price, …)

FOREIGN KEY(AgentId) REFERENCES AgentFOREIGN KEY(SSN) REFERENCES Owner

Page 13: QAch07

Agent(AgentId, OfficeId, Name, Phone)FOREIGN KEY(OfficeId) REFERENCES Office

Person(SSN, Name, Phone)

Owner(SSN, SpouseName, Profess, SpouseProfess)FOREIGN KEY(SSN) REFERENCES Person ON DELETE CASCADE

Buyer(SSN, Address, BdRms, BathRms, MinPrice, MaxPrice, AgentId)FOREIGN KEY(AgentId) REFERENCES AgentFOREIGN KEY(SSN) REFERENCES Person ON DELETE CASCADE

Office(OfficeId, MgrName, Phone, Address)

MakesOffer(SSN, HomeId, ExpDate, Price, CounterOffer)FOREIGN KEY(HomeId) REFERENCES HomeFOREIGN KEY(SSN) REFERENCES Buyer

Further normalization: The home table is not in BCNF because zip state. If the city means the city where

the post office is located, then zip city also holds. If these FDs are significant, split the home table into 2 tables with zip, city, and state in a new table. Zip is the primary key of the new table.

15. Convert the ERD in Figure 7P.4 into tables and perform further normalization as needed. After the conversion, write down FDs for each table. If a table is not in BCNF, explain why and split it into two or more tables that are in BCNF. In the User entity type, UserEmail is unique. In the ExpenseCategory entity type, CatDesc is unique. In the StatusType entity type, StatusDesc is unique. For the ExpenseItem entity type, the combination of the Categorizes and Contains relationships are unique.

Ans: The UNIQUE constraints enforce the candidate key constraints in the problem definition.

Result after conversionUser(UserNo, UserFirstName, UserLastName, UserPhone, UserEmail, UserLimit, MgrNo)

UNIQUE (UserEmail)FOREIGN KEY(MgrNo) REFERENCES User

ExpenseCategory(CatNo, CatDesc, CatLimitAmount)UNIQUE (CatDesc)

StatusType(StatusNo, StatusDesc)

UNIQUE (StatusDesc)

Page 14: QAch07

ExpenseReport(ERNo, ErDesc, ERSubmitDate, ERStatusDate, StatusNo, UserNo)

FOREIGN KEY(StatusNo) REFERENCES StatusFOREIGN KEY(UserNo) REFERENCES User

ExpenseItem(ExpItemNo, ExpItemDesc, ExpItemDate, ExpItemAmount, ERNo, CatNo)

FOREIGN KEY(ERNo) REFERENCES ExpenseReportFOREIGN KEY(CatNo) REFERENCES CategoryUNIQUE (ERNo, CatNo)

Limits(UserNo, CatNo, Amount)FOREIGN KEY(UserNo) REFERENCES UserFOREIGN KEY(CatNo) REFERENCES Category

Further normalization: For each table, the PK determines the other columns. In addition to the FDs associated with the PKs, there are FDs associated with each

candidate key. In the conversion, the UNIQUE constraints designate candidate keys that are not primary keys. For example, the FDs for the User table include the FDs with the primary key (UserNo) as the LHS and the FDs with the candidate key (Email) as the LHS.

All tables are in BCNF. The FDs with the candidate keys as the LHS do not violate BCNF.

16. Convert the ERD in Figure 7P.5 into tables and perform further normalization as needed. After the conversion, write down FDs for each table. If a table is not in BCNF, explain why and split it into two or more tables that are in BCNF. In the employee entity type, each department has one manager. All employees in a department are supervised by the same manager. For the other entity types, FacName is unique in Faculty, ResName is unique in Resource, and CustName and CustEmail are unique in Customer.

Ans: The UNIQUE constraints enforce the candidate key constraints in the problem definition.

Result after conversionFacility(FacNo, FacName)

UNIQUE (FacName)

Location(FacNo, LocNo, LocName)FOREIGN KEY(FacNo) REFERENCES Facility

Resource(ResNo, ResName, ResRate)UNIQUE (ResName)

Page 15: QAch07

Employee(EmpNo, EmpName, EmpPhone, EmpEmail, EmpDeptNo, EmpMgrNo)

UNIQUE (EmpEmail)

Customer(CustNo, CustName, CustPhone, CustEmail, CustContactName) UNIQUE (CustEmail)UNIQUE (CustName)

EventRequest(ERNo, ERDateHeld, ERRequestDate, ERAuthDate, ERStatus, EREstCost, EREstAudience, CustNo, FacNo)

FOREIGN KEY(CustNo) REFERENCES CustomerFOREIGN KEY(FacNo) REFERENCES Facility

EventPlan(EPNo, EPDate, EPNotes, EPActivity, ERNo, EmpNo)FOREIGN KEY(ERNo) REFERENCES EventRequestFOREIGN KEY(EmpNo) REFERENCES Employee

EventPlanLine(EPNo, LineNo, EPLTimeStart, EPLTimeEnd, EPLQty, ResNo, FacNo, LocNo)

FOREIGN KEY(EPNo) REFERENCES EventPlanFOREIGN KEY(ResNo) REFERENCES ResourceFOREIGN KEY(FacNo, LocNo) REFERENCES Location

Further normalization: For each table, the PK determines the other columns. In addition to the FDs associated with the PKs, there are FDs associated with each

candidate key. In the conversion, the UNIQUE constraints designate candidate keys that are not primary keys. For example, the FDs for the Resource table include the FDs with the primary key (ResNo) as the LHS and the FDs with the candidate key (ResName) as the LHS.

All tables are except Employee are in BCNF. The FDs with the candidate keys as the LHS do not violate BCNF.

The Employee table is not in BCNF because EmpDeptNo determines EmpMgrNo. EmpDeptNo is a determinant but not a candidate key. The Employee table should be decomposed to achieve BCNF.

Employee(EmpNo, EmpName, EmpPhone, EmpEmail, DeptNo)UNIQUE (EmpEmail)FOREIGN KEY(DeptNo) REFERENCES Department

Department(DeptNo, MgrNo)UNIQUE (EmpEmail)

Page 16: QAch07

17. Extend the solution to the problem described in Section 8.2.4 about a database to track submitted conference papers. In the description, underlined parts are new. Write down the new FDs. Using the simple synthesis algorithm, design a collection of tables in BCNF. Note dependencies that are not important to the problem and relax your design from BCNF as appropriate. Justify your reasoning.

Author information includes a unique author number, a name, a mailing address, and a unique but optional electronic address.

Paper information includes the list of authors, the primary author, the paper number, the title, the abstract, review status (pending, accepted, rejected), and a list of subject categories.

Reviewer information includes the reviewer number, the name, the mailing address, a unique but optional electronic address, and a list of expertise categories.

A completed review includes the reviewer number, the date, the paper number, comments to the authors, comments to the program chairperson, and ratings (overall, originality, correctness, style, and relevance).

Accepted papers are assigned to sessions. Each session has a unique session identifier, a list of papers, a presentation order for each paper, a session title, a session chairperson, a room, a date, a start time, and a duration. Note that each accepted paper can be assigned to only one session.

Ans:New FDspaperno sessno, ordersessno, order papernosessno starttime, duration, date, room, chair, title

Complete list of FDsAuthNo AuthName, AuthEmail, AuthAddress AuthEmail authnoPaperNo primary-authno, title, abstract, status, sessno, orderSessNo, Order PaperNoRevNo RevName, RevEmail, RevAddress RevEmail RevNoSessNo starttime, duration, date, room, chair, titleRevNo, PaperNo auth-comm, prog-comm, date, 5 rating columns

FD Notes: A paper can appear in at most 1 session. A paper can have at most 1 order in a session. For the list of authors, subject categories, and expertise categories, FDs with null right

hand sides can be written:paperno, authno paperno, catname

Page 17: QAch07

revno, catname

BCNF Tablesauthor(authno, authname, authemail, authaddress)paper(paperno, primary-authno, title, abstract, status, sessno, order)

FOREIGN KEY (primary-authno) REFERENCES authorFOREIGN KEY (sessno) REFERENCES session

Because the combination of sessno and order is also a candidate key, there is no violation of BCNF.reviewer(revno, revname, revemail, revaddress)review(paperno, revno, auth-comm, prog-comm, date, 5 rating attributes)

FOREIGN KEY (paperno) REFERENCES paperFOREIGN KEY (revno) REFERENCES reviewer

session(sessno, title, chair, starttime, date, room, duration)paper-author(paperno, authno)

FOREIGN KEY (paperno) REFERENCES paperFOREIGN KEY (authno) REFERENCES author

paper-category(paperno, catname)FOREIGN KEY (paperno) REFERENCES paper

reviewer-category(revno, catname)FOREIGN KEY (revno) REFERENCES reviewer

18. For the following description of an airline reservation database, identify functional dependencies and construct normalized tables. Using the simple synthesis algorithm, design a collection of tables in BCNF. Note dependencies that are not important to the problem and relax your design from BCNF as appropriate. Justify your reasoning.  The Fly by Night Operation is a newly formed airline aimed at the burgeoning market of clandestine travelers (fugitives, spies, con artists, scoundrels, deadbeats, cheating spouses, politicians, etc.). The Fly by Night Operation needs a database to track flights, customers, fares, airplane performance, and personnel assignment. Since the Fly by Night Operation is touted as a “fast way out of town,” individual seats are not assigned, and flights of other carriers are not tracked. More specific notes about different parts of the database are listed below:

Information about a flight includes its unique flight number, its origin, its (supposed) destination, and (roughly) estimated departure and arrival times. To reduce costs, the Fly by Night Operation only has nonstop flights with a single origin and destination.

Flights are scheduled for one or more dates with an airplane and a crew assigned to each scheduled flight, and the remaining capacity (seats remaining) noted. In a crew assignment, the employee number and the role (e.g., captain, flight attendant) are noted.

Airplanes have a unique serial number, a model, a capacity, and a next-scheduled-maintenance date.

Page 18: QAch07

The maintenance record of an airplane includes a unique maintenance number, a date, a description, the serial number of the plane, and the employee responsible for the repairs.

Employees have a unique employee number, a name, a phone, and a job title. Customers have a unique customer number, a phone number, and a name (typically

an alias). Records are maintained for reservations of scheduled flights including a unique

reservation number, a flight number, a flight date, a customer number, a reservation date, a fare, and the payment method (usually cash but occasionally someone else’s check or credit card). If the payment is by credit card, a credit card number and an expiration date are part of the reservation record.

Ans: FDsflno origin, destination, deptime, arrtimeflno, date remseats, serialnoflno, date, empno roleserialno modelid, capacity, nextmaindatemodelid capacity -- This dependency may not hold. For some airlines, the same

model can be configured with different seating capacities. If this airline only has one class of service, this FD should hold.

mainno serialno, date, empno, desccustno alias, phoneresno flno, custno, date, fare, paymeth, crcardno, expdateflno, custno, date resno, fare, paymeth, crcardno, expdateempno name, title, phone

FD Notes: It is possible that role is the same value as title. The problem does not provide details

about whether role and title are the same. If they are the same, the FD flno, date, empno role should be deleted. However, there should still be a table containing flno, date, and empno. The FD with the null RHS still holds: flno, date, empno .

It is possible that crews should be assigned a unique number and other properties such as a range of dates for the crew to work together. This extension would involve FDs with crewno as the LHS and another table to represent a M-N relationship between crew and employee.

The FD crcardno expdate does not hold because the expdate changes as a credit card is renewed. Thus, two reservations with the same credit card number can have different expiration dates.

The FD custno crcardno does not hold because a customer can have multiple credit cards. Likewise the reverse FD does not hold because multiple people can share the same credit card.

BCNF Tablesflight(flno, origin, destination, deptime, arrtime)

Page 19: QAch07

schedule(flno, date, remseats, serialno) FOREIGN KEY (serialno) REFERENCES airplaneassignment(flno, date, empno, role) FOREIGN KEY (flno, date) REFERENCES schedule FOREIGN KEY (empno) REFERENCES employeeairplane(serialno, modelid, nextmaindate) FOREIGN KEY (modelid) REFERENCES modelmodel(modelid, capacity) /* if the modelidcapacity FD does not hold, ignore this table.maintenance(mainno, serialno, date, empno, desc) FOREIGN KEY (serialno) REFERENCES airplane FOREIGN KEY (empno) REFERENCES employeecustomer(custno, alias, phone)reservation(resno, flno, custno, date, fare, paymeth, crcardno, expdate) FOREIGN KEY (flno, date) REFERENCES schedule FOREIGN KEY (custno) REFERENCES customeremployee(empno, name, title, phone)

19. For the following description of an accounting database, identify functional dependencies and construct normalized tables. Using the simple synthesis algorithm, design a collection of tables in BCNF. Note dependencies that are not important to the problem and relax your design from BCNF as appropriate. Justify your reasoning.

The primary function of the database is recording of entries into a register. A user can have multiple accounts and there is a register for each account.

Information about users includes a unique user number, a name, a street address, a city, a state, a zip, and an e-mail address (optional).

Accounts have attributes including a unique number, a unique name, a start date, a last check number, a type (Checking, Investment, etc.), a user number, and a current balance (computed). For checking accounts, the bank number (unique), the bank name, and the bank address also are recorded.

An entry contains a unique number, a type, an optional check number, a payee, a date, an amount, a description, an account number, and a list of entry lines. The type can have various values including ATM, Next Check Number, Deposit, and Debit Card.

In the list of entry lines, the user allocates the total amount of the entry to categories. An entry line includes a category name, a description of the entry line, and an amount.

Categories have other attributes not shown in an entry line: a unique category number (the name is also unique), a description, a type (asset, expense, revenue, or liability), and tax-related status (yes or no).

Categories are organized in hierarchies. For example, there is a category Auto with subcategorizes Auto:fuel and Auto:repair. Categories can have multiple levels of subcategories.

Ans:

Page 20: QAch07

FDsuserno username, address, zip, emailemail username, address, zip, usernozip city, stateacctno acctname, address, balance, lastcheckno, startdate, userno, banknoacctname acctno, address, balance, lastcheck, startdate, userno, banknobankno bankname, bankaddrentryno entdate, entamount, payee, entdesc, acctno, checknoacctno, checkno entryno, entdate, entamount, payee, entdesc, acctnocatno catname, desc, taxstatus, supcatnocatname catno, catdesc, taxstatus, supcatnocatno, entryno EntLineAmount, EntLineDesccatname, entryno EntLineAmount, EntLineDesccatno, bankno, checkno EntLineAmount, EntLineDesccatname, bankno, checkno EntLineAmount, EntLineDesc

FD Notes: FDs with checkno as a determinant depend on whether an account can have identical

check numbers. If an account can have duplicate check numbers, this set of FDs should be removed.

The FDs with email as the determinant assume that users do not share the same email address. For example, a husband and wife might share the same email address if one or both does not use email much. If email addresses can be shared, then the FDs with email as a determinant should be removed.

Zip determines city unless city is the residence city, not the post office city. If the city represents the post office location, each zip code determines city.

BCNF TablesUser(UserNo, UserName, Address, City, State, Zip, Email)

This table is not in BCNF because zip is a determinant but not a PK. The zip dependencies are not important to the database so the table is not normalized further.

Account(AcctNo, AcctName, Balance, LastCheckNo, StartDate, BankNo, UserNo) FOREIGN KEY (BankNo) REFERENCES Bank FOREIGN KEY (UserNo) REFERENCES UserBank(BankNo, BankName, Address)Entry(EntryNo, AcctNo, EntDate, EntAmount, EntDesc, CheckNo,EntPayee) FOREIGN KEY (AcctNo) REFERENCES AccountCategory(CatNo, CatName, CatDesc, SupCatNo) FOREIGN KEY (SupCatNo) REFERENCES CategoryEntLine(EntryNo, CatNo, EntLineAmount, EntLineDesc) FOREIGN KEY (CatNo) REFERENCES Category FOREIGN KEY (EntryNo) REFERENCES Entry

Page 21: QAch07

20. For the ERDs in Figure 8P.4, describe assumptions under which the ERDs correctly depict the relationships among operators, machines, and tasks. In each case, choose an appropriate name for the relationship(s) and describe the meaning of the relationship(s). In part (b) you should also choose the name for the new entity type.

Ans:a) There is independence among operators and tasks. For a given machine, if an

operator is trained on the machine and the machine can perform the task, then the operator can use the machine for that task. We are not interested in tracking actual usage of machines by operator and task. Rename R1 as trainedfor and R2 as performs.

b) There is dependence among operator, machine, and task. The goal of the database is to record the combinations of operators, machines, and tasks used, not potential combinations. The new entity type may be called performson or taskusage. There would probably be one or more attributes of the new entity type. The three relationships can be named: conducts or operatorof (R1), usedfor or machineof (R2), and performedwith or taskof (R3).

21. For the following description of a database to support physical plant operations, identify functional dependencies and construct normalized tables. Using the simple synthesis procedure, design a collection of tables in BCNF. Note dependencies that are not important to the problem and relax your design from BCNF as appropriate. Justify your reasoning. Design a database to assist physical plant personnel in managing key cards for

access to buildings and rooms. The primary purpose of the database is to ensure proper accounting for all key cards.

A building has a unique building number, a unique name, and a location within the campus.

A room has a unique room number, a size (physical dimensions), a capacity, a number of entrances, and a description of equipment in the room. Each room is located in exactly one building. The room number includes a building identification and followed by an integer number. For example, room number KC100 identifies room 100 in the King Center (KC) building.

An employee has a unique employee number, a name, a position, a unique email address, a phone, and an optional room number in which the employee works.

Magnetically encoded key cards are designed to open one or more rooms. A key card has a unique card number, a date encoded, a list of room numbers that the key card opens, and the number of the employee authorizing the key card. A room my have one or more key cards that open it. A key type must be authorized before it is created.

Ans: FDsbldgno bldgname, bldglocbldgname bldgno, bldglocroomno roomsize, roomcap, roomnumentrances, roomequipdesc, bldgnoempno empname, empposition, empemail, empphone, roomno, kcno

Page 22: QAch07

empemail empname, empposition, empno, empphone, roomno, kcnokeycardno kcdateencoded, kcauthempno, empnokeycardno, roomno

FD Notes: RoomNo can be represented as one column or two columns (building abbreviation

and number). If represented by two columns, the combination of building abbreviation and number determine the other columns. In addition, the combination of BldgNo and number determine the other columns.

To represent the M-N relationship between key cards and rooms, you can write an FD with a null RHS (keycardno, roomno ).

To represent the 1-1 relationship between employee and key card, use multiple FDs with empno determining key card number, empemail determining key card number, key card number determining empno.

The FD keycardno determines empemail should not be recorded because it can be transitively derived (keycardno empno, empno empemail implies keycardno empemail).

BCNF TablesBuilding(BldnNo, BldgName, BldgLocation) UNIQUE(BldgName)Room(RoomNo, roomsize, roomcap, roomnumentrances, roomequipdesc, BldgNo) FOREIGN KEY (BldgNo) REFERENCES BuildingEmp(EmpNo, empname, empposition, empemail, empphone, roomno, kcno) FOREIGN KEY (RoomNo) REFERENCES Room FOREIGN KEY (KCNo) REFERENCES KeyCard UNIQUE(EmpEmail)KeyCard(KCNo, kcdateencoded, kcauthempno, empno) FOREIGN KEY (EmpNo) REFERENCES Employee FOREIGN KEY (KCAuthEmpNo) REFERENCES EmployeeKCAssign(KCNo, RoomNo) FOREIGN KEY (KCNo) REFERENCES KeyCard FOREIGN KEY (RoomNo) REFERENCES Room

22. For the ERDs in Figure 7.P7, describe assumptions under which the ERDs correctly depict the relationships among work assignments, tasks, and materials. A work assignment contains the scheduled work for a construction job at a specific location. Scheduled work includes the tasks and materials needed for the construction job. In each case, choose appropriate names for the relationships and describe the meaning of the relationships. In part (b) you should also choose the name for the new entity type.

Ans:c) There is independence among materials and tasks. For a given work

assignment, it is necessary to track the material required and the tasks

Page 23: QAch07

required, not the material usage for each task in the work assignment. Rename R1 as MatUsage and R2 as TasksRequired.

d) There is dependence among materials and tasks. For a given work assignment, it is necessary to track the material usage for each task in the work assignment. It is not enough to track the material required and the tasks required. The new entity type may be called mattaskusage. There would probably be one or more attributes of the new entity type such as to note the estimated and actual quantity of material used for a task. The three relationships can be named: requires (R1), matusage (R2), and taskusage (R3).