Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk...

15
Fundarnentals of n I 5th Edition Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Sharnkant B. Navathe College of Computing Georgia Institute of Technology Boston San Francisco NewYork London Toronto Sydney Tokyo Singapore Madrid Mexico City Munich Paris CapeTown Hang Kong Montreal

Transcript of Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk...

Page 1: Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk Storage, Basic File Strudures, and Hashing 453 13.1 lntroduction 454 13.2 Secondary

Fundarnentals of

n I

5th Edition

Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington

Sharnkant B. Navathe College of Computing Georgia Institute of Technology

Boston San Francisco NewYork London Toronto Sydney Tokyo Singapore Madrid

Mexico City Munich Paris CapeTown Hang Kong Montreal

Page 2: Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk Storage, Basic File Strudures, and Hashing 453 13.1 lntroduction 454 13.2 Secondary

>duction and Conceptual Modeling M

iter 1 Databases and Database Users 3

I lntroduction 4

! An Example 6

3 Characteristics of the Database Approach 9

Actors on the Scene 14

Workers behind the Scene 16

Advantages of Using the DBMS Approach 17

A Brief History of Database Applications 23

When Not to Use a DBMS 26

Summary 27

iview Questions 27

ercises 28

!lected Bibliography 28

iter 2 Database System Concepts and Architecture 29

I Data Models, Schemas, and lnstances 30

! Three-Schema Architecture and Data lndependence

Page 3: Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk Storage, Basic File Strudures, and Hashing 453 13.1 lntroduction 454 13.2 Secondary

stabase Languages and Interfaces 36

\e Database System Environment 40

entralized and ClienWServer Architectures for DBMSs 44

lassification of Database Management Systems 49

Jmrnary 52

Review Questions 54

Exercises 54

Selected Bibliography 55

chapter 3 Data Modeling Ucing the Entity-Relationship (ER) Model 57

3.1 Using High-Level Conceptual Data Models for Database Design 59

3.2 An Example Database Application 60

3.3 Entity Types, Entity Sets, Attributes, and Keys 61

3.4 Relationship Types, Relationship Sets, Roles, and Structural Constraints 70

3.5 Weak Entity Types 76

3.6 Refining the ER Design for the COMPANY Database 78

3.7 ER Diagrams, Narning Conventions, and Design lssues 79

3.8 Example of Other Notation: UML Class Diagrams 84

3.9 Relationship Types of Degree Higher Than Two 86

3.10 Summary 90

Review Questions 91

Exercises 92

Laboratory Exercises 99

Selected Bibliography 101

chapter 4 The Enhanced Entity-Relationship (EER) Model 101

4.1 Subclasses, Superclasses, and lnheritance 101

4.2 Specialization and Generalization 104

4.3 Constraints and Characteristics of Specialization and Generalization Hierarchies 107

4.4 Modeling of UNION Types Using Categories 114

4.5 An Example UNlVERSlTY EER Schema, Design Choices, and Formal Definitions 117

Page 4: Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk Storage, Basic File Strudures, and Hashing 453 13.1 lntroduction 454 13.2 Secondary

..-. Contents a . kv~

4.6 Example of Other Notation: Representing Specialization and Generalization in UML Class Diagrams 121

4.7 Data Abstraction, Knowledge Representation, and Ontology Concepts 123

4.8 Summary 129

Review Questions 129

Exercises 130

Laboratory Exercises 137

Selected Bibliography 138

npart 2 Relational Model: Concepts, Constraints, Languages, Design, and Programming H

chapter 5 The Relational Data Model and Relational Database Constraints 141

5.1 Relational Model Concepts 141

5.2 Relational Model Constraints and Relational Database Schemas 149

5.3 Update Operations, Tranactions, and Dealing with Constraint Violations 157

5.4 Summary 161

Review Questions 162

Exercises 162

Selected Bibliography 166

chapter 6 The Relational Algebra and Relational Calculus 167

6.1 Unary Relational Operations: SELECT and PROJECT 169

6.2 Relational Algebra Operations from Set Theory 174

6.3 Binary Relational Operations: JOlN and DIVISION 177

6.4 Additional Relational Operations 186

6.5 Examples of Queries in Relational Algebra 193

6.6 The Tuple Relational Calculus 195

Page 5: Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk Storage, Basic File Strudures, and Hashing 453 13.1 lntroduction 454 13.2 Secondary

i .,.<,.I/

Contents

6.7 The Domain Relational Calculus 204

6.8 Summary 206

Review Questions 207

Exercises 208

Laboratory Exercises 21 3

Selected Bibliography 215

chapter 7 Relational Database Design by ER-and EER-to-Relational Mapping 217

7.1 Relational Database Design Using ER-to-Relational Mapping 21 8

7.2 Mapping EER Model Constructs to Relations 226

7.3 Summary 230

Review Questions 230

Exercises 231

Laboratory Exercises 232

Selected Bibliography 233

chapter 8 SQL-99: Schema Definition, Constraints, Queries, and Views 233

8.1 SQL Data Definition and Data Types 235

8.2 Specifying Constraints in SQL 240

8.3 Schema Change Statements in SQL 243

8.4 Basic Queries in SOL 245

8.5 More Complex SQL Queries 255

8.6 INSERT, DELETE, and UPDATE Statements in SQL 271

8.7 Specifying Constraints as Assertions and Triggers 274

8.8 Views (Virtual Tables) in SOL 275

8.9 Additional Features of SQL 279

8.10 Sumrnaly 280

Review Questions 280

Exercises 282

Laboratory Exercises 285

Selected Bibliography 289

Page 6: Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk Storage, Basic File Strudures, and Hashing 453 13.1 lntroduction 454 13.2 Secondary

-.8 ,Contents %I>

chapter 9 lntroduction to SQL Prograrnrning Techniques 289

9.1 Database Programming: lssues and Techniques 290

9.2 Embedded SQL, Dynarnic SQL, and SQLJ 293

9.3 Database Programming with Function Calls: SQLlCLl and JDBC 305

9.4 Database Stored Procedures and SQLIPSM 315

9.5 Summary 318

Review Questions 318

Exercises 319

Laboratory Exercises 319

Selected Bibliography 321

npart 3 Database Design Theory and Methodology H

chapter 10 Functional Dependencies and Norrnalization for Relational Databases 325

10.1 Informal Design Guidelines for Relation Schemas 327

10.2 Functional Dependencies 337

10.3 Normal Forms Based on Primary Keys 345

10.4 General Definitions of Second and Third Normal Forms 355

10.5 Boyce-Codd Normal Form 358

10.6 Summary 360

Review Questions 361

Exercises 362

Laboratory Exercises 367

Selected Bibliography 367

chapter 1 1 Relational Database Design Algorithrns and Further Dependencies 369

11.1 Properties of Relational Decompositions 370

1 1.2 Algorithms for Relational Database Schema Design 376

Page 7: Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk Storage, Basic File Strudures, and Hashing 453 13.1 lntroduction 454 13.2 Secondary

Contents

11.3 Multivalued Dependencies and Fourth Normal Form 386

11.4 Join Dependencies and Fifth Normal Form 392

11.5 lnclusion Dependencies 393

11.6 Other Dependencies and Normal Forms 394

1 1.7 Summary 397

Review Questions 397

Exercises 398

Laboratory Exercises 400

Selected Bibliography 401

chapter 12 Practical Database Design Methodology and Use of UML Diagrams 403

12.1 The Role of Information Systems in Organizations 404

12.2 The Database Design and lmplementation Process 409

12.3 Use of UML Diagrams As an Aid to Database Design Specification 428

12.4 Rational Rose, a UML-Based Design Tool 437

12.5 Automated Database Design Tools 443

12.6 Summary 446

Review Questions 447

Selected Bibliography 449

Ipart 4 Data Storage, Indexing, Query Processing, and Physical Design I

chapter 13 Disk Storage, Basic File Strudures, and Hashing 453

13.1 lntroduction 454

13.2 Secondary Storage Devices 457

13.3 Buffering of Blocks 463

13.4 Placing File Records on Disk 465

Page 8: Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk Storage, Basic File Strudures, and Hashing 453 13.1 lntroduction 454 13.2 Secondary

11'11Contents ;: xX.

13.5 Operations on Files 469

13.6 Files of Unordered Records (Heap Files) 472

13.7 Files of Ordered Records (Sorted Files) 473

13.8 Hashing Techniques 476

13.9 Other Primary File Organizations 486

13.10 Parallelizing Disk Access Using RAID Technology 487

13.1 1 New Storage Systems 490

13.12 Summary 493

Review Questions 494

Exercises 495

Selected Bibliography 498

chapter 14 lndexing Structures for Files 499

14.1 Types of Single-Level Ordered lndexes 500

14.2 Multilevel lndexes 51 0

14.3 Dynamic Multilevel lndexes Using B-Trees and B*-Trees 513

14.4 lndexes on Multiple Keys 526

14.5 Other Types of lndexes 530

14.6 Summary 531

Review Questions 532

Exercises 533

Selected Bibliography 535

chapter 15 Algorithms for Query Processing and Optimization 537

15.1 Translating SQL Queries into Relational Algebra 539

15.2 Algorithms for External Sorting 540

15.3 Algorithms for SELECT and JOlN Operations 542

15.4 Algorithms for PROJECT and SET Operations 553

15.5 lmplementing Aggregate Operations and OUTER JOlNS 554

15.6 Combining Operations Using Pipelining 556

15.7 Using Heuristics in Query Optimization 556

15.8 Using Selectivity and Cost Estimates in Query Optimization 566

15.9 Overview of Query Optimization in Oracle 576

Page 9: Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk Storage, Basic File Strudures, and Hashing 453 13.1 lntroduction 454 13.2 Secondary

15.1 0 Semantic Query Optimization 577

15.1 1 Summary 577

Review Questions 578

Exercises 578

Selected Bibliography 579

chapter 16 Physical Database Design and Tuning 581

16.1 Physical Database Design in Relational Databases 581

16.2 An Ovewiew of Database Tuning in Relational Systems 586

16.3 Summary 592

Review Questions 592

Selected Bibliography 592

m part 5 Transaction Processing Concepts 1111

chapter 17 lntroduction to Transaction Processing Concepts and Theory 595

17.1 lntroduction to Transaction Processing 596

17.2 Transaction and System Concepts 602

17.3 Desirable Properties of Transactions 605

17.4 Characterizing Schedules Based on Recoverability 607

17.5 Characterizing Schedules Based on Serializability 61 0

17.6 Transaction Support in SQL 620

17.7 Summary 622

Review Questions 623

Exercises 624

chapter 18 Concurrency Controi Techniques 627

18.1 Two-Phase Locking Techniques for Concurrency Control 628

18.2 Concurrency Control Based on Timestamp Ordering 638

18.3 Multiversion Concurrency Control Techniques 641

18.4 Validation (Optimistic) Concurrency Control Techniques 643

Page 10: Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk Storage, Basic File Strudures, and Hashing 453 13.1 lntroduction 454 13.2 Secondary

Contents kki

18.5 Granularity of Data ltems and Multiple Granularity Locking 645

18.6 Using Locks for Concurrency Control in Indexes 649

18.7 Other Concurrency Control Issues 650

18.8 Summary 651

Review Ouestions 652

Exercises 653

Selected Bibliography 654

chapter 19 Database Recovery Techniques 655

19.1 Recovery Concepts 656

19.2 Recovery Techniques Based on Deferred Update 662

19.3 Recovery Techniques Based on Immediate Update 667

19.4 Shadow Paging 668

19.5 The ARlES Recovery Algorithm 670

19.6 Recovery in Multidatabase Systems 673

19.7 Database Backup and Recovery from Catastrophic Failures 675 19.8 Summary 675

Review Ouestions 676

Exercises 677

Selected Bibliography 680

Object and Object-Relational Databases M

chapter 20 Concepts for Object Databases 683

20.1 Ovewiew of Object-Oriented Concepts 685

20.2 Object Identity, Object Structure, and Type Constructors 687

20.3 Encapsulation of Operations, Methods, and Persistence 693

20.4 Type and Class Hierarchies and lnheritance 698

20.5 Complex Objects 702

20.6 Other Objected-Oriented Concepts 704

20.7 Summary 707

Review Ouestions 708

Exercises 708

Page 11: Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk Storage, Basic File Strudures, and Hashing 453 13.1 lntroduction 454 13.2 Secondary

, !""i;;:'Contents

chapter 21 Object Database Standards, Languages, and Design 711

21.1 Overview of the Object Model of ODMG 712

21.2 The Object Definition Language ODL 725

21.3 The Object Query Language OQL 730

21.4 Overview of the C++ Language Binding 739

21.5 Object Database Conceptual Design 741

21.6 Summary 744

Review Questions 745

Exercises 745

Selected Bibliography 746

chapter 22 Object-Relational and Extended-Relational Systems 747

22.1 Overview of SQL and Its Object-Relational Features 748

22.2 Evolution of Data Models and Current Trends of Database Technology 755

22.3 The lnformix Universal Server 757

22.4 Object-Relational Features of Oracle 8 768

22.5 lmplementation and Related Issues for Extended Type Systems 770

22.6 The Nested Relational Model 772

22.7 Summary 774

Selected Bibliography 775

Ipart 7 Further Topics: Security, Advanced Modeling, and Distribution lFhfr

chapter 23 Database Security 779

23.1 lntroduction to Database Security lssues 779

23.2 Discretionary Access Control Based on Granting and Revoking Privileges 784

23.3 Mandatory A G C ~ S S Control and Role-Based Access Control for Multilevel Security 788

Page 12: Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk Storage, Basic File Strudures, and Hashing 453 13.1 lntroduction 454 13.2 Secondary

* I ' $

Contents : Wc'

23.4 lntroduction to Statistical Database Security 794 23.5 lntroduction to Flow Control 796

23.6 Encryption and Public Key lnfrastructures 798 23.7 Privacy lssues and Preservation 800

23.8 Challenges of Database Security 801

23.9 Summary 802

Review Questions 803

Exercises 804

chapter 24 Enhanced Data Models for Advanced Applications 807

24.1 Active Database Concepts and Triggers 809

24.2 Temporal Database Concepts 819

24.3 Spatial and Multimedia Databases 833

24.4 lntroduction to Deductive Databases 836

24.5 Summary 850

Review Questions 851

Exercises 852

Selected Bibliography 855

chapter 25 Distributed Databases and Client-Server Architectures 857

25.1 Distributed Database Concepts 858

25.2 Data Fragmentation, Replication, and Allocation Techniques for Distributed Database Design 864

25.3 Types of Distributed Database Systems 871 25.4 Query Processing in Distributed Databases 874 25.5 Overview of Concurrency Control and Recovery

in Distributed Databases 879

25.6 An Overview of 3-Tier Client-Server Architecture 883

25.7 Distributed Databases in Oracle 885

25.8 Summary 888

Review Questions 888

Exercises 889

Selected Bibliography 891

Page 13: Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk Storage, Basic File Strudures, and Hashing 453 13.1 lntroduction 454 13.2 Secondary

Emerging Technologies H

chapter 26 Web Database Programming Using PHP 897

26.1 Structured, Semistructured, and Unstructured Data 898

26.2 A Simple PHP Example 903

26.3 Overview of Basic Features of PHP 905

26.4 Overview of PHP Database Programming 91 2

26.5 Summary 916

Review Questions 91 7

Exercises 91 7

Laboratory Exercises 91 8

Selected Bibliography 91 9

chapter 27 XML: Extensible Markup Language 921

27.1 XML Hierarchical (Tree) Data Model 922

27.2 XML Docurnents, DTD, and XML Schema 923

27.3 XML Documents and Databases 932

27.4 XML Querying 939

27.5 Summary 941

Review Questions 942

Exercises 942

Selected Bibliography 943

chapter 28 Data Mining Concepts 945

28.1 Overview of Data Mining Technology 946

28.2 Association Rules 949

28.3 Classification 961

28.4 Clustering 964

28.5 Approaches to Other Data Mining Problems 967

Page 14: Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk Storage, Basic File Strudures, and Hashing 453 13.1 lntroduction 454 13.2 Secondary

- V . "

Conents :M i

28.6 Applications of Data Mining 970

28.7 Commercial Data Mining Tools 970

28.8 Summaly 973

Review Questions 973

Exercises 974

Selected Bibliography 975

chapter 29 Overview of Data Warehousing and OLAP 977 29.1 Introduction, Definitions, and Terminology 977

29.2 Characteristics of Data Warehouses 979

29.3 Data Modeling for Data Warehouses 980

29.4 Building a Data Warehouse 985

29.5 Typical Functionality of a Data Warehouse 988 29.6 Data Warehouse versus Views 989

29.7 Problems and Open lssues in Data Warehouses 990

29.8 Summary 992

Review Questions 992

Selected Bibliography 992

chapter 30 Emerging Database Technologies and Applications 993

30.1 Mobile Databases 994

30.2 Multimedia Databases 1001

30.3 Geographic Information Systems (GIS) 1008 30.4 Genome Data Management 1022

Selected Bibliography 1031

Credits 1034

appendix A Alternative Diagramrnatic Notations for ER Models 1035

appendix B Parameters of Disks 1039

Page 15: Fundarnentals of · Data Storage, Indexing, Query Processing, and Physical Design I chapter 13 Disk Storage, Basic File Strudures, and Hashing 453 13.1 lntroduction 454 13.2 Secondary

Contents

appendix C Overview of the QBE Language 1043

C.1 Basic Retrievals in QBE 1043

C.2 Grouping, Aggregation, and Database Modification in QBE 1048

appendix D Overview of the Hierarchical Data Model (located on the Companion Website at http://www.aw.cornlelmasri)

appendix E Overview of the Network Data Model (located on the Companion Website at http://www.aw.com/elrnasri)

Selected Bibliography 1051

Index 1081