CSC443 Database Management
description
Transcript of CSC443 Database Management
![Page 1: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/1.jpg)
CSC443 Database Management
Course Introduction
Professor Pepperadapted from presentations given by
Professor Juliana Freire &
Karl Aberer
& Yan Chen
& Silberschatz, Korth and Sudarshan
![Page 2: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/2.jpg)
Today’s Goals
Course OverviewWhy study databases?Why use databases?Intro to Databases
![Page 3: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/3.jpg)
Major Course Objectives
Design and diagram relational databases Create Access and Oracle databasesUse SQL commandsBe able to design a good relational
databaseKnow how to get information out of a
database to answer any question
![Page 4: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/4.jpg)
Diagramming
Use CaseClass DiagramEntity Relationship DiagramAlgebraic Relation Model
![Page 5: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/5.jpg)
Tools
Panther Unix Oracle 9.2.0.1.0
FTP Explorer – register for trialMS Access
![Page 6: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/6.jpg)
BooksDatabase System Concepts 5th Ed
Theory Cross Reference for fourth ed
Oracle 9i Programming - A Primer Practical examples
See course syllabusAvailable in Library
![Page 7: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/7.jpg)
Learning ResourcesBlackboard: my.adelphi.eduWeb site Database System Concepts:
www.db-book.com/My office hours:
Tuesday & Thursday 12:15-1:30; Wed 12-12:30 Alumni 114 or Science Lab
My email: [email protected] phone: 516-747-2362My Web: www.adelphi.edu/~pepperk
![Page 8: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/8.jpg)
Adelphi Account Setup
PantherOracle BlackboardE-mailSignin Sheet
![Page 9: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/9.jpg)
Projects / Grading
Projects: 40% Access – 15 Oracle - 25
Homework assignments: 20%Midterm: 20%Final: 20%.
![Page 10: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/10.jpg)
Assignments
2% dropped for anything 1 day late.10% dropped for anything 2 weeks late.
![Page 11: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/11.jpg)
Delivering assignments
Email ftpdrop boxdiscussion boardmailbox in math department E-mail me if making a change in delivery place. forward your email from Adelphi
![Page 12: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/12.jpg)
What is a Database Management System?
Database Management System = DBMSA collection of files that store the dataA big program written by someone else that
accesses and updates those files for you
Relational DBMS = RDBMSData files are structured as relations (tables)
![Page 13: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/13.jpg)
Why Study Databases?
![Page 14: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/14.jpg)
What is behind this Web Site?
http://www.ticketmaster.com/Search on a large databaseSpecify search conditionsMany usersUpdatesAccess through a web interface
Central to Modern Computer Science
![Page 15: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/15.jpg)
Database Systems: Then
![Page 16: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/16.jpg)
Database Systems: Today
From Friendster.com on-line tour
Field is developing quickly
![Page 17: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/17.jpg)
Other databases you may useDatabases are
EVERYWHERE
![Page 18: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/18.jpg)
Current Commercial OutlookA major part of the software industry:
Oracle, IBM, Microsoft, Sybase also Informix (now IBM), Teradata smaller players: java-based dbms, devices, OO, …
Well-known benchmarks (esp. TPC)Lots of related industries
data warehouse, document management, storage, backup, reporting, business intelligence, app integration
Relational products dominant and evolving adapting for extensibility (user-defined types), adding
native XML support.
Open Source coming on strong MySQL, PostgreSQL, BerkeleyDB
![Page 19: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/19.jpg)
Why Study Databases??
Need exploded Corporate: retail swipe/clickstreams, “customer
relationship mgmt”, “supply chain mgmt”, “data warehouses”, etc.
Scientific: digital libraries, Human Genome project, NASA Mission to Planet Earth, physical sensors, grid physics network
?
![Page 20: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/20.jpg)
Why study databases?
Data is valuable:bank account records, tax records,
student records…Protect It! - no matter what
• Hurricane• Flood• Human error
![Page 21: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/21.jpg)
Why study databases?Data often structured:Example: Bank account records all
follow the same structureWe can exploit this regular
structure To retrieve data in useful ways (that
is, we can use a query language) To store data efficiently
![Page 22: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/22.jpg)
Why Study Databases Summary
Central to modern computer scienceDatabases are everywhereCommercially successfulFast moving technologyPlethora of structured data that business and
people need
![Page 23: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/23.jpg)
What is a database?
Whiteboard Exercise
![Page 24: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/24.jpg)
Database Definition
Database – a very large, integrated collection of data. (the stuff)
Models a real-world enterprise Entities (e.g., teams, games) Relationships
(e.g., The Forty-Niners are playing in The Superbowl)
Database Management System – software that stores and manages databases (the tools)
![Page 25: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/25.jpg)
Database is better than simple file system because:
Data redundancy, inconsistency and isolation
Difficult to accessIntegrity problemsAtomicity of updates (change one file and
die before the other completes)Multiple user issues
![Page 26: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/26.jpg)
So a Database Has:representing information
data modeling languages and systems for querying data
complex queries with real semantics* over massive data sets
concurrency control for data manipulation controlling concurrent access ensuring transactional semantics
reliable data storage maintain data semantics even if you pull the plug
• * semantics: the meaning or relationship of meanings of a sign or set of signs
![Page 27: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/27.jpg)
Why Use a Database
Why use a database presentation
![Page 28: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/28.jpg)
What is in a database?
![Page 29: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/29.jpg)
Describing Data: Data ModelsA data model is a collection of concepts for
describing data.A schema is a description of a particular collection
of data, using a given data model.A relation is the data stored in a certain schemaThe relational model of data is the most widely
used model today. Entities and relations among them Integrity constraints and business rules Perspective dependent (warehouse & sales view item
differently)
![Page 30: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/30.jpg)
Database DesignThe process of designing the general structure of the
database:Logical Design – Deciding on the database
schema. Business decision – What attributes Computer Science decision – What relation schemas
Physical Design – Deciding on the physical layout of the database
![Page 31: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/31.jpg)
Data ModelsA collection of tools for describing Data Data relationships Data semantics Data constraints
Relational modelEntity-Relationship data model (mainly for database
design) Object-based data models (Object-oriented and
Object-relational)Semistructured data model (XML)Other older models:
Network model Hierarchical model
![Page 32: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/32.jpg)
![Page 33: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/33.jpg)
The Entity-Relationship Model Models an enterprise as a collection of entities and relationships
Entity: a “thing” or “object” in the enterprise that is distinguishable from other objects
• Described by a set of attributes Relationship: an association among several entities
Represented diagrammatically by an entity-relationship diagram:
![Page 34: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/34.jpg)
Relational Model
ER for concept map to Algebraic Relational Model
Relations (tables of possible data)Instance (actual data at a given time)Schema (description of those tables, their
relations)
![Page 35: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/35.jpg)
Relational Model Terminology
![Page 36: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/36.jpg)
Relational Model Look Notation: p(r) p is called the selection predicate Defined as:
p(r) = {t | t r and p(t)}
Where p is a formula in propositional calculus consisting of terms connected by : (and), (or), (not)Each term is one of:
<attribute>op <attribute> or <constant> where op is one of: =, , >, . <.
Example of selection:
branch_name=“Perryridge”(account)
![Page 37: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/37.jpg)
Object-Relational Data ModelsExtend the relational data model by including
object orientation and constructs to deal with added data types.
Allow attributes of tuples to have complex types, including non-atomic values such as nested relations.
Preserve relational foundations, in particular the declarative access to data, while extending modeling power.
Provide upward compatibility with existing relational languages.
![Page 38: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/38.jpg)
Design Goals
Design Goals:Avoid redundant dataEnsure that relationships among
attributes representedEnsure constraints are properly
modeled: updatescheck for violation of database
integrity constraints.
![Page 39: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/39.jpg)
Bad Design
![Page 40: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/40.jpg)
Queries
What the programmer sees
![Page 41: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/41.jpg)
Some Basic SQL Commands
Select – Get rows of data* - everythingFrom – the name of the table (relation) will followWhere – Only get the stuff that matchesExample: Select * from movies where theater =
LoewsExercise –
Write down the query to select all of your friends that live in NY State
![Page 42: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/42.jpg)
Example: University DatabaseConceptual schema:
Students(sid: string, name: string, login: string, age: integer, gpa:real)
Courses(cid: string, cname:string, credits:integer)
Enrolled(sid:string, cid:string, grade:string)
External Schema (View): Course_info(cid:string,enrollment:integer)
Physical schema: Relations stored as unordered files. Index on first column of Students. Key to good performance
Physical Schema
Conceptual Schema
View 1 View 2 View 3
DB
![Page 43: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/43.jpg)
Data Independence (levels of abstraction)
Applications insulated from how data is structured and stored.
Logical data independence: Protection from changes in logical structure of data – stablize views.
Physical data independence: Protection from changes in physical structure of data.
Q: Why are these particularly important for DBMS?
Physical Schema
Conceptual Schema
View 1 View 2 View 3
DB
![Page 44: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/44.jpg)
Queries
Change and get data from a databaseRun over data modelEasy & efficientNot good for complex calculationsDML and DDL
![Page 45: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/45.jpg)
Data Manipulation Language (DML)
Language for accessing and manipulating the data organized by the appropriate data model
DML also known as query languageTwo classes of languages
Procedural – user specifies what data is required and how to get those data
Declarative (nonprocedural) – user specifies what data is required without specifying how to get those data
SQL is the most widely used query language
![Page 46: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/46.jpg)
Data Definition Language (DDL) Specification notation for defining the database schema
Example:create table account ( account-number char(10), balance integer)
DDL compiler generates a set of tables stored in a data dictionary Data dictionary contains metadata (i.e., data about data)
Database schema Data storage and definition language
• Specifies the storage structure and access methods used Integrity constraints
• Domain constraints• Referential integrity (references constraint in SQL)• Assertions
Authorization
![Page 47: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/47.jpg)
Queries - What does it look like?
System handles query plan generation & optimization; ensures correct execution.
SELECT eid, ename, title
FROM Emp EWHERE E.sal > $50K
SELECT E.loc, AVG(E.sal)
FROM Emp EGROUP BY E.locHAVING Count(*) > 5
SELECT COUNT DISTINCT (E.eid)FROM Emp E, Proj P, Asgn AWHERE E.eid = A.eid
AND P.pid = A.pidAND E.loc <> P.loc
Issues: view reconciliation, operator ordering, physical operator choice, memory management, access path (index) use, …
EmployeesEmployeesProjectsProjects
AssignmentsAssignments
EmpEmp
SelectSelect
EmpEmp
Group(agg)Group(agg)
HavingHaving
EmpEmp
Count distinctCount distinct
AsgnAsgn
JoinJoin
JoinJoin
ProjProj
![Page 48: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/48.jpg)
SQL
SQL: widely used non-procedural language Example: Find the name of the customer with customer-id 192-83-7465
select customer.customer_namefrom customerwhere customer.customer_id = ‘192-83-7465’
Example: Find the balances of all accounts held by the customer with customer-id 192-83-7465
select account.balancefrom depositor, accountwhere depositor.customer_id = ‘192-83-7465’ and
depositor.account_number = account.account_number Application programs generally access databases through one of
Language extensions to allow embedded SQL Application program interface (e.g., ODBC/JDBC) which allow SQL
queries to be sent to a database For us: Oracle and Access SQL languages
![Page 49: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/49.jpg)
A Look underneath
![Page 50: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/50.jpg)
Concurrency ControlConcurrent execution of user programs: key to good
DBMS performance. Disk accesses frequent, pretty slow Keep the CPU working on several programs concurrently.
Interleaving actions of different programs: trouble! e.g., account-transfer & print statement at same time
DBMS ensures such problems don’t arise. Users/programmers can pretend they are using a single-user
system. (called “Isolation”) Thank goodness! Don’t have to program “very, very
carefully”.
![Page 51: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/51.jpg)
Transactions: ACID PropertiesKey concept is a transaction: a sequence of database
actions (reads/writes).
DBMS ensures atomicity (all-or-nothing property) even if system crashes in the middle.
Each transaction, executed completely, must take the DB between consistent states or must not run at all.
DBMS ensures that concurrent transactions appear to run in isolation.
DBMS ensures durability of committed Xacts even if system crashes.
DBMS can enforce simple integrity constraints on the data.
![Page 52: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/52.jpg)
Structure of a DBMS
A typical DBMS has a layered architecture.
The figure does not show the concurrency control and recovery components.
Each database system has its own variations.
Query Optimizationand Execution
Relational Operators
Files and Access Methods
Buffer Management
Disk Space Management
DB
These layersmust considerconcurrencycontrol andrecovery
![Page 53: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/53.jpg)
Overall System Structure
![Page 54: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/54.jpg)
…must understand how a DBMS works
Databases make these folks happy ... DBMS vendors, programmers $20 million industry
Oracle, IBM, MS, Sybase, … End users Business, education, science, … DB application programmers
Eg smart webmasters Build web services that run off DBMSs
Database administrators (DBAs) Design logical/physical schemas Handle security and authorization Data availability, crash recovery Database tuning as needs evolve
![Page 55: CSC443 Database Management](https://reader035.fdocuments.in/reader035/viewer/2022062409/56815133550346895dbf4d4a/html5/thumbnails/55.jpg)
SummaryWhat is a database – lots of data organized into entities and schemes with a manager
Why study databases? – common use, needed for programming apps
Why use databases? – all the advantages over flat file systems
Intro to Databases
Logical layer:
Query language, data models, transactions
Physical layer
Actual files with indexes, query processing, concurrency, recovery & logs