Relational Algebra & Client-Server Systems, CS263 Lectures 11 and 12.
-
Upload
marsha-lewis -
Category
Documents
-
view
224 -
download
0
Transcript of Relational Algebra & Client-Server Systems, CS263 Lectures 11 and 12.
Relational Algebra & Client-Server Systems, CS263 Lectures 11
and 12
Relational Algebra Relational algebra operations work on one or more relations to define another
relation leaving the original intact.
Both operands and results are relations, so output from one operation can become input to another operation.
Allows expressions to be nested, just as in arithmetic. This property is called closure.
5 basic operations in relational algebra: Selection, Projection, Cartesian product, Union, and Set Difference.
These perform most of the data retrieval operations needed.
Also have Join, Intersection, and Division operations, which can be expressed in terms of 5 basic operations.
Relational Algebra Operations
Selection (Restriction)predicate (R)
Works on a single relation R and defines a relation that contains only those tuples of R that satisfy the specified condition (predicate).
Example: List all staff with a salary greater than £10,000.
salary > 10000 (Staff)
Projectioncol1, . . . , coln(R)
Works on a single relation R and defines a relation that contains a vertical subset of R, extracting the values of specified attributes and eliminating duplicates.
Example: Produce a list of salaries for all staff, showing only their staffNo, fName, lName, and salary details.
staffNo, fName, lName, salary (Staff)
UnionR S
Union of two relations R and S defines a relation that contains all the tuples of R, or S, or both R and S, duplicate tuples being eliminated. R and S must be union-compatible (i.e. same attributes).
Example: Produce a list of all staff that work in either of two departments (each department has a separate database), showing only their staffNo, and date of birth.
staffNo, dob(Staff_DepA) staffNo, dob (Staff_DepB)
staffNo dobSL10 14-02-64SA51 21-11-82DS40 01-01-40
staffNo dobCC15 11-03-66SA51 21-11-82
staffNo dobSL10 14-02-64SA51 21-11-82DS40 01-01-40CC15 11-03-66
Staff_DepA Staff_DepB
Intersect
R SDefines a relation consisting of the set of all tuples that are in both R and S. R and S must be union-compatible.
Example: Produce a list of staff that work in both department A and department B, showing only their staffNo, and date of birth.
( staffNo, dob(Staff_DepA)) ( staffNo, dob (Staff_DepB))
staffNo dobSL10 14-02-64SA51 21-11-82DS40 01-01-40
staffNo dobCC15 11-03-66SA51 21-11-82
staffNo dobSA51 21-11-82
Staff_DepA Staff_DepB
Set DifferenceR – S
Defines a relation consisting of the tuples that are in relation R, but not in S. R and S must be union-compatible.
staffNo dobSL10 14-02-64SA51 21-11-82DS40 01-01-40
staffNo dobCC15 11-03-66SA51 21-11-82
Example: Produce a list of all staff that only work in department A (each department has a separate database), showing only their staffNo, and date of birth.
staffNo, dob(Staff_DepA) staffNo, dob (Staff_DepB)
staffNo dobSL10 14-02-64DS40 01-01-40
Staff_DepA Staff_DepB
Cartesian productR X S
Defines a relation that is the concatenation of every tuple of relation R with every tuple of relation S.
X
Example: Combine details of staff and the departments they work in.
staffNo, job, dept (Staff) dept, name (Dept)X
staffNo job deptSL10 Salesman 10SA51 Manager 20DS40 Clerk 20
dept name 10 Stratford20 Barking
XstaffNo job dept dept nameSL10 Salesman 10 10 StratfordSA51 Manager 20 10 StratfordDS40 Clerk 20 10 StratfordSL10 Salesman 10 20 BarkingSA51 Manager 20 20 BarkingDS40 Clerk 20 20 Barking
Staff Dept
Relational Algebra Operations
JoinR S
Defines a relation that results from a selection operation (with a join predicate) over the Cartesian product of relation R and relation S.
<join condition>
<join condition>
Example: Produce a list of staff and the departments they work in.
( staffNo, job, dept (Staff)) ( dept, name (Dept))
staffNo job deptSL10 Salesman 10SA51 Manager 20DS40 Clerk 20
dept name 10 Stratford20 Barking
staffNo job dept dept nameSL10 Salesman 10 10 StratfordSA51 Manager 20 20 BarkingDS40 Clerk 20 20 Barking
Staff Dept
Staff.dept = Dept.dept
Because the predicate operator is an ‘=‘ this is known as an Equijoin
Natural JoinR S
This performs an Equijoin of the two relations R and S over all common attributes. One occurrence of each common attribute is eliminated from the result.
Example: Produce a list of staff and the departments they work in.
( staffNo, job, dept (Staff)) ( dept, name (Dept))
staffNo job deptSL10 Salesman 10SA51 Manager 20DS40 Clerk 20
dept name 10 Stratford20 Barking
staffNo job dept nameSL10 Salesman 10 StratfordSA51 Manager 20 BarkingDS40 Clerk 20 Barking
Staff Dept
Left Outer JoinR S
Left outer join is a join in which tuples from R that do not have matching values in common columns of S are also included in the resulting relation.
( dept, name (Dept)) ( staffNo, job, dept (Staff))
dept name staffNo job 10 Stratford SL10 Salesman20 Barking SA51 Manager20 Barking DS40 Clerk30 Watford
Example: Produce a list of all departments and associated staff that work in them.
staffNo job deptSL10 Salesman 10SA51 Manager 20DS40 Clerk 20
Staffdept name 10 Stratford20 Barking30 Watford
Dept
Intersect
R SDefines a relation consisting of the set of all tuples that are in both R and S. R and S must be union-compatible.
Example: Produce a list of staff that work in both department A and department B, showing only their staffNo, and date of birth.
( staffNo, dob(Staff_DepA)) ( staffNo, dob (Staff_DepB))
staffNo dobSL10 14-02-64SA51 21-11-82DS40 01-01-40
staffNo dobCC15 11-03-66SA51 21-11-82
staffNo dobSA51 21-11-82
Staff_DepA Staff_DepB
Division
R SDefines a relation over common attributes C that consists of set of tuples from R that match a combination of every tuple in S.
Example: Show all staff that use all the company’s programming languages.
Staff_Prog Prog languageCOBOLBASIC
staffNo languageSL10 COBOLSA51 BASICSA51 COBOLSE14 BASICSE18 BASIC
staffNoSA51
Staff_Prog Prog
CS263 Lec. 12: Client/Server systems
• Operate in a networked environment• Processing of an application distributed between front-end
clients and back-end servers• Generally the client process requires some resource, which the
server provides to the client• Clients and servers can reside in the same computer, or they
can be on different computers that are networked together, usually:
• Client – Workstation (usually a PC) that requests and uses a service
• Server – Computer (PC/mini/mainframe) that provides a service. For DBMS, server is a database server
Three components of application logic
1. Input – output or presentation logic component – responsible for formatting and presenting data on the user’s screen (or other output device) and managing user input from keyboard (or other input device)
2. Processing component logic – handles data processing logic (validation and identification of processing errors), business rules logic, and data management logic (identifies the data necessary for processing the transaction or query)
3. Storage component logic – responsible for data storage and retrieval from the physical storage devices – DBMS activities occur here
Client/Server architectures
File Server Architecture
Database Server Architecture
Three-tier Architecture
Client does extensive processing
Client does little processing
File server architecture
The first client/server architectures developed All processing is done at the PC that requested the data, I.e. the
client handles the presentation logic, the processing logic and much of the storage logic
A file server is a device that manages file operations and is shared by each of the client PCs attached to the LAN
Each file server acts as an additional hard disk for each of the client PCs
Each PC may be called a FAT CLIENT (most processing occurs on the client)
Entire files are transferred from the server to the client for processing.
Three problems with file server architecture
1. Huge amount of data transfer on network - when client wants to access data whole table(s) transferred to PC – so server is doing very little work
2. Each client authorised to use DBMS when DB application program runs on that PC - one database but many concurrently running copies of DBMS (one on each PC) – heavy resource demand on clients
3. DBMS copy in each client must manage shared database integrity - must recognize shared locks, integrity checks, etc - programmers must recognise various conditions that can arise in this environment and understand concurrency, recovery and security controls
File Server Architecture
FAT FAT CLIENTCLIENT
Database server (2-tier) architectures
Client responsible for managing user interface, I/O processing logic, data processing logic and some business rules logic (front-end programs)
Database server performs data storage and access processing (back-end functions) – DBMS only on server
Clients do not have to be as powerful, and server can be tuned to optimise data processing performance
Greatly reduces data traffic on the network, as only records (rather than tables) that match request transmitted to client
Improved data integrity as all processed centrally
Stored proceduresModules of code implementing application logic – included on
the database server. Advantages: Performance improves for compiled SQL statements Reduced network traffic as processing moves from the
client to the server Improved security if stored procedure is accessed rather
than data and code being moved to server Improved data integrity - multiple applications access same
stored procedure Thinner clients (and a fatter database server)Disadvantages: Writing stored procedures takes more time than using e.g.
VB + proprietary nature reduces portability + performance degrades as number of on-line users increases
Database server architecture
ThinnThinner er clientclientss
DBMS DBMS only on only on serverserver
3-tier architectures
In general, these include another server layer in addition to the client and database server
This additional server may be used for different purposes Often application programs reside on the additional server (the application server) Or additional server may hold a local database whilst another server holds the
enterprise database Often a thin client - PC just for user interface and a little application processing.
Limited or no data storage (sometimes no hard drive)
Three-tier architecture
Thinnest Thinnest clientsclients
Business rules on Business rules on separate serverseparate server
DBMS only DBMS only on DB serveron DB server
Advantages Scalability – middle tier can be used to reduce load on
database sever by using a transaction processing monitor to reduce number of connections to server, and additional application servers can be added to distribute processing
Technological flexibility – easier to change DBMS engines – middle tier can be moved to different platform. Easier to implement new interfaces
Cost reduction – use of off-the-shelf components/services in the middle tier - also substitution of modules within application rather than whole application
Improved customer service – multiple interfaces on different clients can access the same business process
Competitive advantage – ability to react to business changes quickly by changing small modules of code rather than entire applications
Challenges
High short-term costs – presentation component must be split from process component – this requires more programming
Tools, training and experience– currently lack of development tools and training programmes, and people experienced in the technology
Incompatible standards – few standards yet proposed Lack of compatible end-user tools – many end-user tools such
as spreadsheets and report generators do not yet work through middle-tier services (see later discussion on middleware)
Middleware
Software which allows an application to interoperateinteroperate with other software, without requiring the user to understand and code the low-level operations required to achieve interoperability
With Synchronous systems, the requesting system waits for a response to the request in real time
Asynchronous systems send a request but do not wait for a response in real time – the response is accepted whenever it is received .
6 Types of Middleware ->
1. Asynchronous Remote Procedure Calls (RPC) - client makes calls to procedures running on remote computers but does not wait for a response. If connection lost, must re-establish the connection and send again. High scalability but low recovery
2. Synchronous RPC – distributed program using this calls services available on different computers – possible to achieve this without undertaking detailed coding (e.g. RMI in Java)
3. Publish/Subscribe (push technology) - server monitors activity and sends information to client when available - asynchronous, clients (subscribers) perform other activities between notifications from server.
4. Message-Oriented Middleware – asynchronous, sends messages that are collected and stored until acted upon - client continues with other processing.
5. Object Request Broker (ORB) – tracks location of each object and routes requests
6. SQL-oriented Data Access - translate generic SQL into the SQL specific to the database
Database middleware
ODBC – Open Database Connectivity - most DB vendors support this
OLE-DB - Microsoft enhancement of ODBC JDBC – Java Database Connectivity - Special Java classes
that allow Java applications/applets to connect to databases CORBA – Common Object Request Broker Architecture –
specification of object-oriented middleware DCOM – Microsoft’s version of CORBA – not as robust
as CORBA over multiple platforms
Client/Server security
Network environment has complex security issues. Networks susceptible to breaches of security through eavesdropping, unauthorised connections or unauthorised retrieval of packets of information flowing round the network. Specific security issues include:
System-level password security – user names and passwords for allowing access to the system. Password management utilities
Database-level password security - for determining access privileges to tables; read/update/insert/delete privileges
Secure client/server communication - via encryption – but encryption can negatively affect performance
DB access from clients Partitioning to create 2, 3 or n-tier architecture - decisions
must be made about the placement of the processing logic Storage logic (the database engine) handled by server, and
presentation logic handled by client Part a) of Fig. depicts possible 2-tier systems, placing
processing logic on client (fat client), on server (thin client) or partitioned across both (distributed environment)
Part b) depicts typical 3 and n -tier architectures Some processing logic placed on the client if desired Typical client in a Web enabled environment will be a thin
client, using a browser for its presentation logic Middle tiers are typically coded in a portable language such
as Java
Processing logic distributions
a) 2-tier
b) 3 and n-tier
Processing logic could be at client, server, or both
Processing logic will be at application server or Web server
Open Database Connectivity (ODBC)
An API providing common language for application programs to access/process SQL databases - independent of particular RDBMS
Required parameters: ODBC driver needed, Back-end server name, Database name, User id and password
Fig. Shows generic ODBC architecture Client application requests connection established with data source Driver manager identifies appropriate ODBC driver Driver selected processes requests from the client and submits
queries to RDBMS in required version of SQL Java Database Connectivity (JDBC) similar to ODBC – built
specifically for Java applications
ODBC Architecture
Each DBMS has its own ODBC-compliant driver
Client does not need to know anything about the DBMS
Application Program Interface (API) provides common interface to all DBMSs