Object-relational database systems
description
Transcript of Object-relational database systems
Object-relational database systems
Yong YaoCS632
April 17, 2001
Content
Introduction Michael Stonebraker: Inclusion of New Types in
Relational Data Base Systems. Michael Stonebraker, Greg Kemnitz: The
Postgres Next Generation Database Management System.
PREDATOR System Design Document: A detailed description of internal design decisions and implementation details.
Introduction
The relational model Dominant mainstream approach Work well on business data
processing But things have changed drastically,
nontraditional problems.
Taxonomy of DBMS Application
Yes
No
Use SQL
data Simple Complexcomplexity
Simple data
Yes
No
Use SQL
data Simple Complexcomplexity
Relational DBMS
File System
OODBMS
Yes
No
Use SQL
data Simple Complexcomplexity
Relational DBMS
File System
OODBMS
•A programming language with a type system
•Persistent object
ORDBMS
Yes
No
Use SQL
data Simple Complexcomplexity
Relational DBMS
ORDBMS
File System
OODBMS
Features
Base type extension Inheritance Complex object
Inclusion of New Types in Relational Database Systems
Michael StonebrakerEECS Dept.
University of California, Berkeley
Example Represent a set of boxes in the DBMS and a simple
query- find all the boxes that overlap the unit square(0,1,0,1).
RDBMS:Create box(id=i4, x1=f8, x2=f8,y1=f8,y2=f8)
select * from boxwhere ! (box.x2<=0 or box.x1>=1 or box,y2<=0 or box.y1>=1)
Too hard to understand Too complex to optimize Too many clauses to check
Example (cont’d)
Create box(id=i4,desc=box)
select *from boxwhere box.desc!!”0,1,0,1”
Box: new data type “!!”: overlap operator with two operands of data type
box, return a boolean
Operators for Boxes
Motivation
Needs of business data processing applicationsGeographic Information System New Types: Point, Line, Polygon; New operators: distance, intersection ; New access methods: R-trees, KDB trees;
Conditions
1. The definition of user-defined data types
2. The definition of new operators for these data types
3. The implementation of new access methods for data types
4. Optimized query processing for commands containing new data types and operators
Definition of New Types
Follow a registration processdefine type-name length = value,
input = file-name, output = file-name
Occupy a fixed amount of space Conversion routines
Definition of New Operators
Safety loophole
Problem: an ADT routine which has an error can
overwrite DBMS data structures Unclear whether such errors are due to bugs
in the user routines or in the DBMS Solutions:
Run in separate address space Build a language processor Provide two environments
New access methods
Users can add new access methods to support user-defined data types
Goal: extensibility Interface:
Conditions for the operators Information on the data types of operators
Define set of operators
Example: B-tree
Conditions for operators:
Information on the data types of operators (B-tree)
•“<=“ is required
•Type: specific type, fixed, variable, fix-var, type1, type2
New set of operators (B-tree)
F1=(value - low-key) / (high-key – low-key)
F2=(high-key – value) / (high-key – low-key)
Implementing New Access Methods
A collection of procedure calls open (relation-name) close (descriptor) get-next(descriptor, OPR, value, tuple-id) insert (descriptor,tuple) delete(descriptor, tuple-id) replace(descriptor, tuple-id, new-tuple) build(descriptor,keyname,OPR)
Hard problem- work with transaction management
log pages – simple, but may suffer from performance penalty
log events Event-oriented interface
REDO(T) UNDO(T) LOG(event-type, event-data)
Access path selection Four pieces of information for optimization
Stups: estimate the expected number of records…where rel-name.field-name OPR value
A second selectivity factor S: …where relname-1.field-1 OPR relname-2.field-2
Feasibility of merge-sort Feasibility of hash-join
Define operator token= AE, …… Stups=1 S= min(N1.N2). merge-sort with AL, hash-join
Generate query processing plan
relname-1.field-1 OPR relname-2.field-2 Merge sort Iterative substitution Hash Join
Relname.field-name OPR value any access method with field-name as a key Secondary index Sequential search
Summary
How to extend an abstract data type How to define new access method Integrate new access method with
transaction management Generation of optimized query
processing plan
The POSTGRES Next-GenerationDatabase Management System
Michael StonebrakerGreg Kemnitz
Three kinds of services for DBMS
Traditional data management Simple data type
Object management Complex data types, e.g. bitmaps, icons,
text… Knowledge management
Store and enforce a collection of rules
Example- newspaper layout
store and manipulate text and graphics, Bill customers for advertisement.
Data: customer information Object: text, pictures and icons Rules: control newspaper layout
POSTGRES Data Model-Design criteria
Traditional relational DBMSs data model: a collection of named relations
Orientation toward database access from a query language Interact with database by query language-
POSTQUEL User defined functions
Design criteria
Orientation toward multilingual access Two selections
One language tightly coupled to the system
Multilingual POSTGRES is programming language
neutral
Design criteria
Small number of concepts As few as possible Four constructs:
Class Inheritance Type Function
Data Model - Class
Class(constructed type or relation) : a named collection of instances of objects.
Instances(record or tuple) : has the same collection of attributes
create EMP(name=string, salary= float, age=int)
Inherit data elements from other classescreate EMP(name=string, salary= float, age=int)
create SALESMAN(quota=float) inherits EMP
Multiple inheritance – only create objects without ambiguity
Data model - Inheritance
SALESMAN
EMP
Three kinds of classes
Real class: instances are stored in the database
Derived class :view Version: store differential relative to its base
class Initially has all instances of the base class Freely updated to diverge from the base class Updates to the version do not affect the base class Updates to the base class are reflected in the version
Supported by the rule system
Version Example
create version my-EMP from EMP
Two classes generated:EMP-MINUS(deleted-OID)EMP-PLUS(all-fields-in EMP, replaced-OID)
Retrieve: retrieve EMP-PLUS instead Insert: All new instances for EMP or my-EMP will
be added into EMP-PLUS; Delete: move records from EMP-PLUS to EMP-
MINUS
Data model - Types
Base types: Int, float, string construct new base types.
Arrays of base types: Composite types:
Construct complex objects: attributes contain other instances as part or all of their value
Composite types
Contains zero or more instances of the same classcreate EMP (… , manager=EMP, coworkers=EMP)
Set: value is a collection of instances from all classescreate EMP(… , hobbies=set){softball, skiing, skating…}
Data Model - Functions
C functions Arbitrary C procedures Can not be optimized by the POSTGRES Argument: base types or composite types Inherited down the class hierarchy
overpaid(EMP) overpaid(SALESMAN)
Functions
Operators Utilize index Functions with one or two operands
{ALT, ALE, AE, AGT, AGE} Allow new access methods
POSTQUEL functions Any collection of commands in the POSTQUEL
Define function high-pay returns EMP asRetrieve (EMP.all)Where EMP.salary>50000
POSTQUEL Set-oriented query language Support nested queries Transitive closure Support for inheritance Support for time travel
Allow a user to run historical queryretrieve (EMP.salary)from EMP [T]where EMP.name=“Sam”
Maintain two different physical collections of records Vacuum cleaner: a daemon moves records
The Rules System
Requirements: referential integrity View management Triggers Integrity constraints Protection Version control
Design a general purpose rules system
Syntax of rules
ON event (TO) object WHERE on new EMP.salary where
POSTQUEL-qualification EMP.name=“Fred”THEN DO [instead] then do replacePOSTQUEL-command(s) E (salary=new.salary)
from E in EMPwhere E.name=“Joe”
Event: retrieve, replace, delete, new … Object: name of a class or class.column POSTQUEL-qualification: normal qualification POSTQUEL-commands: a set of POSTQUEL commands
Forward and Backward chaining
Rules specify additional actions, and these actions may activate other rules
on new EMP.salary where on retrieve to EMP.salary whereEMP.name=“Fred” EMP.name=“Joe”then do replace then do instead retrieveE(salary=new.salary) (EMP.salary)from E in EMP where EMP.name=“Fred”where E.name=“Joe”
Implementation of Rules
Record level processing Rules system is called when individual
records are accessed Place a marker on the record
Query rewrite module Perform poorly for a large number of small-
scope rules Desirable when there are a small number of
larger-scope rules
When rules are activated
Immediate-same transaction Immediate-different transaction Deferred-same transaction Deferred-different transaction
Currently only implements the first option
Storage System
‘ no-overwrite’ storage manager Old record remains in the database
whenever an update occurs Instantaneous crash recovery Support time travel stable main memory required
Performance
Better than UCB-INGRES
But
Compare to the Cattell system, loses by about
a factor of two.
Comments
POSTGRES allows an application designer to trade off performance for data independence
Imports only specific user functions into its address space
PREDATOR DESIGN AND INPLEMENTATION
PARVEEN SESHADRI
System overview
A client-server database system Goal: Extensibility - adding the ability to
process new kinds of data Basic components
Extensible table of Enhanced Abstract Data Type (E-ADTs)
Extensible table of query processing engines(QPEs)
Server Architecture
RECOVERY
CONCURRENCY
BUFFERMGMT
XACTS
RECORDS SCHEMAS
EXPRESSIONSVALUES
OTHER E-ADTs
IMA
GE
E-A
DT
BA
SIC A
DTs
RELATION E-ADT
RELATION ENGINE
OTH
ER E
NG
INESSO
CK
ET
SU
PPO
RT
TH
RE
AD
SU
PPO
RT
WEB CLIENT
TEXT CLIENT
SHORE LIBRARY
DISK STORAGE
UTILITY LAYER
Enhanced Data Types
Standard ADT Specify the storage format for values Specify methods that can be invoked on values Specify how some methods can be matched
using indexes Motivation: methods can be very
expensiveData.image.sharpen().clip()
Improvement
Data.image.sharpen().clip() It is unnecessary for Sharpen to compress and
write its result to disk-> Pass result directly in memory to Clip
Rewrite the expression-> Data.image.clip().sharpen()
No need to retrieve the entire image
E-ADT
Expose the semantics of its methods Query optimizations are performed
using these semantics
Find more at :The Case for Enhanced Abstract Data Types.
VLDB 1997
Thank You