An Object-Oriented Relational Databasr layer. One …...An Object-Oriented Relational oriented layer...

11
AnObject-Oriented Relational oriented layer that keeps relevant data in memory. Locking and up- date protocols are built into this layer. One achieves quick access to Databasr i data, while storing it’in a database between executions. The 00 layel hides the database from applica- tions; applications are unaware that they are receiving DBMS services. A relational DBMS and an object-orientedpro- gramminglanguage canbe combined to yield a surprisingly effective OO-DBMS for many applications. Wllllam J. Premerlanl, Michael R. Blaha, James E. Rumbaugh, and Thomas A. Varwlg Many people recognize the short- comings of current database man- agement systems (DBMS) [‘L, 9, IO] a11cl conventional programming languages [4, 81. DRMS and pro- granlniing languages each provide a distinct viewpoint on data and applications. .l‘he two viewpoints are not well integrated, although each has its own strengths and weaknesses. Kelational DBMSs (KDHMSs) have a firm theoretical base aind sat- isfy many applications. KDEMSs, however, Iack important features needed fin. ad~~anced applications, such as abstract data types, c0111plex integrity constraints, and version- ing. I’rogimiiining languages, such as I’ascal and C+ +, provide ab- stract data types, structured control coiistructs, and the ability to write complex algorithms, but lack data persistence across executions and concurrent access to data. Objected-oriented (00) design provides a uniform paradigm for both database design and program code design. 00 data models pro- duce relational databases that match real-world applications and avoid the normalization problems often associated with relational database design [3]. 00 program- ming languages (0OPL.s) improve code reuse, code maintenance, and modularity [8]. We describe a technique for con- structing an object-oriented DBMS (00-DBMS) from existing technol- ogy and a small amount of human- written code. .l‘he existing technol- ogy is an KDBMS, an OOPL, and an ob.ject-oriented modeling tech- nique. ‘l‘he basic idea is to buffel the database with an object- ‘The programmer sees an object- oriented language with certain pre- defined operations that allows ob- ,jects to be retrieved f’rom and stored in a database between pro- gram executions. The contribution of this article is the realization that a checkout mechanism can be used to combine a KDBMS and OOPL into a robust and ef‘ficient 00-DHMS. Our OO- DBMS is not a complete system. It lacks commonly expected features such as extensible data types, man- agement of behavior as well as data, and inheritance. Nevertheless, OUI approach to an OO-DBMS can sat- isfy many applications. Related Work Two different areas of research re- late to this article: OO-DBMS and database checkouticheckin. 00-DEMS The term “OO-DBMS” is not well defined; it means different things to different people. We will define 00-DBMS as the intersection of database and OOPL technology. The ambiguity in the term OO- DBMS is largely a result of whethet one emphasizes the database or OOPL side. OO-DBMSs are just starting to emerge in the commercial and re- search worlds. The technology is immature, however, and suffers from lack of standards, poor per- f‘ormance, and unresolved design issues, much the same as KDBMSs did a decade ago. One Of’ the most important fea-

Transcript of An Object-Oriented Relational Databasr layer. One …...An Object-Oriented Relational oriented layer...

Page 1: An Object-Oriented Relational Databasr layer. One …...An Object-Oriented Relational oriented layer that keeps relevant data in memory. Locking and up- date protocols are built into

An Object-Oriented Relational oriented layer that keeps relevant

data in memory. Locking and up- date protocols are built into this layer. One achieves quick access to

Databasr i

data, while storing it’in a database between executions. The 00 layel hides the database from applica- tions; applications are unaware that they are receiving DBMS services.

A relational DBMS and an object-oriented pro- gramming language can be combined to yield a

surprisingly effective OO-DBMS for many applications.

Wllllam J. Premerlanl, Michael R. Blaha,

James E. Rumbaugh, and Thomas A. Varwlg

Many people recognize the short- comings of current database man- agement systems (DBMS) [‘L, 9, IO] a11cl conventional programming languages [4, 81. DRMS and pro- granlniing languages each provide a distinct viewpoint on data and applications. .l‘he two viewpoints are not well integrated, although each has its own strengths and weaknesses.

Kelational DBMSs (KDHMSs) have a firm theoretical base aind sat- isfy many applications. KDEMSs, however, Iack important features needed fin. ad~~anced applications, such as abstract data types, c0111plex integrity constraints, and version- ing. I’rogimiiining languages, such as I’ascal and C+ +, provide ab- stract data types, structured control coiistructs, and the ability to write complex algorithms, but lack data

persistence across executions and concurrent access to data.

Objected-oriented (00) design provides a uniform paradigm for both database design and program code design. 00 data models pro- duce relational databases that match real-world applications and avoid the normalization problems often associated with relational database design [3]. 00 program- ming languages (0OPL.s) improve code reuse, code maintenance, and modularity [8].

We describe a technique for con- structing an object-oriented DBMS (00-DBMS) from existing technol- ogy and a small amount of human- written code. .l‘he existing technol- ogy is an KDBMS, an OOPL, and an ob.ject-oriented modeling tech- nique. ‘l‘he basic idea is to buffel the database with an object-

‘The programmer sees an object- oriented language with certain pre- defined operations that allows ob- ,jects to be retrieved f’rom and stored in a database between pro- gram executions.

The contribution of this article is the realization that a checkout mechanism can be used to combine a KDBMS and OOPL into a robust and ef‘ficient 00-DHMS. Our OO- DBMS is not a complete system. It lacks commonly expected features such as extensible data types, man- agement of behavior as well as data, and inheritance. Nevertheless, OUI approach to an OO-DBMS can sat- isfy many applications.

Related Work

Two different areas of research re- late to this article: OO-DBMS and database checkouticheckin.

00-DEMS

The term “OO-DBMS” is not well defined; it means different things to different people. We will define 00-DBMS as the intersection of database and OOPL technology. The ambiguity in the term OO- DBMS is largely a result of whethet one emphasizes the database or OOPL side.

OO-DBMSs are just starting to emerge in the commercial and re- search worlds. The technology is immature, however, and suffers from lack of standards, poor per- f‘ormance, and unresolved design issues, much the same as KDBMSs did a decade ago.

One Of’ the most important fea-

Page 2: An Object-Oriented Relational Databasr layer. One …...An Object-Oriented Relational oriented layer that keeps relevant data in memory. Locking and up- date protocols are built into

-

tutu of an ()()-DBMS is the im- KDBMS with the perf‘ormance of of objects in the database and their plicit assumptions that (1) the sys- the 00 language ill an open archi- rek&rships to one another. OMT tern is oriented toward operations tecture that makes it convenient to diagrams are straightforward to on individual &jec,ts and (2) the intcrf&c with other languages. map into KDBMS data definition programmer can expect these to Also, you can have both an 00 and (DDL) commands and 00 code perform well. This is notable a relational interface to the same declarations. Figure 1 shows a sam- mainly because RDBMSs typically database. This allows you to write ple OMT diagram. Some aspects of perform badly for single-object 00 programs that access data that OMT diagrams are discussed next. operations and navigation between already exists in an RDBMS as well A class describes a set of object objects [6]. There are two basic as continue to use conventional instances with similar structure and approaches to OO-DBtMS: extend- RDBMSs access to the same data. behavior. Each class has a name, a ing a relational DBMS and extend- set of attributes (also called instance ing an 00 programming language. Datalmse checkout variables, fields, or properties) that

One can implement an OO- The notion of database checkout hold state values of the object, and a DBMS by extending an RDBMS. has been discussed [ 1 1, 151. A por- set of operations that an object is sub- One extends the relational model tion of the database is locked for ject to. Object classes are denoted with new data types, operators, and exclusive use and copied into RAM. by boxes. The name of the class, its access methods. An explicit goal is All subsequent read and write traf- attributes, and its methods are to minimize changes to the rela- fit is then executed against the drawn in the box. The listing of at- tional model. This approach defi- KAM copy. Once work is finished, tributes and methods may be sup- nitely adds to the functionality pro- the data is checked back into the pressed, as in Figure 1, depending vided by an RDBMS. This type of database and made available for on the level of detail desired. 00-DBMS integrates well with ex- general use. The checkouticheckin A relationship connects two (or isting relational databases and pro- mechanism provides fast interactive more) classes. An object class may vides for smooth flow of data be- access and eliminates most locking inherit some of its structure and tween engineering and business overhead. We will refer to this behavior from a superclass; the sub-

applications. Potential disadvan- checkout/checkin mechanism as class is a refinement of the SU- tages are performance and robust- database shadowing. perc&zss. This relationship among ness limitations. Even an aug- An ordinary database transaction classes is called generulizution. Gen- mented RDBMS may not be locks its data for a few seconds. In eralization (inheritance) is indicated capable of efficient operations on contrast, a checkout scheme may by lines and a triangle ,fanning-out

individual objects. PlOSTGRES [23] lock data f‘or hours, days, and even from the superclass to the sub- and EXODUS [5] illustrate this weeks. Thus the checkout mecha- classes. approach. nism provides support for long An association relationship con-

Another approach to an OO- transactions. Long transactions are nects two or more object instances DBMS is extending an OOPL. often needed in a design environ- and is indicated by ordinary lines. Database functionality (persistence, ment. Special symbols at the ends of an authorization, concurrency) is pro- association line indicate the multi-

vided as needed for individual ob- SoQtwaee subsystems plicity of the association (how many jects. An extended OOPL effi- Our 00-DBMS is constructed from objects are related to a given ob- ciently navigates individual objects an object-oriented data modeling ject). A solid circle indicates zero or and has no inherent limits on func- protocol, an RDBMS, and an more; a hollow circle means zero or tionality. An extended OOPL, how- OOPL. The exact subsystems cho- one; a straight line without a termi- ever, must demonstrate reliable sen are not important for the suc- nator denotes exactly one. management of large quantities of cess of the shadowing technique, data. One must also develop a theo- which we use to interface the pro- Relationcrl DSMS ~RDRMSI

retical base as Codd provided for gramming language to the data- The KDBMS must provide data RDBMSs. Gemstone [ 171, Vbase base. persistence, concurrency control, [ 11, and ORION [14] adopt this transactions, and a programming approach. Object-oriented Modeling language interface. A query lan-

At this point in time, it is not We have been using the Object guage or query-by-forms capability clear which approach is best. The Modeling Technique (OMT) [3, 161 is unimportant for many engineer- choice would seem to depend on for our work. Any type of 00 data ing analysis problems because ac- the application. The 00-DBMS model would suffice for our shad- cess to complex, structured data is described in this article adopts the owing technique, but our discus- controlled by the 00-DBMS front viewpoint of extending an OOPL. sions will f’ocus on the OMT for the end. The 00-DBMS described

Our approach combines the ma- sake of clarity. The OMT specifies here hides the details of the turity and robustness of’ the logical data structure, i.e., the classes RDBMS and removes some of the

-

Page 3: An Object-Oriented Relational Databasr layer. One …...An Object-Oriented Relational oriented layer that keeps relevant data in memory. Locking and up- date protocols are built into

arbitrary restrictions. The RDBMS guage with persistence. The lan- ated RDBMS tables. We created

may be regarded as a lower-level guage includes specific operations

resource invisible to the end user. about instances of classes and rela-

one table for each object class. For

tionships. The 00-DBMS has both relationships we had the choice of creating explicit tables or embed-

Orr/ect-Oriented ProgrammIng compile-time and run-time compo- ding object IDS as foreign keys. We

Language MoPL) nents. indicated our decision on a case-by-

The object-oriented programming case basis by the presence or ab-

language must provide objects of ~~~,“~~U~‘mP”e~t’me sence of a relationship name on the

different classes and the ability to OMT diagrams. Some decision fac-

send messages to objects to invoke a Figure 2 summarizes 00-DBMS tors were performance, prolifera-

class-dependent method. Inheri- compile-time architecture. OMT tion of tables, and the likelihood of

tance of methods is not required by diagrams and supplemental files future changes. Reference [3] con-

the algorithm presented here, al- are provided as input. The supple-

though any reasonable 00 lan- mental files contain minor details

tains more details on the mapping

that OMT diagrams lack, such as process.

guagc will have it. Each class de- Later, we drew OMT diagrams

scribes a set ofo+jects with identical attribute data types and permissi- with a general purpose graphics

structure. A class maps into an bility of nulls. This input is oper- editor. The graphics editor pro-

RDBMS table; each object instance ated on by a conversion process. The conversion output is a run-

duces graphics output in an ASCII maps into one row of a table. An ID markup language for which we

generated by the 00-DBMS serves time programming interface (00 have a BNF description. We used

as the primary key of each object. language subroutines) and DDL commands to generate RDBMS

LEX and YACC? to compile the Some relationships map into

tables. BNF definitions. Then we wrote

RDBMS tables. The other relation- software to recognize connectivity ship are stored in object tables as Initially the conversion process on OMT diagrams and generate buried pointers. was partially automated. We used programming interface subrou-

AWK’ to create the programming tines and DDL statements. 00-DBMS System interface from a few hand-coded Our long-term solution to this Architecture templates and manually prepared ’ A\\VK is ;I Unix 1001. The programmer views the OO- tables that described object classes DBMS as an object-oriented lan- and relationships. Then we gener- “I.I:S mrd Yi\(X: ;II‘C’ Unix tools.

I Connecton Connects-to Pin I

Bus Circuit Segment Segment

1

Bus Pin

FIGURE q. Part of the OH1 Data Model for Electric Circuit Application

00-DBMS + Programming Interface

FIGURE 2. 00-DBMS Compile-lime Architecture

CCMM”WlCATlCWSCFT”EACM/November IY9O/Vd 33. No II 101

Page 4: An Object-Oriented Relational Databasr layer. One …...An Object-Oriented Relational oriented layer that keeps relevant data in memory. Locking and up- date protocols are built into

Buffered ’

/

. Concurrent

\ \ \ \ \

FIGURE 3. OO-DBMS Run-lime Arthitethire

issue is an OM.1‘ diagram editor. A face, later). These routines access

custom editor provides active sup- buffered objects through an 00

port for the semantics of OMT dia- language. The shadowing routines

grams. The OMT edil.or checks for access the relational database

duplicate names and simplifies the through the KDBMS programming

drawing of OMT diagrams. interface (usually cursors on tables).

00-Di3MS Run-time Architecture

Figure 3 summarizes 00-DBMS run-time architecture. There are two modes of open ation: buf’f’ered and concurrent. To access buffered objects, the programmer first loads one or more sections into memory in a checkout protocol. Concurrent objects are accessed without using the protocol. In either case, the programmer deals ,wil.h objects and relationships by calling the appro-

priate interface functions. The mechanics of database interaction are hidden. In many cases, the op- erations are recorded directly into memory, and updating of the data- base is deferred. We define datuthse shrrdourityg as this mode of transpar- ent, buffered database access through operations on objects and relationships.

The 00-DBMS programming interface is supported by internal shadowing routines (see section on

00-DBMS Programming Inter-

Key Design Criteria

Performance was OUI‘ most impor- tant design criterion for the OO- DBMS. The shadowing mechanism improves interactive performance by using the 00 language to search memory and by deferring database writes. Most read operations can be performed without accessing the database. The 00-DBMS elimi- nates redundant database writes. For example, an object that is in- serted and then deleted requires no

database activity. The shadowing mechanism intercepts intermediate activity and only applies the net re- sult to the database. The initial delay upon loading sections into memory was tolerable for our pur- poses.

Considerable elision is expected under normal application usage. In a typical session, a user will concen- trate on a few areas of a design, making repeated revisions of the same object before committing the results to the database.

Another design goal was to in- crease programmer productivity. The 00-DBMS eases the burden of using a database. To a large extent the programmer can think in terms of the 00 language and forget about database interface details. 00 languages also provide robust libraries of tested code and facili- tate code reuse.

The OO-DBMS reduces pro- gramming errors since the pro- grammer does not become con- fused by the mismatch between programming languages and data- base languages. Instead of using linked lists, trees, and hash tables, one operates on objects and rela- tionships. Objects and relationships provide a simple, uniform pro- gramming paradigm.

Extensibility was another design criteria. Additional functions were added as the need arose, and future additions are expected. For exam- ple, we plan to incorporate propa- gation of operations among objects [al].

Key Design Decisions

Concurrent versus buffered access The 00-DBMS supports two types of database access-co)lcurre)lt and

Page 5: An Object-Oriented Relational Databasr layer. One …...An Object-Oriented Relational oriented layer that keeps relevant data in memory. Locking and up- date protocols are built into

buffered. Concurrent data has global scope and is visible to all users. Concurrent data is accessed directly from the database and locked for the shortest possible amount of time. System data for sections (dis- cussed later) is an example of con- current data. In contrast, buffered data is private to a single user and is locked by a user for the duration of an interactive session. Buffered data is loaded into memory, oper- ated upon, and then saved back to the database. Buffering reduces database traffic and improves ap- plication performance.

Each database table is either con- current or buffered. Both catego- ries look alike to the programmer; both categories support the same operations and have the same syn- tax. The 00-DBMS keeps track of which objects are buffered and which are concurrent. The only dif- ference between the two categories is the time at which changes are committed to the database and the degree of concurrency. Updates on concurrent tables are immediately applied to the database. Updates on buffered tables are accumulated in memory for later writing upon an explicit save command or at the end of a session. In practice, only a few tables contain data that must be shared among concurrent users and must be assigned to the concur- rent category. The other tables can be buffered.

Sections A section is a subset of the database that can be independently manipu- lated. Each instance of an object or relationship belongs to a single sec- tion. Each section contains zero or more data items from each buf- fered database table. In other words, sections partition the data instances and cut across all the buf- fered classes and relationships in the data model.

An application must lock a sec- tion before accessing its data. This is called checking out a section. An application may check out one or more sections. Other users cannot read or write to checked-out sec-

tions. The notion of a section only applies to buffered data. Concur- rent data cannot be checked out and can always be accessed by any application.

Database versus nondatabase definitions There are two kinds of class and relationship definitions in the OO- DBMS: database or nondatabase. Database classes and relationships are handled both by the 00 lan- guage and the KDBMS and may be saved in the database. Nondatabase definitions are handled only by the 00 language and may not be saved. Both kinds are useful. There is no point in creating database ta- bles for nondatabase objects. Modi- fying nondatabase definitions only requires recompilation of the appli- cation, while the database must be rebuilt if any database definitions are changed. Programmers use database definitions sparingly while nondatabase definitions are used freely.

Persistence Object instances are either persistent or trunsient. Persistent objects re- main in the database beyond the life of a program execution. Tran- sient objects are newly created ob- jects stored only in memory and disappear when an application pro- gram terminates. Database objects may be transient or persistent and may be converted from transient to persistent. Nondatabase objects are transient only.

Transient objects may temporar- ily violate database integrity rules. For example, the copy operation may create a transient object with a primary key that matches the key of a persistent object. This may also occur when the transient copy is going to be further updated before ultimate database insertion. A tran- sient object that satisfies database integrity may be converted to a per- sistent object with an INSERT com- mand (see section on Operations on Objects).

IDS for object references An ID [I31 is an arbitrary handle

for referring to an object. Every object has a unique ID. IDS are au- tomatically generated by the OO- DBMS and are not subject to user update. ID allocation involves con- currency issues in a multiuser envi- ronment.

Because of their stability, IDS are particularly useful for object refer- ences. Contrast this with the RDBMS scenario where the value of a primary key changes and all foreign keys that refer to it must be updated. All object database tables have an ID as the primary key. All relationship database tables use one or more IDS from participating ob- jects as the primary key.

All IDS are 32 bits long because the system is being run on a 32-bit machine. Section IDS use only I6 bits and pad the remainder. Object IDS have I6 bits to identify the sec- tion and another I6 bits to resolve identity within a section. One bene- fit of this ID allocation scheme is that once a section is locked, object IDS can be assigned within that sec- tion without consulting the data- base. Another benefit is that if data- base storage of records within tables by IDS can be ordered, ob- jects and relationships for a given section will cluster within the tables. This improves the efficiency of sec- tion loading and section saving.

oo-DBMS DragrammIng InterSace The 00-DBMS provides opera- tions on objects, relationships, and sections. The programmer executes an operation by invoking the func- tion for the operation on the ap- propriate class. The function in turn calls internal routines for per- forming buffered insert, delete, up- date, load, or save database opera- tions (see section on the Internal Buffering Mechanism).

The following functions cover insertion, deletion, updating, and retrieval of objects and relation- ships from memory and the data- base. The functions provide the same functionality that would ordi- narily be provided by a cursor in- terface to a database. These func-

CCMM”YICAT,CNSCFTWEAC1CM/Nove~~~bcr 1990iVol.33, No 11 103

Page 6: An Object-Oriented Relational Databasr layer. One …...An Object-Oriented Relational oriented layer that keeps relevant data in memory. Locking and up- date protocols are built into

tions have been sufficient for our applications.

Operations on Ohrjerts

The 00-DBMS includes functions to perform the fi)llowing opera- tions for each object class. NEW (classname)

Creates and returns a new empty transient object of the given class. Issues a creulp request to the buf- f‘ering system. After an object is created, data may be entered with a PUT operation. An object is transient until an INSERT is performed.

PUT (attributename, object, value) Fills in an attribute by overwrit- ing previous cements. Issues an update request to’ the buffering system.

GET (attributename, opject) Returns an attribute value from an object. No action is required by the buffering system.

RETRIEVE (classname, keyname, valuel, )

Returns a set of persistent objects whose keys match the sequence of’ values. Transient objects are ignored. A key is a predefined list of attribute names used to select objects. Any number of keys may he definecl per ollject class.

RETRIEVE-ALL (classname) Retrieves all persistent objects in a class. All concurrent objects are found; bufferccl objects are found only for sections that have been loaded.

INSERT (object) Converts a transient object into a persistent one. lsslles an insert request to the buffering system. An error is raised if the object violates database integrity.

DELEI‘E (object) Destroys an object. issues a drlete request to the buf‘fering system. An error is raised if the object belongs to any relationship.

COPY (object) Returns a transient copy of the object, including its Idata. Issues a create request to the buffering system. COPY is equivalent to NEW followed by many GETS and PUTS.

Operations on Relationships

The 00-DBMS includes functions to perform the following opera- tions for each relationship with an explicit table.

INSER’I‘ (relationshipname, obj I, obj2)

Adds a pair of persistent objects to a relationship. Issues an zrrsert request to the buffering system.

DELETE (relationshipname, obj I,

ol?P) Deletes an object pair from a re- lationship. Issues a delete request to the huf’f’ering system.

DELETE- I (relationshipname, objl)

Deletes all object pairs from a relationship where o/j1 is the first object in the pair. Issues a delete request to the buffering system for each pair.

DELE’I‘E-2 (relationshipname, obj2)

Deletes all object pairs from a relationship where oDj2 is the sec- ond object in the pair. Issues a dc4etp request to the buffering system for each pair.

RE’I‘RIEVE- I (relationshipname, objl)

Searches relationship to find ob- ject pairs in which objl is the first object. Returns a set of objects.

REl‘KlEVE-2 (relationshipname, obj2)

Searches relationship to find ob- ject pairs in which oDj2 is the sec- ond object. Returns a set of’ ob- ,jects.

‘I‘ES’I’ (relationshipname, objl,

obj2) ‘l‘ests whether an object pair is a member of the relationship. Re- turns a boolean value.

Operations on Sections

The 00-DBMS includes functions for controlling the buffering of sec- tions.

SECTION-LOAD (section) 1,ocks a given database section and loads it into memory. An error occurs if the section is al- ready loaded by another process. Reloading a section already in

memory aborts previous changes for the section.

SECTION-SAVE (section) Commits changes for a given sec- tion to the database. Save re- quests are issued to the buffering system for objects and relation- ships requiring insertion, dele. tion, or updating.

Internal BuiQeelng Mechanism The programmer accesses data by first locking and loading one or more sections into memory. As the data is loaded, the 00-DBMS builds data structures used to search memory for objects and re- lationships. Subsequent operations are performed in memory. Ulti- mately, changes made to a section are discarded or saved to the data- base.

Each buffered object has a state. When a section is loaded, all of its objects are in the PERSISTENT state. Each time an internal opera- tion is performed on an object, its state is checked to cletermine what action to take and what the new state should be. Figure 4 shows a state transition diagram. States are sliown as uppercase names; actions are the lowercase names next to arcs.

EuffeHng States

‘I‘KANSIENT Refers to transient objects. A newly created or copied object has this state. An insert operation ch;mgcs the state to INSERT. Objects in this state are ignored during a save operation. Update operations 011 objects in this state

1 leave the state &h;mged. Ob- jects deleted f.rom this state are discardecl.

l’E:KSlS’l‘Eh7 Refers to ur~modified persistent objects. Objects loaded from the database are put in this state. An update operation f’rom this state sets the state to UPDATE. A de- lete operation sets the state to DEIKI‘E. Objects in this state are iCgiiored during a save opera- tioil because they are already up

Page 7: An Object-Oriented Relational Databasr layer. One …...An Object-Oriented Relational oriented layer that keeps relevant data in memory. Locking and up- date protocols are built into

0 Database

Save Save Update Update Update, Save Update, Save

Load

0 (New)

to-date in ~hc database. An insert from this state is an error.

DIcl.E’l‘E Kef’ers to prrsistellt objects that must be deleted from the data- base. A save operation deletes objects in this state froRi the database and discards the in- memory copy. Any other opcra- Lions are errors,

INSEKI‘ Kcfers to persistent objects that have been created during the current session but not yet writ- ten to the database. A save inserts objects that are in this state into the clatabase and sets their state to I’EKSISTENT. Updates to objects in this state do not change their state. Delete simply discards the object. An insert from this state is an error.

UI’DAI‘E Kctcrs to persistent objects that have been modified since being

FIGURE 4. Buffering AlgOritltfN States

loaded into memory. A save op- eration updates these objects in the database and sets their state to I’EKSIS’I‘ENT. Updates to objects in this state do not change their state. Delete sets the state to DFLEI‘E. An insert from this state is an error.

Internal Buffering Operations

The database buff‘ering system is driven by internal II&, savr, imrrt, rlrlcte, o-e&, and ufhte operations. These internal operations and the state information recorded with each buffered object are used to buffer database writes. Each inter- nal operation on aii object may change the state of the object. Each operation is immediately applied to memory, and flags are set on the object so that the changes can even- tually be written to the database.

load (section)

I.oads a section of the database. Searches all database tables fol rows belonging to the section and creates objects in memory con- taining the data.

save (section) Brings the database up-to-date with a section in memory. Up- dates the database with informa- tion from objects in the insert, update, or delete state. Objects in the persistent or transient state are ignored. Objects in the delete state are discarded and their IDS recovered for reuse.

insert (object) Makes an object persistent. This operation is applied to objects in the Lransient state. Performs bookkeeping chores such as ID generation.

delete (object) If the object is in the insert state, it is discarded. If the object is persistent, it is placed in the de-

Page 8: An Object-Oriented Relational Databasr layer. One …...An Object-Oriented Relational oriented layer that keeps relevant data in memory. Locking and up- date protocols are built into

lete state for eventual removal from the database.

create (class) Creates a new object of the given class and places the object in the transient state.

update (object)

tion followed by a deferred deletion cancels out, leaving no trace in the buffering sys- tem.)

Updates object .attributes and places the object in the update state for eventual writing to the database.

Correctness OF the EuHering algoritm~m We now show that ,applying data- base updates via our buffering scheme yields the same results as immediate database update. Our demonstration only .applies to per- sistent objects and relationships, since transient and local data are not written to the database. We begin by defining consistency con- ditions.

Deletions must be processed before insertions, in case there is a duplicate record with the same primary key. For exam- ple, primary keys for relation- ships are formed from the pri- mary keys of the objects being related. If a member of a rela- tionship is deleted and then inserted, there will be two rec- ords in memory with the same primary key. Insertions must check persis- tent records for uniqueness of the primary key. Modification of the primary key is not allowed for persis- tent records. This is not a problem, since all primary keys are IDS.

Condition 1: Each database record corre- sponds to an object or relation- ship record in memory. Each record in memory has a state that correctly describes the difference between what is in the database and what should be in the data- base.

Condition 2: Each database record is uniquely identified by a primary key. With one exception, no two persistent records in memory may have the same primary key. The one ex- ception is that a record that has been deleted and then inserted may appear twice, once in the insert state and once in the delete state.

Assumptions We make the following assump- tions.

Assertion 3 Each internal buffering operation preserves consistency.

In this section phrases such as

(3) “deferred insert” are used to refer to an internal buffering operation. This will distinguish an internal buffering operation from the even- tual database action.

A deferred insert is only allowed on records in the TRANSIENT state. Part of the deferred insertion process is a check for uniqueness of the primary key. Since the object must belong to a section and since the section ID is part of the key, this

(4) check can be made without consult- ing the database. Records in the INSERT, PERSISTENT, and

(5) UPDATE states are checked, and duplication of their primary keys is blocked. On the other hand, since records in the DELETE state are invisible to the programmer, their primary keys may be reused.

Assertion 1 A deferred delete is allowed fo. Under consistency conditions, a records in the PERSISTENT, save operation correctly updates UPDATE, or INSERT states. Rec- the database. ords in the INSERT state are not

All objects and relationship data- written in the database and can be base tables use IDS as the primary simply discarded. Records in the keys. IDS are assigned by the OO- PERSISTENT or UPDATE state DBMS and are never changed. The have corresponding information in primary key for a record provides a the database and are placed in the one-to-one mapping between mem- DELETE state for later removal. ory and the database. A deferred update is allowed for

The state of each memory record records in all states except the DE- indicates what actions must be per- LETE state. With the exception of formed to synchronize the data- the TRANSIENT state, modifica- base. Records in the PERSISTENT tion of the primary key is not al- state require no action because they lowed. Modification of the primary are already up-to-date. Records in key of a record in the TRAN- the UPDATE state have their non- SIENT state is allowed because primary key attributes updated in such records are temporary. the database. Records in the IN- SERT state are added. Records in the DELETE state are removed.

Induction By induction, since each individual

(1) Sections are lock.ed for a single user.

(2) If there are two records with the same primary key, it is an indication that there was a de- ferred deletion followed by a deferred insertion and not vice versa. (A deferred inser-

-

Assertion 2 buffering operation preserves con-

Consistency conditions hold imme- sist.ency, a series of buffering oper-

diately after a load operation. ations also preserves consistency.

Data is copied from the database into memory, so the two copies are Siclirectional Linkage the same. Each memory record has The 00-DBMS must quickly map the same unique primary key that it from the database primary key to had in the database. Each record is the 00 language pointer and vice In the PERSISTENT state, indicat- versa. Some possible implementa- ing that it agrees with the database. tion techniques are:

Page 9: An Object-Oriented Relational Databasr layer. One …...An Object-Oriented Relational oriented layer that keeps relevant data in memory. Locking and up- date protocols are built into

(1) pair of tables + hashing (2) pair of trees.

We used the container class relntion- sl~if, of our 00 language, which was implemented with a pair of tables plus hashing. Hashing algorithms and table lookup are provided by 00 language libraries.

Integrity Checking

The 00-DBMS incorporates rudi- mentary integrity checking. The 00-DBMS enforces the uniqueness of primary keys and candidate keys and nonnull specifications. We also provide some support for enumer- atlon types and range checking. Since the 00-DBMS is essentially a layer that wraps around an RDBMS, we could include more thorough integrity support in the future.

AuullCatlon oi the OO-DBMS Choice of so#tware SuEsystems

Our implementation of the OO- DBMS was built on top of DSM, MIMER, and the Object Modeling Technique. The 00-DBMS could be ported to another RDBMS by rewritin,g the database interface. The 00-DBMS could be ported to another 00 language by emulating the DSM relation feature [19].

We chose the MIMERa DBMS for reasons unrelated to its techni- cal merits. MIMER is more or less a conventional SQL-like relational DBMS [7. 181. As with its competi- tors, MIMER implements certain aspects of the relational model well, yet contains arbitrary implementa- tion restrictions on others. For ex- ample, a secondary index is re- stricted to a single attribute; it cannot be composite.

We chose the Data Structure Manager (DSM) [ 19, 221 as our 00 language because of its technical features and in-house availability. DSM is a full-functionality 00 lan- guage developed at GE. DSM runs on top of the C language. The most noteworthy DSM feature is its sup- port for relationships among ob-

jects. One can navigate DSM objects in a manner similar to navigating RDBMS tables. DSM can automati- cally enforce certain constraints, such as existence dependencies be- tween objects. DSM has a rich li- brary of predefined classes.

Applketion Experience

We have used the 00-DBMS de- scribed in this article to support an editor for electric circuit design. The circuit eclitor stores its data in a

database for access by other pro- grams such as mathematical simula- tors. The object-oriented layer al- lows the circuit editor to receive database services such as data per- sistence and concurrent access, yet it still responds in real time. Our electric circuit application supports interactive graphical editing and requires a fast response to keep pace with the user.

We designed the 00-DBMS spo- radically over the course of a year. Once we had completed the design, it took three months to implement the 00-DBMS. The extensive DSM library was largely responsible for the short implementation time.

The full electric circuit applica- tion had fourteen full pages of OMT diagrams. There were 82 ob- ject classes and 45 relationships, yielding 108 database tables. Pro- grammers wrote several thousand lines of code for programming in- terface function templates.

The performance of the result- ing 00-DBMS was sufficient to support an interactive electrical cir- cuit editor running on a MicroVAXa computer. The. OO- DBMS keeps pace with interactive mouse movement. An RDBMS by itself cannot support real-time per- formance because of I/O delay and commancl-processing overhead. It takes the 00-DBMS one second to perform a mixed sequence of sev- eral thousand object and relation- ship operations m memory, run- ning on a MicroVAX II@. The same sequence of operatious would take more than a minute if they were applied directly to the database.

This methodology is currently being used for another interactive graphical application. Most of the effort required to adapt it to this new application is the preparation of a new data model.

Concluslons We have described an approach to implementing an object-oriented DBMS (00-DBMS). One can take an existing relational DBMS (RDBMS) and hide it beneath an object-oriented programming lan- guage. The buffering algorithm yields fast interactive performance while storing objects in a database. Our work demonstrates that it is possible to build an object-oriented DBMS on top of a relational DBMS and still get good performance.

This approach combines the best features of’ both KDBMS and 00 programming languages. RDBMSs have a sound theoretical base and work well for business applications. Commercial products have robust concurrency, ,journaling, and roll- back facilities. At the same time we obtain the benefits of using objects to abstract an application. The pro- gramming language allows com-

COMMUNlCAllOWSOCT”EACM/Nowmbcr 1990,“0,.33, No.,, 107

Page 10: An Object-Oriented Relational Databasr layer. One …...An Object-Oriented Relational oriented layer that keeps relevant data in memory. Locking and up- date protocols are built into

-

plex algorithms to be written that would be hard to express in an RDBMS, without frequent access of the RDBMS within the algorithm. This approach nninimizes the amount of new code Ithat must be written, since it uses existing soft- ware subsystems. An 00 design methodology is the “glue” that binds together the RDBMS and 00 language. Our “OC)-DBMS” lacks the functionality of a full system. Nevertheless, our appr0ach to an OO-DBMS is quite effective for

many applications. In our applications, we have

found that 00 programming im- proves programme’r productivity, relative to conventiona. procedural languages like Pascal and C. Our approach to an 00-,DBMS enables one to reap these productivity gains while using an RDBMS. q

References 1. Andrews. T., and Harris, C. Com-

bining language and database ad- vances in an object-oriented devel- opmcrlt envir,onnient. In Procrrdiug.< (q OOPSLA ‘87 Confer- ence. (Oct. 4-8, Orlando, FL.) ACM/ SIGPLAN, New York, 1988, pp. 142-152.

2. Atkinson, M.P., and Buneman, O.P. Types and persistence in database programming languages. ACM Comput. Surv. ZY, 2 (June 1987), 103-190.

3. Blaha, M. R., Premerlani, W.J., and Rumhaugh. J. E. Relational data- base design using an ohject- oriented methodology. Commr~n. ACM 31, 4 (Apr. 1988), 414-427.

4. Booth, G. Object-oriented develop- ment. IEEE Trans. Softw. Eng. SE- 12. 2 (Feb. 1986), 21 l-221.

5. Carey, M., et al. The architecture of the EXODUS extrnsihle DBMS. In Proceeding5 of the 1986 Intrrnalional Workshop on Object-.Oriented Databa.w Systp,ns (Sept. 23-26, Pacific Grove, Calif.). ACMISIGMOD, New York, 1986. pp. 52-65.

6. Cattell, R.G.G., and Rogers, T.R. Combining object-oriented and re- lational models of clata. In Proceed- ings of the 1986 International Work- .st1op on Objw-Orwnted Dalabase .S@,ns (Sept. 23-26, Pacific Grove, Calif). ACMISIGMOD, New York, 1986, pp. 212-213.

7. Codd. E.F. A relational model of data for large shared data hanks. (,‘ommun. ACM 13. 6 (June 1970). 377-387.

8. Cox. B. J. Object-Orienlrd Propam miug: Au Euolurionary Apponch. Addison-Wesley, Reading, Mass., 1986.

9. Dayal, U., et al. PROBE-A re- search project in knowledge- oriented database systems: Prelimi- nary analysis. Tech. Rep. CCA-85- 03. Computer Corporation of America, Cambridge, Ma., 1985.

10. Gadient. A.J. Functional require- ments for an electronic design auto- mation environment integration frame\cork. In Comi)inl 85: Fir-St In- trrwtioncd Conference 00 Computer Aided Technologie,y (Sept. IO- 12, Montreal, Canada). IEEE-G, Washington, D.C.. 1985, pp. 34% 354.

11. Haskin. R.L., and Lorie, R.A. On extending the functions of a rela- tional database system. In Proceed- iqs of SIGMOD ‘82 International Cor~wence on Ma?ragemenl of Dala (June 2-4, Orlatldo, Fla.). ACM/ SIGMOD, New York, 1982, pp. 207-212.

12. Kernighan, B. W.. and Pike, R. The UNIX Progl-ammiug Enuironmen~. Prentice-Hall, Englewood Cliffs, N.J.. 1984.

13. Khoshafian, S. N., and Copeland, G. 1’. Object identity. In Proceedings oJOOPSLA ‘86 Collfirence (Sept. 29- Oct. 2, Portland, Or.). ACM/ SIGPLAN, New York, 1986, pp. 406-4 16.

14. Kim, W. et al. Integrating an ohjcct- oriented programming system with a database system. In Proceedings of OOPSLA ‘88 Con@rence (Sept. 25- 30, San Diego, Calif.). ACM/ SIGPLAN, New York, 1988, pp. 142-152.

15. Klahold, P., et al. A transaction model supporting complex applica- tions. In Procredinp of SIGMOD ‘85 International Conference on Mowage- ment oJ Data (Mq 28-31, AUS& Tex.). ACMISIGPLAN, NPW York, 1985, pp. 388-401.

16. Loomis, M.E.S., Shah, A.V.S., and Rumhaugh. J.E. An object model- ing technique for conceptual de- sign. In Proceedings of the Eurofwu~ Conference on Object-Oriented Pro- gramming (June 15-17, Paris, France). Lecture Notes in Computer Science, 276. Springer-Verlag, New

York, 1987, pp. 192-202. 17. hlaier, D., Stein, J., Otis, A., and

Purdy, A. Development of an oh- jcct-oriented DBMS. In Pr-orrrdiugx of OOPSLA ‘86 CO~IJ~~CIICP (Sept. 29- Oct. 2, Portland, Oreg.). ACM/ SIGPLAN, New York, 1986. pp. 472-482.

18. MIMER Infin-mation Systems AB, Uppsala, Sweden.

19. Rumhaugh. J.E. Relations as se- mantic constructs in an ohject- oriented language. In Proceedings of OOPSLA ‘87 Confrreuce (Oct. 4-8, Orlando, Fla.). ACM/SIGPLAN. New York, 1987. pp. 466-481.

20. Rumhaugh, J. E. Controlling prop- agation of operations using attri- hutes on relations. In Proceedinp of OOPSLA ‘58 Conferewe (Sept. 25- 30, San Diego, (ialif.). ACM/ SIGPLAN, New York, 1988. pp. 2X5-296.

21. Rumhaugh J. et al. Objecf-Orienfed Modeling and Drsip. Prentice-Hall. Englewood Cliffs. N.J., 1991.

22. Shah, A. et al. DSM: An ohject- relationship modeling language. In Proceeding7 of OOPSLA ‘89 Collfer- rnce (Oct. l-6. New Orleans, La.). ACM/SIGPLAN, New York, 1989, pp. 191-202.

23. Stonehraker, M., and Rowe, L. The design of POSrGRES. In Proceed- ings of SIGMOD ‘86 Infrrnalional Confwrnce on Management of Data (May 28-30, Washington, D.C.). ACMISIGMOD, New York, pp. 340-355.

CR Categories and Subject Descript-

ors: D.2.2 [Software Engineering]:

Tools and Techniques; D.2. IO [Soft-

ware Engineering]: Design- melhodoloLgie?ps; H.2.1 [Database Manage-

ment]: Logical Design; H.2.4 [Database

Management]: Systems General Terms: Design, Performance Additional Key Words and Phrases:

Database checkout, database perfor- mance, database shadowing, engineer- ing database application, entity-rela- tionship modeling, long transaction, object-oriented datahase, relational database

About the Authors:

MICHAEL R. BLAHA is a computer scientist at GE’s Corporate Research and Development Center in Schenec- tady, New York. His research interests include engineering database manage- ment and complex data modeling. Email: [email protected]

108 November 199Ohb1.33, No.ll/COMMUNICATIOWSOFT”EACM

Page 11: An Object-Oriented Relational Databasr layer. One …...An Object-Oriented Relational oriented layer that keeps relevant data in memory. Locking and up- date protocols are built into

WILLIAM J. PREMERLANI is a com- puter scientist at GE’s Corporate Ke- search and Development Center. His I-CSC~tl-Ch interests include object- oriented methodologies and applica- tions of databases to engineering appli- cations. Email: [email protected]

.JAMES E. RUMBAUGH is a computer scientist at GE’s Corporate Research and Dcvelopmcnt Center and is work- ing ml object-oriented methodologies for software design and their imple-

mentation as practical systems for appli- cations. Email: rumbaugh~crd.ge.cc,m

THOMAS A. VARWIG is a senior soft- ware design engineer at Cadence in San Diego. His research interests include tools for printed circuit hoard design including automatic placement and routing. Email: [email protected]

Author’s Present Address: M. K. Hlaha, W. J. Premerlani, and J. E. Kmnbaugh, General Electric Company, Corporate Research and Development,

NEW DIRECTIONS IN COMPUTING AND COMMUNICATIONS

m Journal of Visual Communication and Image Representation

EDITORS-IN-CHIEF Yehoshua Y. Zeevi Technion-lsrue/ Institute of Technology, Ha+ and CAlP Center, Rutgers University, Piscatawuy, New jersey

T. Russell Hsing Bell Communications Research, Morristown, New lersey

The Journal of Visual Communication and Image Representation publishes papers on the state-of-the-art of visual communi- cation and image representation with emphasis on novel technologies and theoretical work in this multidisciplinary area of pure and applied research. The field of visual communication and image representation is considered in its broadest sense and covers both digital and analog aspects as well as processing and communication in biological visual systems. Volume 2 (1991), 4 issues ISSN 1047.3203 In the USA and Canada: $128.00 All other countries: $154.00

Journal of Parallel and Distributed Computing Kai Hwang Howard Jay Siegel EDITOR-IN-CHIEF FOR SPECIAL ISSUES AND INVITED PAPERS EDITOR-IN-CHIEF FOR SUBMITTED RESEARCH PAPERS University of Southern California, Los Angeles Purdue University, West Lafayette, Indiana

This international journal is directed to researchers, engineers, educators, managers, programmers, and users of computers who have particular interests in parallel processing and/or distributed computing. The Journal of Parallel and Distributed Computing publishes original research papers and timely review articles on the theory, design, evaluation, and use of parallel and/or distributed computing systems.

The journal features special issues devoted to specific topics such as: parallel architectures and algorithms; algorithms for hypercube computers; parallelism in computer arithmetic; concurrent hypercube computations; frontiers of massively parallel computation; and languages, compilers, and environments for parallel programming.

Volumes 11-13 (19911, 12 issuer ISSN 0743.7315 In the USA and Canada: $279.00 All other cowfries: $346.00

Sample copies and privileged personal rates are available upon request. For more information, please write or call:

ACADEMIC PRESS, INC., Journal Promotion Department 1250 Sixth Avenue, San Diego, CA 92101, U.S.A. (619) 699-6742 All prices are in U.S. dollars and me subject to change without notice. Flll,~

Circle # 15 on Reader Service Card

109