The Objectstore Database System - University of...

14
4$ 4 THE - OBJECTSTORE DATABASE SVSTEM Charles ~mb Gordon Landis Jack Orenstefin Dan Wetinreb 50

Transcript of The Objectstore Database System - University of...

Page 1: The Objectstore Database System - University of Michiganweb.eecs.umich.edu/~jag/eecs584/papers/objectstore.pdf · 2006-09-04 · allohbrarles that were developdw for manipulating

4$4

THE -OBJECTSTORE

DATABASESVSTEM

Charles ~mbGordon LandisJack Orenstefin

Dan Wetinreb

50

Page 2: The Objectstore Database System - University of Michiganweb.eecs.umich.edu/~jag/eecs584/papers/objectstore.pdf · 2006-09-04 · allohbrarles that were developdw for manipulating

o bjectStoreisanobject-orienteddatabasemanagementsystem(OODBMS) thatprovidesa tightly integratedlanguageinterfaceto the tradition DBMS featuresof persistent

storage,transactionmanagement(concurrencycontrol and recove~),distributeddataaccess,andassociativequeries.ObjectStorewasdesignedto provide a unified programmatic interfaceto both persistentlyallocateddata (i.e.,data thatlivesbeyond the executionof an applica-tion program) and transientlyallocateddata (i.e.,data thatdoesnotsurvivebeyond an application’sexecution),with object-accessspeedforpersistentdatausuallyequal to that of an in-memory dereferenceof apointer to transientdata.—

These goals were derived fromthe requirements of ObjectS1ore’s@rget applications, wh!ch are typi-cally data-lntensxveprowams thatperform complex manipulationsonlarge dambasesofob]ects with !ntrl-cate structure, e.g., CAD, CAE:,CAP, CASE, and geoWapblc infor-mation systems ((;1S) This struc-tural complexity ISgenerally real-ized by inter-obJectreferences, e.g ,pointers from one obJect to an-other. Objects are lmated, ~ssiblywith the intent to update them, bytraversing these references and byass~latlve queries,

We selected C++ as the primarylanguage through wh,ch O&JectStOre 1s accessed &cause It iskcomlng a very popular languageamong the developrs of Ob-jectStore’s target appbcations. Ob-JectStore can also & used from Cprograms—provldlng access fromC ISeasy because lhe data mtiel ofC )s a subse( of that of C++. use ofObJectStore from otber program-ming languages IS discussed later.

The key 10ObJectStore’sintegra-tion with C++ ISthat persistence isnot part of the type of an obJectObjects of any C++ dab typewhatsoever can k allocated tran.siendy (on the ordinary heap) orpersistently (In a dambase), frombudt-in types such as Integers andcharacter strings, to arbltraq user-defined structures (which may con-

@in pointers, use C++ virtualfunctions and multlple inherimnce,etc), In particular, there ISno needto ,nherlt from a special “persistentobJect” base class. Different oblectsof the same ty~ may be persistentor transient wtthln the same pro.Warn.

There are several motlvattonsforour goal of making ObjectStorecloselv Integrated with [he Dro-

,“

grammlng language Tbese ln-cblde.

M* of learning: It was intention-ally designed so that a C++ userwould only have to learn a btde bitmore In order to try out Ob-JectStore and smrt to use it effec-tively. After that, a user can learnmore, and tike advanmge of moreof the capabilities the database of-fers. In particular, there is no needto learn a new type system or a newway to define objects. The declara-tive and prmedural parts of thelan~age are used for bth kinds ofobJects. By provldlng a graduallearning path and making It easyfor users to get started, we hc)pe tomake ObJectStore accessible to aw,der ra”gc of developers, a“dhelp case tbe transition Into the useof object-or! ented dambase tech-nology.

No tinslation cde: We wanted tosave tbe programmer from bavlng

to wr!te code that translates &tween the disk-resident representa-tion of data, and the representationused during execution. For exam-ple, t<)~torea C++ object Into a re-Iat!<)naldatahase, the programmermust construct a mapping betweentbe two, and write code that picksfields out oftuples and copies themInto data memkrs ofob)ects (’fhisISpart of the problem tbat has heencalled the “Impedance mlsmatcb”ktween a programming ktn~ageand a da~base access language [2,13].) Wltb O~ectStore, no translat-ing and no copying is needed Per-sistent data IS just hke ordinaryheap-allmated (transient) dataonce a pointer is obtained to It, theuser -n Iust use It In the ordtnaryway. ObJectStore automaticallytakes care of locking, and keepatrack of what ba$ ken mdlfied

Expressive pwer: We wanted theInterface to persistently allmatedobJects to supprt all of the powerof the host programming languageThis contrasts with the traditionaldata manipulation capabihtles oflanguages such as SQL, whlcb aremuch less pwerful than a general-purpose programming language

Reumbili~: We wanted to promotereusabdlty of code, by allowlng thesame code to operate on eltber per-sistent or transient data, and 10

51

Page 3: The Objectstore Database System - University of Michiganweb.eecs.umich.edu/~jag/eecs584/papers/objectstore.pdf · 2006-09-04 · allohbrarles that were developdw for manipulating

a l l o whbrarles that were developdfor manipulating trans]enl data towork on persistent data withoutchange For example, Ifa pro~am-mer has a hbrary routine that tikesan array of floatlng-point numbersand computes the fast Fouriertransform, he or she can pass it anObjectStore persistentarray, and itwdl work. Usually, if a library doesnot need to do any persistent allo-cation of is own, the library can &applled to persistent dam w!tho”leven king recompiled.

&nversiOn: Many programmerswho are Interested In using object-orlented DBMSSwould hke to addpersistence to exlsttng apphcattonsthat deal with transient objects,rather tban budd new applicationsfrom scratcb. We wanted to make itas easy as possible to convert an ex-isting apphcatlon to use persistentobjects throughout. In particular,this means that basic data opera-tions such as derefcrencing point-ers and getting and setting dammembers should be synmcticallytbesame for persistent and transientobJects, and that variables sbouldnot have to have their type declara-tions changed when perslslent ob-jects are used.

~ ch=bg: We wanted thecompile-time type-checktng ofC++ to apply to persistent dam aswell as transient dam, with the en-tire application using a single type

system. The cOmpller’s type check-ing apphes to ObJectsin the dati-base. For example, a variable refer-ring to an object of class employeewould have type ‘employee *’, Sucha variablecould refer toa persistentemployee or a transient employee,at different times during pro~amexecution. A function that takes areference to an employee as an ar-pment can therefore operate on apersistent or a transient employee.

The second goal of ObjectStoreIS to provide a very high perfo-rmancefor the kinds ofapphcationsto which ObjectStore is mrgetted.From the potnt of view of perfor.

mance, the target applications arevery different from traditlo”aldatibase applications s“ch as pay-roll programs and o“.line transac-tion prmessing systems, in severalways, as we fo”nd from l“temiew-Ing developers of such appbcatl”n~

Tempml Imality: Whe” ma”yusers access a shared database, veryoften the next user of a data Itemwdl be the same as the previo”zuser. In other words, whaleconcur.re”t access musl be allo”ed andmust work correctly, many dataitems will be used ‘mostly’ by oneuser over a short span of time.

Spatial lwality: Often an appbca-tlon wdl use only a portion of adaubase, and that portion will be(or can be arranged to be) ina smallsection of the dambase that IScon.ti~ous, or mostly so.

Fine interleaving: Apphcatlonsoften interleave small database op-erations (I.e., go from one obJect toa reference object) with smallamounts of computation. That is,there are many very small databaseOperatiOns rather than relatively

few large ones. If every dambaseoperation required a significantper-operation overhead cost (suchas the cost of sending a networkmessage), overhead costs wouldbecome prohlbtttve,

Developers told us that it is im-perative that ordinary data manip-ulation be as fast as possible. Forexample, an ECAD circuit simula-tion ISCPU-lntenslve, traversing anetwork of objects representing acircuit, carrying out compumtionson the way. These simulations arequite expensive. Any approach todam management that penahzesthe running time of such an appli-cation is impractical. This meansthat one critical opration must &as fast as possible: the o~ratlon ofobmining dam from an object,given a pointer or reference to tbeobject. This o~ration might becalled ‘fetching an object’; moreprecisely, it is dereferencing apinter. ObjectStore is designed to

make the speed ofdereferenclng ofplnters to persistent obJects be asclose as possible to that of transientobjects, namely the speed of a sin-gle load InstructIon.

ObJectStore also has some of tbesame performance goals as ordi-nary relational DBMSS, and it gen-erally accomplishes these usingfamdlar techniques s“ch as l“dexes,query optimlzatlon, log-based re-covery, and so on. The lmplemen-tat!on section expla!ns how we ap-proached all of these performancegoals, focusing on the aspects ofObJectStore that differ from con-ve”tlonal techniques.

Another goal Of ObJectStore IStoprovide several feat”res that aremissing from C++ and from mostDBMSS: a collection facdlty (sets,lists, and so on), a way to expressbldlrectlonal relatlonshlps, andsupport for groupware based onversioned data,

Application IntetiaceIn addltlon to the data defin]tl””and manlpulatton facdltles pro-vided by the host languages, C andC++, O~ectStore provides sup-prt for accessing persistent dataInside transactions, a library of col-lection types, bidirectional relatlo”-ships, an optimizing query facdity,and a version facihty to support col-Iabrative work. Tools supportingdatibase schema design, dambasebrowsing, database adm]nlstratlon,and apphcatlon compdatlon anddebugging, are also provided.

There are three programmingInterfaces s“pported, a C hbraryInterface, a C++ libra~ interface,and an extended C++ which pro-vides a tighter language integration”to the query and relationship facih.ties. This interface isaccessibleonlythrough ObjectStore’s C++ com-piler, while the two library inter-faces are accessible through otherthird-party C or C++ compilers,thus providing maximum pormbil-ity, All of the feat”res a“d perfor.mance knefim of the ObjectStorearchitecture are realized In all ofthe interfaces.

Page 4: The Objectstore Database System - University of Michiganweb.eecs.umich.edu/~jag/eecs584/papers/objectstore.pdf · 2006-09-04 · allohbrarles that were developdw for manipulating

Accessing PeBiSteflt DamA simple C+ + program which usesthe extended C++ interface to the

system is presented in Figure 1This program opens an exlst]ngdatabase, creates a new persistentobJect of class employee, adds thenew employee to an existtng de-partment, and sets the salaryof theemployee to 1,000. The keywordpersiswnt specifies a storage class,saying that this variable resides inthe specified database. Persistentvariables associate names with per-sistent obJects, provldlng the start-ing point from which navlptlons orqueries begin The db argument tothe nm operator spec]fies that theemployee obJect being createdshould & allocated In dambase db

It sbould be noted that the ma-nipulation of data looks Just like anordinary C++ pro~am, eventhough the obJecU are persistentThey also compile Into the samemachine InstructIons.the update ofthe salary field Just uses a Simplestore Instruction ObjectStore auto-matically sets read and write locks,and automatically keeps track ofwhat has been modified, helping toprotect the integrityof the dambase

against the pOssLbilityOf Program-mer error. Access to persistentdataIS ~aranteed to be transaction-conslstent (i.e., all-or-none updatesemantics), and recoverable in theevent of system failure.

It should & noted that in Fig-ure 1 the variable eweer%dep-ent is not expbcidy initial-ized. This is &cause It ISa persis-tent variable, whicb refers to anobJect stored in the ‘c/compmlrecor~” database. The object ISlooked up by name, ‘en-gneerlng-department’, In the dam-base, and the program variable isinltiahzed to refer to the namedobject in the dambase. (It wouldhave ken an error if there hadken no sucb object in the dam-base.) Tbe persistent keyword inthe ObjectStore extended C++ in-ferface simply provides a short-hand for Iwking up an object in thedaabase by name, and binding a

#ticlude (objmWdobjetiWm.H)#ticlude (wo-.H)

- (){

// me nefi -e sW-en@ maw ad mtip@aW a// pers@@nt object mpresenu a person n-cd Wdemployee *emp = nm (db) empl~ee (<’~ed”);e~meeti~dep~ent-> ti~emplqee (emp);emp–>s~~ = 1000;

-

I* merecor&.H .1

clwe employee{wb~c.

chin* -e;ht Stim;

};01=s dep-ent{pubJto:

O~t(emplOyee*) employees;

void ti~empl~e (empl~ee *e){ .mplqees–>@eti (e); }

tit wor~aem (employee *e){ mwn empl~ees–>con= (e);}

);

Using colletilons

53

Page 5: The Objectstore Database System - University of Michiganweb.eecs.umich.edu/~jag/eecs584/papers/objectstore.pdf · 2006-09-04 · allohbrarles that were developdw for manipulating

Iwal program variable to the ~r-s!stent database object.

ColletilonsObjectStore provides a collectionfacdity In the form of an objeticlassIibraq. CoO~tions are abstractstructures which rewmble amays intraditional prOWamming lan-Wages, or mbles in relationalDBMSS. Unlike amys or tibles,bowever, ObjmtStore collectionsprovide a variety of khavlors, in-cluding ordered collections (hsts),and collections witb or witboutduplicates (bags or sets).

Performance t“ning often in-volves replaclng simple data struc-

tures, sucb as lis~, with more effi.cient but more complex stmcturessuch as b-trees or hash ~bles. Thisas~ct of application developmentis also handled by the collection li-brary, Users may optionally de-Krik intendd usage by estlmati”gfrquenties of various o~ratio”s,(e.g., iteration, insenion, and re.moval), and tbe collection libra~will transparently select an appro-priate represenmtion. Funher-more, apoltq can be assmiated witbtbe collection, dlcmtlng how thereprese”mtion sbo”ld change inrespnse to changes in the collec-tion’s cardinality, These ~rfor.mance-tuning facilities reduce tbe

dep~enta d;

fomti (empl~e. e, d–>emplqeaa)e—>std~ *= 1.1;

Iteration over a collection

01=s employee{pub~o:

s- -e;tit etiq;dep-enw dept

timmeaember d~ent::ern@~ea;};

voti ti&empl~e (emplWee *e){ employees–>~efi (e); )

void wor~~em (employee *e){ empl~es–>conw (e) ; }

};

Using relationships

developer’s involvement from cod-ing dam structures m dewrlbingaccess patterns.

FiWre 2 shows the user-writte”include file remti.H, used in thisexample. Note that tbe class de.partment declares a dam memkrof type O~~empl~*) .otietis a (parametrized) collection class,found in the ObjectStore coll~tio”class Iibraq. If d is a department,then d–>tiempl~e(e) simplyadds e into d’s set of employees.d–>worhaem(e) returns t~ ife ISconmined in d’s set of employ-ees, fake othewise.

ObjectStore includes a Iwpingconstruct to iterate over seu. Forexample, the cde In FiWre 3 ~vesa 10% raise to each employee i“department d. In the loop, e ishund to eacb element of d–>em-plqees in turn.

me Relationship FaciliNComplex objects sucb as parts hier.archies, designs, dmumenm, a“dmultimedia l“formation can &modeled using ObjectStore’s rela-tionship facility. Relationships can& thougbt of as a pair of Inversepinters, so tbat if one object pintsto another, tbe swond object has aninverse pinter back to the first.Relationships mainmln the inte@tyof these ~!nters. For example, ifone participant in a relationship isdeleted, then tbe pinter to thatobject, from the other pamicipant,is set to n“ll. One-to-one, one-to.many, and many-to-many relation.ships are supprted.

To continue the example in Fig-ure 3, we could create a relation-ship &tween employees and de-patiments, as in FiWre 4. The deptdah memkr of emplqm and theemplqees dan memkr of de-p-ent are declared to k in-verses ofo”e another. Because onedam memkr is a single Winter andtie other is a ~t, the relationship isone-to-many, Whenever an em.ployee is insertd into a depart-ment’s set of employees, the em-ployee is automatically updated torefer to the department (and vice-

54

Page 6: The Objectstore Database System - University of Michiganweb.eecs.umich.edu/~jag/eecs584/papers/objectstore.pdf · 2006-09-04 · allohbrarles that were developdw for manipulating

versa) Similarly,when an employee15deleted from a department’? setof employees, the pointer from theemployee to the department 13setto null, guaranteeing referentialIntegrity.

Syntactically, relatlonshjps areaccessed just hke data memhrs InC++, hut updating the value of arelatlon~hlpcauses the Inverse rela-tionship to he updated as well, sothat the two sides are alwaysconsis-tent with one another. This meansthat after d–>ti~employee(e) Inthe code example given In Figure 1,e’s dept would & enaeertidep~ment, even though thjs fieldwas not exphclt]y set by tbe apph-catton. Th\s ufiate of e would K-cur as a result of Inserting e Intod–>employees, kcause of the ti-versenember declaratlon~. Slml-Iarly, lf e–>dept is set to anotherdepartment, ti, then e ISremovedfrom d–>employees, and Insertedto ~–>empl~ees. In general,maintenance actions can Involvesimply unsetttng the Inverse, or ac-tuallydeletlng the ohJecton the in-verse, at the schema-definer’s dis-cretion. The latter behavior ISuseful for deleting hierarchies ofohjects, so that, for example, delet-lng an assembly would cause all ofIts subassemblies to he deleted,along with their suhassemhhes,re-cursively

Associative QUerieSIn relational DBMSS, queries areexpressed in a special language,usually SQL. SQL has Itsown vari-ahles and expressions, which differIn syntax and semantics from thevarlahles and expressions in thehost language. Bindings ktweenvarlahlesIn the two lan~ages must& estahhshed exphcitly. Oh-jectStOre queries are more closelyIntegrated with the ho~t language.A que~ IS simply an expressionthat operates on one or more col-lections and produces a collectionor a reference to an ohJect.

Selection predicates, which ap-pear within que~ expressions, arealso expressions, either C++ ex-

pressions or quer,es. Continuingthe previou~ example, ~tippo~? that&employees is a set of employeeob]ects:

OsSet(emplOyee*) Uemployees;

The following statement u~es aquery against ~Lemployeea tofind employees earntng over$100,000, and as~lgn the re%tl!ttomeqti&emplOyees:

OsSet(emplOyee*)@meqal~employees =&employees[. s~a~ >= 100,000 .],

[ :1 15 OhJectStOre ,y”t.Y forqueries The contained expre~%lonis a selectton predicate, that ,s (con-ceptually) apphed to each elementof &employees in turn (In fact,the query wdl k optjm17ed If anIndex on salary ,s pre%ent This ISdiscussed later )

Any collection, even one r??&]lt-Ing from an expres510n,can k qtle-ried. For example, this query findsoverpaid employees of departmentd:

d–>employees[: sd~ >= 100000:]

Query expressions can also henested, to form more complex que-ries The following query Imatesemployees who work In the samedepartment .s Fred:

tiemployeea[: dept–>employees[: n~e ‘= ‘wed’:1 :1,Each memher of&employee8 hasa department, d8pt, which has anembedded set of employees. Thenested query IT true for depart-ments having at least one employeewhose name ISFred.

All of these examples make u~eof the language extenslon~avadahleonly through the ObJectStore C++compiler; the [ :] syntax, for exam-ple, IS a language exten~lon Thesame queries can & exprr~~ed vIathe lihrary tnterface. The prev,ousque~ would be restated ,n theC++ hbrarv Interface as

osSet(emplOyee*)>@ workti-ed =fiemployees->quew(cemplOyee*’,“dept->employees[: nme == \’Fmd’\ ]“);

The first argument tc) query,employee*, Indicates the type ofthe collection elements. The secondargument ISsimply the string rep-resenting the query expression. It 1?al~oposs!ble to use the hhrary inter-face to store precompded and optl-m17ed queries tn the database forlater executton.

In )t~ current form, the Ob-JectStOre query language can ex-pre?s ‘semqolns’ hut not full jo]ns,Ie., the result of a query IYa subsetof the collection king queried.

vemlons

OblectStore provides facdltle~ formultlple users to share data in acooperative fashion (~ometlmes re-ferred to as ~oupware) With thesefacdltles, a user can check out a ver-sion of an obJect or group of ob-Ject5, make changes (perhaps en-tadlng a long series of Individualupdate transactions), and thencheck changes back in to the maindevelopment pro]ect so that theyare vlslble to other members of thecwperating team. In the interim,other users can continue to use theprevious versions, and thereforeare not impeded by concurrencyconflicts on their shared data, re-gardless of the duration of the edit-ing sessions Involved. These ex-tended edltlng sessions on private,checked-out versions are “fte” re.ferred to as long tran~actlons.Thedesign was influenced by [3, 6, 9,lo].

If other users want to make con-current parallel changes, they cancheckout alternativeversions of the~ame object or groups of objects,and work on their versions In pri-vate Again, the result ISthat thereare no concurrency con fllcts, eventhough the u~ers are operating on(d,fferent versions of) the sameobJects Alternative ver~lons can

Page 7: The Objectstore Database System - University of Michiganweb.eecs.umich.edu/~jag/eecs584/papers/objectstore.pdf · 2006-09-04 · allohbrarles that were developdw for manipulating

l be merged kck together tor=oncile differences resultingfrom this parallel development.This merging operation is a diffi-cult problem and is left to the userto implement on an applicatiOn-

s~ciflc basis [81.In supportOfthls,ObjectStore allows simultaneousaccess to kth versions of an obJectduring the merge.

Users can control exactly whichversions to use, for eacb ohject or~oup of objects of interest, by set-ting up pr!vate workspaces that

s~clfy the desired verslOn. Thismight & tbe most r=ent version,or a particular previous version(such as tbe previous release), oreven a version on an alternativebranch. Users can also useworkspaces to selectively share theirwork in pro~ess. Workspaces caninherit from other workspaces, sothat one designer could specify tiathis or her workspace should by de-fault inhetit “whatever is in theteam’s shared workspace”; he orshe could then add Indlvldual newversions as changes are made, over-riding this default.

For example, a team of designersworking on a CPU design might set

UP a wOrkspace in which all Of theirnew versions are created. Onlywhen their CPU design is com-pleted would tbe finished version(s)& checked in to tbe corporateworkspace, making them availableto, say, the manufacturing ~oup.Within tbe design team’s work-

space, there might ~ multiPlesubworkspaces, wh]ch are used bysubWoups of the design team orIndividual team memkrs, Just astbe entire uoup makes its workavailable to manufacturing bychecking In a completed verston tothe cor~rate workspace, individ-ual designers or teams of designerscan make their work-in-pro~essavailable to one another hy check-ing their intermediate versions In totheir shared workspaces. This is il-Iustratd in Fi~re 5.

Just as the ~rsistence of an object is inde~ndentoftyp, the ver.sioning of an object is independent

of ty~ This means that instincesof any type may be versioned, andthat versioned and nonversionedinstinces can & operated on by thesame user cde. Tbis makes it easyto mke an exlst]ng piece of code,which has nonotion ofversioning—for example, a circuit-design simu-lator—and use it on versioned data.The simulator does not have to &rewritten, because opcratlng on aparticular version of a circuit de-sign is Identical to o~rating on anonversloned design.

Pro~ams using versioned dataneed not distln~isb amongversioned, persistent, and transientdam in accordance with ObjectStore’s design principles.

Anhitetira andImplementation

[email protected] fundamenml operation of adambase programming language isdereferencing: finding and using atarget object that is referred to by asource obJect. ObjectStore’s inter-face goals state that th]s must workJust as in ordina~ C++, to providetransparent integration wltb thelanguage and to make dereferenc-

ing as fast as possible. Th]s meansthat ordinary pinters from thehost lan~age must& able to serveas references from one persistentobject to another.

ObjectStore’s performance goalsdemand that once the mrget obJecthas ken retrieved from the dam-base, subsequent references shouldbe just as fast as dereferenclng anordinary pointer in tbe language.This means that dereferenclng apointer to a persistent mrget mustcompde exactly the same asdereferenclng a pointer to a tran-s]ent target, (i.e., as a single ‘load’instruction), without any extra in-structions to check whether the tar-get obJect has ken retrieved fromthe database yet. This creates a dk-Iemma, since It is ~ssible that thetarget object really has not yet kenretrieved from the databaw.

Fortunately, these design goalsare analogous to those of virtualmemory systems, which supprtuniform memo~ references todam, wbether that dati is l~ated inprimary or secondary memory.ObjectStore mkes advanmge of theCPU’s virtual memory hardware,and the operating system’s inter-faces that allow ordina~ softwareto utdlze that hardware. The virtual

L J

[CPU

) [MANU

)

(ALU

)CACHE

(Team 1

) [Team 2

NestIn9 of WOMSPaCeS

56

Page 8: The Objectstore Database System - University of Michiganweb.eecs.umich.edu/~jag/eecs584/papers/objectstore.pdf · 2006-09-04 · allohbrarles that were developdw for manipulating

memory system allows Ob]ectStoreto set tbe protection for any page ofvirtual memory to no access, readonly, or readlwr]te Wben an Ob-JectStoreapphcatlon dereferences apointer whose target has not kenretrieved Into the chent (!.e , a pageset to no access), the hardware de-tects an access vlolatlon, and tbeOperatlng system reflects th,s backto ObJectStore, as a memory faultObjectStore retr]evesthe page fromthe server and places It ]n tbe ch-ent’s cacbe. It tben calls the operat-!ng system to set the protect!c)n ofthe page to allow accesses to suc-ceed (Ie , read only) Finally, It re-turns from the memory fault,which causes the dereference tores~rt This ttme,It succeeds.Sub=sequent reads to tbe same targetobJect,or to <)theraddresses <)ntbesame page, WI]]run In a s]ngle ln-structlc>n,w]thout cau?,ng a faultWrites tc]tbe target page wdl resuhIn fautts that cause the page accessmode and tock to be upgraded toread-write Alt virtual memorymapping and address space manlp-ulatl(]nIn the apphcatlc)nISbandiedby the opcratlng system under thedlrectlon of Ob]ectStore, using nor-mal system calls

The ObJectStore server providesthe long-term repository for persis-tent data Databases can be storedeltber of two ways wllhln fites pr<]-vlded by the operating system’s fitesystem,or w]tbln partitionsof disks,using ObJectStOre’%own file system.The latter provides bigher perfor-mance, by keeping datahases ascontiguous as possible even as theygraduatty grow, and by avoldlng

varlOus Operatlng system Over-heads. Tbe server and tbe clientcommunicate vla Iocatarea networkwhen tbey are running on differenthosts, and by faster facdltles such assbared memory, and Imal sxketswhen they are running on the samehost

Tbe server stores and retrievespages of data In respon$e to re-quests from chents. The server hasno knowtedge of tbe contents of apage It simply passes pages to and

from tbe cbent, and stores tbem ondisk The server ISalso respon~lbiefor concurrency control and recov-ery, using techniques simdar totbose used In conventional DBMSS.It provides two-phase linking witha readlwrlte tmk fur each page.Recove~ IS based on a log, usingtbe write-ahead log protocotTransactions Involv]ng more thanone server are coordinated usingthe two-phase commit protocolThe server also provides backup totong-term storage media such astapes, allowlng full dumps as well asc<]nt,nu<>usarchive Iogglng

S,nce the server has no knowl-edge (>f the content~ of the page,mulchof the query and DBMS pro-cessntgISdone on the chent sideofthe network This contrasts withtraditional relatlonat DBMS sys-tem~ In wblch the server IStargelyresponsible for handhng all queryprocessing, opt,mlzatlon, and for-mdttlng Although such offloadingof work from [he server ISnot Ideatfor atl apphcatlons, this arcbltec-ture doe~ nc]t prectude bav,ng theserver handle more of the work.

ObjectStore malntalns a cbentcacbe, a pool of databasepages thathave recently hen u$ed, In tbe vir-tual memory of lhe chent bestWhen tbe apphcat]on signals amemory fault, ObJectStore deter-mines whetber tbe page being ac-ces~ed ISIn the cbent cache If not,It asks the ObJectStore sewer totransm,t the page to tbe ctient, andputs the page Into tbe cbent cacheThen, the page oftbe chent cache ISmapped Into vlrtuat address space,so tbat the apphcatlon can access It.E’lnally,the faultlng instruction ISrestarted, and tbe apphcat]on con-tinues.

Many appbcat]on~ tend to refer-ence large numhrs of smatt ob-Jects, but networks are, In general,more efficient for bulk data. Tocompensate for this, whole pages ofdata are brought from the server tothe client and ptaced In the cacheand mapped Into virtual memory.Ob]ects are stored on tbe server Intbe same format In wh]cb tbey are

seen by the language In virtualmemory. This avoids potent]at per-obJect overhead such as calhng adynamic memory atlocator, creat-ing entries In ohJecttahles,or refor-matting the nonpolnter elements ofthe ob]ect.

When a transaction finishes, allpages are removed from the ad-dress space and modified pages arewritten back to the server (the chentwaits for an acknowledgment fromtbe server tbat the pages have beensafety written to disk). However,the pages remain In the cbentcache, so that If the next transactionu~esthose pages, It wdt not bave tocommunicate wltb tbe server to re-trieve them; they wdl atready bepresent In the cacbe This improvesperformance when several succes-sive transactions use many of thesame pages. Typlcat ObJeLtStOreappllcatlOnsInterleavecomp”tatlonvery tlghdy w,th database access,dc)]ng some computation, thendereferenclng a po!nter and read-ing or chan~ng a few values, thendoing some m<>recomputat,c]n, etc.Iflt were necessary to communicatewith a remote server for each oftbcse $Lmpledatabase o~rat]ons,the cost of the network and scbed-uter overhead would & enormousBy making the data directly avall-abte to the apphcat,on and attowlngordinary ]nstructlons to manlputatethe data, such apphcatlons performfaster

Since a page can reside ,n the ct%-ent cacbe without blng Iwked,some other chent might modify thepage, lnvabdat,ng the cached copy.The mechanism for making suretbat transactions always see vabdcopies of pages IScalted ‘cache co-herence’. A copy of a page In a cb-ent cache is marked e,ther as shredor exclartue mode The server keepstrack of which pages are In thecaches of wh!ch chents, and withwhich modes. When a chent re-quests a page from tbe server andthe server notices that the page ISIntbe cache of some other cbent (tbehotitng cbent), the sewer wdl cbeckto see If tbe modes conflict. If they

Page 9: The Objectstore Database System - University of Michiganweb.eecs.umich.edu/~jag/eecs584/papers/objectstore.pdf · 2006-09-04 · allohbrarles that were developdw for manipulating

Applications can fimproveperformance by exercEsfing control

over the p~acement of objectswfithin a database.

do, the server sends a message tothe holding chent, asking It to re-move the page from 11scache. Thisis called a callback message, since Itgoes In the opposite dlrectlon fromthe usual request the server ISmaking a request of tbe cbent.

When tbe holding chent receivesthe callback, It checks to see If thepage ,s Imked, and If not, agrees toImmediately rebnq”isb the page,and removes the copy of the pagefrom Itscache. If the page ISlocked,the chent repbes negatively to theserver, and the server forces therequesting cbent to wait untd [beholder ISfinlsbed with the transac-tion. When the holdlng chent corn-mlts or aborts, rt then removes tbecopy oftbe page from Itscache, andthe server can allow tbe original cli-ent to proceed. Tbe use of callbackmessages was inspired by the An-drew Fde System [11]. Relatedcacbe coherency algorithms are dis-cussed in [4].

In an Ideal computer architec-ture wltb unhmlted virtual address

space, every ObJectin every data-base could have a unique address,and virtualaddresses could serve asuncbang!ng object Identlflers.Modern computers have virtualaddress spaces that are veq large,but not unhmlted. Single damhasescan exceed the size of the virtualaddress space Also, two indepen-dent databases might eacb use thesame address for their own obJects.This !s the fundamental problemtbat must & solved by any virtualmemory-mapping approach to aDBMS

ObJectStore solves this problemby dynamically asslgnlng portionsof address space 10 correspond to~rtkons of the databases used by

tbe appbcatlo”, It matntal”s a v,r-tual address map that shows wb,chdatabase arid which o~ect w,tbl”the databdse 15 represented by anyaddres5. As the apphcatlon refer-ences nlore databases a“d moreoblects, additional address space isassigned, and tbe new obJects are[napped Into these new addresses,At tbe end of eacb transaction thevirtual address map IS reset, a“dwben the next transaction smrts,new assignments are made,

.1hls solutl”n does “ot place anybmlts on the s,ze ofa database. Nat-urally, each transaction”is limited toacccsslng 110more data tha” ca” fitInto the v,rt”al address space. 1“practice, tbls hmit ISrarely reached,since nlodern computers have verylarge virtual address spaces, andtransact)ol>s are generally sbortenuugh thal tbey do not accessnearly ASnlucb data as can fit A“operatlo!l large enougb to ap-proach th,s bm,t would be dlvldedInto 5everal transactions, andchecked out Into a workspace toprovide Isolatlon from other “sers.

Wben a page ISmapped i“to v,r-tual memory, tbe corresp”de”ceOfObJectsand virtualaddresses mayhave changed, The value of eachpointer stored In the page m“st &updated. to follow the new vlrt”aladdress of the object, This IScalledrelocationof tbe pointers, Whe”possible, O~ectStore arranges toassign the address space so tbatpointers as stored on the serverhappen to be the same as tbe valuesthey o“ght to have ,“ v,rt”al mem.ory. In this case, relmatlo” IS“otneeded, whlcb Improves ~rfor.mance. But sometimes relmatlo”cannot be avoided. For example,wben the database size exceeds tbe

size of the available address space,relocation is req”lred.

ObjectStore ma!”tal”s a“ a“xd-lary data struct”re called the tagtable that keeps track of the Ioca.tion and type of every obJect1“ thedaubase. Whe” a page ISmappedInto virtual address space andplnter relocation IS needed, Ob-JectStore consults the tag table tofind o“t what obJects reside on thepage, and then uses the datibaseschema to learn wblcb Iocatlonswttbln each ob)cct contain pointers,It tben adj”sts the value of thepointer to account for the new as-signments of data to the vlrt”aladdress space To mlnlmlze spaceoverhead whalekeeping access fast,the tag table ISheavily compressed,and ISIndexed, Each tag table e“trycontains a 16-blt type code, wblchlndexe~ i“to a type table stored tnthe database’s schema, The typemble entry Indicateswhich words ofthe type contain pointers, Tag tablepages are brought Into the chentcache as needed, and managed ,nthe cacbe like ordinary databasepages.

Apphcatio”s can Improve per-formance by exerc,slng controlover the placement of obJectswlthln a database, By clusteringtogether objects tbat are frequentlyreferenced together, Iocahty is In.creased, the client cache IS usedmore efficiently, and fewer pagesneed to ~ transferred In order toaccess the objects, ObjectStore d,.vldes a database Into areas calledsegme”ts, and whenever a“ appli.cation creates a new persistent ob.Ject, it can specify tbe segment )“wblch tbat object should be created,Apphcatio”s can create as ma”ysegme”ts as are n~ded, Segments

58

Page 10: The Objectstore Database System - University of Michiganweb.eecs.umich.edu/~jag/eecs584/papers/objectstore.pdf · 2006-09-04 · allohbrarles that were developdw for manipulating

Sflnce IockEng granularity is on aper-page basis, the advantages

O* c~uster~ng are rea~izedin decreased iockfing overhead.

may & transferred from sewer toclient either en masw, or one pageat a time, depending on the settingof an application-controlled per-se~ent flag.

Objects can cross page bunda-ries, and an & much larger than apage. Image dati, for example, cank stored ,n ve~ large arrays that

span many pages. If an applicationneeds to accessonly a small portionof such a huge object, It can usepage-~anularity transfer, to trans-fer only the pages of the object tbatare actually uwd. Conversely, manysmall ohjects can reside on a s]nglepage, Since Imking granularity ISon a ~r-page basis, the advantagesof clustering are also realized indecreased Imking overhead.

ObjmtStore depnds on the op-erating system to control the map-ping and protection of pages, andto allow access violations to & han-dled by software. The most stan-dard versions of Unix, such asSVR4, OSF/1, Berkeley bsd 4,3,and SunOS all provide these facili.ties. For other versions of Unix,ObjecStore includes a devicedriver that must be linked with thekernel when ObjectStore IS in-stilled. ObJectstore never modifiestbe Unix kernel imelf. Future ver-sions of these operating systems areex~cted to provide thew memo~manipulation facihties. ObjectStorecurrently runs on Sun 3 andSPARC, under SunOS, IBM RS/6000, under AIX, DEC DS31OO,under Ultrix, HP series 300, 400,and 700, under HP/UX. By the e“dof 1991, ObjwtStore should also &running on DEC under VMS, andSGI. Most other ~pular kernel-based operating systems, IncludingVMS and 0s/2, provide the facdi.

ties that ObjectStore needs. Ob-jectStore is also available on MicrO-soft Windows 3.0. Windows doesnot have a protected kernel likeUnix, so ObJectStore controls vir-tual memory directfy.

colletiionsIn designing tbe collmtion facihty,an importint design goal was thatperformance must & comparableto that of hand<oded data struc-tures, across a wide range of appli-cations and cardinality. Often, Okjects have embedded collections.For example, a Person obJect mightconmln a set of chddren. In thesecases, cardinalities are usuallysmall, often Oor 1, and only mca-sionally abve 5-10. Collections arealso usedto storeallObJmtsof somety~, e.g., all employees, and sucbcollations can & arbitrarily large.Furthermore, access patterns differgreatly among applications, andeven over time within a single ap-phmtion. Clearly, a single repre-senmtion type will & inadequatewhen ~rformance is a concern, somultiple represenmtions of collec-tions must & supprted. However,it is not desirable for the user tohave to deal with these represenm-tions directly. The u=r should beable to work through an interfacethat reflects khavior, not repre-senmtion.

The ObjectStore collection facili-ties are arrangd into two class hi-erarchies: one for collections, andanotber for curwrs. The base ofthe collection hierarchy is oscollec-tian, which is actually the base fortwo hierarchies. One of these con-wins o=et, osbag, and odlst.These provide familiar combina-tions of bhavior. Other combina-

tion: .~n & obtiined by specifyingcombinations of bhavior for anos-collection, (e.g., a list without

duplicates, or a set that raises anexception upon InsertIon of a du-plicate, instead of sdendy ignoring1[).

The other hierarchy under oscollection provides for various rep-resenmtions of collections. Eachrepresenmtion sup~fis the entireo~collectton Interface, but with dif-ferent performance characteristics.The= classes are available for di-rmt use, but it should never b nec-essary to work with representitlonsdirmtly. Instead, a representation isnormally selected automatically,based on user-supplied estimatesofaccesspatterns (1.e.,how frequentlyvarious operations wdl & carriedout).

O~ratiOns on collections apparas wthd, (or memkr functions, touse the C++ terminology). As istypical of object-oriented lan.Wages, there is a run-time functiondispatch, to Imate the appropriateimplementation of each function,based on the collection’s behaviorand represenatlon. When a coOec-tion mdifies i~elf to employ a dif-ferent representation, It actuallymodifies its own (represenmtion)

type description, so function dis.patches will continue to work cor-rectly.

auedesSyntactically,queries are treated asordinary expressions in an ex-tended C++. However, que~

expressions are handled quite dif-ferently from other kinds of ex-pressions. The obvious implemen-mtlon strategy-iterate and checkthe prediate—wottld provide ve~

59

Page 11: The Objectstore Database System - University of Michiganweb.eecs.umich.edu/~jag/eecs584/papers/objectstore.pdf · 2006-09-04 · allohbrarles that were developdw for manipulating

poor performance for large cOllec- variable or call.by.reierence pa-tlons. In relational DBMSS, Indexes rameter), or result from the eval”a.can be suppbed to permit more ef- tlo” of a“ expression. This meansficlent implementations. A query that m“ltiple strategies m“st &optimizer examines a variety of generated, with the final selectionstrategies and chooses the least ex- Ieft untd the moment the collectlo”pensive. ObjmtStore also uses in- king queried is known, and thedexes and a query optimizer. The query is to k r“n.indexes are more complex than Relational database whereas areindexes in a relational DBMS, since heavdy normalized—there are “othey may index paths thro”gh ok such things as embdded sets orjects and collections, not Just fields pinters. As a result, queries in-directly conta,ned in objects. The volve multiple tables whose con-query optimization and index tents are related to one another bymaintenance Ideas presented here )oin terms’, i.e., expressions involv-were insp,red by [14]. Similar Ideas Ing rows from a pair of tables (e. g.,on lndexlng and paths ap~ar in the department Identifier column[12, 15, 16]. In the Employee table and the iden-

Optlmlzation techniques devel- tdier column in the Department

Oped fOr relational DBMSS do not table). Consequently, optimizersseem well-suited for ObjectStore. spend mOst Of their time figuring[n a relational DBMS, relations are out the &st way to evaluate queriesalways identified by name. As a re- with multiple join terms, In Obsuit, snformatlon akutthe relation, jectstore, queries tend to ~ over ae.g., the available indexes, is avail- small num~r of top-level (1.e.,able when the query is optimized, nonembedded) collections, “s”allyand a single strategy can b gener- one. SelectIon predicates Involveated. In ObjectStore, collections are paths through obj~ts and embd.often not known by name. They ded collections. These paths ex-may be pointed at (e.g., by a prnter press the same sort of connections

oodbl ObjectStore oodb3 oodb4 rdbmsl lnOeXSystem

Warm andcoldcachetravemalwsults

60

that join terms expressed in rela-tional queries. Stnce the path ISmateriahzed In the dawbase, withinter-object references and embed-ded collections, join optlmlzatlon ISless of a problem.

In ObjectStore, a parse tree rep-resenting the q“ery is constrictedat compde-time. Information con-cerning paths that appear in thequery IS propagated up the tree tothe nodes representing queries.During cde generation, a pair offunctions IS generated for eachnde in the que~’s parse tree OneIS used to implement a scan-basedstrategy (visit each element andcheck the predicate), and the otherImplements an index-based strat-e~. Functions correspo”di”g toquery nodes also conmin code toexamine the collection bing que-ried, (e.g., what indexes are pres-ent? what is the cardlnality?) a“dmake final choices abut strategy.Th}s approach allows for flexibdltyat run time, yet still carries outmuch expensive work (analysis ofthe que~) at compile time.

If ObjectStore’s C++ compiler isused, then query parsrng and opti-mization occurs during compiletime. Queries expressed using thelibrary Interface are actually parsedand optimized at run time, Thesame run-time library supportingquery exsution is used in kthcases.

As noted earlier, paths can &viewed as precomputed Joins. 1“ObjectStore, indexes can b createdon paths. As a result, the join opti-mization problem faced by rela-tional DBMS optimizers ISreplacedby a much simpler index selectionproblem. Analysis of the que~ in-dicates which indexes could be rele-vant, For example, this query findsemployees who earn over $100,000and work in the same departmentas Fred.

empl~ees[: sd~ >100000 &&dept–>employees[:me == ‘Red’ :] :]

There are two paths here~”eon salary, and another start]ng at

Page 12: The Objectstore Database System - University of Michiganweb.eecs.umich.edu/~jag/eecs584/papers/objectstore.pdf · 2006-09-04 · allohbrarles that were developdw for manipulating

a n e m p h ) v e e -through thedepartment of the cmpl”yee, theset of emplovees c>f that depart-ment, and the name of each suchemplovee An Index for each patheither exists or it does not—thechoice can k made q“,ckly at r“”time There IS “o need to reaso”about strategies hased on the pres.ence or abse”ce of an Index foreach step of each patb, as In a rela-tional optlm)zer

Tbls IS not to say that queriesover paths avo,d all q“cry process-ing problems d“e to tbe presence ofJoIns. In general, a comparison of apath to a constant (e g., dept–>nme == ‘Mse=ch,), InVOIVCS

Index selectlon only Jo,” optlmlza-[ion problems recur whe” [WOpaths are compared, as In thisquery (not based on any objectclasses descr~~d prev,o”slj).

projecw[eweers[.projAd == worti-on @@nme == ‘~ed’ :] ,]

Tbls query find project~ ,“VOIV.ing Fred There IS“o stored co”.nectlon betwee” pr”jccts a“d engi-neers They are matched “p bycomparing the projdd of a W0j8ctand theworks_on field of a” [email protected] ObJectStore would eval”atcthisJoin using Iteration over proj-ecti, and an Index lookup o“ e~i-neers, (assuml”g the l“dex ISava,l-ahle). An l“dex on engi”eers’names could also be used

Whale this query IS a vahd Ob-JectStOre q“e~, It 1s an un”s”alone, and It reflects an unus”al Ob-jectStore scbema. Normally, theconnection between proJects andenflneers wo”ld be represented byinter-obJectreferences, i.e., theJo,”wo”ld be precomputed and storedIn the parttclpatlng ObJects.This ISJustified by a“alysjs of programs ,“our apphcat,on doma,”s, Truejoins, as In the earlier query, arequite rare, For this reason we havenot yet tmplementedjoln optlmiza-tlon. It ISun”s”al to have queriesInvolvlng multiple ‘top-level’ ctdlec-lions, (e.g., classextents) whose ele-

ment~are related by comparjng at-tributes. It ISmore c(>mmonto havequeries over a 5,ngle top-fe,,el col-Iectlon, with nested querle$ o“embedded collections (Ie., queriesover paths that may go thro”gh COI.Iectlons). The ObJectStore querv

Optimizer reflects this.WhaleJoin optlmlzallon ISle?s {>f

a prOblem.cOmpared tOa relatlOnalDBMS, l“dex malnte”ance is m“chmore difficult. In a relationalDBMS, “pdatcs affecting Indexesare expres~cd in SQ1,. 1“ Ob-jectStore, where tbe lnte~att”nktweer] the DBMS and the hostIangtlage ]$ much t)ghter, updatesare ord,nary expres~lons that havecertain s,dc cflects For example

Person* p,

p–>age = p–>age + 1,

The a551gnment statement “p-alates the age of perso” p IfP–>@e happens to & tbe key tosome Index, Lben that Index m“st& updated It 13 not practical tocheck If Index maintenance 15Fe-q“lred f“r every statement thatmodifies an ob]ect 1be perfc,r-mance consequences wo”ld be d,s.astrous. Instead, ObjectSt<>re re-quires the declarat,o” of datamembers that could polentnl~ kused as Index keys. Index mainte-nance checks are perf”rmed fortbese data memhers o“ly Example.

CIMSPersont

ht agetidemble;titheight;

}; “’”

Tbe declaratlo” of ~e as tidex-able tndlcates tha[ updates of meneed to k checked f“r l“dex main-tenance. Updates of height do “othave to be cbecked The tidexabledeclaration does not affect type, A5a re$ult, most charlges in i“-dexabibty (addtng or remov,”g adeclaration of tidexable to an ex.Istlng data member do not affect

the schema of the database (Butrecompdatlon %<)uldalwavs be re-qu,red )

Irldex nlalntenance 13 furtbercomphcaced by the pre$el]cc of Rn-dexes on paths. For example, con-sider an Index on ch]tdren’s r>amesfor a set of peopte Sucb arl Index ]suseful for quer1e5 stxch a5 “Findpeople who have a chdd namedFred.” Index “palates arc req”,rcdwhen a per50n ,5 added t“ the c“l-Iect!on, a person in (he ctdlect]or]has a chdd, or when one ofthls per-son’s chddren changes hls or hern.me

Indexes on pathscould k s!ngle-step, with an access method (e g ,hash tahle) used to represent eachstep of the patb, or there coutd bone structure recording the associa-tion f(>r the enore path These al-ternat,,c~ have been d,scussed In[14] ObjectSt”rc USC5a ser,es of?Ingle-$tep Indexes. When an ln-dexable data member IS “pdated,all affected dccess mctb<>ds areupdated. Tben, .11access mcth,,dsdownstream Inaffected Index pathsare updated t<)<)Slmdarlv, an up-date to a cotlectjon triggers upda[esthat mav affect all ac’ess methodsof all Indexes of the collection

ApplicationsI he perlormartce and productlvjtv&nefits of ObJectStore have beendemc)n~tratedIn a numkr of Ob-jrctSt<>reapphcatlons

PeflormanceaentitsThe {;attell Benchmark [5] was de.s,gned to reflect the access patternsof engineering (e.g., CASE andCAD) apphcatlons Tbe benchmarkconsistsof several te?ts, but only thetraverse test results are show” heresince It &st illustrates Lhe perfor-mance benefits of ObJectSl(>re’?architecture. The test traverse~ agraph of oh]ects slmdar t<,C>”Cthatmlgbl b fc)und In a typicat engi-neering apphcat]an (e g , aschema). The graph #n Figure 6sbows that the warm and cold cachetraversalresultswben tbe cbent andserver are on dtfferent machines

61

Page 13: The Objectstore Database System - University of Michiganweb.eecs.umich.edu/~jag/eecs584/papers/objectstore.pdf · 2006-09-04 · allohbrarles that were developdw for manipulating

(i.e., the remote rose).A cold ache is an empty cache,

as would exist when a client stirtsaccessing part of the damhase forthe first time in recent histo~. Awarm cache ISthe same cache aftera numkr of iterations have ke”run. If the next iteration accessesthe same part of the damhase, thecache is said to k warm. The dif-ference htween cold and warmcache times demonstrates that bththe client cache and the virtualmemory-mapping architecturehave a significant ~rformancehenefit.

Cold cache times are dominatedhy the time required to get datafrom the disk of the server into theclient’s address space. Warm timesreflect prmessing speed of datathat is already present at the clientand mappd Into memory. We b-Iieve this to & the most imptintperformance concern for our mr-get appbcatlon areas.

P~dudlvIN BenM&The productivity henefiti are dem-onstrated by the experiences ofLuctd, Inc., wblch is developing anextensible C++ programming en-vironment named Cadillac [7]. Theenvironment has hen under devel-

opment since Ig8g and will ~ re-Ieaxd as a product. The system isking implemented in C++.

Before Obj&tStore was available,the developrs of Caddlac used aC++ ohject class which, when in-herited, prov,ded persistence.Classes that might have ~rsistentinsmnces had to inherit from tbisclass. For each such class, methods(i.e., functions) for storing a“d rc.trieving the object from the dam-base had to& defind. A referenceto an object resulted in a retrievalfrom the database, ,fthe ohject hadnot already hen retrieved. Whilereads were transparent in tbat no

special functiOns had tObe called bytbe class user, writes had to & ex-plicitly specified as function calls—

a pr~ess that was prOne tO errOr.This mechanism was supprted hya conventional Index Squentiai

Access Methd (ISAM)-hsed file

system.Porting Cadillac to ObjectStore

took one developr one week. Themdificatlons were limited to threesource files out of several dozenand involved, for the most part, dis-abling the persistence mechanismthat had ken in use. The simplicityof the prt was due in large part tothe architecture of OhJectStore,which treau persistence as a storageclass rather than as an aspect of

tYPe. The conversion wo”ld haveken much more difficult If func-tions that manipulated objmts hadto & modified to distinguish &-tween ~rsistent and transient ob-jects.

In order to sped the prtingprmess, the developers chose to al-locate all o~ects in the dambase,even those that did not “eed to b~rsistent. Once fine-Waind tun-)ng commencd, however, objectsand values that could be allwatedtransiently were allmated on thetransient heap. Tranmction kund-aries were alm added to shortentransactions, minimizing committime and reducing concurrent con-flies.

The performance of Cadillacimproved considerably followlngthe insmllation of ObjectStore.Compilation from witbin the Cadil-lac environment ran three to fivetimes faster with ObjectStore thanwith the original ISAM-basd per-sistence mmhanism. Compilation isa write-intensive o~ration, splitinto two transactions, one for eachpass of the compiler. Read-inten-sive oprations showed even moreimprovement, running 10 timesfaster using ObjectStore,

Wrk in ProgrsssObject Design, Inc. was founded inAugust 1988, and vers!on 1.0 ofOb]ectStore was released in Octo-&r 1990. Version 1.1, descri~dhere, was released in March 1991and was the result of approximately30 prwn-years of effort.

We are extending this work in anumkr of ways. New features

under development Include:

. Whe- evolution:When a typedefinition changes, insmnces ofthe type, stored In the dambase,need to b mdified to reflect thechange.

● Suppti for hekm~ous mhi-ktims: Some applications re-quire access to a datibase frommult]ple architectures with vary-ing memory layouts (e.g., differ-ent byte orderings and floating-point represenmt,ons).

. Gmmutication with existigdambses: Many applicationsrequtre the abdity to access exist.ing, nonobject-orie”ted databases(e.g., SQL and IMS databases),To retiln the productivity bene-fits of ObjectStore, it ]s necessaryto provide transparent access tothese databases, I.e., througb theexisting ObJectStore interface.

ConclusionsObJectStOre was designed for use Inapplications that perform complexmanipulations on large databases ofobJects with Intricate structure.Developers of these apphcatlonsrequire bigh productivity throughease of use, expressive ~wer, areumble cde base, and tight lnte-~ation with tbe host environment.However, even more imprmnt isthe need for high ~rformance.Sped cannotk sacrificed to obtainthese knefits.

The key to meeting these re-quirement is the virtual memo~-mapping architecture. Becauw ofthis architecture, ObjectStore uwrsdeal witb a single type system. This~rmlts tight Integration with thehost environment, eaw of use, andthe reuse of existing Iibranes.Other approaches to ~rsistencemken by other object-orientedDBMSS require transient and pr-sistent objects to & ty~d differ-ently. As a result, conversion k-tween transient and persistentrepresenmtlons are required, orsoftware that had ken develo~dto deal with transient objwtt must& modified or duplicated to ac-

Page 14: The Objectstore Database System - University of Michiganweb.eecs.umich.edu/~jag/eecs584/papers/objectstore.pdf · 2006-09-04 · allohbrarles that were developdw for manipulating

c persistent objects. In arelational DBMS, all persistent datais accessed wlthln the scope of theSQL Iang”age with Its own Inde-pendent type system,

The virtual memory -mappl”garchitecture also leads to high per.formance. References to transle”tand persistent obJects are ha”dledby the same machine cde se.quences, Other architectures re-quire references to potentially per-sistent objects 10 b handled lnsoftware, and this IS necessaryslower.

ObjectStore’s collecoon, relatio”.sbip, and query facditles prov!desupport for conceptual nlodehngconstructs such as multlvalued at-tributes, and many-to-many rela-tionships can k translated dire’tlyinto declarative ObJectStOre co”.structs ❑

References1. Agrawal, R., Gchan,, N H, ODE

(Object database a“d env,ronme”t).The Ia”g”age a“d tbe data modelACM-SICMOD 1989 Int-t,”wlConfmeme on Maw~-nl of Data(May-Ju”e 1989)

2. Bancdbo”, F., Ma,er, D M“hda”-guage ObJectOrl~nted system. NCWanswers to old database pr”blemsFuture Garalton Com@&s 11,K Fuchl a“d L Kottl, Eds , North-Holland, 1988

3. Bd,r,s, E Config”rat,o” ma”age-me”t and .crsto”,”g ,n a CADICAM data ma”ageme”t e“v,ron-me”t (A” example) Pr,me/Com.p“terv,s,o,, ,nternal memo, J“ne1989

4. Carey, MJ., Fra”kb”, MJ , L,v”y,M., Sheklm, E,J, Dam cacb,”gtrade-offs ,n ‘bent-,,,,., DBMSarcbltect.res 1“ ProceedingsACMSIGMOD I“t_t,oml Confcence .“the Mamgmeni of Data (1Y91).

5. Cattell, R G G a“d Skee”, J Objecto~ratlons ~nchmark ACM TramDahbmc Syst To & pubhshed

6. Cho”, H . K,m, W Vers,on, a“dcha”ge “otnficat,o” ,“ a“ object.or,e”ted database system 1“ Pro.ceed’”gsof 25tA ACM/IEEE Des,pAutomtton Confertwe (1988).

7. Gabr,el, R P , Bo”rbak,, N., De,,”,M , D“ss”d, P, Gray, D , Sexto”, HFo””dat,on for a C++ prOg,am.

m,nge“v, r””me”t In C“?,fe,t”c.P,”cecdzngsof C++ Al W“rk

8. Glew, A Boxes. bnks a“d parallell,ecS 1“ Pr”cecdtng,“f the April ’89Use.= Soflware M.wgement Work.h”f

9. Goldste!n, I P and Bob, ”w, D Alayered approach to software de-s,gn Xerox PARC CSL-SO-5,De.1980

10. G“ldsletn, 1 P, Bobr”w, D An ex-perimental descrlptlOn-based prO-gr~zr)rllntlgenvironment ~Our r~-;;;; X.,”. PARC CSL 81-S, Mar

11. Kazar, M L Sy.cbron,zatlon andcachtng ,ss..s ,“ the A“drew hlcSysrem.In u~tn~ConferenGeproceed-trigs, (Dallas, Wti”ter 1988), pp 27–36

12. Kemper, A., Mmrkotte, G. Accesss“p~rt ,“ “bJect hses. 1“ Proceed-ings ACM SIGMOD Inlemtt”mlCo”fme.c. on Mawgeme”t of D.t.(1990)

13. M.,er, D. Mak,ng dambse syslc,nsfasceno”gh f“r LADappl)catt”t,,,,,obJect-ortienled c“”cepu, dambaseand appbcattons W Ktm and ELmhovsky, Eds , Addison-We.ley,R.ad,ng, Mass, 1989, p~S 573–581

14. Ma,,., D., Ste,n, J. Developn>enta“d lmplementat,o” of an “hJccl-orle”led DBMS 1. ResearchDtrcc-t,”~ z“ ObJetl-OmnledProflammtng,B Shr,ver a“d P Wegr,er, Ed,MIT Press 1987. Als” ,“ Redtngb t,tObjecl-OwntedDal.he Sy>tem,s BZd””,k a“d D Maler, Morg.rlKa”fman”, Ed,.> 1990

15. Shek)ta, E H~gh.~rformance fin,-plemenmt,o” techntq”cs f“r nexc-generatjOn database systenls LOm-puler %Lences Tech Rep #l fJ26,Unlverslty of Wtsconstn.Mdd150n,1991

16. Shek,ta, E , Cdrey, M Performar,cee“ha”ceme”t thro”gh repbcattor)t“ a“ obJect.or,entcd DBMS 1“P,meedtngs A6M SIGMOD l,tlerw-t,owl C“.fere?ue “. M.mgeme?,t 0/Dar. (1990).

CR Categories -d S“bjfft Wscrip-tors: C 24 (Computer Systems Orgmi-mtio”] C“mp.ler-C”m”,”n,’.ltonNetworks—Dutnb”ted sjstem, D32[%fN_] Pr”gran,n)tng h,>g.dges—h,,~gt chstficaft”m, D 42 [%ft-Ware] Opratlng Systems—Sto,agemmgewnl, H.2 I [Information Sys-

CORDON LANDIS ISa memkr of thect,g,”eer,,)g stiff “f Object De.,gn Hewas pre,,””,ly A c“-fo””der of O“-t“l”g)c, *here he led the des,gn and,rr,ple,r,et, tal,”t) “t the Vbase ob~ecl-“r,erlted dalabdse s)stem

JACK A. ORkNSTEIN Z,. ,“etr,&r oflhe et,g,l,c’x,ng sl.fl a,ld .“-f”under offJb~ect Des,gt, H. ma, p,CVIO”Sl~ a

c“”lP”L,, $’,,,11,.1 .1 ~“”,p”t., Corp”-rdt,o,, “f A,ne,,ca, *here he co”d”cledresearch on $p~t,a[ d.ta tn”deb”g a“dsp.t,al quel) pr~esslng On the pR~BEpr”JCC1

DANIEL L. WEINREB ,s a DatibaseAr.h,Lcct a“d ‘“- f””,,dcr “f ObJect De.stgn He -as prc, t””dy Aco-fo”,,der ofS)mbhcs, ~here he led the des,gn and,“lplcmentat,o” “f Lhc ~tall’. obJect.“ntc,>lcd dalabase ,y,Lt”l

A“fbers, Prex”t Address: tJbJectDes,g,,, lnc One hew E,,gla,]d Exe<”-l,ve Park, Burl, ”gton, MA tJ1803, cmad‘a’mCmodt‘“m

Perm,ss,ontoc”py w,tho”t feedl or part of(h,, r“atcr>al is gra”td pr”v!ded that thecop,.sa, e”ot made ordtst~!b”led fordlrectc“mme,c,alad. a”tasc, the ACM .“pyr,ghtn“t,cca”d lbet,tleot tbep”bl,cat~ona”d ,ISdate ap~ar, a“d “et,’e !ss,..” tbat cop),ng,> b> pe]m,.st”n of the As.oct.t,”” f“rComp”t,ngMach,”eq 7“c”pyothew,sc, orto repubhsh, requ!res a fee .nd/or sPectficp.rll,,ss,.n

@ACL,0”020782,9,,10”00,0$, 50

63