1 Object Oriented Databases Ioan Despi 1. Advanced Database Applications 2. Object-Oriented Concepts...

78
1 Object Oriented Databases Ioan Despi 1. Advanced Database Applications 2. Object-Oriented Concepts 3. OODBMS 4. Common Issues 4. ODMG 2.0

Transcript of 1 Object Oriented Databases Ioan Despi 1. Advanced Database Applications 2. Object-Oriented Concepts...

1

Object Oriented Databases

Ioan Despi

1. Advanced Database Applications

2. Object-Oriented Concepts

3. OODBMS

4. Common Issues

4. ODMG 2.0

2

1. Advanced Database Applications

RDBMS: widespread acceptance for traditional business applications: order processing, inventory control, banking, airline reservations

proven inadequate for new technologies: computer-aided design (CAD), computer-aided manufacturing (CAM), computer-aides software engineering (CASE), office information systems and multimedia

systems, digital publishing, geographic information systems

3

Disadvantages of Relational DBMS:

Poor representation of “real world” entities.

Semantic overloading

Poor support dor integrity and enterprise constraints.

Homogenous data structure.

Limited operations.

Difficulty handling recursive queries

Impedance mismatch.

Other problems concerning: concurrency, schema access, navigational access, and so on

4

2. Object- Oriented Concepts

Abstraction: the process of identifying the essential aspects of an entity and ignoring the unimportant properties.

1. Encapsulation: an object contains both the data structure and the set of operations that can be used to manipulate it.

2. Information hiding: we separate the external aspects of an object from its internal details, which are hidden from the outside world.

The internal details of an object can be changed without affecting the application that use it.

Loosely speaking, an object correspond to an entity in the ER model.

The object -oriented paradigm is based on encapsulating data and code related to an object into a single unit.

5

The current state of an object is described by one or more attributes, or instance variable. The value of each variable is itself an object.

1. A simple attribute: can be a primitive type (integer, string, real,…

2. A complex-attribute: can contain collections and/or references

3. A reference attribute: represents a relationship between objects

contains a value or collection of values, which are themselves objects (like a Foreign Key or a pointer)

Complex object: an object that contains one or more complex attributes

Notation: “dot” notation: branch.street, branch.manager, branch.city

Object= a uniquely identifiable entity that contains both the attributes that describe the state of the object and the actions that are associated with it, that is its behaviour.(Simula)

6

The behaviour of an object is given by:

a set of messages to which the object responds

each message may have 0, 1 or more parameters

a set of methods, each of which is a body of code to implement a message

a method returns a value as the response to the message

The physical representation of data is visible only to the implementor of the object.

Messages and responses provide the only extenal interface to an object.

The term message does not necessarily imply physical message passing. Messages can be implemented as procedures calls.

7

Attributes

Method 1Method 4

Method 3 Method 2

Object showing attributes and methods

8

Methods are programs written in a general-purpose language respecting thee following restrictions:

1. Only variables in the object itself may be referenced directly

2. Data in other objects are referenced only by sending messages

They can be used to change the object’s state by modifying its attribute values, or to query the values of selected attributes.

A method consists of a name and a body that performs the behavior associated with the method name:

method void update_salary (float increment)

{salary = salary + increment;

}

9

Messages are the means by which objects communicate.

A message is simply a request from an object (the sender) to another object (the receiver)asking the second object to execute one of its methods.

The sender and receiver may be the same object.

The “dot” notation is generally used to access a method.

staff_object.update_salary(1000)

In a traditional programming language, a message would be written as a function call:

update_salary(staff_object, 1000)

10

Object classes

Similar objects (have the same attributes, respond to the same messages) are grouped into a class.

The attributes and associated methods are defined once for the class.

Al objects in a class have the same:

variable types

message interface

methods

They may differ in the values assigned to variables

Classes are analogous to entity sets in the ER model.

Example: all branch objects would be described by a single Branch class.

11

BRANCH

Attributes

bno

street

city

area

Methods

print

update_tel_no

….

bno=B5street=12 Deer Stcity=Sidcuparea=London...

bno=B7street=16 Dever Stcity=Dycearea=Aberden...

bno=B3street=154 Main Stcity=Partickarea=Glasgow...

12

A class is an object ===>has is own class attributes and class methods

The class is an instance of a higher-level class called metaclass

Class attributes describes the general characteristics of the class, such as totals or averages( ex: total no of branches)

Class methods are used to change or query the state of class attributes

There are special class methods to create new instances of the class:

new --constructor

destructor

In the following example, employment-length is a derived attribute.

For strict encapsulation, methods to read and set other variables are also needed

13

class employee {

/*Variables */

string name;

string address;

date start-date;

int salary;

/* Messages */

int annual-salary;

string get-name;

string get-address;

int set-address ( string new-address)

int employment-length;

};

14

Inheritance

Inheritance allows one class (subclass) to be defined as a special case of a more general class (superclass).

The process of forming a superclass is referred to as generalization.

The process of forming a subclass is referred to as specialization.

By default, a subclass inherits all the properties of its superclass(es) and, additionally, defines its own unique properties.

A subclass can redefine inherited methods.

All instances of the subclass are also instances of the superclass.

Principle of substitutability: we can use an instance of the subclass whenever a method or a construct expects an instance of the superclass..

15

The relation between the subclass and superclass: A KIND OF (AKO)

The relation between an instance and its class: IS-A.

Examples:

Manager is AKO Staff.

Susan Deer IS-A Manager.

Inheritance:

1. Single inheritance: the subclass inherits from no more than one superclass

2. Multiple inheritance: the subclass inherits from more than one superclass ===> conflicts!

16

Staff

Person

Manager Sales_Staff

Manager Sales_Staff

Sales_Manager

Single

inheritance

Multiple

inheritance

17

3. Repeated inheritance: a special case of multiple inheritance

superclasses inherit from a common superclass

The inheritance mechanism must ensure that the subclass does not inherits properties twice.

Staff

Manager Sales_Staff

Sales_Manager

4. Selective inheritance:allows a subclass to inherit a limited number of properties from the superclass.

18

Object Identity

Each object is assigned an Object Identifier (OID) when it is created that is:

system generated

unique to that object

invariant

independent of the values of its attributes

inivisible to the user

Other concepts:

overriding (+ overloading)

polymorphism & dynamic binding

complex objects

persistence

19

3. OODBMS

Hierarchical Data Model

Network Data Model

Relational Data Model

ER Data Model

Semantic Data Model

Object-Relational Data Model Object Oriented Data Model

1960 - 1970

First generation DBMS

1970 - 1980 Second generation DBMS

E. Codd, 1970

IMS

Chen, 1976

Third generation DBMS

Hammer, McLeod, 1981

1980-2000

20

OODM: a logical data model that captures the semantics of objects supported in oo programming

OODB: a persistent and sharable collection of objects defined by an OODM

OODBMS: the manager of an OODB (W. Kim 1991)

Zdonik &Maier(1991) ---An OODBMS must (at minimum) satisfy:

must provide database functionality

must support object identifier

must provide encapsulation

must support objects with complex state

or (Khoshafian &Abnous 1990)

object oriented= ADT + Inheritance + OID

OODBMS= OO + database capabilities

21

Traditional DBMS

•persistence

•sharing

•transasctions

•concurrency control

•recovery control

•security

•integrity

•quering

Semantic data models

•generalization

•aggregation

OO programming

•object identity

•encapsulation

•inheritance

•types & classes

•methods

•complex objects

•polymorphism

•extensibility

Special requirements

•versionong

•schema evolution

Object Oriented Data Model

22

Strategies for Developing an OODBMS:

1. Extend an existing object-oriented programming language with database capabilities. Smalltalk, C++, Java --> GemStone

2. Provide extensible object-oriented DBMS libraries. Ontos, Versant, ObjectStore

3. Embed object-oriented database language constructs in a convenient host language. O2 embeds OODL in C.

4. Extend an existing database language with object-oriented capabilities. Extend SQL--> SQL3, OQL.

5. Develop a novel database data model / data language. SIM (semantic information manager, 1988).

23

OODBMS Perspectives:

Modern database systems are characterized by their support of the following features:

1. Data model: a particular way of describing data, relationships between data, and constraints on the data

2. Data persistence: the ability of data to outlive the ecxecution of a program, and possibly thee lifetime of the program

itself.

3. Data sharing: the ability of multiple applications to access common data, possibly at the same time

4. Reliability: the assurance that the data in the database is protected from hardware and software failures

5. Security : the protection of the data against unauthorized access

24

7. Integrity: the assurance that the data conforms to specified correctness and consistency rules

8. Distribution: the ability to physically distribute a logically interrelated collection of shared data over a network

Traditional programming languages provide:

1. Constructs for procedural control and for data and functional abstraction

2. Lack built-in support for many of the above database features

Novel applications require functionality from both perspectives.

25

Issues:

1. Persistence: objects must survive user session or application program that created them has terminated

transient objects: only last for the invocation of the program

To implement persistence in OODB: 3 schemes

A. Checkpointing: copy all or part of a program’s address space to secondary storage

• a checkpoint can only be used by the program that created it

• a checkpoint may contain a large amount of data that is of no use in subsequent executions

26

B. Serialization: copy the closure of a data structure to disk.

A write operation on a data value involves the traversal of the graph of objects reachable from the value and, then, the

writing of a flattened version of the structure to disk.

Reading back this flattened structure: serialization, pickling, marshaling.

• Does not preserve object identity: if two data structures that share a common sunstructure are separately serialized, then on retrieval the substructure will no longer be shared in the new copies.

• It is not incremental, and so saving small changes to a large data structure is not efficient.

27

C. Explicit paging: paging objects between the application heap and the persistent store.

Requires the conversion of object pointers from a disk-based scheme to a memory-based scheme.

There are two common methods for creating/updating persistent objects:

a. Reachability-based: an object will persist if it is reachable from a persistent root object

at any time after creation, an object can become persistent by adding it to the reachability tree.

Garbage collection: deletes objects when they are no longer accessible from any other object

Smalltalk, Java

28

b. Allocation-based: an object is explicitely declared as being persistent within the application program

i) By class: a class is statically declared to be persistent --> all instances of the class are made persistent when they are created

a clas may be a subclass of a system-supplied persistent class

Ontos, Objectivity/DB

ii) By explicit call: an object may be specified as persistent when it is created or, in soome cases, dynamically at runtime (added to a persistent collection)

ObjectStore

29

Alternatively, to provide persistence in a programming language: orthogonal persistence, based on the following principles:

1. Persistence independence: the persistence of a data object is independent of how the program manipulates the data object and conversely, a fragment of the program is expressed independently of the persistence of data it manipulates.

2. Data type orthogonality: all data objects should be allowed the full range of persistence irrespective of their type: Ps-algol,

Napier88, Galileo, GemStone

Persistence is only a quality attributable to a subset of the language data types: Pascal/R, Amber, E, Avalon/C++

3. Transitive persistence: the choice of how to identify and provide persistent objects at the language level is independent of the choice of data types in the language. Most used technique: reachability-based.

30

Orthogonal persistence:

Advantages:

1. There is no need to define long-term data in a separate schema language

2. No special application code is required to access or update persistent data

3. There is no limit to the complexity of the data structures that can be made persistent

4. Improved programmer productivity from simpler semantics

5. Improved maintenance

6. Consistent protection mechanisms over the whole environment

7. Support for incremental evolution

8. Automatic referential integrity

31

Issues:

2. Pointer Swizzling Techniques:

the action of converting object identifiers (OIDs) to main memory pointers, and back again

Aim: to optimize access to objects.

Obvious approach: to hold a lookup table that maps OIDs to main memory pointers

Pointer swizzling: stores the main memory pointers in the place of the referenced OIDs and vice versa, when the

object has to be written back to disk

32

A. No swizzling: the OID is used every time the object is accessed

the system maintains a lookup table, so that the object’s virtual memory pointer can be located and then used to access the object.

Could be inefficient if the same objects are accessed repeatedly

Could be acceptable if applications access an object once

B. Object referencing: to be able to swizzle a persistent object’s OID to a virtual memory pointer, a mechanism is required to distinguish between resident and non-resident objects.

Most techniques are variations of edge marking or node marking: (Hoskings&Moss, 1993):

33

Virtual memory is considered to be a directed graph, with objects as nodes and references as directed edges:

1. Edge marking marks every object pointer with a tag bit.

If the bit is set, then the reference is to a virtual memory pointer

Otherwise, it is still pointing to an OID and needs to be swizzled when the object it referes to is faulted

into the application’s memory space.

2. Node marking requires that all object references are immediately converted to virtual pointers when the object is faulted into memory.

1 is a software-based technique;

2 can be implemented using software or hardware-based techniques.

34

C. Hardware-based schemes: use virtual memory access protection violations to detect accesses of non-resident objects(Lamb91)

Use the standard virtual memory hardware to trigger the transfer of persistent data from disk to main memory.

Once a page has been faulted in, objects are accessed on that page via normal virtual memory pointers.

The hardware approach avoids the overhead of residency checks incurred by software approaches but

limits the amount of data that can be accessed during a transaction to the size of virtual memory and complicates other issues, like recovery, fine-grained locking, aso.

ObjectStore, Texas

35

Issues:

3. Transactions

in classical DBMSs: short duration transactions

in CAD, CASE,…: long duration transactions (hours, days)

a need for new protocols:

nested transactions, sagas, multi-level transactions.

4. Versions: Ontos, Versant, ObjectStore, Objectivity/DB, Itasca

object version = an identifiable state of an object

version history = the evolution of an object

version management = object references always point to the correct version of an object

36

Types of versions:

1. Transient version: unstable, can be updated and deleted

it can be created from new by checking out a released version from a public database or

by deriving it from a working or transient version in a private database, when the base transient version is promoted to a

working version. Always sored in the creator’s private workspace.

2. Working version: stable and cannot be updated but it can be deleted by its creator. It is stored in the creator’s private workspace.

3. Released version: stable, cannot be updated or deleted.

it is stored in a public database by checking in a working version from a private database

37

Issues:

5. Schema evolution: design is an incremental process.

To support this process, applications require flexibility in

dynamically defining and modifying the database schema.

Typical changes to the schema include:

changes to the class definition:

modifying attributes, modifying methods

changes to the inheritance hierarchy:

making a class S the superclass of a class C,

removing a class S from the list of superclasses of C,

modifying the order of superclasses of C

changes to the set of classes:

creating and deleting classes, modifying class names

38

Client - Server Architectures:

1. Object server: distributes the processing between the two components

Server process: responsible for managing storage, locks, commits to secondary storage, logging, recovery, security, query optimization and executing stored procedures

Client process: responsible for transaction management and interfacing to the programming languages

2. Page server: most of the database processing is performed by the client.

Server process: responsible for secondary storage and for providing pages at the client’s request

39

3. Database server: most of the database processing is performed by the server.

Client process: passes requests to the server, receives results and passes them on to the application.

Used by relational DBMS

In each case, the server resides on the same machine as the physical database.

The client may reside on the same or different machine.

If the client needs access to databases distributed across multiple machines, then the clients communicates with a server on each machine.

There may be a number of clients communicating with one server: for example, one client for each user or application.

40

Advantages od OODBMSs:

enriched modeling capabilities

extensibility

removal of impedance mismatch

more expressive query language

support for schema evolution

support for long duration transactions

aplicability to advanced database applications

improved performance

41

Disadvantages of OODBMSs:

lack of universal data model

lack of experience

lack of standards

query optimization compromises encapsulation

locking at object level may impact performance

complexity

lack of support for views

lack of support for security

42

Object Database Standard ODMG 2.0 1997

Object Database Management Group proposed an OODM consisting of:

1. An object model

2. An object definition language (ODL) (like traditional DDL)

3. An object query language, with a SQL-like syntax

ODMG object model is a superset of the Object Management Group (OMG) object model.

1990: OMG published its Object Management Architecture (OMA) Guide document .

It specified a single terminology for oo languages, systems, databases and applications.

43

Object Request Broker

WP Spreadsheet CAD Help email browser

Application objects

Common

facilities

storageTransactionmanagement queries versioning security

Object

servicesOMA

44

1. The Object Model-- OM

is a design-portable abstract model for communicating with OMG-compliant object-oriented systems

a requester sends a request for object services to the ORB

which keeps track of all the objects in the system and the types of services they can provide

the ORB then forwards the message to a provider

who acts on the message and passes a response back

to the requester via the ORB

requester ORB provider

45

2. The Object Request Broker -- ORB

handles distribution of messages between application objects

is a distributed ‘software bus’ that enables objects (requesters) to make and receive requests and responses from a provider

on receipt of a response from the provider, the ORB translates the response into a form the original requester can understand

--> provides a mechanism by which objects make and receive requests and responses transparently

--> interoperability between applications in a heterogeneous distributed environment

46

3. The Object Services --OS

provide the main functions for realizing basic object functionality

collection: a uniform way to create and manipulate most common collections generically:

sets, queues, stacks, lists, binary trees

concurrency control: a lock manager that enables multiple clients to coordinate their access to shared rresources

event management: allows components to dynamically register or unregister their interest in specific events

exeternalization: provides protocols and conventions for externalizing and internalizing objects.

47

externalization: records the state of an object as a stream of data (in memory, on disk, across network)

internalization: creates a new object from it in a different process

licensing: operations for metering the use of components to ensure fair compensation for their use, and

protect intellectual property

lifecycle: operations for creating, copying, moving, and deleting groups of related objects

naming: facilities to bind a name to an object relative to a naming context

persistence: interfaces to mechanisms for storing and managing objects persistently

property: operations to associate named values (properties) with any (external) component

48

query: declarative query statements with predicates, the ability to invoke operations and other object services

relationship: a way to create dynamic associations between components that know nothing of each other

security: services such as identification and authentification, authorization and access control, auditing, security of communication, non-repudiation, administration

time: maintains a single notion of time across different machines

trader: a matchmaking service for objects. It allows objects to dynamicaly advertise their services, and

other objects to register for a service.

transactions: a two-phase commit coordination among recoverable components using flat or nested transactions

49

4. The Common Facilities --CF

comprise a set of tasks that many applications must perform but are traditionally duplicated within each one.

they are made available through OMA-compliant class interfaces

in the latest version: CF are split in

horizontal common facilities (printing, electronic mail, aso) and

vertical domain facilities (finance, helthcare, manufacturing, e-commerce, transportation, telecommunications)

50

The Common Request Broker Architecture -- CORBA

defines the architecture of ORB-based environments

is the basis of any OMG component, defing the parts that form the ORB and its associated structure

1991: CORBA 1.1 defined:

Interface Definiton Language (IDL)

Application Programming Interfaces (API) - enable client-server interaction with a specific implementation of an ORB

1994dec: CORBA 2.0 improved interoperability

specified how ORBs from different vendors can interoperate

1997: CORBA 2.1

51

Main elements:

IDL: permits the description of class interfaces independent of any particualr DBMS or programming language

a type model that defines the values that can be passed over the network.

an Interface Repository, which provides information on interfaces and types, and is used to construct dynamic runtime requests, by the Dynamic Invocation Interface

Methods for getting the interfaces and specifications of objects

Methods for transforming OIDs to and from strings

From the IDL definitions, CORBA objects can be mapped into particular programming languages, as C, C++, Smalltalk and Java. This produces interface stubs within the application programming language (client) that are used to invoke the requests. The same stubs are used on the object implementation side (server) to create skeletons, which are completed to provide the requested behavior.

52

The ODMG Object Model

Vendors: GemStone Systems, Object Design, O2 Technology, Versant Object Technology, UniSQL, POET Software, Objectivity, IBEX Computing SA, Lockheed Martin

formed Object Database Management Group (ODMG)

It produced an object model that specifies a standard model for the semantics of database objects.

The model is important because it determines the built-in semantics that the OODBMS undestands and can enforce

The design of class libraries and applications that use these semantics should be portable across the various OODBMSs that support the object model.

53

The major components of the ODMG for an OODBMS are:

1. Object model--OM

2. Object definition language --ODL

3. Object query language -- OQL

4. C++ language bindings

5. Smalltalk language bindings

6. Java language bindings

Initial ODMG standard: 1993

Major version: ODMG 2.0 september 1997

54

1. The Object Model --OM

ODMG object model is a superset of th OMG object model

enables both designs and implementations to be ported between complian systems

Basic modeling primitives: the object and the literal.

Objects and literals can be categorized in types: all objects of a given type exihibit common behavior and state. A type is an object.

Behavior is defined by a set of operations that can be performed on or by object.

State is defined by the values an object carries for a set of properties

A property may be either an attribute or a relationship between the object and one or more other objects.

55

Atomic_type

long

short

unsigned long

unsigned short

float

double

boolean

octet

char

string

enum < >

Collection_literal

set < >

bag < >

list < >

array < >

dictionary < >

Structured_literal

date

time

timestamp

interval

structure < >

Literal_type

56

Structured_object

Date

Time

Timestamp

Interval

Object_type

Atomic_object Collection_object

Set< >

Bag< >

List < >

Array < >

Dictionary < >

57

A database stores objects, enabling them to be shared by multiple users and applications.

A database is based on a schema that is defined in ODL. The database contains instances of the types defined by its schema.

Objects types are: atomic, collections or structured types.

Types shown in italics are abstract types. Types shown in normal are directly instantiable. They are the only base types.

Types with < > indicate type generators.

Objects are created using the new() method of the corresponding factory interface provided by the language binding interface.

All objects have an ODL interface which is implicitly inherited by the definition of all user-defined objects:

58

Interface Object {

enum Lock_Type {read, write, upgrade};

exception LockNot Granted {};

void lock(in Lock_Type mode) raises (LockNotGranted);

boolean try_lock(in Lock_Type mode);

boolean same_as(in Object anObject);

Object copy();

void delete(); }

Each object has an unique identity, OID, which does not change and is not reused when the object is deleted.

In addition, each object has one or more meaningful user names

Objects can be transient or persistent.

59

Literals : atomic, collections, structured, null

The values of a literal’s properties may not change.

Literals do not have their own OID and cannot stand alone as objects: they are embedded in objects

Structured literals contain a fixed number of named heterogenous elements of the form: < name , value >, where value may be any literal type.

Struct Address {

string street;

string area;

string city;

string post_code; };

attribute Address branch_address;

60

Collections: contain an arbitrary number of unnammed homogeneous elements, each of which can be an instance of an atomic type, a collection or literal type

There are ordered and unordered collections. Ordered collections must be traversed first to last or vice versa; unordered collections have no fixed order of iteration.

Set: unordered collections that do not allow duplicates

Bag: unordered collections that do allow duplicates

List: ordered collections that allow duplicates

Array:one-dimensional array of dynamically varying length

Dictionary: unordered sequence of key-value pairs with no duplicate ekeys

Each subtype has operations to create an instance of the type and insrt an element into the collection. Sets and Bags have usual set operations: , ,

61

Interface Collection: Object {

exception InvalidCollection{};

exception ElementNotFound{any element};

unsigned long cardinality();

boolean is_empty();

boolean is_ordered();

boolean allows_duplicates();

boolean contains_element(in any_element);

void insert _element(in any_element);

void remove _element(in any_element);

raises (ElementNotFound);

Iterator create_iterator(in boolean stable);

` Bidirectionaliterator create_bidirectional_iterator(in boolean stable);

Raises(InvalidCollectionType); };

ODL interface for collections

62

A type has a specification and one or more implementations.

The (external)specification defines the properties and operations that can be invoked on instances of the type.

An implementation defines data structures, exceptions and methods that operates on the data structures to support the required state and behavior.

Class: The combiantion of a type specification and an implementation.

An interface definition is a specification that defines only the abstract behavior of an object type: supertypes, extend and keys.

A literal definition defines only the abstract state of a literal type.

63

Properties: in ODMG object model: attributes and relationships

Attributes: is defined on a single object type

is not a “first class” object (is not an object)--> no OID

its value is a literal or an OID

Relationships: only binary and are defined between types

cardinality: 1:1, 1:M, M:N

is not a “first class” object, does not have a name

traversal paths are defined in the interface for each direction of traversal

on the many side: objects can be unordered (set, bag) or ordered (list). OODBMS maintains referential integrity.

64

Example: a Branch Has a set of Staff and a member of Staff WorksAt a Branch:

interface Branch {

relationship set <Staff> Has inverse Staff:: WorksAt }

interface Staff {

relationship Branch WorksAt inverse Branch:: Has}

The model has built-in operations to form and to drop members from relationships and to manage the required referential integrity constraints

attribute BranchWorksAt;

void form_WorksAt(in Branch aBranch);

void drop_WorkAt(in Branch aBranch);

65

2. The Object Definition Language --ODL

is a specification language for defining the specifications of object types for OMG-complian systems.

facilitates portability of schemes between compliant systems

defines the attributes and relationships of types

specifies (but not addresses the implementation of) the signature of the operations

the syntax of ODL extends the IDL (Interface Definition Language) of the CORBA

will be the basis for integrating schemas from multiple sources and applications

66

3. The Object Query Language --OQL

provides declarative access to the object database using an SQL-like syntax.

does not provide explicit update operators, but leaves this to the operations defined on object types.

can be used as a standalone or as an embedded language in another language (now: C++, Smalltalk, Java).

can invoke operations programmed in these languages

An OQL query is a function that delivers an object

whose type may be infered from

the operator contributing to the

query expression.

67

Query definition expression:

DEFINE Q AS e /* defines a query with name Q given a query /* expression e

1. Elementary expressions:

• an atomic literal: 10, 17.5, ‘c’, “qwerty”, false, nill

• a named object:

• an iterator variable from the FROM clause of the SELECT-FROM-WHERE:

e as x or e x or x in e

where e is of type collection(T), then x is of type T

• a query definition expression (Q above)

68

2. Construction expression:

•If T is a type name with properties p1, p2, …,pn and e1, e2, …, en are expressions then T(p1 : e1, p2 : e2, …,pn : en) is an expression of type T.

Example: Branch(bno : ”B22”, manager : ”Susan Brand”)

•Similarly, we can construct expressions using struct, set, list, bag and array:

struct (bno : “B22”, street : “166 Main ST”)

is an expression which dynamically creates an instance of this type

69

3. Atomic Type Expressions

•Expressions can be formed using the standard unary and binary operations on expressions.

•If S is a string, expressions can be formed using:

the string concatenation operation ( || or + )

a string offset Si , meaning the i + lth character of the string

S[low : up], meaning the substring of S from low + lth to up+lth character

c in S (where c is a char) returning a bolean expression

S like pattern . Pattern contains the characters ? or _ , meaning any char, or the wildcard characters * or %, mening any substring. Returns a boolean expression

70

4. Object Expressions

•Expressions can be formed using the equlity and inequality operations ( = and != ) returning a boolean.

•If e is an expression of a type having an attribute or a relationship P of the type T, then e.P and e -->P are expressions of type T.

•In a same way, methods can be invoked to return an expression

•If a method has no parameteers, the brackets in the method call can be omitted

71

5. Collections expressions

Expressions can be formed using

universal quantification for all

existential quantification exists

membership testing in

select clause select from where

sort-by operator sort

unary set operators min, max, count, sum, avg

group-by operator group

The format of the SELECT clause is similar to the standdard SQL SELECT clause:

72

SELECT [DISTINCT] <expression>

FROM <from_list>

[WHERE <expression>]

[GROUP BY <attributes> [HAVING <predicate>]

[ORDER BY <expression>]

Where:

<from_list>::= <variable_name> IN <expression> |

<variable_name> IN <expression>, <from_list> |

<expression> AS <variable_name> |

<expression> AS <variable_name>, <from_list> |

The result of a SELECT DISTINCT query is a set

The result of a SELECT query is a bag

73

6. Indexed Collections Expressions

• If e1 and e2 are lists or arrays and e3 and e4 are integers, then e1[e3], e1[e3:e4], first(e1), last(e1) and (e1 + e2) are expressions

7. Binary Set Expressions

• If e1 and e2 are sets or bags, then the set operators union, except and intersect of e1 and e2 are expressions.

8. Structure Expression

• If e is a expression and p is a property name, then e.p and e-->p are expressions, which extract the property p of an object e.

74

9. Conversion Expressions

If e is an expression, then element(e) is an expression that checks e is a singleton, raising an exception if it is not.

If e is a list expression, then listtoset(e) is an expression that converts the list into a set.

If e is a collection-valued expression, then flatten(e) is an expression that converts a collection of collections into a collection, that is, it flattens the structure.

If e is an expression and c is a type name, then c(e) is an expression that asserts e is an object of type c, raising an exception if it is not.

10. Object Expressions

If e is an expression and f is an operation, then e.f and e-->f are expressions that apply an operation to an object. The operation can optionally take a number of expressions as parameters.

75

A query consists of a (possibly empty) set of query definition expressions followed by an expression.

The result of a query is an object with or without identity.

Examples:

A. get the set of all staff (with identity)

staff

B. get the set of all branch managers (with identity):

branch_offices.ManagedBy

76

C. get the set of all staf who live in London (without identity):

define Londoners as

select x

from x in staff

where x.address.city = “London”

select x.name.lname from x in Londoners

returns a literal of type set<string>

D. get the structured set (without identity) containing name, sex, and age for all staf who live in London:

select struct (lname:x.name.lname, sex:x.sex, age:x.age)

from x in staff

where x.address.city = “London”

returns a literal of type set<struct>

77

E. get the structured set (with identity) containing name, sex, and age for all deputy managers over 60:

type deputies {attribute

lname : string; sex: string; age : integer;}

deputies (select

struct ( lname:x.name.lname,

sex:x.sex,

age:x.age)

from x in (select y from staff

where position = “Deputy”)

where x.age > 60)

78

F. get a structured set (without identity) containing branch number and the set of all Assistants at the branches in London:

select struct (bno:x.bno,

assistants: (select y from y in x.WorksAt

where y.position=“Assistant”))

from x in

(select z from branch_offices

where z.address.city= “London”)

Object without identity are created using struct, (see D, F).