Unit 1

67
Advanced RDBMS UNIT - I 1.0 Introduction Information is represented in object-oriented database, in the form of objects as used in Object-Oriented Programming. When database capabilities are combined with object programming language capabilities, the result is an object database management system (ODBMS). An ODBMS makes database objects appear as programming language objects in one or more object programming languages. An ODBMS supports the programming language with transparently persistent data, concurrency control, data recovery, associative queries, and other capabilities. Object database management systems grew out of research during the early to mid-1980s into having intrinsic database management support for graph-structured objects. The term "object-oriented database system" first appeared around 1985. Object database management systems added the concept of persistence to object programming languages. The early commercial products were integrated with various languages: Page 1 Topics: Concepts for Object – Oriented Databases Object identity, Object Structure and Type Constructors ODMG (Object Data Management Group) Object Definition Language (ODL) Object Query Language (OQL) Overview of C++ Language Binding Object Database Conceptual Design Overview of the CORBA Standard for Distributed Objects

description

hh6

Transcript of Unit 1

Page 1: Unit 1

Advanced RDBMS

UNIT - I

1.0 Introduction

Information is represented in object-oriented database, in the form of objects as used in Object-Oriented Programming. When database capabilities are combined with object programming language capabilities, the result is an object database management system (ODBMS). An ODBMS makes database objects appear as programming language objects in one or more object programming languages. An ODBMS supports the programming language with transparently persistent data, concurrency control, data recovery, associative queries, and other capabilities.

Object database management systems grew out of research during the early to mid-1980s into having intrinsic database management support for graph-structured objects. The term "object-oriented database system" first appeared around 1985.

Object database management systems added the concept of persistence to object programming languages. The early commercial products were integrated with various languages: GemStone (Smalltalk), Gbase (Lisp), and Vbase (COP). COP was the C Object Processor, a proprietary language based on C that pre-dated C++. For much of the 1990s, C++ dominated the commercial object database management market. Vendors added Java in the late 1990s and more recently, C#.

Starting in 2004, object databases have seen a second growth period when open source object databases emerged that were widely affordable and easy to use, because they are entirely written in OOP languages like Java or C#, such as db4objects and Perst (McObject).

Benchmarks between ODBMSs and relational DBMSs have shown that ODBMS can be clearly superior for certain kinds of tasks. The main reason for this is that many

Page 1

Topics: Concepts for Object – Oriented Databases Object identity, Object Structure and Type Constructors ODMG (Object Data Management Group) Object Definition Language (ODL) Object Query Language (OQL) Overview of C++ Language Binding Object Database Conceptual Design Overview of the CORBA Standard for Distributed Objects Object Relational and Extended Relational Database

Systems: The Informix Universal Server Object Relational features of Oracle 8 An Overview of SQL

Page 2: Unit 1

Advanced RDBMS

operations are performed using navigational rather than declarative interfaces, and navigational access to data is usually implemented very efficiently by following pointers.

Critics of Navigational Database-based technologies, like ODBMS, suggest that pointer-based techniques are optimized for very specific "search routes" or viewpoints. However, for general-purpose queries on the same information, pointer-based techniques will tend to be slower and more difficult to formulate than relational. Thus, navigational appears to simplify specific known uses at the expense of general, unforeseen, and varied future uses.

Other things that work against ODBMS seem to be the lack of interoperability with a great number of tools/features that are taken for granted in the SQL world including but not limited to industry standard connectivity, reporting tools, OLAP tools and backup and recovery standards. Additionally, object databases lack a formal mathematical foundation, unlike the relational model, and this in turn leads to weaknesses in their query support. However, this objection is offset by the fact that some ODBMSs fully support SQL in addition to navigational access, e.g. Objectivity/SQL++ and Matisse. Effective use may require compromises to keep both paradigms in sync.

In fact there is an intrinsic tension between the notion of encapsulation, which hides data and makes it available only through a published set of interface methods, and the assumption underlying much database technology, which is that data should be accessible to queries based on data content rather than predefined access paths. Database-centric thinking tends to view the world through a declarative and attribute-driven viewpoint, while OOP tends to view the world through a behavioral viewpoint. This is one of the many impedance mismatch issues surrounding OOP and databases.

Although some commentators have written off object database technology as a failure, the essential arguments in its favor remain valid, and attempts to integrate database functionality more closely into object programming languages continue in both the research and the industrial communities

1.1 Objectives

The objective of this lesson is to learn the Object-Oriented database concepts with respect to Object Identity, Object Structure, Object Databases Standards, Language and Design and Overview of CORBA.

1.2 Content

1.2.1 Concepts for Object-Oriented Databases

A database is a logical term used to refer a collection of organized and related information. In any business, certain piece of information about Customer, Product, Price and so on are called database. A data is just a data until it is organized in a meaningful way at which point it becomes information.

Page 2

Page 3: Unit 1

Advanced RDBMS

Through a Database Management System one can Insert, Update, Delete and View the records in existing file

Traditional Data Models : Hierarchical, Network (since mid-60’s), Relational (since 1970 and commercially since 1982).

Object Oriented (OO) Data Models since mid-90’s.

Reasons for creation of Object Oriented Databases

– Need for more complex applications

– Need for additional data modeling features

– Increased use of object-oriented programming languages

Commercial OO Database products – several in the 1990’s, but did not make much impact on mainstream data management

Languages: Simula (1960’s), Smalltalk (1970’s), C++ (late 1980’s), Java (1990’s)

Experimental Systems: Orion at MCC, IRIS at H-P labs, Open-OODB at T.I., ODE at ATT Bell labs, Postgres - Montage - Illustra at UC/B, Encore/Observer at Brown.

Commercial OO Database products: Ontos, Gemstone, O2 ( -> Ardent), Objectivity, Objectstore ( -> Excelon), Versant, Poet, Jasmine (Fujitsu – GM).

1.2.2 Overview of Object Oriented Concepts.

MAIN CLAIM: OO databases try to maintain a direct correspondence between real-world and database objects so that objects do not lose their integrity and identity and can easily be identified and operated upon

Object: Two components: state (value) and behavior (operations). Similar to program variable in programming language, except that it will typically have a complex data structure as well as specific operations defined by the programmer

In OO databases, objects may have an object structure of arbitrary complexity in order to contain all of the necessary information that describes the object.

In contrast, in traditional database systems, information about a complex object is often scattered over many relations or records, leading to loss of direct correspondence between a real-world object and its database representation.

The internal structure of an object in OOPLs includes the specification of instance variables, which hold the values that define the internal state of the object.

Page 3

Page 4: Unit 1

Advanced RDBMS

An instance variable is similar to the concept of an attribute, except that instance variables may be encapsulated within the object and thus are not necessarily visible to external users

Some OO models insist that all operations a user can apply to an object must be predefined. This forces a complete encapsulation of objects.

To encourage encapsulation, an operation is defined in two parts:

– signature or interface of the operation, specifies the operation name and arguments (or parameters).

– method or body, specifies the implementation of the operation.

Operations can be invoked by passing a message to an object, which includes the operation name and the parameters. The object then executes the method for that operation.

This encapsulation permits modification of the internal structure of an object, as well as the implementation of its operations, without the need to disturb the external programs that invoke these operations

Some OO systems provide capabilities for dealing with multiple versions of the same object (a feature that is essential in design and engineering applications).

For example, an old version of an object that represents a tested and verified design should be retained until the new version is tested and verified: it is very crucial for designs in manufacturing process control, architecture , software systems.

Operator polymorphism: It refers to an operation’s ability to be applied to different types of objects; in such a situation, an operation name may refer to several distinct implementations, depending on the type of objects it is applied to.

This feature is also called operator overloading

1.2.3 Object identity, Object Structure and Type constructors

Unique Identity: An OO database system provides a unique identity to each independent object stored in the database. This unique identity is typically implemented via a unique, system-generated object identifier, or OID

The main property required of an OID is that it be immutable; that is, the OID value of a particular object should not change. This preserves the identity of the real-world object being represented

Page 4

Page 5: Unit 1

Advanced RDBMS

Type Constructors: In OO databases, the state (current value) of a complex object may be constructed from other objects (or other values) by using certain type constructors.-The three most basic constructors are atom, tuple, and set. Other commonly used constructors include list, bag, and array. The atom constructor is used to represent all basic atomic values, such as integers, real numbers, character strings, Booleans, and any other basic data types that the system supports directly.

Example 1, one possible relational database state corresponding to COMPANY schema

Page 5

Page 6: Unit 1

Advanced RDBMS

We use i1, i2, i3, . . . to stand for unique system-generated object identifiers. Consider the following objects: o1 = (i1, atom, ‘Houston’)

o2 = (i2, atom, ‘Bellaire’)o3=(i3,atom,‘Sugarland’)o4 = (i4, atom, 5)o5 = (i5, atom, ‘Research’)o6 = (i6, atom, ‘1988-05-22’)o7 = (i7, set, {i1, i2, i3})o8 = (i8, tuple,<dname:i5, dnumber:i4, mgr:i9, locations:i7, employees:i10, projects:i11>)o9 = (i9, tuple, <manager:i12, manager_start_date:i6>)o10 = (i10, set, {i12, i13, i14})o11 = (i11, set {i15, i16, i17})o12 = (i12, tuple, <fname:i18, minit:i19, lname:i20, ssn:i21, . . ., salary:i26,

supervisor:i27, dept:i8>)The first six objects listed in this example represent atomic values. Object seven is a set-valued object that represents the set of locations for department 5; the set refers to the atomic objects with values {‘Houston’, ‘Bellaire’, ‘Sugarland’}. Object 8 is a tuple-valued object that represents department 5 itself, and has the attributes DNAME, DNUMBER, MGR, LOCATIONS, and so on.

This example illustrates the difference between the two definitions for comparing object states for equality.

o1 = (i1, tuple, <a1:i4, a2:i6>)o2 = (i2, tuple, <a1:i5, a2:i6>)o3 = (i3, tuple, <a1:i4, a2:i6>)o4 = (i4, atom, 10)o5 = (i5, atom, 10)o6 = (i6, atom, 20)

In this example, The objects o1 and o2 have equal states, since their states at the atomic level are the same but the values are reached through distinct objects o4 and o5. However, the states of objects o1 and o3 are identical, even though the objects themselves are not because they have distinct OIDs. Similarly, although the states of o4 and o5 are identical, the actual objects o4 and o5 are equal but not identical, because they have distinct OIDs.

1.2.4 Encapsulation of Operations, Methods and Persistence Encapsulation One of the main characteristics of OO languages and systems

Page 6

Page 7: Unit 1

Advanced RDBMS

Related to the concepts of abstract data types and information hiding in programming languages

Specifying Object Behavior via Class Operations: The main idea is to define the behavior of a type of object based on the operations

that can be externally applied to objects of that type. In general, the implementation of an operation can be specified in a general-

purpose programming language that provides flexibility and power in defining the operations.

For database applications, the requirement that all objects be completely encapsulated is too stringent.

One way of relaxing this requirement is to divide the structure of an object into visible and hidden attributes (instance variables).

Adding operations to definitions of Employee and Department Specifying Object Persistence via Naming and Reachability: Naming Mechanism: Assign an object a unique persistent name through which it

can be retrieved by this and other programs. Reachability Mechanism: Make the object reachable from some persistent object. An object B is said to be reachable from an object A if a sequence of references in

the object graph lead from object A to object B. In traditional database models such as relational model or EER model, all objects

are assumed to be persistent. In OO approach, a class declaration specifies only the type and operations for a

class of objects. The user must separately define a persistent object of type set (DepartmentSet) or list (DepartmentList) whose value is the collection of references to all persistent DEPARTMENT objects

Creating Persistent objects by naming and reachabilityDefine class DepartmentSet:

Type set(Department);Operations add_dept(d:Department): Boolean;(* adds a department to the DepartmentSet object *)

remove_dept(d:Department): Boolean;(* this will remove a department from the DepartmentSet Object *)

create_dept_set: DepartmentSet;destroy_dept_set: Boolean;

end DepartmentSet;…….

persistent name AllDepartments: DepartmentSet;(* AllDepartments is a persistent named object of type DepartmentSet *)

…..

1.2.5 Type Hierarchies and Inheritance

Type (class) Hierarchy

Page 7

Page 8: Unit 1

Advanced RDBMS

A type in its simplest form can be defined by giving it a type name and then listing the names of its visible (public) functionsWhen specifying a type in this section, we use the following format, which does not specify arguments of functions, to simplify the discussion:

TYPE_NAME: function, function, . . . , function

Example: PERSON: Name, Address, Birthdate, Age, SSNSubtype: when the designer or user must create a new type that is similar but not identical to an already defined typeSupertype: It inherits all the functions of the subtype

Example (1):EMPLOYEE: Name, Address, Birthdate, Age, SSN, Salary, HireDate, Seniority STUDENT: Name, Address, Birthdate, Age, SSN, Major, GPA OR:EMPLOYEE subtype-of PERSON: Salary, HireDate, Seniority STUDENT subtype-of PERSON: Major, GPA

Example (2): Consider a type that describes objects in plane geometry, which may be defined as follows:

GEOMETRY_OBJECT: Shape, Area, ReferencePointNow suppose that we want to define a number of subtypes for the

GEOMETRY_OBJECT type, as follows:

RECTANGLE subtype-of GEOMETRY_OBJECT: Width, Height TRIANGLE subtype-of GEOMETRY_OBJECT: Side1, Side2, Angle CIRCLE subtype-of GEOMETRY_OBJECT: Radius

An alternative way of declaring these three subtypes is to specify the value of the Shape attribute as a condition that must be satisfied for objects of each subtype:

RECTANGLE subtype-of GEOMETRY_OBJECT (Shape=‘rectangle’): Width, Height TRIANGLE subtype-of GEOMETRY_OBJECT (Shape=‘triangle’): Side1, Side2, Angle CIRCLE subtype-of GEOMETRY_OBJECT (Shape=‘circle’): Radius

Extents: In most OO databases, the collection of objects in an extent has the same type or class. However, since the majority of OO databases support types, we assume that extents are collections of objects of the same type for the remainder of this section.

Persistent Collection: It holds a collection of objects that is stored permanently in the database and hence can be accessed and shared by multiple programs

Transient Collection: It exists temporarily during the execution of a program but is not kept when the program terminates

1.2.6 Complex Objects

Page 8

Page 9: Unit 1

Advanced RDBMS

Unstructured complex object: It is provided by a DBMS and permits the storage and retrieval of large objects that are needed by the database application.

Typical examples of such objects are bitmap images and long text strings (documents); they are also known as binary large objects, or BLOBs for short.

This has been the standard way by which Relational DBMSs have dealt with supporting complex objects, leaving the operations on those objects outside the RDBMS

Structured complex object: It differs from an unstructured complex object in that the object’s structure is defined by repeated application of the type constructors provided by the OODBMS. Hence, the object structure is defined and known to the OODBMS. The OODBMS also defines methods or operations on it.

1.2.7 Other Object-Oriented Concepts

Object Databases Standards Why a standard is needed? A Standard in any Object Model refers to the following aspects:

Portability: execute an application program on different systems with minimal modifications to the program.

Interoperability

ODMG standard refers to - object model, object definition language (ODL), object query language (OQL), and bindings to object-oriented programming languages.

An Object Model explains the data model upon which ODL and OQL are based. It also provides data type and type constructors. SQL report describes a standard data model for relational database. Relation between an Object and literal is – a Literal has only a value but no object identifier. An Object has four characteristics: •identifier •Name•life time (persistent or not)•Structure (how to construct)

Object Database Language

a. Object Definition Language (ODL)An Object Definition Language is designed to support the semantic constructs of the ODMG data model. It is Independent of any programming language and helps to Create object specifications such as classes and interfaces and also Specify a database schema.

b. Object Query Language (OQL)

An Object Query language is:

Page 9

Page 10: Unit 1

Advanced RDBMS

•Embedded into one of these programming languages•Return objects that match the type system of that language•Similar to SQL with additional features (object identity, complex objects, operations, inheritance, polymorphism, relationships)

c. OQL Entry Points and Iterator Variables.

Entry point is a named persistent object (for many queries, it is the name of the extent of a class). An Iterator variable is used when a collection is referenced in OQL query.

d. OQL -Query Results and Path ExpressionsAny persistent object is a query, result is a reference to that persistent object. Path expression is used to specify a path to related attributes and objects once an entry point is specified.

e. OQL Collection Operators

OQL Collection Operators include Aggregate operators such as: min, max, count, sum, and avg.

Object Database Conceptual Design

The Object Database Conceptual Design includes:ODB: relationships are handled by OID references to the related objects.RDB: relationships among tuples are specified by attributes with matching values (value references).ORDBMS: enhancing the capabilities of RDBMS with some of the features in ODBMS.

Other Concepts of Object Database

Polymorphism (Operator Overloading): This concept allows the same operator name or symbol to be bound to

two or more different implementations of the operator, depending on the type of objects to which the operator is applied

Multiple Inheritance and Selective InheritanceMultiple inheritances in a type hierarchy occurs when a certain subtype T is a subtype of two (or more) types and hence inherits the functions (attributes and methods) of both supertypes. For example, we may create a subtype ENGINEERING_MANAGER that is a subtype of both MANAGER and ENGINEER. This leads to the creation of a type lattice rather than a type hierarchy.

Versions and Configurations Many database applications that use OO systems require the existence of several

versions of the same object There may be more than two versions of an object. Configuration: A configuration of the complex object is a collection consisting of

one version of each module arranged in such a way that the module versions in

Page 10

Page 11: Unit 1

Advanced RDBMS

the configuration are compatible and together form a valid version of the complex object.

ODMG (Object Data Management Group)

ODMG 2.0 of the ODMG Standard differs from Release 1.2 in a number of ways. With the wide acceptance of Java, we added a Java Persistence Standard in addition to the existing Smalltalk and C++ ones. The ODMG object model is much more comprehensive, added a meta object interface, defined an object interchange format, and worked to make the programming language bindings consistent with the common model. The changes made throughout the specification based on several years of experience implementing the standard in object database products.As with Release 1.2, we expect future work to be backward compatible with Release 2.0. Although we expect a few changes to come, for example to the Java binding, the Standard should now be reasonable stable.

The major components of ODMG 2.0 are:

Object Model. We have used the OMG Object Model as the basis for our model. The OMG core model was designed to be a common denominator for object request brokers, object database systems, object programming languages, and other applications. In keeping with the OMG Architecture, we have designed an ODBMS profile for the model, adding components (relationships) to the OMG core object model to support our needs. Release 2.0 introduces a meta model.

The Object Data Management Group (ODMG) was a consortium of object database and object-relational mapping vendors, members of the academic community, and interested parties. Its goal was to create a set of specifications that would allow for portable applications that store objects in database management systems. It published several versions of its specification. The last release was ODMG 3.0. By 2001, most of the major object database and object-relational mapping vendors claimed conformance to the ODMG Java Language Binding. Compliance to the other components of the specification was mixed. In 2001, the ODMG Java Language Binding was submitted to the Java Community Process as a basis for the Java Data Objects specification. The ODMG member companies then decided to concentrate their efforts on the Java Data Objects specification. As a result, the ODMG disbanded in 2001.

Many object database ideas were also absorbed into SQL:1999 and have been implemented in varying degrees in object-relational database products.

In 2005 Cook, Rai, and Rosenberger proposed to drop all standardization efforts to introduce additional object-oriented query APIs but rather use the OO programming language itself, i.e., Java and .NET, to express queries. As a result, Native Queries emerged. Similarly, Microsoft announced Language Integrated Query (LINQ) and DLINQ, an implementation of LINQ, in September 2005, to provide close, language-

Page 11

Page 12: Unit 1

Advanced RDBMS

integrated database query capabilities with its programming languages C# and VB.NET 9.

In February 2006, the Object Management Group (OMG) announced that they had been granted the right to develop new specifications based on the ODMG 3.0 specification and the formation of the Object Database Technology Working Group (ODBT WG). The ODBT WG plans to create a set of standards that incorporates advances in object database technology (e.g., replication), data management (e.g., spatial indexing), and data formats (e.g., XML) and to include new features into these standards that support domains in real-time systems where object databases are being adopted

Object Definition Language (ODL)

Lets take a look at something that comes closer to bearing a relationship to our everyday programming. Whether you generate your applications or code them, somehow you need a way to describe your object model. The goal of this Object Definition Language (ODL) is to capture enough information to be able to generate the majority of most SMB web apps directly from a set of statements in the language . . .

Here is a rough cut of ODL along with comments. This is very much a work in progress. Now that I have a meta-grammar and a concrete syntax for describing languages, I can start to write the languages I have been playing with. I will then build up to those languages in the framework so that the framework can consume metadata that can be transformed automatically from ODL, allowing for the automatic generation of most of my code. Expect to see BIG changes in this grammar as I combine “top down” and “bottom up” programming, write some real world applications and see where everything meets in the middle!

Most importantly, we have objects that are comprised of 1..n attributes and that may or may not have relationships. This is the high level UML model kind of stuff. Note that ODL is describing functional metadata, so an object would be “Article” – not “ArticleService” or “ArticleDAO” which are implementation decisions and would be generated from the Article metadata automatically.

Object Query Language (OQL)

But before that we will digress into built-in functions supported in OQL The built-in functions in OQL fall into the following categories:

Functions that operate on individual Java Objects 1. sizeof(o)-- returns size of Java object in bytes 2. objectid(o)-- returns unique id of Java object 3. classof(o)-- returns Class object for given Java object

Page 12

Page 13: Unit 1

Advanced RDBMS

4. identical(o1, o2) -- returns (boolean) whether two given object are identical or not (essentially objectid(o1) == objectid(o2). Do not use simple JavaScript reference comparison for Java Objects!)

5. referrers(o) -- returns array of objects refering to given Java object 6. referees(o) -- returns array of objects referred by given Java object 7. reachables(o) -- returns array of objects directly or indirectly referred

from given Java object (transitive closure of referees of given object) Functions that operate operate on arrays

1. contains(array, expr) -- returns array contains an element that satisfies given expression The expression can refer to built-in variable 'it'. This is current object iterated

2. count(array, [expr]) -- returns number of elements satisfying given expression

3. filter(array, expr) -- returns a new array containing elements satisfying given expression

4. map(array, expr) -- returns a new array that contains results of applying given expression on each element of input array

5. sort(array, [expr]) -- sorts the given array. optionally accepts comparison expression to use. if not given, sort uses numerical comparison

6. sum(array) -- sums all elements of array

As you can see, most array operating functions accept boolean expression -- the expression can refer to current object by it variable. This allows operating on arrays without loops -- the built-in functions loop through the array and 'apply' the expression on each element.

There is also built-in object called heap. There are various useful methods in heap object.

Now, let us see some interesting queries.

Select all objects referred by a SoftReference: select f.referent from java.lang.ref.SoftReference f where f.referent != null

referent is a private field of java.lang.ref.SoftReference class (actually inherited field from java.lang.ref.Reference. You may use javap -p to find these!) We filter the SoftReferences that have been cleared (i.e., referent is null).

Show referents that are not referred by another object. i.e., the referent is reachable only by that soft reference:

select f.referent from java.lang.ref.SoftReference f where f.referent != null && referrers(f.referent).length == 1

Page 13

Page 14: Unit 1

Advanced RDBMS

Note that use of referrers built-in function to find the referrers of a given object. because referrers returns an array, the result supports length property.

Let us refine above query. We want to find all objects that referred only by soft references but we don't care how many soft references refer to it. i.e., we allow more than one soft reference to refer to it.

select f.referent from java.lang.ref.SoftReference f where f.referent != null && filter(referrers(f.referent), "classof(it).name != 'java.lang.ref.SoftReference'").length== 0

Note that filter function filters the referrers array using a boolean expression. In the filter condition we check the class name of referrer is not java.lang.ref.SoftReference. Now, if the filtered arrays contain atleast one element, then we know that f.referent is referred from some object that is not of type java.lang.ref.SoftReference!

Find all finalizable objects (i.e., objects that are some class that has 'java.lang.Object.finalize()' method overriden)

select f.referent from java.lang.ref.Finalizer f where f.referent != null

How does this work? When an instance of a class that overrides finalize() method is created (potentially finalizable object), JVM registers the object by creating an instance of java.lang.ref.Finalizer. The referent field of that Finalizer object refers to the newly created "to be finalized" object. (dependency on implementation detail!)

Find all finalizable objects and approximate size of the heap retained because of those.

select { obj: f.referent, size: sum(map(reachables(f.referent), "sizeof(it)")) } from java.lang.ref.Finalizer f where f.referent != null

Certainly this looks really complex -- but, actually it is simple. The JavaScript object literal used to select multiple values in the select expression (obj and size properties). reachables finds objects reachable from given object. map creates a new array from input array by applying given expression on each element. The map call in this query would create an array of sizes of each reachable object. sum built-in adds all elements of array. So, we get total size of reachable objects from given object (f.referent in this case). Why do I say approximate size? HPROF binary heap dump format does not account for actual bytes used in live JVM. Instead sizes just enough to hold the data are used. For eg. JVMs would align smaller data types such as 'char' -- JVMs would use 4 bytes instead of 2 bytes. Also, JVMs tend to use one or two header words with each object. All these are not accounted in HPROF file dump. HPROF uses minimal size needed to hold the data - for example 2 bytes for a char, 1 byte for a boolean and so on

Page 14

Page 15: Unit 1

Advanced RDBMS

1.2.8 Overview of C++ Language Binding

The C++ binding to ODBMSs includes a version of the ODL that uses C++ syntax, a mechanism to invoke OQL, and procedures for operations on databases and transactions

The Object Definition Language (ODL) is the declarative portion of C++ ODL/OML. The C++ binding of ODL is expressed as a library that provides classes and functions to implement the concepts defined in the ODMG object model. OML is a language used for retrieving objects from the database and modifying them. The C++ OML syntax and semantics are those of standard C++ in the context of the standard class library.

ODL/OML specifies only the logical characteristics of objects and the operations used to manipulate them. It does not discuss the physical storage of objects. It does not address the clustering or memory management issues associated with the stored physical representation of objects or access structures. In an ideal world, these would be transparent to the programmer. In the real world, they are not. An additional set of constructs called "physical pragmas" is defined to give the programmer some direct control over these issues, or at least to enable a programmer to provide "hints" to the storage management subsystem provided as part of the ODBMS run time. Physical pragmas exist within the ODL and OML. They are added to object type definitions specified in ODL, expressed as OML operations, or shown as optional arguments to operations defined within OML.

These pragmas are not in any sense stand-alone languages, but rather a set of constructs added to ODL/OML to address implementation issues.

The programming-language-specific bindings for ODL/OML are based on one basic principle -- that the programmer feels that there is one language, not two separate languages with arbitrary boundaries between them.

The ODMG Smalltalk binding is based upon two principles -- it should bind to Smalltalk in a natural way that is consistent with the principles of the language, and it should support language interoperability consistent with ODL specification and semantics. We believe that organizations specifying their objects in ODL will insist that the Smalltalk binding honor those specifications. These principles have several implications that are evident in the design of the binding:

There is a unified type system that is shared by Smalltalk and the ODBMS. This type system is ODL as mapped into Smalltalk by the Smalltalk binding. The binding respects the Smalltalk syntax, meaning the Smalltalk language

will not have to be modified to accommodate this binding. ODL concepts will be represented using normal Smalltalk coding conventions. The binding respects the fact that Smalltalk is dynamically typed. Arbitrary

Page 15

Page 16: Unit 1

Advanced RDBMS

Smalltalk objects may be stored persistently, including ODL-specified objects, which will obey the ODL typing semantics.

The binding respects the dynamic memory-management semantics of

Smalltalk. Objects will become persistent when they are referenced by other persistent objects in the database, and will be removed when they are no longer reachable in this manner.

As with other language bindings, ODMG Java binding is based on one fundamental principle -- the programmer should perceive the binding as a single language for expressing both database and programming operations, not two separate languages with arbitrary boundaries between them. This principle has several corollaries:

There is a single, unified type system shared by the Java language and the

object database; individual instances of these common types can be persistent or transient.

The binding respects the Java language syntax, meaning that the Java language

will not have to be modified to accommodate this binding.

The binding respects the automatic storage management semantics of Java. Objects will become persistent when they are referenced by other persistent objects in the database, and will be removed when they are no longer reachable in this manner.

The Java binding provides persistence by reachability, like the ODMG Smalltalk binding (this has also been called "transitive persistence"). On database commit, all objects reachable from database root objects are stored in the database.The Java binding provides two ways to declare persistence-capable Java classes:

Existing Java classes can be made persistence capable. Java class declarations (as well as a database schema) may automatically be

generated by a preprocessor for ODMG ODL.

One possible ODMG implementation that supports these capabilities would be a postprocessor that takes as input the Java .class file (bytecodes) produced by the Java compiler, then produces new modified bytecodes that support persistence. Another implementation would be a preprocessor that modifies Java source before it goes to the Java compiler. Another implementation would be a modified Java interpreter.

We want a binding that allows all of these possible implementations. Because Java does not have all the hooks we might desire, and the Java binding must use standard Java syntax, it is necessary to distinguish special classes understood by the database

Page 16

Page 17: Unit 1

Advanced RDBMS

system. These classes are called persistence-capable classes. They can have both persistent and transient instances. Only instances of these classes can be made persistent. The current version of the standard does not define how a Java class becomes a persistence-capable class.

Object Database conceptual Design

Traditional Data Models : Hierarchical, Network (since mid-60’s), Relational (since 1970 and commercially since 1982)

Object Oriented (OO) Data Models since mid-90’s Reasons for creation of Object Oriented Databases

– Need for more complex applications– Need for additional data modeling features– Increased use of object-oriented programming languages

Commercial OO Database products – several in the 1990’s, but did not make much impact on mainstream data management

MAIN CLAIM: OO databases try to maintain a direct correspondence between real-world and database objects so that objects do not lose their integrity and identity and can easily be identified and operated upon

Object: Two components: state (value) and behavior (operations). Similar to program variable in programming language, except that it will typically have a complex data structure as well as specific operations defined by the programmer

In OO databases, objects may have an object structure of arbitrary complexity in order to contain all of the necessary information that describes the object.

In contrast, in traditional database systems, information about a complex object is often scattered over many relations or records, leading to loss of direct correspondence between a real-world object and its database representation

The internal structure of an object in OOPLs includes the specification of instance variables, which hold the values that define the internal state of the object.

An instance variable is similar to the concept of an attribute, except that instance variables may be encapsulated within the object and thus are not necessarily visible to external users

Some OO models insist that all operations a user can apply to an object must be predefined. This forces a complete encapsulation of objects.

To encourage encapsulation, an operation is defined in two parts:1. signature or interface of the operation, specifies the operation name and

arguments (or parameters). 2. method or body, specifies the implementation of the operation.

Operations can be invoked by passing a message to an object, which includes the operation name and the parameters. The object then executes the method for that operation.

This encapsulation permits modification of the internal structure of an object, as well as the implementation of its operations, without the need to disturb the external programs that invoke these operations

Some OO systems provide capabilities for dealing with multiple versions of the same object (a feature that is essential in design and engineering applications).

Page 17

Page 18: Unit 1

Advanced RDBMS

1. For example, an old version of an object that represents a tested and verified design should be retained until the new version is tested and verified:

2. very crucial for designs in manufacturing process control, architecture , software systems …..

Operator polymorphism: It refers to an operation’s ability to be applied to different types of objects; in such a situation, an operation name may refer to several distinct implementations, depending on the type of objects it is applied to.

This feature is also called operator overloading Unique Identity: An OO database system provides a unique identity to each

independent object stored in the database. This unique identity is typically implemented via a unique, system-generated object identifier, or OID

The main property required of an OID is that it be immutable; that is, the OID value of a particular object should not change. This preserves the identity of the real-world object being represented

Type Constructors: In OO databases, the state (current value) of a complex object may be constructed from other objects (or other values) by using certain type constructors.

The three most basic constructors are atom, tuple, and set. Other commonly used constructors include list, bag, and array. The atom constructor is used to represent all basic atomic values, such as integers, real numbers, character strings, Booleans, and any other basic data types that the system supports directly.

Overview of the CORBA Standard for Distributed Objects

The Common Object Request Broker Architecture (or CORBA) is an industry standard developed by the Object Management Group (OMG) to aid in distributed objects programming. It is important to note that CORBA is simply a specification. A CORBA implementation is known as an ORB (or Object Request Broker). There are several CORBA implementations available on the market such as VisiBroker, ORBIX, and others. JavaIDL is another implementation that comes as a core package with the JDK1.3 or above.

CORBA was designed to be platform and language independent. Therefore, CORBA objects can run on any platform, located anywhere on the network, and can be written in any language that has Interface Definition Language (IDL) mappings.

Similar to RMI, CORBA objects are specified with interfaces. Interfaces in CORBA, however, are specified in IDL. While IDL is similar to C++, it is important to note that IDL is not a programming language. For a detailed introduction to CORBA The Genesis of a CORBA Application

There are a number of steps involved in developing CORBA applications. These are: Define an interface in IDL Map the IDL interface to Java (done automatically)

Page 18

Page 19: Unit 1

Advanced RDBMS

Implement the interface Develop the server Develop a client Run the naming service, the server, and the client. We now explain each step by walking you through the development of a CORBA-based file transfer application, which is similar to the RMI application we developed earlier in this article. Here we will be using the JavaIDL, which is a core package of JDK1.3+.

a. Define the Interface

When defining a CORBA interface, think about the type of operations that the server will support. In the file transfer application, the client will invoke a method to download a file. Code Sample 5 shows the interface for FileInterface. Data is a new type introduced using the typedef keyword. A sequence in IDL is similar to an array except that a sequence does not have a fixed size. An octet is an 8-bit quantity that is equivalent to the Java type byte.

Note that the downloadFile method takes one parameter of type string that is declared in. IDL defines three parameter-passing modes: in (for input from client to server), out (for output from server to client), and inout (used for both input and output).

Code Sample FileInterface.idl interface FileInterface { typedef sequence<octet> Data; Data downloadFile(in string fileName);};

Once you finish defining the IDL interface, you are ready to compile it. The JDK1.3+ comes with the idlj compiler, which is used to map IDL definitions into Java declarations and statements.

The idle compiler accepts options that allow you to specify if you wish to generate client stubs, server skeletons, or both. The -f<side> option is used to specify what to generate. The side can be client, server, or all for client stubs and server skeletons. In this example, since the application will be running on two separate machines, the -fserver option is used on the server side, and the -fclient option is used on the client side.

Now, let's compile the FileInterface.idl and generate server-side skeletons. Using the command:

prompt> idlj -fserver FileInterface.idl This command generates several files such as skeletons, holder and helper classes, and others. An important file that gets generated is the _FileInterfaceImplBase, which will be subclassed by the class that implements the interface.

Page 19

Page 20: Unit 1

Advanced RDBMS

b. Implement the interface Now, we provide an implementation to the downloadFile method. This implementation is known as a servant, and as you can see from Code Sample 1, the class FileServant extends the _FileInterfaceImplBase class to specify that this servant is a CORBA object.

Code Sample 1: FileServant.java

import java.io.*;

public class FileServant extends _FileInterfaceImplBase { public byte[] downloadFile(String fileName){ File file = new File(fileName); byte buffer[] = new byte[(int)file.length()]; try { BufferedInputStream input = new BufferedInputStream(new FileInputStream(fileName)); input.read(buffer,0,buffer.length); input.close(); } catch(Exception e) { System.out.println("FileServant Error: "+e.getMessage()); e.printStackTrace(); } return(buffer); }}

c. Develop the server The next step is developing the CORBA server. The FileServer class, shown in Code Sample 2, implements a CORBA server that does the following: Initializes the ORB Creates a FileServant object Registers the object in the CORBA Naming Service (COS Naming) Prints a status message Waits for incoming client requests

Code Sample 2 FileServer.java

import java.io.*;import org.omg.CosNaming.*;import org.omg.CosNaming.NamingContextPackage.*;import org.omg.CORBA.*;

public class FileServer {

Page 20

Page 21: Unit 1

Advanced RDBMS

public static void main(String args[]) { try{ // create and initialize the ORB ORB orb = ORB.init(args, null); // create the servant and register it with the ORB FileServant fileRef = new FileServant(); orb.connect(fileRef); // get the root naming context org.omg.CORBA.Object objRef = orb.resolve_initial_references("NameService"); NamingContext ncRef = NamingContextHelper.narrow(objRef); // Bind the object reference in naming NameComponent nc = new NameComponent("FileTransfer", " "); NameComponent path[] = {nc}; ncRef.rebind(path, fileRef); System.out.println("Server started...."); // Wait for invocations from clients java.lang.Object sync = new java.lang.Object(); synchronized(sync){ sync.wait(); } } catch(Exception e) { System.err.println("ERROR: " + e.getMessage()); e.printStackTrace(System.out); } }}

Once the FileServer has an ORB, it can register the CORBA service. It uses the COS Naming Service specified by OMG and implemented by Java IDL to do the registration. It starts by getting a reference to the root of the naming service. This returns a generic CORBA object. To use it as a NamingContext object, it must be narrowed down (in other words, casted) to its proper type, and this is done using the statement:

NamingContext ncRef = NamingContextHelper.narrow(objRef); The ncRef object is now an org.omg.CosNaming.NamingContext. You can use it to register a CORBA service with the naming service using the rebind method.

d. Develop a client The next step is to develop a client. An implementation is shown in Code Sample 3. Once a reference to the naming service has been obtained, it can be used to access the naming service and find other services (for example the FileTransfer service). When the FileTransfer service is found, the downloadFile method is invoked.

Page 21

Page 22: Unit 1

Advanced RDBMS

Code Sample 3: FileClient

import java.io.*;import java.util.*;import org.omg.CosNaming.*;import org.omg.CORBA.*;

public class FileClient { public static void main(String argv[]) { try { // create and initialize the ORB ORB orb = ORB.init(argv, null); // get the root naming context org.omg.CORBA.Object objRef = orb.resolve_initial_references("NameService"); NamingContext ncRef = NamingContextHelper.narrow(objRef); NameComponent nc = new NameComponent("FileTransfer", " "); // Resolve the object reference in naming NameComponent path[] = {nc}; FileInterfaceOperations fileRef = FileInterfaceHelper.narrow(ncRef.resolve(path));

if(argv.length < 1) { System.out.println("Usage: java FileClient filename"); }

// save the file File file = new File(argv[0]); byte data[] = fileRef.downloadFile(argv[0]); BufferedOutputStream output = new BufferedOutputStream(new FileOutputStream(argv[0])); output.write(data, 0, data.length); output.flush(); output.close(); } catch(Exception e) { System.out.println("FileClient Error: " + e.getMessage()); e.printStackTrace(); } }}

e. Running the application The final step is to run the application. There are several sub-steps involved: Running the CORBA naming service. This can be done using the command tnameserv. By default, it runs on port 900. If you cannot run the naming service on this port, then you can start it on another port. To start it on port 2500, for example, use the following

Page 22

Page 23: Unit 1

Advanced RDBMS

command:

prompt> tnameserv -ORBinitialPort 2500Start the server. This can be done as follows, assuming that the naming service is running on the default port number:

prompt> java FileServer

If the naming service is running on a different port number, say 2500, then you need to specify the port using the ORBInitialPort option as follows: prompt> java FileServer -ORBInitialPort 2500 Generate Stubs for the client. Before we can run the client, we need to generate stubs for the client. To do that, get a

copy of the FileInterface.idl file and compile it using the idlj compiler specifying that you wish to generate client-side stubs, as follows:

prompt> idlj -fclient FileInterface.idlRun the client. Now you can run the client using the following command, assuming that the naming service is running on port 2500.

prompt> java FileClient hello.txt -ORBInitialPort 2500

Where hello.txt is the file we wish to download from the server. Note: if the naming service is running on a different host, then use the -ORBInitialHost option to specify where it is running. For example, if the naming service is running on port number 4500 on a host with the name gosling, then you start the client as follows: prompt> java FileClient hello.txt -ORBInitialHost gosling -ORBInitialPort 4500 Alternatively, these options can be specified at the code level using properties. So instead of initializing the ORB as: ORB orb = ORB.init(argv, null); It can be initialized specifying that the CORBA server machine (called gosling) and the naming service's port number (to be 2500) as follows: Properties props = new Properties(); props.put("org.omg.CORBA.ORBInitialHost", "gosling"); props.put("orb.omg.CORBA.ORBInitialPort", "2500"); ORB orb = ORB.init(args, props); ExerciseIn the file transfer application, the client (in both cases RMI and CORBA) needs to know the name of the file to be downloaded in advance. No methods are provided to list the files available on the server. As an exercise, you may want to enhance the application by adding another method that lists the files available on the server. Also, instead of using a command-line client you may want to develop a GUI-based client. When the client starts up, it invokes a method on the server to get a list of files then pops up a menu displaying the files available where the user would be able to select one or more files to be downloaded.

Developing distributed object-based applications can be done in Java using RMI or JavaIDL (an implementation of CORBA). The use of both technologies is similar since the first step is to define an interface for the object. Unlike RMI, however, where interfaces are defined in Java, CORBA interfaces are defined in the Interface Definition

Page 23

Page 24: Unit 1

Advanced RDBMS

Language (IDL). This, however, adds another layer of complexity where the developer needs to be familiar with IDL, and equally important, its mapping to Java.

Making a selection between these two distribution mechanisms really depends on the project at hand and its requirements. I hope this article has provided you with enough information to get started developing distributed object-based applications and enough guidance to help you select a distribution mechanism

CORBA/IIOP support. Extends application services to Web clients, for integration with your existing applications architecture.

Flexible, pervasive security. Personalize access to data and applications based on individual and group roles. Extend security to HTML files and other data, for pervasive security no matter how or where Web content is stored.

Enhanced HTTP stack. The HTTP engine delivers outstanding performance and Java servlet support.

Integration with Microsoft IIS. Use IIS as the HTTP engine for ValidSolutions, to dramatically enhance IIS security and bring 21 CFR part 11 compliant  Web application services to your NT-based Web environment.

With support for CORBA and IIOP, the ValidSolution allows you to create client/server Web applications that take advantage of the web objects and application services. In addition, you can now access back-end relational databases for enhanced data integration using the Enterprise Connection Services.

Valid Components can leverage the Enterprise Connection Services (ECS) for building live links between pages and forms, to data from relational databases. To set up the links, you simply use the ECS template application to identify your forms and fields that will contain external source data, and to define the real-time connection settings. You can set up connections for DB2, Oracle, Sybase, EDA/SQL, and ODBC.

The Domino Application Server also allows you to design applications with CORBA-standard distributed objects

1.2.9 Object relational and Extended Relational Database Systems Evolution & Current trends of Database Technology

Security concerns must be addressed when developing a distributed database. When choosing between the objectoriented model and the relational model, many factors should be considered. The most important of these factors are single level and multilevel access controls, protection against inference, and maintenance of integrity. When determining which distributed database model will be more secure for a particular application, the decision should not be made purely on the basis of available security features. One should also question the efficacy and efficiency of the delivery of these features. Do the features provided by the database model provide adequate security for the intended application? Does the implementation of the security controls add an unacceptable amount of computational overhead? In this paper, the security strengths and weaknesses

Page 24

Page 25: Unit 1

Advanced RDBMS

of both database models and the special problems found in the distributed environment are discussed.

As distributed networks become more popular, the need for improvement in distributed database management systems becomes even more important. A distributed system varies from a centralized system in one key respect:

The data and often the control of the data are spread out over two or more geographically separate sites. Distributed database management systems are subject to many security threats additional to those present in a centralized database management system (DBMS). Furthermore, the development of adequate distributed database security has been complicated by the relatively recent introduction of the object-oriented database model. This new model cannot be ignored. It has been created to address the growing complexity of the data stored in present database systems.

For the past several years the most prevalent database model has been relational. While the relational model has been particularly useful, its utility is reduced if the data does not fit into a relational table. Many organizations have data requirements that are more complex than can be handled with these data types. Multimedia data, graphics, and photographs are examples of these complex data types.

Relational databases typically treat complex data types as BLOBs (binary large objects). For many users, this is inadequate since BLOBs cannot be queried. In addition, database developers have had to contend with the impedance mismatch between the third generation language (3GL) and structured query language (SQL). The impedance mismatch occurs when the 3GL command set conflicts with SQL. There are two types of impedance mismatches: (1) Data type inconsistency: A data type recognized by the relational database is not recognized by the 3GL. For example, most 3GLs don’t have a data type for dates. In order to process date fields, the 3GL must convert the date into a string or a Julian date. This conversion adds extra processing overhead. (2) Data manipulation inconsistency: Most procedural languages read only one record at a time, while SQL reads records a set at a time. This problem is typically overcome by embedding SQL commands in the 3GL code. Solutions to both impedance problems add complexity and overhead. Object-oriented databases have been developed in response tothe problems listed above: They can fully integrate complex data types, and their use eliminates the impedance mismatch [Mull94].

The development of relational database security procedures and standards is a more mature field than for the object-oriented model. This is principally due to the fact that object-oriented databases are relatively new. The relative immaturity of the object-oriented model is particularly evident in distributed applications. An inconsistent standard is an example: Developers have not embraced a single set of standards for distributed object-oriented databases, while standards for relational databases are well established [Sud95]. One implication of this disparity is the inadequacy of controls in multilevel heterogeneous distributed object-oriented systems.

Page 25

Page 26: Unit 1

Advanced RDBMS

In this paper, we will review the security concerns of databases in general and distributed databases in particular. We will examine the security problems found in both models, and we will examine the security problems unique to each system. Finally, we will compare the relative merits of each model with respect to security.

1.2.10 The Informix Universal Server

While Oracle and Sybase come to mind first when thinking of relational database technology for the Unix platform, Informix Corp. claims the largest installed base of relational database engines running on Unix. (See "Informix on the Move," DBMS, November 1995, page 46.) Furthermore, Informix appears to be focused more specifically on a mission statement to deliver "... the best technology and services for developing enterprisewide data management applications for open systems." Something must be working right. Informix's 1995 revenue ($709 million) and net income ($105.3 million) are up by more than 50 percent and 59 percent, respectively, compared to 1994. This puts Informix on track to join the ranks of other billion dollar software businesses within the next year or two.

Founded in 1980 by Roger Sippl, Informix went public in 1986 and released its current top-of-the-line product, the OnLine Dynamic Server RDBMS, in 1988. While the current Informix product line reflects a focus on database servers and tools, Informix has always encouraged a healthy applications market founded on the use of its tools and server engines. Whereas Oracle developed its own line of accounting and distribution applications, Informix left this to third parties. Both FourGen Software (Seattle, Wash.) and Concepts Dynamic (Schaumburg, Ill.), among others, have developed full accounting application suites based on the Informix RDBMS and built with Informix development tools.

The only time Informix diverted from its database-centric strategy was in 1988, when it merged with Innovative Software, adding the SmartWare desktop applications suite to its database-centric product line. This product acquisition, together with that of the Wingz graphical spreadsheet, followed a pattern similar to Novell's later acquisition of WordPerfect's desktop business. Both companies, Informix and Novell, moved into businesses that they did not understand and eventually divested the products they acquired. Also, just as the WordPerfect acquisition triggered the departure of Novell founder Ray Noorda, the SmartWare acquisition triggered the departure of Roger Sippl from Informix.

Both Informix and Novell subsequently refocused on their core businesses as a result of these forays into desktop applications. The current chairman, president, and CEO of Informix, Phillip E. White, joined the company in 1989. He took over in 1992 from Roger Sippl, who left to found Visigenic, a database access company focused on ODBC technology. White is credited with increasing shareholder value from 56 cents per share at the end of 1990 to $30 per share at the end of 1995. This performance placed Informix

Page 26

Page 27: Unit 1

Advanced RDBMS

at the top of the Wall Street Journal's Shareholder Scoreboard for best five-year performer.

Without the opportunity to grow revenues through diversifying into applications or other non-database areas, Informix could face difficulties in sustaining its growth. Consequently, Informix is pursuing a number of strategies to strengthen and differentiate its core database products in order to reach new markets. These strategies include:

* increasing the range of data types that Informix RDBMS engines can handle

* establishing Informix engines as data warehousing platforms

* making Informix servers attractive for use in mobile computing

* taking advantage of the Internet to reach new database markets

* exploiting other emerging technologies, such as SmartCards

Dynamic Scalable Architecture (DSA)

DSA is the marketing term for a database architecture designed to position Informix as a leading provider in the area of parallel processing and scalable database server technology. DSA provides a foundation for a range of high-end Informix database servers based on variants of the same core engine technology:

* The OnLine Extended Parallel Server is designed for very high-volume OLTP environments that need to utilize loosely coupled or shared-nothing computing architectures composed of clusters of symmetrical multiprocessing (SMP) or massively parallel processing (MPP) systems.

* The Online Dynamic Server is designed for high-volume OLTP environments that require replication, mainframe-level database administration tools, and the performance delivered by Informix's parallel data query technology (PDQ). PDQ enables parallel table scans, sorts, and joins, parallel query aggregation for decision support and parallel data loads, index builds, backups, and restores. Although this server supports SMP it does not support MPP, which is the essential differentiating feature between the OnLine Dynamic Server and the OnLine Extended Parallel Server.

* The OnLine Workgroup Server is designed for smaller numbers of user connections (up to 32 concurrent) and lower transaction volumes. It is also easier to administer because it offers less complex functionality compared to the higher-end servers.

These three server products position Informix to compete effectively against similar stratified server families from Oracle, IBM, and Sybase, as well as niche players such as Microsoft with its SQL Server product and Computer Associates with CA-OpenIngres. However, while IBM may lead with the exceptional database administration breadth and

Page 27

Page 28: Unit 1

Advanced RDBMS

depth of its DB2 engine or Microsoft with the ease of use of its graphical administration tools, Informix is setting the pace in support for parallel processing that addresses an issue dear to every database users' heart, namely performance.

Informix-Universal Server

Informix has supported binary large object (BLOB) data for many years but the company recognizes that the need to store, and more important, to manipulate complex data other than text and numeric data, will be critical to its ability to address future customer needs. For this reason, Informix recently completed its acquisition of Illustra Information Technologies, founded by Ingres RDBMS designer Dr. Michael Stonebraker. Illustra specializes in handling image, 2D and 3D spatial data, time series, video, audio, and document data using snap-in modules called DataBlades that add object handling capabilities to an RDBMS via extensions to SQL. Informix has announced its intention to fully integrate Illustra technology into a new Informix-Universal Server product within the next year.

If Informix manages this task, and analysts such as Richard Finkelstein of Performance Computing doubt that it will (see Computerworld, February 12, 1996), Informix-Universal Server could put Informix in a unique position to service specialized and highly profitable markets such as:

* multimedia asset management for the entertainment industry

* electronic publishing and content management across the Internet

* risk management systems for financial services companies

* government and commercial geographic information systems (GISs)

Establishing an early leadership position in any one of these markets could easily account for another billion dollars in revenue for Informix. This would surely justify the time and cost required to rearchitect its core engine around the Illustra technology and position Informix as a player in the object/relational database market.

Delivery of the Informix-Universal Server is slated to take place in three phases:

1. delivery of a gateway to allow customers to access complex data stored in an Illustra Server and integrate it with traditional relational data in an Informix server (the second quarter of 1996)

2. delivery of a DataBlades Developer Tool Kit for creating new user-defined data types that work in both the Illustra Server and the new Informix-Universal Server (the second quarter of 1996)

Page 28

Page 29: Unit 1

Advanced RDBMS

3. delivery of the fully merged Informix-Universal Server technology including "snap in" DataBlades (the fourth quarter of 1996)

a. Riding Waves

To some extent, you could argue that Informix (like competitors Oracle and Sybase) has surfed the technology wave of relational databases and Unix-based open systems that has swept across corporations over the last decade. Another more recent wave, data warehousing, is far from peaking, and Informix hedged its bets in this area with its acquisition of the San Francisco-based Stanford Technology Group (STG). STG is known for its MetaCube product, which presents a multidimensional view of underlying relational data through the use of an intermediary metadata layer. This lets users of Informix RDBMS servers carry out online analytical processing (OLAP) by using the MetaCube technology. Informix already has a major data warehouse implementation underway at the Consumer Market Division of communications giant MCI. This data warehouse is expected to grow from a 600GB data mart up to three terabytes.

Oracle and Sybase have also taken initiatives in this area and are integrating OLAP technology into their product lines to ensure that they lose as few possible sales to multidimensional server vendors such as Arbor Software (Sunnyvale, Calif.), which sells the Essbase Analysis Server, or to specialized data warehouse server vendors such as Red Brick Systems (Los Gatos, Calif.). The data warehousing wave provides database vendors the chance to offer an application that is no more than their current database engine and some combination of front-end query and reporting tools. The data warehouse solution from Informix also benefits from its built-in parallel processing functionality and log-based "continuous" data replication services for populating the data warehouse from other Informix servers. Leading U.K. database analysts Bloor Research Group cited Informix's DSA as "the best all-round parallel DBMS on the market" and claimed it "has significant benefits over almost all its competitors on data warehouse applications" ("Parallel Database Technology: An Evaluation and Comparison of Scalable Systems," Bloor Research Group, October 1995).

b. Going Mobile

International Data Corp. forecasts suggest that shipments of laptop computers will grow from four million in 1995 to some eight million in 1999 in the U.S. alone. In other words, the road warrior population is set to at least double, and as more workers telecommute and the influence of the Internet makes itself felt in the business world, the term "office" will simply come to mean "where you are at this point in time." To support this scenario, Informix is working on its "anytime, anywhere" strategy, which sounds suspiciously similar to the concepts espoused by Sybase for its SQL Anywhere server product based on the recently acquired Watcom SQL engine.

However, the key to Informix's strategy for the mobile computing market is asynchronous messaging based on new middleware products being built by Informix that provide store-and-forward message delivery and the use of software agents to manage the

Page 29

Page 30: Unit 1

Advanced RDBMS

process. Asynchronous messaging lets mobile clients send and receive messages without maintaining a constant connection with the server. Store-and-forward message delivery ensures that messages get sent or completed as soon as a connection is established or reestablished. The middleware and software agents are used to establish and maintain connections, to automate repetitive tasks, and to intelligently sort and save information. The applications that deliver this functionality can be created using the Informix class libraries built in the Informix NewEra tool, which allows for application partitioning to deploy components on mobile clients or servers.

c. New Era of RAD

NewEra is Informix's rapid application development tool that competes with Powersoft's (a Sybase company) PowerBuilder and Oracle's Developer 2000. Compared to its competitors, NewEra benefits from a strong object-oriented design that delivers a repository-based, class library-driven application development paradigm using class browsers for navigating application objects. NewEra can also generate cross-platform applications. Specifically, NewEra includes:

* a graphical window and form painter with a code generator

* a graphical front end for managing NewEra application components

* a graphical language editor for managing NewEra code

* an interactive, graphical debugger for analyzing NewEra programs

* repositories, class browsers, and configuration tools supporting team-based development

* reusable class libraries that can be Informix or third party provided or developer defined

The impending release of the latest version of NewEra, expected in the second quarter of 1996, is slated to deliver user-defined application partitioning for three-tier client/server deployment; OLE remote automation server support to allow OLE clients to make requests against NewEra built application servers; and class libraries to support transaction-processing monitors for load balancing of high volume OLTP applications. If this functionality is delivered as promised, then client/server application vendors such as Concepts Dynamic (Schaumburg, Ill.), whose Control suite of accounting applications is written in NewEra, will benefit from their use of Informix technology.

d. The Web Word

Informix, like everyone these days, is hot on the Web word. World Wide Web Interface Kits are available for use by Informix customers building Web applications using Informix-4GL or Informix-ESQL/C tools that need to use the common gateway interface

Page 30

Page 31: Unit 1

Advanced RDBMS

(CGI) as a means to access Informix databases across the Internet. Informix has established a Web partner program to build links with other Web software developers such as Bluestone Inc.(Mountain View, Calif.) and Spider Technologies (Palo Alto, Calif.). Informix customers such as MCI, Choice Hotels, and the Internet Shopping Network are already forging ahead with Informix-based Web solutions. Illustra (now owned by Informix) also recently collaborated with other partners to deliver "24 Hours in Cyberspace." This event, claimed to be the largest online publishing event ever staged, allowed the organizers to create a new web page every 30 minutes comprising multimedia content delivered from hundreds of sites worldwide and stored in an Illustra DBMS.

Informix also partnered with Internet darling Netscape Communications Corp. to include the Informix-OnLine Workgroup Server RDBMS as the development and deployment database for Netscape's LiveWire Pro. The LiveWire Pro product is part of Netscape's SuiteSpot Web application development system for building online applications. This deal involves cross-licensing and selling of Informix and Netscape products and is likely to be among the first of many such collaborations between database and Internet vendors during 1996.

e. SmartCards and Internet Personal Communicators (IPC)

While the IPC vs. PC debate rages on in the press, let me put a spin on this scenario for you. You are a road warrior and before leaving on a trip you slip your personal profile SmartCard (PPS) into your jacket pocket and leave the laptop at home. Your PPS contains your personal login information and access numbers for Internet and Intranet connectivity. Eventually this PPS may also be software agent-trained to search for news on specific subjects, and may contain a couple of Java applets for corporate Intranet application front ends to submit your T&E (travel and entertainment) and review your departmental schedule. When you check into your room, there is an IPC designed specifically for OLIP (online Internet processing).

This IPC, which costs your hotel the same amount as the TV in your room, is a combined monitor, PPS reader, and keyboard/mouse already plumbed into the Internet. You switch on the IPC and with one swipe of your PPS in the reader you upload all your profile data into the IPC's local memory. While this is taking place, the hotel uses the opportunity to display its home page, welcoming you to the hotel, advertising goods and services, and, if you are a regular guest, showing you your current bill and your frequent guest program status. You then fire up your favorite browser to process some email, set your software agent off to collect the news, submit your trip expenses to the home office Intranet, and review your current schedule to book a few calls and juggle some appointments. All of this was done without a laptop or personal computer in sight and depends only on a simple device connected to the Internet and a SmartCard.

SmartCards are another technology on which Informix is working together with its partners, Hewlett-Packard (Palo Alto, Calif.) and GemPlus Card International Corp. (Gaithersburg, Md.). SmartCards will be used for all sorts of applications including

Page 31

Page 32: Unit 1

Advanced RDBMS

buying, identifying, and securing things. It is not hard to see SmartCards being carried by everyone and combining your credit card, phone card, driver's license, and medical alert data onto one slim "plastic" database.

f. Putting the Right Foot Forward

It's hard to see Informix taking a wrong step at the moment. The positioning of the Informix-Universal Server, the complementary strategies of mobile computing, Web-enabling, and SmartCards show some good, focused vision. Phillip White's record, as well as that of on-staff database gurus such as Dr. Michael Stonebraker of Ingres/Illustra fame and Mike Saranga of DB2 fame, all shows the proven ability to execute these strategies successfully. Sounds like a recipe for success to me.

1.2.11 Object- Relational Features of Oracle 8i

Oracle 8i server server software has many optional components to chose from The Oracle 8i server softwareNet8 ListenerThe Oracle8i utilitiesSQL * PlusA starter database

Object spatial helps to data mapping and handling.An instance acan be started and open a database in restricted mode so that the database is available only to administration personnel. This mode helps to accomplish the following tasks.

Perform structure maintenance, such as rebuilding indexes. Perform an export or import of database data Perform a data load with SQL * Loader Temporarily prevent typical users from using data.

1.2.12 An Overview of SQL

The SQL Standard and Its ComponentsStructured Query Language is a high level language that was developed to provide access to the data contained in relational databases. SQL has been widely adopted and now almost all contemporary databases can be accessed using SQL. The American National Standards Institute (ANSI) has standardized the SQL language. SQL server uses a dialect of SQL called Transact-SQL. Transact-SQL contains several flow control keywords that facilitate its use for developing stored procedures.The SQL is used for database management tasks such as creating and dropping tables and columns, for writing triggers and stored procedures. It is also used to change SQL servers configuration. It is also interactively used with SQL servers Graphical Query Analyser utility to perform unplanned queries.

Object-Relational Support in SQL-99

Page 32

Page 33: Unit 1

Advanced RDBMS

SQL server database objects consists of Tables, Columns, indexes, views, constraints, rules, defaults, triggers, stored procedures, and extended stored procedures.As you can set in the table, each SQL server table contains a set of related information where each table represents a different object in the publishing business.

Some New Operations and Features in SQLTransact SQL provides three categories of SQL support. DDL (Data Definition Language), DML (Data Manipulation Language), and DCL (Data control Language)SQL: server enterprise manager is a graphical client/server administration and management tool that allows to perform database management, backup, restore operations, set up security and database replication.

1.2.13 Implementation & related issues for extended type systems

Managing Large Objects and Other Storage Features

Like C++, Oracle 8 provides built in constructors for values of a declared type and these constructors bear the name of the type. Thus, the word point type and a parenthesized list of appropriate values form a value of type point type.

One of the most important parts of an Oracle database is its data dictionary. Data Dictionary is a read-only set of tables that provide information about its associated database. Dynamic performance tables are not true tables, and most users should not access them. However, database administrators can query and create views on the tables and grant access to those views to other users . These views are sometimes called fixed views because they cannot be altered or removed by the database administrator.

The Nested Relational Data Model

The nested relational data model is a natural generalisation of the relational data model, but it often leads to designs which hide the data structures needed to specify queries and updates in the information system. The relational data model on the other hand exposes the specifications of the data structures and permits the minimal specification of queries and updates using SQL. The deficiencies in relational systems leading to a demand for object-oriented nested relational solutions are seen to be deficiencies in the implementations of relational database systems, not in the data model itself. The nested relational data model is a natural generalisation of the relational data model, but it often leads to designs which hide the data structures needed to specify queries and updates in the information system. The relational data model on the other hand exposes the specifications of the data structures and permits the minimal specification of queries and updates using SQL. However, there are deficiencies in relational systems, which lead to a demand for object-oriented nested relational solutions. This paper argues that these deficiencies are not inherent in the relational data model, but are deficiencies in the implementations of relational database systems.

Page 33

Page 34: Unit 1

Advanced RDBMS

The paper first sketches how the nested-relational model is a natural extension of the object-relational data model, then shows how the nested relational model, while sound, is expensive to use. It then examines the object-oriented paradigm for software engineering, and shows that it gives very little benefit in database applications. Rather, the relational model as represented in conceptual modeling languages is argued to provide an ideal view of the data. The ultimate thesis is that a better strategy is to employ a main-memory relational database optimised for queries on complex objects, with a query interface based on a conceptual model query language. Object-relational data model leads to nested relationsThe object-relational data model (Stonebraker, Brown and Moore 1999) arises out of the realisation that the relational data model abstracts away from the value sets of attribute functions. If we think in terms of tuple identifiers in relations (keys), then a relation is simply a collection of attribute functions mapping the key into value sets.

The pure relational data model is based on set theory, and operates in terms of projections, cartesian products and selection predicates. Cartesian product simply creates new sets from existing sets, while projection requires the notion of identity, since the projection operation can produce duplicates, which must be identified. Selection requires the concept of a predicate, but the relational model abstracts away from the content of the predicate, requiring only a function from a tuple of value sets into {true, false}. The relational system requires only the ability to combine predicates using the prepositional calculus.

Particular value sets have properties which are used in predicates and in other operations.The only operator used in the pure relational model is identity. The presence of this operator is guaranteed by the requirement that the value sets be sets, although in practice some value sets do not for practical purposes support identity (eg real number represented as floating point).

This realisation that the relational data model abstracts away from the types of value sets and from the operators which are available to types has allowed the design of database systems where the value sets can be of any type. Besides integers, strings, reals, and booleans, object-relational databases can support text, images, video, animation, programs and many other types. Each type supports a set of operations and predicates which can be integrated with the relational operations into practical solutions (each type is an abstract data type).

If a value set can be of any type, why not a set of elements of some type? Why not a tuple? If we allow sets and tuples, then why not sets of tuples? Sets of tuples are relations and the corresponding abstract data type is the relational algebra. Thus the object elational data model leads to the possibility of relation-valued attributes in relations. Having relation-valued attributes in relations looks as if it might violate first normal form. However, the outer relational operations can only result in tuples whose attribute values are either copies of attribute values from the original relations or are functions of those values, in the same way as if the value sets were integers, the results are either the integers present in the original tables or functions like square root of those integers. In

Page 34

Page 35: Unit 1

Advanced RDBMS

other words, the outer relational system can only see inside a relation-valued attribute to the extent that a function is supplied to do so. These functions are particular to the schema of the relation-valued attribute, and have no knowledge of the outer schema. Since the outer relational model and the abstract data type of a relation-valued attribute re the same abstract data type, it makes sense to introduce a relationship among the two.

The standard relationships are unnest and nest. Unnest is an operator which modifies the scheme of the outer data model, replacing the relation-valued attribute function by a ollection of attribute functions corresponding to the scheme of the inner relation. Nest is he reverse operation, which modifies the outer scheme by packaging a collection of attributes into a single relation-valued attribute.

Having relation-valued attributes together with nest and unnest operations between the outer and inner relational systems is called the nested relational data model. We see that the nested relational data model is a natural extension of the object-relational data model.

Use of the nested relational data model for object-oriented development

In recent years the object-oriented model has become the dominant programming model and is becoming more common in systems design, including information systems. The data in an object-oriented system consists typically of complex data structures built from tuple and collector types. The tuple type is the same as the tuple type in the bjectrelational model. A collector type is either a set, list or multiset. The latter two can be seen as sets with an additional attribute: a list is a set with a sequence attribute, while amultiset is a set with an additional identifying attribute. So a nested-relational data model can represent data from an object-oriented design. Accordingly, object-relational databases with object-relational nested SQL can be used to implement object-oriented databases. How this is done is described for example by Stonebraker , Brown and Moore (1999) (henceforth SBM). We should note that both the relational and object-oriented data models are implementations of more abstract conceptual data models expressed in conceptual data modelling languages such as the Entity-Relationship-Attribute (ERA) method. Wellesta lished information systems design methods begin the analysis of data with a conceptual model, moving to a particular database implementation at a later stage.An example adapted from SBM will clarify some issues.

Consider the E- R data model in Figure 1

Figure 1: A Conceptual Model

Figure 1. Since the relationship between department and vehicle is one-to-many,associated with each department is a set of vehicles.

Page 35

Page 36: Unit 1

Advanced RDBMS

Figure 1 A conceptual model A nested-relational implementation of this conceptual data model isDept(ID:int, other: various, (1)car: set of (vehID: string, make:string, year:int))

OR SQL on nested relationsSQL has been extended by SBM among others to handle object-relational databases, mainly by permitting in SQL statements the predicates and operators particular to the abstract data types supporting the value sets. In particular, extending and overloading the dot notation for disambiguating attribute names support nested relational systems.

For example, inSelect ID from dept where car.year = 1999 (2)

Dot year identifies the year attribute of the car tuple, and also designates the membership of a tuple where year = 1999 in the set of tuples which is the value set of dept.car. The result of this query on the table of Figure 2 is ID = 1.

As a consequence of this overloading, the and boolean operator in the WHERE clause becomes if not ambiguous, at least counterintuitive to someone used to standard SQL.

The query Select ID from dept where car.make = Laser (3)Has the same result, ID = 1. Since the outermost interpretation of dot is set membership, in the query

Select ID from dept where car.year = 1999 and car.make = Laser (4)The and operator is interpreted as set intersection, and the result is also ID = 1.This result, although correct, is probably not what the maker of the query intended. They would more likely have been looking for a department, which has a 1999 Laser, and the response they would be looking for would be none.There are two ways to fix this problem. One is to import a new and operator from the relational ADT, so that (4) becomes

Select ID from dept where car.year = 1999 and2 car.make = Laser (5)In this solution, both arguments of and2 must be the same relation-valued attribute of the outer system.

The other solution is to unnest the table so that the standard relational operator works in the way it does in standard SQL

Select ID from dept, dept.car where (6)car.year = 1999 and2 car.make = Laser

Where the addition of dept.car to the FROM clause signifies unnesting.

Page 36

Page 37: Unit 1

Advanced RDBMS

The former method is problematic since nesting can occur to any level, and the second is problematic since it requires the user to introduce navigation information into the query.

The same sort of problem occurs when we try to correlate the SELECT clause with the WHERE clause

Select ID, car.year from dept where car.make = Laser (7)

Returns the table

ID = 1, Year = {1991, 1999} (8)

when applied to the table of Figure 2, as a consequence of first normal form. We need again to use unnest to convert the nested structure to a flat relational structure in order to make the query mean what we want to say. Although OR SQL is a sound and complete query language, the simple-looking queries tend to be not very useful, and in order to make useful queries additional syntax and a good understanding of the possibly complex and possibly multiple nesting structure is essential. The author’s experience is that it is very hard to teach, even to very advanced students.

Representation of many to many relationshipsIf we are going to use the nested relational model to represent complex data structures, then we must take account of many to many relationships

Figure 2: A Many to Many Relationship There are several different ways to implement this application in the nested relational model, taking each of the entities as the outermost relation. If implemented as a single table, two of the entities would be stored redundantly because of the many-to-many relationships. So the normalised way is to store the relationships as sets of reference types (attributes whose value sets are object identifiers).

If the query follows the nesting structure used in the implementation, then we have only the problems of correlation of various clauses in the SQL query described in the last section.

However, if the query does not follow the nesting structure, it can get very complex. For example, if the table has a set of courses associated with each student and a set of lecturers associated with each course, then in order to find the students associated with a given lecturer, the whole structure needs to be unnested, and done so across reference types. The query is hard to specify, and would be very complex to implement.

Page 37

Page 38: Unit 1

Advanced RDBMS

One might argue that one should not use the nested relational model for many to many relationships. But nested systems can interact, as in Figure 3.

Figure 3: A many-to-many with nesting

In this case, an event has a set of races, and a team has a set of competitors, and we have to decide whether a race has a set of references to competitor or vice versa. What if we want to find what events a team participates in? The whole structure must be unnested.

The point is that representing these commonly occurring complex data structures using a nested relational model are very much more complex then representing them in the standard relational model.

Reconsideration of using NR model for OO conceptsWe have seen that the nested relational model arises naturally from the object-relational model, and that it has a sound and complete query language based on first normal form.

However, we have seen several practical problems:

Using the NR model forces the designer to make more choices at the database schema level than if the standard relational model is used.

A query on a NR model must include navigation paths. A query must often unnest complex structures, often very deeply for even

semantically simple queries.

So even though the nested relational model is sound, it is very much more difficult to use than the standard relational model, so may be thought of as much more expensive to use.

In order for a more expensive tool to be a sound engineering choice, there must be a corresponding benefit. Let us therefore look at the benefits of the object-oriented programming model.

Page 38

Page 39: Unit 1

Advanced RDBMS

OO programming and design originated in the software engineering domain. In this domain, it is considered beneficial to hide the details of the implementation of a program specification. This information hiding makes use of objects more transparent, and ensures that modifications made to objects, which do not affect functionality, may be made without side effects. The principles of information hiding were a major advance in software engineering.

The benefits of using an OO approach in a database therefore would come frominformation hiding, that is hiding implementation details not required for understanding the specification of an object.

Let us see how this applies to the specification of data in an information system. As we have seen, it is common to use a conceptual modelling technique to specify such data. The implementation of this data is ultimately in terms of diskaddresses, file organisations and access methods, but is generally done in several stages.

The first stage of implementation is normally the specification of schemas in a database data description language, very often in a relational database system. This stage of implementation is almost a transliteration, frequently introducing no additional design decisions. Algorithms for the purpose are given for example by Elmasri and Navathe (2000).

Further stages of implementation are performed almost entirely within the database manager software (DBMS), sometimes with the guidance of a database administrator who will identify attributes of tables which need rapid access, or give the DBMS some parameters which it will use to choose among pre-programmed design options. In effect, the implementation of the data model is almost entirely automated, and generally not the concern of the applications programmer.

So the conceptual data model is a specification, the almost equivalent DBMS table schemas are in effect also specifications, and the programmer does not generally proceed further with refinement.

On the programming side, an information system generally has a large number ofmodules which update or query the tables. In a relational system, these programs are generally written using the SQL data manipulation language.

The SQL statement is at a very high level, and is generally also refined in several stages:

The order of execution of the various relational operators must be chosen. Various secondary and primary indexes can be created or employed Decisions need to be made as to the size of blocks retrieved from disk, what is to

be cached in main memory, whether intermediate results need to be sorted, and what sort algorithms to use.

Page 39

Page 40: Unit 1

Advanced RDBMS

But, again, these refinement decisions are made by the DBMS using pre-programmed design decisions depending on statistics of the tables held in the system catalog and to a degree on parameters supplied by the database administrator. The programmer is generally not concerned with them.

So it makes sense to think of an SQL statement not as a program but as a specification for a program. It is hard to see what might be removed from an SQL statement while retaining the same specified result. The SELECT clause determines which columns are to appear in the result, the FROM clause determines which tables to retrieve data from (in effect which entities and relationships the data is to come from), and the WHERE clause determines which rows to retrieve data from.

We have that the benefits of information hiding in object-oriented design is that the programmer can work with the specifications of the data and methods of a system without having to worry about how the specifications are implemented. However, in information systems, the programmer works only with specifications of data structures and access/ update methods. The implementation is hidden already in the DBMS. So in a DBMS environment the programmer never has to worry how the specifications are implemented. Information hiding is already employed no matter what design method the programmer uses.

What the nested relational data model does is hide aspects of the structure of the specified data, whereas the standard relational model exposes the specified structure of the data.

Using the NR data model, the data designer must make what amount to packaging design decisions in the implementation of a conceptual model. In this sense, a NR model is more refined than a standard relational model, and is therefore more expensive to build. On the other hand, when a query is planned, in the NR model the programmer, besides specifying the data that is to appear in the query, must also specify how to unpackage the data to expose sufficient structure to specify the result. So as we have seen, the query is also more expensive. Both the data representation and the query are unnecessarily moreexpensive than the standard relational representation, since the information being hidden is part of the specification, not how the specifications are implemented.

So why don’t people use RDBs for OO applications?

One might ask why people don’t already use relational databases for problems calling for object-oriented approaches. The usual reason given is that RDBs are too slow. The paradigmatic object-oriented application is system design, say a VLSI design or the design of a large software system. There is often only one (very complex) object in the system. This object has many parts, which are themselves complex. A relational implementation therefore calls for many subordinate tables with limited context; and processing data in the application generally requires large numbers of joins.

Database managers tend to be designed to support transactional applications, where there are a large number of objects of limited complexity. The space of pre-programmed design

Page 40

Page 41: Unit 1

Advanced RDBMS

options for the implementation of data structures and queries does not generally extend to the situation where there are a small number of very complex objects.

Rejection of the standard relational data model for these applications is therefore not a rejection of the model per se, but a recognition that current implementations of the standard relational data model do not perform well enough for these problems.

What can be done?

Two problems have been identified which make the standard relational model difficult to use for OO applications: the slowness of the implementation and the necessity for the definition of a large number of tables with limited context.The former problem is technical. A large amount of investment has been made in the design of implementations for transaction-oriented applications. Given sufficient effective demand, there is no reason why a sufficient investment can not be made for applications of the OO type. In particular, there are already relational database systems optimised around storage of data primarily in main memory rather than on disk. For example, a research project of National Research Institute for Mathematics and Computer Science in the Netherlands together with the Free University of Amsterdam, called Monet, has published a number of papers on the various design issues in this area. A search on the Web identifies many such products. The problem of slowness of standard relational implementations for OO applications can be taken to be on the way to solution.

The latter problem, that the data definition for an OO application requires a large number of tables with limited context, is a problem with the expressiveness of the standard relational data model. In an OO application one frequently wants to navigate the complex data structures specified. One might want the set of teams participating in a particular race in a particular event, or the set of events inwhich a particular competitor from a particular team is competing, or the association between teams and events defined by the many-to-many relationship between Race and Competitor. From the point of view of each of those queries, there is a nested-relational packaging of the conceptual model which makes the query simple, simpler than the standard relational representation. The unsuitablity of the NR model is that these NR packagings are all different, and that a query not following the chosen packaging structure is very complex.

However, we have already seen that the primary representation of the data can be in a conceptual model. The relational representation can be, and generally is, constructed algorithmically. If the DBMS creates the relational representation of the conceptual model, then the conceptual model should be the basis for the query language. A query expressed on the conceptual model can be translated into SQL DML in the same sort of way that the model itself is translated into SQL DDL. In fact, there are a number of conceptual query languages which permit the programmer to construct a query by specifying a navigation through the conceptual model, for example ConQuer (Bloesch and Halpin, 1996, 1997).

Page 41

Page 42: Unit 1

Advanced RDBMS

Using a language like ConQuer, the programmer can specify a navigation path through the conceptual model, which when it traverses a one-to-many relationship opens the set of instances on the target side. When it traverses a many-to-many relationship, the view from the source of the path looks like a one-to-many. Such a traversal of the conceptual model provides a sort of virtual nested-relational data packaging, which can be translated into standard SQL without the programmer being aware of exactly how the data is packaged. This approach therefore is more true to the spirit of object-oriented software development since the implementation of the specification is completely hidden.

1.2.14 Conclusion

The standard relational data model where the DDL and DML are both hidden beneath a conceptual data modelling language and the DBMS is a main-memory implementation optimised for OO-style applications, presents a much superior approach to the problem of OO applications than does the nested relational data model.

1.3. Revision Points

Information is represented in object-oriented database, in the form of objects as used in Object-Oriented Programming.

A database is a logical term used to refer a collection of organized and related information.

Operator polymorphism: It refers to an operation’s ability to be applied to different types of objects; in such a situation, an operation name may refer to several distinct implementations, depending on the type of objects it is applied to. This feature is also called operator overloading.ODMG standard refers to - object model, object definition language (ODL), object query language (OQL), and bindings to object-oriented programming languages. OQL Collection Operators include Aggregate operators such as: min, max, count, sum, and avg.Structured Query Language is a high level language that was developed to provide access to the data contained in relational databases. SQL has been widely adopted and

1.4. Intext Questions

1. Illustrate ODMG 2. What is C++ Language Binding ?3. Explain what is the concept of Object Oriented Databases ?4. Define Object Definition Language .

Page 42

Page 43: Unit 1

Advanced RDBMS

5. Write a note on Object Query Language.6. The usage of CORBA in Database management – Discuss.7. Explain Entity Relationship Diagram ?

1.5. Summary

o The term "object-oriented database system" first appeared around 1985..o OO databases try to maintain a direct correspondence between real-world and

database objects so that objects do not lose their integrity and identity and can easily be identified and operated upon

o The three most basic constructors are atom, tuple, and set. Other commonly used constructors include list, bag, and array. The atom constructor is used to represent all basic atomic values, such as integers, real numbers, character strings, Booleans, and any other basic data types that the system supports directly.

o Extents: In most OO databases, the collection of objects in an extent has the same type or class since the majority of OO databases support types, we assume that extents are collections of objects of the same type

o Persistent Collection: It holds a collection of objects that is stored permanently in the database and hence can be accessed and shared by multiple programs

o Transient Collection: It exists temporarily during the execution of a program but is not kept when the program terminates

o We made the ODMG object model much more comprehensive, added a metaobject interface, defined an object interchange format, and worked to make the programming language bindings consistent with the common model. We made changes throughout the specification based on several years of experience implementing the standard in object database products.

o The goal of this Object Definition Language (ODL) is to capture enough information to be able to generate the majority of most SMB web apps directly from a set of statements in the language . . .

o The C++ binding to ODBMSs includes a version of the ODL that uses C++ syntax, a mechanism to invoke OQL, and procedures for operations on databases and transactions

1.6. Terminal Exercise

1. What is Database?2. Define ODL, OQL?3. What is Polymorphism?4. What do you mean by OOAD?5. What is the main use of CORBA?

1.7. Suggested Reading

Page 43

Page 44: Unit 1

Advanced RDBMS

1. Bloesch, A. and Halpin, T. (1996) “ConQuer: a Conceptual Query Language” Proc.ER’96: 15th International Conference on Conceptual Modeling, Springer LNCS, no. 1157, pp. 121-33.

2. Bloesch, A. and Halpin, T. (1997) “Conceptual Queries Using ConQuer-II” in. David W. Embley, Robert C. Goldstein (Eds.): Conceptual Modeling - ER '97, 16th International Conference on Conceptual Modeling, Los Angeles, California, USA, November 3-5, 1997, Proceedings. Lecture Notes in Computer Science 1331 Springer 1997.

3. Elmasri, R. & Navathe, S. B. (2000). Fundamentals of Database Systems. (3rd ed.).

4. Addison Wesley, Reading, Mass. Stonebraker, M., Brown, P. and Moore, D. (1999) Object-relational DBMSs : tracking the next great wave San Francisco, Calif. : Morgan Kaufmann Publishers.

1.8 Assignments

1. By using C++ write the ODL statements to fetch the data from the Inventory database.

1.9 Reference Books

Ramez Elmasri, Shamkant B. Navathe, “Fundamentals of Database Systems”, Addison – Wesley, 2000.

1.10 Learning Activities

1. Discuss in detail about SQL.2. The usage of CORBA in Database management – Discuss.

1.11 Keywords

1. Object-Oriented Database2. ORDBMS - Object Relational Database Management System3. ODMG – Object Database Management Group.4. ODL – Object Definition Language5. OQL – Object Query Language.

Page 44