Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

31
Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML Presented by: Vincent Cheu ng Supervised by: Prof Michael Lyu, Prof K. W. Ng Dec 18, 2000

description

Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML. Presented by:Vincent Cheung Supervised by:Prof Michael Lyu, Prof K. W. Ng Dec 18, 2000. Research objectives. Address the Use of XML in enhancing Data Representation and System Communication - PowerPoint PPT Presentation

Transcript of Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Page 1: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Information Searching and Retrieval from Distributed Databasesusing Mediators, CORBA and XML

Presented by: Vincent CheungSupervised by: Prof Michael Lyu,

Prof K. W. Ng

Dec 18, 2000

Page 2: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Research objectives Address the Use of XML in enhancing Data

Representation and System Communication Address the Use of Mediator Architecture for

Searching in Distributed Environment Address the Use of XML and HTTP to simulate

CORBA IIOP calls We have developed a CORBA-based mediator

system which can overcome the limitation of firewall problem to achieve a worldwide system

Page 3: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Presentation outline What is Mediator? Using XML Implementation with CORBA CORBA vs. Firewalls Overcome the Firewall Evaluation Future Work Conclusion Demonstration

Page 4: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Searching in distributed sourcesTo integrate different components in the

systems, there are three approaches: CORBA Mediators Agents

They are not orthogonal, e.g.: Implement mediators by CORBA Agents may use mediators

Page 5: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

What is mediator? A middle layer for forwarding clients queries to appropriate

sources, and integrate the data before returning to users

DatabaseEngine

DatabaseEngine

DatabaseEngine

mediator

Client UIClient UIClients UI

query

result

Page 6: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Using XML We need a COMMON format of information in

order to let distributed heterogeneous components for data exchange.

XML provides a good solution because: XML is semi-structured and highly flexible in data

representation. Highly structured traditional relational and object-oriented data schema can be mapped to XML schema without losing information.

A common XML schema for heterogeneous system can be achieved

Hence, we use XML data schema in our system.

Page 7: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

A piece of XML data<news><source>South China Morning Post</source><date>April 15, 2000</date><title>Press warning appropriate, says Beijing </title><reporter>Kong Lai-fan</reporter><reporter>Greg Torode</reporter><content>Beijing yesterday defended remarks made by senior SAR-based official Wang Fengchao that local media should avoid reporting separatist views. </content></news>

Page 8: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

XML-QL To query semi-structured data, traditional query lan

guages may not be adequate We use XMLQL, which is designed for semi-structu

red XML data An example:where <news> $B <news> in ‘’news_db.xml’’,<date><year>2000</year><month> 4 </month><day> 15 </day></date> in $Bconstruct <result> $B </result>

Page 9: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Why use CORBA?Common Object Request Broker

ArchitectureDesigned for application development

within distributed heterogeneous environment

We use CORBA to build our mediator system.

Page 10: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Architecture of our system

Web-base UI

Web-base UI

Web-base UI

1st tier

Servlet Interface

Mediator(forwarding queries and integrating

results)

2nd tierData Source

Mediator

Data Source

3rd tier

UI Queries and Results Queries Results

…Nth tier

Page 11: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Handling some special casesSome special cases: Infinite loops,

broken connection and too many layer of traversal.

Infinite loops:

mediator mediator

mediator

Solution: To give each query a unique ID, and mediators will keep track all query IDs which have no replied answers yet. Duplicated IDs would indicate an infinite looping has occurred.

qid = 123qid = 123

qid = 123

qid = 123

Page 12: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Special cases Broken connection

Solution: Use a timeout parameter to specify the maximum amount of time that we are willing to wait.

Too may layers of traversalSolution: use a maximum layer parameter to specify the maximum number of layers that we want to go.

mediator mediator

mediatorTimeout = 15000Max_layer = 3

Timeout = 10000Max_layer = 2

Timeout = 5000Max_layer = 1

Page 13: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

IDL design of our system IDL (Interface Definition Language) defines e

xport interface of CORBA objects Our IDL design:

The parameter type for special cases handlingStruct SysPara{

long qid;long timeout;short maxlayer;

}

Page 14: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

IDL design of our system Mediator may make queries to Databases or Me

diators Hence, we want Databases and Mediators can

be very similar objects Now, both of them are implementing QueryEngi

ne Interface Interface QueryEngine {

String query(in SysPara para, in string xmlquery);

}

QueryEngine

QueryDB QueryMed

Implemented by

Page 15: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

QueryDB ObjectDirectly connects to the data sourceCaller calls query() It takes the query statement parameter

and make query to related data sourceReturns answer in XML string stream

format

Page 16: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

QueryMed Object Same invoking method, query() Besides QueryEngine, it implements another interface, Qu

eryMediatorpublic interface QueryMediator {public QueryEngine[] qelist();public void qelist(QueryEngine[] arg);public void append_result(String res);}

qelist holds a list of QueryEngine objects, i.e. QueryMed or QueryDB objects, which will be called by that mediator.

It starts a thread for each target QueryEngine object, and the thread will call append_result() to integrate results from various sources

Page 17: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Problems with CORBA firewall We cannot achieve a worldwide query system becau

se of the firewall. CORBA uses Internet InterORB Protocol (IIOP) for c

ommunication. Message body of IIOP is encoded in Common Data

Representation (CDR), which translates IDL data types into a byte-ordering independent octet string.

Firewalls cannot decode the message body of IIOP in application level.

But we have application level gateways for telnet, FTP, and HTTP

Page 18: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

CORBA firewallsWe do have some firewalls which are ded

icated for CORBA communicationsE.g. IONA Orbix WonderWall and Visibro

ker GatekeeperThey have limitations:

Vendor dependent: Orbix firewalls must use Orbix developed client and server objects

Not all features of CORBA can be used: Callbacks may not be used

Not commonly used Extra purchase, etc…

Page 19: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

CORBA and firewalls IIOP cannot pass firewall, but HTTP can

CORBA enclave

CORBA enclave

IIOP

HTTP Servlet XML

firewall

A real worldwide CORBA system can be achieved

Page 20: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Solution to firewall problem HTTPGateway is also implementing QueryEn

gine.

HTTPGateway is a virtual query engine to forward the query to the target systems

QueryEngine

QueryDB QueryMed

Implemented by

HTTPGateway

Page 21: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Simulated Call

Mediator M

HTTP Gateway H

firewall

Servlet Mediator SM

DB Object

CORBA Enclave CORBA Enclave

Page 22: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Sample XML message in parameter passing<request object="QueryMed" method="query" return="string">

<parameter order="1" name="para">}<SysPara><qid>398498241824033984092</qid><maxlayer>4</maxlayer><timeout>2000</timeout></SysPara></Para><parameter order="2" name="QueryStatement"><string><news> \$B </news> in "database.xml"<keyword>satellite</keyword> in \$Bconstruct <result> \$B </result></string></Para>

</request>

Page 23: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Its DTD<!DOCTYPE parapassing [

<!ELEMENT request (parameter*)><!ATTLIST request object (#CDATA)><!ATTLIST request method (#CDATA)><!ATTLIST request return (#CDATA)><!ELEMENT parameter (SysPara | string)><!ATTLIST parameter order (#CDATA)><!ELEMENT SysPara (qid,maxlayer,timeout)><!ELEMENT qid (#CDATA)><!ELEMENT maxlayer (#CDATA)><!ELEMENT timeout (#CDATA)><!ELEMENT string (#CDATA)>

]>

Page 24: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

EvaluationADVANTAGES: It can overcome the CORBA IIOP vs. firewalls

problem by using HTTP calls,XML, and servlet. Security issue can still be maintained. External

CORBA objects can call only the objects combined with Servlet

Both primitive data types, like String, or complex classes, SysPara, can be well represented by XML in parameter passing

Page 25: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Evaluation It is not limited for the clients and servers

must be CORBA implementation. Internal CORBA objects would not notice the

difference of calling an external objects when comparing to callings of internal objects

With sharing the same XML and IDL standards, we can easily achieve a worldwide query system.

Page 26: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

EvaluationWEAKNESS:With using HTTP calls to simulate IIOP i

s relative slower than using firewalls dedicated for CORBA, as we need to do many extra works to initialize the servlet, to convert parameters to XML format, invoke the servlets to work, etc.

Page 27: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Future WorkGeneralize the IIOP simulation

For all kinds of objects, parameter types, return types, etc

Direct code generation from IDLDesign the mechanism that supports

CALLBACKSWhen callback can be simulated, we

can enhance the features of our mediator system

Page 28: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

ConclusionUsing Mediators in querying distributed

sourcesHow to use CORBA and XML to

integrate the systemCooperation between XML and CORBA

in simulation of IIOP calls

Page 29: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Demonstration Aim to show the mediator system can:

make queries to varies sources in parallel go beyond firewalls

User Interface(SHB 1027)

Servlet MediatorH111, Kuomao Hall

Data Source 1H111, Kuomao Hall

HTTP GatewayH111, Kuomao Hall

FIREWALL

Servlet Mediator(SHB 913, pc90003)

Data Source 2(SHB 913, pc90003)

Page 30: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

Q & A Session

Welcome to Give Questions and Comments

Page 31: Information Searching and Retrieval from Distributed Databases using Mediators, CORBA and XML

<appreciation>Thank You</appreciation>