HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy),...

13
HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho, Rachel Pottinger, Shuan Wang and Ting Wang (UBC, Canada) 1 HEPTOX stands for “Heterogeneous Peers Talk!

Transcript of HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy),...

Page 1: HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho, Rachel.

HEPTOX1: Marrying XML and Heterogeneity in Your

P2P Databases

Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho,

Rachel Pottinger, Shuan Wang and Ting Wang (UBC, Canada)

1 HEPTOX stands for “Heterogeneous Peers Talk!”

Page 2: HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho, Rachel.

See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30

Motivations for Marrying XML and Heterogeneity in P2P Databases

Peers contain similar and related XML data

Each peer wants to keep its own schema and yet needs to be mapped to others’ schemas [cfr. LenzeriniTutorial@PODS02]– autonomy, flexibility are important in P2P– a global mediated schema is unfeasible

Queries are still formulated against one (e.g. local) schema– Need to transparently cross the different schemas

Previous work [Clio, Hyperion, Piazza] could only handle limited heterogeneity

Page 3: HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho, Rachel.

See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30

A P2P Network of Heterogeneous Hospitals

Peer1 Peer2DTD1 DTD2DTDn

Peern

....

Event ...

DateProblem

Admission...

CoronaryPulmonary

...

...ID InsName...

Page 4: HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho, Rachel.

See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30

Heterogeneity and XML data: an example

Consider a P2P network of hospitals and an unfortunate patient moving among them: – Option#1: the patient carries his/her own files and

query translation is done manually It is error-prone, and unfeasible with several moves

and with frequent joining/leaving of peers

– Option#2: the hospital db admin manually writes the mappings

It is not that easy to find a person who knows the rule machinery that well!

Page 5: HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho, Rachel.

See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30

Heterogeneity and XML data: an example

– Option#3: the hospital db admin provides informal arrows/boxes correspondences w.r.t. a set of acquaintances:

Users/applications do not know the underlying mappings machinery and can keep it simple

A peer’s entering the network is a lightweight operation

Page 6: HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho, Rachel.

See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30

What do input mappings look like?What do input mappings look like?

Montreal Hospital

Patient

AdmissionID MedCr# Name Hist

Event

Problem Date

Treat

Desc

Doc

AdmDate DisDate PatRef

Boston Hospital

Pulmonary

Admission

Coronary

ID InsName Policy# Enter Leave Patient

Progress

PatRef Symptom Treatment

Date Desc

*

Source Schema

Target Schema

*

++

ID/IDREF ID/IDREF

*

*

* **

*

?

?@

@@

@@

Page 7: HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho, Rachel.

See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30

Mappings in hospital schemasMappings in hospital schemas

XML mappings in HePToX are specified informally by arrows and boxes which encode:– Data/Metadata correspondences – Structure correspondences– The informal mappings are translated to Datalog-like mapping

rules. Data/Metadata conflicts are not dealt with in previous works:Data/Metadata conflicts are not dealt with in previous works:

– Addressed in HepToXAddressed in HepToX

Event

DateProblem Pulmonary Coronary

Admission

Page 8: HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho, Rachel.

See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30

Demo scenarios

Doctors: track patients

Patients: access their data

Insurance Companies: define the policy for a set of patients

Etc.

Page 9: HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho, Rachel.

See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30

HepToX Contributions

GUI for specifying correspondences (arrows/boxes)

Datalog-like mapping language for working with complex XML trees

Rule Inference algorithm for producing the Datalog-like mapping rules

Query Translation algorithm based on those mappings that works for a significant subset of XQuery (TPs with joins)

Our “Data Exchange” semantics, which differs from GAV and LAV mappings

Page 10: HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho, Rachel.

See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30

Demo Screenshots 1/2Schema Mappings By Boxes/Arrows and Corresponding Datalog-like Rules

Page 11: HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho, Rachel.

See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30

Demo Screenshots 2/2Details of Query Translation Algorithm (for each pair <TP, MR>)

Page 12: HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho, Rachel.

See our demo! This afternoon 14:00-15:30 & Thursday 14:00-15:30

On HEPTOX implementation

The demo shows the following features:– Draw mappings and show the generation of rules– Show the query translation algorithm at work– Show a real network emulation with Emulab

HEPTOX is implemented in Java:– Uses QizX [QizX] as the underlying XQuery engine – FreePastry as the underlying P2P protocol– Emulab as the real network emulation environment– It consists of ~10,000 lines of code

Page 13: HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho, Rachel.

Come visit our demo booth: This afternoon 14:00-15:30 &

Thursday 14:00-15:30