From Enterprise Information Integration to Community-Based Mediation
Alin Deutsch,
Yannis Katsis,
Michalis Petropoulos
Yannis Papakonstantinou
A presentation by
on joint works with
CSE DepartmentCSE Department
Data Integration Requirements & Desiderata (high level)
• Provide application with integrated database– single point of (query/update) access to the
data• Provide distribution and heterogeneity
transparency– heterogenous formats, heterogenous
interfaces, different rates of change (static versus dynamic), autonomous sources
• Decouple application logic from integration• Easily add/change sources• Customize the delivery of content
Integration Software
Most-Generic Integration System Architecture
. . .Information
SourceInformation
SourceInformation
Source
ClientApplication
ClientApplication
ClientApplication
Information+ Service
Source
SIGMOD Community’s Architecture forUnified Access to Data & Services
Local CommonLocal CommonModel (XML) View Model (XML) View
+ Services+ Services
Mediator
Integrated (XML) Global View / Ontology + ServicesIntegrated (XML) Global View / Ontology + Services
Wrapper
Cache &Replication
(Web) ClientApplication
Information+ Service
Source
Wrapper
(Web) ClientApplication
(Web) ClientApplication
Local CommonLocal CommonModel (XML) View Model (XML) View
+ Services+ Services
Approaches towards View-BasedData Integration
Local As View(LAV)
Global As View(GAV)
GLAV=GAV+LAV
IntegrationSpecification
Method
Info Model &Query Language
Relational (SQL)
XML (XQuery)
Object-Oriented
Warehousing(materialized
views)
On-Demand(virtual views)
Storage Method
Enterprise Information IntegrationReaches Maturity
• Materialized View (Warehousing) approach well-adopted since mid/late 90s– GAV function role played by Extract-Transform-Load tools– Human Intervention Occasionally Needed in Cleaning Up
• Concordance tables for Object Identification
• Virtual View (Mediation) approach at early adoption – many years of research
• Distributed db’s, federated db’s, mediators– moving well into mainstream
• BEA AquaLogic (XML, Virtual, GAV view)• IBM DB2
Enterprise
Current Enterprise InformationIntegration Deployments
IntegrationIntegrationAdminAdmin
Marketing
Local View MLocal View M
Integrated Global View V(M, S, E)Integrated Global View V(M, S, E)
Sales
Local View SLocal View S
Service
Local View ELocal View E
View Builder(design time)
Mediator QueryProcessor (run time)
GAV View V
Schemas Data
• Small Domain• Mostly Vertical Partition of Sources • Primarily Application-Driven View or Identity View• Integration Administrator/Developer in charge
Enterprise
Opportunities and Needs Presented by “Motivated” Communities
• Emerging Myriads of Internet Communities of– Myriads of sources and clients– Source owners motivated to participate
• EII does not address needs– Expensive– Bottleneck of Single Integration Admin
• Make building corresponding portals similar to starting and participating in newsgroups
• Appropriate tools needed to enable source owner and client participation
Communities
InformationSource 1
Local XML View S1
Client Application 1
Integrated XML View G
InformationSource n
Local XML View Sn
GLAV: V1 GLAV: Vn
GAV: V1a GAV: Vm
a
Application View V1a (G) Application View Vm
a (G)
Client Application m
Integrated View Owner’s Domain
Source Owner’sDomain
SourceOwner’sDomain
DataServices 1
DataServices n
A Community-Based Information Modeling Architecture
Data Services
Visual Tools Matter! (example from the Enosys Query Builder)
C:\Enosys\projects\allPONS.qpr* - Enosys Query Builder
OPEN & VIEW OPEN & VIEW SOURCE SCHEMAS IN SOURCE SCHEMAS IN
XMLXML DRAG & DROP TO CREATE TARGET XML VIEW
TARGET SCHEMA
(XML VIEW)
AUTOMATICALLY GENERATED
MAPS
1
2
C:\Enosys\projects\allPONS.qpr* - Enosys Query Builder
XML RESULTXQUERY BASED
ON DESIGN SPECS
RUN & TEST XQUERYRUN & TEST XQUERY3
Architecture for Large-Scale Data Integration System and Design Tools
How can the user query and Browse the integrated data?
QURSED
What queries can my app issue?What integrated view services can I build?
CLIDE
How do I export my databaseservices functionality?
RIDE-Services
SourceDomain
WebDomain
ApplicationDomain
IntegrationDomain
Application
DataSource
DataSource
MediatorGlobal View
Schema
Developer
IntegrationEngineer
SourceOwner
Application
Web Forms& Reports
SourceSchema
…
WebService
WebService
WebService
SourceSchema …
How do I export my data?
RIDE
WebServices
Cache(Metadata)
Dual Interactive Registration Problems
New App
?New Query
?
Source Services
Register Source Given GlobalRegister Source Given Global
Schema, Constraints &QueriesSchema, Constraints &QueriesGuide the client in
query/form writing
Apps
?Queries
?
Register Client Given SourcesRegister Client Given Sources
Guide the source owner in registering a new source and services
New Source and Services
Global ViewGlobal View
Source Data Registration
• How do my source attributes map to global attributes– mappers & automatic
matchers• How do my data relate to
queries & other sources– Inconsistencies?– What takes to
contribute to queries?– How much should I
clean up?• Multiple ways of dealing
with redundancy
Apps
?Queries
?
Server SideServer Side
New Source
Global View
How to achieve this Goal
Apps
?Queries
?
BeforeBefore
New Source
Apps
?Queries
?
NowNow
New Source
Look at all sources & queries
Decide how to register
your source
Follow the suggestions
of the interface
Global View Global View
Source Registration Tool
Our Goal in Source Registration
Guide the source owner visually through the registration of the source
so as to avoid/warn about (potential) inconsistencies and contribute information to the answer of the queries
while exposing the minimum information possible and/or minimizing effort
The Problem
?
Client Queries
? ?
Mediator(Global DB)
Sources (Actual Local DBs)
17
The Contribution Problem
?
Client Queries
? ?
Sources (Actual Local DBs)
• What is the contribution of source S to the result of the query Q?
S
Q
18
Mediator(Global DB)
The Problem
Client Queries
?
Sources (Actual Local DBs)
• What is the contribution of source S to the result of the query Q?
S
Q
19
Mediator(Global DB)
Q:Q: carscars
carsreviews
Q: cars JOIN reviewsQ: cars JOIN reviews
S is Self Sufficient w.r.t. Q
S is Now Complementary w.r.t. Q
Relational Schemas: Local and Global
attributes
relations
?
• Relational Schemas
• Visual Representation
makeS1
CarmakeOriginSales
autoS2
IdModel
detailIdEngineBaseprice
Source 1Business Magazine
Source 2Car Magazine
GlobalCar Portal
carG
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
Carmake
20
Source Registration using GLAV Mappings
• Source Registration:Source Registration: Correspondence between a source schema and the global schema = Set of Mapping Constraints of the form
(U V)
• Open WorldOpen World
• Global and Local As View (GLAV)Global and Local As View (GLAV)
?
CQ=
over source schema
CQ= over global schema
21
Target Constraints
• Constraints on the global schema = Set of Constraints of the form
(U V)
• Also Expresses Dependencies (PKs, Ref Integrity, …)Also Expresses Dependencies (PKs, Ref Integrity, …)
?
CQ=
over global schema
CQ= over global schema
22
Visual Representation of Mappings (1)
?
• Visual Representation (IBM Clio)
brandCO
Business Magazine: Provides Carmake and Origin
U1(C, O) :- make(C, O, S)V1(C, O) :- brand(C, O)
(U1 V1)
carG
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
makeS1
CarmakeOriginSales
OC
makeCOSOC
23
Visual Representation of Mappings (2)
?
• Visual Representation (IBM Clio)
Car Magazine: Provides Model, Carmake and Baseprice
autoIMC
detailIEB
carMC?B
autoS2
IdModel
detailIdEngineBaseprice
Carmake
carG
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
24
Query Semantics
• Queries in UCQ=
• Set of Possible Global InstancesSet of global instances that satisfy all constraints
• Query Answers = Set of Certain AnswersThe tuples appearing in the answer to Q for any possible global instance
Possible globalinstances
Answer to Qfor any of the
possible globalinstances
CertainCertainAnswersAnswers
to Qto Q
Answer to Qfor any of the
possible globalinstances
CertainCertainAnswersAnswers
to Qto Q
Possible globalinstances
?
26
Source Instance’s Contribution
Answer to Q - Answer to
Q
• For given instances of the sources
Contribution to Q of Source Instance
=
The tuples in answer of Q not provided by the other sources
27
Source Registration’s Contribution
• Source Registration: Source Mappings
• Degrees of Source Registration’s Contribution
Self Sufficient
Now Complementary
Later Complementary
Unusable
More contribution
Less contribution
28
Self Sufficient Registration: Example
?Baseprices of ModelsBaseprices of Models
carM3
BMW?
45K
carG
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
Green Registration isSelf Sufficient
carModel
CarmakeDoors
Baseprice
Example
29
Self Sufficient Registration: Definition
Answer to Q
Source instance
s.t.
The source has a non empty contribution in the absence
of the other sources
Answer to Q-
Self SufficientSelf Sufficient
X X X X
30
Now Complementary Registration: Example
?
Baseprices of ModelsBaseprices of Modelsby German manufacturersby German manufacturers
carM3
BMW?
45K
carG
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
brandBMW
Germany
Green Registration isNow Complementary
brandCarmakeOrigin =
‘Germany’
carModel
CarmakeDoors
Baseprice
Example
31
Now Complementary Registration: Definition
Answer to Q
Not Self Sufficient
&
Source instances
s.t.
The source has a non empty contribution in combination
with the other existing sources
Answer to Q-
Now ComplementaryNow Complementary
32
Later Complementary Registration: Example
?
Baseprices of ModelsBaseprices of Modelsby German manufacturersby German manufacturers
carM3
BMW?
45K
carG
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
brandBMW
Germany
Green Registration isLater Complementary
brandCarmakeOrigin =
‘Germany’
carModel
CarmakeDoors
Baseprice
Example
33
Later Complementary Registration: Definition
Answer to Q
Not Self Sufficient &Not Now Complementary
&
Potential future sources
& Source instances
s.t.
The source has a non empty contribution in combination
with the future sources
Answer to Q-
Later ComplementaryLater Complementary
34
Unusable Registration: Example
?car
G
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
brandCarmake
OriginOrigin of CarmakesOrigin of Carmakes
Green Registration isUnusable
Example
35
Unusable Registration: Definition
Answer to Q
Not Self Sufficient &Not Now Complementary &Not Later Complementary
The source has a empty contribution regardless of
what sources enter the system
Answer to Q- =
UnusableUnusable
36
Subtleties for Unusable Registrations
?
Baseprices and DoorsBaseprices and Doorsof Modelsof Models
carG
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
carM3
BMW?
45K
carM3
BMW2?
Green Registration isUnusable
carModel
CarmakeDoors
Baseprice
Example
37
In presence of PK Unusable Example becomes Later Complementary
?
Baseprices and DoorsBaseprices and Doorsof Modelsof Models
carG
ModelCarmake
Carmakebrand
Origin
DoorsBaseprice
Green Registration isLater Complementary
carM3
BMW?
45K
carM3
BMW2?
M3BMW
M3BMW
carM3
BMW2
45K
carModel
CarmakeDoors
Baseprice
Example
38
Decidability Results
None Primary keys Primary keys + Referential Integrity Constraints
Self SufficientYes Yes No
Now complementary Yes Yes No
Later complementary Yes Yes ?
UnusableYes Yes ?
Target constraintsTarget constraints
DDeeggrreeeess
Overview: What is decidableOverview: What is decidable
39
Issues
UniqueUnique client query client query MultipleMultiple client queries client queriesVsVs
Contribute to: - all queries? - one query? - specific queries? - some queries based on some ranking?
DataData independence independence DataData dependence dependenceVsVs
M1: cars, refPricesM2: reviewsQ: cars JOIN reviews JOIN refPrices
e.g.
DB1: cars, refPrices (Audis)DB2: reviews (Hondas)
(M2, Q) now-complementarybut Certain Answers for Instances DB1, DB2 =
Putting it all together
ArchitectureArchitecture
?Query
Global Schema
LocalSchemas
Q
S1 Sn Sn+1
…M1 Mn Mn+1
S’
Guide the source owner visually through the registration of the source
so as to raise contribution to the answer of the queries
while exposing the minimum info possible and/or minimizing effort
4 categories:4 categories:
Self Sufficient / Now Complementary / Later Complementary / Unusable
Query Answering / Mappings / SchemasQuery Answering / Mappings / Schemas
Architecture
Goal
Registeredsources
Newsource
Mappings
…
Contribution
Example 1
carCommunity
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*model
model
condition
quality
Local SchemasLocal Schemas Global SchemaGlobal Schema
carAppQuery
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*model
model
condition
quality
QueryQuery
Without primary keys in the target
Unusable BLUE: Map at least one of the groups
carAutoTrader
vin
cmodel
price
ad*
carId
id*
Example 1
carCommunity
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*modelcondition
quality
Local SchemasLocal Schemas Global SchemaGlobal Schema QueryQuery
Unusable
= cmo
carAutoTrader
vin
cmodel
price
ad*
carId
id* car
AppQuery
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*model
model
condition
quality
model
Without primary keys in the target
Example 1
carCommunity
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*modelcondition
quality
Local SchemasLocal Schemas Global SchemaGlobal Schema QueryQuery
= cmo
carAutoTrader
vin
cmodel
price
ad*
carId
id* car
AppQuery
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*model
model
condition
quality
model= price
LaterComplementary
Without primary keys in the target
Example 2
carCommunity
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*model
model
condition
quality
Local SchemasLocal Schemas Global SchemaGlobal Schema
carAppQuery
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*model
model
condition
quality
QueryQuery
Unusable
carAutoTrader
vin
cmodel
price
ad*
carId
id*
With primary keys in the target
Example 2
carCommunity
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*model
model
condition
quality
Local SchemasLocal Schemas Global SchemaGlobal Schema
carAppQuery
price
modeldrive
usedAd*
modelreview*
vin
*
price
refPrice*model
model
condition
quality
QueryQuery
= price
= vin
carAutoTrader
vin
cmodel
price
ad*
carId
id*
LaterComplementary
With primary keys in the target
Lessons learned
To merge data with that of other sources (become complementary):
Pick a relation and provide…
…all its attributes asked by the query
…its primary key and one of its attributes asked by the query
In absence of primary keys
In presence of primary keys
The number of choices increases in presence of primary keys
Foreign keys on the target affect the suggestions
Target constraints make a differenceTarget constraints make a difference
Large-Scale Data Integration Systems
How can the user query and Browse the integrated data?
QURSED
What queries can the mediator answer for me?
CLIDE
How do I export my databaseservices functionality?
RIDE-Services
SourceDomain
WebDomain
ApplicationDomain
IntegrationDomain
Application
DataSource
DataSource
MediatorGlobal View
Schema
Developer
IntegrationEngineer
SourceOwner
Application
Web Forms& Reports
SourceSchema
…
WebService
WebService
WebService
SourceSchema …
How do I export my data?
RIDE
Running Example
SchemaComputers(cid, cpu, ram, price)NetCards(cid, rate, standard, interface)
Views
V1 ComByCpu(cpu) (Computer)*SELECT DISTINCT Com1.*FROM Computers Com1WHERE Com1.cpu=cpu
V2 ComNetByCpuRate(cpu, rate) (Computer,
NetCard)*SELECT DISTINCT Com1.*, Net1.*FROM Computers Com1, Network Net1WHERE Com1.cid=Net1.cidAND Com1.cpu=cpuAND Net1.rate=rate
Parameterized Views
DellDell CiscoCiscoSchema
Routers(rate, standard, price, type)
Views
V3 RouByTypeW() (Router)*
SELECT DISTINCT Rou1.*FROM Routers Rou1WHERE Rou1.type='Wired'
V4 RouByTypeWL() (Router)*
SELECT DISTINCT Rou1.*FROM Routers Rou1WHERE Rou1.type='Wireless'
Computersfor a given cpu
Computers & NetCardsfor a given cpu & rate
Wired Routers
Wireless Routers
Running Example
• Global schema puts togetherthe Dell and Cisco schemas
• Resembles the schema of CNET.com portal
Column Associations• (Computers.cid, NetCards.cid)• (NetCards.rate, Routers.rate)• (NetCards.standard, Routers.standard)
Global Schema
V1
Application
V3V2
Dell Cisco
MediatorGlobal Schema
Developer
V4
Sophisticated Mediators Make Feasibility Hard to Predict
Feasible Queries FQ• Equivalent CQ query rewritings using the views• Might involve more than one views• Order might matter
V4
Mediator
RouByTypeWL()
Routers.*
10.11b
50Wirele
ss
54.11g
120Wirele
ssA
B
V2
ComNetByCpuRate(‘P4’, ‘10’)
C
DComputers.* NetCards.*A123
P4512 400A123
10.11b
USB
B123
P41024
550B123
54.11g
USB
Feasible
ComNetByCpuRate(‘P4’, ‘54’)
Computers.* NetCards.* Routers.*A123
P4 512 400A123
10.11b
USB
10.11b
50Wireless
B123
P41024
550B123
54.11g
USB
54.11g
120Wireless
E
Query:Get all ‘P4’ Computers, together with their NetCards and their compatible ‘Wireless’ Routers
Query:Get all Computers
Infeasible
Problem
1. Large number of sources2. Large number of views3. Mediator capabilities
Developer formulates an application query Is an application query feasible? If not, how do I know which ones are feasible?
Previous options:– The developer had to browse the view definitions
and somehow formulate a feasible query– Or formulate queries until a feasible one is found
(trial-and-error)
No system-provided guidance
The CLIDE Solution
A query formulation interface, which interactively guides the user toward feasible queries by employing a coloring scheme
CLIDE
V1
Application
V3V2
Dell Cisco
MediatorGlobal Schema
Developer
V4
QBE-Like Interfaces
Microsoft SQL-Server
CLIDE Interface
• Table, selection, projection and join actions• Color-based suggestions• Feasibility Flag
Projection BoxesTable Boxes
Selection Boxes
Feasibility FlagTable Alias
CLIDE Interface
Yellow required action– All feasible queries require this action
White optional action– Feasible queries can be formulated
w/ or w/o these actions
Snapshot 1
CLIDE Interface
Snapshot 2
Blue required choice of action– At least one feasible (next) query cannot
be formulated unless this action is performed
V1
Mediator
ComByCpu(‘P4’)
cidcpu
rampric
eA123 P4 512 400B123 P4 1024 550
rampric
e512 400
1024 550A B
C
CLIDE Interface
Join Lines:• Only yellow and blue are displayed• Must appear in Column Associations
Snapshot 3
CLIDE Interface
Snapshot 4
CLIDE Interface
Snapshot 5
• * any other constant• Red prohibited action
– Does not appear in any feasible query
– Lead to “Dead End” state
CLIDE Interface
Snapshot 6
V4
Mediator
RouByTypeWL()Routers.*
10.11b
512Wirele
ss
54.11g
1024
Wireless
A
B V2
ComNetByCpuRate(‘P4’, rate)
D
E
Computers.* NetCards.*A123
P4512 400A123
10.11b
50
B123
P41024
550B123
54.11g
120
ramprice
rateinterface
price
512 400 10 USB 501024
550 54 USB 120
F
CLIDE Facts
• Rapid Convergence
– At every step, yellow and blue actions lead to a feasible query in a minimum number of steps
• Completeness of Suggestions
– Every feasible query can be formulated by performing yellow and blue actions at every step
• Minimality of Suggestions
– At every step, only a minimal number of actions are suggested, i.e., the ones that are needed to preserve completeness
Join ActionTableAction
SelectionAction
Interaction Graph
• Nodes are queries– One for each qCQ
• Edges are actions– Table, selection, projection and join actions
• Green nodes are feasible queries• Infinitely big structure
– All CQ queries– All possible combinations of actions formulating them
Com1.cid=Net1.cidCom1.cpu=‘P4’Com1 Com1.ram Rou1…… Com1.price ……
s
… … ………Net1 …
Interaction Graph: Colorable Actions
• Colorable actions AC labeloutgoing edges of the current node
Net1
Com1.cpu=*
Com1.price=*
Rou1
Com1.ram=*
Com1.cid=*
Com2
Com1.cid
……
……
…
Com1.cpu
……
……
Current Node
Interaction Graph: Colors
Com1.cpu=*
…
……
……
……
…
……
…
…
CurrentNode
Net1 Com1.cid=Net1.cid
Com2.cid=Net1.cid
Com2
Com2.cpu=‘P4’ Net1.rate=‘54Mbps’
Net1.rate=’54Mbps’…… … … …
… …
Com1.cpu=*Com1.cpu=* Rou1 Net1.rate=Rou1.rate ……… …
Net1.rate=’54Mbps’ …
Com1.cid=Net1.cid
Com1.cid=Net1.cid …Net1
Com1.cpu=*
Com1.price=*
Rou1
Com1.ram=*
Com1.cid=*
Com2
Com1.cid
Com1.cpu
• Yellow action – Every path from current node n to a
feasible node contains • Blue action
– At least one feasible query cannot be formulated unless this action is performed (minimality)
• Red action – No path to a feasible node contains
Color Determined By a Finite Set of Feasible Queries
• Start by considering the closest feasible queries FQC
• FQC is sufficient to color actions in AC • Theorem: Set of Closest Feasible Queries is Finite
• How far can closest feasible queries FQC be?
Based on Maximally Contained Queries FQMC?
n
…
…
…
…
…
…
ClosestFeasibleQueries FQC
Challenge: Infinitely Many Feasible Queries
Radius
Infinitely manyfeasible queries
?…
…
Color Algorithm
• Assuming fixed SELECT clause (projection list)• Covered extensively in literature
– MiniCon, Bucket, InverseRules
• FQMC is finite
Maximally Contained Query
Maximally Contained Queries FQMC
Query: Q1Get all Computers
Query: Q2Get all Computers with a given cpu
Query: Q3Get all Computerswith a given cpu & ram
Not Maximally ContainedMaximally Contained Query
Query: Q4Get all Computerswith a given ram
Color Algorithm
• Compute maximally contained queries FQMC
• The radius pL is the longest path to a node n’ such that q(n’) in FQMC
All FQC queries are reachable via a path of length p pL
ClosestFeasibleQueries FQC
MaximallyContainedQueries FQMCn
…
…
…
…
…
…
Maximally Contained Queries FQMC
pL Radius
…
Color Algorithm
• Theorem: All queries in FQMC are in FQC
• But not all queries in FQC are in FQMC
More on Closest Feasible Queries
ClosestFeasibleQueries FQC
MaximallyContainedFeasibleQueries FQMC
…
…
…
…
…
…
More feasible nodes
n
Color Algorithm
• Naïve Approach
– Start from n and explore paths up to length pL
More on Closest Feasible Queries
ClosestFeasibleQueries FQC
MaximallyContainedFeasibleQueries FQMC
…
…
…
…
…
…
n
Color Algorithm
• Collapse Aliases to compute FQC \ FQMC
• Check satisfiability
Collapse Aliases
ClosestFeasibleQueries FQC
MaximallyContainedFeasibleQueries FQMCn
…
…
…
…
…
…
Color Algorithm
Coloring Non-Projection Actions• No interaction graph materialization• Use of containment mapping from current query to the closest feasible
ones• An action is colored
– Yellow, if is mapped into all queries in FQC
– Red, if is not mapped into any query in FQC
– Blue, if is mapped into at least one query qF in FQC, no other action in AP is mapped into qF, and is neither yellow nor red
Coloring Projection Actions• Never colored yellow• Can be colored blue only if
– the current query is feasible– it is not colored red
• Which ones are red?– Bring all projection atoms from views such that feasibility is preserved
– If action is not mapped into any query in FQC, then is red
Other Back-End
Parameterized Views Back-End
CLIDE Implementation
ActionCurrentQuery
ClosestFeasibleQueries
SchemasViews
MiniConContainment
TestCollapseAliases
ColorActions
Front-End
Developer
MaximallyContainedQueries
OptimalMaximallyContainedQueries
ColoredActions
ColumnAssociations
MiniCon• Outputs redundant and non-minimal queries• Affects CLIDE’s rapid convergence and minimality propertiesContainment Test• Well-known NP-complete problem• Polynomial when query is acyclicCollapse Aliases / Color Actions• Reuse containment mappings created by MiniCon
CLIDE Performance
0
2
4
6
8
10
12
14
(# of Joins, # of Selections) in Current Query
Tim
e (
sec) 112 Views
168 Views
224 Views
280 Views
• Queries
A-span = 7B-span = 4Selections = 4,6,8,10
A
B1…
C1
B2 C1A
BK
B1…
C1
CL
…• Schema
…
Bi
… Ci
• Views
A
BK
B1…
C1
CL
…
… …
BiM
Bi1…
CiM
Ci1…
Chains of Stars
Top Related