Post on 12-Feb-2016
description
Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP
• Mauricio Minuto Espil Faculty of Engineering Universidad Católica Argentina• Alejandro A. Vaisman Computer Science Department Universidad de Buenos Aires
7th InternationalWorkshop on
Data Warehousing &OLAP
OUTLINE:
• CHARACTERIZATION• PROBLEM AND PROPOSAL• FACT INTEGRATION• DIMENSION INTEGRATION • AGGREGATE QUERIES• CONCLUSIONS
Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP
Peer-to-Peer Systems
Involves a network of interconnected peer systems;The network topology is not relevant;Each peer maintains full autonomy over its own data resources; Each peer may assume the role of local. The rest become acquaintances of the local peer;The roles of local and acquaintance among peers are not static; they are functional and are determined with respect to an operation.
MAIN CHARACTERISTICS:
Peer-to-Peer Data Management
No global schema is assumed to exist for data; Each peer must manage its data according its own perspective;A query may be posed on any peer, the responsive peer becomes local with respect to the query;Answers to queries must conform the best attempt to gather data from all peers; Answers to queries posed by local peer users must conform the view those users have of their data;Peers must cooperate in maintaining the local views of data;
MAIN CHARACTERISTICS:
OUTLINE:
• CHARACTERIZATION• PROBLEM AND PROPOSAL• FACT INTEGRATION• DIMENSION INTEGRATION • AGGREGATE QUERIES• CONCLUSIONS
Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP
OLAP Data in a Peer-to-Peer System
• OLAP data is essentially multidimensional;• Multidimensional data consists in a collection of views of base and derived aggregated data, describing fact indicators by dimensions of analysis; • Concepts for aggregation within dimensions are obtained from finer grain concepts through hierarchies;• Different peers may have affine fact indicators described by different dimension hierarchies;• Integration is needed: Any summary concept that appears in a hierarchy of a peer acquaintance must be transformed into a summary concept meaningful to the local peer. •••• >
THE PROBLEM:
OLAP Data in a Peer-to-Peer System
• The expected integration is not always possible;• Users may pose OLAP queries in a local peer expecting results involving all relevant data stored in all peers.• Local queries must be propagated among the acquaintances;• A rewriting of the propagated queries is needed to conform the view of the local user.• The rewriting technique must accomplish the data integration on the fly;• Incomplete and uncertain results must be admitted;
•••• > THE PROBLEM
Peer-to-Peer OLAP
• FACT PEERS• DIMENSION PEERS• AGGREGATE P2P OLAP QUERIES• COMPLETE AND CERTAIN QUERY ANSWERS
MODEL (DEFINES):
• AUTONOMOUS PEER DATA MANAGEMENT• THREE PHASE PEER TO PEER COORDINATION• COOPERATIVE QUERY ANSWERING
ARCHITECTURE (INVOLVES):
OUTLINE:
• CHARACTERIZATION• PROBLEM AND PROPOSAL• FACT INTEGRATION• DIMENSION INTEGRATION • AGGREGATE QUERIES • CONCLUSIONS
Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP
Fact Integration
• GENERIC FACT • FACT PEERS
TYPES OF FACT:
IS-A RELATIONSHIP
FACT CONCILIATION PHASE:
SOURCE PEER
PUBLISHES GENERIC FACT DEFINITION AND DIMENSIONAL STRUCTURE
LISTENING PEER
GENERIC FACT AGREEMENT AND DIMENSION PEERS DEFINITION
OUTLINE:
• CHARACTERIZATION• PROBLEM AND PROPOSAL• FACT INTEGRATION• DIMENSION INTEGRATION• AGGREGATE QUERIES• CONCLUSIONS
Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP
Dimension Integration
• LEVEL HIERARCHY INTEGRATION• MEMBER HIERARCHY INTEGRATION.
CONSISTS IN:
• CORRESPONDENCE DEFINITION AMONG DIMENSION LEVELS • REVISION/MAPPING DEFINITION AMONG DIMENSION INSTANCES
COMPRISES:
INVOLVES:• A PAIR OF DIMENSION PEERS
Level Hierarchy Integration
LEVEL CORRESPONDENCE• APPLIES ON SCHEMAS• ESTABLISHES HOW A PAIR OF LEVELS ON DIFFERENT PEER DIMENSIONS ARE RELATED• IS PRODUCED/UPDATED DURING A SCHEMA CONCILIATION PHASE• IS MATERIALIZED AS METADATA IN CORRESPONDENCE TABLES
ORDER PRESERVING LEVEL CORRESPONDENCE
Benefit Type
Funding Class
All
Tax DischargeCategory
Loan Type
All
Charity Modality
BenefitType
Level Hierarchy Integration
A LEVEL CORRESPONDENCE THAT DO NOT PRESERVE ORDER IS NOT ADMISSIBLE
Benefit Type
Funding Class
All
Tax DischargeCategory
Loan Types
All
Charity Modality
BenefitType
Level Hierarchy Integration
WRONG
Member Hierarchy Integration
INTEGRATION BY MAPPING
• APPLIES ON INSTANCES• ESTABLISHES HOW A PAIR OF MEMBERS OF CORRESPONDING LEVELS ARE RELATED• IS PRODUCED/UPDATED DURING A MAPPING ACQUISITION PHASE• MUST BE PRECEDED BY AT LEAST ONE SCHEMA CONCILIATION PHASE• IS MATERIALIZED AS METADATA IN MAPPING TABLES
l1: m1 (Local) l'1: m'1 (Peer)
l2: m2 (Local) l'2: m'2 (Acq)
For each member m of a level l, such that map (l:m) is defined,
if there exists some member m’ of level l’, satisfying roll-up (l:m) = l’:m’
and level l’ is in dom(Correspondence)then roll-up (map (l:m) ) = map (l’:m’).
Member Hierarchy Integration
MAPPINGS: HOMOMORPHISM PROPERTY
l:m
l':m’map
maproll-uproll-up
Member m’ in level l’ is conflicting,it cannot be mapped.
An approach based on mapping exclusively is not always effective.
Member Hierarchy Integration
HOMOMORPHISM MAY NOT BE ALWAYS GRANTED
l:m1
l':m’
mapmap
roll-uproll-up
l:m2
roll-uproll-up
MAPPINGS DO NOT SUFFICE: MAPPINGS DO NOT SUFFICE: REVISIONS MAY BE NECESSARYREVISIONS MAY BE NECESSARY
Member Hierarchy Integration
l:m1
l':m’
l:m2
Conflicting Member
REVISIONS AFFECT THE VIEW A PEER HAS OF THE REVISIONS AFFECT THE VIEW A PEER HAS OF THE HIERARCHY OF ITS ACQUAINTANCE ONLYHIERARCHY OF ITS ACQUAINTANCE ONLY
LOCAL
ACQUAINTANCE
A REVISION BY SPLITTING A REVISION BY SPLITTING MAY BE USED TO REPAIR CONFLICTSMAY BE USED TO REPAIR CONFLICTSGIVING WAY TO MAPPABLE MEMBERSGIVING WAY TO MAPPABLE MEMBERS
Member Hierarchy Integration
l:m1
l':m2’
l:m2
l:m1’
LOCAL
ACQUAINTANCE
EXAMPLE OF A REVISION: EXAMPLE OF A REVISION: CONFLICTING MEMBER SPLITCONFLICTING MEMBER SPLIT
Non-Conflicting Members
A REVISION BY RECLASSIFYING A REVISION BY RECLASSIFYING MAY BE AN ALTERNATIVE TO RESTORE HOMOMORPHISMMAY BE AN ALTERNATIVE TO RESTORE HOMOMORPHISM
Member Hierarchy Integration
l:m1
l:m2
l:m’
LOCAL
ACQUAINTANCE
l:m3
l':m”
EXAMPLE OF A REVISION:EXAMPLE OF A REVISION:CONFLICTING MEMBER RECLASSIFICATIONCONFLICTING MEMBER RECLASSIFICATION
Non-Conflicting Members
• PRODUCES AND BROADCASTS REVISION AND MAPPING DEFINITIONS TO POTENTIAL ACQUAINTANCES
REVISE AND MAP APPROACH:LOCAL PEER:
Member Hierarchy Integration
ACQUAINTANCE:• REVISES ITS OWN HIERARCHIES PRODUCING A REVISED INSTANCE (REVISED ROLL-UPS) WITH RESPECT TO THE LOCAL PEER• STORE INFORMATION ON MAPPINGS IN METADATA MAPPING TABLES
Whenever some member m2’ of a level l’ is not mapped,a bottom-up completion approach for query answeringis employed: information on non-mapped members andtheir roll-ups is stored in metadata completion tables.
Member Hierarchy Integration
BOTTOM-UP COMPLETION APPROACH
l:m1
l':m2’
mapmap
Incompleteroll-up
roll-upl:m2
roll-uproll-up
l':m1’Non-Mapped
Member
OUTLINE:
• CHARACTERIZATION• PROBLEM AND PROPOSAL• FACT INTEGRATION• DIMENSION INTEGRATION• AGGREGATE QUERIES• CONCLUSIONS
Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP
P2P OLAP Queries
Syntactical Structure (Datalog Style):
query( Z1, ... , Zn, aggr(M), Set of Peers) Generic Fact(X1, ... , Xn, M ), rollup dimension d1 from bottom level to desired level l1 ( X1, Z1 ), ... , rollup dimension dn from bottom level to desired level ln ( Xn, Zn );
• GENERATES A QUERY FOR EACH RELEVANT PEER (INCLUDING THE LOCAL PEER);• GENERATED QUERIES ARE PROPAGATED TO RELEVANT PEERS;• QUERIES FOR RELEVANT PEERS STEM FROM THE REWRITING OF THE SUBMITTED P2P OLAP QUERY;• THE REWRITING PROCESS INTRODUCES REFERENCES TO FACT PEERS, REVISED ROLL-UPS, AND MAPPING AND COMPLETION TABLES;• RESULTS OF PROPAGATED QUERIES ARE COLLECTED AND AGGREGATED LOCALLY TO PRODUCE THE FINAL QUERY ANSWER;• QUERY ANSWERS MAY BE UNCERTAIN AND INCOMPLETE DUE TO BOTTOM-UP COMPLETION.
Query Evaluation Process
Query ProcessingQuery Processing
Local Peer Relevant Peer
Fact Fact tablestables
QUERY
Rewriting
Evaluation
Partial Result
Revised Revised RollupsRollups
MetadataMapping Mapping
tablestables
IntegrationAnswer
Completion Completion tablestables
Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP
• GENERIC FACTS• FACT CONCILIATION PHASE• HIERARCHY LEVEL CORRESPONDENCE• SCHEMA CONCILIATION PHASE• REVISE AND MAP APPROACH• BOTTOM-UP COMPLETION• MAPPING ACQUISITION PHASE• P2P OLAP QUERIES• QUERY REWRITING AND EVALUATION
CONCLUSIONS: MAIN POINTS DISCUSSED