Aggregate Queries in Peer-to-Peer OLAP

28
Aggregate Queries Aggregate Queries in Peer-to-Peer OLAP in Peer-to-Peer OLAP Mauricio Minuto Espil Faculty of Engineering Universidad Católica Argentina Alejandro A. Vaisman Computer Science Department Universidad de Buenos Aires 7 th International Workshop on Data Warehousing & OLAP

description

Aggregate Queries in Peer-to-Peer OLAP. Mauricio Minuto Espil Faculty of Engineering Universidad Católica Argentina Alejandro A. Vaisman Computer Science Department Universidad de Buenos Aires. 7 th International Workshop on Data Warehousing & OLAP. - PowerPoint PPT Presentation

Transcript of Aggregate Queries in Peer-to-Peer OLAP

Page 1: Aggregate Queries  in Peer-to-Peer OLAP

Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP

• Mauricio Minuto Espil Faculty of Engineering Universidad Católica Argentina• Alejandro A. Vaisman Computer Science Department Universidad de Buenos Aires

7th InternationalWorkshop on

Data Warehousing &OLAP

Page 2: Aggregate Queries  in Peer-to-Peer OLAP

OUTLINE:

• CHARACTERIZATION• PROBLEM AND PROPOSAL• FACT INTEGRATION• DIMENSION INTEGRATION • AGGREGATE QUERIES• CONCLUSIONS

Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP

Page 3: Aggregate Queries  in Peer-to-Peer OLAP

Peer-to-Peer Systems

Involves a network of interconnected peer systems;The network topology is not relevant;Each peer maintains full autonomy over its own data resources; Each peer may assume the role of local. The rest become acquaintances of the local peer;The roles of local and acquaintance among peers are not static; they are functional and are determined with respect to an operation.

MAIN CHARACTERISTICS:

Page 4: Aggregate Queries  in Peer-to-Peer OLAP

Peer-to-Peer Data Management

No global schema is assumed to exist for data; Each peer must manage its data according its own perspective;A query may be posed on any peer, the responsive peer becomes local with respect to the query;Answers to queries must conform the best attempt to gather data from all peers; Answers to queries posed by local peer users must conform the view those users have of their data;Peers must cooperate in maintaining the local views of data;

MAIN CHARACTERISTICS:

Page 5: Aggregate Queries  in Peer-to-Peer OLAP

OUTLINE:

• CHARACTERIZATION• PROBLEM AND PROPOSAL• FACT INTEGRATION• DIMENSION INTEGRATION • AGGREGATE QUERIES• CONCLUSIONS

Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP

Page 6: Aggregate Queries  in Peer-to-Peer OLAP

OLAP Data in a Peer-to-Peer System

• OLAP data is essentially multidimensional;• Multidimensional data consists in a collection of views of base and derived aggregated data, describing fact indicators by dimensions of analysis; • Concepts for aggregation within dimensions are obtained from finer grain concepts through hierarchies;• Different peers may have affine fact indicators described by different dimension hierarchies;• Integration is needed: Any summary concept that appears in a hierarchy of a peer acquaintance must be transformed into a summary concept meaningful to the local peer. •••• >

THE PROBLEM:

Page 7: Aggregate Queries  in Peer-to-Peer OLAP

OLAP Data in a Peer-to-Peer System

• The expected integration is not always possible;• Users may pose OLAP queries in a local peer expecting results involving all relevant data stored in all peers.• Local queries must be propagated among the acquaintances;• A rewriting of the propagated queries is needed to conform the view of the local user.• The rewriting technique must accomplish the data integration on the fly;• Incomplete and uncertain results must be admitted;

•••• > THE PROBLEM

Page 8: Aggregate Queries  in Peer-to-Peer OLAP

Peer-to-Peer OLAP

• FACT PEERS• DIMENSION PEERS• AGGREGATE P2P OLAP QUERIES• COMPLETE AND CERTAIN QUERY ANSWERS

MODEL (DEFINES):

• AUTONOMOUS PEER DATA MANAGEMENT• THREE PHASE PEER TO PEER COORDINATION• COOPERATIVE QUERY ANSWERING

ARCHITECTURE (INVOLVES):

Page 9: Aggregate Queries  in Peer-to-Peer OLAP

OUTLINE:

• CHARACTERIZATION• PROBLEM AND PROPOSAL• FACT INTEGRATION• DIMENSION INTEGRATION • AGGREGATE QUERIES • CONCLUSIONS

Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP

Page 10: Aggregate Queries  in Peer-to-Peer OLAP

Fact Integration

• GENERIC FACT • FACT PEERS

TYPES OF FACT:

IS-A RELATIONSHIP

FACT CONCILIATION PHASE:

SOURCE PEER

PUBLISHES GENERIC FACT DEFINITION AND DIMENSIONAL STRUCTURE

LISTENING PEER

GENERIC FACT AGREEMENT AND DIMENSION PEERS DEFINITION

Page 11: Aggregate Queries  in Peer-to-Peer OLAP

OUTLINE:

• CHARACTERIZATION• PROBLEM AND PROPOSAL• FACT INTEGRATION• DIMENSION INTEGRATION• AGGREGATE QUERIES• CONCLUSIONS

Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP

Page 12: Aggregate Queries  in Peer-to-Peer OLAP

Dimension Integration

• LEVEL HIERARCHY INTEGRATION• MEMBER HIERARCHY INTEGRATION.

CONSISTS IN:

• CORRESPONDENCE DEFINITION AMONG DIMENSION LEVELS • REVISION/MAPPING DEFINITION AMONG DIMENSION INSTANCES

COMPRISES:

INVOLVES:• A PAIR OF DIMENSION PEERS

Page 13: Aggregate Queries  in Peer-to-Peer OLAP

Level Hierarchy Integration

LEVEL CORRESPONDENCE• APPLIES ON SCHEMAS• ESTABLISHES HOW A PAIR OF LEVELS ON DIFFERENT PEER DIMENSIONS ARE RELATED• IS PRODUCED/UPDATED DURING A SCHEMA CONCILIATION PHASE• IS MATERIALIZED AS METADATA IN CORRESPONDENCE TABLES

Page 14: Aggregate Queries  in Peer-to-Peer OLAP

ORDER PRESERVING LEVEL CORRESPONDENCE

Benefit Type

Funding Class

All

Tax DischargeCategory

Loan Type

All

Charity Modality

BenefitType

Level Hierarchy Integration

Page 15: Aggregate Queries  in Peer-to-Peer OLAP

A LEVEL CORRESPONDENCE THAT DO NOT PRESERVE ORDER IS NOT ADMISSIBLE

Benefit Type

Funding Class

All

Tax DischargeCategory

Loan Types

All

Charity Modality

BenefitType

Level Hierarchy Integration

WRONG

Page 16: Aggregate Queries  in Peer-to-Peer OLAP

Member Hierarchy Integration

INTEGRATION BY MAPPING

• APPLIES ON INSTANCES• ESTABLISHES HOW A PAIR OF MEMBERS OF CORRESPONDING LEVELS ARE RELATED• IS PRODUCED/UPDATED DURING A MAPPING ACQUISITION PHASE• MUST BE PRECEDED BY AT LEAST ONE SCHEMA CONCILIATION PHASE• IS MATERIALIZED AS METADATA IN MAPPING TABLES

l1: m1 (Local) l'1: m'1 (Peer)

l2: m2 (Local) l'2: m'2 (Acq)

Page 17: Aggregate Queries  in Peer-to-Peer OLAP

For each member m of a level l, such that map (l:m) is defined,

if there exists some member m’ of level l’, satisfying roll-up (l:m) = l’:m’

and level l’ is in dom(Correspondence)then roll-up (map (l:m) ) = map (l’:m’).

Member Hierarchy Integration

MAPPINGS: HOMOMORPHISM PROPERTY

l:m

l':m’map

maproll-uproll-up

Page 18: Aggregate Queries  in Peer-to-Peer OLAP

Member m’ in level l’ is conflicting,it cannot be mapped.

An approach based on mapping exclusively is not always effective.

Member Hierarchy Integration

HOMOMORPHISM MAY NOT BE ALWAYS GRANTED

l:m1

l':m’

mapmap

roll-uproll-up

l:m2

roll-uproll-up

Page 19: Aggregate Queries  in Peer-to-Peer OLAP

MAPPINGS DO NOT SUFFICE: MAPPINGS DO NOT SUFFICE: REVISIONS MAY BE NECESSARYREVISIONS MAY BE NECESSARY

Member Hierarchy Integration

l:m1

l':m’

l:m2

Conflicting Member

REVISIONS AFFECT THE VIEW A PEER HAS OF THE REVISIONS AFFECT THE VIEW A PEER HAS OF THE HIERARCHY OF ITS ACQUAINTANCE ONLYHIERARCHY OF ITS ACQUAINTANCE ONLY

LOCAL

ACQUAINTANCE

Page 20: Aggregate Queries  in Peer-to-Peer OLAP

A REVISION BY SPLITTING A REVISION BY SPLITTING MAY BE USED TO REPAIR CONFLICTSMAY BE USED TO REPAIR CONFLICTSGIVING WAY TO MAPPABLE MEMBERSGIVING WAY TO MAPPABLE MEMBERS

Member Hierarchy Integration

l:m1

l':m2’

l:m2

l:m1’

LOCAL

ACQUAINTANCE

EXAMPLE OF A REVISION: EXAMPLE OF A REVISION: CONFLICTING MEMBER SPLITCONFLICTING MEMBER SPLIT

Non-Conflicting Members

Page 21: Aggregate Queries  in Peer-to-Peer OLAP

A REVISION BY RECLASSIFYING A REVISION BY RECLASSIFYING MAY BE AN ALTERNATIVE TO RESTORE HOMOMORPHISMMAY BE AN ALTERNATIVE TO RESTORE HOMOMORPHISM

Member Hierarchy Integration

l:m1

l:m2

l:m’

LOCAL

ACQUAINTANCE

l:m3

l':m”

EXAMPLE OF A REVISION:EXAMPLE OF A REVISION:CONFLICTING MEMBER RECLASSIFICATIONCONFLICTING MEMBER RECLASSIFICATION

Non-Conflicting Members

Page 22: Aggregate Queries  in Peer-to-Peer OLAP

• PRODUCES AND BROADCASTS REVISION AND MAPPING DEFINITIONS TO POTENTIAL ACQUAINTANCES

REVISE AND MAP APPROACH:LOCAL PEER:

Member Hierarchy Integration

ACQUAINTANCE:• REVISES ITS OWN HIERARCHIES PRODUCING A REVISED INSTANCE (REVISED ROLL-UPS) WITH RESPECT TO THE LOCAL PEER• STORE INFORMATION ON MAPPINGS IN METADATA MAPPING TABLES

Page 23: Aggregate Queries  in Peer-to-Peer OLAP

Whenever some member m2’ of a level l’ is not mapped,a bottom-up completion approach for query answeringis employed: information on non-mapped members andtheir roll-ups is stored in metadata completion tables.

Member Hierarchy Integration

BOTTOM-UP COMPLETION APPROACH

l:m1

l':m2’

mapmap

Incompleteroll-up

roll-upl:m2

roll-uproll-up

l':m1’Non-Mapped

Member

Page 24: Aggregate Queries  in Peer-to-Peer OLAP

OUTLINE:

• CHARACTERIZATION• PROBLEM AND PROPOSAL• FACT INTEGRATION• DIMENSION INTEGRATION• AGGREGATE QUERIES• CONCLUSIONS

Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP

Page 25: Aggregate Queries  in Peer-to-Peer OLAP

P2P OLAP Queries

Syntactical Structure (Datalog Style):

query( Z1, ... , Zn, aggr(M), Set of Peers) Generic Fact(X1, ... , Xn, M ), rollup dimension d1 from bottom level to desired level l1 ( X1, Z1 ), ... , rollup dimension dn from bottom level to desired level ln ( Xn, Zn );

Page 26: Aggregate Queries  in Peer-to-Peer OLAP

• GENERATES A QUERY FOR EACH RELEVANT PEER (INCLUDING THE LOCAL PEER);• GENERATED QUERIES ARE PROPAGATED TO RELEVANT PEERS;• QUERIES FOR RELEVANT PEERS STEM FROM THE REWRITING OF THE SUBMITTED P2P OLAP QUERY;• THE REWRITING PROCESS INTRODUCES REFERENCES TO FACT PEERS, REVISED ROLL-UPS, AND MAPPING AND COMPLETION TABLES;• RESULTS OF PROPAGATED QUERIES ARE COLLECTED AND AGGREGATED LOCALLY TO PRODUCE THE FINAL QUERY ANSWER;• QUERY ANSWERS MAY BE UNCERTAIN AND INCOMPLETE DUE TO BOTTOM-UP COMPLETION.

Query Evaluation Process

Page 27: Aggregate Queries  in Peer-to-Peer OLAP

Query ProcessingQuery Processing

Local Peer Relevant Peer

Fact Fact tablestables

QUERY

Rewriting

Evaluation

Partial Result

Revised Revised RollupsRollups

MetadataMapping Mapping

tablestables

IntegrationAnswer

Completion Completion tablestables

Page 28: Aggregate Queries  in Peer-to-Peer OLAP

Aggregate Queries Aggregate Queries in Peer-to-Peer OLAPin Peer-to-Peer OLAP

• GENERIC FACTS• FACT CONCILIATION PHASE• HIERARCHY LEVEL CORRESPONDENCE• SCHEMA CONCILIATION PHASE• REVISE AND MAP APPROACH• BOTTOM-UP COMPLETION• MAPPING ACQUISITION PHASE• P2P OLAP QUERIES• QUERY REWRITING AND EVALUATION

CONCLUSIONS: MAIN POINTS DISCUSSED