SynopSys: Foundationsfor...
Transcript of SynopSys: Foundationsfor...
![Page 1: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/1.jpg)
Michael Rudolf1, Hannes Voigt1, Christof Bornhoevd2,and Wolfgang Lehner1
SynopSys: Foundations forMultidimensional Graph AnalyticsBusiness Intelligence for the Real-Time Enterprise (BIRTE 2014)1Database Technology Group, Technische Universität Dresden2SAP Labs, LLC, Palo Alto
September 1, 2014
![Page 2: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/2.jpg)
Motivation: Big (Graph) Data
Peak Performance
Nov. 26, 2012: 26.5M items (306/sec)Nov. 23, 2013: 36.8M items (426/sec)
645M users135 K new every day
58M tweets & 2.1 G searches / day
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 2
![Page 3: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/3.jpg)
Motivation: Big (Graph) Data
Peak Performance
Nov. 26, 2012: 26.5M items (306/sec)Nov. 23, 2013: 36.8M items (426/sec)
645M users135 K new every day
58M tweets & 2.1 G searches / day
Intensional vs. Extensional� Schema & integrity constraints
� Created at design time bydomain experts
ETL
Once & forever
� Collect lots of data �rst
� Try to deduce the intension
� �The Fourth Paradigm� [Mic09]
. . .
Time© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 2
![Page 4: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/4.jpg)
The Property Graph Model
1
black
64GB�Apple iPadMC707LL/A�
2
black
32 GB
�AppleiPhone 5�
3white
16 GB�Apple
iPhone 4�
a
4�ConsumerElectronics�
5�Phones�7�Tablets�
b
8�Freddy�
FR
9
�Karl�DE
10�Mike�US
11�Steve�US
c
125/5stars
135/5 stars
14
4/5 stars
d
e
f
15
delivered 24/02/14
16ordered24/02/14
g h
part ofpart of
in
in
in
authors
authors
rates
rates
rates
likeslikes
records
records
contains 1
contains 2
contains 1
� Provides directed, attributed multi-relational graphs
� Attributes on vertices and edges as key-value pairs(instance-level instead of class-level)
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 3
![Page 5: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/5.jpg)
The Property Graph Model
1
black
64GB�Apple iPadMC707LL/A�
2
black
32 GB
�AppleiPhone 5�
3white
16 GB�Apple
iPhone 4�
a
4�ConsumerElectronics�
5�Phones�7�Tablets�
b
8�Freddy�
FR
9
�Karl�DE
10�Mike�US
11�Steve�US
c
125/5stars
135/5 stars
14
4/5 stars
d
e
f
15
delivered 24/02/14
16ordered24/02/14
g h
part ofpart of
in
in
in
authors
authors
rates
rates
rates
likeslikes
records
records
contains 1
contains 2
contains 1
� Provides directed, attributed multi-relational graphs
� Attributes on vertices and edges as key-value pairs(instance-level instead of class-level)
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 3
![Page 6: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/6.jpg)
Agenda
Analytical Scenario: From Graphs to Cubes
Operations: Roll-up, Drill-down, Slice & Dice
Challenges: Unbalanced Hierarchies & OLAP Anomalies
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 4
![Page 7: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/7.jpg)
Graph Cube
1
black
64GB�Apple iPadMC707LL/A�
2
black
32 GB
�AppleiPhone 5�
3white
16 GB�Apple
iPhone 4�
a
4�ConsumerElectronics�
5�Phones�7�Tablets�
b
8�Freddy�
FR
9
�Karl�DE
10�Mike�US
11�Steve�US
c
125/5stars
135/5 stars
14
4/5 stars
d
e
f
15
delivered 24/02/14
16ordered24/02/14
g h
part ofpart of
in
inauthors
authors
rates
rates
rates
in
likeslikes
records
records
contains 1
contains 2
contains 1
1. Identify facts
2. Specify dimensions
3. De�ne measures
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 5
![Page 8: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/8.jpg)
Graph Cube
1
black
64GB�Apple iPadMC707LL/A�
2
black
32 GB
�AppleiPhone 5�
3white
16 GB�Apple
iPhone 4�
a
4�ConsumerElectronics�
5�Phones�7�Tablets�
b
8�Freddy�
FR
9
�Karl�DE
10�Mike�US
11�Steve�US
c
125/5stars
135/5 stars
14
4/5 stars
d
e
f
15
delivered 24/02/14
16ordered24/02/14
g h
part ofpart of
in
inauthors
authors
rates
rates
rates
in
likeslikes
records
records
contains 1
contains 2
contains 1
1. Identify facts
2. Specify dimensions
3. De�ne measures
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 5
![Page 9: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/9.jpg)
Graph Cube
1
black
64GB�Apple iPadMC707LL/A�
2
black
32 GB
�AppleiPhone 5�
3white
16 GB�Apple
iPhone 4�
a
4�ConsumerElectronics�
5�Phones�7�Tablets�
b
8�Freddy�
FR
9
�Karl�DE
10�Mike�US
11�Steve�US
c
125/5stars
135/5 stars
14
4/5 stars
d
e
f
15
delivered 24/02/14
16ordered24/02/14
g h
part ofpart of
in
inauthors
authors
rates
rates
rates
in
likeslikes
records
records
contains 1
contains 2
contains 1
1. Identify facts
2. Specify dimensions
3. De�ne measures
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 5
![Page 10: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/10.jpg)
Graph Cube
1
black
64GB�Apple iPadMC707LL/A�
2
black
32 GB
�AppleiPhone 5�
3white
16 GB�Apple
iPhone 4�
a
4�ConsumerElectronics�
5�Phones�7�Tablets�
b
8�Freddy�
FR
9
�Karl�DE
10�Mike�US
11�Steve�US
c
125/5stars
135/5 stars
14
4/5 stars
d
e
f
15
delivered 24/02/14
16ordered24/02/14
g h
part ofpart of
in
inauthors
authors
rates
rates
rates
in
likeslikes
records
records
contains 1
contains 2
contains 1
1. Identify facts
2. Specify dimensions
3. De�ne measures
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 5
![Page 11: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/11.jpg)
Facts
Depending on the use case, a (base) fact can be
� a vertex attribute, an edge attribute, or
� the presence of an edge.
in general: a subgraph
Ô Use pattern matching Ô graphical speci�cation instead of DSL
Example
authorsrates Match reviews of products and
their authors (vertex typesindicated via color)
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 6
![Page 12: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/12.jpg)
Facts
Depending on the use case, a (base) fact can be
� a vertex attribute, an edge attribute, or
� the presence of an edge.in general: a subgraph
Ô Use pattern matching Ô graphical speci�cation instead of DSL
Example
authorsrates Match reviews of products and
their authors (vertex typesindicated via color)
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 6
![Page 13: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/13.jpg)
Facts
Depending on the use case, a (base) fact can be
� a vertex attribute, an edge attribute, or
� the presence of an edge.in general: a subgraph
Ô Use pattern matching Ô graphical speci�cation instead of DSL
Example
authorsrates Match reviews of products and
their authors (vertex typesindicated via color)
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 6
![Page 14: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/14.jpg)
Facts
Depending on the use case, a (base) fact can be
� a vertex attribute, an edge attribute, or
� the presence of an edge.in general: a subgraph
Ô Use pattern matching Ô graphical speci�cation instead of DSL
Example
authorsrates Match reviews of products and
their authors (vertex typesindicated via color)
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 6
![Page 15: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/15.jpg)
Dimensions
Dimensions can be
1. vertex or edge attributes
2. connectivity
1
black
64GB�Apple iPadMC707LL/A�
2
black
32 GB
�AppleiPhone 5�
3white
16 GB�Apple
iPhone 4�
a
4�ConsumerElectronics�
5�Phones�7�Tablets�
b
8�Freddy�
FR
9
�Karl�DE
10�Mike�US
11�Steve�US
c
125/5stars
135/5 stars
14
4/5 stars
d
e
f
15
delivered 24/02/14
16ordered24/02/14
g h
part ofpart of
in
inin
authors
authorsrates
rates
rates
likeslikes
records
records
contains 1
contains 2
contains 1
Structure in Dimensions� extrinsic: not contained in graph data,needs to be provided externally (e.g., GeoNames)
� intrinsic: embodied in graph data
explicit: captured as topological informationimplicit: has to be derived from attribute values
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 7
![Page 16: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/16.jpg)
Dimensions
Dimensions can be
1. vertex or edge attributes
2. connectivity
1
black
64GB�Apple iPadMC707LL/A�
2
black
32 GB
�AppleiPhone 5�
3white
16 GB�Apple
iPhone 4�
a
4�ConsumerElectronics�
5�Phones�7�Tablets�
b
8�Freddy�
FR
9
�Karl�DE
10�Mike�US
11�Steve�US
c
125/5stars
135/5 stars
14
4/5 stars
d
e
f
15
delivered 24/02/14
16ordered24/02/14
g h
part ofpart of
in
inin
authors
authorsrates
rates
rates
likeslikes
records
records
contains 1
contains 2
contains 1
Structure in Dimensions� extrinsic: not contained in graph data,needs to be provided externally (e.g., GeoNames)
� intrinsic: embodied in graph data
explicit: captured as topological informationimplicit: has to be derived from attribute values
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 7
![Page 17: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/17.jpg)
Intrinsic Dimensions
Explicit Dimensions� Can be speci�ed using path expressions� In general requires one path expression per level, e.g.
-[@type='belongsTo ']->[@type='state ']-[@type='partOf ']->[@type='country ']
Implicit Dimensions� Might require bucketization� In general requires one expression per level, e.g.
GetWeekOfYear(@ordered) and GetYear(@ordered)
alias @ attribute access of vertex or edge attribute
- [ edge predicate ] -> [ vertex predicate ] ( length )
paths (with optional recursion depth), optionally satisfying the predicates© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 8
![Page 18: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/18.jpg)
Intrinsic Dimensions
Explicit Dimensions� Can be speci�ed using path expressions� In general requires one path expression per level, e.g.
-[@type='belongsTo ']->[@type='state ']-[@type='partOf ']->[@type='country ']
Implicit Dimensions� Might require bucketization� In general requires one expression per level, e.g.
GetWeekOfYear(@ordered) and GetYear(@ordered)
alias @ attribute access of vertex or edge attribute
- [ edge predicate ] -> [ vertex predicate ] ( length )
paths (with optional recursion depth), optionally satisfying the predicates© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 8
![Page 19: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/19.jpg)
Intrinsic Dimensions
Explicit Dimensions� Can be speci�ed using path expressions� In general requires one path expression per level, e.g.
-[@type='belongsTo ']->[@type='state ']-[@type='partOf ']->[@type='country ']
Implicit Dimensions� Might require bucketization� In general requires one expression per level, e.g.
GetWeekOfYear(@ordered) and GetYear(@ordered)
alias @ attribute access of vertex or edge attribute
- [ edge predicate ] -> [ vertex predicate ] ( length )
paths (with optional recursion depth), optionally satisfying the predicates© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 8
![Page 20: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/20.jpg)
Dimension Speci�cation
ExampleName Seed Pattern Levels
Nationality $c $c@nationality
Category $p
Product category:
$p-[@type='in']->
Product group:
$p-[@type='in']->-[@type='part-of']->
Product area:
$p-[@type='in']->-[@type='part-of']->(2)
Seed Pattern� Connects facts to dimensions� Is matched against facts
Ô Has to be a super pattern of the fact pattern (i.e., more general)
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 9
![Page 21: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/21.jpg)
Dimension Speci�cation
ExampleName Seed Pattern Levels
Nationality $c $c@nationality
Category $p
Product category:
$p-[@type='in']->
Product group:
$p-[@type='in']->-[@type='part-of']->
Product area:
$p-[@type='in']->-[@type='part-of']->(2)
Seed Pattern� Connects facts to dimensions� Is matched against facts
Ô Has to be a super pattern of the fact pattern (i.e., more general)© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 9
![Page 22: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/22.jpg)
Properties of Dimensions
MonotonyLevels should be ordered such thatthe number of items decreases.
Level Name # Elements
1 Region 125
2 Country 30
3 Continent 3
HierarchyLevels should form hierarchies.If two facts map to the sameelement in li, they should map tothe same element in li+1 as well.Ô Functional dependency
Fact Level 1 Level 2 Level 3
A Saxony Germany Europe
B Saxony Germany Europe
C Bavaria Germany Europe
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 10
![Page 23: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/23.jpg)
Properties of Dimensions
MonotonyLevels should be ordered such thatthe number of items decreases.
Level Name # Elements
1 Region 125
2 Country 30
3 Continent 3
HierarchyLevels should form hierarchies.If two facts map to the sameelement in li, they should map tothe same element in li+1 as well.Ô Functional dependency
Fact Level 1 Level 2 Level 3
A Saxony Germany Europe
B Saxony Germany Europe
C Bavaria Germany Europe
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 10
![Page 24: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/24.jpg)
Measures
A measure is a derived fact
� combining several facts
� computed by a speci�ed function(e.g., scalar, aggregation).
Ô Annotate the fact patternÔ Introduce representative vertex
Example� Average product rating byproduct category
� Minimum age of customersby nationality
$c
$r $p++
(Min. Age, $c@age,MIN)
(Avg. Rtg., $r@stars,AVG)
authors$a
rates$e
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 11
![Page 25: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/25.jpg)
Measures
A measure is a derived fact
� combining several facts
� computed by a speci�ed function(e.g., scalar, aggregation).
Ô Annotate the fact patternÔ Introduce representative vertex
Example� Average product rating byproduct category
� Minimum age of customersby nationality
$c
$r $p++
(Min. Age, $c@age,MIN)
(Avg. Rtg., $r@stars,AVG)
authors$a
rates$e
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 11
![Page 26: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/26.jpg)
Measures
A measure is a derived fact
� combining several facts
� computed by a speci�ed function(e.g., scalar, aggregation).
Ô Annotate the fact patternÔ Introduce representative vertex
Example� Average product rating byproduct category
� Minimum age of customersby nationality
$c
$r $p++
(Min. Age, $c@age,MIN)
(Avg. Rtg., $r@stars,AVG)
authors$a
rates$e
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 11
![Page 27: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/27.jpg)
Operations: Roll-up, Drill-down, Slice & Dice
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 12
![Page 28: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/28.jpg)
Roll-up/Drill-down
Granularity of the Cube� Represents the �grouping�: the current levels of interest
� Initially: the lowest level of each dimension
Roll-up� Reduces the granularity
� For dimension d, move up one level from li to li+1
Drill-down� Increases the granularity
� For dimension d, move down one level from li to li−1
Ô Introduce representative vertex for each groupÔ Expose computed values for measures as attributes
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 13
![Page 29: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/29.jpg)
Roll-up/Drill-down
Granularity of the Cube� Represents the �grouping�: the current levels of interest
� Initially: the lowest level of each dimension
Roll-up� Reduces the granularity
� For dimension d, move up one level from li to li+1
Drill-down� Increases the granularity
� For dimension d, move down one level from li to li−1
Ô Introduce representative vertex for each groupÔ Expose computed values for measures as attributes
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 13
![Page 30: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/30.jpg)
Roll-up/Drill-down
Granularity of the Cube� Represents the �grouping�: the current levels of interest
� Initially: the lowest level of each dimension
Roll-up� Reduces the granularity
� For dimension d, move up one level from li to li+1
Drill-down� Increases the granularity
� For dimension d, move down one level from li to li−1
Ô Introduce representative vertex for each groupÔ Expose computed values for measures as attributes
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 13
![Page 31: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/31.jpg)
Slice & Dice
Function �lter transforms fact base of cube
� evaluates level-predicate pairs
� removes facts not matching the predicates
For a single predicate applied to one dimension Ô slice
Example
Slice product reviews by German customers from the cube c:filter(c, {(Nationality, λ = �DE�)}).
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 14
![Page 32: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/32.jpg)
Slice & Dice
Function �lter transforms fact base of cube
� evaluates level-predicate pairs
� removes facts not matching the predicates
For a single predicate applied to one dimension Ô slice
Example
Slice product reviews by German customers from the cube c:filter(c, {(Nationality, λ = �DE�)}).
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 14
![Page 33: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/33.jpg)
Challenges: Unbalanced Hierarchies & OLAP Anomalies
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 15
![Page 34: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/34.jpg)
Unbalanced Hierarchies
Facts with di�erent granularities
Relative dimension speci�cation:
Product category:
$p-[@type='in']->
Product group:
$p-[@type='in']->-[@type='part-of']->
Product area:
$p-[@type='in']->-[@type='part-of']->(2)
Ô Absolute instead of relativedimension speci�cation required
Example
Products in categories andgroups
15red 16 GB
�Google Nexus 5�16black
�SamsungE1200�
4�Cell Phones& Accessories�
5 �Phones�
6
�Computers &Accessories�
7�Tablets�
12�Smartphones�in in
part of
part ofpart of
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 16
![Page 35: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/35.jpg)
Unbalanced Hierarchies
Facts with di�erent granularities
Relative dimension speci�cation:
Product category:
$p-[@type='in']->
Product group:
$p-[@type='in']->-[@type='part-of']->
Product area:
$p-[@type='in']->-[@type='part-of']->(2)
Ô Absolute instead of relativedimension speci�cation required
Example
Products in categories andgroups
15red 16 GB
�Google Nexus 5�16black
�SamsungE1200�
4�Cell Phones& Accessories�
5 �Phones�
6
�Computers &Accessories�
7�Tablets�
12�Smartphones�in in
part of
part ofpart of
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 16
![Page 36: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/36.jpg)
Unbalanced Hierarchies
Facts with di�erent granularities
Relative dimension speci�cation:
Product category:
$p-[@type='in']->
Product group:
$p-[@type='in']->-[@type='part-of']->
Product area:
$p-[@type='in']->-[@type='part-of']->(2)
Ô Absolute instead of relativedimension speci�cation required
Example
Products in categories andgroups
15red 16 GB
�Google Nexus 5�16black
�SamsungE1200�
4�Cell Phones& Accessories�
5 �Phones�
6
�Computers &Accessories�
7�Tablets�
12�Smartphones�in in
part of
part ofpart of
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 16
![Page 37: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/37.jpg)
Unbalanced Hierarchies
Facts with di�erent granularities
Relative dimension speci�cation:
Product category:
$p-[@type='in']->
Product group:
$p-[@type='in']->-[@type='part-of']->
Product area:
$p-[@type='in']->-[@type='part-of']->(2)
Ô Absolute instead of relativedimension speci�cation required
Example
Products in categories andgroups
15red 16 GB
�Google Nexus 5�16black
�SamsungE1200�
4�Cell Phones& Accessories�
5 �Phones�
6
�Computers &Accessories�
7�Tablets�
12�Smartphones�in in
part of
part ofpart of
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 16
![Page 38: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/38.jpg)
Unbalanced Hierarchies
Facts with di�erent granularities
Relative dimension speci�cation:
Product category:
$p-[@type='in']->
Product group:
$p-[@type='in']->-[@type='part-of']->
Product area:
$p-[@type='in']->-[@type='part-of']->(2)
Ô Absolute instead of relativedimension speci�cation required
Example
Products in categories andgroups
15red 16 GB
�Google Nexus 5�16black
�SamsungE1200�
4�Cell Phones& Accessories�
5 �Phones�
6
�Computers &Accessories�
7�Tablets�
12�Smartphones�in in
part of
part ofpart of
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 16
![Page 39: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/39.jpg)
Unbalanced Hierarchies
Facts with di�erent granularities
Relative dimension speci�cation:
Product category:
$p-[@type='in']->
Product group:
$p-[@type='in']->-[@type='part-of']->
Product area:
$p-[@type='in']->-[@type='part-of']->(2)
Ô Absolute instead of relativedimension speci�cation required
Example
Products in categories andgroups
15red 16 GB
�Google Nexus 5�16black
�SamsungE1200�
4�Cell Phones& Accessories�
5 �Phones�
6
�Computers &Accessories�
7�Tablets�
12�Smartphones�in in
part of
part ofpart of
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 16
![Page 40: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/40.jpg)
Unbalanced Hierarchies
Solution: Pre-process the graph
Data Cleansing� Balance hierarchies
� Add missing root nodes
Tagging� Add attributes for absolutereferencing
15red 16 GB
�Google Nexus 5�16black
�SamsungE1200�
4�Cell Phones& Accessories�
5 �Phones�
6
�Computers &Accessories�
7�Tablets�
12�Smartphones�in in
part of
part ofpart of
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 17
![Page 41: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/41.jpg)
Unbalanced Hierarchies
Solution: Pre-process the graph
Data Cleansing� Balance hierarchies
� Add missing root nodes
Tagging� Add attributes for absolutereferencing
15red 16 GB
�Google Nexus 5�16black
�SamsungE1200�
4�Cell Phones& Accessories�
5 �Phones�
6
�Computers &Accessories�
7�Tablets�
12�Smartphones�in
part of
part ofpart of
13 �Dumbphones�
14�ConsumerElectronics�
in
part of
part ofpart of
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 17
![Page 42: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/42.jpg)
Unbalanced Hierarchies
Solution: Pre-process the graph
Data Cleansing� Balance hierarchies
� Add missing root nodes
Tagging� Add attributes for absolutereferencing
15red 16 GB
�Google Nexus 5�16black
�SamsungE1200�
43 �Cell Phones& Accessories�
5
2
�Phones�
62
�Computers &Accessories�
7 1�Tablets�
12 1�Smartphones�in in
part of
part ofpart of
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 17
![Page 43: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/43.jpg)
OLAP Anomalies
It depends on the model:
� double counting can occur, if acardinality assumption isviolated (1:1 vs. 1:nrelationship)
7�Tablets�5 �Phones�
15128GB
�Apple iPad Air�
16 black
�SamsungE1200�
inin in
� incompleteness can occur, if aconnectivity assumption isviolated
7�Tablets�5 �Phones�
15128GB
�Apple iPad Air�
16 black
�SamsungE1200�
inin
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 18
![Page 44: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/44.jpg)
OLAP Anomalies
It depends on the model:
� double counting can occur, if acardinality assumption isviolated (1:1 vs. 1:nrelationship)
7�Tablets�5 �Phones�
15128GB
�Apple iPad Air�
16 black
�SamsungE1200�
inin in
� incompleteness can occur, if aconnectivity assumption isviolated
7�Tablets�5 �Phones�
15128GB
�Apple iPad Air�
16 black
�SamsungE1200�
inin
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 18
![Page 45: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/45.jpg)
Conclusion
Powerful Mapping of Multidimensional Analytics� Expose well-known concepts and operations
� Emphasize challenges posed by graph data
Ô Open up the graph world to Business Intelligence
Flexible Work�ow for the Big Graph Data Era� No up-front schema design
� Adapt to changing data and requirements
Ô What is a fact today can be a dimension tomorrow
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 19
![Page 46: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/46.jpg)
1 Additional Material & References
![Page 47: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/47.jpg)
References I
Chen Chen, Xifeng Yan, Feida Zhu, Jiawei Han, and Philip S. Yu.Graph OLAP: Towards Online Analytical Processing on Graphs.In Proceedings of the Eighth International Conference on Data Mining, pages 103�112, Pisa,Italy, December 2008. IEEE.
Microsoft Research.The Fourth Paradigm: Data-Intensive Scienti�c Discovery.Microsoft Press, 2009.
Marko A. Rodriguez and Peter Neubauer.Constructions from Dots and Lines.Bulletin of the American Society for Information Science and Technology, 36(6):35�41, 2010.
Yuanyuan Tian and Jignesh M. Patel.TALE: A Tool for Approximate Large Graph Matching.In 2008 IEEE 24th International Conference on Data Engineering, pages 963�972. IEEE, April2008.
Peixiang Zhao, Xiaolei Li, Dong Xin, and Jiawei Han.Graph Cube: On Warehousing and OLAP Multidimensional Networks.In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages853�864, Athens, Greece, 2011. ACM.
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 2
![Page 48: SynopSys: Foundationsfor MultidimensionalGraphAnalyticsdb.csail.mit.edu/birte2014/slides/slides10.pdf · 2014-09-10 · Michael Rudolf 1, Hannes Voigt , Christof Bornhoevd2, and Wolfgang](https://reader030.fdocuments.in/reader030/viewer/2022040514/5e6c25407d9cb37451241dce/html5/thumbnails/48.jpg)
References II
Ning Zhang, Yuanyuan Tian, and Jignesh M. Patel.Discovery-Driven Graph Summarization.In Proceedings of the 26th International Conference on Data Engineering, pages 880�891,Long Beach, CA, USA, 2010. IEEE.
© Michael Rudolf | SynopSys: Foundations for Multidimensional Graph Analytics | 3