1 Exercise Sheet 3 Exercise 7.: ROLAP Algebra Assume that a fact table SalesCube has 3 hierarchies...

10
1 Exercise Sheet 3 7.: gebra hat a fact table SalesCube has 3 hierarchies with a Month M, Productgroup P and City C and the measure ssume that the attributes have the following cardin 3 2 0 7.1: (hierarchical) aggregation network

Transcript of 1 Exercise Sheet 3 Exercise 7.: ROLAP Algebra Assume that a fact table SalesCube has 3 hierarchies...

Page 1: 1 Exercise Sheet 3 Exercise 7.: ROLAP Algebra Assume that a fact table SalesCube has 3 hierarchies with attributes  ear , Month M, Productgroup P and.

1

Exercise Sheet 3

Exercise 7.: ROLAP AlgebraAssume that a fact table SalesCube has 3 hierarchies with attributesear , Month M, Productgroup P and City C and the measure sales. Assume that the attributes have the following cardinalities: = 3M = 12P = 500C = 80

Exercise 7.1: Draw the (hierarchical) aggregation network

Page 2: 1 Exercise Sheet 3 Exercise 7.: ROLAP Algebra Assume that a fact table SalesCube has 3 hierarchies with attributes  ear , Month M, Productgroup P and.

2

Exercise 7.2.: Construct the ROLAP expression to compute the average and maximal sales for the groups { ,P}, {, C} and {P}

Exercise 7.3:Translate the ROLAP expression of Exercise 7.2 into a single SQL statement and estimate its

cost = total number of tuples read + total number of tuples written

if you assume that there is no optimization of this SQL statement.

Exercise 7.4:Translate the SQL statement of Exercise 7.3 into several SQL statements employing auxiliary tables for intermediate results. Try to minimize the cost.

Page 3: 1 Exercise Sheet 3 Exercise 7.: ROLAP Algebra Assume that a fact table SalesCube has 3 hierarchies with attributes  ear , Month M, Productgroup P and.

3

Solution 7.1:

(M, P, C) 1.440.000 = n

(Y, P, C)120.000

(M, ALL, C)2.880

(M, P, ALL) 18.000

(M, ALL, ALL) 36(Y, P, ALL) 1.500

(Y, ALL, C) 240

(Y, ALL, ALL) 3

(ALL, ALL, ALL)

(ALL, P, C) 40.000

(ALL, ALL, C) 80 (ALL, P, ALL) 500

Page 4: 1 Exercise Sheet 3 Exercise 7.: ROLAP Algebra Assume that a fact table SalesCube has 3 hierarchies with attributes  ear , Month M, Productgroup P and.

4

Solution 7.2:

POT(SalesCube,{{Y,P}, {Y,C},{P}}, {sum(sales), avg(sales)})

cost (Y,P) = n + 2*18.000 + 1.500cost (Y,C) = n + 2*2.880 + 240cost (Y,P) = n + 2*18.000 + 2*1.500 + 500reading and writing of intermediate results

with insufficient cache

Page 5: 1 Exercise Sheet 3 Exercise 7.: ROLAP Algebra Assume that a fact table SalesCube has 3 hierarchies with attributes  ear , Month M, Productgroup P and.

5

Solution 7.3:

Select Y,’ALL’, P, ’ALL’ sum(sales), avg(sales)

From SalesCubeGroup By Y,PUnionSelect Y, ’ALL’, ’ALL’, C,

sum(sales), avg(sales)From SalesCubeGroup By Y,CUnionSelect ’ALL’, ’ALL’, P, ’ALL’

sum(sales), avg(sales)From SalesCubeGroup By P

• Cost assuming n fact tuples and sufficient cache:

3 * n // read ops

+ 3*500 // {Y,P}

+ 3*80 // {Y,C}

+ 500 // {P}

= 3*n + 2240

Page 6: 1 Exercise Sheet 3 Exercise 7.: ROLAP Algebra Assume that a fact table SalesCube has 3 hierarchies with attributes  ear , Month M, Productgroup P and.

6

Solution 7.4:

Select Y,P,C, sum(sales), avg(sales) into YPCFrom SalesCubeGroup By Y,P,C;

Select Y,P, sum(sales), avg(sales) into YPFrom YPCGroup By Y,P;

Select P, sum(sales), avg(sales)From YPGroup By PUnionSelect * From YPUnionSelect Y,C, sum(sales), avg(sales)From YPCGroup By YC;

Cost assuming n fact tuples:

size YPC is 120.000 tuples

n // read SalesCube

+ 3*500*80 // gen YPC

+ 3*500*80 // read YPC

+ 3*500 // gen YP

+ 3*500*2 // read YP

+ 3*500*80 // read YPC

+ 500 + 500*3 + 3*80 // write result

= n + 4500 + 360000+2240 n + 370000 << 3*n !!

(for realistic size of n)

Page 7: 1 Exercise Sheet 3 Exercise 7.: ROLAP Algebra Assume that a fact table SalesCube has 3 hierarchies with attributes  ear , Month M, Productgroup P and.

7

Exercise 8: Clustering

Exercise 8.1:Compute the NN distances for the following set of points and label the corresponding edges.

A *

D *

B C * *

Page 8: 1 Exercise Sheet 3 Exercise 7.: ROLAP Algebra Assume that a fact table SalesCube has 3 hierarchies with attributes  ear , Month M, Productgroup P and.

8

Solution 8.2: Compute the mutual nearest neighbor distances for the points of Exercise 8.1

MND(A,B) = 3, MND(A,C)= 5, MND(A,D)=6MND(B,C)=2 , MND(B,D)=5, MND(C,D)=3

Solution 8.1.

NN(A,B) = 1 NN(A,C) = 2 NN(A,D) = 3

NN(B,A) = 2 NN(B,C) = 1 NN(B,D) = 3

NN(C,A) = 3 NN(C,B) = 1 NN(C,D) = 2

NN(D,A) = 3 NN(D,B) = 2 NN(D,C) = 1

Page 9: 1 Exercise Sheet 3 Exercise 7.: ROLAP Algebra Assume that a fact table SalesCube has 3 hierarchies with attributes  ear , Month M, Productgroup P and.

9

Solution 8.3: Minimal spanning tree

E FG

C DB

A

H

I

JSolution 8.4: 2 clusters: {J}, {A,B,C,D,E,F,G,H,I} 4 clusters: {J}, {A,B,C,I}, {H}, {D,E,F,G} or {J}, {A,B,C,H}, {I}, {D,E,F,G}5 clusters: {J}, {A,B,C}, {I}, {H}, {D,E,F,G}

Page 10: 1 Exercise Sheet 3 Exercise 7.: ROLAP Algebra Assume that a fact table SalesCube has 3 hierarchies with attributes  ear , Month M, Productgroup P and.

10

Exercise 8.5: Which clusters result from the k-means algorithm if we use the small circles as starting centroids for the clusters?

E FG

C o D oB

A

H

I o

o J