Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS •...

68
Data Warehousing & Data Mining & Data Mining Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de

Transcript of Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS •...

Page 1: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

Data Warehousing & Data Mining& Data MiningWolf-Tilo BalkeSilviu HomoceanuInstitut für InformationssystemeTechnische Universität Braunschweighttp://www.ifis.cs.tu-bs.de

Page 2: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

5. Queries

5.1 OLAP query languagesOLAP operators in SQL

MDX (MultiDimensional eXpressions)

5.2 Data modeling

5. Queries

5.2 Data modelingLogical modeling - implementation

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 2

Page 3: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

5.1 How does OLAP work?

Presentation Presentation Presentation

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 3

OLAP

Interface

MDDB

HOLAP

Server

RDBMS

MDDB

ROLAP

Server

RDBMS

Page 4: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• OLAP systems

– Client/server architecture

• The client displays reports and allows interaction with the end user to perform the OLAP operations and other custom queries

5.1 How does OLAP work ?

custom queries

• The server is responsible for providing the requested data. How? It depends on whether it is MOLAP, ROLAP, HOLAP, etc.

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 4

Page 5: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• OLAP server

– High-capacity, multi-user data manipulation engine specificallydesigned to support and operateon multidimensional data

5.1 How does OLAP work ?

on multidimensional datastructures

– It is optimized for

• Fast, flexible calculation and transformation of raw data based on formulaic relationships

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 5

Page 6: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• OLAP server may either

– Physically stage the processed multidimensional information to deliver consistent and rapid response times to end users

• MOLAP

5.1 How does OLAP work ?

• MOLAP

– Populate its data structures in real-time from relational or other databases

• ROLAP

– Or offer a choice of both

• HOLAP

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 6

Page 7: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• We have seen that

– The best way to represent data at the presentation level is multidimensional

• Regardless if the storage is multidimensional (MOLAP) or

5.1 How does OLAP work ?

multidimensional (MOLAP) orrelational (ROLAP)

• Optimal for analyze purposes: easy to understand by the decision makers, natural representations of the data inbusinesses, etc.

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 7

927 103

812 102

39 580

30 501

680 952

818605 825

31 512

14 400

Page 8: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Getting from OLAP operations to the data

– As in the relational model, through queries

• In OLTP we have SQL as the standard query language

– However, OLAP operations are hard to express in SQL

5.1 OLAP query languages

SQL

– There is no standard query language for OLAP

– Choices are:

• SQL-99

• MDX (Multidimensional expressions)

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 8

Page 9: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• SQL-99

– Prepare SQL for OLAP queries

– New SQL commands

• GROUPING SETS

5.1 OLAP query languages

• ROLLUP

• CUBE

– New aggregate functions

– Queries of type “top k”

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 9

Page 10: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Shortcomings of SQL/92 with regard to OLAP queries

– Hard or impossible to express in SQL

• Multiple aggregations

• Comparisons (with aggregation)

5.1 SQL-99

• Comparisons (with aggregation)

• Reporting features

– Performance penalty

• Poor execution of queries with many AND and OR conditions

– Lack of support for statistical functions

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 10

Page 11: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Multiple aggregations in SQL/92

– Create a 2D spreadsheet that shows sum of sales by maker as well as car model

– Each subtotal requires a separate aggregate query

5.1 SQL-99

SELECT model, make, sum(amt)

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 11

BMW Mercedes

SUV

Sedan

Sport

By model

By make

SUM

SELECT model, make, sum(amt)

FROM sales GROUP BY model,

make

union

SELECT model, sum(amt) FROM

sales GROUP BY model

union

SELECT make, sum(amt) FROM

sales GROUP BY make

union

SELECT sum(amt) FROM sales

Page 12: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Comparisons in SQL/92

– This year’s sales vs. last year’s sales for each product

• Requires a self-join

• CREATE VIEW v_sales AS SELECT prod_id, year, sum(qty) AS sale_sum FROM sales GROUP BY prod_id, year;

5.1 SQL-99

AS sale_sum FROM sales GROUP BY prod_id, year;

• SELECT cur.prod_id, cur.year, cur.sale_sum, last.year, last.sale_sum FROM v_sales cur, v_sales last WHERE cur.year = (last.year+1) AND cur.prod_id = last.prod_id;

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 12

Page 13: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Reporting features in SQL/92– Too complex to express

• RANK (top k) and NTILE (“top X%” of all products)

• Median

• Running total, moving average, cumulative totals

– E.g., moving average over a 3 day window of total sales for

5.1 SQL-99

– E.g., moving average over a 3 day window of total sales for each product

• CREATE OR REPLACE VIEW v_sales AS SELECT prod_id, time_id, sum(qty) AS sale_sum FROM sales GROUP BY prod_id, time_id;

• SELECT end.time, avg(start.sale_sum) FROM v_sales start, v_sales end WHERE end.time >= start.time AND end.time <= start.time + 2 GROUP BY end.time;

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 13

Page 14: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Grouping operators

– Extensions to the GROUP BY operator

• GROUPING SET

• CUBE

• ROLLUP

5.1 SQL-99

• ROLLUP

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 14

Page 15: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• GROUPING SET– Used for reporting purposes

– Replaces the series of UNIONed queries• SELECT dept_name, CAST(NULL AS CHAR(10)) AS

job_title, COUNT(*) FROM personnel

5.1 Grouping operators

job_title, COUNT(*) FROM personnelGROUP BY dept_name

UNION ALLSELECT CAST(NULL AS CHAR(8)) AS dept_name, job_title, COUNT(*) FROM personnel

GROUP BY job_title;

• Can be re-written as:SELECT dept_name, job_title, COUNT(*) FROM Personnel

GROUP BY GROUPING SET (dept_name, job_title);

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 15

Page 16: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• The issue of NULL values

– The new grouping functions generate NULL values at the subtotal levels

– So we have generated NULLs and real NULLs from the data itself

5.1 Grouping set

the data itself

– How do we tell the difference?

• Through the GROUPING function return value: GROUPING(job_title) which returns 0 for NULL in the data and 1 for generated NULL

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 16

Page 17: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• ROLLUP– Produces a result set that contains subtotal rows in

addition to regular grouped rows

– GROUP BY ROLLUP (a, b, c) is equivalent to

GROUP BY GROUPING SETS

5.1 Grouping operators

GROUP BY GROUPING SETS

(a, b, c),(a, b), (a), ()

– N elements of the ROLLUP translate to (N+1) grouping sets

– Order is significant to ROLLUP!• GROUP BY ROLLUP (c, b, a) is equivalent with grouping

sets of (c, b, a), (c, b), (c), ()

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 17

Page 18: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• ROLLUP operation, e.g.,:

– SELECT year, brand, SUM(qty) FROM sales GROUP BY ROLLUP(year, brand);

5.1 ROLLUP

Year Brand SUM(qty)

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 18

2008 Mercedes 250

2008 BMW 300

2008 VW 450

2008 1000

2009 Mercedes 50

… … …

2009 400

1400

(year, brand)

(year)

(ALL)

(year, brand)

(year)

Page 19: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• CUBE operator

– Contains all the subtotal rows of a ROLLUP and in addition cross-tabulation rows

– Can also be thought as a series of GROUPING SETs

5.1 Grouping operators

– All permutations of the cubed grouping expressions are computed along with the grand total

• N elements of a CUBE translate to 2n grouping sets:

– GROUP BY CUBE (a, b, c) is equivalent to

GROUP BY GROUPING SETS(a, b, c) (a, b) (a, c) (b, c) (a) (b) (c) ()

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 19

Page 20: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

5.1.1 CUBE operator

BMW MERCBy model

Cross TabSUV

SEDAN

SPORT

By model

Group By (with total)Sum

Aggregate

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 20

SUV

SEDAN

SPORT

By model

By Make & model

By Make & Year

By model& Year

By MakeBy Year

Sum

The Data Cube and The Sub-Space Aggregates

SUV

SEDAN

SPORT

BMW MERC

By Make

By model

Sum

Sum

Page 21: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• E.g., CUBE operator

– SELECT year, brand, SUM(qty) FROM sales GROUP BY CUBE (year, brand);

5.1 CUBE

Year Brand SUM(qty)

2008 Mercedes 250

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 21

2008 Mercedes 250

2008 BMW 300

2008 VW 450

2008 1000

2009 Mercedes 50

… … …

2009 400

Mercedes 300

BMW 350

VW 650

1400

(year, brand)

(year)

(ALL)

(year, brand)

(year)

(brand)

Page 22: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Diagram of OLAP function evaluation

5.1 OLAP functions

Partitioning SortingDynamic

Window

Dynamic

WindowAggregation

.

.

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 22

Partitioning SortingDynamic

Window

Dynamic

WindowAggregation

.

.

OVER(PARTITION BY ORDER BY ROWS BETWEEN…) RANK(), SUM()…

Page 23: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• The window clause

– Specify that we want to perform an action over a set of rows

– 3 sub-clauses: Partitioning, ordering and aggregation grouping

5.1 OLAP functions

aggregation grouping

– General format:<aggregate function> OVER ([PARTITION BY <column list>] ORDER BY <sort column list> [<aggregation grouping>])

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 23

Page 24: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Moving averages are hard to compute withSQL-92

– It involves multiple self joins for the fact table

• With the window clause we can create

5.1 Window clause

• With the window clause we can create dynamical windows: expressed in the <aggregation grouping>

• SELECT … AVG(sales) OVER (PARTITION BY region ORDER BY month ASC ROWS 2 PRECEDING) AS SMA3…

– Moving average of 3 rows

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 24

Page 25: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Ranking operators in SQL

– Row numbering is the most basic ranking function

• ROW_NUMBER() returns a column as an expression that contains the row’s number within the result set

• E.g., SELECT SalesOrderID, CustomerID, ROW_NUMBER()

5.1 Ranking in SQL

• E.g., SELECT SalesOrderID, CustomerID, ROW_NUMBER() OVER (ORDER BY SalesOrderID) as RunningCountFROM Sales WHERE SalesOrderID > 10000 ORDER BY SalesOrderID;

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 25

SalesOrderID CustomerID RunningCount

43659 543 1

43660 234 2

43661 143 3

43662 213 4

43663 312 5

Page 26: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• ROW_NUMBER doesn’t consider tied values

– 2 equal considered values get 2 different returns

5.1 Ranking in SQL

SalesOrderID RunningCount

43659 1

43659 2

– The behavior is non-deterministic

• Each tied value could have its number switched!!

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 26

43659 2

43660 3

43661 4

Page 27: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• RANK and DENSE_RANK functions

– Allow ranking items in a group

– The difference between RANK and DENSE_RANK is that DENSE_RANK leaves no gaps in ranking sequence when there are ties

5.1 Ranking in SQL

sequence when there are ties

– Syntax:

• RANK ( ) OVER ( [query_partition_clause] order_by_clause )

• DENSE_RANK ( ) OVER ( [query_partition_clause] order_by_clause )

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 27

Page 28: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• SQL99 Ranking e.g.,

5.1 Ranking in SQL

SELECT channel, calendar, TO_CHAR(TRUNC(SUM(amount_sold),-6), '9,999,999')

SALES, RANK() OVER (ORDER BY Trunc(SUM(amount_sold),-6) DESC) AS RANK,

DENSE_RANK() OVER (ORDER BY TRUNC(SUM(amount_sold),-6) DESC) AS DENSE_RANK

FROM sales, products …

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 28

CHANNEL CALENDAR SALES RANK DENSE_RANK

Direct sales 02.2009 10,000 1 1

Direct sales 03.2009 9,000 2 2

Internet 02.2009 6,000 3 3

Internet 03.2009 6,000 3 3

Partners 03.2009 4,000 5 4

Page 29: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Other flavors of ranking

– Group ranking

• RANK function can operate within

groups: the rank gets reset whenever

the group changes

5.1 Ranking in SQL

the group changes

• A single query can contain more than one ranking function, each partitioning the data into different groups

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 29

Page 30: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• This is accomplished with the PARTITION BYclause

– E.g., SELECT … RANK() OVER (PARTITION BY channel ORDER BY SUM(amount_sold) DESC) AS RANK_BY_CHANNEL

5.1 Group Ranking

RANK_BY_CHANNEL

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 30

CHANNEL CALENDAR SALES RANK _BY_CHANNEL

Direct sales 02.2009 10,000 1

Direct sales 03.2009 9,000 2

Internet 02.2009 6,000 1

Internet 03.2009 6,000 1

Partners 03.2009 4,000 1

Page 31: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• The treatment of NULL values: NULLs are treated as normal values

– A NULL value is equal to another NULL value

– They are given ranks according to

5.1 Ranking in SQL

• The ASC | DESC options provided for measures

• The NULLS FIRST | NULLS LAST clause

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 31

MONTH SOLD NULL FIRST

ASC

NULL LAST

ASC

NULL FIRST

DESC

NULL LAST

DESC

01 34535 5 3 3 1

02 32123 4 2 4 2

03 27500 3 1 5 3

04 1 4 1 4

05 1 4 1 4

Page 32: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Top k ranking

– By enclosing the RANK function in a sub-query and then applying a filter condition outside the sub-query

5.1 Ranking in SQL

SELECT * FROM

(SELECT country_id,

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 32

(SELECT country_id,

SUM(amount_sold) SALES,

RANK() OVER (ORDER BY SUM(amount_sold) DESC ) AS COUNTRY_RANK

FROM sales, products, customers, times, channels

WHERE ... GROUP BY country_id)

WHERE COUNTRY_RANK <= 5;

Page 33: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• NTILE

– Not a part of SQL99 standards but adopted by major vendors

– Splits a set into equal groups

It divides an ordered partition into buckets and assigns a

5.1 NTILE

• It divides an ordered partition into buckets and assigns a bucket number to each row in the partition

• Buckets are calculated so that each bucket has exactly the same number of rows assigned to it or at most 1 row more than the others

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 33

Page 34: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• SELECT … NTILE(3) OVER (ORDER BY sales) NT_3 FROM …

5.1 NTILE

CHANNEL CALENDAR SALES NT_3

Direct sales 02.2009 10,000 1

Direct sales 03.2009 9,000 1

– NTILE(4) – quartile

– NTILE(100) – percentage

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 34

Direct sales 03.2009 9,000 1

Internet 02.2009 6,000 2

Internet 03.2009 6,000 2

Partners 03.2009 4,000 3

Page 35: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• MDX (MultiDimensional eXpressions)

– Developed by Microsoft

• Not really brilliant

• But adopted by major OLAP providers due to Microsoft's market leader position

5.1 MDX

market leader position

– Used in

• OLE DB for OLAP (ODBO) with API support

• XML for Analysis (XMLA): specification of web services for OLAP

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 35

Page 36: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Similar to SQL syntax

– SELECT

5.1 MDX

SELECT {Deutschland, Niedersachsen, Bayern, Frankfurt} ON COLUMNS,

{Qtr1.CHILDREN, Qtr2, Qtr3} ON ROWS

FROM SalesCube

WHERE (Measures.Sales, Time.[2008], Products.[All Products]);

– SELECT

• axes dimensions, on columns and rows

– FROM

• Data source cube specification

• If joined, data cubes must share dimensions

– WHERE

• Slicer - restricts the data area

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 36

Page 37: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Lists– Enumeration of elementary nodes from different classification levels

• E.g. {Deutschland, Niedersachsen, [Frankfurt am Main], USA}

• Generated elements– Methods which lead to new sets of the classification levels

• Deutschland.CHILDREN generates: {Niedersachsen, Bayern,…}• Niedersachsen.PARENT generates Deutschland

5.1 MDX

• Niedersachsen.PARENT generates Deutschland• Time.Quarter.MEMBERS generates all the elements of the classification level

• Functional generation of sets– DESCENDENT(USA, Cities): children of the provided classification

levels– GENERATE ({USA, France}, DESCENDANTS(Geography.CURRENT,

Cities)): enumerates all the cities in USA and France

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 37

Page 38: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Sets nesting combines individual coordinates to reduce dimensionality

5.1 MDX

SELECT CROSSJOIN({Deutschland, Sachsen, Hannover, BS}{Ikeea, [H&M-Möbel]})

ON COLUMNS,

{Qtr1.CHILDREN, Qtr2} ON ROWS

FROM salesCube

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 38

FROM salesCube

WHERE (Measure.Sales, Time.[2008], Products.[All Products]);

Deutschland Sachsen Hannover BS

Ikeea H&M-

Möbel

Ikeea H&M-

Möbel

Ikeea H&M-

Möbel

Ikeea H&M-

Möbel

Jan 08

Feb 08

Mar 08

Qtr2

Page 39: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Relative selection– Uses the order in the dimensional structures

• Time.[2008].LastChild : last quarter of 2008• [2008].NextMember : {[2009]}• [2008].[Qtr4].Nov.Lead(2) : Jan 2009• [2006]:[2009] represents [2006], .., [2009]

• Methods for hierarchy information extraction

5.1 MDX

• Methods for hierarchy information extraction• Deutschland.LEVEL : country• Time.LEVELS(1) : Year

• Brackets• {}: Sets, e.g. {Hannover, BS, John}• []: text interpretation of numbers, empty spaces between words or other

symbols– E.g. [2008], [Frankfurt am Main], [H&M]

• (): tuple e.g. WHERE (Measure.Sales, Time.[2008], Products.[All Products])

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 39

Page 40: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Special functions and filters

– Special functions TOPCOUNT(), TOPPERCENT(), TOPSUM()

• E.g.

5.1 MDX

SELECT {Time.CHILDREN} ON COLUMNS,

{TOPCOUNT(Deutschland.CHILDREN, 5, Sales.turnover)} ON ROWS

– Filter function

• E.g.

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 40

{TOPCOUNT(Deutschland.CHILDREN, 5, Sales.turnover)} ON ROWS

FROM salesCube

WHERE (Measure.Sales, Time.[2008]);

SELECT FILTER(Deutschland.CHILDREN, ([2008], Turnover) > ([2007], Turnover))

ON COLUMNS, Quarters.MEMBERS ON ROWS

FROM salesCube

WHERE (Measure.Sales, Time.[2008], Products.Electronics);

Page 41: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Time series

– Set Values Expressions

• Choosing time intervals

– PERIODSTODATE(Quarter, [15-Nov-2008]): returns 1.10.-15.11.2008

– LASTPERIODS(3, [Sept-2008]): returns [June-2008], [July-2008], [Aug-

5.1 MDX

LASTPERIODS(3, [Sept-2008]): returns [June-2008], [July-2008], [Aug-2008]

– Member Value Expressions

• Pre-periods

– PARALLELPERIOD(Year, 3, [Sep-2008]): returns [Sep-2005]

– Numerical functions

• COVARIANCE, CORRELATION

• LINEAR REGRESSION

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 41

Page 42: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• XMLA (XML for Analysis) – Most recent attempt at a standardized API for OLAP– Allows client applications to talk to multi-dimensional data

sources– In XMLA, mdXML is a MDX wrapper for XML – Underlying technologies

5.1 mdXML

– Underlying technologies• XML, SOAP, HTTP

– Service primitives• DISCOVER

– Retrieve information about available data sources, data schemas, server infos…

• EXECUTE – Transmission of a query and the corresponding conclusion

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 42

Page 43: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Now we know

– What OLAP looks like

• From the outside

• From the inside

– SQL 99, MDX

5.2 Data ModelingStore dimension

Pro

du

ct d

ime

nsi

on

– SQL 99, MDX

– How data is modeled

• Conceptual level

– mUML, ME/R

• Logical level

– Cubes, dimensions

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 43

Page 44: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• But how do we implement it…

– On a logical level

– On a physical level

5.2 Data ModelingRequirement

Analysis

Conceptual

Design

Functional

Analysis

Data requirements

Conceptual schema

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 44

Physical Design

Application

Program Design

Transaction

Implementation

Logical Design

Logical schema

DBMS Independent

DBMS Dependent

Application

Page 45: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Implementation of the multidimensional data model can be:

– Relational

• Snowflake-schema

• Star-schema

5.2 Implementation

• Star-schema

– Multidimensional

• Array technique

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 45

Page 46: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Relational Implementation

– Main goals:

• As low loss of semantically knowledge as

possible e.g., classification hierarchies

• The translation from multidimensional queries must be

5.2 Implementation

• The translation from multidimensional queries must be efficient

• The RDBMS should be able to run the translated queries efficiently

• The maintenance of the present tables should be easy and fast e.g., when loading new data

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 46

Page 47: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Going from multidimensional to relational– Representations for cubes, dimensions, classification

hierarchies and attributes

– Implementation of cubes without the classification hierarchies is easy

5.2 Implementation

hierarchies is easy• A table can be seen as a cube

• A column of a table can be considered as a dimension mapping

• A tuple in the table represents a cell in the cube

• If we interpret only a part of the columns as dimensions we can use the rest as measures

• The resulting table is called a fact table

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 47

Page 48: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

5.2 Implementation

818

Product

Time

Laptops

Mobile p.

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 48

Geography

13.11.200818.12.2008

Article Store Day Sales

Laptops Hannover, Saturn 13.11.2008 6

Mobile Phones Hannover Saturn 18.12.2008 24

Laptops Braunschweig

Saturn

18.12.2008 3

Page 49: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Snowflake-schema

– Simple idea: use a table for each classification level

• This table includes the ID of the classification level and other attributes

• 2 neighbor classification levels are connected by 1:n

5.2 Snowflake Schema

• 2 neighbor classification levels are connected by 1:n connections e.g., from n Days to 1 Month

• The measures of a cube are maintained in a fact table

• Besides measures, there are also the foreign key IDs for the smallest classification levels

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 49

Page 50: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Snowflake?

– The facts/measures are in the center

– The dimensions spread out in each direction and branch out with their

5.2 Snowflake Schema

branch out with their granularity

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 50

Page 51: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

5.2 Snowflake Example

Sales

Product_ID

Day_ID

Store_ID

Sales

Revenue

Product

Product_ID

Description

Brand

Product_gro

up_ID

Product group

Product_group_ID

Description

Product_categ_ID

Product category

Product_category_ID

Description

Day

Day_ID

Description

Month_ID

Week_ID

Week

Week_ID

Description

Year_ID

Year

Year_ID

Description

Month

Month_ID

Description

Quarter_ID

n

n

n

n

n

n

n

nn

1

1

1

1

11

1

1

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 51

Store

Store_ID

Description

State_ID

State

State_ID

Description

Region_IDRegion

Region_ID

Description

Country_ID

Country

Country_ID

Description

Quarter

Quarter_ID

Description

Year_IDn

n

n n

111

1

1

fact table

dimension tables

time

location

Page 52: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Snowflake schema – Advantages

– With a snowflake schema the size of the dimension tables will be reduced and queries will run faster

• If a dimension is very sparse (most measures corresponding to the dimension have no data)

5.2 Snowflake Schema

corresponding to the dimension have no data)

• And/or a dimension has long list of attributes which may be queried

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 52

Page 53: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Snowflake schema – Disadvantages

– Fact tables are responsible for 90% of the storage requirements

• Thus, normalizing the dimensions usually lead to insignificant improvements

5.2 Snowflake Schema

improvements

– Normalization of the dimension tables can reduce the performance of the DW because it leads to a largenumber of tables

• E.g., when connecting dimensions with coarse granularity these tables are joined with each other during queries

• A query which connects Product category with Year and Country is clearly not performant (10 tables need to be connected)

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 53

Page 54: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

5.2 Snowflake Example

Sales

Product_ID

Day_ID

Store_ID

Sales

Revenue

Product

Product_ID

Description

Brand

Product_gro

up_ID

Product group

Product_group_ID

Description

Product_categ_ID

Product category

Product_category_ID

Description

Day

Day_ID

Description

Month_ID

Week_ID

Week

Week_ID

Description

Year_ID

Year

Year_ID

Description

Month

Month_ID

Description

Quarter_ID

n

n

n

n

n

n

n

nn

1

1

1

1

11

1

1

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 54

Store

Store_ID

Description

State_ID

State

State_ID

Description

Region_IDRegion

Region_ID

Description

Country_ID

Country

Country_ID

Description

Quarter

Quarter_ID

Description

Year_IDn

n

n n

111

1

1

Page 55: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Star schema

– Basic idea: use a denormalized schema for all the dimensions

• A star schema can be obtained from the snowflake schema through the denormalization of the tables

5.2 Star Schema

schema through the denormalization of the tables belonging to a dimension

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 55

Page 56: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

5.2 Star Schema - Example

Sales

Product_ID

Time_ID

Geo_ID

Sales

Revenue

Product

Product_ID

Product group

Product category

Description

Time

Time_ID

Day

Week

Month

Quarter

Year

n

n

n

1

1

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 56

Geography

Geo_ID

Store

State

Region

Country

Year

1

Page 57: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Advantages

– Improves query performance for often-used data

– Less tables and simple structure

– Efficient query processing with regard to dimensions

5.2 Star Schema

Efficient query processing with regard to dimensions

• Disadvantages

– In some cases, high overhead of redundant data

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 57

Page 58: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Snowflake vs. Star

5.2 Snowflake vs. Star

– The structure of the classifications are expressed in table schemas

– The fact table and dimension tables are

– The entire classification is expressed in just one table

– The fact table is normalized while in the dimension table the normalization is broken

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 58

dimension tables are normalized

the normalization is broken

• This leads to redundancy of information in the dimension tables

Page 59: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

5.2 Examples

• Snowflake • Star

Product_ID Description Brand Prod_group_ID

10 E71 Nokia 4

11 PS-42A Samsung 2

12 5800 Nokia 4

Bold Berry 4

Product_

ID

Description … Prod.

group

Prod. categ

10 E71 … Mobile Ph.. Electronics

11 PS-42A … TV Electronics

12 5800 Mobile Ph.. Electronics

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 59

Prod_group_ID Description Prod_categ_ID

2 TV 11

4 Mobile Pho.. 11

Prod_categ_ID Description

11 Electronics

13 Bold Mobile Ph.. Electronics

Page 60: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• When should we go from Snowflake to star?

– Heuristics-based decision

• When typical queries relate to coarser granularity (like product category)

• When the volume of data in the dimension tables is

5.2 Snowflake to Star

• When the volume of data in the dimension tables is relatively low compared to the fact table

– In this case a star schema leads to negligible overhead through redundancy, but performance is improved

• When modifications on the classifications are rare compared to insertion of fact data

– In this case these modifications controlled through the data load process of the ETL reducing the risk of data anomalies

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 60

Page 61: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Snowflake or Star?

– It depends on the necessity

• Fast query processing or efficient space usage

– However, most of the time a mixed form

5.2 Do we have a winner?

is used

• The Starflake schema: some dimensionsstay normalized corresponding to thesnowflake schema, while others are denormalized according to the star schema

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 61

Page 62: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• The Starflake schema– The decision on how to deal with

the dimensions is influenced by • Frequency of the modifications: if the dimensions change

often, normalization leads to better results

5.2 Our forces combined

often, normalization leads to better results

• Amount of dimension elements: the bigger the dimension tables, the more space normalization saves

• Number of classification levels in a dimension: more classification levels introduce more redundancy in the star schema

• Materialization of aggregates for the dimension levels: if the aggregates are materialized, a normalization of the dimension can bring better response time

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 62

Page 63: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Galaxies

– In pratice we usually have more measures described by different dimensions

• Thus, more fact tables

5.2 More Schemas

• Thus, more fact tables

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 63

Sales

Product_ID

Store_ID

Sales

Revenue

Store

Store_ID

Date

Date_ID

Product

Product_ID

…Receipts

Product_ID

Date_ID

Vendor

Vendor_ID

Page 64: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Other schemas

– Fact constellations

• Pre-calculated aggregates

– Factless fact tables

5.2 More Schemas

• Fact tables do not have non-key data

– Can be used for event tracking or to inventory the set of possible occurrences

– …

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 64

Page 65: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Multidimensional implementation

– The representation of the multidimensional data can be implemented relationally with a finite set of transformation steps, however:

• Multidimensional queries have to be first translated to the

5.2 Multidimensional?

• Multidimensional queries have to be first translated to the relational representation

• A direct interaction with the relational data model is not fit for the end user

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 65

Page 66: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Data structures

– The basic data structure for multidimensional data storage is the array

– The elementary data structures are the cubes and the dimensions

5.2 Multidimensional?

dimensions

• C=((D1, …, Dn), (M1, …, Mm))

– The storage is intuitive as arrays of arrays, physically linearized

• More about linearization and related issues will be discussed in the lecture optimization

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 66

Page 67: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Defining the physical structures– Setting up the database environment– Setting up the appropriate security– Preliminary performance tuning strategies

• Indexing• Partitioning

5.2 Physical Model?

• Partitioning• Materialization

• Goal:– define the actual storage architecture– decide on how the data is to be accessed and how it is

arranged– Again, this will be discussed in more detail in the lecture

on optimization

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 67

Page 68: Data Warehousing & Data Mining - ifis.cs.tu-bs.de · HOLAP Server RDBMS MDDB ROLAP Server RDBMS • OLAP systems – Client/server architecture • The client displays reports and

• Optimization

– Indexes

Next lecture

Data Warehousing & OLAP – Wolf-Tilo Balke – Institut für Informationssysteme – TU Braunschweig 68