Lecture OLAP

download Lecture OLAP

of 83

Transcript of Lecture OLAP

  • 8/2/2019 Lecture OLAP

    1/83

    ON-LINE ANALYTICALPROCESSING

  • 8/2/2019 Lecture OLAP

    2/83

    04/23/12 2

    Lecture Objectives

    What is OLAP

    Need for OLAPFeatures & functions of OLAP

    Different OLAP models

    OLAP implementations

  • 8/2/2019 Lecture OLAP

    3/83

    04/23/12 3

    OLAP

    Term coined in mid 1990s

    Main Goal: support ad-hoc but

    complex querying by businessanalysts

    Extends worksheet like analysis towork with huge amounts of data in

    a DW

  • 8/2/2019 Lecture OLAP

    4/83

    04/23/12 4

    Demand for OLAP

    2 approaches to developing EDWs In both approaches, Data Marts

    rest on Dimensional Model Data Marts are sufficient for basic

    data analysis Users need to go beyond such

    basic analysis

  • 8/2/2019 Lecture OLAP

    5/83

    04/23/12 5

    Demand for OLAP

    Need for MultidimensionalAnalysis

    Fast Access & Powerful

    Calculations Limitations of other analysis

    methods like:SQL

    SpreadsheetsReport Writers

  • 8/2/2019 Lecture OLAP

    6/83

    04/23/12 6

    Demand for OLAP

    Traditional tools of report writers,query products, spreadsheets, &language interfaces do not matchthe user expectations as far asperforming multidimensionalanalysis with complex calculationsis concerned.

    Tools used with OLTP and basic DW

    environments do not match up tothe task

  • 8/2/2019 Lecture OLAP

    7/83

    04/23/12 7

    OLAP is the Answer!

    OLAP is a category of software technology

    that enables analysts, managers, and

    executives to gain insight into the data

    through fast, consistent, interactive, access in

    a wide variety of possible views of

    information that has been transformed from

    raw data to reflect the real dimensionality ofthe enterprise as understood by the user.

  • 8/2/2019 Lecture OLAP

    8/83

    04/23/12 8

    What is OLAP?

    OLAP software provides the ability toanalyze large volumes of information to

    improve decision making at all levels of an

    organization.

  • 8/2/2019 Lecture OLAP

    9/83

    04/23/12 9

    What is OLAP?

    A wide spectrum of multidimensionalanalysis involving intricate calculations and

    requiring fast response times.

  • 8/2/2019 Lecture OLAP

    10/83

    04/23/12 10

    What is OLAP?

    OLAP has two immediate consequences:

    online part requires the answers of queries to

    be fast, the analyticalpart is a hint that the

    queries itself are complex

    i.e., Complex questions with Fast Answers!

  • 8/2/2019 Lecture OLAP

    11/83

    04/23/12 11

    Why a separate OLAP tool?

    oEmpowers end users to do own

    analysiso Frees up IS backlog of report requestso Ease of useo No knowledge of tables or SQL

    required

  • 8/2/2019 Lecture OLAP

    12/83

    04/23/12 12

    OLAP Characteristics

    oMulti-user environment

    oClient-server architecture

    o Rapid response to queries,

    regardless of DB size and complexity

  • 8/2/2019 Lecture OLAP

    13/83

    04/23/12 13

    Data Warehouse & OLAP

    o OLAP is a software system that works on top

    of a DW

    o A front-end tool for a DW

    o Information delivery system for the DW

    o Compliments the information deliverycapacities of a DW

  • 8/2/2019 Lecture OLAP

    14/83

    04/23/12 14

    Why is OLAP useful?

    Facilitates multidimensional dataanalysis by pre-computingaggregates across many sets of

    dimensions Provides for:

    Greater speed and responsiveness

    Improved user interactivity

  • 8/2/2019 Lecture OLAP

    15/83

    04/23/12 15

    The OLAP Market

  • 8/2/2019 Lecture OLAP

    16/83

    04/23/12 16

    The OLAP Market

  • 8/2/2019 Lecture OLAP

    17/83

    04/23/12 17

    Warehouse Models & Operators

    Data Models relations

    stars & snowflakes

    cubes

    Operators slice & dice

    roll-up, drill down pivoting

    other

  • 8/2/2019 Lecture OLAP

    18/83

    04/23/12 18

    Data Warehouses

    A data warehouse is based on amultidimensional data model which viewsdata in the form of a data cube

    A data cube allows data to be modeledand viewed in multiple dimensions

    In data warehousing literature, an n-Dbase cube is called a base cuboid. The top

    most 0-D cuboid, which holds thehighest-level of summarization, is calledthe apex cuboid. The lattice of cuboidsforms a data cube.

  • 8/2/2019 Lecture OLAP

    19/83

    04/23/12 19

    Lattice of Cuboids

    all

    time item location supplier

    time,item time,location

    time,supplier

    item,location

    item,supplier

    location,supplier

    time,item,location

    time,item,supplier

    time,location,supplier

    item,location,supplier

    time, item, location, supplier

    0-D(apex) cuboid

    1-D cuboids

    2-D cuboids

    3-D cuboids

    4-D(base) cuboid

  • 8/2/2019 Lecture OLAP

    20/83

    04/23/12 20

    CUBE

    sale prodId storeId date amtp1 c1 1 12

    p2 c1 1 11

    p1 c3 1 50

    p2 c2 1 8

    p1 c1 2 44

    p1 c2 2 4

    day 2c1 c2 c3

    p1 44 4

    p2 c1 c2 c3

    p1 12 50

    p2 11 8

    day 1

    dimensions = 3

    Multi-dimensional cube:Fact table view:

  • 8/2/2019 Lecture OLAP

    21/83

    04/23/12 21

    Aggregates

    sale prodId storeId date amt

    p1 c1 1 12

    p2 c1 1 11

    p1 c3 1 50

    p2 c2 1 8

    p1 c1 2 44

    p1 c2 2 4

    Add up amounts for day 1 In SQL: SELECT sum(amt) FROM SALE

    WHERE date = 1

    81

  • 8/2/2019 Lecture OLAP

    22/83

    04/23/12 22

    sale prodId storeId date amt

    p1 c1 1 12

    p2 c1 1 11

    p1 c3 1 50

    p2 c2 1 8

    p1 c1 2 44

    p1 c2 2 4

    Add up amounts by day In SQL: SELECT date, sum(amt) FROM SALE

    GROUP BY date

    ans date sum

    1 81

    2 48

    Aggregates

  • 8/2/2019 Lecture OLAP

    23/83

    04/23/12 23

    Operators: sum, count, max,min, median, ave

    Having clause Using dimension hierarchy

    average by region (within store)

    maximum by month (within date)

    Aggregates

  • 8/2/2019 Lecture OLAP

    24/83

    04/23/12 24

    Cube Aggregation

    day 2

    c1 c2 c3

    p1 44 4

    p2c1 c2 c3

    p1 12 50

    p2 11 8

    day 1

    c1 c2 c3

    p1 56 4 50

    p2 11 8

    c1 c2 c3

    sum 67 12 50

    sum

    p1 110

    p2 19

    129

    . . .

    drill-down

    rollup

    Example: computing sums

  • 8/2/2019 Lecture OLAP

    25/83

    04/23/12 25

    Cube Operators

    day 1

    day 2c1 c2 c3

    p1 44 4

    p2 c1 c2 c3

    p1 12 50

    p2 11 8

    c1 c2 c3

    p1 56 4 50

    p2 11 8

    c1 c2 c3

    sum 67 12 50

    sum

    p1 110

    p2 19

    129

    . . .

    sale(c1,*,*)

    sale(*,*,*)sale(c2,p2,*)

    sale(*,p1,*)

  • 8/2/2019 Lecture OLAP

    26/83

    04/23/12 26

    c1 c2 c3 *

    p1 56 4 50 110

    p2 11 8 19* 67 12 50 129

    Extended Cube

    day 2c1 c2 c3 *

    p1 44 4 48

    p2

    * 44 4 48c1 c2 c3 *

    p1 12 50 62

    p2 11 8 19

    * 23 8 50 81

    day 1

    *

    sale(*,p2,*)

  • 8/2/2019 Lecture OLAP

    27/83

    04/23/12 27

    Aggregation UsingHierarchies

    day 2c1 c2 c3

    p1 44 4

    p2 c1 c2 c3

    p1 12 50

    p2 11 8

    day 1

    region A region B

    p1 56 54

    p2 11 8

    customer

    region

    country

    (customer c1 in Region A;

    customers c2, c3 in Region B)

  • 8/2/2019 Lecture OLAP

    28/83

    04/23/12 28

    Pivoting

    sale prodId storeId date amt

    p1 c1 1 12

    p2 c1 1 11

    p1 c3 1 50

    p2 c2 1 8p1 c1 2 44

    p1 c2 2 4

    day 2c1 c2 c3

    p1 44 4

    p2 c1 c2 c3

    p1 12 50

    p2 11 8

    day 1

    Multi-dimensional cube:Fact table view:

    c1 c2 c3

    p1 56 4 50

    p2 11 8

  • 8/2/2019 Lecture OLAP

    29/83

    04/23/12 29

    Cube Aggregates Lattice

    city, product, date

    city, product city, date product, date

    city product date

    all

    day 2c1 c2 c3

    p1 44 4

    p2 c1 c2 c3

    p1 12 50

    p2 11 8

    day 1

    c1 c2 c3

    p1 56 4 50

    p2 11 8

    c1 c2 c3

    p1 67 12 50

    129

    use greedy

    algorithm to

    decide what

    to materialize

  • 8/2/2019 Lecture OLAP

    30/83

    04/23/12 30

    Dimension Hierarchies

    all

    state

    city

    cities city state

    c1 CA

    c2 NY

  • 8/2/2019 Lecture OLAP

    31/83

    04/23/12 31

    Dimension Hierarchies

    city, product

    city, product, date

    city, date product, date

    city product date

    all

    state, product, date

    state, date

    state, product

    state

    not all arcs shown...

  • 8/2/2019 Lecture OLAP

    32/83

    04/23/12 32

    Interesting Hierarchy

    all

    years

    quarters

    months

    days

    weeks

    time day week month quarter year

    1 1 1 1 2000

    2 1 1 1 2000

    3 1 1 1 2000

    4 1 1 1 2000

    5 1 1 1 2000

    6 1 1 1 20007 1 1 1 2000

    8 2 1 1 2000

    conceptual

    dimension table

  • 8/2/2019 Lecture OLAP

    33/83

    04/23/12 33

    Total annual salesof TV in U.S.A.

    Date

    Product

    Countr

    ysum

    sumTV

    VCRPC

    1Qtr 2Qtr 3Qtr 4Qtr

    U.S.A

    Canada

    Mexico

    sum

    SAMPLE CUBE

    Total annual sales

    of PC in U.S.A.Total annual sales

    of VCR in U.S.A.

    Total Q1 sales

    In U.S.ATotal Q1 sales

    In CanadaTotal Q1 sales

    In Mexico

    Total Q1 sales

    In all countries

    Total Q2 sales

    In all countries

    Total sales

    In U.S.ATotal sales

    In Canada

    Total sales

    In Mexico

    TOTAL SALES

  • 8/2/2019 Lecture OLAP

    34/83

    04/23/12 34

    Roll-Up

    Drill-Down

    Slice & Dice Pivot

    Drill-Across

    Drill-Through

    OLAP Operations

  • 8/2/2019 Lecture OLAP

    35/83

    04/23/12 35

    OLAP Operations

    Roll up (drill-up): summarize data

    by climbing up hierarchy or by dimension reduction

    Drill down (roll down): reverse of roll-up

    from higher level summary to lower level summary or

    detailed data, or introducing new dimensions

    Slice and dice:

    project and select

    Pivot (rotate):

    reorient the cube, visualization, 3D to series of 2D planes. Other operations

    drill across: involving (across) more than one fact table

    drill through: through the bottom level of the cube to its

    back-end relational tables (using SQL)

  • 8/2/2019 Lecture OLAP

    36/83

    04/23/12 36

    Fact TableSales(Store_id, Product_id, Time_id, Sales_amt)

    Dimension TablesStore (Store_id, city, state, region, country)

    Product (Product_id, name, category)

    Day (Time_id, month, quarter, year)

    HierarchiesStore City State Region CountryProduct CategoryDay Month Quarter Year

    Example Schema

  • 8/2/2019 Lecture OLAP

    37/83

    04/23/12 37

    SELECT S.product_id, S.store_id, SUM(S.sales_amt)

    FROM Sales S

    GROUP BY S.store_id, S. product_id

    SELECT S.product_id, St.state, SUM(S.sales_amt)FROM Sales S, Store St

    WHERE St.store_id=S.store_id

    GROUP BY S.product_id, St.state

    SELECT S.product_id, St.city, SUM(S.sales_amt)FROM Sales S, Store St

    WHERE St.store_id=S.store_id

    GROUP BY S.product_id, St.city

    Drill-Down

    State

    City

  • 8/2/2019 Lecture OLAP

    38/83

    04/23/12 38

    Drill-Down

  • 8/2/2019 Lecture OLAP

    39/83

    04/23/12 39

    SELECT S.product_id, St.city, SUM(S.sales_amt)

    INTO City_sales

    FROM Sales S, Store St

    WHERE St.store_id=S.store_id

    GROUP BY S.product_id, St.city

    SELECT T.product_id, St.State, SUM(T.sales_amt)

    FROM City_sales T, Store St

    WHERE St.city=T.CityGROUP BY T.product_id, St.State

    Rolling Up

  • 8/2/2019 Lecture OLAP

    40/83

    04/23/12 40

    When we view the data as a multi-dimensionalcube & group on a subset of axes, we are said tobe performing a pivot on those axes

    - Pivoting on dimension Dj (j=1(1)k) in a cube Di(i=1(1)n) means that we use GROUP BY Aj

    (j=1(1)k) & aggregate over Ak+1, . An, where Ai

    is an attribute of dimension Di

    - Pivoting on product & time corresponds togrouping on prod_id & quarter & aggregatingover store_id

    Pivoting

  • 8/2/2019 Lecture OLAP

    41/83

    04/23/12 41

    SELECT S.product_id, T.quarter, SUM(S.sales_amt)

    FROM Sales S, Time T

    WHERE T.time_id=S.time_id

    GROUP BY S.product_id, T.quarter

    Pivoting

  • 8/2/2019 Lecture OLAP

    42/83

    04/23/12 42

    When we use GROUP BY to specify partof an hierarchy, we are performing arange selection called a DICE

    Dicing Sales in the time dimension: totalsales for each product in each qurater

    SELECT S.product_id, T.quarter, SUM(S.sales_amt)

    FROM Sales S, Time TWHERE T.time_id=S.time_id

    GROUP BY T.quarter, S.product_id

    Dicing

  • 8/2/2019 Lecture OLAP

    43/83

    04/23/12 43

    When we use WHERE to specify aparticular value for an axis, we areperforming a SLICE

    Slicing in the time dimension: choosingsales only in week 12, then pivoting toproduct_id (aggregating over store_id)

    SELECT S.product_id, SUM(S.sales_amt)

    FROM Sales S, Time TWHERE T.time_id=S.time_id & T.week=12GROUP BY S.product_id

    Slicing

  • 8/2/2019 Lecture OLAP

    44/83

    04/23/12 44

    Slicing

  • 8/2/2019 Lecture OLAP

    45/83

    04/23/12 45

    OLAP Operations

  • 8/2/2019 Lecture OLAP

    46/83

    04/23/12 46

    Slicing

  • 8/2/2019 Lecture OLAP

    47/83

    04/23/12 47

    Dicing (Sub-cube)

  • 8/2/2019 Lecture OLAP

    48/83

    04/23/12 48

    Roll-Up

  • 8/2/2019 Lecture OLAP

    49/83

    04/23/12 49

    Drill-Down

    h

  • 8/2/2019 Lecture OLAP

    50/83

    04/23/12 50

    Other OLAPOperations

    o Drill-Across: Queries involving more than one fact tableo Drill-Through: Makes use of SQL to drill through thebottom level of a data cube down to its back-end relationaltables

    o Pivot (rotate): Pivot (also called "rotate") is avisualization operation which rotates the data axes inview in order to provide an alternative presentation ofthe data. Other examples include rotating the axes in a

    3-D cube, or transforming a 3-D cube into a series of 2-D planes.

    O h O

  • 8/2/2019 Lecture OLAP

    51/83

    04/23/12 51

    Other OLAPOperations

    oTop N or Bottom N querieso Moving Averageso Growth Rateso Depreciationo Currency Conversiono Statistical Functions

  • 8/2/2019 Lecture OLAP

    52/83

    04/23/12 52

    Conceptual vs. Actual

    The cube is a logical way ofvisualizing the data in an OLAPsetting

    Not how the data is actuallyrepresented on disk

    Two ways of storing data:ROLAP: Relational OLAPMOLAP: Multidimensional OLAP

  • 8/2/2019 Lecture OLAP

    53/83

    04/23/12 53

    Approaches to OLAPServers

    It is all about which DBMS youchoose to store your data warehouse

    data RDBMS ROLAP

    MDDB MOLAP

    BOTH - HOLAP

  • 8/2/2019 Lecture OLAP

    54/83

    04/23/12 54

    OLAP Flavours

    OLAP

    ROLAP MOLAP DOLAP

    HOLAP

  • 8/2/2019 Lecture OLAP

    55/83

    04/23/12 55

    Approaches to OLAPServers

    Three possibilities for OLAP servers

    (1) Relational OLAP (ROLAP) Relational and specialized relational DBMS to store and

    manage warehouse data

    OLAP middleware to support missing pieces(2) Multidimensional OLAP (MOLAP)

    Array-based storage structures Direct access to array data structures

    (3) Hybrid OLAP (HOLAP)

    Storing detailed data in RDBMS Storing aggregated data in MDBMS User access via MOLAP tools

  • 8/2/2019 Lecture OLAP

    56/83

    04/23/12 56

    ROLAP

    Special schema design: star, snowflake

    Special indexes: bitmap, multi-table join

    Proven technology (relational model,DBMS), tend to outperform specializedMDDB especially on large data sets

    Products IBM DB2, Oracle, Sybase IQ, RedBrick,

    Informix

  • 8/2/2019 Lecture OLAP

    57/83

    04/23/12 57

    ROLAP Defines complex, multi-dimensional data

    with simple model Reduces the number of joins a query has to

    process

    Allows the data warehouse to evolve withrelatively low maintenance Can contain both detailed and summarized

    data. ROLAP is based on familiar, proven, and

    already selected technologies.BUT!!! SQL for multi-dimensional manipulation of

    calculations.

  • 8/2/2019 Lecture OLAP

    58/83

    04/23/12 58

    MOLAP

    MDDB: a special-purpose data model Facts stored in multi-dimensional

    arrays Dimensions used to index array Sometimes on top of relational DB Products

    Pilot, Arbor Essbase, Gentia

  • 8/2/2019 Lecture OLAP

    59/83

    04/23/12 59

    MOLAP

    Pre-calculating or pre-consolidating transactionaldata improves speed.

    BUTFully pre-consolidating incoming data, MDDs requirean enormous amount of overhead both in processingtime and in storage. An input file of 200MB can easilyexpand to 5GB

    MDDBs are great candidates for the < 100GBdepartment data marts.

    With MDDs, application design is essentially thedefinition of dimensions and calculation rules, whilethe RDBMS requires that the database schema be astar or snowflake.

    i k f

  • 8/2/2019 Lecture OLAP

    60/83

    04/23/12 60

    Quick Recap of OLAPNeeds

    User Needs Multidimensional view Excellent Performance Analytical Flexibility Real-Time Data Access High Data Capacity

    MIS Needs Leverages Data Warehouse

    Easy Development Low Structure Maintenance Low Aggregate Maintenance

    Q i k R f O AP

  • 8/2/2019 Lecture OLAP

    61/83

    04/23/12 61

    Quick Recap of OLAPNeeds: User Needs

    Multidimensional ViewAll true OLAP tools, whether they work

    with a MDDB or an RDBMS, provide a

    multidimensional view of data. For example, decision makers may view

    sales by office, quarter, representative,product, etc. This perspective on data,

    which mirrors the way businessprofessional think, allows for moreintuitive and more powerful analysis.

    Q i k R f OLAP

  • 8/2/2019 Lecture OLAP

    62/83

    04/23/12 62

    Excellent Performance The performance of your decision support

    tool directly depends on the way it

    manages aggregates.RDBMS

    Calculate aggregates on fly (response timesuffers)

    DBA creates summary tables to storeaggregates (enormous amount of diskspace)

    Quick Recap of OLAPNeeds: User Needs

    Q i k R f OLAP

  • 8/2/2019 Lecture OLAP

    63/83

    04/23/12 63

    Quick Recap of OLAPNeeds: User Needs

    Excellent Performance For example, suppose you have a Sales indicator

    with six dimensionsRepresentatives, Products,Customers, Regions, Months, and Years.

    MOLAP tools will store a given aggregate, such asthe November 1997 government sales of productA504 by representative 1040 in New York, in 1cell of the MDDB.

    In contrast, ROLAP tools consume 600% more

    space, because they require a record of sevenvaluessix foreign keys and the actual aggregatein a relational summary table.

    Q i k R f OLAP

  • 8/2/2019 Lecture OLAP

    64/83

    04/23/12 64

    Quick Recap of OLAPNeeds: User Needs

    Excellent Performance

    Q i k R f OLAP

  • 8/2/2019 Lecture OLAP

    65/83

    04/23/12 65

    Quick Recap of OLAPNeeds: User Needs

    Excellent PerformanceRDBMSs must use several summary tables to store the aggregatesthat a MOLAP could store in just one cube. For example, consider a Salesindicator with three dimensions: Months, Regions, and Products. The indicatorcube will contain seven sets of aggregates:

    Sales by month Sales by product Sales by region Sales by month and product Sales by month and region Sales by product and region Sales by product, month, and regionTo store these aggregates in an RDBMS, youd have to create seven summarytables, one for each aggregate set.HOW MANY SUMMARY TABLES FOR 6 DIMENSIONS?(Separate fact table and shrunken dimension table approach for storingaggregates)

    Q i k R f OLAP

  • 8/2/2019 Lecture OLAP

    66/83

    04/23/12 66

    Quick Recap of OLAPNeeds: User Needs

    Excellent Performance

    Huge amounts of extra storage space is required (even ifthere is no sparsity failure)

    Maintenance costs are high

    Lot of statistical analysis needs to be done to decidewhich aggregates are to be precomputed

    DBA must keep the cost/performance ratio in check

    Q i k R f OLAP

  • 8/2/2019 Lecture OLAP

    67/83

    04/23/12 67

    Quick Recap of OLAPNeeds: User Needs

    Excellent Performance

    In contrast, weve seen that multidimensional databasesstore aggregates in a very compact structure thatconsumes very little disk space and requires very little

    maintenance

    All levels of consolidation can therefore be precomputedand stored in MDDB

    As a result, fast response time is not limited to the mostfrequently accessed queries; all aggregates can be accessed withlightning speed.

    Q i k R f OLAP

  • 8/2/2019 Lecture OLAP

    68/83

    04/23/12 68

    Quick Recap of OLAPNeeds: User Needs

    Analytical Flexibility

    Both ROLAP & MOLAP tools offer comparativeperformance for Comparative Analysis

    Roll-up and Drill-down

    Slicing & Dicing

    Only MOLAP tools offer what-if analysis

    Q i k R f OLAP

  • 8/2/2019 Lecture OLAP

    69/83

    04/23/12 69

    Quick Recap of OLAPNeeds: User Needs

    Real-Time Data Access MOLAP tools load data into the multidimensional cubes.

    Consequently, the data being accessed is only as recentas the last load.

    Some applications require real-time data access Process of continually refreshing the data attaches higher

    costs to operating a MOLAP system Some MOLAP tools offer reach-through functionality to

    access volatile data stored outside the MDDB Unfortunately, users must be aware of the underlying

    database structure Relational data access is too complex for the typical user

    Q ick Recap of OLAP

  • 8/2/2019 Lecture OLAP

    70/83

    04/23/12 70

    Quick Recap of OLAPNeeds: User Needs

    Real-Time Data Access ROLAP tools maintain a constant link to the

    operational RDBMS, which provides users

    with up-to-the-minute, accurate data(Real-Time Data Warehousing)

    Industries & organizations with highly volatiledata particularly benefit from this access to

    live, operational data.

    Quick Recap of OLAP

  • 8/2/2019 Lecture OLAP

    71/83

    04/23/12 71

    Quick Recap of OLAPNeeds: User Needs

    High Capacity Data MOLAP products are limited by the size of the

    cube defined by the multidimensional view.

    When dimension elements are predefined, thescope of available data is limited at the onset.

    ROLAP tools circumvent this barrier. Dynamicdimensions are not stored in the predefined

    multidimensional model, but fetched at runtime from the RDBMS.

    Quick Recap of OLAP

  • 8/2/2019 Lecture OLAP

    72/83

    04/23/12 72

    Quick Recap of OLAPNeeds: User Needs

    High Capacity Data

    o In MOLAP, only aggregates are stored in the cube.

    Atomic, operational data are forced out of the usersanalytical realm.o ROLAP systems can access extremely detailedoperational data, as well as aggregated data stored in

    summary tables.

    Quick Recap of OLAP

  • 8/2/2019 Lecture OLAP

    73/83

    04/23/12 73

    Quick Recap of OLAPNeeds

    MIS Needs

    Administrators should be able to

    leverage their existing relationaldatabases without devoting largeamounts of time and effort to intricatedevelopment, fine tuning, or intensive

    maintenance.

    Quick Recap of OLAP

  • 8/2/2019 Lecture OLAP

    74/83

    04/23/12 74

    Quick Recap of OLAPNeeds: MIS Needs

    Leveraging Data Warehouse Both the finance and the MIS departments of

    your organization will appreciate a decision

    support tool that leverages existinginvestments in data warehousing.

    MIS staff that opts for a MOLAP tool mustduplicate data in its own proprietary MDDB.

    MIS staff that chooses a ROLAP tool will beable to access the data warehouse directly.

    Quick Recap of OLAP

  • 8/2/2019 Lecture OLAP

    75/83

    04/23/12 75

    Quick Recap of OLAPNeeds: MIS Needs

    Easy Development MOLAP development is straightforward, it requires no

    fine tuning and creates its own aggregates. ROLAP tools, on the other hand, require a specific

    schema for the relational database. Skilled DBAs must provide the appropriate schema

    (star or snowflake schema), tune the database, andcreate the appropriate summary tables.

    However, many ROLAP tools are metadata-driven,

    which means the multidimensional view is generatedand maintained more easily.

    Quick Recap of OLAP

  • 8/2/2019 Lecture OLAP

    76/83

    04/23/12 76

    Quick Recap of OLAPNeeds: MIS Needs

    Low Structure Maintenance The structure of a MOLAP tools underlying MDDB

    greatly depends on each of its dimensions. When onedimension changes, the entire MDDB must be re-

    structured. Multi-matrix MDDBs reduce the maintenance burden ROLAP systems do not store data in a proprietary

    structure. They build and maintain a constant link between the

    multidimensional view and the underlying RDBMS usingthe metadata.

    No database restructuring is required.

    Quick Recap of OLAP

  • 8/2/2019 Lecture OLAP

    77/83

    04/23/12 77

    Quick Recap of OLAPNeeds: MIS Needs

    Low Aggregate Maintenance MOLAP tools automatically create high-level aggregates

    based on your lower-level MDDB data and aggregatedefinitions.

    When data is updated, the aggregates areautomatically updated and stored in the MDDB. With ROLAP tools, MIS staff must continually monitor

    the use of summary tables to keep theircost/performance ratio in check.

    DBAs inevitably use sophisticated statistics to isolate

    only the most frequently accessed aggregates, andstore them in summary tables. These tables leave ROLAP administrators with a heavy

    maintenance burden.

  • 8/2/2019 Lecture OLAP

    78/83

    04/23/12 78

    ROLAP vs. MOLAP

  • 8/2/2019 Lecture OLAP

    79/83

    04/23/12 79

    ROLAP vs. MOLAP

    1)1)Performance:Performance:How fast will the system appear to the end-user?How fast will the system appear to the end-user? MDD server vendors believe this is a key point in theirMDD server vendors believe this is a key point in their

    favor.favor.

    2) Data volume and scalability:2) Data volume and scalability:While MDD servers can handle up to 100GB of storage,While MDD servers can handle up to 100GB of storage,RDBMS servers can handle hundreds of gigabytes andRDBMS servers can handle hundreds of gigabytes and

    terabytes.terabytes.

  • 8/2/2019 Lecture OLAP

    80/83

    04/23/12 80

    Hybrid OLAP - HOLAP

    o Best of both worlds

    o Storing detailed data in RDBMS

    o Storing aggregated data in MDBMS

    o

    User access via MOLAP tools

  • 8/2/2019 Lecture OLAP

    81/83

    04/23/12 81

    HOLAP

    Multi-

    dimensional

    access Multidimensional

    Viewer

    RelationalViewer

    ClientMDBMS Server

    Multi-

    dimensional

    data

    SQL-Read

    RDBMS Server

    User

    data Metadata

    Derived

    data

    SQL-Reach

    Through

    SQL-Read

    ROLAP MOPAL or

  • 8/2/2019 Lecture OLAP

    82/83

    04/23/12 82

    ROLAP, MOPAL, orHOLAP

    IF A. You require write accessB. Your data is under 50 GBC. Your timetable to implement is 60-90 daysD. Lowest level already aggregatedE. Data access on aggregated levelF. Youre developing a general-purpose application for inventory movement or assetsmanagement

    THEN

    Consider an MDD /MOLAP solution for your data mart

    IFA. Your data is over 100 GBB. You have a "read-only" requirementC. Historical data at the lowest level of granularityD. Detailed access, long-running queriesE. Data assigned to lowest level elements

    THEN

    Consider an RDBMS/ROLAP solution for your data mart.

    IFA. OLAP on aggregated and detailed dataB. Different user groupsC. Ease of use and detailed data

    THENConsider an HOLAP for your data mart

  • 8/2/2019 Lecture OLAP

    83/83

    Conclusions

    ROLAP: RDBMS -> star/snowflake schema MOLAP: MDDB -> Cube structures ROLAP or MOLAP: Data models used play major role in

    performance differences

    MOLAP: for summarized and relatively lesser volumesof data (100GB)

    ROLAP: for detailed and larger volumes of data Both storage methods have strengths and weaknesses The choice is requirement specific, though currently

    data warehouses are predominantly built usingRDBMSs/ROLAP.

    HOLAP is emerging as the OLPA server of choice