Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business...

49
Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March 18, 2008

Transcript of Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business...

Page 1: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Business IntelligenceTools and Techniques

Robert Monroe

March 18, 2008

Page 2: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Agenda

• Quick survey• Overview of Business Intelligence Tools and Techniques• Course structure, grading, and expectations • Data management fundamentals

Page 3: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Survey

• Please complete and hand back the survey

• Survey helps me to:– Understand your goals and expectations for the course

– Evaluate your previous IT knowledge and experience

– … adjust the class accordingly

Page 4: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Introducing Business Intelligence Tools and Techniques

Page 5: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Corporations Are Drowning in Data

• … but thirsty for actionable knowledge• Our ability to collect and store data seems to have surpassed our

ability to make sense of it!• Important trends:

– Storage capacity continues to rise rapidly– Cost of storage continues to drop

Page 6: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Business Intelligence

• Core question: How can an organization manage and leverage large data sets to make better business decisions?

• Business Intelligence (BI)– A broad category of applications and technologies for

gathering, storing, analyzing, and providing access to data to help enterprise users make better business decisions. (Wikipedia)

• Two common uses for BI tools– Measuring where you are / how your business is performing – Identifying problems and opportunities

Page 7: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Business Intelligence Systems Improve Decision Making

Source: O’Brien, Management Information Systems, 6th ed.

Page 8: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

In-Class Exercise

• Take out a piece of paper and pencil• Select a company that you are familiar with and a

managerial role in that company• Write down five pieces of quantitative information that

you would most want to have to manage your business (or your part of the business) effectively

Page 9: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

A Business Intelligence Roadmap

Page 10: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Module 1: Course Intro, Data Management Fundamentals

• What is Business Intelligence?– How can it help me make better

business decisions?– What kinds of questions can BI

tools help me answer?

• What is the relationship between data, information, & knowledge?

• What does it mean to ‘Compete on Analytics’– Why would I want to do so?– How might I do so effectively?

Data

Info

Knowledge

Page 11: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Module 2: Data Warehousing

• What is a Data Warehouse?– How about a Data Mart?

– How is a Data Warehouse different from a ‘regular’ database?

• Why do we need another database that just duplicates data that we already have?

• How can fill a data warehouse with comprehensive, timely, and high-quality data?

Page 12: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Module 3: Reporting and OLAP

• How do I convert the data in my data warehouse into actionable information or knowledge?

• What tools are available to help non-programmers analyze warehouse data?

• What is dimensional modeling? Why is it powerful?

• What kinds of questions are OLAP tools designed to answer?

Page 13: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Module 4: Info Viz and Data Mining

Page 14: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Module 5: Dashboards

• What is an executive dashboard?– Are they only for executives?

– Why are they useful?

– What are their drawbacks?

• How can I implement dashboards effectively in my organization?

Page 15: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Module 6: ‘Real-Time’ Business Intelligence

• How can we move from historical analysis to ‘real-time’ analysis?

• Why is this hard to do in practice?

• What tools and techniques are available to support real-time analysis?

Page 16: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Module 7: Implementing BI, Ethical use of BI

• What does my organization need to do to implement a successful BI program?

• What ethical issues arise with BI capabilities?

• How can we insure that our BI capabilities are used ethically? – What does it mean to do so?

Page 17: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Dashboard: Expected Effort

• First two weeks focus on BI foundation– Eat your vegetables, exercise more

• Middle classes focus on using various BI tools effectively– Use the tools, Luke

• Final classes combine fundamentals, tools, people, processes, and ethics– Pull it all together

R

eadi

ng L

oad

Week #

Wor

k w

ith B

I Too

ls Week # →

Page 18: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Course Structure, Grading, and Expectations

Page 19: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Course Goals

• Understand how to apply various Business Intelligence (BI) tools and techniques to analyze and evaluate large data sets to make better business decisions

• Understand the benefits, drawbacks, and applicability of various approaches to BI

• Improve awareness of a variety of challenges and ‘gotchas’ that arise when implementing BI systems– … and how to avoid them

Page 20: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Course Philosophy

• Focus on applying BI technology to solve business problems, not building BI systems

• You will develop new skills by doing and participating– You will need to use the BI tools– When in doubt try something, experiment– Most work done in teams – learn from/with your peers– Casual interactive class – your participation is important

• Many of the technologies we will look at are relatively new– Not everything will work perfectly the first time…– Flexibility, patience, and a willingness to explore will help a lot

• Let’s have some fun – life’s too short to do otherwise

Page 21: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Expectations, Etiquette, and Academic Integrity

• Waitlist• Office hours, 3:30 – 4:30 MWF• Expectations and etiquette• Academic integrity

• Teaching Assistant– Bao-Jun Jiang, [email protected]

Page 22: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Pass/Fail

I allow students to take the course pass/fail provided that they agree to:– Attend class regularly

– Prepare for class as if they were taking it for a grade

– Complete all of the assignments

– Take the final exam at its regular time and place

– Complete all of the necessary administrative paperwork

Page 23: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Blackboard And The Course Wiki

Page 24: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Grading

• Grades will be computed as follows:– Homework exercises (3) 45%

– Final exam 30%

– Class attendance, preparation, 25%and participation

Late assignments policy: 25% deduction each day late I curve final grades, not individual assignments Please see regrade request policy in syllabus document

Page 25: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Assignments

• Three homework assignments– Groups of 2-4 people

• Assignment #1: Data warehousing– Analyze data warehousing scenario and make business, technology, and

process recommendations based on your analysis (management option)– Create a simple data warehouse and ETL process to load it (tech option)

• Assignment #2: Reporting and OLAP tools– Use Microsoft’s Reporting and/or OLAP tools to retrieve, analyze, and

present useful information from a data warehouse and OLAP cubes

• Assignment #3: Case analysis, dashboards or visualizations– Case analysis – Continental or SYSCO cases (management option)– Analyze scenario/case and design dashboard(s) and/or data visualizations

to meet business needs (tech option)

Page 26: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Computing Resources

• There are many good BI platforms

• We will primarily use Microsoft’s SQL Server 2005– Client tools– Reporting Services– Analysis Services– Integration Services (ETL tool – optional)

• We will also experiment with a variety of other BI tools

• You must provide a laptop that can run SQL Server 2005 client– At least client tools, servers are optional– 600Mhz proc, 512MB of RAM, 0.5–2.0GB of disk space– Install instructions are available on Blackboard– Please try to install SQL Server 2005 client tools before Monday’s class

Page 27: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Data Management Fundamentals

Page 28: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Definitions

• What is the difference between data, information, and knowledge?

• Data is a collection of raw value elements or facts used for calculating, reasoning, or measuring. Data may be collected, stored, or processed but not put into a context from which any meaning can be inferred. [Los03]

• Information is the result of collecting and organizing data in a way that establishes relationships between data items, which thereby provides context and meaning. [Los03]

• Knowledge is information to which experience, interpretation, and reflection are added by individuals so that it becomes a high value form of information

– The OR Society http://www.orsoc.org.uk/about/topic/projects/kmwebfiles/knowledge.htm

Page 29: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Exercise

3/21/05 $27.74 3/22/05 $27.013/21/05 $19.78 3/22/05 $19.723/21/05 $21.41 3/22/05 $21.503/21/05 $83.81 3/22/05 $84.24

MSFTINTCCSCOIBM

3/21/05 3/22/05 3/22/05 3/21/05 3/22/05 3/22/05 3/21/05 3/21/05 $27.74 $19.78 $21.41 $83.81 $27.01 $19.72 $21.50 $84.24 CSCO MSFT INTC IBM

Closing Stock Prices

Page 30: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Goal: Convert Data to (Actionable) Knowledge

Data

Info

Knowledge

IncreasingValue

Why is this so hard to do in practice?

Page 31: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Challenge: What To Capture and Store?

• The amount of data that can be captured is enormous– Storing data is relatively cheap ( free @ the margin)– Structuring and retrieving data is relatively expensive– Converting large data sets to actionable knowledge tends to be

relatively challenging and expensive

• Rules of thumb for deciding what to capture and store– Start with what you want to get out and work backwards– Evaluate what is already available– Insure that you capture high-quality data– Analyze fundamental data requirements for the enterprise,

independent of the specific project at hand

Page 32: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Exercise: What To Capture And Store

• Scenario 1: You are a marketing VP for a large chain food retailer. You need to figure out how to properly price and promote a specific brand of snack chips over the next year

• What questions do you need to ask?• What analyses would you like to do to answer them?• What data will you need to do these analyses?• Where will you get that data?

– Is your organization likely to already have all the data that you need?

– Are there other data sources that you should try to take advantage of and incorporate into your analyses?

Page 33: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Exercise: What To Capture And Store

• Scenario 2: You are an executive at Ferrari who needs to decide how to allocate the latest and greatest sports car your company is introducing in six months to maximize your company’s profits long-term

• What questions do you need to ask?• What analyses would you like to do to answer them?• What data will you need to do these analyses?• Where will you get that data?

– Is your organization likely to already have all the data that you need?

– Are there other data sources that you should try to take advantage of and incorporate into your analyses?

Page 34: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Exercise: What To Capture And Store

• Scenario 3: You are a HR executive responsible for recruiting salespeople. Your bonus each year is directly tied to how well the salespeople you bring in do in their first three years at your company. You’ve read Moneyball and Competing on Analytics, and you want to take a more analytic approach to your job

• What questions do you need to ask?• What analyses would you like to do to answer them?• What data will you need to do these analyses?• Where will you get that data?

– Is your organization likely to already have all the data that you need?– Are there other data sources that you should try to take advantage of and

incorporate into your analyses?

Page 35: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

The Relational Data Model

• The Relational Model has become the de-facto standard for managing operational business data

• Core concepts in a relational model:– Tables (relations)

– Records (rows)

– Data fields (columns)

– Primary keys

– Foreign keys

Products

Product ID Description Color Size Qty Available

52 Shoes (pair) Blue 10 25

64 Socks (pair) White Large 200

145 Blouse Green 7 14

158 Pants Blue 32/34 0

Page 36: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Data, Information, Database Example

Purchases

Order ID Customer Name Product ID Quantity Date

5623 Jimmy Hwang 52 3 12/15/2004

5624 Sue Smith 64 5 12/16/2004

5625 Jane Chen 145 1 12/16/2004

Products

Product ID Description Color Size Qty Available

52 Shoes (pair) Blue 10 25

64 Socks (pair) White Large 200

145 Blouse Green 7 14

158 Pants Blue 32/34 0

Jimmy Hwang purchased 3 pairs of size 10 shoes on 12/15/2004

What other information can we derive from these data tables?

Data in Database Tables Information

Page 37: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Relational Data, Tables, Records, and Metadata Example

Purchases

Order ID Customer Name Product ID Quantity Date

5623 Jimmy Hwang 52 3 12/15/2004

5624 Sue Smith 64 5 12/16/2004

5625 Jane Chen 145 1 12/16/2004

Products

Product ID Description Color Size Qty Available

52 Shoes (pair) Blue 10 25

64 Socks (pair) White Large 200

145 Blouse Green 7 14

158 Pants Blue 32/34 0

Table Name: ProductsProductID Int (pkey)Description Text(50)Color Text(50)Size Text(20)QtyAvailable Int

Table Name: PurchasesOrderID Int (pkey)CustomerName Text(75)ProductID Int (fkey)Quantity DecimalDate DateTime

Data (Records) in Database Tables Metadata

Page 38: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Normalization And Denormalization

• Data normalization is the process of decomposing relations with anomalies to produce smaller, well-structured relations– Basic idea: each table only holds data about one ‘thing’

• Goals of normalization include:– Minimize data redundancy

– Simplifying the enforcement of referential integrity constraints

– Simplify data maintenance (inserts, updates, deletes)

– Improve representation model to match “the real world”

• Normalization sometimes hurts query performance

Page 39: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Example: Denormalized Table

• Insertion anomaly: when an employee takes a new class we need to add duplicate data (Name, Dept_Name, and Salary)

• Deletion anomaly: If we remove employee 140, we lose information about the existence of a Tax Acc class

• Modification anomaly: Employee 100 salary increase forces update of multiple records

• These anomalies exist because there are two themes (entity types) into one relation – course and employee, resulting in duplication, and an unnecessary dependency between the entities

Employee          

Emp_ID Name Dept_Name Salary Course_Title Date_Completed

100 Margaret Simpson Marketing 48000 SPSS 6/19/2005

100 Margaret Simpson Marketing 48000 Surveys 10/7/2004

140 Alan Beeton Accounting 52000 Tax Acc 12/8/2004

110 Chris Lucero Info Systems 43000 SPSS 1/12/2004

110 Chris Lucero Info Systems 43000 C++ 4/22/2003

190 Lorenzo Davis Finance 55000

150 Susan Martin Marketing 42000 Java 8/12/2002

150 Susan Martin Marketing 42000 SPSS 6/19/2005

Example Derived from Hoffer, Prescott, McFadden, Modern Database Management, 7th ed.

Page 40: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Normalizing Previous Employee/Class Table

Course_Completion    

Emp_ID Course_ID Date_Completed

100 1 6/19/2005

100 2 10/7/2004

140 3 12/8/2004

110 1 1/12/2004

110 4 4/22/2003

150 1 6/19/2005

150 5 8/12/2002

Employee      

Emp_ID Name Dept_Name Salary

100 Margaret Simpson Marketing 48000

140 Alan Beeton Accounting 52000

110 Chris Lucero 43000

190 Lorenzo Davis Finance 55000

150 Susan Martin Marketing 42000

Course  

Course_ID Course_Title

1 SPSS

2 Surveys

3 Tax Acc

4 C++

5 Java

This seems more complicated

Why might this approach be superior to the previous one?

Page 41: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Indexing

• An index is a table or other data structure used to determine the location of rows in a file that satisfy some condition

• Indices reduce the time needed to retrieve records• … but increase the time and cost to insert, update, or delete• Indexing is critical for high performance in large, complex db’s,

– Especially data warehouses and data marts

Products

Product ID Description Color Size

52 Shoes (pair) Blue 10

145 Socks (pair) White Large

62 Blouse Green 7

12 Pants Blue 32/34

532 Skirt Green 7

… … … …

Product_Index

Product ID Row

12 4

52 1

62 3

145 2

532 5

… …

Page 42: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Alternative Data Models

• The relational data model is the current de-facto standard for storing and managing corporate data

• There are other data storage models, usually associated with legacy systems– The data you need for your analysis may be stored in them!

• Four common alternative data models– Flat file– Hierarchical– Network– Object

Page 43: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Structured Query Language (SQL)

• SQL provides a standard language for describing, manipulating, and querying data from relational databases

• SQL allows applications to interact with databases without requiring a tight binding between the application and the underlying DBMS

• All of the major relational database vendors implement some form of SQL in their database products

• Example Query:SELECT ProductName, ProductPriceFROM ProductsWHERE SupplierName=‘Acme’ORDER BY ProductsPrice DESC;

Page 44: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Query Example

English: Find the 10 most expensive products that we stock

SQL:

SELECT TOP 10 Products.ProductName AS TenMostExpensiveProducts, Products.UnitPriceFROM ProductsORDER BY Products.UnitPrice DESC;

Query Results:

Page 45: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Transactional and Analytical Systems

• Transactional systems:System that are used to run a business in real time, based on current data. Also called “systems of record”

• Analytical systems:Systems designed to support decision making based on historical point-in-time and prediction data for complex queries or data mining applications

• BI systems are generally analytical systems

Page 46: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Examples of Transactional and Analytical Systems

Transactional System Examples• Supermarket checkout

system• ATM machines• Purchase order processing• Student course registration• Warehouse/inventory tracker• Airline ticketing system• E-Z Pass

Analytical System Examples• Data warehouses• Data marts• Enterprise spend analysis

– Where do we spend our $$$

• Sales force productivity analysis– By sales person, region, or

product line

• Product-line profitability analysis– Which products are most

profitable?– Which do we lose money on?

Page 47: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Why Not Use Operational Data Stores For BI?

• It is good practice to separate operational and analytical systems and data

• Why?– To improve system performance

– To improve database managability and maintainability

– Optimize each type of system for it’s primary purpose

Page 48: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

Wrap Up

Page 49: Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques

For Thursday

• We will be discussing part 1 of Competing on Analytics– Reading assignment is available on the wiki

• Come prepared to apply the concepts in part 1 of the book in class discussions to analyze how some well-known organizations might be able to improve their business by aggressively pursuing the principles of analytic excellence described in the book– Feel free to suggest organizations to discuss prior to class: I’ll

be taking requests as I spin your favorite on-the-fly cases – Post suggestions for organizations to discuss in class, along

with a brief description of why they would be an interesting to discuss, to the course wiki by Wednesday evening.