Post on 30-May-2018
8/14/2019 Business Intelligence Technologies
1/29
First page
Business IntelligenceTechnologies
Donato MalerbaDipartimento di Informatica
Universit degli Studi, Bari, Italy
malerba@di.uniba.it
http://www.di.uniba.it/malerba
8/14/2019 Business Intelligence Technologies
2/29
First page
Business Intelligence
s Business Intelligence is a global term
for all the processes, techniques and
tools that support business decision-
making based on informationtechnology.
s The approaches can range from a
simple spreadsheet to a major
competitive undertaking.s Data mining is an important new
component of business undertaking.
8/14/2019 Business Intelligence Technologies
3/29
Business IntelligenceTechnologies
Data Sources
Paper, Files, Information Providers, Database Systems, OLTP
Data Warehouses / Data Marts
Data Exploration
OLAP, DSS, EIS, Querying and Reporting
Data Mining
Information discovery
Data Presentation
Visualization Techniques
Decision
Making
Increasing potentialto support
business decisions
End User
Business
Analyst
Data
Analyst
DB
Admin
8/14/2019 Business Intelligence Technologies
4/29
First page
Business Processess
Data for support decision making
decisional
processes(agreement with a credit card)
managementprocesses
(grant a loan)
operational
processes(transaction on
bank account
(Ex.: Banking)
DSS o EIS
MIS
TPS
s Different information systems support the different processes
8/14/2019 Business Intelligence Technologies
5/29
First page
DSS vs. EISs Decision Support Systems (DSS) and
Executive Information Systems (EIS):
information systems designed to help
managers in making choices.
s Different, yet interrelated applicationss A DSS focuses on a particular decision,
whereas an EIS provides a much wider
range of information (e.g., information on
financials, on production history, and on
external events).
s DSSs appeared in the 1970s
s EISs appeared in the 1980s.
8/14/2019 Business Intelligence Technologies
6/29
First page
DSS vs. EISs The original EISs did not have the
analytical capabilities of a DSS
s An EIS is used by senior managers to
find problems; the DSS is used by the
staff people to study them and to offeralternatives (Rockart and Delong, 1988)
EIS DSS
8/14/2019 Business Intelligence Technologies
7/29
First page
Where do Data ComeFrom?
s The EISs and DSSs often lacked a strong
database component.
s Most organizational information gathering
was (and is) directed to maintaining
current (preferably on-line) informationabout individual transactions and
customers.
s Managerial decision making requires
consideration of the past and the future,not just the present.
s New databases, called data warehouses,
were created specifically for analytic use
8/14/2019 Business Intelligence Technologies
8/29
First page
A Data Warehouse is ...
A data warehouse is ax subject-oriented,
x integrated,
x time-variant, and
x nonvolatile
collection of data in support of managementsdecisionsInmon, W.H.
Building the Data WarehouseWellesley, MA: QED Tech. Pub. Group,1992
8/14/2019 Business Intelligence Technologies
9/29
First page
subject-oriented ...
s The data in the warehouse is definedand organized in business terms, and is
grouped underbusiness-orientedsubject headings, such as
x customers
x products
x sales
rather than individual transactions.
s Normalization is not relevant.
8/14/2019 Business Intelligence Technologies
10/29
First page
integrated ...s The data warehouse contents are defined such that
they are valid across the enterprise and its operationaland external data sources
Operational systems
Data warehouse
s The data in the warehouse should be
x clean
x validated
x properly integrated
8/14/2019 Business Intelligence Technologies
11/29
First page
time-variant ...
s All data in the data warehouse is time-
stamped at time of entry into the
warehouse or when it is summarized
within the warehouse.s This chronological recording of data
provides historical and trend analysis
possibilities.
s On the contrary, operational data is
overwritten, since past values are not of
interests.
8/14/2019 Business Intelligence Technologies
12/29
First page
nonvolatile ...
s Once loaded into the data warehouse, the
data is not updated.
s Data acts as a stable resource for
consistent reporting and comparativeanalysis.
s On the contrary, operational data is
updated (inserted, deleted, modified).
8/14/2019 Business Intelligence Technologies
13/29
First page
Which Data in theWarehouse?
s
A data warehouse contains five types ofdata:
x Current detail data
x Old detail data
x Lightly summarized data
x Highly summarized data
x Metadata
s Granularityof the data: a key designissue
8/14/2019 Business Intelligence Technologies
14/29
First page
Flow of Data
Operational
Environment
Clean thedata
Reside in
warehouse
Purge
Summarize
Archive
8/14/2019 Business Intelligence Technologies
15/29
An Example of DataIntegration
Checking Account System
Jane Doe (name)
Female (gender)
Bounced check #145 on 1/5/95
Opened account 1994
Checking Account System
Jane Doe (name)Female (gender)
Bounced check #145 on 1/5/95
Opened account 1994
Savings Account System
Jane Doe
F (gender)
Opened account 1992
Savings Account System
Jane Doe
F (gender)
Opened account 1992
Investment Account System
Jane Doe
Owns 25 Shares Exxon
Opened account 1995
Investment Account System
Jane Doe
Owns 25 Shares Exxon
Opened account 1995
Customer
Jane Doe
Female
Bounced check #145
Married
Owns 25 Shares ExxonCustomer since 1992
Customer
Jane DoeFemale
Bounced check #145
Married
Owns 25 Shares ExxonCustomer since 1992
Operational
data
datawarehouse
8/14/2019 Business Intelligence Technologies
16/29
First page
Cost and Size of a DataWarehouse
s Data warehouses are expensive
undertakings (mean cost: $2.2 million).
s Since a data warehouse is designed for
the enterprise it has a typical storage
size running from 50 Gb to over aTerabite.
s Parallel computingto speed up dataretrieval
WAREHOUSE SIZE SERVER REQUIREMENTS
5-50 GB Pentium PC > 100MHz
50-500 GB SMP machine
> 500 GB SMP or MPP machine
8/14/2019 Business Intelligence Technologies
17/29
First page
The Data Mart
s
A lower-cost, scaled-down version of thedata warehouse designed for the
strategic business unit (SBU) or
department level.
s An excellent first step for manyorganizations.
s Main problem: data marts often differ
from department to department.s Two approaches:
x data marts enterprise-wide systemx data warehousedata marts
A A hit t f D t
8/14/2019 Business Intelligence Technologies
18/29
An Architecture for DataWarehousing
operational
databases
external sources
data
warehouse
extraction
cleaning
validation
summariz.
data mart
metadata
usedby
EIS
DSS
OLAP
data
mining
query
used
by
8/14/2019 Business Intelligence Technologies
19/29
First page
On-Line AnalyticalProcessing (OLAP)
s Term introduced by E.F. Codd (1993) in
contrast to On-Line Transaction
Processing (OLTP)
s
The OLAP Councils definition:A category of software technology that
enables analysts, managers and executives
to gain insight into data through fast,
consistent, interactive access to a widevariety of possible views of information that
have been transformed from raw data to
reflect the real dimensionality of the
enterprise as understood by the user
8/14/2019 Business Intelligence Technologies
20/29
First page
On-Line AnalyticalProcessing (OLAP)
s Basic idea: users should be able tomanipulate enterprise data models
across many dimensions to understand
changes that are occurring.
s Data used in OLAP should be in the
form of a multi-dimensional cube.
Time
Marke
t
Product
8/14/2019 Business Intelligence Technologies
21/29
First page
DimensionalHierarchies
s
Each dimension can be hierarchicallystructured
Item
Product
Type of product
Day
Week
Month
Year
Store
City
State
Country
8/14/2019 Business Intelligence Technologies
22/29
First page
OLAP Operations
s
Rollup: decreasing the level of details Drill-down: increasing the level of detail
s Slice-and-dice: selection and projection
s Pivot: re-orienting the multidimensional view of data
8/14/2019 Business Intelligence Technologies
23/29
First page
Implementing Multi-dimensionality
s
Multi-dimensional databases (MDDB)s To make relational databases handle
multidimensionality, two kinds of tables are
introduced:
x Fact table: contains numerical facts. It islong and thin.
x Dimension tables: contain pointers to the
fact table. They show where the
information can be found. A separatetable is provided for each dimension.
Dimension tables are small, short, and
wide.
8/14/2019 Business Intelligence Technologies
24/29
First page
Star Schema
STORE KEYPRODUCT KEY
PERIOD KEY
Dollars
Units
Price
STORE KEY
Store Desc.
City
State
District ID
District Desc.
Region ID
Region Desc.
Regional Mgr.Level
PERIOD KEY
Period Desc.
Year
QuarterMonth
Day
PRODUCT KEY
Product Desc.Brand
Color
Size
Manufacturer
Fact Table
Time Dimension
Product Dimension
Market Dimension
8/14/2019 Business Intelligence Technologies
25/29
First page
MOLAP, ROLAP, DSS
s
The OLAP technology is considered anextension of the original DSS technology.s DSS applications are tools that access and
analyze data in relational database (RDB)
tables.s OLAP tools access and analyze
multidimensional data (typically three, up to
ten-dimensional data).
s OLAP technology is called MOLAP/ROLAP(multidimensional/relational OLAP) if it uses
an MDDB/RDB.
8/14/2019 Business Intelligence Technologies
26/29
First page
OLAP/DSS
s
OLAP tools focus on providing multi-dimensional data analysis, that is superior to
SQL in computing summaries and
breakdowns along many dimensions.
s
OLAP tools require strong interaction fromthe users to identify interesting patterns in
data.
s An OLAP tool evaluates a precise query that
the user formulates.s OLAP users are farmers.
D W h
8/14/2019 Business Intelligence Technologies
27/29
First page
Data Warehouse Data Mining
The rational to move from the datawarehouse to data mine arises from the
need to increase the leverage that an
organization can get from its existing
warehouse approach.
After implementing a data mining solution, an
organization could decide to integrate the solution
in a broader data-driven approach to businessdecision making. The data warehouse will provide
an excellent vehicle for such an integration.
r ca uccess ac ors
8/14/2019 Business Intelligence Technologies
28/29
First page
r ca uccess ac orsfor BusinessApplications
s Peoplex Find a sponsor for the application
x Select the right user group
x Involve a business analyst with domain
knowledgex Collaborate with experienced data analysts
s Data
x Select relatively clean sources of data
x Select a limited set of data sources (e.g., thedata warehouse)
r ca uccess ac ors
8/14/2019 Business Intelligence Technologies
29/29
First page
r ca uccess ac orsfor BusinessApplications (cont.)
s Applicationx Understand business objectives.
x Analyze cost-benefits and significance
of the impact on business problem.
x Consider legal or social issues in
collecting input data