Business DataWarehouse_Big Data

Post on 06-Aug-2015

39 views 0 download

Tags:

Transcript of Business DataWarehouse_Big Data

Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

What is Big Data ?

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

‘Data’-The New oil of Information Revolution

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

‘Data’-The New Information Revolution ‘Data’-The New oil of Information Revolution

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

What makes it ‘Big’ Data ?

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Volume

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Velocity

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Variety

Hadoop

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Hadoop

HDFS HDFS

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

HDFS

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

HDFS

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Map Reduce

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Map Reduce

Key =index.php Value=1

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Processing Logs

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Hadoop Ecosystem

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Big Data Landscape

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

What’s in store for us?

• More jobs

• More opportunities

• More Money!

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Big Data Landscape

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Big Data Landscape

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Big Data Landscape

Sectors Using Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Enhancing the Multichannel Consumer experience:

• Use big data to integrate promotions and pricing for shoppers seamlessly, whether those consumers are online, in-store, or perusing a catalog.

• Integrate customer databases with information on households such as income, housing values, and number of children and thus create different versions of catalogs etc attuned to the behavior and preferences of different groups of customers

Big Data Revenue

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Increased Efficiency

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Current Limitations for Big Data Analytics

• Meeting the need for speed

• Understanding the data

• Addressing data quality

• Displaying meaningful results

• Big data skills are in short supply.

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Problems & Treats – Big Data

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

• Privacy breaches and embarrassments

• Anonymization could become impossible

• Data masking could be defeated to reveal personal

information

• Unethical actions based on interpretations

• Big data analytics are not 100% accurate

• Discrimination

• Few (if any) legal protections exist for the involved

individuals

• Big data will probably exist forever

• Concerns for e-discovery

• Making patents and copyrights irrelevant

Case Studies – Recent Data Breaches

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

• Target breach, in which 40 million credit and debit accounts were

compromised over a three-week period - lost $148 million dollars.

• JP Morgan reporting that 76 million households and 8 million small business

were exposed in a data breach.

• Customer names, addresses, phone numbers and e-mail addresses were

taken

• Hackers also obtained internal data identifying customers by category,

such as whether they are clients of the private-bank, mortgage, auto or

credit-card divisions, said a person briefed on the matter.

• Third party – External Data - News: Banks turn to Facebook and Twitter to

keep track of education loan takers

Thinking Dimensionally

Sentiment_Analysis Table

Sentiment_ID ( e g-1,2,3,)

Sentiment_description

(eg-Wow, Awesome, Crap)

Customer_ID

Product_ID

Dim_Customer

Customer_ID

Customer_Name

Gender

Age

Dim_Product

Product_ID

Product_Name

Category

Product_Description

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Data-Big or small

Customer Name Location

Avadhoot Patil Dallas

Customer name

Location

Ankur Kaushik Dallas

Customer Name

Location

Avadhoot Patil Dallas

Ankur Kaushik Dalllas

Sort and Merge

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Conformed Dimensions

Online_Customer Table Store_Customer Table

Airport

Name

City Country

ABC Dallas USA

Airport_ID Airport

Name

City Country

1001 ABC Dallas USA

1002 XYZ Dallas USA

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Selecting Keys

• Anchor Dimensions with Durable Surrogate Keys

Natural Keys

durable surrogate keys.

slowly changing dimension

Datawarehouse System Airport Data_source

Dimensionalize data before applying governance

Dimensionalize data as early as possible in the data pipeline

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Governance

Parse Match Identify

Resolution on Fly

• Privacy is the Most Important Governance Perspective

For Most form of Analysis the personal details should be masked

Data aggregated enough not to allow identification of

individuals

Data masked or encrypted on write or data should be masked on read.

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6

Privacy

THANK YOU !

MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6