Business DataWarehouse_Big Data
-
Upload
pragativbora -
Category
Documents
-
view
39 -
download
0
Transcript of Business DataWarehouse_Big Data
Big Data
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
What is Big Data ?
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
‘Data’-The New oil of Information Revolution
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
‘Data’-The New Information Revolution ‘Data’-The New oil of Information Revolution
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
What makes it ‘Big’ Data ?
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Volume
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Velocity
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Variety
Hadoop
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Hadoop
HDFS HDFS
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
HDFS
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
HDFS
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Map Reduce
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Map Reduce
Key =index.php Value=1
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Processing Logs
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Hadoop Ecosystem
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Big Data Landscape
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
What’s in store for us?
• More jobs
• More opportunities
• More Money!
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Big Data Landscape
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Big Data Landscape
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Big Data Landscape
Sectors Using Big Data
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Enhancing the Multichannel Consumer experience:
• Use big data to integrate promotions and pricing for shoppers seamlessly, whether those consumers are online, in-store, or perusing a catalog.
• Integrate customer databases with information on households such as income, housing values, and number of children and thus create different versions of catalogs etc attuned to the behavior and preferences of different groups of customers
Big Data Revenue
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Increased Efficiency
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Current Limitations for Big Data Analytics
• Meeting the need for speed
• Understanding the data
• Addressing data quality
• Displaying meaningful results
• Big data skills are in short supply.
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Problems & Treats – Big Data
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
• Privacy breaches and embarrassments
• Anonymization could become impossible
• Data masking could be defeated to reveal personal
information
• Unethical actions based on interpretations
• Big data analytics are not 100% accurate
• Discrimination
• Few (if any) legal protections exist for the involved
individuals
• Big data will probably exist forever
• Concerns for e-discovery
• Making patents and copyrights irrelevant
Case Studies – Recent Data Breaches
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
• Target breach, in which 40 million credit and debit accounts were
compromised over a three-week period - lost $148 million dollars.
• JP Morgan reporting that 76 million households and 8 million small business
were exposed in a data breach.
• Customer names, addresses, phone numbers and e-mail addresses were
taken
• Hackers also obtained internal data identifying customers by category,
such as whether they are clients of the private-bank, mortgage, auto or
credit-card divisions, said a person briefed on the matter.
• Third party – External Data - News: Banks turn to Facebook and Twitter to
keep track of education loan takers
Thinking Dimensionally
Sentiment_Analysis Table
Sentiment_ID ( e g-1,2,3,)
Sentiment_description
(eg-Wow, Awesome, Crap)
Customer_ID
Product_ID
Dim_Customer
Customer_ID
Customer_Name
Gender
Age
Dim_Product
Product_ID
Product_Name
Category
Product_Description
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Data-Big or small
Customer Name Location
Avadhoot Patil Dallas
Customer name
Location
Ankur Kaushik Dallas
Customer Name
Location
Avadhoot Patil Dallas
Ankur Kaushik Dalllas
Sort and Merge
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Conformed Dimensions
Online_Customer Table Store_Customer Table
Airport
Name
City Country
ABC Dallas USA
Airport_ID Airport
Name
City Country
1001 ABC Dallas USA
1002 XYZ Dallas USA
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Selecting Keys
• Anchor Dimensions with Durable Surrogate Keys
Natural Keys
durable surrogate keys.
slowly changing dimension
Datawarehouse System Airport Data_source
Dimensionalize data before applying governance
Dimensionalize data as early as possible in the data pipeline
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Governance
Parse Match Identify
Resolution on Fly
• Privacy is the Most Important Governance Perspective
For Most form of Analysis the personal details should be masked
Data aggregated enough not to allow identification of
individuals
Data masked or encrypted on write or data should be masked on read.
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6
Privacy
THANK YOU !
MIS 6309 Business Data Warehousing Fall 2014 -GROUP 6