April 10-12 | Chicago, IL
Using Power View and Hiveto Gain Business InsightsFinding Hidden Answers in Data
Joey D’Antoni, Comcast CableStacia Misner, Data Inspirations
April 10-12 | Chicago, IL
Please silence cell phones
3
About Us
• Principal Architect for SQL Server at Comcast Cable
• @jdanton on Twitter• Joedantoni.wordpress.com
Joey D’Antoni Stacia Misner
• Principal Consultant at Data Inspirations• @StaciaMisner on Twitter• blog.datainspirations.com
4
Agenda
• Introducing Big Data• Overview and Summary of Data Set• Insights into the Data• Conclusions
5
Classic Data Analysis
Loading Analyzing Visualization
Classic Data Analysis
Data Warehouse & BI Solutions
ETL
…Uses Just a Subset
Classic Data Analysis
Data Warehouse & BI Solutions
ETL
…Requires Structure
8
Why Leave the RDBMS
Key Differences
Scale Out As NeededWith Commodity
Hardware
Impose Schema On Read
Basically
Available
Soft-state
Eventually consistent
Hadoop Ecosystem
HDFS
MapReduce
Note: This is only a subset of ecosystem!
11
Hadoop and Hive Demo
12
Extract, Transform, Load (ETL) Process
Some Database Business Doesn’t
Care About
Process
Your
Some
Credit—Buck Woody, Microsoft
13
Our ETL Process
Collection Server
HDFS
Hive is a Data Warehouse System that connects to Hadoop and allows SQL queries to be written against data sets in Hadoop
The Data Set
Set Top Box Engagement Times
• Max Set Top Boxes Viewing Channels• Aggregate Viewing Seconds• Potential Total Seconds Watched• Recorded in 5, 15 and 60 minute aggregates
This data is from the week of 11-17, July 201214
15
Preparation for Data Analysis
• Define question to answer
• Define ideal data set
• Find data
16
Remember Legal and Privacy Issues
17
Diving into Data Analysis
• Cleanse• Reformat as needed • Decide what is usable
• Explore• Create summaries• Perform statistical analysis• Use visualizations
18
Aggregate Statistics on Data
19
Resources
Connecting Excel to Hive (Hive ODBC Driver, Excel Hive Add-in)• http://social.technet.microsoft.com/wiki/contents/articles/6226.ho
w-to-connect-excel-to-hadoop-on-azure-via-hiveodbc.aspx
Connecting PowerPivot to Hadoop on Azure• http://dennyglee.com/2012/01/21/connecting-powerpivot-to-hadoo
p-on-azure-self-service-bi-to-big-data-in-the-cloud/
Connecting Power View to Hadoop on Azure• http
://dennyglee.com/2012/02/10/connecting-power-view-to-hadoop-on-azurean-awesomesauce-way-to-view-big-data-in-the-cloud/
April 10-12 | Chicago, IL
Thank you!Diamond Sponsor
Top Related