Hadoop acm presentation
-
Upload
brad-sarsfield -
Category
Documents
-
view
397 -
download
0
description
Transcript of Hadoop acm presentation
Hadoop and Microsoft.
Brad Sarsfield | Senior Software Engineer @bradoop
BIG DATA
HADOOP
MICROSOFT & HADOOP
Agenda
How Big is Big Data?
It’s all about your BigDataProblemsBig Problems
Hadoop is for Big Data.
Data is the Platform.
Hadoop Data Science.
Hadoop Capabilities.
Machine Learning
Graph Processing
Distributed Compute
Extract Load Transform
Predictive
Analysis
Distributed Storage(HDFS)
Query(Hive)
Hadoop architecture.
Distributed Processing(Map Reduce)
Scripting
(Pig)
NoSQ
L Data
base
(HB
ase
)
Metadata(HCatalog)
Data
Inte
gra
tion
( OD
BC
/ SQ
OO
P/
REST)
Busin
ess In
tellig
ence
(E
xcel, Po
werV
iew
…)
Machine Learning(Mahout)
Graph(Pegasus)
Stats processin
g(RHadoop
)
Pipelin
e /
workflo
w(O
ozie
)
Log file
aggre
gatio
n(Flu
me)
Hadoop and Microsoft.
We are delivering• Apache Hadoop on Windows Server• Apache Hadoop on Windows Azure
Big engineering investment• Big Data Business Intelligence tooling• Big Data Apache Hadoop• Big Data Parallel Data Warehouse
Open source Commitment• Apache Software Foundation• Hortonworks Partnership
Microsoft Hadoop Vision.
Microsoft Business Intelligence (BI) • ODBC Connectivity
Better on Windows and Azure • Active Directory• System Center
Microsoft Data Connectivity• SQL Server / SQL Parallel Data Warehouse• Azure Storage / Azure Data Market
ACM Hackathon.
Hadoop on Azure demo
Free Hadoop on Azure• Code: acmhackathon
Free 30 day Azure account • No credit card• 750h small compute / 35GB storage• Email [email protected] for code