SQL Server 2014 Faster Insights from Any Data
-
Upload
stephane-frechette -
Category
Technology
-
view
1.569 -
download
2
description
Transcript of SQL Server 2014 Faster Insights from Any Data
![Page 1: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/1.jpg)
SQL Server 2014Faster Insight from Any Data
Stéphane Fréchette
Friday May 9, 2014
![Page 2: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/2.jpg)
Email: [email protected]
Twitter: @sfrechette
Blog: stephanefrechette.com
Stéphane FréchetteFounder, CEO | Strategic consultant
Microsoft SQL Server MVP
![Page 3: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/3.jpg)
Session Overview
![Page 4: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/4.jpg)
![Page 5: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/5.jpg)
![Page 6: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/6.jpg)
Excel BI | Capabilities
![Page 7: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/7.jpg)
Microsoft Power BI for Office 365
1 in 4 enterprise customers on Office 3651 Billion Office Users
Analyze Visualize Share Find
Q&A
MobileDiscover
Scalable | Manageable | Trusted
![Page 8: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/8.jpg)
Extend with Hybrid Cloud Solutions
![Page 9: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/9.jpg)
Extend with Hybrid Cloud Solutions
![Page 10: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/10.jpg)
Extend with Hybrid Cloud Solutions
![Page 11: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/11.jpg)
Power Query, PowerPivot, Power View, and Power Map
![Page 12: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/12.jpg)
Powerful Self-Service BI with Excel 2013
![Page 13: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/13.jpg)
Power QueryEnable self-service data discovery, query, transformation and mashup experiences for Information
Workers, via Excel and PowerPivot
Discovery and connectivity to a wide range of data sources, spanning volume as well as variety of data.
Highly interactive and intuitive experience for rapidly and iteratively building queries over any data source, any size.
Consistency of experience, and parity of query capabilities over all data sources.
Joins across different data sources; ability to create custom views over data that can then be shared with team/department.
![Page 14: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/14.jpg)
Power QueryDiscover, combine, and refine Big Data, small data, and any data with Data
Explorer for Excel.
![Page 15: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/15.jpg)
S
Data Sources
Windows Azure
Marketplace
Windows Active
Directory
Azure SQL
DatabaseAzure HDInsight
![Page 16: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/16.jpg)
Powerful Self-Service BI with Excel 2013
![Page 17: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/17.jpg)
Introducing PowerPivot
![Page 18: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/18.jpg)
PowerPivot for SharePoint
![Page 19: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/19.jpg)
Powerful Self-Service BI with Excel 2013
![Page 20: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/20.jpg)
Introducing Power View
![Page 21: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/21.jpg)
Power View for Multidimensional Models• Power View on Analysis Services via BISM
• Native support for DAX in Analysis Services
• Better flexibility: Choice of DAX on Tabular or Multidimensional (cubes)
![Page 22: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/22.jpg)
Architecture
Internet Explorer
Analysis Services
BI Semantic Model
Tabular
SharePoint
(2010 or 2013)
Reporting
Services
Power ViewAnalysis Services
BI Semantic Model
Multidimensional
SQL Server Data Tools
SQL Server Data Tools
1
2
35
6
4
![Page 23: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/23.jpg)
BI Semantic Model: ArchitectureThird-party
applications
Reporting Services
(Power View) Excel PowerPivot
Databases LOB Applications Files OData Feeds Cloud Services
SharePoint
Insights
![Page 24: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/24.jpg)
BISM-MD Object Tabular Object
Cube Model
Cube Dimension Table
Attributes (Key(s), Name) Columns
Measure Group Table
Measure Measure
Measure without MeasureGroup Within Table called “Measures”
MeasuregroupCube Dimension relationship Relationship
Perspective Perspective
KPI KPI
User/Parent-Child Hierarchies Hierarchies
Multidimensional-Tabular Mapping
![Page 25: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/25.jpg)
Powerful Self-Service BI with Excel 2013
![Page 26: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/26.jpg)
Power Map for Microsoft Excel enables information workers to discover and share new insights
from geographical and temporal data through three-dimensional storytelling.
What Is Power Map?
![Page 27: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/27.jpg)
Map Data
• Data in Excel
• Geo-Code
• 3D and 3 Visuals
Discover Insights
• Play over Time
• Annotate points
• Capture scenes
Share Stories
• Cinematic Effects
• Interactive Tours
• Share Workbook
Power Map: Steps to 3D insights
![Page 28: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/28.jpg)
Map Data
•
![Page 29: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/29.jpg)
Discover Insights
•
•
•
•
![Page 30: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/30.jpg)
Share Stories
•
•
•• Export to Video for Viral!
![Page 31: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/31.jpg)
Power MapExcel Add-in to Enhance Data Visualization
![Page 32: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/32.jpg)
Power BI Site
![Page 33: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/33.jpg)
Power BI for Office 365 | Capabilities
![Page 34: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/34.jpg)
Power BI for Office 365 | Capabilities
![Page 35: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/35.jpg)
Power BI for Office 365 | Capabilities
![Page 36: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/36.jpg)
Power BI for Office 365 | Capabilities
Corporate
Data Sources
![Page 37: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/37.jpg)
Data Management Gateway
Enabling Corporate
OData Feeds
Enabling Excel Workbook
Data Refresh using
SharePoint Online
Enabling
Discovery in
Power Query
capabilities
Power BI Admin CenterData Management Gateway
![Page 38: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/38.jpg)
Data Management Gateway - Conceptual
Power BI Admin CenterAllows IT to configure, manage
and monitor access to
corporate data sources.
Data Management Gateway
Connects to corporate data
sources and sends data to
Microsoft cloud services through a
secure channel (Service Bus).
Corporate Data SourcesThe Gateway can connect to
a variety of data sources.
Secure Credential Store
All credentials used by
the gateway are stored
on-premises.
![Page 39: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/39.jpg)
Data Management Gateway Network Topology
MICROSOFT DATA CENTERINTERNETPERIMETER
NETWORKINTRANET
Data Management
Gateway
Data Management
Gateway Cloud
Services
Customer network
Power Query
Outgoing connection to cloud services
(Registration, Regular Heartbeat, Data
Source definition requests)
Connect to
Corporate
OData feedData
Per Machine: Single gateway installed
Credential
Management
Saves
credentials
![Page 40: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/40.jpg)
Corporate OData Feeds and Data Management Gateway
Data Management
GatewayPower Query
(1) Using Power Query Anna connects to OData feed (URL: http://feedgwMyDB )
Example: Contoso\Anna
(2) The Data Management Gateway connects to SQL Server using either Windows account or Database account setup by Patrick when creating the feed
Example: DB1_Reader
(3) Returns Result
(4) Returns OData feed
![Page 41: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/41.jpg)
Scheduled RefreshScenario: workbook is refreshed on schedule as configured by the author in BI Sites
• Scheduler runs in BI Azure and triggers refresh as configured in the BI Sites application
• The flow assumes the workbook has been added to Power BI, thus save back is done directly to SPO
• When refresh is called by BI Azure, SPO rehydrates the user identity and calls WAC in a back channel (i.e. redirect equivalent)
3. Refresh workbook
BI Azure
Office Web Apps
Service (WAC)
Excel Services
5. Get shadow
workbook refresh
Data Model
SPO
Azure Active
Directory(AAD)
OrgID, MSODS,ACS
Excel
Service
s SOAP
API
1. Verify user existence and license in MSODS and get
access token to target URL in SPO from ACS
2. Construct the user part of the access token, and trigger
refresh for a workbook on behalf of the scheduled refresh user
On-Prem
Data
Sources
Cloud Data
Sources
6. Get data from
cloud/on-prem
sources and re-
process the data
model
7. Save updated workbook to SPO
4. Power BI workbook?
![Page 42: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/42.jpg)
On-premises Data Access from BI AzureScenario: Interactive refresh from Excel Web Access where the data source is on-premises
• For interactive refresh, shared data sources are configured in advance in the Power BI Admin Center
• For scheduled refresh, personal data sources can be configured by the workbook ownerAzure Active
Directory (AAD)
OrgID, MSODS,
ACS
BI Azure
Hybrid Proxy
ADO.NET
Provider
Discovery API Tenant
Configuration
SQL Azure
Hybrid Data Integration Service
Hybrid Proxy
Hybrid Delivery
1. Determine whether data
source is cloud or on-prem,
and retrieve registered ID
2. Authenticate &
retrieve tenant
information
3. Get registered
data source info
On-Prem
Cloud
4. Issue refresh query
Data Management Gateway
Windows Azure Service Bus
5. Send request to Gateway
(via Service Bus)
Hybrid Delivery
Client API
6. Read query request from
Service Bus queue
7. Retrieve data
source
credentials
Credential
Manager8. Run query and
retrieve the data
9. Coordinate
transfer job
Azure Storage
(temporary)
10. Compress &
stream data in
multiple chunks
11 . Receive & decompress
data
Azure Active Directory
(AAD)
OrgID, MSODS, ACS
BI Azure
Hybrid Proxy
ADO.NET
Provider
Discovery API
Hybrid Data Integration Service
Hybrid Proxy
Hybrid Delivery
![Page 43: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/43.jpg)
Data Refresh in SPO– How does it work?
Data Management
Gateway
Excel Workbook in
SharePoint OnlineGateway
Cloud Service(1) Excel workbook
uploaded to SharePoint Online
(2) Click Data Refresh for Excel workbook
(3) Connects to Gateway Cloud Service
(4) Checks whether user is authorized to perform a refresh
(5) Sends command (SQL statement, connection string) to on-premise Data Management Gateway
(6) Sends SQL to SQL Server
(7) Return Results
(8) Efficiently transfer this to cloud service
(9) Returns data to Excel Workbook
![Page 44: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/44.jpg)
Data Management Gateway - OData
![Page 45: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/45.jpg)
Power BI for Office 365 | Capabilities
![Page 46: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/46.jpg)
Engage customers with smart,
contextual mobile experiences
Boost agility with real-time access to
apps and data from anywhere
Enable Deep Business and Customer ConnectionsVirtually Anytime, Anywhere
![Page 47: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/47.jpg)
Stay Productive on the GoDeliver Familiar, Connected Experiences to a Mobile Workforce
…while ensuring enterprise security, manageability, and compliance
![Page 48: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/48.jpg)
Mobile BI Capabilities Available Today
Browser-based corporate BI solutions on iOS, Android and Windows:
• SharePoint Mobile enhancements
• PerformancePoint Services
• Excel Services
• SQL Server Reporting Services
“Ultimately, the new Microsoft mobile BI solution leads to more revenue for Recall
and gives us deeper customer insight, helping us stay ahead of our competitors.”
Recall Records Management Company Gets Real-Time BI, Boosts Sales with Mobile Solution case study. Full Case study.
![Page 49: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/49.jpg)
Excel Web App
![Page 50: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/50.jpg)
Excel Web App
Quick Explore
![Page 51: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/51.jpg)
Mobile-Friendly Apps for Office
![Page 52: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/52.jpg)
Power BI for Office 365 | Capabilities
![Page 53: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/53.jpg)
Tabular models for Power BI
![Page 54: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/54.jpg)
Datasources
![Page 55: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/55.jpg)
Creating & managing models in Power BI
![Page 56: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/56.jpg)
Reliable Persistent Storage (RPS)
Power BI Tabular Model Architecture
SSDT
SQL Azure
HDInsight
Azure Tables
External Data Sources
AS Instance AS Instance AS Instance AS Instance
…
On Prem SQL
Gateway
Power BI Portal
in O365
Excel
XMLA REST
![Page 57: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/57.jpg)
Service Health Monitoring
At a glance view of the health of IT managed gateways
![Page 58: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/58.jpg)
Enabler of Self Service BI
Varying levels of control across data sources, departments
Oversight and monitoring of cloud data access
Ability to make corporate data sources easier to discover, and easier to access
Role of the IT Admin in Power BI
![Page 60: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/60.jpg)
Power BI Admin Portal & Data Management Gateway
Power BI Admin CenterPower BI Admin Center
![Page 61: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/61.jpg)
HDInsight, Polybase, and StreamInsight
![Page 62: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/62.jpg)
Key Trends
![Page 63: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/63.jpg)
Big Data Analytics
![Page 64: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/64.jpg)
Internet of things
Audio / Video
Log Files
Text/Image
Social Sentiment
Data Market Feeds
eGov Feeds
Weather
Wikis / BlogsClick Stream
Sensors / RFID / Devices
Spatial & GPS Coordinates
WEB 2.0Mobile
Advertising CollaborationeCommerce
Digital Marketing
Search Marketing
Web Logs
Recommendations
ERP / CRM
Sales Pipeline
Payables
Payroll
Inventory
Contacts
Deal Tracking
Terabytes
(10E12)
Gigabytes
(10E9)
Exabytes
(10E18)
Petabytes
(10E15)
Velocity - Variety - variability
Vo
lum
e
1980
190,000$
2010
0.07$
1990
9,000$2000
15$Storage/GB
ERP / CRM WEB
2.0
Internet of things
What Is Big Data?
![Page 65: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/65.jpg)
Modern Data Warehousing
![Page 66: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/66.jpg)
Hadoop Distributed Architecture
![Page 67: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/67.jpg)
MapReduce: Move Code to the Data
![Page 68: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/68.jpg)
So How Does It Work?
![Page 69: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/69.jpg)
Distributed Storage
(HDFS)
Query
(Hive)
Distributed Processing
(MapReduce)
OD
BC
Legend
Red = Core
Hadoop
Blue = Data
processing
Gray= Microsoft
integration
points and
value adds
Orange = Data
Movement
Green =
Packages
HDInsight and Hadoop Ecosystem
![Page 70: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/70.jpg)
Record
readerMap Combiner
Partitioner
Shuffle
and sort
ReduceOutput
format
![Page 71: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/71.jpg)
![Page 72: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/72.jpg)
MapReduce Summary
![Page 73: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/73.jpg)
Programming HDInsight
Hive, Pig, Mahout, Cascading, Scalding, Scoobi, Pegasus…
C#, F# Map/Reduce, LINQ to Hive, Microsoft .NET
management clients
JavaScript Map/Reduce, browser hosted console, Node.js
management clients
PowerShell, cross-platform CLI tools
![Page 74: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/74.jpg)
RDBMS vs. Hadoop
![Page 75: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/75.jpg)
Microsoft Hadoop VisionInsights to all users by activating new types of data
![Page 76: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/76.jpg)
Polybase
76
DBHDFS
SQL Server PDW querying HDFS data, in-situ
=
![Page 77: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/77.jpg)
Polybase in PDW V2
77
Hadoop
HDFS DB
(a) PDW query in, results out
Hadoop
HDFS DB
(b) PDW query in, results stored in HDFS
![Page 78: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/78.jpg)
Sensor
& RFID
Web
Apps
Unstructured data Structured data
Traditional schema-
based DW applications
RDBMSHadoop
Social
Apps
Mobile
Apps
How to overcome the
“impedance mismatch”
Increasingly massive amounts of unstructured data driven by new sources
At the same time, vast amounts of corporate data and data sources, and the bulk of their data analysis
Polybase addresses this challenge for advanced data analytics by allowing native query across PDW and Hadoop, integrating structured and unstructured data
Native Query Across Hadoop and PDW
![Page 79: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/79.jpg)
• Querying data in Hadoop from PDW using regular SQL queries, including
• Full SQL query access to data stored in HDFS, represented as ‘external tables’ in PDW
• Basic statistics support for data coming from HDFS
• Querying across PDW and Hadoop tables (joining ‘on the fly’)
• Fully parallelized, high performance import of data from HDFS files into PDW tables
• Fully parallelized, high performance export of data in PDW tables into HDFS files
• Integration with various Hadoop distributions: Hadoop on Windows Server, Hortonwork and Cloudera.
• Supporting Hadoop 1.0 and 2.0
Native Query Across Hadoop and PDWPolybase Features in SQL Server PDW
![Page 80: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/80.jpg)
Native Query Across Hadoop and PDWCreating “External Tables”
• Internal representation of data residing in Hadoop/HDFS (delimited text files only)
• High-level permissions required for creating external tables
• ADMINISTER BULK OPERATIONS & ALTER SCHEMA
• Different than ‘regular SQL tables’: essentially read only (no DML support)
CREATE EXTERNAL TABLE table_name ({<column_definition>} [,...n ])
{WITH (LOCATION =‘<URI>’,[FORMAT_OPTIONS = (<VALUES>)])}
[;]
Indicates
“External” Table
1
Required location of
Hadoop cluster and file
2
Optional Format Options associated
with data import from HDFS
3
![Page 81: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/81.jpg)
Native Query Across Hadoop and PDWQuerying Unstructured Data
1. Querying data in HDFS and displaying results in table form (using external tables)
2. Joining data from HDFS with relational PDW data
Example – Creating external table ‘ClickStream’:
CREATE EXTERNAL TABLE ClickStream(url varchar(50), event_date date, user_IP
varchar(50)), WITH (LOCATION =‘hdfs://MyHadoop:5000/tpch1GB/employee.tbl’,
FORMAT_OPTIONS (FIELD_TERMINATOR = '|'));
Text file in HDFS with | as field delimiter
SELECT top 10 (url) FROM ClickStream where user_IP = ‘192.168.0.1’ Filter query against data in
HDFS
SELECT url.description FROM ClickStream cs, Url_Description url
WHERE cs.url = url.name and cs.url=’www.cars.com’;
Join data coming from files in
HDFS (Url_Description is a second text file in HDFS)
Query Examples
1
2
SELECT user_name FROM ClickStream cs, Users u WHERE
cs.user_IP = u.user_IP and cs.url=’www.microsoft.com’;
3Join data from HDFS
with relational PDW table(Users is a distributed PDW table)
![Page 82: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/82.jpg)
Native Query Across Hadoop and PDWParallel Data Import from HDFS into PDW
Persistently storing data from HDFS in PDW tablesFully parallelized via CREATE TABLE AS SELECT (CTAS) with external tables as source table and PDW tables (either distributed or replicated) as destination
CREATE TABLE ClickStream_PDW WITH DISTRIBUTION = HASH(url)
AS SELECT url, event_date, user_IP FROM ClickStream
Retrieval of data in HDFS “on-the-fly”
Enhanced
PDW query
engine
CTAS Results
External Table
DMS
Reader
1
DMS
Reader
N
…
HDFS bridge
Parallel
HDFS Reads
Parallel
Importing
Sensor
& RFID
Web
Apps
Unstructured data
Hadoop
Social
Apps
Mobile
Apps
Structured data
Traditional DW
applications
PDW
![Page 83: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/83.jpg)
Sensor
& RFID
Web
Apps
Unstructured data
Social
Apps
Mobile
Apps
HDFS data nodes
Native Query Across Hadoop and PDWParallel Data Export from PDW into HDFS• Fully parallelized via CREATE EXTERNAL TABLE AS SELECT (CETAS) with external tables as
destination table and PDW tables as source
• ‘Round-trip of data’ possible with first importing data from HDFS, joining it with relational data, and then exporting results back to HDFS
CREATE EXTERNAL TABLE ClickStream (url, event_date, user_IP)
WITH (LOCATION =‘hdfs://MyHadoop:5000/users/outputDir’, FORMAT_OPTIONS
(FIELD_TERMINATOR = '|')) AS SELECT url, event_date, user_IP FROM ClickStream_PDW
Enhanced
PDW query
engine
CETAS Results
External Table
DMS
Writer
1
DMS
Writer
N
…
HDFS bridge
Parallel
HDFS Writes
Parallel
Reading
Structured data
Traditional DW
applications
PDW
![Page 84: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/84.jpg)
In-Memory for big data analyticsInteractive Analytics over “Big Data”
84
• SQL Server Analysis Services scaled out to very
large data volumes
• Sourced from “Big Data” sources, e.g.
• Hadoop, Isotope, etc.
• Enterprise data sources (SQL Server, Oracle, SAP,
etc.)
• Built upon the In-Memory Analytics engine
• In-memory, column-store, 10x compression
• Deployment vehicles: Box, Appliance, Cloud
• Customers:
• Skype, Klout, Halo 4, UBS, AdCenter, Windows
Update
XMLAWeb services
External
Data Sources
GW
Mgmt
Deploy
Monitor
AS
Instance
AS
Instance
AS
Instance
Reliable Persistent Storage
Excel, PV
3rd party apps,
tools, etc.
![Page 85: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/85.jpg)
StreamInsightManaging Streaming Data In-Memory
•
•
•
Customer benefits
•
•
•
•
85
Event
Output
streamInput
stream
![Page 86: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/86.jpg)
Complete and Consistent Data Platform
![Page 87: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/87.jpg)
What Questions Do You Have?
![Page 88: SQL Server 2014 Faster Insights from Any Data](https://reader033.fdocuments.in/reader033/viewer/2022052823/55505fc4b4c90574428b5239/html5/thumbnails/88.jpg)
Thank Youfor attending this session