Mechanical behaviour of wood in the orthotropic directions - UTAD
UTAD - Jornadas de Informática - Potential of Big Data
-
Upload
marco-silva -
Category
Software
-
view
135 -
download
0
Transcript of UTAD - Jornadas de Informática - Potential of Big Data
Potential of Big DataMarco António SilvaSolution [email protected]
What is Big Data?
• AnalysisSome issues you already had to take care off
• Analysis• Transportation
Some issues you already had to take care off
• Analysis• Transportation• Access Control
Some issues you already had to take care off
• Analysis• Transportation• Access Control• Replication
Some issues you already had to take care off
• Analysis• Transportation• Access Control• Replication• Storage
Some issues you already had to take care off
• Analysis• Transportation• Access Control• Replication• Storage• Data Quality
Some issues you already had to take care off
New Generation of Data
The EMC Digital Universe Study launched its seventh edition. According to the study, by 2020, the amount of data in our digital universe is expected to grow from 4.4 trillion GB to 44 trillion GB
According to IBM, "2.5 exabytes - that's 2.5 billion gigabytes (GB) - of data was generated every day in 2012. That's big by anyone's standards. "About 75% of data is unstructured, coming from sources such as text, voice and video.“
How big is BIG?
How big is BIG?
Connected “Things” by 2020:26 billionGartner
Market for IoT by 2020:$1.9 trillionIDC
“Big Data as high volume, velocity and variety information assets that demand cost-effective, innovative forms of information processing for
enhanced insight and decision making.”
Gartner
Define Big Data
The Five “V”s of Big DataVolume Velocity Variety Veracity Value
Data at Rest
Data in Motion
Data in Many Forms
Data in Doubt
Data into Money
Giga, Tera, Exabyte’s of existing data to be stored and processed
Streaming data that requires fast analysis and response
Relational, Structured, Unstructured, Text, Audio, Video…
Data inconsistency, incompleteness, ambiguity, latency, noise, errors,…
New business models, insights and products can be created from the data
€ €
€€
€
Turning Big Data into Value
Volume
Velocity
Variety
Veracity
€ €
€€
€
€€
€€
€
Data Sources Analyse the Data• ERP• CRM• Inventory• Finance• Social Media• Logs• Video+Audi
o• Sensors• …
• Predictive Analysis
• Text Analysis• Sentiment
Analysis• Image
Processing• Computer
Vision• Voice Analysis• …
The Tools
HDInsight on Azure
• Based on Hortonworks Data Platform• Available in Windows and Linux flavors• Scale elastically
Reliable, scalable, distributed computing
HDInsight <academic_mode = “on” />MapReduce is a framework for processing parallelizable problems across huge datasets using a large number of computers (nodes), collectively referred to as a clustermap (in_key, in_value) -> list(out_key, intermediate_value) reduce (out_key, list(intermediate_value)) -> list(out_value)
MapReduce explained
Read linesfrom file
Convert line to
Key-Value Pair(s)
Filter (by
key/value)
Combine Values with
similar Keys
Shuffle data
across nodes
for reduces by Key
Sort by Key
Aggregate(reduce)
Filter (based on aggregated value)
Write results to file
Map Reduce
MapReduce Hello World
Deer Bear RiverCar Car RiverDeer Car Bear
Deer Bear River
Car Car River
Deer Car Bear
Deer, 1Bear, 1River, 1
Car, 1Car, 1River,
1
Deer, 1
Car, 1Bear,
1
Bear, 1
Bear, 1
Car, 1Car, 1Car, 1
Deer, 1
Deer, 1
River, 1River, 1
Bear, 2
Car, 3Deer,
2River,
2
Input Splitting Mapping Shuffling
Bear, 2
Car, 3
Deer, 2
River, 2
Reducing Finalresult
• Pig is a high level scripting language that is used with Apache Hadoop
• Excels at describing data analysis problems as data flows
• Is complete in that you can do all the required data manipulations with Pig
Pig knows Latin
Azure HDInsight
Windows Azure Blob Storage (WABS) Distributed File System
Applications (by cluster type)Spark
Spark Spark
Streaming Spark MLlib
Storm
Storm Kafka
HBase
HBase Zookeeper
….
Hadoop HDFS APIs MapReduce Sqoop Pig Hive (Tez) Mahout Oozie
Yet Another Resource Negotiator (YARN)
Acquisition Azure Data
Factory
Stream Processing
• Steam Analytics• Event Hub
Machine Learning Azure Machine
Learning
NoSQL Table Storage DocumentDB
Cortana Intelligence SuiteTransform data into intelligent action
Personal Digital Assistant – Cortana
Perceptual Intelligence
Preconfigured Solutions
Dashboards and Visualizations
Machine Learning and Analytics
Big Data Store
Information Management
Business ScenariosRecommendations,
customer churn,forecasting, etc.
Perceptual IntelligenceFace, vision
Speech, text
Personal Digital Assistant
Cortana
Dashboards and Visualizations
Power BI
Cortana Intelligence SuiteTransform data into intelligent action
DATA
Business apps
Custom apps
Sensors and devices
INTELLIGENCE ACTION
People
Automated Systems
Big Data Stores
Azure Data Lake store
Azure SQL Data Warehouse
Information Management
Azure Data Factory
Azure Data Catalog
Azure Event Hub
Machine Learning
and Analytics
Azure Machine Learning
Azure HDInsight (Hadoop and Spark)
Azure Stream Analytics
Azure Data Lake analytics service
Pay for performance
Operational efficiency
Smart buildings
Predictive maintenance
Supply chain management
Lifetime customer value
Personalized offers
Product recommendation
Fraud detection
Credit risk management
Customer Acquisition
Cross-sell and upsell
Loyalty programs
Marketing mix optimization
Cortana Intelligence scenariosEXAMPLE SOLUTIONS
Sales and marketing
Finance and risk
Customer and channel
Operations and workforce
Azure Stream AnalyticsProcess real-time data in Azure using a simple SQL languageConsumes millions of real-time events from Event Hub collected from devices, sensors, infrastructure, and applications
Performs time-sensitive analysis using SQL-like language against multiple real-time streams and reference data
Outputs to persistent stores, dashboards or back to devices
Point of Service Devices
Self CheckoutStations
Kiosks
Smart Phones
Slates/Tablets
PCs/Laptops
Servers
Digital Signs
DiagnosticEquipmentRemote Medical
MonitorsLogic
Controllers
SpecializedDevicesThin
Clients
Handhelds
Security
POS Terminals
AutomationDevices
VendingMachines
Kinect
ATM
Stream Analytics
Azure Data FactoryFully managed service to support orchestration of data movement and processing
Connect to relational or non-relational data that is on-premises or in the cloud
Single pane of glass to monitor and manage data processing pipelines.
Publish to Power BI
Compose and orchestrate data services at scale
No SQLDB
Blob
C#
MapReduceTrusted data
BI & analyticsHive
Pig
Stored Procedures
VM
Azure Machine Learning
ML Algorithms are best of breed and embrace OSS• MS + R + Python + BYOA
ML Studio for productive development• Faster experiments results in faster improvements• Visual Workflows & ML Experiments
ML Operationalization to remove deployment friction• Build entire ML Apps & Deploy as Cloud APIs
ML Gallery• Provide ML applications like apps in an ‘app store’• Publish/consume APIs in a 2 sided market
Help organizations eliminate undifferentiated heavy lifting
Powerful predictive analytics in AzureAzure Machine Learning
Azure Data CatalogEnable enterprise-wide self-service data source registration and discoveryA metadata repository that allow users to register, enrich, understand, discover, and consume data sources
Delivers differentiated value though‒ Data source discovery; rather than data
discovery ‒ Support for data from any source; Structured and
unstructured, on premises and in the cloud‒ Publishing, discovery and consumption through
any tool ‒ Annotation crowdsourcing: empowering any
user to capture and share their knowledge.
This, while allowing IT to maintain control and oversight
Power BI
Excel BI Investments
Power Map with custom maps allows deeper geospatial explorations and storytelling
Power Query brings modern data discovery, connectivity, shaping and publishing to Excel
Analysis Services connectivity for Power View allows users to leverage existing IT investments
Support for more sophisticated data models in Power Pivot – date and calc tables, many-to-many relationships, etc
Power Map w/ Custom
Maps
Power Query
Power BI investments
Power BI dashboards and KPIs for monitoring the health of your business
New data visualizations and touch-optimized exploration in HTML5
Power BI mobile apps across devices including iPad and iPhone
Support for new data sources including SalesForce.com, Dynamics CRM online and SQL Server Analysis Services
Dashboard
Tree Map
A hyper scale repository for big data analytic workloadsIntroducing Azure Data Lake Store
• Hadoop File System compatible with HDFS™• Integrated with HDInsight, Revolution R, Hortonworks, Cloudera• Based on YARN
• Petabyte-sized files• No size limits to data in single account•Massive throughput to increase performance
• AAD based access control• Data management
Devices
Azure Data Lake Analytics ServiceA new distributed analytics service
Built on Apache YARNScales dynamically with the turn of a dialPay by the querySupports Azure AD for access control, roles, and integration with on-prem identity systemsBuilt with U-SQL to unify the benefits of SQL with the power of C# Processes data across Azure
37
Stream Analytics
TransformIngest
Example overall data flow and Architecture
Web logs
Present & decide
IoT, Mobile Devices etc.
Social Data
Event Hubs HDInsight
Azure Data Factory
Azure SQL DB
Azure Data Lake
Azure Machine Learning
(Fraud detection etc.)
Power BI
Web dashboards
Mobile devices
DW / Long-term storage
Predictive analytics
Event & data producers
Azure SQL DW
How can I develop Faster?
Cortana Intelligence Preconfigured Solutions
Customer Churn
Product Recommendation
Sentiment Analysis
From zero to finished, analytical apps and scenariosPre-Configured Solutions designed to help customers jumpstart the creation of analytics solution
Allows customers to accelerates the process of building analytical apps
Go from zero to sample app in minutes, from sample app to finished solution in a week
Cognitive Services
Cortana Intelligence Gallery
What type of Problems can I Solve with these?
The Internet of Things – ManufacturingGLOBAL OPERATIONS
I can see my production line status and recommend adjustments to better manage operational cost.
I know when to deploy the right resources for predictive maintenance to minimize equipment failures and reduce service cost.
I gain insight into usage patterns from multiple customers and track equipment deterioration, enabling me to reengineer products for better performance.
MANUFACTURING PLANT
Aggregate product data, customer sentiment, and other third-party syndicated data to identify and correct quality issues.
Manage equipment remotely, using temperature limits and other settings to conserve energy and reduce costs.
Monitor production flow in near-real time to eliminate waste and unnecessary work in process inventory.
GLOBAL FACILITY INSIGHT
Implement condition-based maintenance alerts to eliminate machine down-time and increase throughput.
THIRD-PARTY LOGISTICS
Provide cross-channel visibility into inventories to optimize supply and reduce shared costs in the value chain.
CUSTOMER SITE
Transmits operational information to the partner (e.g. OEM) and to field service engineers for remote process automation and optimization.
Management
R&D
Field Service
The Internet of Things – Retail
Marketing
101 0
0 101
1 010
0 10 1
000 0
1 01 0
111 0
1 01 0
1 001
0 11 0
1 001
0 10 0
1 01 0
111 0
1 010
1 001
0 10 1
101 0
101 1
0 10 0
101 1
0 1
001 0
1 010
MOBILE EXPERIENCE
STORE PURCHASE HISTORY:
Dog food
M T W Th F
WeatherData
1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 0 0 0 1 0 1 0 1 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0
1 0 1 0
Merchandizing
IN-STORE SHOPPING
OnlineBehavior
Shopping Route
REFLECTION
BESTDEAL
INSPIRATION,DISCOVERY,
PRE-SHOPPING
PurchaseHistory
RIGHT OFFER, RIGHT TIME, RIGHT PLACE
IoT DATA FUELS CUSTOMER AND PRODUCT INSIGHTS
100
100
100
100
100
1001
011
010
0101
011
001
011
010
010
100
001
010
101
010
110
100
1011
0100
101
010
Retail
200ft
Have youseen these!
We’re ready for the rain! #ShoppingSuccess
42
The Internet of Things – Hospitality & Travel
Save money with more accurate arrival time predictions
Provide a seamless traveler experience from the curb to the gate, and enable context-sensitive notifications
Provide guests with a connected tablet to control room settings, request services, and provide feedback—and save their preferences
Centrally manage critical station assets—everything from communication and security networks to escalators and HVAC control systems
Send reports and sensor data to maintenance crews for faster turnaround
Configure notifications on employee devices of restaurant equipment maintenance needs
Manage inventory in near real time, and monitor food storage temperatures and expirations
NEW GATE:B7
25% off
ON TIME
How to Start?
Links and References• Azure Portal• https://
azure.microsoft.com• Learn• http://
channel9.msdn.com• http://
build.microsoft.com
• Try• IoT Suite• https://www.azureiotsuite.com/• Cognitive Services• https://www.microsoft.com/cognitive-
services• Cortana Analytics Suite• https://www.microsoft.com/en-us/
server-cloud/cortana-intelligence-suite/
• …
© Microsoft Corporation. All rights reserved.