Microsoft Data Science Technologies 201505

73
Microsoft Technologies for Data Science Mark Tabladillo, Ph.D. Senior Data Scientist LogicBlox/Predictix

Transcript of Microsoft Data Science Technologies 201505

Microsoft Technologies for Data Science

Mark Tabladillo, Ph.D.

Senior Data Scientist

LogicBlox/Predictix

Networking

Interactive

Vision Analytics

Recommenda-

tion engines

Advertising

analysis

Weather

forecasting for

business planning

Social network

analysis

Legal

discovery and

document

archiving

Pricing analysis

Fraud

detection

Churn

analysis

Equipment

monitoring

Location-based

tracking and

services

Personalized

Insurance

Machine learning & predictive analytics are core capabilities that are needed throughout your business

http://www.bizjournals.com/stlouis/subscriber-only/2014/07/25/largest-employers.html

Phrase Goal

“Data Mining”

“Text Mining”

Inform actionable decisions

“Machine

Learning”

Determine best performing

algorithm

Analysis

(science)

Synthesis

(art)

GO

Science needs science fiction -- MarkTab

Analysis

(science)

Synthesis

(art)

GO

Magic Quadrant

for Business

Intelligence and

Analytics Platforms

Retrieved from http://www.microstrategy.com/us/about-us/analyst-reviews/gartner-magic-

quadrant

Magic Quadrant

for Data

Warehouse

Database

Management

Systems

Retrieved from http://www.gartner.com/technology/reprints.do?id=1-

1DU2VD4&ct=130131&st=sb – January 31, 2013

http://www.kdnuggets.com/polls/2014/analytics-

data-mining-data-science-software-used.html

http://products.office.com/en-us/excel

http://www.microsoft.com/en-us/server-cloud/products/sql-server/

http://pytools.codeplex.com/

http://azure.microsoft.com/en-us/services/hdinsight/

http://www.revolutionanalytics.com/

SQL Server Data Mining: Analysis Serviceshttp://sqlserverdatamining.com

SS

SQL

AS

NoSQL

Database

Services

SQL Server*

SQL Azure*

Replication

SQL Azure Data Sync*

Full Text & Semantic

Search*

Data Integration

Services

Integration Services*

Master Data Services*

Data Quality Services*

StreamInsight*

Project “Austin”*

Analytical

Services

Analysis Services*

Data Mining

PowerPivot*

Reporting

Services

Reporting Services*

SQL Azure Reporting*

Report Builder

Power View*

Data

Mining

SSMS SSIS PowerShell

http://sqlserverdatamining.com

Data mining add-in for business analysts

• Ease of use

• Rich data mining

• Scalable

What is Inside Semantic Search

SQL Server 2012 and higher

Logical Model

How Semantic Search Works

Rowset

Output

with Scores

Varchar

NVarchar

Office

PDF

Documents

Full-Text

Keyword

Index

“FTI”

iFilters

Semantic Document

Similarity Index “DSI”

Semantic

Database

Semantic

Key Phrase

Index –

Tag Index

“TI”

Simplified Chinese

British English

Portuguese

Chinese (Hong Kong SAR, PRC)

Spanish

Chinese (Singapore)

Chinese (Macau SAR)

Full Text Keyword Index “FTI”

Semantic Key Phrase Index –

Tag Index “TI”

Semantic Document

Similarity Index “DSI”

http://msdn.microsoft.com/en-

us/library/gg492085.aspx#SemanticIndexing

Interactive Demo

SQL Server Management Studio

Semantic Search Microsoft Data MiningSQL Server Data Tools: data mining plus text mining

Microsoft Azure Machine Learning

PerformanceThe Million-Dollar Edge

Time in Seconds vs. Number of Documents

(2011 – K. Mukerjee, T. Porter, S. Gherman – Microsoft)

Video Intro

http://blogs.technet.com/b/machinelearning/archive/2014/0

9/17/extensibility-and-r-support-in-the-azure-ml-

platform.aspx

Difference in Proportions Test

Lexicon Based Sentiment Analysis

Forecasting-Exponential Smoothing

Forecasting - ETS+STL

Forecasting-AutoRegressive Integrated Moving Average (ARIMA)

Normal Distribution Quantile Calculator

Normal Distribution Probability Calculator

Normal Distribution Generator

Binomial Distribution Probability Calculator

Binomial Distribution Quantile Calculator

Binomial Distribution Generator

Multivariate Linear Regression

Survival Analysis

Binary Classifier

Cluster Model

datamarket.azure.com

Mutable Immutable

Open Source Java Scala

.NET C#, C++,

VB.NET

F#, PowerShell

http://channel9.msdn.com/posts/Erik-Meijer-Functional-Programming-From-First-Principles

http://channel9.msdn.com/posts/Erik-Meijer-Functional-Programming-From-First-Principles

HDInsight

http://www.kdd.org/

http://blogs.technet.com/b/machinelearning/

http://social.msdn.microsoft.com/forums/azure/en-US/home?forum=MachineLearning

http://sqlserverdatamining.com

http://marktab.net

http://curah.microsoft.com/342704/azure-machine-learning-videos-february-2015

Professional Association for SQL Server

http://www.sqlpass.org

PASS Data Science Virtual Chapter

http://datascience.sqlpass.org

http://www.inside-r.org/

http://datascience.sqlpass.org