Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark,...
Transcript of Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark,...
![Page 1: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/1.jpg)
Leveraging Azure From RAzure Spark and MPI Clusters from R
Doug ServiceStephen WellerDaniel HansonJuly 3, 2016
Microsoft Machine LearningRevolution Analytics
©Microsoft 2015 R/Finance 2016 1
![Page 2: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/2.jpg)
Outline
1. Introduction
2. Azure
3. MPI Cluster
4. Portfolio Optimization Demo
©Microsoft 2015 R/Finance 2016 2
![Page 3: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/3.jpg)
Introduction
![Page 4: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/4.jpg)
Introduction
GoalsLeverage Azure compute clusters from R to solve compute or dataparallel finance problems faster
1. Login to Azure accounts with $200 spending limit you can useduring and after the presentation
2. Run and review R demos on pre-configured R Server Spark andMPI compute clusters
©Microsoft 2015 R/Finance 2016 3
![Page 5: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/5.jpg)
Azure
![Page 6: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/6.jpg)
Azure
AdvantagesEliminates the expense of buying, maintaining, and continuallyupgrading a data center. Only pay for the resources you use.
Microsoft facility in Quincy Washington
©Microsoft 2015 R/Finance 2016 4
![Page 7: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/7.jpg)
Azure
AdvantagesBuild Spark, Hadoop, MPI compute clusters in Azure Portal, orlanguages such as Bash, PowerShell, node.js, or C# to access from R
©Microsoft 2015 R/Finance 2016 5
![Page 8: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/8.jpg)
Azure
Azure is a collection of integrated cloud services
• Compute - virtual machines (VMs)• Linux: Ubuntu, Redhat, CentOS...• Windows: Windows Server, Windows Enterprise...
• Networking - connect VMs• Internal virtual network• Public IP address and domain name
• Database - deploy to VMs• Oracle, OrientDB, Redis, SQL Server, MySQL
• Data Analytics - pre-configured• HDInsight, Stream Analytics, Cloudera
• Storage
©Microsoft 2015 R/Finance 2016 6
![Page 9: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/9.jpg)
MPI Cluster
![Page 10: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/10.jpg)
MPI Cluster
Four virtual machinesAll Nodes: desktop + worker
• Ubuntu Server 16.04• Open message passing interface (OpenMPI)• Open secure shell (OpenSSH)• Network file system (NFS)• R plus packages
Desktop node
• Ubuntu Mate Cloudtop desktop• X remote desktop protocol (XRDP)• Visual Studio Code editor• Sublime Text 3 editor
©Microsoft 2015 R/Finance 2016 7
![Page 11: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/11.jpg)
MPI Cluster
R Packages
• foreach• doMPI• Rmpi
Gotchas
• rsh (ssh) must work reciprocally from all nodes, requires bothpublic and private SSH key files on every node
• Development R scripts must be on all nodes in same location,best solution exports working directory on desktop node tocompute nodes via Network File System (NFS)
• High performance configuration uses desktop in cloud due tohigh speed network connections to worker nodes
©Microsoft 2015 R/Finance 2016 8
![Page 12: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/12.jpg)
Portfolio Optimization Demo
![Page 13: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/13.jpg)
S&P 500 Portfolio Optimization
Algorithm
• Select top 30% of stocks in each S&P index sectorIndustrials, Health Care, Information Technology etc.
• Form uniformly drawn random portfolios of 30 stocks• Perform a minimum CVaR analysis on every portfolio• Select the portfolio with the highest return• Generate the efficient frontier for highest return portfolio
©Microsoft 2015 R/Finance 2016 9
![Page 14: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/14.jpg)
S&P 500 Portfolio Optimization
Optimization Run TimeTransport Machines Threads Time (mins) ScriptNone 1 1 4.3162 RunPortST.sh
MPI 1 4 1.6641 RunPortMT.sh
MPI 4 1 1.4296 RunPortMPI.sh
RunAalysis.sh - Generates analysis report
©Microsoft 2015 R/Finance 2016 10
![Page 15: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/15.jpg)
S&P 500 Portfolio Optimization
Demo Directory/nfs/mpidemos/rfinance/RAzureCluster/demo/portfolioOptimization
Demo Files
• PortfolioMPI.R - portfolio optimization• PortfolioMPIResults.R - generates optimization report
©Microsoft 2015 R/Finance 2016 11
![Page 16: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/16.jpg)
S&P 500 Portfolio Optimization
Using foreach
eres <- foreach(cdx=1:nnode,.packages='fPortfolio') %dopar% {
# Get the combinations for the current node.
ncmbs <- cmbs[,rngs[cdx,1]:rngs[cdx,2]]
ret <- list()
for (idx in 1:ncol(ncmbs)) {
ret <- c(ret,list(list(Cmb=cmbs[,idx],
Stats=calcMinCVaRPort(spxret.ts[,ncmbs[,idx]]))))
}
return(ret)
}
©Microsoft 2015 R/Finance 2016 12
![Page 17: Leveraging Azure From Rpast.rinfinance.com/agenda/2016/workshop/DougService.pdf · Build Spark, Hadoop, MPI compute clusters in Azure Portal, or languages such as Bash, PowerShell,](https://reader036.fdocuments.in/reader036/viewer/2022071212/6026049c99df976a796ed721/html5/thumbnails/17.jpg)
S&P 500 Portfolio Optimization
Review Portfolio optimization output
©Microsoft 2015 R/Finance 2016 13