Hosting and Using “R” with Azure ML
Transcript of Hosting and Using “R” with Azure ML
Hosting and Using “R” with Azure ML
November 18, 2014
Meetup Azure Machine Learning
http://www.meetup.com/New-York-Azure-Machine-Learning-Meetup/
Why host “R” in Azure ML
• R has great depth and breadth in many areas
• Very high value but not always easy to transition to a pluggable solution callable by other processes or services
• Can be combined with existing Azure ML modules (mix and match)
• ORSimply host a working R solution as a web service
Common challenge
Focus on right hand side…
One Azure ML module to learn and use
More input ports than usual
Output ports for “data” and R device output
Adding additional “R” packages/scripts…
From script to running web service
Using the web service…
references
• https://marketbasket.cloudapp.net/
• https://datamarket.azure.com/dataset/amla/mba
• https://azureinfo.microsoft.com/CO-Azure-WBNR-FY15-11Nov-OperationalizingRasaWebService-ThankYou.html?aliId=8371905
• http://azure.microsoft.com/en-us/documentation/articles/machine-learning-r-csharp-web-service-examples/
Next meetup – will be mid-January 2015
• Will post slides and script from tonight to
• http://www.meetup.com/New-York-Azure-Machine-Learning-Meetup/files/
• Remaining slides have R script and illustrations.
R scripts used last night
• Loading and referencing an external R package• Note: make sure follow steps on slide 8
• #this is trivial and just used to show package load and testinstall.packages("src/slam_0.1-32.zip",lib=".",repos=NULL,verbose=TRUE)install.packages("src/clue_0.3-48.zip",lib=".",repos=NULL,verbose=TRUE)install.packages("src/skmeans_0.2-6.zip",lib=".",repos=NULL,verbose=TRUE)library(skmeans,lib.loc=".",verbose=TRUE)#our package and libraries should be loaded up#stuff <- packages.installed(skmeans)#dataset1 <- maml.mapInputPort(1) # class: data.frame#dataset2 <- maml.mapInputPort(2) # class: data.framesamp <-matrix(sample.int(1000,size=20*50,replace=TRUE),nrow=20,ncol=500,dimnames=list(1:20,1:500))fit <-skmeans(samp,5)result <- data.frame(list(rownames(samp),fit$cluster),row.names=NULL)colnames(result) <- c("sample row","cluster")print(result) #R console output
Simple kmeans cluster
mydata <- maml.mapInputPort(1) # get our data from the R script input module instead of inline – this is the web service input signature
# parse and structure the input data to become a dataframe for the clustering
data.split <- strsplit(mydata[1,1], ",")[[1]]
data.split <- sapply(data.split, strsplit, ";", simplify = TRUE)
data.split <- sapply(data.split, strsplit, ";", simplify = TRUE)
data.split <- as.data.frame(t(data.split))
data.split <- data.matrix(data.split)
data.split <- data.frame(data.split)
# K-Means Cluster Analysis
fit <- kmeans(data.split, mydata$k) # k-cluster solution
# get cluster means
aggregate(data.split,by=list(fit$cluster),FUN=mean)
# append cluster assignment
mydatafinal <- data.frame(t(fit$cluster))
n_col=ncol(mydatafinal)
colnames(mydatafinal) <- paste("V",1:n_col,sep="")
mydatafinal
maml.mapOutPortPort(mydatafinal) # this will become the web service publishing port – i.e. what is returned – output must be a dataframe…
Input schema and sample data for kmeansthis is hosted in the “R” script module
mydata <- data.frame(value = "1; 3; 5; 6; 7; 7, 5; 5; 6; 7; 2; 1, 3; 7; 2; 9; 56; 6, 1; 4; 5; 26; 4; 23, 15; 35; 6; 7; 12; 1, 32; 51; 62; 7; 21; 1", k=5, stringsAsFactors=FALSE)
maml.mapOutputPort("data"); # this is the key as it wires the sample schema above to the downstream receiver (see the next illustration)
A model that is ready to be published as a web service.Note the publishing icons on the lower modules input and output ports.
Input Schema
Simple kmeans cluster script