Real time machine learning
-
Upload
vinoth-kumar-kannan -
Category
Technology
-
view
123 -
download
3
description
Transcript of Real time machine learning
1
Real-time Machine Learning
Vinoth Kannan
Intelligent software architecture using Modified Lambda architecture & Apache Mahout
SkillFactory 71
2
Agenda
What is Machine Learning ?Need for Real Time Machine LearningWhat is Lambda architecture ?What is Mahout ?How does a basic recommendor engine works ?Some Use Cases
3
What is machine learning?
4
IntroductionMachine Learning from Streaming Data
Model that considers
recent history Model that is updatable
Machine Learning
It has been sunny and 30 degrees in the last two days, it is unlikely that it will be -10 degrees and snowing the next day
A retail sales model that remains accurate as the business gets larger
Dont they both mean the same ??
5
IntroductionMachine Learning from Streaming Data
Time-series prediction non-stationary data distributions
weather Retail sales
Model that considers
recent history Model that is updatable
6
IntroductionMachine Learning from non-stationary data distributions
Incremental Algorithms
non-stationary data distributions
Batch algorithm
These are machine learning algorithms that learn incrementally over the data.
These are machine learning algorithms that re-trains periodically with a batch algorithm.
7
IntroductionThe Challenge for the Best Big Data Technology
Hadoop
Batch processing System that can churn huge volume of data
Storm
Real time complex event processing System that can process data stream
Wrong Fight !!!
9
+ =Real-timeBig Data
Its a Chance not a Challenge
Lambda Architecture!!!
10
Lambda ArchitectureOverview
Speed Layer
Serving layer
Batch layer
Speed Layer• Only new data• Compensates for high latency
Serving layer updates• Batch layer overrides speed
layer
Serving layer• Loads and expose the batch
views for querying • Random access to batch views
Batch layer• Immutable, constantly growing
datasets• Batch views are computed from
this raw dataset
Lambda ArchitectureOverview with description
12
Basic Idea behind Lambda architecture
query = function(all data)- Nathan Marz
Big Data - Principles and best practices of scalable realtime data systems
13
Basic Idea behind Lambda
𝑓 (𝑎0…𝑎𝑚)Perform some function from real-time data “0“ to the history data “n“
Real Time Big Data
= +
Lambda Architecture
Hadoop ProcessStorm ProcessReal Time Big Data
} } }
Letting the History data processed by Hadoop makes process faster
14
The Problem
= +
Batch ProcessReal-timeReal Time Big Data
} } }• How to define the boundery between Real-time and Batch
Process ?• How to synchronize the computation between the two
system ?• How to avoid gaps and overlaps ?• What algorithm to use?• How to avoid failure and have fault tolerance mechanism ?
Questions to be answered
Unanswered questions of Lambda architecture
Modified Lamda ArchitecturePresentation Layer
• Presentation layer must aggregate the output of Storm and Hadoop outputs
• User will see the result of his events in less than 2 seconds
• Seamless merge between short and long term data
16
Machine Learning with Mahout
17
What is Mahout ?Introduction
• Apache Software Foundation Java library• Scalable “machine learning“ library that runs on Hadoop mostly• Currently Mahout supports mainly four use cases
Recommendation Clustering
Classification Frequent Itemset mining
• Core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm
18
Basic Recommendor algorithmHow it works
Today‘s FOCUS : Suggesting item to user based on current search
19
Basic Recommendor algorithmDefining recommendation
Two broad categories of recommender engine algorithms
Mahout implements a collabrative filtering framework
User-based
Recommends items by finding similar users.Harder to scale because of dynamic nature of users
Item-based
Calculate similiarty between items and make recommendations.Items usually dont change much and hence could be calculated offline
20
Basic Recommendor algorithmDefining recommendation
User Preference to an Item
• Like Something• Dont Like something• Dont Care
1 Click = 1 Like = Uniform Preference
Safe to assume
Mahout Library of AlgorithmsLots of algorithms to Choose From
Use CasesReal Time Machine Learning
eCommerce
Objective : Increase sales revenue
Match potential customer to the right productPersonalise user experience on web and emailCustomer lifecycle management
Use CasesReal Time Machine Learning
Financial Services
Objective : Real Time Fraud Detection
Compute patterns/ predictors for individual customers
Classify and Cluster custumers and recalculate patterns and predictors
Set threshold across all data
Use CasesReal Time Machine Learning
Media
Objective : Generating Meta Data
Video/ Audio/Text analysisFind patterns/cluster for people, places,
products, things
Use CasesReal Time Machine Learning
Carbookplus
Objective : Generating Meta Data
Match potential trips to right destinationRecommend best gas station Recommend contacts whom user might knowMatch right advertisers to customer based on
vehcile needs
26
Summary
Ability to create real time systems based on lambda architectureUsefulness of predictive algorithms Reason to concentrate on real time predicitionsMore Read
http://storm-project.net/http://mahout.apache.org/http://hadoop.apache.org/
27
Thank You