Introduction to Machine Learning BITS C464/BITS F464 Navneet Goyal Department of Computer Science,...
Transcript of Introduction to Machine Learning BITS C464/BITS F464 Navneet Goyal Department of Computer Science,...
Introduction to Machine LearningBITS C464/BITS F464
Navneet Goyal
Department of Computer Science, BITS-Pilani, Pilani Campus, India
Machine Learning Humour
Source – http://www.kdnuggets.com/2012/12/machine-learning-data-mining-humor.html
Source - http://diegoferrin.wordpress.com
Introduction
Related FieldsArtificial IntelligenceStatisticsData Mining
Machine Learning Humour
What is the difference between statistics, machine learning, AI and data mining? •If there are up to 3 variables, it is statistics. •If the problem is NP-complete, it is machine learning. •If the problem is PSPACE-complete, it is AI. •If you don't know what is PSPACE-complete, it is data mining.
Source – http://www.kdnuggets.com/2012/12/machine-learning-data-mining-humor.html
What is Machine Learning? Machines DO Machines LEARN Shift in paradigm! Machines can be made to learn! How and for what purpose? How? By writing algorithms! Purpose: Mainly to Predict and to take
Decisions!
Types of Learning Supervised Unsupervised Semi-supervised Reinforcement Active Deep
Introduction
Zoologists study learning in animals Psychologists study learning in humans In this course, we focus on
“Learning in Machines”
Course Objective Study of approaches and algorithms that can
make a machine learn
Introduction
Machine LearningSubarea of AI that is concerned with
algorithms/programs that can make a machine learnImprove automatically with experienceFor example- doctors learning from experienceImagine computers learning from medical
records and suggesting treatment (automated diagnosis & prescription)
Machine Learning
A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.
Interesting Problems
Speech and Hand Writing Recognition Robotics (training moving robots) Search Engine (context aware) Learning to drive autonomous vehicle Medical Diagnosis Detecting credit card fraud Computational Bioinformatics Game Playing
What is Machine Learning? To solve a problem, we need an algorithm! For example: sorting a list of numbers Input: list of numbers Output: sorted list of numbers For some tasks, like filtering spam mails Input: an email Output: Y/N We do not know how to transform Input to Output Definition of Spam changes with time and from
one individual to individual What to DO?Reference: E Alpaydin’s Machine Learning Book, 2010 (MIT Press)
What is Machine Learning? Collect lots of emails (both genuine and spam) “Learn” what constitutes a spam mail (or for
that matter a genuine mail) Learn from DATA!! For many similar problems, we may not have
algorithm(s), but we do have example data (called Training Data)
Ability to process training data has been made possible by advances in computer technology
Reference: E Alpaydin’s Machine Learning Book, 2010 (MIT Press)
What is Machine Learning? Face Recognition!!! We humans are so good at it!!! Ever thought how we do it, despite
Different light conditions, pose, hair style, make up, glasses, ageing etc..
Since we do not know how we do it, we can not write a program to do it
ML is about making inference from a sample
Reference: E Alpaydin’s Machine Learning Book, 2010 (MIT Press)
Machine Learning Applications What kind of data I would require for learning?
Credit card transactions Face Recognition Spam filter Handwriting/Character Recognition
Handwriting Recognition Task T
recognizing and classifying handwritten words within images
Performance measure P percent of words correctly classified
Training experience E a database of handwritten words with given classifications
Handwriting Recognition
Pattern Recognition Example Handwriting Digit Recognition
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer
Pattern Recognition Example Handwriting Digit Recognition
Non-trivial problem due to variability in handwriting
What about using handcrafted rules or heuristics for distinguishing the digits based on shapes of strokes?
Not such a good idea!! Proliferation of rules Exceptions of rules and so on… Adopt a ML approach!!
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer
Pattern Recognition Example Handwriting Digit Recognition
Each digit represented by a 28x28 pixel image Can be represented by a vector of 784 real no.s Objective: to have an algorithm that will take such a vector
as input and identify the digit it is representing Take images of a large no. of digits (N) – training set Use training set to tune the parameters of an adaptive
model Each digit in the training set has been identified by a target
vector t, which represents the identity of the corresp. digit. Result of running a ML algo. can expressed as a fn. y(x)
which takes input a new digit x and outputs a vector y. Vector y is encoded in the same way as t
The form of y(x) is determined through the learning (training) phase
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer
Pattern Recognition Example Generalization
The ability to categorize correctly new examples that differ from those in training
Generalization is a central goal in pattern recognition Preprocessing
Input variables are preprocessed to transform them into some new space of variables where it is hoped that the problem will be easier to solve (see fig.)
Images of digits are translated and scaled so that each digit is contained within a box of fixed size. This reduces variability.
Preprocessing stage is referred to as feature extraction New test data must be preprocessed using the same
steps as training dataReference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer
22
Linear Classifiers in High-Dimensional Spaces
Var1
Var2 Constructed Feature 1
Find function (x) to map to a different space
Go back
Constructed Feature 2
A word about Preprocessing!! Preprocessing
Can also speed up computations For eg.: Face detection in a high resolution video stream Find useful features that are fast to compute and yet
that also preserve useful discriminatory information enabling faces to be distinguished form non-faces
Avg. value of image intensity in a rectangular sub-region can be evaluated extremely efficiently and a set of such features are very effective in fast face detection
Such features are smaller in number than the number of pixels, it is referred to as a form of Dimensionality Reduction
Care must be taken so that important information is not discarded during pre processing
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer