Machine learning
-
Upload
andrea-iacono -
Category
Technology
-
view
406 -
download
0
description
Transcript of Machine learning
MachineLearning
Andrea Iacono https://github.com/andreaiacono/MachineLearning
Machine Learning: Intro
[Wikipedia]: a branch of artificial intelligence that allows the construction and the study of
systems that can learn from data
What is Machine Learning?
- Regression analysis- Similarity and metric learning- Decision tree learning- Association rule learning- Artificial neural networks- Genetic programming- Support vector machines (classification and regression analysis)- Clustering- Bayesian networks
Machine Learning: Intro
Some approaches:
Machine Learning: Intro
Supervised learningvs
Unsupervised learning
Machine learningvs
Data mining
Machine Learning: Regression analysis
Regression Analysis
A statistical technique for estimating the relationships among a dependent variable and independent variables
Machine Learning: Regression analysis
Prediction of house pricesSize (x) Price (y)
0.80 70
0.90 83
1.00 74
1.10 93
1.40 89
1.40 58
1.50 85
1.60 114
1.80 95
2.00 100
2.40 138
2.50 111
2.70 124
3.20 172
3.50 172
Machine Learning: Regression analysis
Prediction of house prices
Hypothesis:
hθ(x )=θ0+θ1 x
Machine Learning: Regression analysis
Prediction of house prices
Hypothesis:
hθ(x )=θ0+θ1 x J (θ0,θ1)=12m
∑i=1
m
(hθ(x(i))− y(i ))2
Cost function for linear regression:
Machine Learning: Regression analysis
Prediction of house prices
Hypothesis:
hθ(x )=θ0+θ1 x J (θ0,θ1)=12m
∑i=1
m
(hθ(x(i))− y(i ))2
Gradient Descent
repeat until convergence :
θ0=θ0−α1m∑i=1
m
(hθ( x(i ))− y(i )
)
θ1=θ1−α1m∑
i=1
m
[ (hθ(x(i))− y(i )) x(i) ]
Cost function for linear regression:
Machine Learning: Regression analysis
Prediction of house prices
Iterative minimization of cost function with gradient descent
Machine Learning: Regression analysis
Hands on
Machine Learning: Regression analysis
Regression analysis
- one / multiple variables - linear / higher order curves
- several optimization algorithms - linear regression - logistic regression - simulated annealing - ...
Machine Learning: Regression analysis
Overfitting vs underfitting
Machine Learning: Similarity and metric learning
Similarity and metric learning
- concept of distance
Euclidean distance
euclidean distance (p , q )=√∑i =1
n
(p i −q i )2
Machine Learning: Similarity and metric learning
manhattan distance (p , q )=∑i=1
n
∣(p i−q i )∣
Machine Learning: Similarity and metric learning
Manhattan distance
Pearson ' s correlation (p , q )=∑i =1
n
(p i q i )−∑i =1
n
p i ∑i =1
n
q i
n
√(∑i =1
n
p i2−
(∑i =1
n
pi )2
n)(∑
i =1
n
q i2−
(∑i =1
n
q i )2
n)
Machine Learning: Similarity and metric learning
Pearson's correlation
Searches a large group of users for finding a small subset that have tastes like yours. Based on what this subset likes or dislikes the system can recommend you other items.
Two main approaches: - User based filtering - Item based filtering
Machine Learning: Similarity and metric learning
Collaborative filtering
Machine Learning: Similarity and metric learning
User based filtering
- based on ratings given to the items, we can measure the distance among users
- we can recommend to the user the items that have the highest ratings among the closest users
Hands on
Machine Learning: Similarity and metric learning
Machine Learning: Similarity and metric learning
Is user based filtering good for- scalability?
- sparse data?- quickly changing data?
Machine Learning: Similarity and metric learning
Is user based filtering good for- scalability?
- sparse data?- quickly changing data?
No, it's better to use item based filtering
Machine Learning: Similarity and metric learning
Euclidean distance for item based filtering:nothing has changed!
- based on ratings got from the users, we can measure the distance among items
- we can recommend an item to a user, getting the items that are closer to the highest rated by the user
Hands on
Machine Learning: Similarity and metric learning
P (A∣B )=P (B∣A)P (A)
P (B )
Machine Learning: Bayes' classifier
Example: given a company where 70% of developers use Java and 30% use C++, and knowing that half of the Java developers always use enhanced for loop, if you look at the snippet:
which is the probability that the developer who wrote it uses Java?
for (int j=0; j<100; j++) {t = tests[j];
}
P (A∣B )=P (B∣A)P (A)
P (B )
Bayes' theorem
P (A∣B )=P (B∣A)P (A)
P (B )
Machine Learning: Bayes' classifier
Example: given a company where 70% of developers use Java and 30% use C++, and knowing that half of the Java developers always use enhanced for loop, if you look at the snippet:
which is the probability that the developer who wrote it uses Java?
for (int j=0; j<100; j++) {t = tests[j];
}
P (A∣B )=P (B∣A)P (A)
P (B )
Bayes' theorem
Hint:A = developer uses JavaB = developer writes old for loops
P (A∣B )=P (B∣A)P (A)
P (B )
Machine Learning: Bayes' classifier
Example: given a company where 70% of developers use Java and 30% use C++, and knowing that half of the Java developers always use enhanced for loop, if you look at the snippet:
which is the probability that the developer who wrote it uses Java?
for (int j=0; j<100; j++) {t = tests[j];
}
P (A∣B )=P (B∣A)P (A)
P (B )
Bayes' theorem
Solution:A = developer uses JavaB = developer writes old for loops
P (A∣B )=P (B∣A)P (A)
P (B )=
0.5⋅0.70.65
=0.54
P(A) = prob. that a developer uses Java = 0.7P(B) = prob. that any developer uses old for loop = 0.3 + 0.7*0.5 = 0.65P(B|A) = prob. that a Java developer uses old for loop = 0.5
Machine Learning: Bayes' classifier
Naive Bayes' classifier
- supervised learning- trained on a set of known classes- computes probabilities of elements to be in a class- smoothing required
P c (w 1 , .... , w n)=∏i =1
n
P (c∣w i )
∏i =1
n
P (c∣w i )+∏i=1
n
(1−P (c∣w i ))
Machine Learning: Bayes' classifier
Naive Bayes' classifier
Example
- we want a classifier for Twitter messages- define a set of classes: {art, tech, home, events,.. }- trains the classifier with a set of alreay classified tweets- when a new tweet arrives, the classifier will (hopefully) tell us which class it belongs to
Machine Learning: Bayes' classifier
Hands on
Machine Learning: Bayes' classifier
Sentiment analysis
- define two classes: { +, - }- define a set of words: { like, enjoy, hate, bore, fun, …}- train a NBC with a set of known +/- comments - let NBC classify any new comment to know if +/-
- performance is related to quality of training set
Machine Learning: Clustering
- Unsupervised learning- Different algorithms: - Hierarchical clustering - K-Means clustering - ...
Clustering
Common use cases: - navigation habits - online commerce - social/political attitudes - ...
Machine Learning: Clustering
K-Means aims at identifying cluster centroids, such that an item belonging to a cluster X, is closer to the centroid of cluster X than to the centroid of any other cluster.
K-Means clustering
Machine Learning: Clustering
The algorithm requires a number of clusters to start, in this case 3. The centroids are placed in the item space, typically in random locations.
K-Means clustering
Machine Learning: Clustering
The algorithm will then assign to each centroid all items that are closer to it than to any other centroid.
K-Means clustering
Machine Learning: Clustering
The centroids are then moved to the center of mass of the items in the clusters.
K-Means clustering
Machine Learning: Clustering
A new iteration occurs, taking into account the new centroid positions.
K-Means clustering
Machine Learning: Clustering
The centroids are again moved to the center of mass of the items in the clusters.
K-Means clustering
Machine Learning: Clustering
Another iteration occurs, taking into account the new centroid positions.
K-Means clustering
Machine Learning: Clustering
The centroids are again moved to the center of mass of the items in the clusters.
K-Means clustering
Machine Learning: Clustering
Another iteration occurs, taking into account the new centroid positions. Note that this time the cluster membership did not change. The cluster centers will not move anymore.
K-Means clustering
Machine Learning: Clustering
The solution is found.
K-Means clustering
Machine Learning: Clustering
Hands on
Machine Learning: Neural networks
A logical calculus of the ideas immanent in nervous activity
by McCulloch and Pitts in 1943
Neural networks
Machine Learning: Neural networks
Neural networks
Feedforward Perceptron
Machine Learning: Neural networks
Neural networks
Logic operators with neural networks:
Threshold = 0
X0 X1 X2 Σ Result
-10 0 0 -10 0
-10 0 20 10 1
-10 20 0 10 1
-10 20 20 30 1
OR operator
Machine Learning: Neural networks
Neural networks
Threshold = 0
X0 X1 X2 Σ Result
-30 0 0
-30 0 20
-30 20 0
-30 20 20
which operator?
Logic operators with neural networks:
Machine Learning: Neural networks
Neural networks
Threshold = 0
X0 X1 X2 Σ Result
-30 0 0 -30 0
-30 0 20 -10 0
-30 20 0 -10 0
-30 20 20 10 1
AND operator
Logic operators with neural networks:
Machine Learning: Neural networks
Hands on
Machine Learning: Neural networks
Neural networks
Backpropagation
Phase 1: Propagation - Forward propagation of a training pattern's input through the neural network in order to generate the propagation's output activations - Backward propagation of the propagation's output activations through the neural network using the training pattern target in order to generate the deltas of all output and hidden neurons
Phase 2: Weight update - Multiply its output delta and input activation to get the gradient of the weight - Bring the weight in the opposite direction of the gradient by subtracting a ratio of it from the weight
Machine Learning: Neural networks
Neural networks
Multilayer perceptrons
Machine Learning: Neural networks
Hands on
Machine Learning: Genetic algorithms
Genetic algorithms
GA is a programming technique that mimics biological evolution as a problem-solving strategy
Steps- maps the variables of the problem into a sequence of bits, a chromosome
Chromosome
- creates a random population of chromosomes- let evolve the population using evolution laws: - the higher the fitness, the higher the chance of breeding - crossover of chromosomes - mutation in chromosomes- if otpimal solution is found or after n steps the process is stopped
Machine Learning: Genetic algorithms
Genetic algorithms
Mutation
Crossover
Hands on
Machine Learning: Genetic algorithms
Thanks!
Machine Learning
The code is available on:https://github.com/andreaiacono/MachineLearning