COMP 150: Developmental Robotics · Northeast Robotics Colloquium Held at Northeastern University...
Transcript of COMP 150: Developmental Robotics · Northeast Robotics Colloquium Held at Northeastern University...
COMP 150: Developmental Robotics
Instructor: Jivko Sinapovwww.cs.tufts.edu/~jsinapov
Northeast Robotics Colloquium
● Held at Northeastern University on Saturday October 21st
● https://nerc2017.ccis.northeastern.edu/● Deadline for registration: October 15● $50 dollars for graduate students, $10 for
undergrads
Project Related Deadlines
● Team-up by the end of class, Thursday Oct 5● “Preliminary” project ideas presentations:
Tuesday Oct 10 and Thursday October 12● Project Proposal is due October 26 (midnight )
Written Project Proposal
● 5 pages + 1 for references● Template: default google doc or default LaTeX
(e.g., using overleaf.com)– Do not change margins, font sizes, spacing, etc.
Proposal Sections
● Abstract (1 paragraph)● Introduction● Related Work● Problem Formulation and Technical Approach● Expected Results / Experimental Validation
– What is your criteria for success?
● Timeline / Schedule
Overview of Machine Learning
Topics
● Unsupervised Learning
● Classification / Recognition / Prediction
● Reinforcement Learning
Unsupervised Learning
● “Learns” from the features of data alone – no direct feedback or label for each data point
● Main goal: transform the data into some other form that is useful for tasks further on or easily interpretable by a human
Clustering
[ https://home.deib.polimi.it/matteucc/Clustering/tutorial_html/images/clustering.gif ]
K-Means Clustering
[http://cs.nyu.edu/~dsontag/courses/ml12/slides/lecture14.pdf]
[http://cs.nyu.edu/~dsontag/courses/ml12/slides/lecture14.pdf]
[http://cs.nyu.edu/~dsontag/courses/ml12/slides/lecture14.pdf]
[http://cs.nyu.edu/~dsontag/courses/ml12/slides/lecture14.pdf]
[http://cs.nyu.edu/~dsontag/courses/ml12/slides/lecture14.pdf]
What are some of the limitations of k-means clustering?
Graph-based Clustering Algorithms
Spectral Clustering
How can we split the graph into two graphs as to minimize the weights of the edges being cut and maximize those of the rest?
Spectral Clustering
How can we split the graph into two graphs as to minimize the weights of the edges being cut and maximize those of the rest?
Hierarchical Clustering
[http://blog.hackerearth.com/wp-content/uploads/2017/01/flow-01.jpg]
What are some of the practical issues related to using clustering?
● Choice of distance / similarity function● Choice of number of clusters● ...
Dimensionality Reduction
[ http://jntsai.blogspot.com/2015/04/ammai-nonlinear-dimensionality.html ]
What are some situations where a robot would need to use unsupervised learning (e.g.,clustering?
(discussion)
Machine Learning Frameworks
supervised unsupervised
disc
rete
cont
inuo
us
classification or categorization
clustering
dimensionality reduction and manifold
learning regression
Classification
Non-linear Classification
Notation
Inputs: Outputs:
set of classes:where
Problem Formulation
y = f(x)
• Training: given a training set of labeled examples {(x1,y1), …, (xN,yN)}, estimate the function f by minimizing the error on the training set
• Testing: apply f to a never before seen test example x and output the predicted value y = f(x)
output classification function
data point
Slide credit: L. Lazebnik
Example Problem
Input data:
[1 0 0 1 1 1 0 1 0]
[0 1 0 1 1 1 0 0 1]
…
[1 0 1 1 0 0 1 0 1]
[0 1 0 1 0 0 0 1 1]
Class Labels:
+1
-1
…
+1
-1
1-Nearest Neighbor
x x
xx
x
x
x
xo
oo
o
o
o
o
x2
x1
+
+
3-Nearest Neighbor
x x
xx
x
x
x
xo
oo
o
o
o
o
x2
x1
+
+
Linear Classifier
• Finds a linear function to separate the classes:
f(x) = sgn(w x + b)
Slide credit: L. Lazebnik
x x
xx
x
x
x
x
oo
o
o
o
x2
x1
Training a Linear Classifier
Random Initialization
x2
x1
First data point….
x
x2
x1
Second data point...
x
ox2
x1
3rd data point: Error!
x
ox2
x1
o
Update: change w and b in the direction that minimizes the error
x
ox2
x1
o
And so on...
x
ox2
x1
o
x
x
And so on...
x
ox2
x1
o
x
x
And so on...
x
ox2
x1
o
x
x
And so on...
x
ox2
x1
o
x
x
x
x
x
oo
o
The Algorithm
[https://i.stack.imgur.com/SCfew.png]
Linear Classifier
• How do we decide which line is the best?
Slide credit: L. Lazebnik
x x
xx
x
x
x
x
oo
o
o
o
x2
x1
Answer: maximize the margin
x x
xx
x
x
x
x
oo
o
o
o
x2
x1
This is called the margin
Answer: maximize the margin
x x
xx
x
x
x
x
oo
o
o
o
x2
x1
What happens when the data is not linearly separable?
o
Answer: map the data into a higher dimensional space
[http://www.imtech.res.in/raghava/rbpred/svm.jpg]
Nonlinear Support Vector Machine
0 x
0 x
Linearly separable:
Not linearly separable:
Can we construct a mapping function from 1D to 2D such that the data in the 2D space is linearly
separable?
0 x
Input space Feature space
φ(x) → <x1,x
2>
In other words, both x1 and x
2 need to be function of x
Can we construct a mapping function from 1D to 2D such that the data in the 2D space is linearly
separable?
0 x
Input space Feature space
φ(x) → <x,x>
Can we construct a mapping function from 1D to 2D such that the data in the 2D space is linearly
separable?
0 x
Input space Feature space
φ(x) → <x,|x|>
x2
Can we construct a mapping function from 1D to 2D such that the data in the 2D space is linearly
separable?
0 x
Input space Feature space
φ(x) → <x,x2>
x1
x2
Nonlinear Support Vector Machine
• The kernel trick: instead of explicitly computing the lifting transformation φ(x), define a kernel function K such that
K(xi , xj) = φ(xi ) · φ(xj)
(to be valid, the kernel function must satisfy Mercer’s condition)
• Intuitively, the kernel function should encode a measure of similarity between xi and xj
Nonlinear Support Vector Machine
Consider the mapping ),()( 2xxx
22
2222
),(
),(),()()(
yxxyyxK
yxxyyyxxyx
x2
There are many other classifiers out there...
Decision Trees
[http://cdn2.hubspot.net/hub/64283/file-15380323-png/images/rapidminer-decision-tree-personal-loan-accept.png]
Feed-Forward Neural Networks
[http://cs231n.github.io/assets/nn1/neural_net2.jpeg]
Deep Learning Methods
Deep Learning Methods
http://www.kdnuggets.com/wp-content/uploads/deep-learning.png
There are many ways to combine classifiers...
Classifier Ensembles
Boosting
Sequences of classifiers that grows in complexity of classifier
http://www.svcl.ucsd.edu/~ehsan/web/Img/cascadeflow.jpg
http://www.chioka.in/wp-content/uploads/2013/09/stacking.png
Project Breakout (10 – 15 min)
What are some ways you can use unsupervised and/or supervised machine learning methods in your project?
What are the “data points” going to be? What will they encode?
Where are the labels going to come from?
Take-home message
“The decision to use machine learning is more important than the choice of a particular learning method.”
- James Hays, Brown University
Resources
● Introduction to Machine Learning textbook: http://alex.smola.org/drafts/thebook.pdf
● WEKA Machine Learning Library (in Java): http://www.cs.waikato.ac.nz/ml/weka/
● Support Vector Machine example using OpenCV: http://docs.opencv.org/2.4/doc/tutorials/ml/introduction_to_svm/introduction_to_svm.html
● ML library in Python: http://scikit-learn.org/stable/