Introduction to Machine Learning Lecture

Post on 12-Jan-2017

219 views 4 download

Transcript of Introduction to Machine Learning Lecture

Biomedical Informatics 260

Machine Learning for Images: Introduction

Lecture 11

Spring 2014

Review: Feature Extraction

Today: Machine learning to classify, predict

1. We have a rich set of semantic and

computational features

2. We want to harness these features to

build tools for medical decision support

Machine Learning

• What is machine learning?

• How do I prepare my data?

• Types of Learning Algorithms

• How do I evaluate performance?

http://www.vbmis.com/learn/some-link-here

More details on methods

(formulas, written out explanations, etc.)

What is machine learning?

Real world

phenomenonModel

Data 1.2 ‘green’ 234.5 10

1.4 ‘red’ 160.8 11

.05 ‘red 150.3 10

What is

machine

learning?

Where do I start?

• Identify a question

– Can I predict candy type from image features?

• Ideal data?

– High resolution images for all candy in the world

• Available data

– 100 pictures of peanut and regular M&M’s

• Explore and visualize

• Create model

• Evaluate results

What is

machine

learning?

Look at your data

Start with simple methods first

Always evaluate

Good ideas:

What is

machine

learning?

How do I prepare my data?

Data

Learning

Algorithm

Classifier

Features

Evaluation

GOAL:

How do I

prepare my

data?

Prepare data for feature extraction

How do I prepare my data?

Filtering

Quality Analysis

Registration

Segmentation

How do I

prepare my

data?

Filtering

Do we need to smooth edges?

Smooth pixels for noise reduction?

Filter out a threshold of interest?

How do I

prepare my

data?

http://www.vbmis.com/learn/category/medical-imaging/image-

processing/filtering/

Quality Analysis

Are any images still not good enough?

Is the data biased in any way?

Do we have representation of all classes?

How do I

prepare my

data?

Registration

Functional data aligned with structural?

Alignment to a standard space?

Alignment to a group template?

How do I

prepare my

data?

http://www.vbmis.com/learn/category/medical-imaging/image-

processing/registration-image-processing/

Segmentation

What is background vs.

region of interest?

How do I

prepare my

data?

http://www.vbmis.com/learn/category/medical-imaging/image-

processing/segmentation/

How do I prepare my data?

Features

Data

Learning

Algorithm

Classifier

Features

Evaluation

Feature Space Representations

Most algorithms work with general vectors

2.9 0.2

0.1 1.9

7.2 2.1

Image 1

Image 2

Image 3

X1 X2 Y

“green peanut”

“red regular”

“red peanut”

How do I

prepare my

data?

Feature Normalization

Here is our vector

v = <x,y>

Calculate the norm

|v| = ��+��

For example,

N=<4,-3>

|v| = 4�+3� = 5

Then divide vector by the norm to

make the unit vector

v / |v|

How do I

prepare my

data?

How do I build a classifier?

Choosing a learning algorithm

Data

Learning

Algorithm

Classifier

Features

Evaluation

Three Types of Learning Algorithms

1. Supervised Classification

2. Unsupervised Clustering

3. Regression

How do I

build a

classifier?

1. Supervised Classification

• K-nearest neighbor (KNN)

• Naïve Bayes

• Support vector machines (SVM)

How do I

build a

classifier?

1. Supervised Classification

K-Nearest Neighbor

1. For each unlabeled point, compute the distance to all other points (order N)

2. sort the distances so we have a sorted list of the neighbors, including labels

3. determine the closest K neighbors (take top K off the list)

4. combine the neighbors labels to make a decision **

1. MUST have a way to resolve K neighbors that don't agree!

How do I

build a

classifier?

http://www.vbmis.com/learn/k-nearest-neighbor-clustering-knn/

1. Supervised Classification

KNN with K=1

How do I

build a

classifier?

1. Supervised Classification

KNN with K=15

How do I

build a

classifier?

1. Supervised Classification

KNN with K=100

How do I

build a

classifier?

1. Supervised Classification

KNN with K=400

How do I

build a

classifier?

1. Supervised Classification

K-Nearest Neighbor

How do I

build a

classifier?

1. Supervised Classification

Naïve Bayes

Estimate the probability of belonging to each class, and assign to highest

probable class

1. Assume conditions are independent: observing a feature says nothing about others

2. We typically sum the log of probabilities instead of multiplying probabilities

�(���|�� �) =� �� � ��� ∗ �(���)

�(�� �)

How do I

build a

classifier?

http://www.vbmis.com/learn/naive-bayes/

1. Supervised Classification

Naïve Bayes

�(����� == �|��� ����) =� ��� ���� ����� == � ∗ �(�)

�(��� ����)

�(����� == �|��� ����) =� ��� ���� ����� == � ∗ �(�)

�(��� ����)

How do I

build a

classifier?

�(����� == �|��� ����) =� ��� ���� ����� == � ∗ �(�)

�(��� ����)

1. Supervised Classification

Naïve Bayes

� � �!"# $|����� == �

If the hypothesis were true, what would the features look like?

How do I

build a

classifier?

1. Supervised Classification

Naïve Bayes

�(����� == �|��� ����) =� ��� ���� ����� == � ∗ �(�)

�(��� ����)

� ��� ���� ����� == � ∗ � � + � ��� ���� ����� == � ∗ �(�)

� � �!"# $

The overall probability of observing the features in the data, regardless of candy type

How do I

build a

classifier?

�(����� == �|��� ����) =� ��� ���� ����� == � ∗ �(�)

�(��� ����)

1. Supervised Classification

Naïve Bayes

� �

The probability of the hypothesis before looking at the data

How do I

build a

classifier?

1. Supervised Classification

Naïve Bayes

�(����� == �|��� ����) =� ��� ���� ����� == � ∗ �(�)

�(��� ����)

�(����� == �|��� ����) =� ��� ���� ����� == � ∗ �(�)

�(��� ����)

How do I

build a

classifier?

1. Supervised Classification

Naïve Bayes

�(����� == �|��� ����) =� ��� ���� ����� == � ∗ �(�)

�(��� ����)

�(����� == �|��� ����) =� ��� ���� ����� == � ∗ �(�)

�(��� ����)

How do I

build a

classifier?

1. Supervised Classification

Support Vector Machines

1. a way to transform dots into higher dimensional space (kernel mapping)

2. looking for a best boundary line that best distinguishes classes

3. the points that form the boundary line are called the support vectors

How do I

build a

classifier?

http://www.vbmis.com/learn/support-vector-machines-svms/

2. Regression

1. Supervised learning method

2. Assume that relationship between data points X, and Y, is linear

3. We can solve with the least squares approach (min sum of squared error)

Linear Regression

� & = '( + ')�) + ')�

) + . . . +'*�*

How do I

build a

classifier?

http://www.vbmis.com/learn/linear-regression/

3. Unsupervised Clustering

• K-means clustering

• Hierarchical clustering

How do I

build a

classifier?

Unsupervised Clustering

K-Means

1. Generate K random points (cluster centers) in the space of the objects to be clustered.

2. Compute the distance of each data point (objects) to all the cluster centers

3. Assign each object to the closest cluster center (CC)

4. Compute a new position for the cluster center as the sum average of the assigned

objects (lots of ways to do average)

Loop to step 2, until bored or the cluster centers don't change significantly

(less than or equal to some epsilon that we set)

How do I

build a

classifier?

http://www.vbmis.com/learn/k-means-clustering/

Unsupervised Clustering

Hierarchical

1. Computer a matrix of all distances between objects (not order N, calculate N squared

distances)

2. Find the two closest nodes

3. Merge them by "averaging" (multiple strategies for averaging, usually is a weighted

average) positions

4. Compute the distance of new merged node to all others

This leaves N-1 nodes

5. Repeat until all nodes merged (there is one node)

6. Draw cluster boundaries as you see fit

How do I

build a

classifier?

http://www.vbmis.com/learn/hierarchical-clustering/

Data

Learning

Algorithm

Classifier

Features

Evaluation

How do I build a classifier?

Ok, now build it!

What data do I use to build?

Training vs. Testing Data

Data

Learning

Algorithm

?

?

? Classifier

Orange

M&M

Yellow M&M

Blue Peanut

M&M

Features1.2 ‘green’ 234.5 10

1.4 ‘red’ 160.8 11

.05 ‘red 150.3 10

TRAINING

building model

The algorithm “learns” the optimal

parameters for the model

How do I

build a

classifier?

Data

Learning

Algorithm

?

?

? Classifier

Orange

M&M

Yellow M&M

Blue Peanut

M&M

Features1.2 ‘green’ 234.5 10

1.4 ‘red’ 160.8 11

.05 ‘red 150.3 10

TESTING

evaluating model

We give the classifier new data to

predict class labels

How do I

build a

classifier?

How do I use the data

to obtain reliable estimates?

How do I obtain reliable estimates?

• Holdout: if you have enough data

• Cross Validation: If you don’t

How do I

build a

classifier?

• Ideal: We have new data to test on

• Reality: We don’t

How do I obtain reliable estimates?

• Holdout

– 2/3 for training

– 1/3 for testing

How do I

build a

classifier?

How do I obtain reliable estimates?

• Cross Validation

– Partition into N sets

– Train on N-1

– Test on “held out” set

How do I

build a

classifier?

How do I choose the right model

complexity?

Overfitting and Underfitting

Bishop, et al. Pattern Recognition and Machine Learning

What Model Complexity to Choose?

generated

data model

fitted models of

different orders

data

How do I

build a

classifier?

What Model Complexity to Choose?

Gareth, et al. An Intro to Statistical Learning with Applications in R

How do I

build a

classifier?

How to evaluate performance?

Data

Learning

Algorithm

Classifier

Features

Evaluation

How do I evaluate performance?of a classification model

• Focus on predictive ability of the model

Confusion Matrix

Predicted Class

Actual Class

Yes No

Yes TP FN

No FP TN

How do I

assess

performance?

How do I evaluate performance?of a classification model

+,,-./,0 =12 + 13

12 + 13 + 42 + 43

Predicted Class

Actual Class

Yes No

Yes TP FN

No FP TN

How do I

assess

performance?

How do I evaluate performance?of a classification model

1.-5267898:5;/95 =12

12 + 43

(Sensitivity)

1.-535</98:5;/95 =13

13 + 42

(Specificity)

How do I

assess

performance?

How do I evaluate performance?of a clustering model

• Ideal Clustering: finding "optimal" (best that we can do) grouping of

objects such that the within group distance is low (minimized) and

the between group distance is high (maximized).

How do I

assess

performance?

http://www.vbmis.com/learn/cluster-validation/

• Internal Measures

– stability validation

– connectivity

– compactness

– separation

– the Dunn Index

– silhouette width

How do I evaluate performance?of a clustering model

How do I

assess

performance?

• External Measures

– Biological Homogeneity Index

– Biological Stability Index

How do I evaluate performance?of a clustering model

How do I

assess

performance?

Evaluation:

Comparing to Competitor Methods

ROC Curve Analysis

=5>7898:890 = 12; =12

12 + 43

We care a lot about missing something – FP are expensive

=?5,8@8,890 = 13; =13

13 + 42

We don’t care about missing something that might be true

How do I

assess

performance?

What if our classifier sucks?

Find some subset of features (predictor

variables) that are most informative about a

class label

Feature Selection:

How do I

assess

performance?

Feature SelectionMethods for dimensionality reduction

• Criterion

– mean squared error (regression)

– misclassification rate (classification)

How do I

assess

performance?

Feature SelectionMethods for dimensionality reduction

1. Best Subset Selection

2. Sequential Forward Search

3. Sequential Backward Search

4. Shrinkage Methods

5. Dimensionality Reduction

How do I

assess

performance?

Best Subset Selection

• Only feasible when number of features (p) is small

P1 P2 P3

How do I

assess

performance?

Sequential Forward Search

• Sequentially add features to an empty candidate set

until the addition of further features doesn’t

decrease our criterion

P1 P2 P3

How do I

prepare my

data?

Sequential Backward Search

• features are sequentially removed from a full

candidate set until the removal of further features

increase the criterion

P1 P2 P3

How do I

prepare my

data?

Summary

Machine learning is using methods from statistics and computer science

to predict an outcome

What is

machine

learning?

How do I

prepare my

data?

We should visualize, normalize, and

clean our data before turning it into a

vector to train a learning algorithm

We should choose a method based on

our data, perform intelligent feature

selection, and take advantage of

Matlab’s built in functions

How do I

build a

classifier?

Generally, we should evaluate with a

separate test data set, and look at ROC

curve metrics, and internal and

external validation for clustering

How do I

assess

performance?

What does it mean for me?

• First identify your question

• Come up with “the ideal” data

• Find “actual” data

• Explore it, visualize it!

• Experiment with different classifiers

• Evaluate and write up your results

Courses to Take

• Statistics:

– 116, 200, 202

• Computer Science:

– CS229

Next Time:

Advanced Machine Learning for Imaging

Thank you!