Olga Senyukova - Machine Learning Applications in Medicine

MACHINE LEARNING APPLICATIONS

IN MEDICINE

Olga Senyukova

Graphics & Media Lab

Faculty of Computational Mathematics and Cybernetics

Lomonosov Moscow State University

Medical data

Medical images

Physiologic signals

Other: narrative, textual, numerical, etc.

Medical images

X-Ray MRI

CT Ultrasound

Computed tomography (CT)

1972, Sir Godfrey Hounsfield

X-rays are computer-processed to produce

tomographic images

https://en.wikipedia.org/wiki/CT_scan

Computed tomography (CT)

insightci.com.au

http://www.google.ru/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRw&url=http://insightci.com.au/services/ct-cat-scan/&ei=rJuSVdanCuKuygOam52gCA&bvm=bv.96783405,d.bGQ&psig=AFQjCNHY3nRX8_znYV1Yf9kwTWI3DbzHwA&ust=1435757821311237

Magnetic resonance imaging (MRI)

1973, Paul C. Lauterbur and Peter Mansfield

Allows localizing the image by slices

Source: K. Toennies

https://www.google.ru/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRw&url=https://commons.wikimedia.org/wiki/File:MRI.png&ei=rp6SVfKsLsHoywPWtoWQCg&bvm=bv.96783405,d.bGQ&psig=AFQjCNGKkipH2C2e15WqcmzIqd3-FlNJCQ&ust=1435758618852018

Magnetic resonance imaging (MRI)

www.raleighrad.com

https://www.google.ru/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRw&url=https://www.raleighrad.com/expertise/mri/open-bore-mri/&ei=QqKSVZ79A4L5ygPQ9IWQCQ&psig=AFQjCNGKkipH2C2e15WqcmzIqd3-FlNJCQ&ust=1435758618852018

Electrocardiography (ECG)

1901, Einthoven

Recording of the electrical activity of the heart by

electrodes placed on the body

intensivecarehotline.com

RR time series

RR time series (interbeat intervals lengths) are widely

used for ECG analysis

www.elsevier.es

Human gait time series

reylab.bidmc.harvard.edu

Analysis: what for?

Normal or diseased?

Where is the diseased area?

What changes over time occur

(especially, after treatment)?

Does the specific condition take

place (e.g. overtraining of the

sportsman)?

…

www.fresher.ru

Main tasks: images

Detection

aneurysm

Segmentation

T Matching (Registration)

Main tasks: physiologic signals

Diagnostics

Healthy

Disease XXX

Disease YYY

Template Matching

Condition ZZZ

The same or

not???

Machine learning in medical imaging:

challenges

Slide by D. Rueckert

Images are often 3D or 4D:

# of voxels and # of extracted features is very large

Number of images for training is often limited:

large datasets means typically 100 to 1000 images

“small sample size problem”

Machine learning in medical imaging:

challenges

Training data is expensive

annotation of images is resource intensive (manpower,

cost, time)

sometimes possible to augment training bases using

unlabelled images

Training data is sometimes imperfect

training data may be wrongly labelled

e.g. diseases such as Alzheimer’s require confirmation

through pathology (difficult and costly to obtain)


The InnerEye project

Measuring brain tumors

Localizing and identifying vertebrae

Kinect for surgery

Source: A. Criminisi & the InnerEye team @ MSRC

Anatomy localization via regression

forests

A. Criminisi, et al.

Med Image Analysis

2013

Decision forests

Leo Breiman, 2001

A. Criminisi, J. Shotton (eds.). Decision Forests in

Computer Vision and Medical Image Analysis //

Advances in Computer Vision and Pattern

Recognition. 2013

Decision forest consists

of decision trees…

Decision tree

Each internal node: a split (test) function

Each leaf: class label (predictor)

Source: A. Konushin

Regression tree

input value

continu

ous

la

bel

• Green – high uncertainty

• Red – low uncertainty

• Thickness – the number of samples

from the training set Several following slides are adapted from

A. Criminisi and J. Shotton

Regression tree: training

• S0 – whole training set

• Sj – part of training set at the jth node

))(,;(~)|( 2 xyyNxyp y

Regression tree: training

Split function parameters at the jth node maximize the information gain

At each part (L,R):

fit a line to the points

(e.g. least squares)

for each x we have ))(,;(~)|( 2 xyyNxyp y

),(maxarg

jj SIj

jij

Syx RLi Syx

yy xxI),( },{ ),(

))(log())(log(

y – green line

Example

Different models

Predictor models

Constant Polynomial and linear Probabilistic linear

Weak learners (split functions)

Axis-aligned Generic oriented

hyperplane

Conic section

Regression forest

d

dxx ),...,( 1v

Randomness

Bagging: each tree is trained on a random subset

of the whole training set

Randomness

Randomized node optimization: optimize a split

function at the jth node w.r.t. a small random subset

of parameter values

),(maxarg jj SI ),(maxarg jj SI

j

!!!

j

),,( jjjj τ

j

j

jτ

selects features from the whole feature set

is a weak learner type (axis-aligned, linear, etc.)

is a set of splitting thresholds

Forest vs tree

The labeled database

Anatomy localization

Key idea: all voxels in the image vote for the

position of the organ

Each organ is defined by its 3D axis-aligned

bounding box

Cc),,,,,( F

c

H

c

P

c

A

c

R

c

L

cc bbbbbbb

C = {liver, spleen, kidneyL, kidneyR, …}


For each input voxel the distribution of

relative displacements to the organ bounding box

is obtained

),,( zyx vvvv

),,,,,()( F

c

H

c

P

c

A

c

R

c

L

cc ddddddd v

);( vf – feature response


Voxel clusters with the highest confidence of

prediction are considered to be salient regions for

localization of an organ

salient regions are shown in green

Context-rich features

Features: mean intensity in randomly displaced boxes

Features for CT and MRI

CT: we can rely

on absolute

intensity values

MRI: only intensity

difference makes

sense

Learning clinically useful information

from medical images

Biomedical Image Analysis Group

Department of Computing

Daniel Rueckert

Segmentation using registration


Multi-atlas segmentation using classifier

fusion

Multi-atlas segmentation using classifier

fusion and selection

Selection of atlases

How to select atlases the most similar to our image?

Atlases should be clustered by disease/population

Manifold learning is used to efficiently discover

such clusters

Manifold learning

Several following slides are adapted from D. Rueckert

Embed the data to

the manifold

(project to less-

dimensional space)

Find a manifold

Manifold learning: Laplacian eigenmaps

Given a graph G = (V, E)

Each vertex vi corresponds to an image

Each edge weight wij defines the similarity between

image i and j

Define diagonal matrix T which contains the degree

sums for each vertex

j ijii wt

Manifold learning: Laplacian eigenmaps

2/12/1 )( TWTTL

Normalized graph Laplacian

2

,min jiji ij yyW

The eigen decomposition of L

provides manifold coordinates

yi for each vertex i (or image)

Manifold learning for multi-atlas

segmentation

We have two sets of images:

labeled (atlases)

unlabeled

We want to label all the unlabeled images

We can do it iteratively:

label a part of unlabeled images using the most similar

from already labeled

these images can be used as atlases for the next

iteration

Manifold learning for multi-atlas

segmentation

Wolz et al., Neuroimage, 2010

Example

Wolz et al., Neuroimage, 2010

Segmentation of brain lesions in MRI

Olga V. Senyukova, “Segmentation of blurred objects by

classification of isolabel contours”. Pattern Recognition,

2014

Data was provided by Children's Clinical and Research

Institute Emergency Surgery and Trauma

The proposed algorithm

Each MRI slice is processed separately

In order to improve speed and robustness the

regions containing lesions can be specified manually

Lesions inside these regions are segmented

automatically

Algorithm overview

Input region Isolabel contours

I(x,y)=const

Closed isolabel

contours Nonlinear SVM

classification

Isolabel contours

In geography

each isolabel contour (one color):

constant height f(x,y)=h

In image processing

each isolabel contour (one color):

constant intensity f(x,y)=I

How to distinguish lesion contours?

Visually we can do it easily!

Let’s use the same set of features for automatic

classification of isolabel contours

Features of isolabel contours

In order to distinguish isolabel contours delineating

lesions 4 features were proposed

Imean Imean inside the contour / Imean inside BBox

Imax-Imin Ivariance

Labeled training base

Various regions on many images:

a user can click on lesion contours: they will get “lesion”

other isolabel contours will automatically get “non-lesion”

… , ,

[ɸ1, ɸ2, ɸ3, ɸ4] -> non-lesion

[ɸ1, ɸ2, ɸ3, ɸ4] -> lesion

[ɸ1, ɸ2, ɸ3, ɸ4] -> lesion

[ɸ1, ɸ2, ɸ3, ɸ4] -> lesion




[ɸ1, ɸ2, ɸ3, ɸ4] -> lesion

[ɸ1, ɸ2, ɸ3, ɸ4] -> lesion

…

[ɸ1, ɸ2, ɸ3, ɸ4] is

a feature vector

Binary classification via SVM

We have a binary classification task: each isolabel contour belongs to one of two classes, lesions or non-lesions

One of the best classifiers is SVM – Support Vector Machine

original linear SVM: Vladimir Vapnick, Alexey Chervonenkis, 1963

applying a kernel trick results in nonlinear SVM: Bernhard Boser, Isabelle Guyon, Vladimir Vapnick, 1992

Linear SVM

support vectors margin

1:1 by ii wx

1:1 by ii wx

positive samples

negative samples

w/2

Maximizing

we solve quadratic

optimization problem:

w/2

wwT

2

1

1)( by ii xw

minimizing

subject to

byb iii i xxxw

Solution is a hyperplane:

ix

i

– support vectors

– learned weights

Nonlinear SVM

For linearly separable data linear SVM is excellent

What about the data that is not linearly separable?..

We can make it linearly separable by mapping it to

more-dimensional space

Nonlinear SVM: kernel trick

by iii i xx bKy iii i ),( xxInstead of we have

)()(),( jijiK xxxx where

2

exp),( jijiK xxxx

For classification of isolabel contours nonlinear SVM

with RBF (radial basis function) kernel is used

Ensemble-based analysis of RR and gait

Olga Senyukova

Valeriy Gavrishchaka, Department of Physics, West

Virginia University

Springer, 2013, 2015

RR and gait time series

Normal?

Huntington’s disease?

Parkinson’s disease?

…

Normal?

Arrhythmia?

Congestive heart failure?

…

Ensemble learning techniques

Ensemble can work better than a single classifier

…

accuracy: 0.61 accuracy: 0.73 accuracy: 0.65

Weak learner 1 Weak learner 2 Weak learner N

Ensemble of classifiers accuracy: 0.9

AdaBoost

Freund and Schapire, 1997

On each iteration focuses on the most hard-to-

classify samples

AdaBoost

– training data, – labels

Initial weights of all N samples:

M iterations, from m = 1 to M:

find

set

update

Classifier output:

Nwi /1)0(

))(()(1

M

m mmTsignH xx

Nii ,...,1, x }1;1{ iy

)]([)(minarg)(1

imi

N

i

mjT

m TyiwTj

xx

m

mm

1log

2

1

m

imimmm

Z

Tyiwiw

)(exp)()(1

x

Good classifier example

Iteration 1 of 3

T1

Iteration 2 of 3

T2

Iteration 3 of 3

STOP T3

Final model

)](72.0)(70.0)(42.0[ 321 xxx TTTsign

Ensemble decomposition learning

We apply ensemble-based classifier to a point x

Each x can be described by its ensemble

decomposition vector (EDL vector)

We can classify data points by comparing their EDL

vectors

M

m mmTH1

)()( xx

)](,),(),([)( 2211 xxxx MMTTTD

EDL: learning

All available data

«normal/abnormal»

MSE DFA

AdaBoost

Indicators from nonlinear

dynamics

Building a general classifier

«normal/abnormal»

MSE1 DFA2 … MSEN α1 + α2 + αN

Ensemble classifier

Training sample x


Applying the ensemble

MSE

+1 (normal) -1 (abnormal) +1 (normal) -1 (abnormal)

DFA

)]1(*,),1(*,1*[)( 21 MD x

EDL vector

Disease XXX

EDL: testing

Input y


Applying the ensemble

]1*,,1*),1(*[)( 21 MD y

)()( yx DD

? x = y x ≠ y

EDL vector

no yes

In multi-class classification problem the class of y is the class of the training example

with the closest EDL vector

))()((min)()(: yy DxDDxDCik C

iCk

y has a disease XXX y does not have a disease XXX

Results

CHF/Arrhythmia classification

Real data from

http://www.physionet.org/physiobank

Thank you for attention!

knizhnayaraduga.ru

http://www.google.ru/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRxqFQoTCPXn98-c9sYCFeWfcgod8pkCJw&url=http://knizhnayaraduga.ru/EKSMO-Kniga-pyshka-Dobryy-doktor_1260t.html&ei=D3mzVbWkI-W_ygPys4q4Ag&bvm=bv.98717601,d.bGQ&psig=AFQjCNGNZfXuWIXL5qGvUPAMvfsvgxdesA&ust=1437911630091982

Olga Senyukova - Machine Learning Applications in Medicine

Science

Transcript of Olga Senyukova - Machine Learning Applications in Medicine