Machine Learning Applications in Medicine (Olga Senyukova)

MACHINE LEARNING APPLICATIONS

IN MEDICINEOlga SenyukovaGraphics & Media Lab

Faculty of Computational Mathematics and Cybernetics

Lomonosov Moscow State University

Medical data

Medical images

Physiologic signals

Other: narrative, textual, numerical, etc.

Medical images

X-Ray MRI

CTUltrasound

Computed tomography (CT)

1972, Sir Godfrey Hounsfield X-rays are computer-processed to

produce tomographic images

https://en.wikipedia.org/wiki/CT_scan

Computed tomography (CT)

insightci.com.au

http://www.google.ru/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRw&url=http://insightci.com.au/services/ct-cat-scan/&ei=rJuSVdanCuKuygOam52gCA&bvm=bv.96783405,d.bGQ&psig=AFQjCNHY3nRX8_znYV1Yf9kwTWI3DbzHwA&ust=1435757821311237

Magnetic resonance imaging (MRI)

1973, Paul C. Lauterbur and Peter Mansfield

Allows localizing the image by slices

Source: K. Toennies

https://www.google.ru/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRw&url=https://commons.wikimedia.org/wiki/File:MRI.png&ei=rp6SVfKsLsHoywPWtoWQCg&bvm=bv.96783405,d.bGQ&psig=AFQjCNGKkipH2C2e15WqcmzIqd3-FlNJCQ&ust=1435758618852018

Magnetic resonance imaging (MRI)

www.raleighrad.com

https://www.google.ru/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRw&url=https://www.raleighrad.com/expertise/mri/open-bore-mri/&ei=QqKSVZ79A4L5ygPQ9IWQCQ&psig=AFQjCNGKkipH2C2e15WqcmzIqd3-FlNJCQ&ust=1435758618852018

Electrocardiography (ECG)

1901, Einthoven Recording of the electrical activity of the

heart by electrodes placed on the body

intensivecarehotline.com

RR time series

RR time series (interbeat intervals lengths) are widely used for ECG analysis

www.elsevier.es

Human gait time series

reylab.bidmc.harvard.edu

Analysis: what for?

Normal or diseased? Where is the diseased

area? What changes over time

occur (especially, after treatment)?

Does the specific condition take place (e.g. overtraining of the sportsman)?

…

www.fresher.ru

Main tasks: images

Detection

aneurysm

Segmentation

TMatching (Registration)

Main tasks: physiologic signals

Diagnostics Healthy

Disease XXX

Disease YYY

Template MatchingCondition ZZZ

The same or not???

Machine learning in medical imaging:challenges

Slide by D. Rueckert

Images are often 3D or 4D: # of voxels and # of extracted features is

very large Number of images for training is often

limited: large datasets means typically 100 to 1000

images “small sample size problem”

Machine learning in medical imaging:challenges

Training data is expensive annotation of images is resource intensive

(manpower, cost, time) sometimes possible to augment training

using unlabelled images Training data is sometimes imperfect training data may be wrongly labelled e.g. diseases such as Alzheimer’s require

confirmation through pathology (difficult and costly to obtain)


The InnerEye project

Measuring brain tumors

Localizing and identifying vertebrae

Kinect for surgery

Source: A. Criminisi & the InnerEye team @ MSRC

Anatomy localization via regression forests

A. Criminisi, et al. Med Image Analysis 2013

Decision forests

Leo Breiman, 2001 A. Criminisi, J. Shotton (eds.). Decision

Forests in Computer Vision and Medical Image Analysis // Advances in Computer Vision and Pattern Recognition. 2013

Decision forest consists of decision trees…

Decision tree

Each internal node: a split (test) function Each leaf: class label (predictor)

Source: A. Konushin

Regression tree

input value

conti

nuous

label

• Green – high uncertainty• Red – low uncertainty• Thickness – the number of

samples from the training setSource: A. Criminisi, J. Shotton

Regression tree: training

• S0 – whole training set• Sj – part of training set at the jth node

))(,;(~)|( 2 xyyNxyp y

Source: A. Criminisi, J. Shotton

Regression tree: training

Split function parameters at the jth node maximize the information gain

At each part (L,R): fit a line to the points (e.g. least squares) for each x we have ))(,;(~)|( 2 xyyNxyp y

),(maxarg jj SI

j

jijSyx RLi Syx

yy xxI),( },{ ),(

))(log())(log(

y– green line Source: A. Criminisi, J. Shotton

Example


Different models

Predictor models

Constant Polynomial and linearProbabilistic linear

Weak learners (split functions)

Axis-aligned Generic oriented

hyperplane

Conic sectionSource: A. Criminisi, J. Shotton

Regression forest

ddxx ),...,( 1v


Randomness

Bagging: each tree is learned on subset of the whole training set

Randomness

Randomized node optimization: optimize a split function at the jth node w.r.t. a small random subset of parameter values

),(maxarg jj SI ),(maxarg jj SI

j!!!

j

),,( jjjj τ jjjτ

selects features from the whole feature setis a weak learner type (axis-aligned, linear, etc.)

is a set of splitting thresholdsSource: A. Criminisi, J. Shotton

Forest vs tree


The labeled database


Anatomy localization

Key idea: all voxels in the image vote for the position of the organ

Each organ is defined by its 3D axis-aligned bounding box

Cc),,,,,( F

cHc

Pc

Ac

Rc

Lcc bbbbbbb

C = {liver, spleen, kidneyL, kidneyR, …}


Anatomy localization

For each input voxel we obtain distribution of relative displacements to the organ bounding box

),,( zyx vvvv

),,,,,()( Fc

Hc

Pc

Ac

Rc

Lcc ddddddd v

);( vf – feature response


Context-rich features

Features: mean intensity in randomly displaced boxes


Features for CT and MRI

CT: we can rely on absolute intensity values

MRI: only intensity difference makes senseSource: A. Criminisi, J.

Shotton

Learning clinically useful information from medical images

Biomedical Image Analysis Group Department of Computing Daniel Rueckert

Segmentation using registration


Multi-atlas segmentation using classifier fusion

Multi-atlas segmentation using classifier fusion and selection

Selection of atlases

How to select atlases the most similar to our image?

Atlases should be clustered by disease/population

Manifold learning is used to efficiently discover such clusters

Manifold learning

Source: D. Rueckert

Embed the data to the manifold

(project to less-

dimensional space)

Find a manifold

Manifold learning: Laplacian eigenmaps

Given a graph G = (V, E) Each vertex vi corresponds to an image Each edge weight wij defines the similarity

between image i and j Define diagonal matrix T which contains the

degree sums for each vertex

j ijii wt


Manifold learning: Laplacian eigenmaps

2/12/1 )( TWTTL

Normalized graph Laplacian

2

,min jiji ij yyW

The eigen decomposition of L provides manifold

coordinates yi for each vertex i (or image)

Source: D. Rueckert

Manifold learning for multi-atlas segmentation

We have two sets of images: labeled (atlases) unlabeled

We want to label all the unlabeled images

We can do it iteratively: label a part of unlabeled images using the

most similar from already labeled these images can be used as atlases for the

next iteration

Manifold learning for multi-atlas segmentation

Wolz et al., Neuroimage, 2010

Example

Wolz et al., Neuroimage, 2010

Segmentation of brain lesions in MRI

Olga V. Senyukova, “Segmentation of blurred objects by classification of isolabel contours”. Pattern Recognition, 2014

Data was provided by Children's Clinical and Research Institute Emergency Surgery and Trauma

The proposed algorithm

Each MRI slice is processed separately In order to improve speed and

robustness the regions containing lesions can be specified manually

Lesions inside these regions are segmented automatically

Algorithm overview

Input regionIsolabel contours

I(x,y)=const

Closed isolabel contours

Nonlinear SVM classification

Isolabel contours

In geographyeach isolabel contour (one color): constant height f(x,y)=h

In image processingeach isolabel contour (one color): constant intensity f(x,y)=I

How to distinguish lesion contours? Visually we can do it easily! Let’s use the same set of features for

automatic classification of isolabel contours

Features of isolabel contoursIn order to distinguish isolabel contours delineating lesions we propose 4 features

Imean Imean inside the contour / Imean inside BBox

Imax-IminIvariance

Labeled training base

Various regions on many images: a user can click on lesion contours: they will

get “lesion” other isolabel contours will automatically get

“non-lesion”

…, ,

[ɸ1, ɸ2, ɸ3, ɸ4] -> non-lesion[ɸ1, ɸ2, ɸ3, ɸ4] -> lesion[ɸ1, ɸ2, ɸ3, ɸ4] -> lesion[ɸ1, ɸ2, ɸ3, ɸ4] -> lesion

[ɸ1, ɸ2, ɸ3, ɸ4] -> non-lesion[ɸ1, ɸ2, ɸ3, ɸ4] -> non-lesion[ɸ1, ɸ2, ɸ3, ɸ4] -> non-lesion[ɸ1, ɸ2, ɸ3, ɸ4] -> lesion[ɸ1, ɸ2, ɸ3, ɸ4] -> lesion…

[ɸ1, ɸ2, ɸ3, ɸ4] is a feature vector

Binary classification via SVM We have a binary classification task:

each isolabel contour belongs to one of two classes, lesions or non-lesions

One of the best classifiers is SVM – Support Vector Machine original linear SVM: Vladimir Vapnick,

Alexey Chervonenkis, 1963 applying a kernel trick results in nonlinear

SVM: Bernhard Boser, Isabelle Guyon, Vladimir Vapnick, 1992

Linear SVM

support vectors margin

1:1 by ii wx

1:1 by ii wx

positive samples

negative samples

w/2

Maximizing we solve quadratic optimization problem:

w/2

wwT

2

1

1)( bxy ii w

minimizing

subject to

byb iii i xxxw Solution is a hyperplane:

ix

i– support vectors

– learned weights

Nonlinear SVM

For linearly separable data linear SVM is excellent

What about the data that is not linearly separable?..

We can make it linearly separable by mapping it to more-dimensional space

Nonlinear SVM: kernel trick

by iii i xx bKy iii i ),( xxInstead of we have

)()(),( jijiK xxxx where

2exp),( jijiK xxxx

For classification of isolabel contours I use nonlinear SVM with RBF (radial basis

function) kernel

Ensemble-based analysis of RR and gait

Olga Senyukova Valeriy Gavrishchaka, Department of

Physics, West Virginia University Springer, 2013, 2015

RR and gait time series

Normal?

Huntington’s disease?

Parkinson’s disease?

…

Normal?

Arrhythmia?

Congestive heart failure?

…

Ensemble learning techniques Ensemble can work better than a single

classifier

…

accuracy: 0.61 accuracy: 0.73 accuracy: 0.65

base classifier 1

base classifier 2

base classifier N

Ensemble of classifiers

accuracy: 0.9

AdaBoost

Freund and Schapire, 1997 On each iteration focuses on the most

hard-to-classify samples

AdaBoost

– training data, – labels

Initial weights of all N items: M iterations, from m = 1 to M: find if then stop set

update Classifier output:

Nwi /1)0(

))(()(1

M

m mmTsignH xx

Nii ,...,1, x }1;1{ iy

)]([)(minarg)(1

imi

m

imj

Tm TyiwT

j

xx

2/1m

m

mm

1log

2

1

t

imimmm Z

Tyiwiw

)(exp)()(1

x

Good classifier example

Iteration 1 of 3

T1

Iteration 2 of 3

T2

Iteration 3 of 3

STOPT3

Final model

)](72.0)(70.0)(42.0[ 321 xxx TTTsign

Ensemble decomposition learning We apply ensemble-based classifier to

vector x

Each x can be described by its ensemble decomposition vector (EDL vector)

We can classify data points by comparing their EDL vectors

M

m mmTH1

)()( xx

)](,),(),([)( 2211 xxxx MMTTTD

EDL: learning

All available data«normal/abnormal»

MSE

DFA

AdaBoost

Indicators from nonlinear dynamics

Building a general classifier«normal/abnormal»

MSE1

DFA2 …

MSEN

α1 + α2 + αN

Ансамбль классификаторов

Training example x

MSE1

DFA2 …

MSEN

α1 + α2 + αN

Applying the ensemble

MSE

+1 (normal) -1 (abnormal) +1 (normal) -1 (abnormal)

DFA

)]1(*,),1(*,1*[)( 21 MD xEDL vector

EDL: testing

Testing example y

MSE1

DFA2 …

MSEN

α1 + α2 + αN

Applying the ensemble

]1*,,1*),1(*[)( 21 MD y

)()( yx DD

?x = yx ≠ y

EDL vector

no yes

In multi-class classification problem the class of y is the class of the training example with the closest EDL vector

))()((min)()(: yy DxDDxDCik C

iCk

Results

CHF/Arrhythmia classification Real data from

http://www.physionet.org/physiobank

Thank you for attention!

knizhnayaraduga.ru

http://www.google.ru/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRxqFQoTCPXn98-c9sYCFeWfcgod8pkCJw&url=http://knizhnayaraduga.ru/EKSMO-Kniga-pyshka-Dobryy-doktor_1260t.html&ei=D3mzVbWkI-W_ygPys4q4Ag&bvm=bv.98717601,d.bGQ&psig=AFQjCNGNZfXuWIXL5qGvUPAMvfsvgxdesA&ust=1437911630091982

Machine Learning Applications in Medicine (Olga Senyukova)

Science

Transcript of Machine Learning Applications in Medicine (Olga Senyukova)