An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison...

25
An empirical comparison of CNNs and other methods for classification of protein subcellular localization with microscopy images Mengli Xiao, Wei Pan Division of Biostatistics University of Minnesota June 5, 2018 Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 1 / 20

Transcript of An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison...

Page 1: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

An empirical comparison of CNNs and other methodsfor classification of protein subcellular localization with

microscopy images

Mengli Xiao, Wei Pan

Division of BiostatisticsUniversity of Minnesota

June 5, 2018

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 1 / 20

Page 2: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Outline1 Background

Protein subcellular localizationData descriptionImplementation

2 A Convolutional Neural Network: DeepYeastCNN model structureResult

3 Residual Neural NetworkResNet model structuresResult

4 Feature extraction and transfer learningDefinitionResult

5 SummaryComparison of different methodsDiscussion

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 2 / 20

Page 3: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Background

1 BackgroundProtein subcellular localizationData descriptionImplementation

2 A Convolutional Neural Network: DeepYeastCNN model structureResult

3 Residual Neural NetworkResNet model structuresResult

4 Feature extraction and transfer learningDefinitionResult

5 SummaryComparison of different methodsDiscussion

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 2 / 20

Page 4: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Background Protein subcellular localization

Protein subcellular localization

A protein’s subcellular localization <==> functionSpatial temporal variation of a protein’s location results from geneticand environmental perturbationsHigh-throughput imaging: image classification

proteins are fluorescently labeled to track their locations within a cell;Why automating?

Manual: labor-intensive and error-prone.Large, but not so large, amounts of data: deep learning? others?

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 3 / 20

Page 5: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Background Data description

Data description

Data: Parnamaa and Parts (2017);Each image contains a single cell.

Figure 1: DeepYeast dataset overview with 4 images per category (Parnamaa andParts, 2017)

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 4 / 20

Page 6: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Background Data description

Data description

Table 1: Data: sample sizes

Subcellular categories training validation testCell periphery 6924 961 1569Cytoplasm 6935 1223 1276Endosome 2692 697 689ER 6195 1393 1755Golgi 2770 208 382Mitochondria 6547 1560 1243Nuclear Periphery 6661 1252 1164Nucleolus 7014 1147 1263Nuclei 6440 1312 1627Peroxisome 1683 297 164Spindle 4713 1517 781Vacuole 6426 936 587Total 65000 12500 12500

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 5 / 20

Page 7: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Background Implementation

Implementation

Keras in Tensorflow - CNNsPython sklearn - RF, XGBoostR - CATCH (Pan et al., 2018a,b)

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 6 / 20

Page 8: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

A Convolutional Neural Network: DeepYeast

1 BackgroundProtein subcellular localizationData descriptionImplementation

2 A Convolutional Neural Network: DeepYeastCNN model structureResult

3 Residual Neural NetworkResNet model structuresResult

4 Feature extraction and transfer learningDefinitionResult

5 SummaryComparison of different methodsDiscussion

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 6 / 20

Page 9: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

A Convolutional Neural Network: DeepYeast CNN model structure

DeepYeast (11-layer CNN) Model structure

A 11-layered CNN; similar to the first few layers of VGG-19;VGG-19 was trained on the (ImageNet) ILSVRC dataset consisting ofnatural objects, aircraft, etc.Several papers: similar CNNs for the current problem.

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 7 / 20

Page 10: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

A Convolutional Neural Network: DeepYeast CNN model structure

VGG-19 and DeepYeast (11-layered) model structureTable 2: VGG-19 and DeepYeast model strcuture

VGG-19 DeepYeast19 weight layers 11 weight layersInput: 224×224×3 Input: 64×64×3conv3-64 conv3-64conv3-64 conv3-64maxpool 2×2 maxpool 2×2conv3-128 conv3-128conv3-128 conv3-128maxpool 2×2 maxpool 2×2conv3-256 conv3-256conv3-256 conv3-256conv3-256 conv3-256conv3-256 conv3-256maxpool 2×2 maxpool 2×2conv3-512 Fully-connected layer-512conv3-512 Dropout-0.5conv3-512 Fully-connected layer-512conv3-512 Dropout-0.5maxpool 2×2 Fully-connected layer-12 (softmax)conv3-512 (BN added except for the last FC layer)conv3-512conv3-512conv3-512maxpool 2×2Fully-connected layer-4096Dropout-0.5Fully-connected layer-4096Dropout-0.5Fully-connected layer-1000 (softmax)# of parameters is 144,000,000 # of parameters is 3,128,908

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 8 / 20

Page 11: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

A Convolutional Neural Network: DeepYeast CNN model structure

CNN Model structureInput image, 64 × 64 × 3

Conv layer with 64 3 × 3 filters, padding=1, stride=1Output dimension: 64 × 64 × 64

Conv layer with 64 3 × 3 filters, padding=1, stride=1Output dimension: 64 × 64 × 64

2 × 2 MaxpoolingOutput dimension: 32 × 32 × 64

Conv layer with 128 3 × 3 filters, padding=1, stride=1Output dimension: 32 × 32 × 128

Conv layer with 128 3 × 3 filters, padding=1, stride=1Output dimension: 32 × 32 × 128

2 × 2 MaxpoolingOutput dimension: 16 × 16 × 128

Conv layer with 256 3 × 3 filters, padding=1, stride=1Output dimension: 16 × 16 × 256

Conv layer with 256 3 × 3 filters, padding=1, stride=1Output dimension: 16 × 16 × 256

Conv layer with 256 3 × 3 filters, padding=1, stride=1Output dimension: 16 × 16 × 256

Conv layer with 256 3 × 3 filters, padding=1, stride=1Output dimension: 16 × 16 × 256

2 × 2 MaxpoolingOutput dimension: 8 × 8 × 256

Fully connected with 512 neuronsOutput dimension: 512 × 1

Fully connected with 512 neuronsOutput dimension: 512 × 1

Fully connected with 12 neuronsOutput dimension: 12 × 1

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 9 / 20

Page 12: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

A Convolutional Neural Network: DeepYeast Result

ResultBase CNN (DeepYeast) performance on the subcellular localization dataset

The test accuracy is 0.8512 (vs 0.8671 in the paper).

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 10 / 20

Page 13: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Residual Neural Network

1 BackgroundProtein subcellular localizationData descriptionImplementation

2 A Convolutional Neural Network: DeepYeastCNN model structureResult

3 Residual Neural NetworkResNet model structuresResult

4 Feature extraction and transfer learningDefinitionResult

5 SummaryComparison of different methodsDiscussion

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 10 / 20

Page 14: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Residual Neural Network ResNet model structures

Motivation

Figure 2: Poorer performance withdeeper layers (He et al., 2016) Figure 3: Convolution layer learns the

residual features left by the identity skipconnection/shortcut

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 11 / 20

Page 15: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Residual Neural Network ResNet model structures

Residual neural networks

Convolutional layer blocks; no fully-connected layers;Identity shortcut was shown to perform well.We tried 18- and 50-layered ResNet, Res18 and Res50.

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 12 / 20

Page 16: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Residual Neural Network ResNet model structures

ResNetwork model structures

Table 3: Model structure

Block name DeepYeast Res18 (ours) ResNet 50 Res50 (ours) W40-4 W40-2conv1 x

[3 × 3, 64

]× 2

[7 × 7, 64

] [7 × 7, 64

] [7 × 7, 64

] [3 × 3, 16

] [3 × 3, 16

]

conv2 x[3 × 3, 128

]× 2

[3 × 3, 643 × 3, 64

]× 2

1 × 1, 643 × 3, 64

1 × 1, 256

× 3

1 × 1, 643 × 3, 641 × 1, 64

× 3[

3 × 3, 16 × 43 × 3, 16 × 4

]× 6

[3 × 3, 16 × 23 × 3, 16 × 2

]× 6

conv3 x[3 × 3, 256

]× 4

[3 × 3, 643 × 3, 64

]× 2

1 × 1, 1283 × 3, 1281 × 1, 512

× 4

1 × 1, 643 × 3, 641 × 1, 64

× 2[

3 × 3, 32 × 43 × 3, 32 × 4

]× 6

[3 × 3, 32 × 23 × 3, 32 × 2

]× 6

conv4 x[

3 × 3, 643 × 3, 64

]× 2

1 × 1, 2563 × 3, 256

1 × 1, 1024

× 6

1 × 1, 643 × 3, 641 × 1, 64

× 2[

3 × 3, 64 × 43 × 3, 64 × 4

]× 6

[3 × 3, 64 × 23 × 3, 64 × 2

]× 6

conv5 x[

3 × 3, 643 × 3, 64

]× 2

1 × 1, 5123 × 3, 512

1 × 1, 2048

× 3

1 × 1, 643 × 3, 641 × 1, 64

× 3

max pooling[512-d fc] × 2

12-d fc (softmax)average pooling, 12-d fc (softmax)

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 13 / 20

Page 17: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Residual Neural Network Result

Test accuracy of residual neural networks

Res18 and Res50 performed better than their plain versions;Plain 50 worse than plain 18; but Res50 better than Res18;More benefits with 50 layers.

Table 4: Comparison of accuracy among different methods

Network Training time Test accuracyplain 18 1.75 h 0.8432Res 18 1.75 h 0.8708plain 50 13 h 0.8190Res 50 12.75 h 0.8856

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 14 / 20

Page 18: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Feature extraction and transfer learning

1 BackgroundProtein subcellular localizationData descriptionImplementation

2 A Convolutional Neural Network: DeepYeastCNN model structureResult

3 Residual Neural NetworkResNet model structuresResult

4 Feature extraction and transfer learningDefinitionResult

5 SummaryComparison of different methodsDiscussion

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 14 / 20

Page 19: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Feature extraction and transfer learning Definition

Definition and advantages

The last one or few layers of a pretrained neural network are replacedby new classifiers.More accurate and faster (vs. without feature extraction).

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 15 / 20

Page 20: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Feature extraction and transfer learning Result

Use trained network as a feature extractor

Replace the last fully-connected layer of the base CNN model(DeepYeast) with a random forest and an XGBoost:

Compared to using vectorizing-image input, the test accuracy isimproved (0.85 vs 0.6)Faster: with 512 extracted features vs the original 12288(= 64 × 64 × 3) features

Replace all the fully-connected layers of the VGG-19 model with newfully-connected layers, a random forest and an XGBoost respectively,

Very quick compared to training a neural network from scratch; decenttest accuracy (0.73, 0.66 , 0.72)

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 16 / 20

Page 21: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Summary

1 BackgroundProtein subcellular localizationData descriptionImplementation

2 A Convolutional Neural Network: DeepYeastCNN model structureResult

3 Residual Neural NetworkResNet model structuresResult

4 Feature extraction and transfer learningDefinitionResult

5 SummaryComparison of different methodsDiscussion

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 16 / 20

Page 22: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Summary Comparison of different methods

SummaryTable 5: Comparison of accuracy between different methods

Network Training time Test accuracyDeepYeast (11-layer CNN) 6 h 0.851CNN (18-layer) 1.75 h 0.843Res18 1.75 h 0.871 – 0.891ResNet 18 (He et al., 2016) 2.45 h 0.853CNN (50-layer) 13 h 0.819ResNet 50 (He et al.2016) 12.75 h 0.886Wide ResNet (widening factor 2) 46 h 0.853Random Forest (v-images; 1000 trees) 1.68 h 0.600XGBoost (v-images 800 trees) 10 h 0.6Feature extraction by DeepYeast:Random Forest 10 min 0.850XGBoost 1 h 0.840Feature extraction by VGG-19 (transfer learning):FC layers 3 min 0.730Random Forest (800 trees) 12 min 0.660XGBoost (1000 trees) 14 h 0.722

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 17 / 20

Page 23: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Summary Discussion

Discussion

CNNs performed best, though not a thorough evaluation!Why? images, a large dataset, ...Other statistical methods: RF, Boosting, SVM good, but not tailoredto images...BUT, some new stat methods: could not even run ...

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 18 / 20

Page 24: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Summary Discussion

Acknowledgment

Funded by NIH, NSF.Thank you!

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 19 / 20

Page 25: An empirical comparison of CNNs and other …weip/course/dm/slides/SLDS18.pdfAn empirical comparison of CNNs and other methods for classification of protein subcellular localization

Summary Discussion

References

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning forimage recognition. In Proceedings of the IEEE conference on computervision and pattern recognition, pages 770–778.

Pan, Y., Mai, Q., and Zhang, X. (2018a). catch: Covariate-AdjustedTensor Classification in High-Dimensions. R package version 1.0.

Pan, Y., Mai, Q., and Zhang, X. (2018b). Covariate-adjusted tensorclassification in high-dimensions. arXiv preprint arXiv:1805.04421.

Parnamaa, T. and Parts, L. (2017). Accurate classification of proteinsubcellular localization from high-throughput microscopy images usingdeep learning. G3: Genes, Genomes, Genetics, 7(5):1385–1392.

Mengli Xiao, Wei Pan (UMN) CNNs and microscopy images June 5, 2018 20 / 20