Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... •...

52
1 Mark-Anthony Bray, Ph.D Imaging Platform, Broad Institute Cambridge, Massachusetts, USA [email protected] 0.4233 54,454 45.777 0.6886 0.0055 6.9994 83.333 14.113 1.5567 0.0954 0.5553 Using CellProfiler for Biological Image Analysis Quantitative Analysis of Large-Scale Biological Image Data

Transcript of Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... •...

Page 1: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

1

Mark-Anthony Bray, Ph.D

Imaging Platform, Broad Institute

Cambridge, Massachusetts, [email protected] 0.4233

54,454

45.777

0.6886

0.0055

6.9994

83.333

14.113

1.5567

0.0954

0.5553

Using CellProfiler for Biological Image Analysis

Quantitative Analysis of Large-Scale Biological Image Data

Page 2: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

2

2

Summary

• Background on image-based screening

• Introduction to CellProfiler considerations in image analysis

• Construction and use of a pipeline for analyzing typical image data

• Measurement export and preparation for additional analysis

Page 3: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

3

3

Images Contain A Wealth Of Information

http://www.microscopyu.com Image: Javier Irazoqui

Page 4: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

4

4

Visual Appearance Indicates Biological State

• Automatic image analysis is– Objective

– Quantitative, with statistics

– Can measure multiple properties at once for every cell

– Distinguishes subtle changes, even those undetectable by eye

– Faster, less tedious

• Images contain a wealth of biological information

• That information can be quantified

Localization

… + hundreds of other features

mRNA or

protein levels

morphology

Page 5: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

5

5

Cells or organisms in multiwell plates, each well treated with a gene or chemical perturbant

Automated microscopy

(any manufacturer)

High-Content Screening

Data exploration

& machine learning

Anne

Carpenter

Ray

Jones

Cell measurements

(size, shape, intensity, texture, etc.)

Page 6: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

6

6

Software Overview

• Available from www.cellprofiler.org• Free, open source (Python)• Software available for Windows, Mac and Linux

Image Analysis &

Quantification

Image-centric

Data Analysis

Page 7: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

7

7

CellProfiler: Overview

• Process large sets of images

• Identifies and measures objects

• Export data for further analysis

• Goal: Provide powerful image analysis methods with a user-friendly interface

• Philosophy: Measure everything, ask questions later...

• Support data analysis based on individual cells

Page 8: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

8

8

Typical CellProfiler Pipeline Workflow

• For image-based assays, the basic objective is always to – Identify cells/organisms

– Measure feature(s) of interest

• The uniqueness of each assay comes in– Deciding what compartments

to identify and how to identify them

– Determining which measure(s) are most useful to identify interesting samples

Page 9: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

9

9

Typical CellProfiler Pipeline Workflow

Page 10: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

10

10

The CellProfiler Interface

• Pipeline panel: Displays modules in pipeline– Modules executed in order from top to bottom

Change module position

Add or remove modules

Module help

Page 11: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

11

11

Load pipeline by double-clicking on it

View images by double-clicking on the filename

The CellProfiler Interface

• File panel: Displays files in default image folder

Page 12: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

12

12

The CellProfiler Interface

• The figure window has additional menu options

• Toolbar menu: Pan, zoom in/out

• CellProfiler Image Tools– Image Tool (also

displayed by clicking on image)

– Interactive zoom

– Show pixel data (location, intensity)

Page 13: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

13

13

The CellProfiler Interface

• Folder panel: Change default input and output directories– Usually these should be separate folders

Input folder: Contains images to be analyzed

Output folder: Contains the output file plus exported data and images

Page 14: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

14

14

The CellProfiler Interface

• Settings panel: View and change settings for each module– Clicking on a different module updates the settings view

Page 15: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

15

15

Module Categories

• File processing: Image input, file output

• Image processing: Often used for pre-processing prior to object identification

• Object processing:Identification, modification of objects of interest

• Measurement: Collection of measurements from objects of interest

• Data Tools: Measurement exploration, measurement output

Page 16: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

16

16

The First Module: LoadImages

• Related how? Depending on the imaging device, one file may represent– One channel at one imaging location

– Multiple channels at one imaging location

– Multiple channels at multiple locations

– Etc…

• Loads an image set

A group of related images to be processed

DNA GFP

Page 17: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

17

17

The First Module: LoadImages

• Can use text matching to define the difference between images in a set

All images stained for GFP have the text Channel1- in the name

Same for DNA images (Channel2-)

Assign each image a meaningful name for downstream reference

Page 18: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

18

18

Object Identification

• Once the images are loaded, how do you find objects of interest?

• Step 1: Distinguish the foreground from the background by picking a good threshold

• Step 2: Identify objects as regions brighter than the threshold

• Step 3: Cut and join objects to “improve” their shape

Page 19: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

19

19

Primary Object Identification

• Many options for thresholding, cut and join methods, etc.

Page 20: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

20

20

Thresholding

• Definition: Division of the image into background and foreground

• Method: Pick the method that provides the best results– Otsu: Default - Good for readily identifiable foreground / background

– Background, RobustBackground: Good for images in which most of the image is comprised of background

• What is the best threshold value for dividing the intensity into foreground and background pixels?

Pixel values

Fre

qu

en

cy

Page 21: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

21

21

Thresholding

• Correction factor

– Multiplication factor applied to threshold

– Adjusts threshold stringency/leniency

– Setting this factor is empirical

• Upper/lower bounds

– Set safety limits on automatic threshold to guards against false positives

– Helpful for unexpected images: Empty wells, images with dramatic artifacts, etc

Page 22: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

22

22

Object Separation

• We need to distinguish multiple objects contained in the same “clump”

Images from Carolina Wahlby

•••

••

••

• Once the foreground objects have been identified, what next?

Page 23: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

23

23

Object Separation

• Two step process in “de-clumping”1. Identification of the objects in a clump2. Drawing boundaries between the clumped objects

Adjust settings to “de-clump” objects

Page 24: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

24

24

Object Separation

– Intensity: Works best if objects are brighter at center, dimmer at edges

– Shape: Works best if objects have indentations where clumps touch (esp. if objects are round)

Peaks

2

1 2

Indentations

• Clump identification: Two options

1

1

•••

••

••

Page 25: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

25

25

Object Separation

– Distance: Draws boundary lines midway between object centers

– Intensity: Draws boundary lines at dimmest line between objects

• Test Mode allows users to view results of all setting combinations

• Drawing boundaries: Two options

1

•••

••

••

Page 26: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

26

26

Object Separation

• Additional separation settings: Adjust these settings if objects are being incorrectly split into pieces or merged together

Original image Smoothing filter

size = 4

Smoothing filter

size = 8

• Smoothing: Increase to reduce intensity irregularities which produce over-segmentation of objects

Page 27: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

27

27

Object Separation

• Suppress Local Maxima– Smallest distance allowed between object intensity

peaks to be considered one object rather than a clump

– Decrease to reduce improper merging of objects in clumps

Original image Maxima

distance = 4

Maxima

distance = 8

Maxima

Page 28: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

28

28

Object Separation

• Adjusting can produce more improper segmentation than it solves

• The proper settings are usually a matter of trial and error– The automatic settings are a good starting point, though

• However….

Original image Smoothing filter

size = 4

Smoothing filter

size = 8

Page 29: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

29

29

Filtering Invalid Objects

• See FilterObjects module for more advanced filtering options

Discard objects that fail size criterion or touch the image border

Page 30: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

30

30

Primary Object Identification

• Segmented objects are colored– Shows if each object has

been identified and separated properly

• Outlines: Valid objects– Green: Valid

– Yellow: Invalid – Touching border

– Red: Invalid – Size criterion

• Also outputs object count

Page 31: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

31

31

Secondary Object Identification

• Goal: Identify cell boundaries by “growing” primary objects– Nuclei typically more uniform in shape, more easily separated than cells

• Approach: Segment nuclei → Seeds for cell segmentation by using a cell stain channel

Page 32: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

32

32

Secondary Object Identification

• Methods– Distance-N: Ignores image

information• Useful in cases where no cell

stain is present

– Watershed, propagate, Distance-B: Uses image information

• Finds dividing lines between objects and background / neighbors

• Test mode allows user to view results of all methods

Propagation

Distance-N

Page 33: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

33

33

Tertiary Object Identification

• Goal: Identify tertiary objects by removing the primary objects from secondary objects

– “Subtract” the nuclei objects from cell objects to obtain cytoplasm

Cells Nuclei Cytoplasm— ═

Page 34: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

34

34

Pixel-Based Image Classification

• For images where a threshold cannot be found…

• CellProfiler is packaged with ilastik, a pixel-based classification tool– User manually labels regions of image– ilastik uses features to distinguish regions and create a classifier– Classifier used as input into ClassifyPixels module– Currently, Windows only

DIC ilastik Foreground/background mask

Page 35: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

35

35

Measurement Modules: Object Morphology

Select the objects to measure

Page 36: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

36

36

Module: MeasureObjectAreaShape

• Goal: Measure morphological features such as – Area

– Perimeter

– Eccentricity

– MajorAxisLength

– MinorAxisLength

– Orientation

– FormFactor: Compactness measure, circle = 1, line = 0

Page 37: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

37

37

Measurement Modules: Object Intensity

Select the image to measure from

Select the objects to measure

Page 38: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

38

38

Module: MeasureObjectIntensity

• Goal: Measure object intensity features such as

– Integrated intensity: Sum of the pixel intensities within an object

– Mean, median, standard deviation intensities

– Maximal and minimal pixel intensities

– Lower/Upper quartile

• The object intensity may be obtained from any image, not just the image used to identify the object

– Example: Ph3 intensity may be measured using the nuclei objects

Page 39: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

39

39

Measurement Modules: Object Texture

Select the image to measure from

Select the objects to measure

Select the spatial scale

Page 40: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

40

40

MeasureObjectTexture

• Goal: Determine whether the staining pattern is smooth on a particular scale

• Selection of the appropriate texture scale is essentially empirical

– A higher number measures larger patterns of texture

– Smaller numbers measure more localized (finer) patterns of texture

• Can also add several texture modules to the pipeline, each measuring a different texture scale

Page 41: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

41

41

Other Measurement Modules

• CalculateMath: Arithmetic operations for measurements

• CalculateStatistics: Assay quality (V and Z' factors) and dose response data (EC50) for all measurements

• Image-based measures– MeasureImageAreaOccupied

– MeasureImageGranularity

– MessureImageIntensity

• Object-based measures– MeasureCorrelation

– MeasureObjectNeighbors

– MeasureRadialDistribution

Page 42: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

42

42

Data Export Modules

• User may output images or image measurements

Select the objects to export

Page 43: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

43

43

Measurement Display

• The averagemeasurements for all objects in the image are displayed in the figure window

• However, the individualmeasurements for each object are stored in the output file

Page 44: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

44

44

Data Export Modules

• Goal: Retain images of intermediate image processing steps for quality control or save measurements for later analysis and exploration

• SaveImages: Writes an image to a file– Intermediate images in the pipeline are not saved unless

requested

– Choice of many image formats to write → module can be used as an image format converter

• ExportToSpreadsheet: Export measurements as a comma-separated file readable by spreadsheet programs

• ExportToDatabase: Export measurements as a per-object and per-table plus configuration file for a MySQL or SQLite database

Page 45: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

45

45

Cluster Computing

• If processing time is too great on a single computer, then run the pipeline on a cluster– Install CellProfiler on a computing cluster

– Add the ExportToDatabase module

– Add/configure the CreateBatchFiles module to the end of the pipeline

– Run the pipeline to create a batch file

– Submit the batches to your cluster for processing

– Check the progress of processing

• For really big screens, it is necessary to process images in batches on a computing cluster.

Page 46: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

46

46

Megakaryocyte Polyploidization: Leukemia

DNA stain, with

outlines identifying the nuclei

Martha Vokes

Mark

Bray

SU6656

(positive control)

Project in progress

per-cell DNA content (log2)

pro

po

rtio

n o

f ce

lls

SU6656DMSO

DMSO

(negative control)

John Crispino,

Northwestern

University

Jeremy Wen,

postdoc

Status: Identified 206 polyploidization

regulators from 10k compound screen

Page 47: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

47

47

Images from BioImage SBS image analysis comparison. Thanks to Ilya Ravkin

Carpenter, et al., Genome Biology, 2006

Measuring Morphology

Page 48: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

48

48

Upcoming: CellProfiler 2.1

• Major changes

– Streamlined loading of images and associated data

– Takes advantage of multiple CPU cores, so very large sets of images can now be processed on a regular desktop computer

• Release scheduled for early 2014

Page 49: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

49

49

Final Notes

• Where to get help

– Access help from the CellProfiler main window

– Ask for help on the CellProfiler.org forum

Page 50: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

50

50

Annual Support Training Plan

• Contact [email protected] for more details

Page 51: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

51

51

Image assay developmentApply image analysis methods to biological questions

Mark

Bray

Anne

Carpenter David

Logan

Algorithm development & software engineeringDevelop & test new image analysis and data mining methods

and create open-source software tools

IT/Administration

Matthew

Veneskey

Vebjørn

Ljoså

Carolina

Wählby

Carpenter Lab / Broad Institute Imaging Platform

Lee

Kamentsky

Shantanu

Singh

Director

Holger

Hennig

Page 52: Using CellProfiler for Biological Image AnalysisCYTO+U+Webinar...user-friendly interface ... • Folder panel: Change default input and output directories – Usually these should

52

52

Acknowledgments

S.D.G.

Free, at www.cellprofiler.org:

Recent funding for this work provided by:

NIH NIGMS (Carpenter: R01 GM089652 and Wahlby: R01 GM095672)

The Broad Institute of Harvard and MIT

Many thanks to our many biology collaborators who

provide images

Contact:[email protected]