Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for...

21
Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development Studio (TDS), the HC-Screening Unit at the MPI-CBG [email protected] KNIME workshop March 2 nd – 4 th 2011, Zürich

Transcript of Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for...

Page 1: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

Screen Mining with KNIME

A user-friendly framework

for high throughput/content data analysis

Martin Stöter HT - Technology Development Studio (TDS), the HC-Screening Unit at the MPI-CBG [email protected]

KNIME workshop March 2nd – 4th 2011, Zürich

Page 2: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

Outline

Martin Stöter, MPI-CBG, Dresden, Germany 2

- Our challenges with High-Content Screening (HCS) data

- HCS Tools (Community nodes)

- Example Workflows (pseudo demo) - with HCS Tools - R templates - Dose Response with variables

Page 3: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

Technology Development Studio (TDS)

Martin Stöter, MPI-CBG, Dresden, Germany 3

MPI-CBG, Dresden, Germany

Screening facility for academic laboratories

Provide full service for automation and cell-based screens, RNAi and

chemical screens

Equipment: liquid handling robots, drop dispensers, plate washers, plate readers,

High Content Screening platforms

Page 4: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

Is Data Analysis a Bottleneck in HCS?

4

Data analyst

Complex Experiments Lots of data (too much for Excel) Fancy data analysis / mining Many scientists, but few data analysts Sometimes different languages

Data analysis is often a bottleneck! Konstanz Information Miner (KNIME) Open source data analysis platform Modular data pipelining concept

Scientists

Courtesy of Holger Brandl, MPI-CBG

Page 5: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

Template Scripts for KNIME

5

R Local or R server, easy to update & to edit Visualizations (boxplots, histograms, profiles, …)

Matlab Open in Matlab, Matlab snippet, Mattlab plot

Python Open in Python, Python snippet, Python plot

R, Python, Matlab, Java, programming

Excel, tables, graphical user

interfaces

Scientist

Data analyst

Solution Data analysts write template scripts AND users access these using a graphical interface!

Other scripting languages Groovy, Java snippets, JPython, Perl

Other KNIME functionalities Chemoinformatics Bioinformatics Data mining

Courtesy of Holger Brandl, MPI-CBG

Page 6: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

High-Content Screening (HCS) data

Martin Stöter, MPI-CBG, Dresden, Germany 6

Data generation - Cells (RNAi, compounds) - Microscopy -> images - Image analysis - Cell features/parameters -> well data

Tasks/problems - Read data from various sources SQL database, XML, Excel, various .csv …

- Screening specific statistics - Screening specific utilities - Data mining, visualization

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

A

B DMSO DMSO DMSO

C 0.001 DMSO DMSO 0.001

D 10 DMSO DMSO 10

E 10 DMSO DMSO 10

F 3 DMSO DMSO 3

G 3 DMSO DMSO 3

H 1 DMSO DMSO 1

I 1 DMSO DMSO 1

J 0.3 DMSO DMSO 0.3

K 0.3 DMSO DMSO 0.3

L 0.1 DMSO DMSO 0.1

M no AB no AB 0.1 DMSO DMSO 0.1

N no AB no AB 0.1 DMSO DMSO 0.1

O DMSO DMSO

P

Page 7: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

HCS Tools for KNIME

Martin Stöter, MPI-CBG, Dresden, Germany 7

Data Import Image Analysis (Opera, Operetta, Cell Profiler, MotionTracking) Plate Readers (Envision, GeniusPro, MSD SectorImager)

Normalization Percent-of-control (POC), Normalized percent inhibition (NPI) Z-score, B-score Optional: robust statistics (Median + MAD) Select wells to normalize (controls, samples)

Quality Control CV, Z‘, Multivariate Z‘, SSMD

Screen Mining Annotation of screen data from database Dose Response (IC50)

Utilities Handle barcodes & wells, join layouts

Visualization Plate Viewer -> heatmaps, brows wells, ... Mondiran, R templates

HCS Tools

Page 8: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

HCS Tools: Standardized Data Format

- Different readers nodes to shape a common data structure

- Enforce standardization of data format

- Lower the knowledge entry barrier for new users

Page 9: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

HCS Tools: Barcode Standard

Regular expression for interpretation of barcode: Standardized table structure -> connection to our TDS compound database (?<libplatenumber>[0-9]{3})(?<projectcode>[A-z]{2})(?<date>[0-9]{6})(?<replicate>[A-z]{1}) Configurable in Preferences -> KNIME -> HCA Tools

Page 10: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

HCS Tools: Annotate Experiment

Excel is (still) the tool of choice for assay development

Join Layout node is Excel Reader for defined spread sheet

Plate format with multiple well attributes (1 plate layout -> 1 column in KNIME)

Page 11: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

KNIME Workflow Example 1

Page 12: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

Plate Viewer: Heatmaps of entire Screen

Martin Stöter, MPI-CBG, Dresden, Germany 12

179 plates x 384wells = ~70.000 data points times x parameters

Page 13: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

Workflow Example 2

Martin Stöter, MPI-CBG, Dresden, Germany 13

Page 14: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

Workflow Example 2: Clustering

Martin Stöter, MPI-CBG, Dresden, Germany 14

Page 15: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

Workflow Example 2

Martin Stöter, MPI-CBG, Dresden, Germany 15

Page 16: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

Workflow Example 2: Dose Response node

Martin Stöter, MPI-CBG, Dresden, Germany 16

Page 17: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

Using flow variables in R scripts

Martin Stöter, MPI-CBG, Dresden, Germany 17

Dose Response node looks like KNIME, but actually it is R!!!

Page 18: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

Summary

• HCS Tools provides a very useful functionality for HCS specific applications in KNIME

• The scripting template environment allows to use a GUI for configuration of scripts.

• R scripts from templates can be modified and customized in any node and support flow variables.

Page 20: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

Acknowledgements

20

Marc Bickle

Cordula Andre

Rico Barsacchi

Sara Christ

Milan Esner

Annett Lohmann

Felix Meyerhofer Claudia Möbius

Antje Niederlein Nadine Tomschke

Jan Wagner

Software Development / Bioinformatics Facility (MPI-CBG) Holger Brandl

HCS Tools

TDS team (MPI-CBG)

KNIME Michael Berthold and the KNIME team

Page 21: Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for high throughput/content data analysis Martin Stöter HT - Technology Development

Thank you for your attention!

21 Martin Stöter, MPI-CBG, Dresden, Germany