Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for...
Transcript of Screen Mining with KNIME · 2017-05-23 · Screen Mining with KNIME A user-friendly framework for...
Screen Mining with KNIME
A user-friendly framework
for high throughput/content data analysis
Martin Stöter HT - Technology Development Studio (TDS), the HC-Screening Unit at the MPI-CBG [email protected]
KNIME workshop March 2nd – 4th 2011, Zürich
Outline
Martin Stöter, MPI-CBG, Dresden, Germany 2
- Our challenges with High-Content Screening (HCS) data
- HCS Tools (Community nodes)
- Example Workflows (pseudo demo) - with HCS Tools - R templates - Dose Response with variables
Technology Development Studio (TDS)
Martin Stöter, MPI-CBG, Dresden, Germany 3
MPI-CBG, Dresden, Germany
Screening facility for academic laboratories
Provide full service for automation and cell-based screens, RNAi and
chemical screens
Equipment: liquid handling robots, drop dispensers, plate washers, plate readers,
High Content Screening platforms
Is Data Analysis a Bottleneck in HCS?
4
Data analyst
Complex Experiments Lots of data (too much for Excel) Fancy data analysis / mining Many scientists, but few data analysts Sometimes different languages
Data analysis is often a bottleneck! Konstanz Information Miner (KNIME) Open source data analysis platform Modular data pipelining concept
Scientists
Courtesy of Holger Brandl, MPI-CBG
Template Scripts for KNIME
5
R Local or R server, easy to update & to edit Visualizations (boxplots, histograms, profiles, …)
Matlab Open in Matlab, Matlab snippet, Mattlab plot
Python Open in Python, Python snippet, Python plot
R, Python, Matlab, Java, programming
Excel, tables, graphical user
interfaces
Scientist
Data analyst
Solution Data analysts write template scripts AND users access these using a graphical interface!
Other scripting languages Groovy, Java snippets, JPython, Perl
Other KNIME functionalities Chemoinformatics Bioinformatics Data mining
Courtesy of Holger Brandl, MPI-CBG
High-Content Screening (HCS) data
Martin Stöter, MPI-CBG, Dresden, Germany 6
Data generation - Cells (RNAi, compounds) - Microscopy -> images - Image analysis - Cell features/parameters -> well data
Tasks/problems - Read data from various sources SQL database, XML, Excel, various .csv …
- Screening specific statistics - Screening specific utilities - Data mining, visualization
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
A
B DMSO DMSO DMSO
C 0.001 DMSO DMSO 0.001
D 10 DMSO DMSO 10
E 10 DMSO DMSO 10
F 3 DMSO DMSO 3
G 3 DMSO DMSO 3
H 1 DMSO DMSO 1
I 1 DMSO DMSO 1
J 0.3 DMSO DMSO 0.3
K 0.3 DMSO DMSO 0.3
L 0.1 DMSO DMSO 0.1
M no AB no AB 0.1 DMSO DMSO 0.1
N no AB no AB 0.1 DMSO DMSO 0.1
O DMSO DMSO
P
HCS Tools for KNIME
Martin Stöter, MPI-CBG, Dresden, Germany 7
Data Import Image Analysis (Opera, Operetta, Cell Profiler, MotionTracking) Plate Readers (Envision, GeniusPro, MSD SectorImager)
Normalization Percent-of-control (POC), Normalized percent inhibition (NPI) Z-score, B-score Optional: robust statistics (Median + MAD) Select wells to normalize (controls, samples)
Quality Control CV, Z‘, Multivariate Z‘, SSMD
Screen Mining Annotation of screen data from database Dose Response (IC50)
Utilities Handle barcodes & wells, join layouts
Visualization Plate Viewer -> heatmaps, brows wells, ... Mondiran, R templates
HCS Tools
HCS Tools: Standardized Data Format
- Different readers nodes to shape a common data structure
- Enforce standardization of data format
- Lower the knowledge entry barrier for new users
HCS Tools: Barcode Standard
Regular expression for interpretation of barcode: Standardized table structure -> connection to our TDS compound database (?<libplatenumber>[0-9]{3})(?<projectcode>[A-z]{2})(?<date>[0-9]{6})(?<replicate>[A-z]{1}) Configurable in Preferences -> KNIME -> HCA Tools
HCS Tools: Annotate Experiment
Excel is (still) the tool of choice for assay development
Join Layout node is Excel Reader for defined spread sheet
Plate format with multiple well attributes (1 plate layout -> 1 column in KNIME)
KNIME Workflow Example 1
Plate Viewer: Heatmaps of entire Screen
Martin Stöter, MPI-CBG, Dresden, Germany 12
179 plates x 384wells = ~70.000 data points times x parameters
Workflow Example 2
Martin Stöter, MPI-CBG, Dresden, Germany 13
Workflow Example 2: Clustering
Martin Stöter, MPI-CBG, Dresden, Germany 14
Workflow Example 2
Martin Stöter, MPI-CBG, Dresden, Germany 15
Workflow Example 2: Dose Response node
Martin Stöter, MPI-CBG, Dresden, Germany 16
Using flow variables in R scripts
Martin Stöter, MPI-CBG, Dresden, Germany 17
Dose Response node looks like KNIME, but actually it is R!!!
Summary
• HCS Tools provides a very useful functionality for HCS specific applications in KNIME
• The scripting template environment allows to use a GUI for configuration of scripts.
• R scripts from templates can be modified and customized in any node and support flow variables.
How to get the HCS Tools & Scripting nodes?
Martin Stöter, MPI-CBG, Dresden, Germany 19
MPI-CBG http://www.mpi-cbg.de/facilities/profiles/software-engineering/hcs-tools.html HCS Tools KNIME community website http://tech.knime.org/hcs-tools Link KNIME community update site http://tech.knime.org/update/community-contributions/nightly KNIME -> Help -> Install Software -> Add… Paste update link and select node packages to be installed …and the R templates? -> links comes automatically with Scripting nodes, but can be configured in KNIME preferences.
Acknowledgements
20
Marc Bickle
Cordula Andre
Rico Barsacchi
Sara Christ
Milan Esner
Annett Lohmann
Felix Meyerhofer Claudia Möbius
Antje Niederlein Nadine Tomschke
Jan Wagner
Software Development / Bioinformatics Facility (MPI-CBG) Holger Brandl
HCS Tools
TDS team (MPI-CBG)
KNIME Michael Berthold and the KNIME team
Thank you for your attention!
21 Martin Stöter, MPI-CBG, Dresden, Germany