Visual Programming for Metabolomics KNIME Stephan Beisken.
-
Upload
octavia-greer -
Category
Documents
-
view
226 -
download
3
Transcript of Visual Programming for Metabolomics KNIME Stephan Beisken.
Visual Programming for Metabolomics
KNIME
Stephan Beisken
Visual Programming
• “Visual programming languages enable physicians
and other computer users with little knowledge of programming to develop computer software. The physician uses a visual paradigm to "draw" the computer interface and then attaches short segments of computer code to buttons, menus, and list boxes.”
Ebell, M. H. (1993). Visual programming languages. M.D. Computing : Computers in Medical Practice, 10(5), 305–11.
Motivation
• Simplify your (working) life
• Data processing and analysis requires various different tools to work together in sequence
• Data input and output
• Spreadsheets
• Data transformation
• Transposition, aggregation, string manipulation
• IsaCreator
• Formatting of tables
Agenda
• Introduction
• Tutorial
• Installation and Extensions
• Overview of the Workbench
• Nodes and Table Models
• Exercises
• Introductory Examples
• MassCascade
• OpenMS
• XCMS
• Slides, software, workflows, and data for takeaway
Disclaimer
• Workflows are great
• It does not have to be KNIME, there are many other solutions
• Every method that captures information in a consistent manner and enables reproducibility is great• Transparency
• Ability to share data and ‘everything’ that was done to the data
Who is already a KNIME user?
Introduction
• KNIME: Konstanz Information Miner
• http://www.knime.org/
• Developed at University of Konstanz in Germany
• Desktop version available free of charge (open source)
• Modular platform for building and executing workflows using predefined components: nodes
• Core functionality available for tasks such as data mining, analysis, and manipulation
• Extra features and functionality available in KNIME through extensions from various groups (community) and vendors
• Written in Java based on the Eclipse SDK platform
Workflow Concepts
• Workflow execution
• Can execute complex, multi-step operations on input data
• Can be run be “non-experts” using predefined parameter templates ensuring optimal results
• Can be set up for specific measurement systems
• Can be shared across researchers
Functionality
• Data manipulation and analysis
• File & database I/O, sorting, filtering, grouping, joining, pivoting
• Data mining and machine learning
• R, WEKA, KNIME, interactive plotting
• Cheminformatics
• Conversions, similarity, clustering, (Q)SAR analysis, etc.
• Scripting integration
• R, Perl, Python, Matlab, Octave, Groovy
• Reporting and much more
• Bioinformatics, HTS & image analysis, network & text mining
• Marketing, big data and business analytics
Modules (Community Extensions)
• http://tech.knime.org/community
• Chemoinformatics• CDK (EMBL-EBI), RDKit (Novartis), Indigo (GGA),
• ErlWood (Eli Lilly), Enalos (NovaMechanics)
• ChEMBL and ChEBI (EMBL-EBI)
• Bioinformatics
• OpenMS (Tübingen, ETH Zurich)
• MassCascade (EMBL-EBI)
• HCS (MPI), NGS (Konstanz), Image analysis
• Integration• Python, Perl, R, Groovy, Matlab (MPI), PDB web services
client (Vernalis), REST and SOAP web service support
Workflow Platforms
Applications
Applications cont.
Applications cont.
Applications cont.
Applications cont.C
alib
ratio
nR
egre
ssio
n
Advantages Disadvantages
• Intuitive to use
• No or little programming experience required
• Good for prototyping
• Lots of functionality
• Very modular and flexible
• Active community
• Extensible
• Visual Feedback
• Steep learning cure
• Resource greedy
• No (free) server edition
• Slower execution than standalone scripts
Installation
• Download and unzip KNIME• No further setup required
• ./knime.ini contains arguments for launch
• Install new modules (nodes) from update sites
• Explorer and installation wizard provided
• Workflows and data are stored in a workspace• ~/<user>/knime/workspace
• C:\Users\<user>\knime\workspace
• Preferences in: File Preferences KNIME
Workbench
workflow editor
consoleoutline
tabs
Node description
node repository
workflow projects
favorite nodes
public server
Auto-layout Execute Execute all nodes
Nodes
Title
Icon
Input port(s) – on the left of icon
Output port(s) – on the right of icon
Status display (‘traffic lights’)
• Red (not ready)• Amber (ready)• Green (executed)
• Blue bar during execution (with percentage or flashing)
Sequence numberRight-click menu
To configure and execute the node, display the output views, edit the node, and display data for the ports
• Node: Basic processing unit of a workflow
• performs a particular task
Dialogs• Double-click opens configuration dialogs
• Explicit column types
TablesColumn specificationsTable rows Various renderers Column types
Exercises: Preliminaries
• Pre-installed KNIME Desktop 2.9.1
• Workflows
• starters, xcms, openms, masscascade
• Data
• FAAH knockout LC/MS data
• ESB tomato LC/MS QC data
• ChEBI SDFile, KEGG SDFile
• Plug-Ins (more in About KNIME Installation Details)
• R (interactive)
• Erl Wood, CDK
• OpenMS, MassCascade
Exercises: Installation
• Open your KNIME directory
• ~/Desktop/knime_2.9.1
• ./knime.exe
• Memory allocation
• ./knime.ini
Exercises: Starters
• More examples available from the Examples repository
Exercises: MassCascade
https://bitbucket.org/sbeisken/masscascadeknime/wiki/ExampleWorkflows
Exercises: XCMS
http://www.bioconductor.org/packages/devel/data/experiment/manuals/faahKO/man/faahKO.pdf
Exercises: OpenMS
http://ftp.mi.fu-berlin.de/OpenMS/release-documentation/OpenMS_tutorial.pdf
Final Remarks
• Workflows can make exploratory or repetitive data tasks easier and save time
• Extensive data pre-processing functionality
• Extensions for statistics, machine learning, bio-, and cheminformatics
• Integration of R (XCMS) and spectrometry extensions can help you to build elaborate pipelines and share work
• Can help to organize one’s thoughts.
• It’s actually quite a bit of fun.
Resources
• KNIME Forum
• http://www.knime.org/
• KNIME Learning Hub
• http://www.knime.org/learning-hub
• Quickstart Guide
• http://tech.knime.org/files/KNIME_quickstart.pdf
• Happy to Help
Q&A
•
•
•
•
•
•
•
•
•
•