Aleksi Kallio CSC – IT Center for Science [email protected] Chipster and collaboration with other...

19
Aleksi Kallio CSC – IT Center for Science [email protected] Chipster and collaboration with other bioinformatics platforms

Transcript of Aleksi Kallio CSC – IT Center for Science [email protected] Chipster and collaboration with other...

Page 1: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.

Aleksi KallioCSC – IT Center for Science

[email protected]

Chipster and collaboration with other bioinformatics platforms

Page 2: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.

Chipster introduction

Page 3: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.

Free, open source software for analyzing high-throughput data such as NGS

Available as a ready-to-run VM with a large collection of analysis tools and reference data• Use directly or via Chipster GUI

Chipster GUI enables users to• Visualize data efficiently• Share analysis sessions• Document what they have done • Save and share automatic workflows

Chipster in a nutshell

Page 4: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.

Analysis tools for different kinds of data

140 NGS tools for• RNA-seq• miRNA-seq• exome/genome-seq• ChIP-seq• FAIRE/DNase-seq• MeDIP-seq• CNA-seq• Metagenomics (16S rRNA)

140 microarray tools for• gene expression• miRNA expression• protein expression• aCGH• SNP• integration of different data

60 tools for sequence analysis• BLAST, EMBOSS, MAFFT• Phylip

Page 5: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.
Page 6: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.
Page 7: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.
Page 8: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.

Technical features Client-server user interface, loosely coupled distributed backend

• Can be spread over different clouds, elasticity Data on client or server side

• No duplication of data Workflows – reusing and sharing your analysis pipeline

• You can save your analysis steps as a reusable automatic ”macro”

Web based interface for system administration and tool development• Tool scripts can be R,

Python or Java Integrated user support

functionality• Easy to see what the user

has done

Page 9: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.

Chipster admin GUI

Page 10: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.

Chipster compared to Galaxy There is no obvious way to compare two complex systems…

• Windows vs. Linux, vi vs. Emacs, Python vs. Java… Many technical differences, but maybe the core difference is in

how tools, workflows etc. are presented to user• Chipster’s approach is more integrated: focus on usability,

consistent biologist friendly terminology, single complete virtual machine distribution, automated updates for the whole system

• Galaxy’s approach is more modular: focus on tool distribution, tool developer community, workflow driven work, several customised versions available

• YMMV… Typical feedback we hear: in Chipster people like the GUI and in

particular being able to visualise the session, in Galaxy people like the breadth of available tools and integrations

Page 11: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.

Collaboration opportunities

Page 12: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.

Tool evaluation and selection Selecting best tools takes effort Finding and testing example datasets takes effort Wiki for shared best practices?

Should include basic justification for selection e.g. benchmarks, references to review articles…

Lightweight alternative to full comparison or review article

Page 13: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.

Cloud integration Combined efforts to bring different bioinformatics platforms to

major generic and scientific clouds Not only about software infrastructure parts (easy), but tools and

databases (hard) and keeping them up-to-date (really hard) Tools to achieve elasticity not only within platforms, but across

platforms Running several platforms and scaling resources across the platform

according to changing workloads

Page 14: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.

Chipster in EGI FedCloud Chipster VM available in FedCloud Applications Database Chipster Virtual Organization

Page 15: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.

Tool platform Why every platform needs to integrate the same tools and

databases, but in a different way? There are many highly sophisticated solutions out there, but

typically with low coverage of the day-to-day tools Virtual machine images and containers (Docker) are practical tools

for software packaging Supporting technologies

Bio-Linux, Debian Med BioImg.org CernVM-FS

Page 16: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.

Tool platform idea

Factory that generates VMI’s or containers 24/7• Always tested • Latest software versions

High-quality and widely used VMI’s/containers• One good software bundle is better than dozens of

poorly baked ones Vision: you can just assume that software and

databases are there. “Spotify” of bioinformatics software.

For cloud: automatically updated VM with hooks for update events

Could there be NeIC project around this?

16

Page 17: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.

Finally

Page 18: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.

Thanks to users and contibutors!

Page 19: Aleksi Kallio CSC – IT Center for Science chipster@csc.fi Chipster and collaboration with other bioinformatics platforms.

More info

[email protected] http://chipster.csc.fi http://chipster.github.io/chipster/