Scipion: Toward software integration, reproducibility and validation in EM image processing...

28
Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC J.M. de la Rosa Trevín

Transcript of Scipion: Toward software integration, reproducibility and validation in EM image processing...

Page 1: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

Scipion: Toward software integration, reproducibility and

validation in EM image processing

Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

J.M. de la Rosa Trevín

Page 2: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

There are different EM modalities

• Single Particles

• Helical

• 2D Crystallography

• Tomography

Page 3: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

There is a long path to produce a 3D model

Sample Preparation

Image Acquisition

3D reconstruction

3D model2D Analysis

Page 4: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

Image acquisition and preprocessing

Data collection

Movies alignment (with DDD)

Micrographs CTF estimation

Page 5: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

2D image processing for Single Particles

Particle Picking

Screening - Preprocessing

Alignment and Classification

Page 6: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

3D reconstruction

Initial model

3D classification

3D refinement

VALIDATION

Page 7: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

The EM field needs software integration

Using different EM software packages is now like the

tower of Babel

Page 8: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

Appion is certainly a pioneer work in terms of software integration (and our main inspiration)

It is increasing the number of external tools added to current EM packages (such as Eman2, Xmipp3 or Relion )

But still is complicated to easily use tools from different packages in one project…

Page 9: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

Scipion goals

1. Integrate EM software packages to be used in the same project.

2. Full project traceability, improving reproducibility.

3. Execute complete workflows in an automated manner.

4. Easy to install and use.5. Easy to extend with new protocols.

Page 10: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

Goal 1: Integrate EM software packages to be used in the same project.

Page 11: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

All vs All is hard to maintain and extend

All conversions: N*N

New package: 2*N

Page 12: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

It is better to have a common format

All conversions: N+N

New package: 2

Page 13: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

We bridge across package differences by modeling our domain

3D Reconstruction

Set of Images

Initial Model

3D Volume

DataProtocols

Page 14: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

We need conversion functions for each package

Page 15: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

Goal 2: Full project traceability, improving reproducibility.

Page 16: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

Results should be reproducible, not more “black boxes”

Page 17: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

We implemented asimple storage mechanism

Mapper Layer

Data Objects

Protocol Objects

Page 18: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

Goal 3: Execute complete workflows in an automated manner.

Page 19: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

Scipion client

Worker Host 1 Worker Host 2

Scipion Server Bookeeping

Designed to perform distributed execution

Distributed data storage

Big data transfers

Page 20: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

Goal 4: Easy to install and use.

Goal 5: Easy to extend with new protocols.

(Let´s see Scipion in action)

Page 21: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

Example 1: Integration of Spider-MDA (in collaboration with Tanvir Shaikh)

Page 22: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

Example 2: Integration of Normal Modes analysis and flexible fitting

(in collaboration with Slavica Jonic)

Page 23: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

Example 3: Integration of ResMap

(in collaboration with Alp Kucukelbirand Hemant Tagare)

Page 24: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

List of currently integrated Protocols: Software package and tools Protocols integrated into Scipion

Xmipp 3.1 All

Niko Grigorieff ctffind3.5/4, frealign9.07 refinement and classification

Eman 2.1 Initial model, particle picking, 3D refinement

Spider Filters, align APSR, CAPCA, classify Ward,Refinement 3D

Relion Most of programs

Bsoft Particle picking

ResMap Local resolution estimation

Dosefgpu DD Movie averaging

Page 25: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

Roadmap 2015

May : Alpha release Pilot installations outside Madrid. (A few have been made already)

June : Beta release announcement in 3DEM list

End of summer : Scipion 1.0 release

Page 26: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

There is a team behind

Page 27: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

We need to do it all together!!! All are wellcome.

Page 28: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC.

www.structuralbiology.euFollow us on twitter @instructhub