UNICORE as a Tool for Processing the Data from GS FLX Instrument

17
UNICORE as a Tool for Processing the Data from GS FLX Instrument M. Borcz 1,2 R. Kluszczyński 1 K. Skonieczna 3,4 T. Grzybowski 3 Piotr Bała 1,2 1 ICM University of Warsaw 2 Faculty of Mathematics and Computer Science, UMK, Toruń 3 Collegium Medicum, UMK, Bydgoszcz 4 Postgraduate School, Medical University of Warsaw

Transcript of UNICORE as a Tool for Processing the Data from GS FLX Instrument

Page 1: UNICORE as a Tool for Processing the Data from GS FLX Instrument

UNICORE as a Tool for Processing the Data from GS FLX Instrument

M. Borcz1,2 R. Kluszczyński1 K. Skonieczna3,4 T. Grzybowski3 Piotr Bała1,2

1ICM University of Warsaw

2Faculty of Mathematics and Computer Science, UMK, Toruń

3Collegium Medicum, UMK, Bydgoszcz

4Postgraduate School, Medical University of Warsaw

Page 2: UNICORE as a Tool for Processing the Data from GS FLX Instrument

PROCESSING TIME

STORAGE

TECHNICAL SUPPORT

AUTOMATION

FLEXIBILITY

SECURITY

Informatics for Automation M. Borcz

MOTIVATION

Page 3: UNICORE as a Tool for Processing the Data from GS FLX Instrument

Informatics for Automation M. Borcz

PL-GRID

„The goal of the PL-Grid project (Polish Infrastructure for Supporting Computational Science in the European Research Space) is to provide the Polish scientific community with an IT platform based on Grid computer clusters, enabling e-science research in various fields.

PL-Grid aims at significantly extending the amount of computing resources provided to the Polish scientific community (by approximately 215 TFlops of computing power and 2500 TB of storage capacity) and constructing a Grid system that will facilitate effective and innovative use of the available resources.”

www.plgrid.pl

Page 4: UNICORE as a Tool for Processing the Data from GS FLX Instrument

PROCESSING TIME

STORAGE

TECHNICAL SUPPORT

AUTOMATION

FLEXIBILITY

SECURITY

Informatics for Automation M. Borcz

MOTIVATION

Page 5: UNICORE as a Tool for Processing the Data from GS FLX Instrument

Informatics for Automation M. Borcz

UNICORE UNICORE (Uniform Interface to Computing Resources) is a middleware enabling

access to the Grid resources in a seamless and secure way. UNICORE is a part of Unified

Middleware Distribution developed by EMI project.

www.unicore.eu

www.eu-emi.eu

UNICORE RichClient(URC)

UNICORE CommandlineClient (UCC)

High-LevelAPI (HiLA)

Page 6: UNICORE as a Tool for Processing the Data from GS FLX Instrument

Informatics for Automation M. Borcz

UNICORE

www.unicore.eu

Page 7: UNICORE as a Tool for Processing the Data from GS FLX Instrument

Informatics for Automation M. Borcz

UNICORE WORKFLOW

www.unicore.eu

Page 8: UNICORE as a Tool for Processing the Data from GS FLX Instrument

Informatics for Automation M. Borcz

Page 9: UNICORE as a Tool for Processing the Data from GS FLX Instrument

Informatics for Automation M. Borcz

EXPERIMENT

Determination of the 18 complete mitochondrial genome sequences of tumor and matched non-tumor tissues obtained from 9 patients diagnosed with colorectal cancer

mtDNA sequences comparison with the reference sequence

mtDNA mutation identification

Ultra high speed processing of mtDNA sequence data.

High-throughput GS FLX Instrument (Roche Diagnostics)

Up to 1 million reads of approxmately 500 bp long in a single experiment

Page 10: UNICORE as a Tool for Processing the Data from GS FLX Instrument

Informatics for Automation M. Borcz

WORKFLOW

GSRunProcessor : Data from GS FLX Instrument (Roche Diagnostics) , SFF and CWF files

GSReferenceMapper: SFF files GSReporter: CWF files GSAssembler: SFF files, FASTA file

BLAST: FASTA file

Page 11: UNICORE as a Tool for Processing the Data from GS FLX Instrument

Informatics for Automation M. Borcz

DATA PROCESSING

High-throughput GS FLX Instrument (Roche Diagnostics) UNICORE Commandline Client (UFTP)

Target System Storage (PL-Grid)

UNICORE Rich Client Batch System (PL-Grid):

GS Run Processor GS Reporter GS Reference Mapper GS Assembler BLAST

Page 12: UNICORE as a Tool for Processing the Data from GS FLX Instrument

Informatics for Automation M. Borcz

STORAGE

Page 13: UNICORE as a Tool for Processing the Data from GS FLX Instrument

Informatics for Automation M. Borcz

UNICORE RICH CLIENT Gridbeans are plug-ins enabling to run an application on the grid. They generate description of the job and supply user with graphical interface to enter input data and present results.

Page 14: UNICORE as a Tool for Processing the Data from GS FLX Instrument

Informatics for Automation M. Borcz

WORKFLOW EDITOR Gridbeans can be used to build simple jobs or can be treated as building blocks

for workflows consisting of various tasks and operations.

Page 15: UNICORE as a Tool for Processing the Data from GS FLX Instrument

Informatics for Automation M. Borcz

DETAILS

Data: 17 Gb

Images: 834 files

File size: 33Mb

Transfer: 3s / file

GSRunAnalysisPipe:

Interlagos: AMD Opteron(TM) Processor 6272 @ 2.10GHz

AMD: AMD Opteron(tm) Processor 6174 @ 2.20GHz

Intel: Intel(R) Xeon(R) CPU, X5660 @ 2.80GHz (inifiniband)

1 cpu: 70.0h

8x8 cpu (Intel, MPI): 2.5h

Page 16: UNICORE as a Tool for Processing the Data from GS FLX Instrument

Informatics for Automation M. Borcz

SHORT DEMONSTRATION (1)

SHORT DEMONSTRATION (2)

Page 17: UNICORE as a Tool for Processing the Data from GS FLX Instrument

Informatics for Automation M. Borcz

REFERENCES

www.unicore.eu

www.plgrid.pl

www.eu-emi.eu

www.roche.com

„Building a National Distributed e-Infrastructure - PL-Grid” Lecture Notes in Computer Science, Vol 7136, in the subseries: Information Systems and Applications, incl. Internet / Web, and HCI.