Use of Data Provenance and the Grid in Medical Image Analysis and Drug Discovery – an IXI exemplar...
-
Upload
marybeth-summers -
Category
Documents
-
view
213 -
download
0
Transcript of Use of Data Provenance and the Grid in Medical Image Analysis and Drug Discovery – an IXI exemplar...
Use of Data Provenance and the Grid in Medical Image Analysis and Drug
Discovery – an IXI exemplar
Kelvin K. Leung1, Mark Holden1, Rolf A. Heckemann2, Nadeem Saeed3,
Keith J. Brooks3, Jacky B. Buckton4, Kumar Changani3, David G. Reid3,
Daniel Rueckert5, Joseph V. Hajnal2, Derek L.G. Hill1
1Division of Imaging Sciences, King's College London, UK2Imaging Sciences Department, Imperial College (Hammersmith Hospital
Campus), UK3Imaging Centre, 4RA Disease Biology, ri-CEDD, GlaxoSmithKline, UK
5Department of Computing, Imperial College, UK
Overview
• Background– Motivations
– Virtual data system
– Automatic delineation of multiple bones in serial MR images of joints in a disease model of Rheumatoid Arthritis (RA)
– Image registration and segmentation propagation
• Methods– Prototype
• Results
• Conclusions
Motivations
• Medical imaging is going to play an important part in drug discovery– Recent £76m investment by GlaxoSmithKline (GSK) and Imperial
College on a new clinical imaging center
• Automatic analysis of medical image data requires:– Lots of storage space (each image is about 32Mb in this work)
– Computational power (running time is about 20-24 hours for processing an image on a single desktop computer in this work)
• Motivated by the need of computational resources
Motivations
• The Grid has the potential to allow better collaboration between industry and university with the idea of virtual organisation– University can provide image analysis algorithms as services to
the industry, such as GSK, over the Grid
• Motivated by the need of better and more effective collaboration with the industry
Motivations
• Detail and reliable documentation of data provenance of all the analysis is very important in order to obtain regulatory approval for new drug.– Part 11 of Guidance on industry issued by US Food and Drug
Administration (FDA)
– Good Laboratory Practice (GLP) and Good Clinical Practice (GCP)
• Motivated by the need of data provenance
Overview
• Background– Motivations
– Virtual data system
– Automatic delineation of multiple bones in serial MR images of joints in a disease model of Rheumatoid Arthritis
– Image registration and segmentation propagation
• Methods– Prototype
• Results
• Conclusions
Virtual data system (VDS or Chimera)
• A system to “enable documentation of data provenance, discovery of available methods and on-demand data generation (so-called ‘virtual data’)”– Developed by I. Foster, J. Vöckler, M. Wilde and Y. Zhao of University
of Chicago
• It consists of: – A virtual data catalogue is a virtual data schema that provides a
representation of computational procedures and their invocations.– A virtual data language interpreter handles all the requests for
constructing and querying the database entries.
• Data objects, such as input and output files, are described by logical file names (LFN), which are mapped to physical files via Globus replica catalog (RC) or Globus replica location service (RLS)
Virtual data system
• Virtual data language (VDL) is used to describe computational procedures and their invocations
• Computational procedures are defined by transformation (TR) statements. Example:– TR foo(input file1, output file2) { … }
• Invocations are defined by derivation (DV) statements. Example:– To invoke foo with logical filenames file_a (input) and file_b (output)
– DV call_foo->foo(file1=@{input:”file_a”},file2=@{output:”file_b”});
• Virtual data schema allows the storage of TR’s and DV’s
Virtual data system
• Compound TR can be built so that workflow can be defined. Example: – To call foo twice and pass the output of the first call to the input of the
second call
– TR compound_foo(input file_in, output file_out, io file_io) {
call foo(file1=@{input:”file_in”}, file2=@{output:”file_io”});
call foo(file1=@{input:”file_io”}, file2=@{output:”file_out”}); };
• When requesting an output file from the system, an abstract DAG (contains only LFN) will be generated.
• A planner called “Planning for Execution in Grid (Pegasus)” converts the abstract DAG into a Condor DAGman script and submit it to the Globus universe of Condor.
Overview
• Background– Motivations
– Virtual data system
– Automatic delineation of multiple bones in serial MR images of joints in a disease model of Rheumatoid Arthritis
– Image registration and segmentation propagation
• Methods– Prototype
• Results
• Conclusions
Automatic delineation of multiple bones
• Rheumatoid Arthritis (RA)– Is a chronic, systemic, autoimmune inflammatory disease.– Targets synovial joints, in which there is a massive
accumulation of blood-borne cells such as T cells and macrophages.
– Blood vessels are formed to support this new tissue and the whole mass is called pannus.
– Progressive erosion to cartilage and bone leads to disability in patients
• MR images were acquired in a disease model of RA• Interested in the talus bone and the calcaneus bone in
the ankle• Delineate them from the MR images and study them,
e.g. calculate volume to measure any erosion
Image registration
• Refers to the spatial alignment of two images so that corresponding features in the two images are matched
• The result is a spatial mapping or transformation that transforms positions from one image to positions in another image.
• Example: Movie showing the rigid registration of two 3D MR images of a knee
Sagittal plane of image 1
Sagittal plane of image 2
Sagittal Transaxial Coronal
Image registration
• Rigid registration: translation + rotation = 6 degrees of freedom (dof)
• Affine registration: rigid + skewing + scaling = 12 dof
• Nonrigid registration: warp one image into another one– Very computationally demanding because of lots of dof
– Example: Free form deformation (FFD) models local deformation as translation of a regularly spaced grid of points (control points)
Movie showing the green MR image of a knee overlaid on top of the grey MR image of a knee before and after warping.
White arrows show the amount of translation of the control points.
Segmentation propagation
• Makes use of the spatial mapping calculated from the registration of two image to perform segmentation
• Requires an atlas– An atlas is a reference image with labelled
structures
Target imageReference image
Manual segmentation of calcaneus
Apply spatial mapping
Computed boundary of calcaneus
Segmentation propagation
calcaneus
Atlas
Rigid + non-rigidregistration
Spatial mapping
All image analysis workflows were entered into VDS
Overview
• Background– Motivations
– Virtual data system
– Automatic delineation of multiple bones in serial MR images of joints in a disease model of Rheumatoid Arthritis
– Image registration and segmentation propagation
• Methods– Prototype
• Results
• Conclusions
Prototype
• Simple web interface to replace some command line tools of VDS, Globus Toolkit 2.4 and Condor– Researchers or clinicians working on medical image analysis may
not be comfortable with command line tools and the virtual data language
– Developed using Java servlet on Apache Tomcat
– Web pages for • Querying VDS for transformations and derivations
• Invoking transformations in VDS
• Querying, uploading and downloading files to and from Globus RLS
• Displaying job status in Condor
Prototype
Web portal machine running Apache Tomcat,Globus client, personal Condor(job submission site)
Grid machine runningGlobus Gatekeeper, GridFTP server, Globus RLS and Condor
Experimental condor pool of 4 machines
(storage and execution site)
Overview
• Background– Motivations
– Virtual data system
– Automatic delineation of multiple bones in serial MR images of joints in a disease model of Rheumatoid Arthritis
– Image registration and segmentation propagation
• Methods– Prototype
• Results
• Conclusions
Results
Service to delineate the calcaneus and talus from the target image
target reference_image
aregdof
Rigid registration
talus_seg
talus tal_dof
Segmentationpropagation
cal_seg
calcaneus cal_dof
Segmentationpropagation
Overview
• Background– Motivations
– Virtual data system
– Automatic delineation of multiple bones in serial MR images of joints in a disease model of Rheumatoid Arthritis
– Image registration and segmentation propagation
• Methods– Prototype
• Results
• Conclusions
Conclusions
• We integrated Grid middleware and data provenance tool with medical image processing software in a prototype system with collaboration with GSK
• Data provenance of the results were kept in VDS. They can be queried and retrieved easily. – Aim to satisfy guidelines issued by US FDA, GLP and GCP on the
maintenance of “audit trail” of electronic records.
• The total processing time of delineating 12 bones from 6 subjects were cut down from about 132 hours to about 33 hours (a factor of 4) by running the computing tasks on a Condor pool instead of on a single desktop computer
Further work
• More user feedback is required to evaluate and improve the system
• Further validation and application to a larger amount of subjects are required to determine the sensitivity of the delineation technique to disease progression