EGEE-III INFSO-RI-222667
Enabling Grids for E-sciencE
www.eu-egee.org
The Medical Data Manager :the components
Johan Montagnat, Romain Texier, Tristan GlatardCNRS, I3S laboratory
Medical Data Manager, R. Texier, July 16, 2008 2
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
EGEE Medical Data Manager
• Objectives– Expose a standard grid interface (SRM) for medical image
servers (DICOM)– Use native DICOM storage format– Fulfill medical applications security requirements– Do not interfere with clinical practice
User Interfaces
Worker Nodes
DICOM clients
DIC
OM
Inte
rfac
eS
RM
DICOM server
Medical Data Manager, R. Texier, July 16, 2008
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Medical data protection• Content
– Medical images (data, confidential)– Patient folder (attached metadata, very sensitive)
• Requirements– Patient privacy
Needs fine access control (ACLs on all data and metadata) Needs metadata contention (metadata databases administrated by
accredited staff)
– Data protection Needs data encryption (even grid sites administrators are not
accredited to access the data)
• How important it is?– The medical community will just not use a system in which they
are not trustful (both a technical and a human problem)
Medical Data Manager, R. Texier, July 16, 2008 4
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
MDM main Components
• Usability– LFC API provides transparent access
• Privacy– LFC and DPM provide file level ACLs
– AMGA provides metadata secured communication and ACLs
• Data protection– SRM-DICOM provides on-the-fly data
anonimization DPM-based (SRM v2 interface)
– Hydra key store provides encryption / decryption transparently
– Data is anoymized prior to transmission
LFC
AMGA Metadata
SRM-DICOMInterface
Medical Data Manager, R. Texier, July 16, 2008 5
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Exploiting DPM extensibility
• DPM can access different storage back-end through plugins– The DPM-DICOM plugin prepares the file
• DPM exposes a standard Storage Element interface (SRM)• DPM provides standard file exchange protocols, file
access control
DICOM
GET
DPM
head
DPM
Disks pool
Standard interface
File
retrieval
DPM-DICOM
Plugin
DPM-DICOM
Library
Temporary copy
SFN request
Medical Data Manager, R. Texier, July 16, 2008 6
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Medical Data Registration
AMGA Metadata
gLite
API
1. Image is acquired
2. Image is stored in DICOM server
3. gLite client
3a. Image is registered
(a GUID is associated)3b. Image keyis produced andregistered
4. image m
etadataare registered
LFC
DICOM serverDPM
File Catalog
Hydra keystore
Medical Data Manager, R. Texier, July 16, 2008 7
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Medical Data Registration
AMGA Metadata
LFC
API
1. Image is acquired
2. Image is stored in DICOM server
3. gLite client
3a. Image is registered
(a GUID is associated)3b. Image keyis produced andregistered
4. image m
etadataare registered
LFC
DICOM serverDPM
File Catalog
– All this step can be done by a single CLI
– A DICOM transaction can initiate the registration
PUSH
DICOM
Triggers:
DICOM server PUSHMDM registraiton
Medical Data Manager, R. Texier, July 16, 2008 8
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Registration in Hydra
• Each DICOM image is uniquely identified by a unique Study/Serie/SOP identifier
• The hydra servers generate a key for the selected cypher
• The cypher and the key are associated to the unique DICOM identifiers
analyzeStudy ID
Series IDSOP ID
Select a cypher
and generate a key
DICOM image
Hydra
servers
Medical Data Manager, R. Texier, July 16, 2008 10
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
File identifiers registration
• A reference to a file is recorded in the DPM, but no copy of the file in the DPM disk pool is needed
• Directories with the Study, Series and SOP identifiers are created in the LFC
• The anonymized data fields are registered in the AMGA server
- SURL and PFN
- the size of the file- host of the disk pool- ...
- LFN and SURL
- size of the file- DICOM image
metadata
DPM LFC AMGA
Medical Data Manager, R. Texier, July 16, 2008 11
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Access control rights management
• To allow one user to access a medical file and its metadata the owner of the file must set the right in all the component :
• Example:
LFC
DPM
Hydra
AMGA
Medical Data Manager, R. Texier, July 16, 2008 12
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Medical Data Retrieval
SR
M-D
ICO
M
inte
rfac
e
AMGA Metadata
User
Interface
Worker Node
2. lcg client
3. get SFN from GUID
4. request file
5. get file key
6. on-the-fly encryption and anonimyzation
return encrypted file
7. get file key and decrypt file locally
Metadata ACL control
Anonymization & encryption
1. get GUID from metadata
gLite
API
LFCFile ACL control
File Catalog
Key ACL control
Hydra keystore
Medical Data Manager, R. Texier, July 16, 2008 13
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Dicom retrieval : get the dicom file
DPM
SURL request DPM-DICOM
Library
DPM-DICOM
Plugin
• The PFN associates with a DICOM file is resolved by the DPM-DICOM plugin
• The plugin makes a DICOM transaction with the DICOM server to retrieve the medical image
• By default, MDM is packaged with the Conquest DICOM server, but it is intended for interface to production servers
The database
assocites eachSURL with a PFN
DICOM GET
Medical Data Manager, R. Texier, July 16, 2008 14
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Dicom retrieval : Anonymization and encryption
DPM
Disks pool
Standard interface
File
transfer
• Step 1A: The DPM-DICOM uses the DCMTK library to anonymize the DICOM file
• Or Step 1B: The DICOM file is converted to a 3D format (inrimage) without nominative information
• Step 2: The DPM-DICOM calls Hydra to encrypt the final file• DPM-DICOM uses the RFIO library to copy the file in a spool
disk. The spool disk is only a buffer for the file.
DPM-DICOM
Library
DICOM file
1A
1B2
Image
anonimizationDICOM server
SURL
request DPM-DICOM
Plugin
Medical Data Manager, R. Texier, July 16, 2008 15
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Service Distribution
• Hospital sites have to remain autonomous– With strong (in-site) control over the sensitive metadata
• The EGEE Data Management System federates distributed data files
• AMGA supports databases replication but not distribution– Asynchronous, master-slave model, with partial replication of
the directory hierarchy– The MDM includes a library and a query client that provide
multi-site metadata servers federation. The client is based on the AMGA client and is syntactically compatible (transparency).
- Users can send the commands to
only one or all the servers- Users can dynamically add
or remove servers
AMGA AMGA AMGA AMGA
Medical Data Manager, R. Texier, July 16, 2008 16
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Use cases
• File administration– A system administrator has access to the file for
replication / backup procedures– No access to the file content, nor to metdata
• Image processing– A neuroscientist has access to the file content for image
analysis– No access to the nominative metadata
• Medical analysis– A physician involved in the patient healthcare has
access to all data and metadata
Medical Data Manager, R. Texier, July 16, 2008 17
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
DataDataManagementManagement
The User Interface
Computing Resources
Storage Resources
Site X
Logging, real time monitoringLogging, real time monitoring
WorkloadWorkloadManagementManagement
Sites ResourcesSites Resources
InformationInformationServiceService
Dynamic evolution
DataSets info
Author.&Authen.
qu
eries
quer
ies
User requests
Resources allocation
Pu
blication
resources info
Ind
exing
Medical Data Manager, R. Texier, July 16, 2008 18
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
The User Interface
• Very few components• Easy to install• The Hydra client will be
part of the standard UI
• Standard user configuration for the LFC and BDII
• No configuration for the DPM• Only one file for :
– Hydra (services.xml)– AMGA (.mdclient)
LFC
Hydra
AMGA
Hydra
AMGA
Hydra
Multi-server AMGA client
Hydra client
ConfigurationInstallation
Medical Data Manager, R. Texier, July 16, 2008 19
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
The MDM components
Computing Resources
Site X
Logging, real time monitoringLogging, real time monitoring
WorkloadWorkloadManagementManagement
InformationInformationServiceService
Dynamic evolution
DataSets info
Author.&Authen.
qu
eries
quer
ies
User requests
Resources allocation
Pu
blication
resources info
Ind
exing
DataDataManagementManagement
Storage Resources
Sites ResourcesSites Resources
Medical Data Manager, R. Texier, July 16, 2008 20
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
MDM is on top of
• SL4 ( and CentOS ) for the DPM version of the MDM
• SL3 for the gLite-IO version of the MDM
• Libraries ( gLite, DCMTK, etc)
Medical Data Manager, R. Texier, July 16, 2008 21
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
The MDM components
AMGAdatabasefront-end
Access control
AMGA Metadata
HydraKey store
Access control
SRM v2 interfaceAccess control
Instrumented DPM
Storage Element
DPM-DICOM plugin
LFCFile Catalog
Access control
LFC
DICOM Server
BDII Server
Medical Data Manager, R. Texier, July 16, 2008 22
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
One server
• All the components could be on the same server
/vo
/dpm
/domain
/home
DPM
head nodefile
DPM disk servers
…
DPM-DICOM plugin
One server
BDII Server
Medical Data Manager, R. Texier, July 16, 2008 23
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Behind the components
• The LFC of the BIOMED VO is used
• The BDII must be registered by a top level BDII
• By default, AMGA uses a PostgreSQL database to store the metadata – Can use other database (Mysql, SQLite,
Oracle)
• The DPM is only a buffer : – The storage area should be small. – The file are already encrypted.– The file in the DPM can be replicated by
other servers
LFC
AMGA Metadata
SRM-DICOMInterface
Medical Data Manager, R. Texier, July 16, 2008 24
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Behind the components
• The Hydra server uses Mysql to store the keys – Each Hydra server use well-separated
tables/database
• The Hydra server is on the top of a Tomcat and an Apache server
• All the DICOM picture are stored in the DICOM server – If there is no DICOM server, the MDM
provides the CONQUEST server
Medical Data Manager, R. Texier, July 16, 2008 25
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Installation procedure
• Add some yum repositories• Install with
– Yum install MDM
Medical Data Manager, R. Texier, July 16, 2008 26
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Configuration procedure
• The server must be registered in EGEE – The server receive a certificate
• Today, there is no automatic configuration procedure• The configuration procedure is describe• Some parts of the configuration (firewall, DPM buffer,
etc) are already automatic
Top Related