DICOM Objects: Unstructured Data in Oracle 11g
-
Upload
insync-conference -
Category
Technology
-
view
742 -
download
0
Transcript of DICOM Objects: Unstructured Data in Oracle 11g
DICOM Objects: Unstructured Data in Oracle 11g
Naomi RafaelBioGrid Australia
16 August 2010
The most comprehensive Oracle applications & technology content under one roof
Agenda• Purpose and Description of BioGrid Australia• The MRI Images Collection for Melbourne
Health• Oracle 11g Advantages
• Design of the Images Database• Examples of Utilities and Techniques• Current Challenges
• References
AgendaPurpose and Description of BioGrid
Australia• The MRI Images Collection for Melbourne
Health• Oracle 11g Advantages• Design of the Images Database• Examples of Utilities and Techniques• Current Challenges• References
BioGrid Australia Vision• Facilitate multi-disciplinary medical research• Leverage research collaboration• Link heterogeneous data from multiple institutions• Confer value, retain and re-use health data• Enforce system security
• Respect patient privacy
• Select pragmatic technology
BioGrid Architecture
Oracle 11g
VPN
Federated Data IntegratorThe Federated Data Integrator(FDI) is the hub of the BioGridarchitecture . Data from heterogeneous data sources are integrated into one virtual repository on this server .
Local ResearchRepository (LRR)
USIDB
FDIPRDInternet
LRRs
Research RepositoriesAt OtherInstitutions
Oracle 11g ServerSize: 2.5 TbImages: 15 million
VPN
DemographicsDe-identified Data
DBIMGS
DemographicsDe-identified Data
USIServer
Data Linka
ge
User
User
DMZ
TerminalServer,Reverse
Proxy
Agenda• Purpose and Description of BioGrid Australia
The MRI Images Collection for Melbourne Health
• Oracle 11g Advantages
• Design of the Images Database• Examples of Utilities and Techniques• Current Challenges
• References
IF YOU WANTED TO COMPAREIF YOU WANTED TO COMPAREthe volume and shape of the the volume and shape of the
brain of individualbrain of individual
EPILEPSY EPILEPSY PATIENTSPATIENTS
where would you start?where would you start?
The Images Local Research Repository (1)• First take 7 million proprietary Magnetic
Resonance Images (MRIs) on over 1000 DAT format tapes
• Convert to Digital Imaging and Communications in Medicine (DICOM) format
• Store and index images on-line• Extract DICOM header information
• Link into BioGrid Australia and issue record linking ID
The Images Local Research Repository (2)
• Retrieve identified and de-identified images on demand
• Be economical and sustainable • Add 8 million more MRI images (stored on
Optical Disk storage technology)
Agenda• Purpose and Description of BioGrid
Australia• The MRI Images Collection for Melbourne
Health
Oracle 11g Advantages• Design of the Images Database• Examples of Utilities and Techniques• Current Challenges• References
Oracle 11g Advantages (1)• Oracle Database 11g stores the images on
line• The images can be retrieved on demand • Oracle Database 11g indexes and partitions
for fast query• Oracle Multimedia 11g has a dedicated
DICOM data type with rich feature set • SQL*Loader can be tuned for fast image load
Oracle 11g Advantages (2)• Security features are available• Compression is available at the LOB level, on
backup, and on DataPump export.• Application Express is available for rapid
application development• And for Melbourne Health: installation
licensed by Victoria Department of Health statewide Oracle license
ORDDICOM object type• Digital Imaging and Communications in Medicine• http://medical.nema.org/• The Digital Imaging and Communications in Medicine
(DICOM) feature was first introduced to Oracle interMedia in Oracle Database 10g Release 2 as a feature of the ORDImage object type
• Metadata tags associated with DICOM data were extracted into an XML document
• Oracle Database 11g Release 1 provides more complete DICOM support in a new ORDDicom object type.
• This object type holds the DICOM binary data and extracted metadata, and contains the methods to manipulate the DICOM binary data.
Oracle 11g: Using DICOM Images - Features
• Built in function to extract the DICOM metadata (tags)• Ability to select and view DICOM attributes• Ability to convert images from DICOM to other image
formats, eg, JPEG, GIF, PNG and TIFF• Built in function to remove identifying tag information
ie, de-identify images• Ability to import and export images on other servers
using mapped drives
Agenda• Purpose and Description of BioGrid Australia• The MRI Images Collection for Melbourne
Health
• Oracle 11g Advantages Design of the Images Database• Examples of Utilities and Techniques• Current Challenges• References
Database Architecture (1)• Windows Server 2003 (64bit)• Oracle Database 11g Release 1 Version 11.1.0.6
– Single Instance Database• 2.8 TB Data Stored on 20 spindles
• Separate physical drive contains flashback recovery area
• Partitioned by range
Agenda• Purpose and Description of BioGrid Australia• The MRI Images Collection for Melbourne
Health• Oracle 11g Advantages
• Design of the Images Database
Examples of Utilities and Techniques• Current Challenges
• References
Example: Using DICOM Object Type inCreate Table
create table medical_image_table (id varchar(50),
TAPE_ID number, dicom orddicom,
USI varchar(50) )LOB (dicom.source.localdata) STORE AS SECUREFILE
(COMPRESS HIGH)PARTITION BY range (TAPE_ID)( PARTITION PART1 VALUES less than (50) TABLESPACE
TBLS_PART1_FROM_TAPE1);
Example: Using setProperties to Extract Metadata into ORDDICOM Object
-- Set Data Model Repository. This procedure must be called at the -- beginning of each database session.
execute ordsys.ord_dicom.setDataModel();declare obj orddicom; res varchar2(1000);begin select dicom into obj from medical_image_table where id =
'E11200S001I001.dcm' for update; obj.setProperties;end;/
Example: Select and View DICOM Attributes
select t.dicom.getAttributebyTag('00200010') as
STUDY_ID, t.dicom.getAttributebyTag('00100010') as
PATIENT_NAME, t.dicom.getAttributebyTag('00100020') as PATIENT_ID,TO_DATE(t.dicom.getAttributebyTag('00100030'),'YYYY
-DD-MM') as PATIENT_DOB,from medical_image_table t where t.dicom.id =
'E11200S001I001.dcm';
Example: Create View for Patient DetailsCreate or replace view patient_details asselect t.id,t.tape_id,t.usi,………,(t.dicom.getAttributebyTag('00080030')) as
STUDY_TIMEfrom medical_image_table
(Note: A prerequisite is to execute ordsys.ord_dicom.setDataModel() to load datamodel repository to be able to fetch attibutes by tag.)
Example: Convert Image from DICOM to JPEG and Make Anonymous
declare dcm ordsys.orddicom;begin ord_dicom.setDatamodel; for rec in (select * from medical_image_table for update) loop rec.dicom.setProperties(); -- create a JPEG thumbnail rec.dicom.processCopy('fileFormat=jpeg fixedScale=75,100',
rec.imageThumb); -- make a new anonymous version of the ORDDicom object
rec.dicom.makeAnonymous(genUID(rec.id), rec.anonDicom); -- write the objects back to the row …….. end loop; commit;end;/
Example: Import and Export Images
CONNECT / AS SYSDBA
--Directory IMAGEDIR for export/import DICOM
create or replace directory imagedir as 'O:\ORACLE_DICOM_IMAGES';
grant read,WRITE on directory IMAGEDIR to Administrator;
-- import() method can be used to import (where ORDDICOM source
-- attributes contain ‘FILE’, ‘IMAGEDIR’, and filename)dcm.import();-- export() method can be used to exportdcmSrc.export('FILE', 'IMAGEDIR', filename);
Example: Use of Compression for DICOM Objects
On Tables and LOBS using SECUREFILE (COMPRESS HIGH):
create table medical_image_table (id varchar(50),
TAPE_ID number, dicom orddicom,
USI varchar(50) )
LOB (dicom.source.localdata) STORE AS SECUREFILE (COMPRESS HIGH)
PARTITION BY range (TAPE_ID)( PARTITION PART1 VALUES less than (50) TABLESPACE
TBLS_PART1_FROM_TAPE1);
Impact of CompressionDicom images are stored as
SECUREFILE (COMPRESS HIGH)
Compression level example from load of first cohort of images:
• Space utilization on file system: 1515.52 Gb• Space utilization in Oracle 11g: 816 Gb
Compression achieved:
(1515-816)/1515= approx. 46%
Example: Compression in Backup Using RMANRMAN> configure device type disk backup type to
compressed backupset;
RMAN> configure channel device type disk maxpiecesize 50g;
RMAN> show compression algorithm;RMAN configuration parameters for database with
db_unique_name RMHIMG are:CONFIGURE COMPRESSION ALGORITHM 'BZIP2';
--ZLIB compression algorithm offers speed but not a good compression ratio. The alternate compression algorithm, BZIP2, is slower but provides a better compression ratio.
RMAN> backup database;..
Maximize SQL*Loader Performance• Use Direct Path Loads(direct=true) - The conventional path
uses standard insert statements whereas the direct path loader loads directly into the Oracle data files and creates blocks in Oracle database block format.
• Disable/Drop Indexes and Constraints• Disable Archiving During Load • Use unrecoverable- This disables the writing of the data to the
redo logs. • The parallel load option is not allowed when loading lob
columns • Do remember to create indices or enable them after direct load.
Otherwise performance will be affected.
SQL*Loader Performance Results
• Using these options we were able to reduce time for loading 50 tapes from 13 hours to approximately 5 hours.
Compression and Parallelisation with Data Pump Export
• expdp Images_admin/WELCOME DIRECTORY=BACKUP_64BIT JOB_NAME=IMAGES_ADMIN_EXP_JOB dumpfile=IMAGES_ADMIN%U.dmp PARALLEL=3 COMPRESSION=all
• With PARALLEL=3 three Dump files IMAGES_ADMIN%u.DMP are created making the export process much faster.
• After export, each partition is further compressed to 21-30 GB (originally 40-50GB after SecureFiles compress high).
Example: Create Table for Best ORDDicom with SecureFiles Performance (1)
create table medical_image_table (id varchar(50),
TAPE_ID number, dicom orddicom,
USI varchar(50) )Pct free 60
lob(dicom.source.localdata) store as SecureFile( nocache filesystem_like_logging),
Example: Create Table for Best ORDDicom with SecureFiles Performance (2)
lob(dicom.extension) store as SecureFile( nocache disable storage in row )
xmltype dicom.metadata store as SecureFile clob( nocache disable storage in row )
Example: Options for Indexing Metadata (1)1. Build indices on the ORDDicom metadata column
• If few attributes, index each: create INDEX dcm_patientfamilyname_idx ON dicom (extractValue(src.metadata, '/DICOM_OBJECT/PERSON_NAME[@tag="00100010"]/NAME/FAMILY','xmlns="http://xmlns.oracle.com/ord/dicom/metadata_1_0"'));
• If many attributes, create full text index: create index dcm_md_idx on dicom (src.metadata) indextype is ctxsys.context parameters('STOPLIST dicom_stoplist storage dcm_text_idx_pref') parallel 4;
Example: Options for Indexing Metadata (2)
2. Build indices on separate extracted metadataa) Create a mapping documentb) Call extractMetadatac) Store and index results
Load Options (1)
• Disable logging on lob column rather than SQL*LOADER recoverable
• Use SQL*Loader with Conventional Path Loads and submit parallel jobs
– Get all the advantages of SQL*Loader– Plus the advantages of parallelism– Parallelism makes up for lack of direct path load
Load Options (2)
If adding load function to a Java application:• Use JDBC thin driver for best performance• Use getBytes() and setBytes() from oracle.sql.BLOB
class to read/write from SecureFile BLOB• Read and write large buffers to the database, for
example, 10MB• Balance application traffic over available network links for
parallel load
IF YOU WANTED TO COMPAREIF YOU WANTED TO COMPAREthe volume and shape of the the volume and shape of the
brain of individualbrain of individual
EPILEPSY EPILEPSY PATIENTSPATIENTS
where would you start?where would you start?
Example: Query Linking Patient Clinical Information with Images
16 CREATE TABLE SASUSER.QUERY_FOR_PARTY_0000 AS SELECT PARTY1.USI,
17 PARTY.USI AS USI1,18 VISITDETAILS.SYNDROMEDIAGNOSIS19 FROM EPIL_RMH.PARTY AS PARTY,20 IMGRMH.PARTY AS PARTY1,21 EPIL_RMH.VISITDETAILS AS VISITDETAILS22 WHERE (PARTY.USI = PARTY1.USI AND PARTY.USI =
VISITDETAILS.USI)23 ORDER BY VISITDETAILS.SYNDROMEDIAGNOSIS;
Example: Results
Example: Query Linking Syndrome Diagnosis with Images15 PROC SQL;
16 CREATE TABLE SASUSER.Query_for_QUERY_FOR_PARTY_0000 AS SELECT DISTINCT QUERY_FOR_PARTY_0000.USI,
17 IMAGE_DETAILS.ID,18 IMAGE_DETAILS.IMAGE_DATE,19 IMAGE_DETAILS.STUDY_ID,20 IMAGE_DETAILS.STUDY_DESC,21 IMAGE_DETAILS.INSTITUTION_NAME22 FROM SASUSER.QUERY_FOR_PARTY_0000 AS
QUERY_FOR_PARTY_000023 INNER JOIN IMGRMH.IMAGE_DETAILS AS IMAGE_DETAILS
ON (QUERY_FOR_PARTY_0000.USI = IMAGE_DETAILS.USI)24 WHERE QUERY_FOR_PARTY_0000.SYNDROMEDIAGNOSIS =
"Symptomatic Generalised";
Agenda• Purpose and Description of BioGrid Australia• The MRI Images Collection for Melbourne
Health• Oracle 11g Advantages
• Design of the Images Database• Examples of Utilities and Techniques
Current Challenges• References
Current Challenges• Find a sponsor after the capitalisation phase• Improve the deployment of the Oracle 11g R1
database to bring it up to best practice• Upgrade to Oracle 11g Release 2• Promote the use of the MRI images for
research
Agenda• Purpose and Description of BioGrid Australia• The MRI Images Collection for Melbourne
Health• Oracle 11g Advantages • Design of the Images Database• Examples of Utilities and Techniques
• Current Challenges
References
References (1)• Jain, Pranabh and Melliyal Annamalai, Oracle Open
World 2008, “Images and Oracle Database 11g” presentation.
• http://www.oracle.com/technology/products/database/application_express/howtos/howtos.html• http://www.oracle.com/technology/obe/11gr1_db/index.htm• http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28416/ch_dev_apps.htm#CIHEIGBC
References (2)
• http://www.remote-dba.net/teas_rem_util18.htm
• Oracle Documentation
Tell us what you think…
• http://feedback.insync10.com.au