DICOM Objects: Unstructured Data in Oracle 11g

48
DICOM Objects: Unstructured Data in Oracle 11g Naomi Rafael BioGrid Australia 16 August 2010 The most comprehensive Oracle applications & technology content under one roof

Transcript of DICOM Objects: Unstructured Data in Oracle 11g

Page 1: DICOM Objects: Unstructured Data in Oracle 11g

DICOM Objects: Unstructured Data in Oracle 11g

Naomi RafaelBioGrid Australia

16 August 2010

The most comprehensive Oracle applications & technology content under one roof

Page 2: DICOM Objects: Unstructured Data in Oracle 11g

Agenda• Purpose and Description of BioGrid Australia• The MRI Images Collection for Melbourne

Health• Oracle 11g Advantages

• Design of the Images Database• Examples of Utilities and Techniques• Current Challenges

• References

Page 3: DICOM Objects: Unstructured Data in Oracle 11g

AgendaPurpose and Description of BioGrid

Australia• The MRI Images Collection for Melbourne

Health• Oracle 11g Advantages• Design of the Images Database• Examples of Utilities and Techniques• Current Challenges• References

Page 4: DICOM Objects: Unstructured Data in Oracle 11g

BioGrid Australia Vision• Facilitate multi-disciplinary medical research• Leverage research collaboration• Link heterogeneous data from multiple institutions• Confer value, retain and re-use health data• Enforce system security

• Respect patient privacy

• Select pragmatic technology

Page 5: DICOM Objects: Unstructured Data in Oracle 11g

BioGrid Architecture

Oracle 11g

VPN

Federated Data IntegratorThe Federated Data Integrator(FDI) is the hub of the BioGridarchitecture . Data from heterogeneous data sources are integrated into one virtual repository on this server .

Local ResearchRepository (LRR)

USIDB

FDIPRDInternet

LRRs

Research RepositoriesAt OtherInstitutions

Oracle 11g ServerSize: 2.5 TbImages: 15 million

VPN

DemographicsDe-identified Data

DBIMGS

DemographicsDe-identified Data

USIServer

Data Linka

ge

User

User

DMZ

TerminalServer,Reverse

Proxy

Page 6: DICOM Objects: Unstructured Data in Oracle 11g

Agenda• Purpose and Description of BioGrid Australia

The MRI Images Collection for Melbourne Health

• Oracle 11g Advantages

• Design of the Images Database• Examples of Utilities and Techniques• Current Challenges

• References

Page 7: DICOM Objects: Unstructured Data in Oracle 11g

IF YOU WANTED TO COMPAREIF YOU WANTED TO COMPAREthe volume and shape of the the volume and shape of the

brain of individualbrain of individual

EPILEPSY EPILEPSY PATIENTSPATIENTS

where would you start?where would you start?

Page 8: DICOM Objects: Unstructured Data in Oracle 11g

The Images Local Research Repository (1)• First take 7 million proprietary Magnetic

Resonance Images (MRIs) on over 1000 DAT format tapes

• Convert to Digital Imaging and Communications in Medicine (DICOM) format

• Store and index images on-line• Extract DICOM header information

• Link into BioGrid Australia and issue record linking ID

Page 9: DICOM Objects: Unstructured Data in Oracle 11g

The Images Local Research Repository (2)

• Retrieve identified and de-identified images on demand

• Be economical and sustainable • Add 8 million more MRI images (stored on

Optical Disk storage technology)

Page 10: DICOM Objects: Unstructured Data in Oracle 11g

Agenda• Purpose and Description of BioGrid

Australia• The MRI Images Collection for Melbourne

Health

Oracle 11g Advantages• Design of the Images Database• Examples of Utilities and Techniques• Current Challenges• References

Page 11: DICOM Objects: Unstructured Data in Oracle 11g

Oracle 11g Advantages (1)• Oracle Database 11g stores the images on

line• The images can be retrieved on demand • Oracle Database 11g indexes and partitions

for fast query• Oracle Multimedia 11g has a dedicated

DICOM data type with rich feature set • SQL*Loader can be tuned for fast image load

Page 12: DICOM Objects: Unstructured Data in Oracle 11g

Oracle 11g Advantages (2)• Security features are available• Compression is available at the LOB level, on

backup, and on DataPump export.• Application Express is available for rapid

application development• And for Melbourne Health: installation

licensed by Victoria Department of Health statewide Oracle license

Page 13: DICOM Objects: Unstructured Data in Oracle 11g

ORDDICOM object type• Digital Imaging and Communications in Medicine• http://medical.nema.org/• The Digital Imaging and Communications in Medicine

(DICOM) feature was first introduced to Oracle interMedia in Oracle Database 10g Release 2 as a feature of the ORDImage object type

• Metadata tags associated with DICOM data were extracted into an XML document

• Oracle Database 11g Release 1 provides more complete DICOM support in a new ORDDicom object type.

• This object type holds the DICOM binary data and extracted metadata, and contains the methods to manipulate the DICOM binary data.

Page 14: DICOM Objects: Unstructured Data in Oracle 11g

Oracle 11g: Using DICOM Images - Features

• Built in function to extract the DICOM metadata (tags)• Ability to select and view DICOM attributes• Ability to convert images from DICOM to other image

formats, eg, JPEG, GIF, PNG and TIFF• Built in function to remove identifying tag information

ie, de-identify images• Ability to import and export images on other servers

using mapped drives

Page 15: DICOM Objects: Unstructured Data in Oracle 11g

Agenda• Purpose and Description of BioGrid Australia• The MRI Images Collection for Melbourne

Health

• Oracle 11g Advantages Design of the Images Database• Examples of Utilities and Techniques• Current Challenges• References

Page 16: DICOM Objects: Unstructured Data in Oracle 11g

Database Architecture (1)• Windows Server 2003 (64bit)• Oracle Database 11g Release 1 Version 11.1.0.6

– Single Instance Database• 2.8 TB Data Stored on 20 spindles

• Separate physical drive contains flashback recovery area

• Partitioned by range

Page 17: DICOM Objects: Unstructured Data in Oracle 11g

Agenda• Purpose and Description of BioGrid Australia• The MRI Images Collection for Melbourne

Health• Oracle 11g Advantages

• Design of the Images Database

Examples of Utilities and Techniques• Current Challenges

• References

Page 18: DICOM Objects: Unstructured Data in Oracle 11g

Example: Using DICOM Object Type inCreate Table

create table medical_image_table (id varchar(50),

TAPE_ID number, dicom orddicom,

USI varchar(50) )LOB (dicom.source.localdata) STORE AS SECUREFILE

(COMPRESS HIGH)PARTITION BY range (TAPE_ID)( PARTITION PART1 VALUES less than (50) TABLESPACE

TBLS_PART1_FROM_TAPE1);

Page 19: DICOM Objects: Unstructured Data in Oracle 11g

Example: Using setProperties to Extract Metadata into ORDDICOM Object

-- Set Data Model Repository. This procedure must be called at the -- beginning of each database session.

execute ordsys.ord_dicom.setDataModel();declare obj orddicom; res varchar2(1000);begin select dicom into obj from medical_image_table where id =

'E11200S001I001.dcm' for update; obj.setProperties;end;/

Page 20: DICOM Objects: Unstructured Data in Oracle 11g

Example: Select and View DICOM Attributes

select t.dicom.getAttributebyTag('00200010') as

STUDY_ID, t.dicom.getAttributebyTag('00100010') as

PATIENT_NAME, t.dicom.getAttributebyTag('00100020') as PATIENT_ID,TO_DATE(t.dicom.getAttributebyTag('00100030'),'YYYY

-DD-MM') as PATIENT_DOB,from medical_image_table t where t.dicom.id =

'E11200S001I001.dcm';

Page 21: DICOM Objects: Unstructured Data in Oracle 11g

Example: Create View for Patient DetailsCreate or replace view patient_details asselect t.id,t.tape_id,t.usi,………,(t.dicom.getAttributebyTag('00080030')) as

STUDY_TIMEfrom medical_image_table

(Note: A prerequisite is to execute ordsys.ord_dicom.setDataModel() to load datamodel repository to be able to fetch attibutes by tag.)

Page 22: DICOM Objects: Unstructured Data in Oracle 11g

Example: Convert Image from DICOM to JPEG and Make Anonymous

declare dcm ordsys.orddicom;begin ord_dicom.setDatamodel; for rec in (select * from medical_image_table for update) loop rec.dicom.setProperties(); -- create a JPEG thumbnail rec.dicom.processCopy('fileFormat=jpeg fixedScale=75,100',

rec.imageThumb); -- make a new anonymous version of the ORDDicom object

rec.dicom.makeAnonymous(genUID(rec.id), rec.anonDicom); -- write the objects back to the row …….. end loop; commit;end;/

Page 23: DICOM Objects: Unstructured Data in Oracle 11g

Example: Import and Export Images

CONNECT / AS SYSDBA

--Directory IMAGEDIR for export/import DICOM

create or replace directory imagedir as 'O:\ORACLE_DICOM_IMAGES';

grant read,WRITE on directory IMAGEDIR to Administrator;

-- import() method can be used to import (where ORDDICOM source

-- attributes contain ‘FILE’, ‘IMAGEDIR’, and filename)dcm.import();-- export() method can be used to exportdcmSrc.export('FILE', 'IMAGEDIR', filename);

Page 24: DICOM Objects: Unstructured Data in Oracle 11g

Example: Use of Compression for DICOM Objects

On Tables and LOBS using SECUREFILE (COMPRESS HIGH):

create table medical_image_table (id varchar(50),

TAPE_ID number, dicom orddicom,

USI varchar(50) )

LOB (dicom.source.localdata) STORE AS SECUREFILE (COMPRESS HIGH)

PARTITION BY range (TAPE_ID)( PARTITION PART1 VALUES less than (50) TABLESPACE

TBLS_PART1_FROM_TAPE1);

Page 25: DICOM Objects: Unstructured Data in Oracle 11g

Impact of CompressionDicom images are stored as

SECUREFILE (COMPRESS HIGH)

Compression level example from load of first cohort of images:

• Space utilization on file system: 1515.52 Gb• Space utilization in Oracle 11g: 816 Gb

Compression achieved:

(1515-816)/1515= approx. 46%

Page 26: DICOM Objects: Unstructured Data in Oracle 11g

Example: Compression in Backup Using RMANRMAN> configure device type disk backup type to

compressed backupset;

RMAN> configure channel device type disk maxpiecesize 50g;

RMAN> show compression algorithm;RMAN configuration parameters for database with

db_unique_name RMHIMG are:CONFIGURE COMPRESSION ALGORITHM 'BZIP2';

--ZLIB compression algorithm offers speed but not a good compression ratio. The alternate compression algorithm, BZIP2, is slower but provides a better compression ratio.

RMAN> backup database;..

Page 27: DICOM Objects: Unstructured Data in Oracle 11g

Maximize SQL*Loader Performance• Use Direct Path Loads(direct=true) - The conventional path

uses standard insert statements whereas the direct path loader loads directly into the Oracle data files and creates blocks in Oracle database block format.

• Disable/Drop Indexes and Constraints• Disable Archiving During Load • Use unrecoverable- This disables the writing of the data to the

redo logs. • The parallel load option is not allowed when loading lob

columns • Do remember to create indices or enable them after direct load.

Otherwise performance will be affected.

Page 28: DICOM Objects: Unstructured Data in Oracle 11g

SQL*Loader Performance Results

• Using these options we were able to reduce time for loading 50 tapes from 13 hours to approximately 5 hours.

Page 29: DICOM Objects: Unstructured Data in Oracle 11g

Compression and Parallelisation with Data Pump Export

• expdp Images_admin/WELCOME DIRECTORY=BACKUP_64BIT JOB_NAME=IMAGES_ADMIN_EXP_JOB dumpfile=IMAGES_ADMIN%U.dmp PARALLEL=3 COMPRESSION=all

• With PARALLEL=3 three Dump files IMAGES_ADMIN%u.DMP are created making the export process much faster.

• After export, each partition is further compressed to 21-30 GB (originally 40-50GB after SecureFiles compress high).

Page 30: DICOM Objects: Unstructured Data in Oracle 11g

Example: Create Table for Best ORDDicom with SecureFiles Performance (1)

create table medical_image_table (id varchar(50),

TAPE_ID number, dicom orddicom,

USI varchar(50) )Pct free 60

lob(dicom.source.localdata) store as SecureFile( nocache filesystem_like_logging),

Page 31: DICOM Objects: Unstructured Data in Oracle 11g

Example: Create Table for Best ORDDicom with SecureFiles Performance (2)

lob(dicom.extension) store as SecureFile( nocache disable storage in row )

xmltype dicom.metadata store as SecureFile clob( nocache disable storage in row )

Page 32: DICOM Objects: Unstructured Data in Oracle 11g

Example: Options for Indexing Metadata (1)1. Build indices on the ORDDicom metadata column

• If few attributes, index each: create INDEX dcm_patientfamilyname_idx ON dicom (extractValue(src.metadata, '/DICOM_OBJECT/PERSON_NAME[@tag="00100010"]/NAME/FAMILY','xmlns="http://xmlns.oracle.com/ord/dicom/metadata_1_0"'));

• If many attributes, create full text index: create index dcm_md_idx on dicom (src.metadata) indextype is ctxsys.context parameters('STOPLIST dicom_stoplist storage dcm_text_idx_pref') parallel 4;

Page 33: DICOM Objects: Unstructured Data in Oracle 11g

Example: Options for Indexing Metadata (2)

2. Build indices on separate extracted metadataa) Create a mapping documentb) Call extractMetadatac) Store and index results

Page 34: DICOM Objects: Unstructured Data in Oracle 11g

Load Options (1)

• Disable logging on lob column rather than SQL*LOADER recoverable

• Use SQL*Loader with Conventional Path Loads and submit parallel jobs

– Get all the advantages of SQL*Loader– Plus the advantages of parallelism– Parallelism makes up for lack of direct path load

Page 35: DICOM Objects: Unstructured Data in Oracle 11g

Load Options (2)

If adding load function to a Java application:• Use JDBC thin driver for best performance• Use getBytes() and setBytes() from oracle.sql.BLOB

class to read/write from SecureFile BLOB• Read and write large buffers to the database, for

example, 10MB• Balance application traffic over available network links for

parallel load

Page 36: DICOM Objects: Unstructured Data in Oracle 11g

IF YOU WANTED TO COMPAREIF YOU WANTED TO COMPAREthe volume and shape of the the volume and shape of the

brain of individualbrain of individual

EPILEPSY EPILEPSY PATIENTSPATIENTS

where would you start?where would you start?

Page 37: DICOM Objects: Unstructured Data in Oracle 11g

Example: Query Linking Patient Clinical Information with Images

16 CREATE TABLE SASUSER.QUERY_FOR_PARTY_0000 AS SELECT PARTY1.USI,

17 PARTY.USI AS USI1,18 VISITDETAILS.SYNDROMEDIAGNOSIS19 FROM EPIL_RMH.PARTY AS PARTY,20 IMGRMH.PARTY AS PARTY1,21 EPIL_RMH.VISITDETAILS AS VISITDETAILS22 WHERE (PARTY.USI = PARTY1.USI AND PARTY.USI =

VISITDETAILS.USI)23 ORDER BY VISITDETAILS.SYNDROMEDIAGNOSIS;

Page 38: DICOM Objects: Unstructured Data in Oracle 11g

Example: Results

Page 39: DICOM Objects: Unstructured Data in Oracle 11g

Example: Query Linking Syndrome Diagnosis with Images15 PROC SQL;

16 CREATE TABLE SASUSER.Query_for_QUERY_FOR_PARTY_0000 AS SELECT DISTINCT QUERY_FOR_PARTY_0000.USI,

17 IMAGE_DETAILS.ID,18 IMAGE_DETAILS.IMAGE_DATE,19 IMAGE_DETAILS.STUDY_ID,20 IMAGE_DETAILS.STUDY_DESC,21 IMAGE_DETAILS.INSTITUTION_NAME22 FROM SASUSER.QUERY_FOR_PARTY_0000 AS

QUERY_FOR_PARTY_000023 INNER JOIN IMGRMH.IMAGE_DETAILS AS IMAGE_DETAILS

ON (QUERY_FOR_PARTY_0000.USI = IMAGE_DETAILS.USI)24 WHERE QUERY_FOR_PARTY_0000.SYNDROMEDIAGNOSIS =

"Symptomatic Generalised";

Page 40: DICOM Objects: Unstructured Data in Oracle 11g
Page 41: DICOM Objects: Unstructured Data in Oracle 11g
Page 42: DICOM Objects: Unstructured Data in Oracle 11g

Agenda• Purpose and Description of BioGrid Australia• The MRI Images Collection for Melbourne

Health• Oracle 11g Advantages

• Design of the Images Database• Examples of Utilities and Techniques

Current Challenges• References

Page 43: DICOM Objects: Unstructured Data in Oracle 11g

Current Challenges• Find a sponsor after the capitalisation phase• Improve the deployment of the Oracle 11g R1

database to bring it up to best practice• Upgrade to Oracle 11g Release 2• Promote the use of the MRI images for

research

Page 44: DICOM Objects: Unstructured Data in Oracle 11g

Agenda• Purpose and Description of BioGrid Australia• The MRI Images Collection for Melbourne

Health• Oracle 11g Advantages • Design of the Images Database• Examples of Utilities and Techniques

• Current Challenges

References

Page 45: DICOM Objects: Unstructured Data in Oracle 11g

References (1)• Jain, Pranabh and Melliyal Annamalai, Oracle Open

World 2008, “Images and Oracle Database 11g” presentation.

• http://www.oracle.com/technology/products/database/application_express/howtos/howtos.html• http://www.oracle.com/technology/obe/11gr1_db/index.htm• http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28416/ch_dev_apps.htm#CIHEIGBC

Page 46: DICOM Objects: Unstructured Data in Oracle 11g

References (2)

• http://www.remote-dba.net/teas_rem_util18.htm

• Oracle Documentation

Page 47: DICOM Objects: Unstructured Data in Oracle 11g

Thank you!

BIOGRID AUSTRALIA

Naomi [email protected]

Page 48: DICOM Objects: Unstructured Data in Oracle 11g

Tell us what you think…

• http://feedback.insync10.com.au