3Q_Factors_to_Enhance_Big_Data_Medical_Image_for_Better_Diagnosis.pdf

11
1 "3Q Factors to Enhance Big Data Medical Image for Better Diagnosis" Raflaa Hilmi Hamid, Ahmed Nabeel Ahmed, Hasanain Mohammed Manji Under the supervision of Dr.Azizah Bt Haji Ahmad College of Arts and Sciences University Utara Malaysia Abstract Nowadays, images are employed in several areas of medicine for early diagnosis. In this sense, the industry provides accurate models to obtain X-ray, Computerized Tomography (CT), Magnetic Resonance Imaging (MRI) and others of high resolution equipments. However, other images, such as those related to pathological anatomy present in many situations poor quality, big quantity of information to be process and non- quick in enhancement. This complicates the diagnostic process. This work is focused on the quality, quantity and quickness enhancement of this type of images through a system based on informatics image system combined with traditional techniques of image processing. The results show that the proposed methodology can help medical specialists in the diagnostic of several pathologies. Considering that the medical image as a big data issue. Keywordssuper-informatics, medical imaging, diagnosis, image processing, big Data. I. Introduction To analyze the application values of the super-informatics image technique system (SIITS) based on picture archive and communication system (PACS) in improvement of medical imaging (MI) diagnosis and also image processing techniques [23]. In normal case MI should compare with multimillion images in one of the informatics system Vendor Natural Archive (VNA) until this system find the conformable diagnosis that fit this image, this process will take a lot of time and effort and also high cost which made this techniques seem to be slow, unproductive and ineffective [17]. For many years, big data informatics system developed, but also the MI gets more big and big, spatially with the urgent need to get early diagnosis for disease that contact with human life. The health care organizations spend more and more every year in order to be advanced by one step in this field, since early diagnosis for the disease such as cancer and blood disease is the key of recovery and survival [6]. An archive is a location containing a collection of records, documents or other materials of historical importance. An integral part of picture archive and communication system (PACS) is archiving. When a hospital needs to migrate a PACS vendor, the completed earlier data need to be migrated in the format of the newly procured PACS. It is both time and money consuming [23]. PACS was consisted of medical imaging and data acquisition components and storage and display subsystems. Different imaging modalities in modern imaging system (e.g., X-ray, Ultrasonography (US), Digital Subtraction Angiography (DSA), Computerized Tomography (CT), Magnetic Resonance Imaging (MRI), positive emission tomography (PET)) could be processed in PACS in the format of Digital Imaging and Communications in Medicine (DICOM) imaging. The PACS was an integrated system, allowing for efficient electronic distribution and storage of medical images and access to medical record data. PACS of different size were widely used in clinical research stage and diagnostic imaging. With the rapid development of imaging technology X-ray, US, DSA, CT, MRI, PET and other modern imaging devices formed a huge medical imaging system (Big Data). One of the most important characteristics of modern imaging system was mass information and native digital images. As a result, the pattern of medical imaging education should change a lot of correspondingly. The foundation of medical imaging system was high-quality, a large quantity and quick process of imaging data. It was also well known that the diagnosing of medical imaging requires systematic study of a large number of medical images [1]. At presents, the potentials of PACS for diagnosing applications were not fully understand by

Transcript of 3Q_Factors_to_Enhance_Big_Data_Medical_Image_for_Better_Diagnosis.pdf

Page 1: 3Q_Factors_to_Enhance_Big_Data_Medical_Image_for_Better_Diagnosis.pdf

1

"3Q Factors to Enhance Big Data Medical Image for

Better Diagnosis"

Raflaa Hilmi Hamid, Ahmed Nabeel Ahmed, Hasanain Mohammed Manji

Under the supervision of Dr.Azizah Bt Haji Ahmad

College of Arts and Sciences

University Utara Malaysia

Abstract — Nowadays, images are employed in

several areas of medicine for early diagnosis. In

this sense, the industry provides accurate models

to obtain X-ray, Computerized Tomography (CT),

Magnetic Resonance Imaging (MRI) and others of

high resolution equipments. However, other

images, such as those related to pathological

anatomy present in many situations poor quality,

big quantity of information to be process and non-

quick in enhancement. This complicates the

diagnostic process. This work is focused on the

quality, quantity and quickness enhancement of

this type of images through a system based on

informatics image system combined with

traditional techniques of image processing. The

results show that the proposed methodology can

help medical specialists in the diagnostic of several

pathologies. Considering that the medical image

as a big data issue.

Keywords— super-informatics, medical imaging,

diagnosis, image processing, big Data.

I. Introduction

To analyze the application values of the

super-informatics image technique system (SIITS)

based on picture archive and communication system

(PACS) in improvement of medical imaging (MI)

diagnosis and also image processing techniques [23].

In normal case MI should compare with

multimillion images in one of the informatics system

Vendor Natural Archive (VNA) until this system find

the conformable diagnosis that fit this image, this

process will take a lot of time and effort and also high

cost which made this techniques seem to be slow, unproductive and ineffective [17].

For many years, big data informatics system

developed, but also the MI gets more big and big,

spatially with the urgent need to get early diagnosis

for disease that contact with human life.

The health care organizations spend more and

more every year in order to be advanced by one step

in this field, since early diagnosis for the disease such

as cancer and blood disease is the key of recovery

and survival [6].

An archive is a location containing a collection of

records, documents or other materials of historical

importance. An integral part of picture archive and

communication system (PACS) is archiving. When a

hospital needs to migrate a PACS vendor, the

completed earlier data need to be migrated in the

format of the newly procured PACS. It is both time

and money consuming [23].

PACS was consisted of medical imaging and data

acquisition components and storage and display

subsystems.

Different imaging modalities in modern imaging

system (e.g., X-ray, Ultrasonography (US), Digital

Subtraction Angiography (DSA), Computerized

Tomography (CT), Magnetic Resonance Imaging

(MRI), positive emission tomography (PET)) could

be processed in PACS in the format of Digital

Imaging and Communications in Medicine (DICOM)

imaging.

The PACS was an integrated system, allowing for

efficient electronic distribution and storage of

medical images and access to medical record data.

PACS of different size were widely used in clinical

research stage and diagnostic imaging. With the rapid

development of imaging technology X-ray, US,

DSA, CT, MRI, PET and other modern imaging

devices formed a huge medical imaging system (Big

Data). One of the most important characteristics of

modern imaging system was mass information and

native digital images. As a result, the pattern of

medical imaging education should change a lot of

correspondingly. The foundation of medical imaging

system was high-quality, a large quantity and quick

process of imaging data. It was also well known that

the diagnosing of medical imaging requires

systematic study of a large number of medical images

[1].

At presents, the potentials of PACS for

diagnosing applications were not fully understand by

Page 2: 3Q_Factors_to_Enhance_Big_Data_Medical_Image_for_Better_Diagnosis.pdf

2

most organization that concern with medical image.

So, a new and effective diagnosing system was

needed in medical imaging techniques. Combing with

PACS, we constructed an imaging informatics system

based on PACS to improve the diagnosis effect of

medical imaging system.

The new concept of Vendor Natural Archive

(VNA) has emerged. A VNA simply decouples the

PACS and workstations at the archival layer. This is

achieved by developing an application engine that

receives, integrates, and transmits the data using the

different syntax of a DICOM format. Transferring the

data belonging to the old PACS to a new one is

performed by a process called migration of data. In

VNA, a number of different data migration

techniques are available to facilitate transfer from the

old PACS to the new one, the choice depending on

the speed of migration and the importance of data.

The techniques include simple DICOM migration,

prefetch-based DICOM migration, medium

migration, and the expensive non-DICOM migration.

―Vendor neutral‖ may not be a suitable term, and

―architecture neutral,‖ ―PACS neutral,‖ ―content

neutral,‖ or ―third-party neutral‖ are probably better

and preferred terms. Notwithstanding this, the VNA

acronym has come to stay in both the medical IT user

terminology and in vendor nomenclature, and

radiologists need to be aware of its impact in PACS

across the globe [17].

On the other hand, other healthcare organization

goes towards develop the image processing

techniques (such as gray level–transaction, image

filtering, Binarization and segmentation).

Image Processing is a technique to enhance raw

images received from CT, MRI, US and other

devices, placed on satellites, space probes and

aircrafts or images taken in normal day-today life for

various applications.

Various techniques have been developed in Image

Processing during the last four to five decades. Most

of the techniques are developed for enhancing images

obtained from unmanned spacecraft's, space probes

and military reconnaissance flights. Image Processing

systems are becoming popular due to easy

availability of powerful personnel computers, large

size memory devices, graphics software etc.

Image Processing is used in various applications such

as:

Remote Sensing

Medical Imaging

Non-destructive Evaluation

Forensic Studies

Textiles

Material Science

Military

Film industry

Document processing

Graphic arts

Printing Industry

The common steps in image processing are image

scanning, storing, enhancing and interpretation.

Image segmentation is the process that subdivides

an image into its constituent parts or objects. The

level to which this subdivision is carried out depends

on the problem being solved, i.e., the segmentation

should stop when the objects of interest in an

application have been isolated e.g., in autonomous

air-to ground target acquisition, suppose our interest

lies in identifying vehicles on a road, the first step is

to segment the road from the image and then to

segment the contents of the road down to potential

vehicles. Image thresholding techniques are used for

image segmentation [10,11].

Even that the two ways (MI processing and

medical informatics system) seems to be predictive,

they still have many weakness point. The Quality,

Quantity, and Quickness (3Q) are the main factors

that we aim to achieve.

We suggest new system for medical images

processing to facilitate the diagnoses of disease in the

better manner. Using both characteristics of these two

ways, step 1: using one or more image processing

technique to enhance our medical image then step 2:

passing the last enhanced one to the informatics

system, that will be easy and quick to match it with

vendor storage system, which can bring back the

diagnosis and certainly the percentage of this

diagnosis will be higher.

The purpose of this study is to raise quality,

quantity and quickness (3Q) in order to highlight the

diagnosis imaging systems in the healthcare and

medical organizations. Also we will try to understand

and discuss both of (image processing technique and

informatics imaging system) briefly, considering MI

as a big data source. Then combine them together in

order to produce new virtual system called ―super-

informatics image technique system‖ (SIITS).

II. Case Study

With the ever-increasing amount of annotated

medical data, large-scale, data-driven methods

provide the promise of bridging the semantic gap

between images and diagnosis. The goal of this paper

is to suggest new technique to enhance informatics

system (such as PACS and VNA) storage technique

in attempting to avoid the high cost of medical

diagnosing and compare with less quantity of image

also increase the quality of all system [17]. we try to

combine both informatics system and image

processing in order to increase quality, quantity and

quickness of imaging data and this will achieve by

Page 3: 3Q_Factors_to_Enhance_Big_Data_Medical_Image_for_Better_Diagnosis.pdf

3

enhance the medical image with any of processing

image technique, then send this image to informatics

system to be analysis and detect in order to match it

with one of the storage image in this system and get

the diagnosis that will be more effective and more

guaranteed, see figure 3 Super-Informatics Image

Techniques System (SIITS).

III. Overview of image processing and informatics

image technique

One of main purposes of image processing is to

manipulate pixel values for better visibility. For

example, gray-level transformation and image

filtering are typical image processing techniques for

converting an input image into a new image with

better visibility. Another purpose of image processing

is to extract some target objects or regions from an

input image. For example, if we extract all organelles

of a specific kind, we can count them and also

understand their distribution and behavior in a cell [2,

3].

On the other hand, a general purpose of

informatics image technique (such as PACS or VNA)

to classify an image or a target object or a region into

one of types, i.e. classes. Although it is difficult to

achieve the informatics accuracy of human beings,

informatics image technique has already been used in

various applications. Optical character informatics

(OCI) is one of the most classic applications, where

an image of a single character is classified into one of

52 classes.

Table 1 indicates how to select image processing

and informatics image techniques according to our

purpose. All of those techniques can be applicable to

biological image analysis. Note that there is no strict

boundary between image processing and informatics

image technique. Many intelligent image processing

techniques rely on informatics image techniques [4,

15].

It is rather rare to use a single image processing

technique or a single informatics image technique. In

fact, they are often used in a mixed manner in order

to realize a complete system for a specific task (For

example diagnosis), for extracting target organelles

from an image, an image segmentation technique is

first applied to the image and then each segment (i.e.

region) is fed into an informatics image technique

(such as PACS or VNA) for deciding whether the

segment is target or not. Figure 2 shows an examples

for segmentation technique which applied to medical

image first to find specific target, we need to

understand all functions of individual techniques and

useful combinations of the techniques [25].

It is very important to understand the fact that

medical images are often far more difficult to be

processed and recognized than popular (i.e. daily-

life) images, such as character, face, and person

images. In particular, microscopic bio images have

the following difficulties for image processing and

informatics [1].

Figure 2: Combination of multiple image

processing phase's techniques for realizing a

complete system of image processing.

Image acquisition

as digital form of image

Noise removal

step to easier identify target

Image

Segmentation

Recognition of each

segment for finding the

target segment

Page 4: 3Q_Factors_to_Enhance_Big_Data_Medical_Image_for_Better_Diagnosis.pdf

4

Table 1: Image processing and recognition methods which fit to a specific purpose

Figure 3: Super-informatics image techniques system (SIITS)

Image Processing

Techniques

Capture Medical

Image

VNA System

Medical Image

storage

The Image Diagnosed

Page 5: 3Q_Factors_to_Enhance_Big_Data_Medical_Image_for_Better_Diagnosis.pdf

5

IV. The concept of Big Data in image processing

and healthcare

Big Data is the future of image processing which

it is part of healthcare scope need to devote time with

big data poised to change the healthcare ecosystem,

organizations and resources to understanding this

phenomenon and realizing the envisioned benefits.

All healthcare constituents (members, payers,

providers, groups, researchers, governments etc.) will

be impacted by big data, which can predict how these

players are likely to behave encourage desirable

behavior and minimize less desirable behavior. These

applications of big data can be tested, refined and

optimized quickly and inexpensively and will

radically change healthcare delivery and research.

Leveraging big data will certainly be part of the

solution to controlling spiraling healthcare costs. We

will try to define big data, explore the opportunities

and challenges it poses for healthcare organizations

to understand how present the hug quantity of

medical images can be as a Big Data issue [26, 27].

A large amount of data becomes "big data"

when it meets five criteria: volume, variety, velocity,

veracity and value, figure 4.

Figure 4: Five Vs‘

Here is a look at more important three V’s:

Healthcare Big Data: Volume

Big data in healthcare means there is a lot of data

— terabytes or even petabytes (1,000 terabytes). This

is perhaps the most immediate challenge of big data

in medical requirements, as it requires scalable

storage and support for complex, distributed queries

across multiple data sources. The challenge is being

able to identify, locate, analyze and aggregate

specific pieces of data in a vast, partially structured

data set [32].

While standard techniques and technologies exist

to deal with volumes of structured data, it becomes a

significant challenge to analyze and process a large

amount of highly variable data and turn it into

actionable information. But this is also where the

potential of big data potential lays, as effective

analytics allow you to make better decisions and

realize opportunities that would not otherwise exist.

Such examples of large-data, their promise and

challenges, have not gone unnoticed. In the US, The

National Science Foundation, the National Institutes

of Health, the Defense Department, the Energy

Department, Homeland Security Department as well

as the U.S. Geological Survey have all made

commitments toward ―big data‖ programs. The

Obama Administration itself has even gotten in on

the act. In response to recommendations from the

President‘s Council of Advisors on Science and

Technology, the White House sponsored a meeting

bringing together a cross-agency committee to lay out

specific actions agencies should take to coordinate

and expand the government‘s investment in ―big

data‖, totaling $200 million in support; we mention it

all as an example of big data feasibility [30, 32].

Healthcare Big Data: Variety

There are three different forms of data in most

large healthcare institutions. Discretely codified

billing and clinical transactions are well suited for

relational data models. Digital capture and

management of diagnostic imaging studies required

the development of specialized data formats,

communication protocols, and storage systems.

While these PACS systems are not typically

recognized as big data, they clearly meet the criteria

we have outlined here.

The third form of data in healthcare consists of

blobs of text, typically generated to document an

encounter or procedure. While stored electronically,

there is very little analysis done on this data today,

because database servers are not able to effectively

query or process these large strings. Natural language

processing has been around since the 1950‘s, but

progress in the field has been much slower than

initially expected. The accuracy and reliability of the

results produced by this technology do not yet meet

the requirements of most clinical analytic use cases.

There is much opportunity for progress in this area,

particularly for clinical research [29].

Healthcare Big Data: Velocity

The speed at which some applications generate

new data can overwhelm a system‘s ability to store

that data. Data can be generated from two sources:

humans, or sensors. We have both sources in

healthcare. With a few exceptions like diagnostic

Page 6: 3Q_Factors_to_Enhance_Big_Data_Medical_Image_for_Better_Diagnosis.pdf

6

imaging and intensive care monitoring, most of the

data we use in healthcare is entered by people, which

effectively limits the rate at which healthcare

organizations can generate data. Like a hospital,

Facebook‘s data is generated by people [32].

V. Healthcare Analytics and Deeper Insight

Data analytics, wisely used, can create business

value and competitive advantage. Compared with

many other industries, healthcare has been a late

adopter of analytics. Most health systems have lots of

opportunities to improve clinical quality and financial

performance, and analytics are required to identify

and take advantage of those opportunities.

It‘s a long journey for most organizations to

develop of culture of continuous, data-driven

improvement. Can big data help your healthcare

organization along this journey? Hopefully you now

have a framework to help guide your thinking.

Learn why when it comes to predictive analytics,

sometimes big data is a big miss. Or see why

advanced analytics can‘t solve all of health care‘s

problems [28, 31].

VI. New Computer Science Designed with Big

Data in Mind

While it might be tempting to think that once all

the medical images (MI) data has been archived,

indexed, and is ready to go, that all one would need is

to start analyzing it and answers to all our questions

about the all MI data will be revealed. Even

examining the contents of an archive to know what

data is available to be analyzed requires new, cleverly

designed and user friendly software tools and novel

approaches for exploratory inspection [28]. Such

tools are only now beginning to appear and their

further development will be essential for dealing with

existing as well as the expected size of MI data sets.

Once a selection of data worthy of further analysis

has been identified, a new concern is realized – it

becomes clear that many software packages for MI

data analysis are ill-suited toward very large data sets

involving potentially thousands of subjects.

Algorithm optimization is not often considered for

when data sets are small or modest in size but as data

sets grow memory management is an important

factor. New mathematics and informatics approaches

will be needed to more completely model multi-

modal MI data in the context of diagnosis disease,

white matter connectivity, and functional activity.

These will need to work fast, be accurate, and be

interoperable with other tools so that data processing

can be automated as much as possible. Interactive

workflow environments for automated data analysis

will also be critical for ongoing or retrospective

research studies involving complex computations on

large multi-dimensional datasets. Yet, few tools, if

any, now exist which enable the joint analysis of MI

data which would be capable of efficiently obtaining

results while also achieving the requisite degree of

statistical power. Moving forward, software

engineers will need to create brilliant and innovative

ways to tackle the massive amounts of MI data [31].

VII. Image segmentation

Image segmentation is one of the most important

image processing techniques for medical images. Its

purpose is to partition an input image into regions.

Image segmentation is necessary for multiple

purposes; for example, counting objects, measuring

the two-dimensional (or three-dimensional)

distribution of the objects, measuring the shape or

appearance of individual objects, recognizing the

individual objects, localizing objects for tracking,

removing unnecessary regions, etc [33].

It is important to note that image segmentation is

the most difficult task among all image processing

tasks. Even though human beings perform image

segmentation without any difficulties, computers

often suffer from its difficulty. In fact, we have not

had any perfect segmentation method yet even for

human face separation from a picture. Biological

images often have far more difficulties than face

images. This is because target objects in biological

images have ambiguous boundaries and thus are

difficult to be separated from the background and

other objects. Furthermore, all the difficulties listed

in the Introduction (such as low resolution) make

segmentation a difficult task.

Table 2 lists typical image segmentation methods,

which have been developed for general (i.e. non-

biological) images. Those methods are overviewed

below, except for binarization. Again, there is no

perfect segmentation method, see figure 5, especially

for biological images. It will be an important future

work to develop new methods specialized for

biological images [8, 34].

Figure 5: Various types of segmentation

Page 7: 3Q_Factors_to_Enhance_Big_Data_Medical_Image_for_Better_Diagnosis.pdf

7

Table 2 List of image segmentation methods

VIII. Medical Image Processing in Healthcare

Industry

Medical image processing needs continuous

enhancements in terms of techniques and applications

to help improve quality of services in health care

industry. The techniques used for interpolation,

image registration, compression, medical diagnosis

are to be improved to be abreast with growing

demands in the industry and emerging technologies

pertaining to mobile computing and cloud computing.

From the analysis of the literature it is understood

that the health care domain has got much scope for

further research in the areas of diagnosing life

threatening diseases, usage of remote health

monitoring applications for real time functioning to

alert healthcare employees. The integration of

medical equipment and applications with wearable

devices is also promising area for further research [5,

28].

Growing interest in health care domain has paved

way for innovative approaches for medical diagnosis

and clinical practices. Since health is considered to be

wealth, the healthcare industry has been striving to

use innovative medical procedures and treatment

practices coupled with technologies in computations,

harnessing advances in hardware resources [7, 14].

Precision in disease diagnosis and accuracy in

clinical practices and improvement in state-of-the-art

equipment is the ever-ending necessity in the health

care industry. This has led to various best practices

Name Methodology Merit Demerit

Image

binarization

See Table 1 Appropriate when the target

object is comprised of only

bright pixels (or dark pixels)

Limited applicability (however, note that

several binarization methods can be extended

for multi-level thresholding. For example, by

using two thresholds, an image is partitioned

into bright regions, mid regions, and dark

regions.)

Background

Subtraction

Detect target objects by

removing the background

part

Appropriate when target objects

are distributed over the

background

The background image is necessary.

Especially when the background is not

constant, some dynamic background

estimation is necessary

Watershed

Method

Representing an image as a

three-dimensional surface,

and detecting its ridge lines,

i.e. watershed

Even if gray-level change is not

abrupt, it is possible to detect

its peak as an edge

Appropriate preprocessing is necessary for

suppressing noises

Region

growing

Iterative. If neighboring

regions have similar

properties, combine them

Simple Inaccurate due to its local optimization

policy

Clustering Grouping pixels with

similar properties

Simple. Popular clustering

algorithms, such as k-means,

can be used

Difficulty in balancing locational proximity

and pixel value similarity

Active

contour

model

Optimally locating a

deformable closed contour

around a single

target object

Robust by its optimization

framework. If the contour of a

target object is invisible, it still

provides closed contour

Only for a single object. Difficulties of

dealing with unsmooth contours. Usually,

characteristics of the region enclosed by the

contour are not

Considered

Template

matching

and

recognition

based

method

Finding pixels or blocks

whose appearance or other

characteristics are similar to

reference patterns of the

target object

Capable of stable segmentation

by using various pattern

recognition theories for

evaluating the similarity

Computationally expensive. Often a

sufficient number of reference patterns are

necessary for realizing stability

Markov

random

field (MRF)

An integrated method to

optimize the segmentation

result

considering the similarity of

neighboring pixels

Accurate and robust. Highly

flexible and capable of using

various criteria

Computationally expensive. Difficult to

implement

Page 8: 3Q_Factors_to_Enhance_Big_Data_Medical_Image_for_Better_Diagnosis.pdf

8

which are clinically proven. However, more needs to

be done with ever-growing medical data, called big

data now days, in order to discover hidden

knowledge from the data [14, 31].

Healthcare industry generates huge amount of

data. Intelligent processing of such data can reveal

hidden relationships among the data items which will

help in clinical diagnosis. The growth in usage of

medical image processing can improve quality of

services to reduce death toll and improve health

standards of citizens of a country [9, 28].

Lehmann et al. explored B-Spline interpolation

techniques for medical imaging in order to improve

the quality of images. This has an important utility as

healthcare users need to have good visual perception

of images. Banos Jr, Sehn and Krechel proposed a

service model known as ―Integrated Image Access

and Distributed Processing Service‖ which is a

distributed environment which facilitates radiological

medical personnel to gain access to image processing

features. Matthew J et al. proposed an application for

medical image processing as well as visualization

which enabled professional to study, diagnose

clinical disorders. Later 3D imaging came into

existence to leverage medical image processing. Li,

Papachristou and Shekhar [ provided a reconfigurable

architecture for 3D medical image processing. The

system has four operational stages namely parameter

generation, input brick fetching, medical data stream

processing and output brick storing. Tian and Ha

reviewed applications for medical image processing

that make use of wavelet and inverse transforms.

These applications are used for clinical diagnosis.

Chen, Yi and Ni proposed a platform known as

Medical Image Processing Platform (MIPP), which is

used for web based processing of medical images.

The platform was used for design and manufacturing

of stents used for heart patients [7, 28, 31].

IX. Example for using data mining

techniques to Diagnosis of Cancer & Heart

Ailments

Data mining techniques are being used for

processing medical databases. Kharya proposed a

methodology for diagnosis and prognosis of breast

cancer. Decision tree model was used to represent

actionable knowledge pertaining to breast cancer.

Artificial Neural Networks (ANNs) were also used to

diagnose breast cancer. Krishnaiah et al. used

classification techniques for lung cancer prediction.

Their system is used for early detection of lung

cancer and accurate diagnosis of it which will save

time of doctors besides helping them in clinical

practices. Srinivas et al. used data mining techniques

for prediction of heart attacks [12].

X. Archiving and its Challenges

An archive is a location containing a collection of

records, documents, or there materials of historical

importance. In the context of computers, it is

generally a long-term storage, often on disks and

tapes. Archiving is typically done in a compressed

format so that data are saved efficiently, using less

memory resources and allowing the whole process of

archiving to be executed rapidly. PACS can archive

images for several years: 3-5 years is very common.

PACS storage has inbuilt mechanisms to take care of

disk failures through RAID (Redundant Array of

Independent Disks, also called inexpensive disks).

Depending on the patient load, types of modalities,

and the duration for which the images are to be

stored, the storage size varies from terabytes to

petabytes or even exabytes and zetabytes [23, 24].

A few challenges present themselves in archiving.

A common misconception in archiving is ―my PACS

is DICOM conformant and hence there will be no

interoperability problems.‖ The reality is that every

PACS has its own internal formats to store data and

its inherent proprietary methods to store image

presentation states and key image notes. When a

hospital needs to be migrated a PACS vendor, the

complete earlier data need to be migrated in the

format of the new PACS. This is both time and

money consuming. Part of the problem occurs

because DICOM is in reality a cooperative standard

and not an enforced one and hence has limitations.

Vendors make their own conformance statements,

which may or may not conform to all that is expected

of them and there may be a few gaps and

inconsistencies [16, 22].

Inability of vendors to comply fully with their

conformance statements occurs occasionally. Such

situations arise when a vendor providing a detailed

conformance standard states that interoperability

between their machine and other vendors machine is

the responsibility of the user and not the vendor's.

Similarly, equipment may have conformed to

DICOM standards at testing and installation, but

subsequent non-conformance when the standards

change is not the vendor's responsibility. Data that

are not in conformance with DICOM standards—

where the data are in a format understood only by a

specific vendor—are called ―dirty data‖ [18, 19].

XI. Features of VNA

VNA is an application engine that handles the

data of a vendor and at a fast speed. It is stationed

between the modality and the PACS [Figure 6]. The

imaging data are pushed to VNA directly from the

modality. Thereafter, VNA forwards it to the PACS,

along with the priors. VNA stores the image

presentation states and key image in DICOM format

Page 9: 3Q_Factors_to_Enhance_Big_Data_Medical_Image_for_Better_Diagnosis.pdf

9

Figure 6: Vendor Neutral Archive (VNA) is stationed

between the modality and PACS. The imaging data is

pushed to VNA directly from the modality.

Thereafter, VNA forwards it to PACS, along with the

priors. VNA stores the image presentation states and

key image in [17].

So what does VNA do? It simply decouples the

PACS and workstations at the archival layer. Let us

take a situation where (a) the modality did not have a

field in the Graphical User Interface (GUI) to permit

data entry for the technologist and (b) the modality

work list was not supported. In this situation, there

could be the possibility that the accession number is

entered into the study description field, and a

compulsory field in the DICOM header in front of the

accession number is left blank. This would cause

problems not only in efficient workflow but also in

retrieval of images in the future [21].

To handle this problem, an application engine was

developed that would check the DICOM header for

any of the non-conformances and automatically

normalize the DICOM tags and additionally, send the

received DICOM file in its original form. Another

issue that one comes across is that of non-

conformance of transfer syntax. Some of the common

ones being JPEG lossless, JPEG 2000 lossless, JPEG

2000, and Implicit VR Little Endian [17].

As the DICOM standard grows, more and more of

syntax are being added, with different vendors using

different ones. It is quite possible that two vendors,

whose systems are expected to be used together, use

different syntax. This problem is handled by

developing an application engine that receives the

data using one kind of syntax and transmits this data

using the syntax of the target system [20].

Finally, the application engine needs to perform

the above two functions at a fast speed. Thus, it is

stationed between the modalities and the PACS and

performs tag morphing and routing. Table 3 outlines

a list of ideal characteristics of VNA and the

advantages of VNA Figure 7.

Figure 7: Vendor Neutral Archive can work as

enterprise archive for all departments like radiology

and cardiology, seamlessly integrating their data.

Table 3 Ideal characteristics and advantages of vendor neutral archive VNA

Ideal characteristics of VNA Advantages of VNA

As per US FDA it is a class one medical device. Increased workflow efficiency (quality), saving time and labor (quickness).

Includes lifecycle image management. Ability to store huge number of images (quantity).

Manages images as well as other related info ,e.g. ,SR,PR,RT objects ,non-DICOM , waveforms ,pdf ,etc .

Allows switching PACS without requiring a complex image/data migration.

Supports open standards. Being able to use the latest hardware technology.

Supports multiple departments, enterprise, and regional architecture.

Effectively controlling data works as an enterprise.

Allows PACS to be interchangeable.

VNA: Vendor Neutral Archive, FDA: Food and Drug Administration, PACS: Picture Archiving and Communications System, SR: Structured Reports, PR: Presentation States, RT: Radiation Therapy, DICOM: Digital Imaging and Communications in Medicine IHE: Integrating the Healthcare Enterprise.

Page 10: 3Q_Factors_to_Enhance_Big_Data_Medical_Image_for_Better_Diagnosis.pdf

10

XII. Future of Archiving with VNA

In the end, VNA is an archive that has been

developed on an open architecture. It can be easily

migrated, ported to interface with another vendor's

viewing, acquisition, and workflow engine to manage

medical images and related information.

Besides images sourced from radiology, the latest

PACS will allow storage of images from other

sources such as endoscopes, opthalmoscopes,

bronchoscopes, and from the departments of

dermatology, pathology, etc., An emerging term for

such images is ―Visible Lights‖ [17].

XIII. Conclusion

Image processing and informatics image techniques

are helpful to analyze medical image and make

diagnosis of diseases easier. Since a huge number of

techniques have been proposed, and also many

informatics systems which make the appropriate

technique for a specific task is important. For

example, there are many binarization or segmentation

techniques with different properties and therefore we

need to understand what the best binarization or

segmentation technique for the task is and also which

informatics system is better. This paper can be used

for a brief guide for helping the diagnosing of

diseases to be more faster and effective by get the 3Q

factors, quality of processing data, quantity of data

(since MI is a big data source) and the quickness of

processing this data.

As emphasized, medical images are a very difficult

target even for state-of-the-art image processing and

informatics image techniques. Thus, for a specific

task, we may need to develop a new technique. This

will be possible by a collaboration of biologists and

specialists of image processing and informatics

techniques with enough discussion. On the other

hand, a task can be solved easily by an existing

technique or a combination of existing techniques.

We suggest in this paper this new system to find the

diagnosis by implementing two step ( may be more)

in order to get disease diagnosis faster and more

effective, since it is concerned with human life .

Even in this case, it is worth discussing with an

image processing specialist because she/he will help

to choose appropriate techniques. Like biology,

research on image processing and informatics

techniques continues steadily and will make further

progress in accuracy, robustness, versatility,

usability, computational efficiency, etc.

Many biological tasks can use in the future (SIITS)

for fully automatic image analysis. They also can use

future (or even present) informatics image techniques

proving empirically known biological facts and

discovering new biological facts. Again, for

continued progress, mutual collaboration between

biologists and image processing specialists is very

important.

Bibliography:

1. Uchida, Seiichi. "Image processing and

recognition for biological images."

Development, growth & differentiation 55.4

(2013): 523-549.

2. Langs, Georg, et al. "VISCERAL: Towards large

data in medical imaging—Challenges and

directions." Medical Content-Based Retrieval for

Clinical Decision Support. Springer Berlin

Heidelberg, 2013. 92-98.

3. Ravudu, M., V. Jain, and M. M. R. Kunda.

"Review of image processing techniques for

automatic detection of eye diseases." Sensing

Technology (ICST), 2012 Sixth International

Conference on. IEEE, 2012.

4. Richter, Detlef. "Current state of image

processing for medical irradiation therapy."

Radioelektronika (RADIOELEKTRONIKA),

2012 22nd International Conference. IEEE,

2012.

5. Bulsara, Viralkumar, et al. "Low cost medical

image processing system for rural/semi urban

healthcare." Recent Advances in Intelligent

Computational Systems (RAICS), 2011 IEEE.

IEEE, 2011.

6. Villanueva, Lara G., et al. "Medical Diagnosis

Improvement through Image Quality

Enhancement Based on Super-Resolution."

Digital System Design: Architectures, Methods

and Tools (DSD), 2010 13th Euromicro

Conference on. IEEE, 2010.

7. Liu, Danzhou, Kien A. Hua, and Kiminobu

Sugaya. "A generic framework for Internet-based

interactive applications of high-resolution 3-D

medical image data." Information Technology in

Biomedicine, IEEE Transactions on 12.5 (2008):

618-626.

8. Castro, F. Javier Sanchez, et al. "A cross

validation study of deep brain stimulation

targeting: from experts to atlas-based,

segmentation-based and automatic registration

algorithms." Medical Imaging, IEEE

Transactions on 25.11 (2006): 1440-1450.

9. Kassim, Ashraf A., et al. "Motion compensated

lossy-to-lossless compression of 4-D medical

images using integer wavelet transforms."

Information Technology in Biomedicine, IEEE

Transactions on 9.1 (2005): 132-138.

10. Chen, Gong, Hong Yi, and Zhonghua Ni.

"MIPP: a Web-based medical image processing

system for stent design and manufacturing."

Services Systems and Services Management,

Page 11: 3Q_Factors_to_Enhance_Big_Data_Medical_Image_for_Better_Diagnosis.pdf

11

2005. Proceedings of ICSSSM'05. 2005

International Conference on. Vol. 2. IEEE, 2005.

11. Abràmoff, Michael D., Paulo J. Magalhães, and

Sunanda J. Ram. "Image processing with

ImageJ." Biophotonics international 11.7 (2004):

36-43.

12. Jannin, Pierre, et al. "Validation of medical

image processing in image-guided therapy."

IEEE Transactions on Medical Imaging 21.12

(2002): 1445-9.

13. de Moraes Barros Jr, Euclides, et al. "A model

for distributed medical image processing using

CORBA." Computer-Based Medical Systems,

2001. CBMS 2001. Proceedings. 14th IEEE

Symposium on. IEEE, 2001.

14. Lehmann, Thomas Martin, Claudia Gonner, and

Klaus Spitzer. "Survey: Interpolation methods in

medical image processing." Medical Imaging,

IEEE Transactions on 18.11 (1999): 1049-1075.

15. Drukker, Karen. "Applied Medical Image

Processing: A Basic Course." Med. Phys 37

‏.6500 :(2010)

16. van Ooijen, Peter MA, et al. "DICOM data

migration for PACS transition: procedure and

pitfalls." International journal of computer

assisted radiology and surgery (2014): 1-10.

17. Sanjeev, Tapesh Kumar Agarwal. "Vendor

neutral archive in PACS." Indian Journal of

Radiology and Imaging 22.4 (2012): 242-245.

18. Pianykh, Oleg S. Digital imaging and

communications in medicine (DICOM).

Springer, 2012.

19. Suapang, Piyamas, Kobchai Dejhan, and

Surapun Yimmun. "A web-based DICOM-

format image archive, medical image

compression and DICOM viewer system for

teleradiology application." SICE Annual

Conference 2010, Proceedings of. IEEE, 2010.

20. Kaur, K., Chopra, V., & Kaur, H. (2012). Image

Compression of medical images using VQ-

Huffman Coding Technique. International

Journal of Research in Business and Technology,

‏.36-43 ,(1)1

21. Suapang, Piyamas, Kobchai Dejhan, and

Surapun Yimmun. "Medical image compression

and DICOM-format image archive." ICCAS-

SICE, 2009. IEEE, 2009.

22. Robertson, Ian D., and Travis Saveraid.

"Hospital, radiology, and picture archiving and

communication systems." Veterinary radiology

& ultrasound 49.s1 (2008): S19-S28.

23. Doi, Kunio. "Computer-aided diagnosis in

medical imaging: historical review, current status

and future potential." Computerized medical

imaging and graphics 31.4-5 (2007): 198-211.

24. Liu, Boqiang, et al. "Medical image conversion

with DICOM." Electrical and Computer

Engineering, 2007. CCECE 2007. Canadian

Conference on. IEEE, 2007.

25. Cho, Kyucheol, et al. "Development of medical

imaging viewer: role in DICOM standard."

Enterprise networking and Computing in

Healthcare Industry, 2005. HEALTHCOM 2005.

Proceedings of 7th International Workshop on.

IEEE, 2005.

26. Ryu, Seewon, and Tae-Min Song. "Big Data

Analysis in Healthcare." Healthcare informatics

research 20.4 (2014): 247-248.

27. Raghupathi, Wullianallur, and Viju Raghupathi.

"Big data analytics in healthcare: promise and

potential." Health Information Science and

Systems 2.1 (2014): 3.

28. Bindu, A., and C. N. Kumar. "Inpainting for Big

Data." Signal and Image Processing (ICSIP),

2014 Fifth International Conference on. IEEE,

2014.

29. Koumpouros, Yiannis. "Big Data in Healthcare."

Healthcare Administration: Concepts,

Methodologies, Tools, and Applications:

Concepts, Methodologies, Tools, and

Applications (2014): 23.

30. Groves, Peter, et al. "The ‗big data‘revolution in

healthcare." McKinsey Quarterly (2013).

31. Suresh, K., and M. Rajasekhara Babu. "Towards

on high performance computing of medical

imaging based on graphical processing units."

Advanced Computing Technologies (ICACT),

2013 15th International Conference on. IEEE,

2013.

32. Constantinescu, Liviu, Jinman Kim, and Dagan

Feng. "Integration of interactive biomedical

image data visualisation to Internet-based

Personal Health Records for handheld devices."

e-Health Networking, Applications and Services,

2009. Healthcom 2009. 11th International

Conference on. IEEE, 2009.

33. CHAUHAN, NIKITA K., and SWATI J.

PATEL. "Survey on Medical image processing

areas."

34. Inallou, Mohammad Madadpour, Majid

Pouladian, and Bahman Mehri. "The Application

of Partial Differential Equations in Medical

Image Processing."