Performance Analysis of Hybrid Lossy/Lossless Compression...

4
Performance Analysis of Hybrid Lossy/Lossless Compression Techniques for EEG Data Madyan Alsenwi 1 , Tawfik Ismail 2 and Hassan Mostafa 1 Faculty of Engineering, Cairo University 2 National Institute of Laser Enhanced Science, Cairo University, Egypt Abstract— Long recording time, large number of electrodes, and a high sampling rate together produce a great data size of Electroencephalography (EEG). Therefore, more bandwidth and space are required for efficient data transmission and storing. So, EEG data compression is a very important problem in order to transmit EEG data efficiently with fewer bandwidth and storing it in a less space. In this paper, We use the Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT) which are lossy compression methods in order to convert the random- ness EEG data to high redundancy data. Therefore, adding a lossless compression algorithm after the lossy compression is a good idea to get a high compression ratio without any distortion in the signal. Here, we use Run Length Encoding (RLE) and Arithmetic Encoding which are lossless compression methods. Total times for compression and reconstruction (T), Compression Ratio (CR), Root Mean Square Error (RMSE), and Structural Similarity Index (SSIM) are evaluated in order to check the effectiveness of the proposed system. I. I NTRODUCTION Electroencephalography (EEG) is an electrophysiological monitoring technique to record electrical activity within the brain. It is typically noninvasive, with the electrodes placed in the scalp. In the medical applications, a group of sensors are placed over or inside the human body. These sensors are used to collect several EEG signals then transmit the collected data to an external station for analyzing diagnosing. The main problems during the transmission process are to minimize the transmission time with a limitation of the channel capacity and to save more power. The compression techniques are one of the best solutions to overcome the limitation of the channel capacity and power consumption. Data compression is the process of converting an input stream of data into another smaller data stream in size. There are two types of compression techniques, lossless and lossy. In the lossless compression, the original data can be reconstructed from the compressed data without any distortion. However, this technique limits the compression ratio (CR). In the lossy compression, high CR can be achieved, but some of the original data can be loosed, which may lead a non perfect reconstruction. The randomness in the EEG signal makes the compression of EEG data is a difficult [1]. Therefore, high CR cannot be obtained with lossless techniques. Thus, lossy compression techniques are used with an accepted level of distortion. Several works are focused on the EEG data compression [2]. Paper [1] considered the use of DCT algorithm for lossy EEG data compression. By using the DCT only, we are unable to achieve a high CR. Paper [3] considered a compression algorithm for ECG data composed from DCT, RLE, and Huffman encoding. High CR can be achieved in this algorithm, but this algorithm consumes a long time for compression and reconstruction processes. Paper [4] considered a comparative analysis by three transform methods, DCT, Discrete Wavelet Transform (DWT), and Hybrid (DCT+DWT) Transform. A high distortion can be occurred in the reconstructed signal, since DCT and DWT both are a lossy algorithms. The rest of this paper is organized as follows. Section II discusses EEG compression techniques. DCT, DWT, RLE and Arithmetic Encoding are introduced in this section. Section III introduces the implementation of the proposed system and performance measures. Section IV discuses the simulation results. Finally, Section V concludes the paper. II. DATA COMPRESSION TECHNIQUES In this section, an overview on the data compression tech- niques is introduced. The data compression techniques can be classified into two approaches: Lossless and Lossy Compres- sion [5], [6]. Some of the Lossless and Lossy compression techniques, which we use it here, are given below: A. Discrete Cosine Transform (DCT) DCT is a type of transformation methods which converts a time series signal into frequency components. DCT concen- trates the energy of the input signal in the first few coefficients and this is the main feature of DCT. Therefore, DCT is widely used in the field of data compression. Let f(x) is the input of DCT which is a set of N data values (EEG samples) and Y(u) is the output of DCT which is a set of N DCT coefficients. For N real numbers, the one dimensional DCT is expressed as follows [1], [7], [8]: Y (u)= 2 N α(u) N1 x=0 f (x) cos( π(2x + 1)u 2N ) (1) where α(u)= 1 2 , u =0 1, u> 0 where Y (0) is the DC coefficient and the rest coefficients are referred to as AC coefficients. The Y (0) coefficient contains 3 Center for Nanoelectronics and Devices, AUC and Zewail City of Science and Technology, Egypt 1,3 978-1-5090-5721-4/16/$31.00 ©2016 IEEE ICM 2016 1

Transcript of Performance Analysis of Hybrid Lossy/Lossless Compression...

Page 1: Performance Analysis of Hybrid Lossy/Lossless Compression ...scholar.cu.edu.eg/?q=hmostafa/files/performance_analysis_of_hybrid... · the lossy compression, high CR can be achieved,

Performance Analysis of Hybrid Lossy/Lossless

Compression Techniques for EEG Data

Madyan Alsenwi1, Tawfik Ismail2 and Hassan Mostafa

1Faculty of Engineering, Cairo University2National Institute of Laser Enhanced Science, Cairo University, Egypt

Abstract— Long recording time, large number of electrodes,and a high sampling rate together produce a great data size ofElectroencephalography (EEG). Therefore, more bandwidth andspace are required for efficient data transmission and storing. So,EEG data compression is a very important problem in order totransmit EEG data efficiently with fewer bandwidth and storingit in a less space. In this paper, We use the Discrete CosineTransform (DCT) and Discrete Wavelet Transform (DWT) whichare lossy compression methods in order to convert the random-ness EEG data to high redundancy data. Therefore, adding alossless compression algorithm after the lossy compression is agood idea to get a high compression ratio without any distortionin the signal. Here, we use Run Length Encoding (RLE) andArithmetic Encoding which are lossless compression methods.Total times for compression and reconstruction (T), CompressionRatio (CR), Root Mean Square Error (RMSE), and StructuralSimilarity Index (SSIM) are evaluated in order to check theeffectiveness of the proposed system.

I. INTRODUCTION

Electroencephalography (EEG) is an electrophysiological

monitoring technique to record electrical activity within the

brain. It is typically noninvasive, with the electrodes placed

in the scalp. In the medical applications, a group of sensors

are placed over or inside the human body. These sensors are

used to collect several EEG signals then transmit the collected

data to an external station for analyzing diagnosing. The main

problems during the transmission process are to minimize the

transmission time with a limitation of the channel capacity and

to save more power.

The compression techniques are one of the best solutions

to overcome the limitation of the channel capacity and power

consumption. Data compression is the process of converting

an input stream of data into another smaller data stream in

size. There are two types of compression techniques, lossless

and lossy. In the lossless compression, the original data can be

reconstructed from the compressed data without any distortion.

However, this technique limits the compression ratio (CR). In

the lossy compression, high CR can be achieved, but some of

the original data can be loosed, which may lead a non perfect

reconstruction.

The randomness in the EEG signal makes the compression

of EEG data is a difficult [1]. Therefore, high CR cannot be

obtained with lossless techniques. Thus, lossy compression

techniques are used with an accepted level of distortion.

Several works are focused on the EEG data compression

[2]. Paper [1] considered the use of DCT algorithm for lossy

EEG data compression. By using the DCT only, we are unable

to achieve a high CR. Paper [3] considered a compression

algorithm for ECG data composed from DCT, RLE, and

Huffman encoding. High CR can be achieved in this algorithm,

but this algorithm consumes a long time for compression and

reconstruction processes. Paper [4] considered a comparative

analysis by three transform methods, DCT, Discrete Wavelet

Transform (DWT), and Hybrid (DCT+DWT) Transform. A

high distortion can be occurred in the reconstructed signal,

since DCT and DWT both are a lossy algorithms.

The rest of this paper is organized as follows. Section II

discusses EEG compression techniques. DCT, DWT, RLE and

Arithmetic Encoding are introduced in this section. Section

III introduces the implementation of the proposed system and

performance measures. Section IV discuses the simulation

results. Finally, Section V concludes the paper.

II. DATA COMPRESSION TECHNIQUES

In this section, an overview on the data compression tech-

niques is introduced. The data compression techniques can be

classified into two approaches: Lossless and Lossy Compres-

sion [5], [6]. Some of the Lossless and Lossy compression

techniques, which we use it here, are given below:

A. Discrete Cosine Transform (DCT)

DCT is a type of transformation methods which converts a

time series signal into frequency components. DCT concen-

trates the energy of the input signal in the first few coefficients

and this is the main feature of DCT. Therefore, DCT is widely

used in the field of data compression.

Let f(x) is the input of DCT which is a set of N data values

(EEG samples) and Y(u) is the output of DCT which is a set of

N DCT coefficients. For N real numbers, the one dimensional

DCT is expressed as follows [1], [7], [8]:

Y (u) =

2

Nα(u)

N−1∑

x=0

f(x) cos(π(2x+ 1)u

2N) (1)

where

α(u) =

{

1√2, u = 0

1, u > 0

where Y (0) is the DC coefficient and the rest coefficients are

referred to as AC coefficients. The Y (0) coefficient contains

3Center for Nanoelectronics and Devices, AUC and Zewail City of Science and Technology, Egypt

1,3

978-1-5090-5721-4/16/$31.00 ©2016 IEEE ICM 20161

Page 2: Performance Analysis of Hybrid Lossy/Lossless Compression ...scholar.cu.edu.eg/?q=hmostafa/files/performance_analysis_of_hybrid... · the lossy compression, high CR can be achieved,

Fig. 1: DWT Tree

Fig. 2: RLE

the mean value of the original signal.

Inverse DCT takes transform coefficients Y (u) as input and

converts them back into time series f(x). For a list of N DCT

coefficients, the inverse transform is expressed as follows [1]:

f(x) =

2

Nα(u)

N−1∑

u=0

Y (u) cos(π(2x+ 1)u

2N) (2)

Most of the N coefficients produced by DCT are small

numbers or zeros. These small numbers usually down to zero.

B. Discrete Wavelet Transform (DWT)

In DWT, the input signal is decomposed into low frequency

part (approximation) and high frequency part (details) Fig. 1.

Decomposition a signal into different frequency bands is suit-

able for data compression which we can study each component

with a resolution matched to its scale [9], [10].

In this paper, we use the DWT with Haar basis function, which

is a simple DWT basis function, because it has less complexity

and gives accepted performance. There are two coefficients for

every two consecutive samples S(2m) and S(2m+1) in DWT

with Haar function defined as [4], [11]:

CA(m) =1√2[S(2m) + S(2m+ 1)] (3)

CD(m) =1√2[S(2m)− S(2m+ 1)] (4)

where CA(m) is the approximation coefficient and CD(m)is the details coefficient. We can notice from equations (3, 4)

that calculating CA(m) and CD(m) is equivalent to pass the

signal through first order low-pass and high-pass filters with

sub-sampling factor of 2 and normalized by 1/√2.

C. Run Length Encoding (RLE)

RLE is a type of lossless compression. The idea of RLE is to

take the consecutive repeating occurrences of a data value and

replace this repeating value by only one occurrence followed

by the number of occurrences (Fig. 2). This is most useful on

data that contains many such runs [3], [7], [12].

D. Arithmetic Coding (AC)

Arithmetic coding is a type of entropy encoding. Unlike

other types of entropy encoding, such as Huffman coding,

which replace each symbol in the message with a code, the

entire message is replaced with a single code in arithmetic

coding.

First, the input file is read symbol by symbol and the Arith-

metic algorithm starts with a certain interval. The probability

of each symbol is used to narrow the interval. Specifying a

new interval requires more bits, so the arithmetic algorithm

designed such that a low-probability symbol narrows the

interval more than a high-probability symbol in order to

achieve compression. Therefore, the high-probability symbols

contribute fewer bits to the output [13], [14], [15].

III. IMPLEMENTATION AND PERFORMANCE MEASURES

This section introduces the implementation of the proposed

algorithm and the performance measures.

A. Implementation

The proposed system consists of two main units: compres-

sion unit and decompression unit.

1) Compression Unit: The first step in this unit is reading

the EEG data file, and transform it by DCT or DWT

(according to user selection). After that, thresholding

step is applied to get a high redundancy in the trans-

formed data. In this step, values below the threshold

value are set to zero. The number of zero coefficients

can be increased or decreased by varying the threshold

value. Therefore, the accuracy of the reconstructed data

can be controlled. Transformation and thresholding steps

together increase the probability of redundancies in the

transformed data. Finally, if we use RLE or Arithmetic

Encoding (according to the user selection), we will get

a high compression ratio due to the high redundancy in

the transformed data (Fig. 3, Table I).

1) Reconstruction Unit: First, the compressed data is de-

coded using inverse RLE or inverse Arithmetic encod-

ing, according to the selection in the compression unit,

then the inverse DCT or inverse DWT is applied, also

according to the selection in the compression unit, in

order to reconstruct the EEG data (Fig. 3, Table I).

B. Performance Measures

1) Root Mean Square Error (RMSE): The RMSE is used

to test the quality of the reconstructed signal. RMSE measures

how much error between two signals as follows:

RMSE =

∑n

i=1(yi − y′i)

2

n

where y′ and y are the reconstructed and original signals,

respectively.

The Normalized RMSE (NRMSE) is defined as the following:

NRMSE =RMSE

ymax − ymin

2

Page 3: Performance Analysis of Hybrid Lossy/Lossless Compression ...scholar.cu.edu.eg/?q=hmostafa/files/performance_analysis_of_hybrid... · the lossy compression, high CR can be achieved,

Fig. 3: Compression and Reconstruction Process

TABLE I: Compression and Reconstruction Algorithm

Data← EEGData. Lossy Compressionif DCT is Required then

TransData← DCT (Data)else

TransData← DWT (Data)end if

. ThresholdingThr ← ThresholdV alue[SortedData, index]← sort(abs(Data))i← 1for LengthofData do

if abs(x(i)/x(1)) > Thr then

i← i+ 1continue

else

breakend if

end for

TransData(index(i+ 1 : end))← 0. Lossless Compressionif RLE is Required then

CompressedData← RLE(TransData)else

CompressedData← Arithm(TransData)end if

. Reconstructionif RLE is used then

DecData← IRLE(CompressedDatta)else

DecData← IArithm(CompressedData)end if

if DCT is used then

ReconstructedData← IDCT (DecData)else

ReconstructedData← IDWT (DecData)end if

where ymax andymin are the maximum and minimum values

in signal y, respectively.

2) Compression Ratio (CR): The second performance mea-

sure, which we use it in this paper, is the CR, which is defined

as:

CR =OriginalData− CompData

OriginalData× 100

3) Compression and Reconstruction Time (T): The third

performance measure is the time (T), which is the total time

of compression process and reconstruction process.

4) Structural Similarity index (SSIM): The last

performance measure is the Structural Similarity index

(SSIM) [16] which is defined as the following:

SSIM(x, y) =(2µxµy + C1)(2σxy + C2)

(µ2x + µ2

y + C1)(σ2x + σ2

y + C2)

where µx,µy ,σx,σy , and σxy are the means, standard

deviations, and cross-covariance for signals x,y. C1 and C2

are the regularization constants [16].

IV. SIMULATION AND RESULTS

The performance of the proposed system is studied using

Matlab and it’s run on Intel(R) Core(TM) i3 CPU 2.27GHz,

4 GB RAM. The size of EEG data, which we use it here in

the simulation, is 1 MB.

Fig. (4) shows the CR of each case with different values of

Normalized RMSE. The value of RMSE can be changed by

varying the threshold value. Fig. (4) shows that the case of

DCT with RLE has the highest CR. This due to the compact-

ing property of DCT which causes high redundancy in the

transformed data and facilitates the use of RLE. Also, the case

of DCT with Arithmetic Encoding has a good CR compared

to other cases. In addition to the good CR, Arithmetic coding

exports data in the binary form. So, we don’t need to use

analog to digital converter in case of using arithmetic coding.

Fig. (5) shows the total time of compression and reconstruction

processes with normalized RMSE. We notice from this figure

that the case of DWT only has the smallest time since the

implementation of DWT is simple (just Low pass filter and

High pass filter). Conversely, Arithmetic coding consumes

more time compared to the other techniques and this due to

the complexity of Arithmetic coding. Also, Fig. (5) shows

that the case of DWT with Arithmetic coding consumes

time more than in case of DCT with Arithmetic despite the

DWT consumes time less than DCT. This because that DCT

produces redundancy in the data higher than DWT and this

redundancy facilitates the use of Arithmetic coding and make

it fast.

Fig. (6) shows the CR with SSIM. The results in this figure

exactly the inverse of the results in Fig. (4) since the SSIM

inversely proportion to the RMSE. We can control the value

3

Page 4: Performance Analysis of Hybrid Lossy/Lossless Compression ...scholar.cu.edu.eg/?q=hmostafa/files/performance_analysis_of_hybrid... · the lossy compression, high CR can be achieved,

0.2 0.4 0.6 0.8 10

10

20

30

40

50

60

70

80

90

100C

R (

%)

Normalized RMSE (%)

DCT + RLEDCT + Arithm

DWT + RLEDCTDWTDWT + Arithm

Fig. 4: CR versus Normalized RMSE

0.2 0.4 0.6 0.8 10

30

60

90

120

150

Tim

e (

s)

DWT + ArithmDCT + Arithm

DWT + RLEDCT + RLE

0.2 0.4 0.6 0.8 10

0.1

0.2

Tim

e (

s)

Normalized RMSE (%)

DCT

DWT

Fig. 5: Time versus Normalized RMSE

of SSIM by varying the threshold value.

Finally, the final comparison between all techniques according

to CR and time is shown in Fig. (7). Figure (7) shows the

DCT with RLE is the best since it has the highest CR and

consumes small time. Conversely, DWT with Arithmetic is

the worst since it has low CR and consumes long time.

V. CONCLUSION

In this paper, a compression algorithm for EEG data is

proposed. The lossy compression techniques, DCT and DWT,

are applied to transform the high randomness time series

EEG signal into basic frequency components. After that, a

threshold is applied and all coefficients below the threshold

value are replaced by zero. The resulting signal has a high

probability of redundancy. Finally, a lossless compression,

0.98 0.985 0.99 0.9950

10

20

30

40

50

60

70

80

90

100

CR

(%

)

SSIM

DCT + RLEDCT + Arithm

DWT + RLEDCTDWTDWT + Arithm

Fig. 6: CR versus Structural Similarity

−20 0 20 40 60 80 100 12020

40

60

80

100

DCT+RLE

DCT+Arithm

DWT+RLEDCT

DWT

DWT+Arithm

CR

(%

)

Time(s)

Fig. 7: CR versus Time in case of NRMSE = 0.56% and SSIM = 0.99

RLE and Arithmetic Encoding, is applied, which has a high

efficiency with high redundancy data. Compression Ratio,

Normalized RMSE, Compression and Reconstruction Time,

and SSIM are evaluated to check the performance of the

proposed system. We conclude that, the case of DCT with RLE

has the highest CR, and less time compared with other cases

(DCT with Arithmetic, DWT with RLE, DWT with Arithm).

REFERENCES

[1] D. Birvinskas, I. Jusas, and Damasevicius, “Fast dct algorithms foreeg data compression in embedded systems,” Computer Science and

Systems, vol. 12, no. 1, pp. 49–62, 2015.[2] L. J. Hadjileontiadis, “Biosignals and compression standards,” in M-

Health. Springer, 2006, pp. 277–292.[3] S. Akhter and M. Haque, “Ecg comptression using run length encoding,”

in Signal Processing Conference, 2010 18th European. IEEE, 2010,pp. 1645–1649.

[4] A. Deshlahra, G. Shirnewar, and A. Sahoo, “A comparative study of dct,dwt & hybrid (dct-dwt) transform,” 2013.

[5] G. Antoniol and P. Tonella, “Eeg data compression techniques,” Biomed-

ical Engineering, IEEE Transactions on, vol. 44, no. 2, pp. 105–114,1997.

[6] L. Koyrakh, “Data compression for implantable medical devices,” inComputers in Cardiology, 2008. IEEE, 2008, pp. 417–420.

[7] D. Salomon, Data compression: the complete reference. SpringerScience and Business Media, 2004.

[8] S. Fauvel and Ward, “An energy efficient compressed sensing frameworkfor the compression of electroencephalogram signals,” Sensors, vol. 14,no. 1, pp. 1474–1496, 2014.

[9] B. A. Rajoub, “An efficient coding algorithm for the compression ofecg signals using the wavelet transform,” Biomedical Engineering, IEEE

Transactions on, vol. 49, no. 4, pp. 355–362, 2002.[10] M. A. Shaeri and A. M. Sodagar, “A method for compression of

intra-cortically-recorded neural signals dedicated to implantable brain–machine interfaces,” Neural Systems and Rehabilitation Engineering,

IEEE Transactions on, vol. 23, no. 3, pp. 485–497, 2015.[11] O. O. Khalifa, S. H. Harding, and A.-H. Abdalla Hashim, “Compression

using wavelet transform,” International Journal of Signal Processing,vol. 2, no. 5, pp. 17–26, 2008.

[12] Z. T. DRWEESH and L. E. GEORGE, “Audio compression based ondiscrete cosine transform, run length and high order shift encoding,”International Journal of Engineering and Technology (IJEIT (, Vol. 4,

Issue 1, Pp. 45-51, 2014.[13] P. G. Howard and J. S. Vitter, “Analysis of arithmetic coding for data

compression,” in Data Compression Conference, 1991. DCC’91. IEEE,1991, pp. 3–12.

[14] A. Moffat, R. M. Neal, and Witten, “Arithmetic coding revisited,”Transactions on Information Systems (TOIS), vol. 16, no. 3, pp. 256–294, 1998.

[15] I. H. Witten, R. M. Neal, and Cleary, “Arithmetic coding for datacompression,” Communications of the ACM, vol. 30, no. 6, pp. 520–540, 1987.

[16] A. C. Wang, Zhou, H. R. Sheikh, and Simoncell, “Image quality assess-ment: from error visibility to structural similarity,” Image Processing,

IEEE Transactions on, vol. 13, no. 4, pp. 600–612, 2004.

4