CHAPTER 4 Image based...

41

CHAPTER 4

Image based Steganography

4.1 Introduction

Digital images often have a large amount of redundant data and for this reason it is

possible to hide secret message inside image file. Images are the most common and

widespread carrier medium for steganography (Westfeld and Pfitzmann, 2000). To a

computer, an image is a collection of numbers that constitute different light intensities

in different areas of the image (Johnson and Jajodia, 1998). This numeric

representation forms a grid and the individual points are referred to as pixels (Morkel

et al., 2005). These pixels make up the image’s raster data. Data hiding in images take

advantage of the limited power of the human visual system (HVS) which has a low

sensitivity in pattern changes and luminance (Carvajal-Gamez et al., 2009). Most of

the digital steganography methods take advantage of the margin between the

numerical value and visual perception of the multimedia carriers. In other words, the

secret messages are embedded in the images by involving some slight distortions in

the non-significant parts which are invisible to human perception system.

The amount of digital images has increased rapidly on the Internet because of its

importance in many applications. The digital images are the most popular cover-

object for steganography because of its suitable size in comparison to other digital

media and of its massive presence in the internet, it can effort to carry large amount of

secret data embedded into it. In image steganography, it is necessary to ensure that the

changes in the stego-image due to the embedding of data are visually and statistically

negligible for making the steganographic method difficult to detect. The most

effective way of hiding data in an image is to change the image content i.e. the colors

of the pixels. Such technique, although crude, hides a large volume of information

inside the image. The idea is to embed the data into a significantly larger object so

that the changes are undetectable (Carvajal-Gamez et al., 2009). In steganography, a

secret message is embedded into another, innocent looking digital medium, in order

not only to conceal the secret message, but to conceal the sheer existence of the secret


42

message. The security of stego-images depends entirely on their ability to go

unnoticed. There are many methods that enable embedding secret information into an

image. The information can be embedded inside an image file in any order or in

specific areas that makes the information invisible and undetectable from third party.

It is also important to note that steganographic technique not only involves in

embedding information inside digital media but also the receiver should be able to

successfully retrieve the information from the media. When dealing with digital

images for use with Steganography, 8-bit and 24-bit per pixel image files are typical

(Johnson and Jajodia, 1998). Both have advantages and disadvantages, 8-bit images

are useful because of their relatively small size. 24-bit images offer much more

flexibility when used for Steganography. Because of this reliance greater benefits can

be gathered from 24-bit images which may use 16 millions of colors for RGB images

(Gonzalez and Woods, 2002; Sheikh et al., 2006).

Reasons for using digital image as cover-media for steganography:

a. It is the most widely used medium.

b. Takes advantage of the limited visual perception of colors.

c. This field is continually growing with the growth of computer graphics.

d. Digital images are made up of pixels.

e. The arrangement of pixels makes up the image’s ‘raster data’.

f. 8-bit and 24-bit images are common

g. The larger the image size, the more information can be hidden.

In Steganographic technique the choice of the cover image is equally important. The

digital images may be large enough to be transmitted through the internet. So,

techniques are used to fit the image to a suitable size for displaying it in a reasonable

time across the internet. These techniques make use of mathematical formulas to

reduce image data, resulting in smaller file sizes and this process is called

compression (Morkel et al., 2005). Image compression is used to minimize the

amount of memory needed to represent an image. Current image formats can be


43

divided into two broad categories, lossy and lossless. Both methods save storage

space but have different results, interfering with the hidden information, when the

information is uncompressed (Johnson and Jajodia, 1998). Lossy compression (e.g.

JPEG format) attains a high level of compression and thus saves more space but in

doing so, the bits may be altered largely and the originality of the image may be

affected. The plus side of lossy images, in particular JPEG, is that it achieves

extremely high compression, while maintaining fairly good quality (Bender et al.,

1996). Lossless compression reconstructs the original message exactly and it is

preferred when the original information must remain intact (as with steganographic

images) (Johnson and Jajodia, 1998). However, they do not have the high

compression ratio that lossy formats do. Lossless compression is typical of images

saved as GIF (Graphic Interchange Format) and BMP (bitmap).

Image steganography method is basically classified into two categories based on the

working domain: Spatial domain and Frequency domain based steganography.

4.1.1 Spatial Domain Based Steganography

The term spatial domain refers to the image plane itself and approaches in this domain

are based on direct manipulation of pixels of an image. Spatial domain methods are

procedures that operate directly on pixels and are aggregate of pixels composing an

image.

Figure 4.1: One byte representation of a pixel with integer to binary conversion

In the spatial domain approach, the secret message is embedded directly into the

pixels of a cover image. Spatial domain based steganographic method involves

modification of the secret data in the spatial domain of the cover-image. Least


44

significant bit (LSB)-based hiding strategies are most commonly used in this

approach. Spatial domain steganographic techniques, also known as substitution

techniques, consists of simple techniques that create a covert channel in the parts of

the cover image in which changes are likely to be imperceptible to the human visual

system (HVS) (Hamid et al., 2012). Spatial domain based steganography include LSB

based embedding and Palette based Embedding.

4.1.1.1 Least Significant Bit based Steganography

Least significant bit (LSB) is the most popular and common method of embedding

scheme where information is hidden in the least part of an image (Juneja et al., 2009).

LSB steganography is the most classic and simplest steganographic techniques, which

embeds secret messages in a subset of the LSB plane of the image (Abraham and

Paprzycki, 2004). This method is probably the easiest way of hiding information in

an image and yet it is surprisingly effective. It works by using the least significant bits

of each pixel in one image to hide the most significant bits of another. This

embedding method is basically based on the fact that the least significant bits in an

image can be thought of as random noise, and consequently they become not

responsive to any changes on the image (Bailey and Curran, 2006; Kharrazi et al.,

2006; Hamid et al., 2012). A large number of popular steganographic tools, such as S-

Tools 4, Steganos and StegoDos, are based on LSB replacement in the spatial domain

(Johnson and Jajodia, 1998).

This technique tries to substitute redundant parts of a signal with secret message. The

embedding process consists of choosing a subset of cover elements and performing

the substitution operations on them (Chan and Cheng, 2004). The basic concept of

LSB based embedding includes the embedding of the secret data at the bits which are

having minimum weighting so that it will not affect the value of original pixel (Sharda

and Budhiraja, 2013). This method often works with raster images, presented in a

format without compression (e.g. *.gif, *.bmp). This file formats are preferred

because they offer "lossless" compression. But, other image formats are used as cover

image as well (Bandyopadhyay and Maitra, 2010). The image formats typically used

in the LSB substitution are lossless and the data can be directly manipulated and

recovered (Celik et al., 2005). One of the most important features of lossless


45

compression is to maximize the embedding capacity. Employing the LSB technique

for data hiding achieves both invisibility and reasonably high storage payload (Amin

et al., 2003). 8-bit images are not as forgiving to LSB manipulation because of color

limitations (Johnson and Jajodia, 1998).

The advantages of LSB based data hiding method is that it is simple to embed the bits

of the message directly into the LSB of image pixel and many techniques use these

methods (Amin et al., 2003). The LSB modification does not result in image

distortion and thus the resulting stego-image looks identical to the cover-image

(Bailey and Curran, 2006). LSB based technique enables high embedding rate and

also fully recovers the secret data without any error. For this reason, it is mostly

preferred in image steganography. The amount of data to be embedded may be fixed

or variable in size depending on the number of pixels selected. The main advantage of

such technique is that the modification of the LSB plane does not affect the statistics

of the overall image as the amplitude variation of the pixel values is bounded by ±1

(Chandramouli and Memon, 2001).

By overwriting the LSB, the numeric value of the byte changes very little and is least

likely to be detected by the human perception. Since there are 256 possible intensities

of each primary color, changing the LSB of a pixel results in small changes in the

intensity of the colors. With a well-chosen image, one can even hide the large volume

of secret message in the LSB without noticing the difference (Neeta et al., 2006). If a

cover-image is taken with M×N pixels then the maximum data hiding capacity of

LSB steganography is M×N and the embedding ratio p as the ratio of the length of

embedded messages to the maximum capacity, where 0 p 1 (Zhang et al., 2006).

In case of a 24- bit image, data can be stored in 3 bits in each pixel by changing LSBs

of each of the red, green and blue color components as each of the components are

represented by a byte.

The following example demonstrates the way the letter ‘S’ can be hidden in the first

eight bytes of three pixels in a 24-bit image. In image representation each pixel is

made up of three bytes consisting of either a 1 or a 0. The original raster data for 3

pixels may be


46

R G B

(00100110 11101010 11001010)

(00100101 11001010 11101011)

(11001010 00100101 11101011)

And the character, S=01010011

Embedding character ‘S’ into the LSBs of the following pixels then the resulting pixel

becomes:

R G B

(00100110 11101001 11001000)

(00100111 11001000 11101000)

(11001001 00100111 11101001)

The three underlined bits are the only three bits that are actually altered (where bits in

bold and underlined have been changed). On average, only one half of the LSBs are

changed (Johnson and Jajodia, 1998). However changing the MSBs causes a

noticeable impact on the color but changing the LSBs is not noticeable and preserves

the image quality. Thus, 01101010 could be changed to 01101011 or remains same

and would go unnoticed to the casual observer. The last bits of the pixels plane can be

used to embed data. This actually makes sense when one considers that one set of

zeroes and ones are substituted with another set of zeroes and ones.

From the embedding process as illustrated shows that it is possible to extract the

secret message bits directly from the LSBs of those pixels selected during this

process. In the extraction process, given the stego-image, the embedded secret

messages can be extracted using the same sequence as in the embedding process. The

set of pixels storing the secret message bits are selected from the stego-image. The

LSBs of the selected pixels are extracted and lined up to reconstruct the secret

message bits (Chan and Cheng, 2004). Thus in extraction, the receiver must have

access to the sequence of element indices used in the embedding process. This


47

extraction algorithm is considered the inverse of the embedding algorithm, although

the embedding and extraction algorithms may be created such that the extraction

algorithm is not actually the mathematical inverse of the embedding algorithm (Jain et

al., 2012b).

A slight variation of such technique allows for embedding the message in two or more

of the least significant bits per byte and increases the hidden information capacity of

the cover-image, but the cover-image is degraded more, and therefore it is more

detectable (Juneja et al., 2009). Other variations on this technique include ensuring

that statistical changes in the image do not occur. When hiding the message bits in the

LSBs of an image, there are two schemes, namely sequential and random. In

sequential case, the message is embedded into image sequentially or successively. In

the random embedding, the message bits are randomly scattered throughout the image

using a random sequence to control the embedding process.

The main drawback is that it is vulnerable to small manipulation in the stego-image.

Converting an image from a format like GIF or BMP, which reconstructs the original

message exactly (lossless compression) to a JPEG, which does not (lossy

compression), and then back could destroy the information hidden in the LSBs

(Johnson and Jajodia, 1998). However, such technique maintains the size and

properties of the source image by adding robustness of the secret message and allows

high perceptual transparency. Most digital formats are designed with the outer limits

of human perception in mind, which makes LSB the pattern of choice for packaging

messages in these channels. LSB has been shown to be quite versatile and the

implementation is straightforward. All of these factors contribute to the continued use

of LSB in steganographic applications. The primary objective when using this method

is to barter a marginal amount of image quality in order to create undefined space

within the carrier space. Among all message embedding techniques, least significant

bits (LSB) insertion/modification is a difficult one to detect, and it is imperceptible to

humans (Chandramouli and Memon, 2001).

4.1.1.2 Palette based Steganography

Palette based image enables 8 bits per pixel or less to look almost as good as 24 bits

per pixel (Agaian and Perez, 2004). Rather than each pixel in the image having all


48

three RGB colors (one 8-bit red, one 8-bit green and one 8-bit blue), each pixel

contains one 8-bit number that indexes into the 256-color lookup table, which

contains the RGB values (Bandyopadhyay and Maitra, 2010). The palette based

algorithms consist of color quantization and dithering. Color quantization selects the

palette of the image by truncating all colors of the original raw, 24-bit image to a

finite number of colors (Fridrich, 1999a; Wang et al., 2005). Palette images can be

transformed from a three color layer image by reducing the number of unique colors

used within an image by using color quantization (Johnson and Jajodia, 1998;

Bandyopadhyay and Maitra, 2010). Data embedding in palette takes advantage of the

color quantization process during the transformation. The basic idea of embedding in

the palette lies in the insertion of secret messages within the ordering of the colors in

the color-map (Agaian and Perez, 2004). Due to the color quantization some

alternation introduces in the image thus the secret message is able to pass as noise

(Wang et al., 2005). This avoids changes in the image leaving the visual perception of

the image unscathed. This is possible because two identical images may have

completely different color-maps (Westfeld and Pfitzmann, 2000). The information

can also be embedded by arranging the palette in a structure where neighboring colors

are close in given distance, including chroma difference (Fridrich, 1999a). Palette

based steganography are also be useful in fast transmission of secret message over a

communication system. Dithering is used for apparent increasing of color depth that

uses the integrating properties of the human visual system and creates the illusion of

additional colors by trading space resolution for color depth (Fridrich, 1999a). The

use of palette-based image representation is based on the observation that natural

images usually use only a small percentage of the available RGB color space and

quantization of colors can be done without severely degrading the image quality

(Wang et al., 2005).

The method to extract the embedded message is relatively simple, as long as the color

grouping configuration is identified. The receiver can simply recover the message by

selecting the same pixels and collecting the LSBs of all indices to the ordered palette

(Wang et al., 2005; Wu et al., 2004). Using the same sequence as in the embedding

process, the secret message is simply read by extracting the parity bits of the colors of

selected pixels from the stego-image. Also the extraction is done by firstly sorting the

palette (stego-palette) and then retrieving the stego-bits from the palette.


49

Figure 4.2: Sorting of color in palette as used in EzStego method (Westfeld and

Pfitzmann, 2000).

One of the most popular message hiding schemes for palette-based images (GIF files)

has been proposed by Machado is similar to the commonly used LSB method for 24

bit color images (or 8 bit grayscale images) called EZ stego method(Westfeld and

Pfitzmann, 2000; Fridrich, 1999a; Wang et al., 2005). EzStego is a non-adaptive

method that embeds in the LSB of the index where the palette is first sorted by

luminance which is a linear combination of three colors R, G, B in the palette (Agaian

and Perez, 2004; Wu et al., 2004). In the reordered palette, neighboring palette entries

are typically near to each other in the color space as shown in figure 4.2 (Westfeld

and Pfitzmann, 2000). Then index of the pixel’s RGB color in the reordered palette is

evaluated and is replaced with the bit of the message. EZ Stego embeds the message

in a binary form into the LSB of indices (pixels) pointing to the palette colors

(Fridrich, 1999a; Wang et al., 2005; Wu et al., 2004). However, occasionally colors

with similar luminance values may be relatively far from each other, generating very

noticeable artifact (Wang et al., 2005). The advantage of EZ Stego is that it gave

importance to color models and after embedding, the image is reconstructed by

arranging the palette (Agaian and Perez, 2004). According to Fridrich, (1999a), EZ

stego has problem that method does not easily generate better stego-images as similar

luminance values may be relatively far from each other and to avoid such problem,

the author presented a steganographic method for hiding message bits into the parity

bit of close colors by changing the image’s index. In their algorithm, they used

distance to select pixels that are close in distance. According to Agaian and Perez,

(2004), the algorithm stated by Fridrich, (1999a) has disadvantages that embedding


50

capacity is limited to the size of the index and also it uses the entire image to find the

desired parity bit, providing more room for errors.

The problem with the palette approach used with BMP images is that if the LSB of a

pixel is changed, it can result in a completely different color since the index to the

color palette is changed (Johnson and Jajodia, 1998). Agaian and Perez, (2004)

further stated that problems with such methods is that they embed within the palette

are that they do not take in to account other important color models. Also, the

embedding information is limited and the hidden message can be destroyed by

switching the order of the palettes.

4.1.2 Frequency Domain based Steganography

In the frequency domain cover images are transformed using a frequency-oriented

mechanism and then the secret messages can be combined with the coefficients in the

frequency-form images to achieve embedding. Frequency Domain is also known as

transform domain as it transforms the image. Unlike spatial domain techniques,

frequency domain based techniques hide secret data in significant parts of the cover

file. There are many transforms used to map a signal into the frequency domain.

Discrete cosine transform (DCT), discrete wavelet transform (DWT), and discrete

Fourier transform (DFT) are methods used as mediums to embed secret data in digital

images.

4.1.2.1 Discrete Cosine Transform based Steganography

The DCT algorithm is one of the main components of the JPEG compression

technique and it can be exploited for information hiding (Morkel et al., 2005). Such

technique basically applies lossy compression in images and thus they form an image

with some loss in bits (Fridrich, 2009). An example of an image format that uses this

compression technique is JPEG (Joint Photographic Experts Group) (Johnson and

Jajodia, 1998). JPEG is the most popular and common image file format on the

Internet and the image sizes are small because of the compression, thus making it the

least suspicious algorithm to use. In frequency domain based steganography the


51

knowledge of the JPEG compression algorithm that uses discrete cosine transform to

image content transformation is used to embed secret message (Chang et al., 2002).

Table 4.1: A block of 8 X 8 pixel values of a cover-image as stated in Jpeg–Jsteg

139 144 149 153 155 155 155 155

144 151 153 156 159 156 156 156

150 155 160 163 158 156 156 156

159 161 162 160 160 159 159 159

159 160 161 162 162 155 155 155

161 161 161 161 160 157 157 157

162 162 161 163 162 157 157 157

162 162 161 161 163 158 158 158

In order to compress an image into JPEG format, the RGB color representation is first

converted to a YUV representation space and each color plane is partitioned into non-

overlapping 8 x 8 blocks of pixels (Chang et al., 2002; Liu and Liao, 2008; Currie and

Irvine, 1996). In this representation the Y component corresponds to the luminance

(or brightness) and the U and V components correspond to chrominance (or color) (Li

and Wang, 2007). The human eye is more sensitive to changes in the brightness

(luminance) of a pixel than to changes in its color (Fridrich, 2009). Thus, it is possible

to remove a lot of color information from an image without losing a great deal of

quality (Watson, 1994a). JPEG compression performs downsampling (or

subsampling) of image where much of the compression takes place by downsampling

the chrominance data to reduce the overall file size (Cox et al., 2008). The color

components (U and V) are halved in horizontal and vertical directions.

Next, DCT transforms a signal from an image representation into a frequency

representation by grouping the pixels into 8 × 8 pixel blocks and transforming the

pixel blocks into 64 DCT coefficients each (Kharrazi et al., 2006; Almohammad et

al., 2008). A modification of a single DCT coefficient will affect all 64 image pixels

in that block (Morkel et al., 2005). Each DCT coefficient F (u, v) of an 8 x 8 block of

image pixels f(x, y) is given as stated by (Sheisi et al., 2012):


52

( )

( ) ( ) [∑ ∑ ( )

( )

( )

] (4.1)

where ( ) √ when u =0 and C(u)=1 otherwise.

( ) √ when v =0 and C(v)=1 otherwise.

In this case x, y, u, v ϵ {0, 1, …, 7} and f (x, y) is the particular pixel color space

component.

Table 4.2: The DCT coefficients formed after transformation of the image block

1260 -1 -12 -5 2 -2 -3 1

-23 -17 -6 -3 -3 0 0 -1

-11 -9 -2 2 0 -1 -1 0

-7 -2 0 1 1 0 0 0

-1 -1 1 2 0 -1 1 1

2 0 2 2 -1 1 1 -1

-1 0 0 -1 0 2 1 -1

-3 2 -4 -2 2 1 -1 0

After performing DCT to each 8x8 block, the low frequency coefficient which is on

the top left of the table gets the higher value as it encodes the data with the highest

importance and high frequency gets lower value (Chang et al., 2002) as shown in

table 4.2. Then, the quantization phase is performed which is the main lossy

compression step where the remaining coefficients are quantized. The transformed

coefficients are quantized (scaled) in accordance with the default quantization table of

JPEG (Tseng and Chang, 2004).


53

Table 4.3: The Standard quantization table of JPEG

16 11 10 16 24 40 51 61

12 12 14 19 26 58 60 55

14 13 16 24 40 57 69 56

14 17 22 29 51 87 80 62

18 22 37 56 68 109 103 77

24 35 55 64 81 104 113 92

49 64 78 87 103 121 120 101

72 92 95 98 112 100 103 99

The standard quantization table is listed in table 4.3, which is a matrix that contains

64 coefficients and the user can adjust those 64 coefficients (Chang et al., 2002). It is

actually the real default table for luminance included in the JPEG specification and

higher the values on the quantization table, the more details are eliminated

(Almohammad et al., 2008). The aim is to quantize the values that represent the

image after transforming values to frequencies (Watson, 1994b). Quantization process

takes the 64 DCT coefficients and dividing them individually against a predetermined

set of values and then rounding the results to the nearest real number value and

thereby eliminating the redundant frequency coefficients (Liu and Liao, 2008; Li et

al., 2011).

Table 4.4: The quantized DCT coefficients formed by quantization

79 0 -1 0 0 0 0 0

-2 -1 0 0 0 0 0 0

-1 -1 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0


54

Each block is compressed through quantization table to scale the DCT coefficients

and then the secret message is embedded in quantized DCT coefficients of each block

(Chang et al., 2002).

Figure 4.3: Frequency distribution in a DCT block where embedding takes place

After quantization most of the quantized DCT coefficients (i.e. DCT coefficients

formed after quantization) are equal to zero as seen in table 4.4 and it is where data

hiding takes place. The bits of the secret message can be replaced with the bits of the

quantized DCT coefficients. Thus the secret data can be embedded into high, middle

and low frequency components as shown in figure 4.3. Such embedding alters the

magnitude of the coefficients in the frequency components that are formed by

quantization. The manipulations in the high, low and middle frequencies are not

generally sensitive to the human eye and does not cause any major degradation, thus

data hiding takes place in these components. It is because of the fact that the human

eye is fairly good at spotting small differences in brightness over a relatively large

area, but not so good as to distinguish between different strengths in frequency

brightness (Katzenbeisser and Petitolas, 2000). DCT based techniques does this by

dividing all the values in a block by a quantization coefficient. The quantization step

is lossy because of the rounding error (Watson, 1994b).

After performing embedding in the quantized DCT coefficients, each 8x8 block is left

with a few coefficients and large number of zeroes. Then the modified DCT

coefficients are sorted in the frequency order by zigzag ordering method (Sheisi et al.,


55

2012). In zigzag small unimportant coefficients are rounded to 0 while larger ones

lose some of their precision (Liu and Liao, 2008). Zigzag order is performed to group

similar frequencies together by putting maximum of zeroes close to each other so that

it will compress better.

After zigzag process, JPEG entropy coding (that contains Huffman coding, Run-

Length coding, and DPCM) is applied to each block for further compression (Chang

et al., 2002). Thus the results are rounded to integer values and the coefficients are

encoded using Huffman coding to further reduce the size (Currie and Irvine, 1996).

The size field for discrete cosine values is included in the Huffman coding for the

other size values, so that JPEG can achieve compression of the data. Entropy coding

is lossless compression process. For each block after the entropy coding, a JPEG file

is obtained that contains a quantization table and some compressed data (Chang et al.,

2002). And finally JPEG stego-image is generated.

Extraction of secret data from DCT based data embedding can be performed in two

ways. In one method, the JPEG file (stego-image) is entropy decoded using the

coding tables (Huffman tables) located in the image header (Almohammad et al.,

2008). The entropy decoding (inverse JPEG entropy coding) contains Huffman

decoding, Run-Length decoding, and DPCM decoding. Each block is reconstructed

after all the compressed data are decoded (Chang et al., 2002). After entropy decoding

8 × 8 non-overlapping blocks of the quantized stego DCT coefficients are recovered.

So long as the decoder knows that the embedding took place in the DCT domain, it

will be capable of extracting the message successfully. In another method, firstly the

stego-image is divided into non-overlapping 8X8 blocks of pixels. Next, two

dimensional DCT is applied on each block of the stego-image and the DCT

coefficients are quantized through quantization table to form quantized DCT

coefficients. Then the secret bits from the low/ middle/ high frequency coefficients

can be extracted and converted into 8 bit into character to form the secret message.


56

Figure 4.4: DCT based Steganography during JPEG compression process

The drawback of such technique is that the amount of secret data that can be

embedded is less compared to spatial domain based techniques and also there is a risk

that the bits of the secret data can be lost because of high compression. One of the

major characteristics of steganography is the fact that information is hidden in the

redundant bits of an object and since redundant bits are left out because of the harsh

compression applied, it was feared that the hidden message would be destroyed

(Morkel et al., 2005). However, properties of the compression algorithm have been

exploited for developing steganographic algorithm for JPEGs. Thus it is important to

recognize that the JPEG compression algorithm is actually divided into lossy and

lossless stages, the DCT and the quantization phase form part of the lossy stage, while

the Huffman encoding used to further compress the data is lossless. Steganography

can take place between these two stages (Johnson and Jajodia, 1998; Morkel et al.,

2005).Using this principle of insertion the secret message can be embedded into DCT

coefficients before applying the Huffman encoding (Watson, 1994a). The

steganographic embedding can take place before quantization phase or after

quantization phase as shown in figure 4.4. By embedding the information at this stage,

in the transform domain, it is extremely difficult to detect, since it is not in the visual

domain (Liu and Liao, 2008). Transform embedding methods are found to be in

general more robust than other embedding methods which are susceptible to image-

processing type of attacks (Li and Wang, 2007).


57

4.1.2.2 Discrete Wavelet Transform based Steganography

Wavelet transform is used to convert a signal from spatial domain to frequency

domain. Wavelet transform represents an image as a sum of wavelet functions

(wavelets) with different locations and scales and any decomposition of an image into

wavelets involves a pair of waveforms: one to represent the high frequencies

corresponding to the detailed parts of an image (wavelet function) and one for the low

frequencies or smooth parts of an image (scaling function) (Grgic et al., 2001). The

use of wavelet in image stenographic model lies in the fact that the wavelet transform

clearly separates the high frequency and low frequency information on a pixel by

pixel basis (Latef, 2011). High frequencies are transformed with short functions (low

scale) and low frequencies are transformed with long functions (high scale) (Khalifa

et al., 2008). The result of wavelet transform is a set of wavelet coefficients, which

measure the contribution of the wavelets at these locations and scales (Grgic et al.,

2001). The coefficients in this wavelet expansion are called the discrete wavelet

transform (DWT), of the signal. The Discrete Wavelet Transform (DWT) is based on

sub-band coding that result in fast computation of Wavelet Transform. The discrete

wavelet transform is a very useful tool for signal analysis and image processing,

especially in multi-resolution representation that can decompose signal into different

components in the frequency domain (Khalifa et al., 2008; Audithan and

Chandrasekaran, 2009). DWT for an image as a 2-D signal can be derived from 1-D

DWT and the easiest way for obtaining scaling and wavelet function for two

dimensions is by multiplying two 1-D functions (Grgic et al., 2001).

The simplest form of discrete wavelet transform (DWT) is Haar-DWT in which the

low frequency wavelet coefficient are generated by averaging the two pixel values

and high frequency coefficients are generated by taking half of the difference of the

same two pixels (Chen and Lin, 2006; Nag et al., 2011; Latef, 2011). Haar wavelet is

not continuous, therefore not differentiable and is used to convert spatial domain

image to wavelet domain (Dey et al., 2012). The operation for Haar DWT has been

applied to image processing especially in multi-resolution representation (Audithan

and Chandrasekaran, 2009).


58

For a function f, the HWT (Haar wavelet transform) is defined as (Chen and Lin,

2006):

( ) (4.2)

( ……… 𝑁 )

( ……… 𝑁 )

where L is the decomposition level, a is the approximation subband and d is the detail

subband.

Figure 4.5: Two dimensional wavelet transformation of an image

To apply HWT on images, a one level Haar wavelet is first applied to each row and

secondly to each column of the resulting image of the first operation (Dey et al.,

2012). The DWT is computed by successive low frequency and high frequency of the

discrete time-domain signal that decomposes into four classes or band coefficients

(Khalifa et al., 2008). Its significance is in the manner it connects the continuous-time

multiresolution to discrete-time filters. For 2-D images, applying DWT separates the

image into a lower resolution approximation image or band (LL) and higher

frequency band or detail components horizontal band (HL), vertical band (LH) and

diagonal band (HH) as shown in figure 4.5 (Audithan and Chandrasekaran, 2009; Nag

et al., 2011; Latef, 2011). The approximation band (LL) consists of low frequency

wavelet coefficients, which contain significant part (smooth parts) of the spatial

domain image. Thus embedding in the lower frequency sub-bands may degrade the

image significantly. The other bands such as HH, HL, and LH also called as detail

bands consists of high frequency coefficients, which contain the edge and texture

details of the spatial domain image (Audithan and Chandrasekaran, 2009; Nag et al.,

2011). However changes in the high frequency sub-bands are not generally sensitive


59

to the edges and textures of the image and also to human eye. Thus data hiding takes

place in high frequency sub-bands (HH, HL, and LH sub-band) by high modifying

frequency wavelet coefficients (Latef, 2011; Dey et al., 2012). The overall process is

called the one-level 2-D Haar-DWT. With this approach, the time resolution becomes

arbitrarily good at high frequencies, while the frequency resolution becomes

arbitrarily good at low frequencies (Grgic et al., 2001).

In the process of extraction, firstly, the modified coefficients matrix is obtained by

applying 2D-Haar DWT to the stego-image that separates all the four sub-bands that

are LL, HL, LH and HH (i.e. the high and low frequency information). Then the

coefficients of the three high frequency sub-bands (HH, HL, and LH) are extracted to

get the secret bits.

The main limitation of such method is that as the embedding takes place in the

frequency domain, so the hiding capacity is less as it achieves high compression.

Secondly, because of decomposition which is a part of wavelet transform that results

in creation of sub-bands, the time complexity of such process increases and is more

than spatial based embedding.

4.2 Image Steganalysis

Steganalysis is the art and science to detect whether a given digital image contains

hidden data. The steganalysis plays a role in the selection of features or properties of

the image to test for hidden data and also in designing of technique to detect or extract

tests the hidden data. A steganalysis method is considered as successful if it can detect

and extract the hidden data embedded (Katzenbeisser and Petitolas, 2000).

Steganalysis can be termed as a method of attacking the digital media for estimating

whether the media contains secret data embedded in it. Thus it can serve as an

effective way to judge the security performance of steganographic techniques. The

steganalyst (one who performs steganalysis) is assumed to control the process of

transmission channel and trace out for suspicious data. In practice, the steganalyst is

frequently more interested in verifying whether or not a secret message is present in a

medium (Fridrich et al., 2003b).


60

The objectives of steganalysis are:

To detect the existence of a secret message in a binary image. The suspect

image may or may not have hidden data encoded into them.

To evaluate techniques that can be used to distinguish the images hidden with

secret messages from those without. Some of the suspect images may have

noise or irrelevant data encoded into them.

Its purpose is to identify the type of steganographic method used to create the

stego-image by trying to understand the internal mechanism used during the

embedding operation.

The steganalysis technique is used not only to detect the stego-image but it

tries to recover the hidden data.

The steganalysis technique tries to estimate the length and the location of the

pixels bearing the hidden message.

Steganalysis is designed to estimate the relative numbers of embedding

changes in the digital image.

4.2.1 Types of Attacks

While the purpose of Steganography is to hide messages, there exist several attacks

that one may execute to test for Steganographic data. The strength of a steganographic

algorithm depends on its ability to successfully withstand attacks. Attacks and

analysis of hidden data may take several forms: detecting, extracting, disabling or

destroying hidden data. An attack is dependent on what information is available to the

steganalyst. Attacking steganographic algorithm is very similar to attacking

cryptographic algorithms and similar techniques apply (Wayner, 2009). There are six

general protocols used to attack the use of Steganography as pointed out by

Katzenbeisser and Petitolas, (2000). These are as follows:

Stego-only attack: Only the steganography medium/object is available for analysis.


61

Known-carrier attack: The carrier, that is, the original cover, and steganography

media/object are both available for analysis or are known.

Known-message attack: In this case, the hidden message is known and can be

compared with the stego-object/medium.

Chosen-stego attack: The steganography medium/object and tool (algorithm) are

both available for analysis.

Chosen-message attack: Here a chosen message and steganography tool (or

algorithm) is used to create steganography media for future analysis and comparison.

Known-steganography attack: The secret message, steganography medium/object

and the steganography tool (algorithm) are known and available for analysis.

Steganography elimination technique is involved with steganalysis that try to

eliminate or destroy the hidden information as the purpose is to break the cover

communication. The most common attacks based on this factor are (Katzenbeisser

and Petitolas, 2000):

Destroy everything attack –this type of attack aims in destroying the message

completely and the attacker might not even try to retrieve the message.

Random tweaking attacks – here small changes in the files are added so that the

message will be unreadable.

Add new Information – in some cases the attackers might use the same technique of

data hiding to embed a new message into the stego-file. The original message might

be overwritten.

Reformat attack – a common way to destroy the information hidden in a file is by

changing the file format. This type of attack can produce a lot of damages to the

hidden message.

Compression attack – the attacker might compress the file which might result in the

total loss of the secret message embedded in the file.


62

The attacks presented above discuss ways to destroy the hidden message. But for all

such case, the attack should be on the suspected image. It might also be a case that an

attack can be performed on an innocent image that does not contain any secret data.

Based on this certain attacks are implemented in steganography to evaluate if the

image contains hidden data.

4.2.2 Image based Steganalysis Techniques

Fig 4.6: Classification of image steganalysis techniques

Steganalysis can be classified into targeted method and blind method as shown in

figure 4.6 (Patil et al., 2012). A targeted steganalysis uses the knowledge about the

steganographic technique to detect stego-images created with that specific technique,

while blind steganalysis aims to distinguish whether an image contains hidden

information without any prior knowledge about the used steganographic technique.

Blind and targeted steganalysis techniques have been greatly studied on digital images

(Fridrich et al., 2001).

4.2.2.1 Targeted Steganalysis

Targeted steganalysis are designed to evaluate mechanisms of particular embedding

operations and fully utilizes the knowledge applicable to detect steganography. A

targeted steganalysis technique works on a specific type of known stego-system and

sometimes limited on image format (Chandramouli et al., 2004). By studying and

analyzing the embedding algorithm, one can find image statistics that change after

embedding. The results from the targeted steganalysis techniques can be accurate


63

while the technique is also inflexible since most of the time there is no path to extend

them to other embedding algorithms. A targeted Steganalysis can be of three types-

Visual, Statistical and Structural attacks (Patil et al., 2012).

(a) Visual attacks

Visual Attacks are simplest form of steganalysis that involves examining the stego-

image with the naked eye to identify any kind of degradation (Patil et al., 2012). The

steganographic method does not leave any kind of visual distortion on the image file

due to modification of bits. The visual attack makes the ability of humans to

distinguish between noise and visual patterns that can be implemented by picking on

different properties of the image. For example, a visual attack could be set up to

display the spatial domain of the image on its own to verify its LSB. A steganalyst

searches for such inconsistency in order to classify an image either as a stego-image

or normal image. Although such inconsistencies depend on way the data is embedded

in the cover-image. Similarly, the steganalyst could also attack in the transform

domain to evaluate whether or not the image contains signs of transform embedding

(Westfeld and Pfitzmann, 2000).

On the other hand, it is much harder to perform a visual attack on randomized

embedding as the data are embedded in the random pixels of an image. So it becomes

much difficult in identifying the regions that have been altered as a result of random

embedding. Visual attacks can be a useful tool for known cover attacks. When the

cover image is not available to the steganalyst, visual attack is depends on three

factors holding true to prove successful. The message must be embedded in a

sequential order, its length must be less than the maximum size of the bit plane and it

should not be encrypted. It is no longer possible to see a change in form in the bit

plane, so the steganalyst finds it harder to classify the image as stego-image. Also,

when a message is encrypted it can reduce the chance of success for a visual attack by

considerable proportions when the cover-image is not available.

It is essential for a visual attack to determine appropriately the features of the image

that can be ignored and those features that can be taken into consideration for

implementing a valued attack in order to test the possibility that the suspected image

contains secret message. The success of visual attack varies significantly depending


64

on the steganographic method applied and the format of the image. As the attack can

be applied in different embedding technique so examining properties of several image

formats is not sufficient. The cover-image or the steganographic technique is required

in order to detect the distorted regions successful attack. Thus it is proves time-

consuming in testing images for various methods of embedding. This is obviously an

inefficient methodology, and the main drawback with the attack is the fact that it

cannot be automated.

(b) Statistical Attacks

In this type of attacks, the statistical analysis of the images by some mathematical

formula is performed to detect the presence of hidden data. Statistical attack is

partially similar to visual attack. Generally the hidden message is more random than

the original data of the image thus finding the formula to know the randomness

reveals the existence of data (Wayner, 2009). A theory is constructed that seemingly

explains why the phenomenon occurs, and statistical methods are used to prove this

theory to be either true or false. Statistical tests try to reveal whether an image has

been modified by determining image’s statistical properties deviate from a norm.

Some tests are independent of the data format and just measure the entropy of the

redundant data (Provos and Honeyman, 2003). There are methods that try to detect

the existence of a hidden message via statistical approaches by identifying signs of

embedding for specific stego-systems. Chi-square Analysis is one of such attack that

belongs to statistical attack.

Chi-square Analysis

Westfeld and Pfitzmann, (2000) outlined a statistical attack where they observed that

for a given image, the embedding of data changes the histogram of color frequencies

in a particular way. In their case, the embedding process changes the least significant

bits of the colors in an image where the colors are addressed by their indices in the

color table. Then, the frequencies of the color indices before embedding become

larger than the frequencies after embedding because the frequency difference between

adjacent colors is reduced by the embedding process. Westfeld and Pfitzmann, (2000)

used a Chi-square (χ2) test to determine whether the color frequency distribution in an

image matches a distribution that shows distortion from embedding data with the


65

probability of statistics under the condition that the distributions frequencies of the

color indices before embedding and after embedding are equal. They increased the

sample size and applied the test at a constant position.

According to Provos and Honeyman, (2003) it is possible to extend Westfeld and

Pfitzmann’s Chi-square test to be more sensitive to partial distortions in an image i.e.,

the DCT coefficients in a JPEG format. According to them, two identical distributions

produce about the same chi-square values in any part of the distribution. Instead of

increasing the sample size and applying the test at a constant position, they used a

constant sample size but slide the position where the samples are taken over the entire

range of the image. They stated that the expected distribution for the chi-square test

has to be computed from the image by taking the arithmetic mean of the frequencies

of the color indices before embedding and after embedding, and then to compare

against the observed distribution.

(c) Structural Attacks

Structural attacks are designed to take advantage of the high-level properties that are

known to exist for a particular steganographic algorithm (Patil et al., 2012). Structural

attacks rarely analyze each image on its own merits. Instead, the images are scanned

to see if they contain any of the known side-effects for various steganographic

algorithms. Images that contain these properties are often subjected to further

investigation. There are sometimes cases where the image may possess signs of

steganography while it may be perfectly innocent. This is why a more detailed

investigation is done in structural attack. A common element of structural detectors is

to estimate features so that macroscopic cover property can be approximated from the

stego object by inverting the effects of embedding as a function of features so that it

matches cover assumptions best (Fridrich et al., 2003b). A successful structural attack

relies on being able to identify a distinct difference between the cover-image and a

stego-image, which means that there is a heavy reliance on either knowing the cover-

image or knowing the embedding details of the steganographic algorithm and

evaluating the consequences of the embedding strategy. It is rarely the case that a

steganalyst will have access to one of these, and even rarer for them to have access to

both, which only hampers the success of the attack. Structural attacks are not used as


66

a means of proving that an image contains steganography, rather they highlight

images that contain signs of embedding. RS (Regular and Singular groups) analysis

and Pair analysis represents structural attack.

Regular and Singular groups (RS) Analysis

RS steganalysis is used to estimate the length of the embedded message on a digital

image for LSB steganographic methods. RS steganalysis was introduced by Jessica

Fridrich and others (Fridrich et al., 2001; Fridrich et al., 2003b) for exploiting the

correlation of images in the spatial domain. They stated that lossless capacity reflects

the fact that the LSB plane – even though it looks random – is related to the other bit

planes and the method is based on the fact that the content of each bit plane of an

image is correlated with the remaining bit planes. In RS Analysis the image is

partitioned into groups of pixels (Regular and Singular groups) of a fixed shape

depending upon some properties. Each group classified as ‘regular’ or ‘singular’

depending on whether the pixel noise within the group (as measured by the mean

absolute value of the differences between adjacent pixels) is increased or decreased

after flipping the LSBs of a fixed set of pixels within each group (the pattern of pixels

to flip is called the ‘mask’). The classification is repeated for a dual type of flipping.

They stated that some theoretical analysis and experimentation show that that the

proportion of regular and singular groups form curves quadratic in the amount of

message embedded by the LSB method. Using such assumption, the proportions of

regular and singular groups with respect to the standard and dual flipping, some

information may be gained to estimate the proportion of an image in which data is

hidden. The estimate can be accurate (often within 1%), but fails when this

assumption does not hold.

Pairs analysis

Pairs Analysis is a steganalysis technique that detects the data hidden in palette

images by analyzing the LSBs of indices (Fridrich et al., 2003b; Fridrich et al.,

2003a). The principle of Pairs Analysis is based on the color pair. Pairs Analysis first

splits an image into a color cut, scanning through and selecting only pixels which fall

into each pair of values (0,1), (2, 3), and so on. Concatenating the color cuts into a

single stream, the homogeneity of the LSBs is measured. Repeating with the


67

alternatives pairs of values (255, 0), (1, 2), (3, 4) etc, one can show that the function

defined by the difference between the two homogeneity measures is quadratic in the

amount of embedded data. Under the assumption that natural images have no

difference in homogeneity, one can obtain information to deduce the amount of

embedded data in an image, and this estimate form the statistic which is used to

distinguish the cases of hidden data present and absent. However the method is not

reliable for images for which the assumption of equal homogeneity does not hold.

4.2.2.2 Blind Steganalysis

Blind steganalysis is an approach of detecting secret message embedded into a file

even when it is not sure how the information might have been embedded. Blind

steganalysis do not require prior knowledge about details of the embedding operations

(Luo et al., 2008; Chandramouli et al., 2004). Blind steganalysis therefore works

differently to targeted steganalysis because it assumes that nothing is known about

either the algorithm or the cover image that was used to produce a suspect image. It

tries to detect any steganographic tool, known or unknown in advance and both sets of

statistical moments are used as features for steganalysis (Chandramouli, 2003). The

attacks attempt to evaluate the probability of embedding based solely on the data of

the suspect image. Such approaches are more likely to be common in real-world

steganalysis.

Blind identification methods pose the steganalysis problem as a system identification

problem and the embedding algorithm is represented as a channel and the goal is to

invert this channel to identify the hidden message (Chandramouli and Subbalakshmi,

2004). In such steganalysis method, each image is analyzed individually based on the

computed statistics. Digital images are known to be statistically non-stationary and

such causes practical issues in implementing algorithms based on the blind

identification model which assumes stationarity of data (Chandramouli, 2003). When

the stationarity condition is violated additional effort is needed to make steganalysis

work. If the message embedding algorithm is nonlinear then the blind identification

problem becomes more difficult (Chandramouli et al., 2004). Perhaps the most

important aspect of blind steganalysis is ensuring that one can derive an estimate of

the cover image which is as accurate as possible. The attacks that follow this


68

procedure often compare the data in the estimated cover image to that of the suspect

image.

Some of the methods that belong to the blind steganalysis schemes are discussed

below:

(a) Self-calibration mechanism: Calibration process is used by the blind steganalysis

schemes to estimate the statistics of the cover image from the stego image in case of

JPEG which is proposed by Fridrich et al., (2002). It depends on the fact that JPEG

based stego-systems encode the message data in the transform domain during the

compression procedure to produce stego-image of JPEG format by transforming the

image into 8x8 blocks, and it is within these blocks the secret data are encoded. The

idea of calibration is to estimate marginal statistics of the cover-image’s transformed

domain coefficients from the stego-image by desynchronising the block transform

structure in the spatial domain. A stego-image in transformed domain representation

is first converted to spatial domain then it is cropped by a small number of pixels at

two orthogonal margins (e.g. cropped by 4 rows and 4 columns) and then re-encoded

in the JPEG format. The calibration is done by taking the feature differences of the

cropped image with the original image. Visually and technically, the calibrated image

is compared with the stego image based on statistical measures such as PSNR,

Histogram etc.

(b) Features capturing cover memory (Sullivan et al., 2006; Fu et al., 2006): Most

steganographic schemes hide data on a per-symbol basis, and typically do not

explicitly compensate or preserve statistical dependencies. Hence, features that

capture higher dimensional dependencies in the cover symbols are crucial in detecting

the embedding changes. Cover memory has been shown to be very important to

steganalysis and is incorporated into the feature vector in several ways.

(c) Supervised learning based steganalysis (Avcibas et al., 2003; Lyu and Farid,

2004): Supervised learning based steganalysis techniques employ two phase

strategies: (a) training phase and (b) testing phase. In the training phase it constructs a

classifier to differentiate between stego and non-stego images using training

examples. The learning classifier iteratively updates its classification rule based on its

prediction and the ground truth. In the testing phase unknown images are given as


69

input to the trained classifier to decide whether a secret message is present or not.

However the choice of proper features to train the classifier upon is a critical step. If

the selected features are not appropriate for the specific embedding algorithm then the

detector may completely fail. There is no systematic rule for feature and parameter

selection. It is extremely difficult or even impossible to identify portions of the image

where a message is hidden.

Steganalysis mechanism can be used to analyze the embedding performance of

steganographic techniques. The steganalysis techniques presented works only on the

specific format and thus there is no universal steganalysis technique. It is up to the

user (steganalyst) to choose an appropriate methodology based on the information that

is available and also these steganalysis techniques has pros and cons. A successful

attack relies on being able to identify a distinct difference between the cover-image

and a stego-image, which means that there is a heavy reliance on either knowing the

cover-image or knowing the embedding details of the steganographic algorithm. It is

rarely the case that a steganalyst will have access to one of these, and even rarer for

them to have access to both. As the attack can be applied in different embedding

technique so examining properties of several image formats is also not sufficient and

is also time-consuming or completely infeasible. Also, the analysis may give

significant amount of false positives. There are some cases where the image may

possess signs of steganography when it may be perfectly innocent. Among the

enormous amount of images present on the internet, it is very difficult to judge

whether an image is stego-image or not.

4.3 Performance Metrics of Image Steganography

In order to examine the performance of a steganographic system or technique, an

evaluation scheme for steganographic systems is needed. Currently, no standard test

or measure is available in order to evaluate the performance or the effectiveness of

steganographic systems. However, there are some guidelines and general procedures

that can be considered when evaluating or designing steganographic systems (Cox et

al., 2008). Basically, steganographic systems have two fundamental characteristics

which must be investigated in order to evaluate the system. The security or


70

undetectability and the hiding capacity are the most important requirements that must

be addressed in every steganographic system (Wang and Wang, 2004). Thus, the

effectiveness of a steganography technique can be measured using two key principles:

the amount of data that can be embedded and the difficulty of detection of this data

(Cole and Krutz, 2003). Therefore, measuring these two characteristics determines the

superiority of a steganography technique over another. Thus, designing information

hiding algorithms that are statistically undetectable and can hide a large amount of

data is the main goal of steganography (Cox et al., 2008).

Generally, a steganographic system fails if an attacker is able to prove the existence of

a secret message or if the embedding technique arouses suspicions of attackers. If a

steganographic algorithm leave a trace during embedding than it can be detected

through statistical analysis. For an algorithm to be statistically undetectable, it should

be impossible for a warden to statistically prove the existence of hidden information.

4.3.1 Measure of Steganographic Capacity

Fundamentally, capacity of a steganographic system is used as one of the evaluation

criteria which is defined as the amount of information that can be hidden within the

cover image. Capacity is the most important parameter since the size/amount of the

secret information has direct impact on a steganographic system. Therefore,

evaluating the capacity of a steganography technique is the maximum number of bits

that can be embedded in a given cover image with a negligible probability of

detection by an adversary. Moreover, the size of the hidden information relative to the

size of the cover image is known as embedding rate or capacity (Venkatraman et al.,

2004). The steganography embedding operation needs to preserve the statistical

properties of the cover image in addition to its perceptual quality. Steganographic

systems mainly used for secret communication aims to maximize the steganographic

capacity and minimize the perception of hidden messages in stego images (Wang and

Wang, 2004). Cole and Krutz, (2003) stated that ‘the more data you can hide, the

better the technique’. However, the steganographic capacity tends to be restricted by

the size of cover files (Artz, 2001; Rabah, 2004). Therefore, designing a

steganography technique should take into consideration how to increase the amount of

secret data that can be embedded without affecting the properties of stego-image.


71

Additionally, improving the stego image quality while maintaining the steganographic

capacity is also considered a significant contribution (Wu and Hwang, 2007).

4.3.2 Measure of Robustness

When determining the robustness of algorithms against image manipulation attacks, a

distinction can again be made between the cover-image image and the stego-image, as

the embedding of data can result in changes to the bits of the image data and

distortion can take place due to embedding. During communication of a stego-image

between authorized parties, the image may undergo changes by an attacker in an

attempt to remove hidden information. It is thus important for steganographic

algorithms to be robust against malicious as well as unintentional changes to the

image. Moreover, the design of most steganographic systems does not consider

robustness as a fundamental requirement, since the majority of these systems assume

the passive warden scenario (Cox et al., 2008). Hence, steganographic systems are

either not robust against modifications or have limited robustness against technical

modifications.

4.3.3 Measure of Imperceptibility

The invisibility of the embedded information is the first and foremost evaluation

criteria, since the strength of image steganography lies in its ability to go unnoticed to

human eye. If any changes to the bits of image lead to visual distortion which

becomes noticeable then the overall objective of the steganographic method fails. On

the other hand, if the level of invisibility is high in image steganography algorithms

then the overall objective of the approach is fulfilled. Thus, for better evaluation and

comparison it is therefore necessary to consider the perceptibility of the resultant

image in the evaluation process. Methods or techniques that can be used to evaluate

the undetectability or imperceptibility of steganographic systems are different from

one system to another depending on the type of cover file used for data hiding.

Two types of perceptibility can be distinguished and evaluated in image processing

systems, namely fidelity and quality (Stoica et al., 2003). Fidelity is the perceptual

similarity between images before and after processing. For image based

steganography, the fidelity is defined as the perceptual similarity between the original


72

cover image and the stego image. Therefore, the fidelity evaluation requires both

versions of the image before and after embedding. On the other hand, attackers or

recipients do not have access to the original cover image. Additionally,

steganographic systems must avoid attracting the attention of anyone not involved in

the secret communication process and therefore stego images must have very good

quality. Therefore, quality is the major perceptual concern for most steganography

techniques in order to avoid any suspension and therefore detection (Cox et al., 2008).

Even though the PSNR (peak signal-to-noise ratio) and the mean square error (MSE)

are by definition fidelity metrics, they also acts as quality measures, since they also

represent perceptual distance metrics used to measure the distortion of an image

(Stoica et al., 2003; Wang et al., 2002b). Accordingly, a high quality image entails a

large PSNR value and therefore both cover image and stego image are very similar

and quite undistinguishable (Yu et al., 2007; Cheddad et al., 2010). Significantly,

‘Fidelity’ is defined as the perceptual quality of stego files and therefore PSNR and

MSE describe how imperceptible the secret message is (Cox et al., 2008).

The level of statistical undetectability of an image steganography algorithm is

determined by the amount of noticeable difference between the cover-image and the

stego-image and thus, it is very important that there appears no visual difference

between the images and are perfectly imperceptible. In most cases of steganography,

the term of security is usually equivalent to undetectability. Therefore, secure

steganographic systems refer to imperceptible steganographic systems (Cox et al.,

2008). Chang et al., (2002) stated that ‘The better quality the stego image has, the

more secure the steganography system will be’. Thus, imperceptible steganographic

system means that the hidden information cannot be perceived by the human visual

system or other statistical means. Nevertheless, hiding secret information in a cover

image may introduce some noise or modulate this cover image (Venkatraman et al.,

2004). So, the embedding process must not degrade the perceived quality of stego

image in order to get a secure steganographic system.

4.3.3.1 Evaluating the Quality of the Images

A steganographic method is considered secure if it is difficult for attackers to detect

the presence of hidden data in the stego files by using any accessible means.


73

Additionally, the hidden message must be invisible both perceptually and statistically

in order to avoid any suspicions of attackers. Visual quality refers to any visual-

quality metric appropriate to evaluate the visual distortion due to the embedding

process. Evaluating and analyzing the quality of images still represent a significant

issue in many image processing applications. Thus, image quality represents a key

factor in most applications and assessing the perceived quality of digital images is

very important (Tan et al., 1998). Generally, there are two primary ways to measure

image quality: objective quality methods (automated) and subjective quality methods

(human based) (Stoica et al., 2003). The objective methods measure the physical

aspects of images and psychological issues while the subjective methods are

psychologically based methods. Additionally, subjective methods use human

observers in order to evaluate the quality of images. For example, subjects can be

asked to compare a modified image with its original version in order to know how

much this modified image is degraded (Wu and Rao, 2005). A steganographic system

is perfectly secure if the statistics of the cover file and that of the stego file are

identical. Therefore, the characteristics and attributes of cover files should not be

changed and no distortions should be produced during the embedding process

(Venkatraman et al., 2004). However, the presence of statistical anomalies (i.e.

histograms and a variety of higher-order statistics) may be used by an adversary to

prove that a secret communication is taking place (Cox et al., 2008). Accordingly, the

higher the quality of stego images, the larger the imperceptibility of the

steganographic system. Therefore, evaluating the quality of stego images is a

significant measure to be used for evaluating the performance of image steganography

techniques (Wu and Hwang, 2007).

4.3.3.2 Objective Quality Assessment

Designing image quality evaluation metrics that can automatically predict the

perceived image quality is the main goal of objective image quality assessment

research (Wang et al., 2002a). Thus, the assessment algorithms designed for objective

image quality evaluation should be in close agreement with subjective human

evaluation regardless of the image content, the distortion amount, or the distortion

type (Sheikh et al., 2006). Objective image quality evaluation metrics are classified

into three generic categories according to the availability of the unmodified or original


74

image (reference). These categories are: full-reference (FR), no-reference (NR), and

reduced-reference (RR) image quality assessment (Wang et al., 2002b). The full

reference means that the original image and the test (impaired) image are available.

However, the no-reference means that only the test image is available. On the other

hand, the reduced-reference means that the test image and some information about the

original image are available.

Nowadays, the most popular and common distortion measures used to evaluate the

quality of images in the field of image processing is the peak signal-to-noise ratio

(PSNR) and Mean squared error (MSE). In the literature, the peak signal-to-noise

ratio metric (PSNR) has shown the best advantage almost overall objective image

quality metrics under different image distortion environments and strict testing

conditions (Wang et al., 2002a).

PSNR and MSE are the most common and widely-used full-reference (FR) metrics

for objective image quality evaluation (Sheikh et al., 2006). Furthermore, PSNR is

used in many image processing applications and considered as a reference model to

evaluate the efficiency of other objective image quality evaluation methods (Wang et

al., 2002b). The PSNR measures the similarity between two images (how two images

are close to each other) and are usually measured in decibels (dB) and, while the MSE

measures the difference between these two images and measured in percentage. The

computing of these two metrics is very easy and fast, so they are widely-used and

very popular (Wang et al., 2002a). The MSE is the statistical difference in the pixel

values between the original and the reconstructed image. Moreover, PSNR and MSE

are defined as follows (Stoica et al., 2003; Wang et al., 2002b):

(4.3)

where mean square error (MSE) is a measure used to quantify the difference between

the cover image I and the stego (modified) image I’. If the image has a size of M * N

then

𝑁∑

∑ ( ) ( ) 𝑁 (4.4)


75

According to Yu et al., (2007) for color images, PSNR is similarly defined as follows:

(4.5)

where the MSE for color images is defined as follows

(4.6)

where MSER , MSEG , and MSEB are the MSE of red, green, and blue components

respectively.

Thus the best image quality can be found when the MSE value is very small or going

to be zero since the difference between the original and reconstructed image is

negligible. However, PSNR values between 20 and 40 can be considered as typical

values (Cole and Krutz, 2003). Moreover, the higher the PSNR value of a stego

image, the better the degree of hidden message imperceptibility. For example, it is

difficult for the human visual system to recognize any difference between a grayscale

cover image and its stego image if the PSNR value exceeds 36 dB (Wu and Hwang,

2007). According to Cheddad et al., (2010) PSNR values falling below 30dB indicate

a fairly low quality, i.e., distortion caused by embedding can be obvious. Thus a high

quality stego-image should strive for a PSNR value of 40dB and above. The above

metric is used to quantify the distortion caused by an embedding process for

calculating distortion measurements. By employing such quality metrics it is very

probable that future benchmarking of digital steganography systems can be evaluated

for making efficient steganographic system.

4.3.3.3 Subjective Quality Assessment

This kind of evaluation is based on observation of some images by humans for

analyzing or accessing their visual quality. However, the visual sensitivity varies from

person to another that changes over time and so, different viewers can behave

differently. The objective image quality measures may not perfectly reflect the

impression of humans. Thus, the subjective quality measure represents a true

performance benchmark for image processing tools (Stoica et al., 2003). Unlike

objective quality measures, subjective measures represent the most reliable method to


76

determine the actual image quality since human beings are the ultimate proposed

receivers. Furthermore, it can be stated that the subjective test is one of the best and

reliable method to evaluate the quality of images. Accordingly, subjective measures

use structured experimental designs and real end users or human subjects to assess the

quality of images (Tan et al., 1998; Wu and Rao, 2005). Furthermore, they are the

most widely recognized methods for image quality evaluation since they quantify the

actual perceived quality. However, subjective experiments of image quality

evaluation are complex, difficult to repeat, very expensive, and time consuming (Wu

and Rao, 2005). Generally, observers are asked to rate the quality of images,

sometimes with reference to other images.

The performance evaluation on individual data sets is desirable for direct performance

comparison of two methods for one data set. The quality metrics shows the

relationship between the bit- or detection-error and the visual-image quality for a

fixed attack and is very useful in comparing different steganographic methods since it

facilitates immediate robustness comparisons for a given visual image quality.

CHAPTER 4 Image based...

Documents

Transcript of CHAPTER 4 Image based...