CHAPTER 4 Image based...
Transcript of CHAPTER 4 Image based...
41
CHAPTER 4
Image based Steganography
4.1 Introduction
Digital images often have a large amount of redundant data and for this reason it is
possible to hide secret message inside image file. Images are the most common and
widespread carrier medium for steganography (Westfeld and Pfitzmann, 2000). To a
computer, an image is a collection of numbers that constitute different light intensities
in different areas of the image (Johnson and Jajodia, 1998). This numeric
representation forms a grid and the individual points are referred to as pixels (Morkel
et al., 2005). These pixels make up the image’s raster data. Data hiding in images take
advantage of the limited power of the human visual system (HVS) which has a low
sensitivity in pattern changes and luminance (Carvajal-Gamez et al., 2009). Most of
the digital steganography methods take advantage of the margin between the
numerical value and visual perception of the multimedia carriers. In other words, the
secret messages are embedded in the images by involving some slight distortions in
the non-significant parts which are invisible to human perception system.
The amount of digital images has increased rapidly on the Internet because of its
importance in many applications. The digital images are the most popular cover-
object for steganography because of its suitable size in comparison to other digital
media and of its massive presence in the internet, it can effort to carry large amount of
secret data embedded into it. In image steganography, it is necessary to ensure that the
changes in the stego-image due to the embedding of data are visually and statistically
negligible for making the steganographic method difficult to detect. The most
effective way of hiding data in an image is to change the image content i.e. the colors
of the pixels. Such technique, although crude, hides a large volume of information
inside the image. The idea is to embed the data into a significantly larger object so
that the changes are undetectable (Carvajal-Gamez et al., 2009). In steganography, a
secret message is embedded into another, innocent looking digital medium, in order
not only to conceal the secret message, but to conceal the sheer existence of the secret
Image based Steganography
42
message. The security of stego-images depends entirely on their ability to go
unnoticed. There are many methods that enable embedding secret information into an
image. The information can be embedded inside an image file in any order or in
specific areas that makes the information invisible and undetectable from third party.
It is also important to note that steganographic technique not only involves in
embedding information inside digital media but also the receiver should be able to
successfully retrieve the information from the media. When dealing with digital
images for use with Steganography, 8-bit and 24-bit per pixel image files are typical
(Johnson and Jajodia, 1998). Both have advantages and disadvantages, 8-bit images
are useful because of their relatively small size. 24-bit images offer much more
flexibility when used for Steganography. Because of this reliance greater benefits can
be gathered from 24-bit images which may use 16 millions of colors for RGB images
(Gonzalez and Woods, 2002; Sheikh et al., 2006).
Reasons for using digital image as cover-media for steganography:
a. It is the most widely used medium.
b. Takes advantage of the limited visual perception of colors.
c. This field is continually growing with the growth of computer graphics.
d. Digital images are made up of pixels.
e. The arrangement of pixels makes up the image’s ‘raster data’.
f. 8-bit and 24-bit images are common
g. The larger the image size, the more information can be hidden.
In Steganographic technique the choice of the cover image is equally important. The
digital images may be large enough to be transmitted through the internet. So,
techniques are used to fit the image to a suitable size for displaying it in a reasonable
time across the internet. These techniques make use of mathematical formulas to
reduce image data, resulting in smaller file sizes and this process is called
compression (Morkel et al., 2005). Image compression is used to minimize the
amount of memory needed to represent an image. Current image formats can be
Image based Steganography
43
divided into two broad categories, lossy and lossless. Both methods save storage
space but have different results, interfering with the hidden information, when the
information is uncompressed (Johnson and Jajodia, 1998). Lossy compression (e.g.
JPEG format) attains a high level of compression and thus saves more space but in
doing so, the bits may be altered largely and the originality of the image may be
affected. The plus side of lossy images, in particular JPEG, is that it achieves
extremely high compression, while maintaining fairly good quality (Bender et al.,
1996). Lossless compression reconstructs the original message exactly and it is
preferred when the original information must remain intact (as with steganographic
images) (Johnson and Jajodia, 1998). However, they do not have the high
compression ratio that lossy formats do. Lossless compression is typical of images
saved as GIF (Graphic Interchange Format) and BMP (bitmap).
Image steganography method is basically classified into two categories based on the
working domain: Spatial domain and Frequency domain based steganography.
4.1.1 Spatial Domain Based Steganography
The term spatial domain refers to the image plane itself and approaches in this domain
are based on direct manipulation of pixels of an image. Spatial domain methods are
procedures that operate directly on pixels and are aggregate of pixels composing an
image.
Figure 4.1: One byte representation of a pixel with integer to binary conversion
In the spatial domain approach, the secret message is embedded directly into the
pixels of a cover image. Spatial domain based steganographic method involves
modification of the secret data in the spatial domain of the cover-image. Least
Image based Steganography
44
significant bit (LSB)-based hiding strategies are most commonly used in this
approach. Spatial domain steganographic techniques, also known as substitution
techniques, consists of simple techniques that create a covert channel in the parts of
the cover image in which changes are likely to be imperceptible to the human visual
system (HVS) (Hamid et al., 2012). Spatial domain based steganography include LSB
based embedding and Palette based Embedding.
4.1.1.1 Least Significant Bit based Steganography
Least significant bit (LSB) is the most popular and common method of embedding
scheme where information is hidden in the least part of an image (Juneja et al., 2009).
LSB steganography is the most classic and simplest steganographic techniques, which
embeds secret messages in a subset of the LSB plane of the image (Abraham and
Paprzycki, 2004). This method is probably the easiest way of hiding information in
an image and yet it is surprisingly effective. It works by using the least significant bits
of each pixel in one image to hide the most significant bits of another. This
embedding method is basically based on the fact that the least significant bits in an
image can be thought of as random noise, and consequently they become not
responsive to any changes on the image (Bailey and Curran, 2006; Kharrazi et al.,
2006; Hamid et al., 2012). A large number of popular steganographic tools, such as S-
Tools 4, Steganos and StegoDos, are based on LSB replacement in the spatial domain
(Johnson and Jajodia, 1998).
This technique tries to substitute redundant parts of a signal with secret message. The
embedding process consists of choosing a subset of cover elements and performing
the substitution operations on them (Chan and Cheng, 2004). The basic concept of
LSB based embedding includes the embedding of the secret data at the bits which are
having minimum weighting so that it will not affect the value of original pixel (Sharda
and Budhiraja, 2013). This method often works with raster images, presented in a
format without compression (e.g. *.gif, *.bmp). This file formats are preferred
because they offer "lossless" compression. But, other image formats are used as cover
image as well (Bandyopadhyay and Maitra, 2010). The image formats typically used
in the LSB substitution are lossless and the data can be directly manipulated and
recovered (Celik et al., 2005). One of the most important features of lossless
Image based Steganography
45
compression is to maximize the embedding capacity. Employing the LSB technique
for data hiding achieves both invisibility and reasonably high storage payload (Amin
et al., 2003). 8-bit images are not as forgiving to LSB manipulation because of color
limitations (Johnson and Jajodia, 1998).
The advantages of LSB based data hiding method is that it is simple to embed the bits
of the message directly into the LSB of image pixel and many techniques use these
methods (Amin et al., 2003). The LSB modification does not result in image
distortion and thus the resulting stego-image looks identical to the cover-image
(Bailey and Curran, 2006). LSB based technique enables high embedding rate and
also fully recovers the secret data without any error. For this reason, it is mostly
preferred in image steganography. The amount of data to be embedded may be fixed
or variable in size depending on the number of pixels selected. The main advantage of
such technique is that the modification of the LSB plane does not affect the statistics
of the overall image as the amplitude variation of the pixel values is bounded by ±1
(Chandramouli and Memon, 2001).
By overwriting the LSB, the numeric value of the byte changes very little and is least
likely to be detected by the human perception. Since there are 256 possible intensities
of each primary color, changing the LSB of a pixel results in small changes in the
intensity of the colors. With a well-chosen image, one can even hide the large volume
of secret message in the LSB without noticing the difference (Neeta et al., 2006). If a
cover-image is taken with M×N pixels then the maximum data hiding capacity of
LSB steganography is M×N and the embedding ratio p as the ratio of the length of
embedded messages to the maximum capacity, where 0 p 1 (Zhang et al., 2006).
In case of a 24- bit image, data can be stored in 3 bits in each pixel by changing LSBs
of each of the red, green and blue color components as each of the components are
represented by a byte.
The following example demonstrates the way the letter ‘S’ can be hidden in the first
eight bytes of three pixels in a 24-bit image. In image representation each pixel is
made up of three bytes consisting of either a 1 or a 0. The original raster data for 3
pixels may be
Image based Steganography
46
R G B
(00100110 11101010 11001010)
(00100101 11001010 11101011)
(11001010 00100101 11101011)
And the character, S=01010011
Embedding character ‘S’ into the LSBs of the following pixels then the resulting pixel
becomes:
R G B
(00100110 11101001 11001000)
(00100111 11001000 11101000)
(11001001 00100111 11101001)
The three underlined bits are the only three bits that are actually altered (where bits in
bold and underlined have been changed). On average, only one half of the LSBs are
changed (Johnson and Jajodia, 1998). However changing the MSBs causes a
noticeable impact on the color but changing the LSBs is not noticeable and preserves
the image quality. Thus, 01101010 could be changed to 01101011 or remains same
and would go unnoticed to the casual observer. The last bits of the pixels plane can be
used to embed data. This actually makes sense when one considers that one set of
zeroes and ones are substituted with another set of zeroes and ones.
From the embedding process as illustrated shows that it is possible to extract the
secret message bits directly from the LSBs of those pixels selected during this
process. In the extraction process, given the stego-image, the embedded secret
messages can be extracted using the same sequence as in the embedding process. The
set of pixels storing the secret message bits are selected from the stego-image. The
LSBs of the selected pixels are extracted and lined up to reconstruct the secret
message bits (Chan and Cheng, 2004). Thus in extraction, the receiver must have
access to the sequence of element indices used in the embedding process. This
Image based Steganography
47
extraction algorithm is considered the inverse of the embedding algorithm, although
the embedding and extraction algorithms may be created such that the extraction
algorithm is not actually the mathematical inverse of the embedding algorithm (Jain et
al., 2012b).
A slight variation of such technique allows for embedding the message in two or more
of the least significant bits per byte and increases the hidden information capacity of
the cover-image, but the cover-image is degraded more, and therefore it is more
detectable (Juneja et al., 2009). Other variations on this technique include ensuring
that statistical changes in the image do not occur. When hiding the message bits in the
LSBs of an image, there are two schemes, namely sequential and random. In
sequential case, the message is embedded into image sequentially or successively. In
the random embedding, the message bits are randomly scattered throughout the image
using a random sequence to control the embedding process.
The main drawback is that it is vulnerable to small manipulation in the stego-image.
Converting an image from a format like GIF or BMP, which reconstructs the original
message exactly (lossless compression) to a JPEG, which does not (lossy
compression), and then back could destroy the information hidden in the LSBs
(Johnson and Jajodia, 1998). However, such technique maintains the size and
properties of the source image by adding robustness of the secret message and allows
high perceptual transparency. Most digital formats are designed with the outer limits
of human perception in mind, which makes LSB the pattern of choice for packaging
messages in these channels. LSB has been shown to be quite versatile and the
implementation is straightforward. All of these factors contribute to the continued use
of LSB in steganographic applications. The primary objective when using this method
is to barter a marginal amount of image quality in order to create undefined space
within the carrier space. Among all message embedding techniques, least significant
bits (LSB) insertion/modification is a difficult one to detect, and it is imperceptible to
humans (Chandramouli and Memon, 2001).
4.1.1.2 Palette based Steganography
Palette based image enables 8 bits per pixel or less to look almost as good as 24 bits
per pixel (Agaian and Perez, 2004). Rather than each pixel in the image having all
Image based Steganography
48
three RGB colors (one 8-bit red, one 8-bit green and one 8-bit blue), each pixel
contains one 8-bit number that indexes into the 256-color lookup table, which
contains the RGB values (Bandyopadhyay and Maitra, 2010). The palette based
algorithms consist of color quantization and dithering. Color quantization selects the
palette of the image by truncating all colors of the original raw, 24-bit image to a
finite number of colors (Fridrich, 1999a; Wang et al., 2005). Palette images can be
transformed from a three color layer image by reducing the number of unique colors
used within an image by using color quantization (Johnson and Jajodia, 1998;
Bandyopadhyay and Maitra, 2010). Data embedding in palette takes advantage of the
color quantization process during the transformation. The basic idea of embedding in
the palette lies in the insertion of secret messages within the ordering of the colors in
the color-map (Agaian and Perez, 2004). Due to the color quantization some
alternation introduces in the image thus the secret message is able to pass as noise
(Wang et al., 2005). This avoids changes in the image leaving the visual perception of
the image unscathed. This is possible because two identical images may have
completely different color-maps (Westfeld and Pfitzmann, 2000). The information
can also be embedded by arranging the palette in a structure where neighboring colors
are close in given distance, including chroma difference (Fridrich, 1999a). Palette
based steganography are also be useful in fast transmission of secret message over a
communication system. Dithering is used for apparent increasing of color depth that
uses the integrating properties of the human visual system and creates the illusion of
additional colors by trading space resolution for color depth (Fridrich, 1999a). The
use of palette-based image representation is based on the observation that natural
images usually use only a small percentage of the available RGB color space and
quantization of colors can be done without severely degrading the image quality
(Wang et al., 2005).
The method to extract the embedded message is relatively simple, as long as the color
grouping configuration is identified. The receiver can simply recover the message by
selecting the same pixels and collecting the LSBs of all indices to the ordered palette
(Wang et al., 2005; Wu et al., 2004). Using the same sequence as in the embedding
process, the secret message is simply read by extracting the parity bits of the colors of
selected pixels from the stego-image. Also the extraction is done by firstly sorting the
palette (stego-palette) and then retrieving the stego-bits from the palette.
Image based Steganography
49
Figure 4.2: Sorting of color in palette as used in EzStego method (Westfeld and
Pfitzmann, 2000).
One of the most popular message hiding schemes for palette-based images (GIF files)
has been proposed by Machado is similar to the commonly used LSB method for 24
bit color images (or 8 bit grayscale images) called EZ stego method(Westfeld and
Pfitzmann, 2000; Fridrich, 1999a; Wang et al., 2005). EzStego is a non-adaptive
method that embeds in the LSB of the index where the palette is first sorted by
luminance which is a linear combination of three colors R, G, B in the palette (Agaian
and Perez, 2004; Wu et al., 2004). In the reordered palette, neighboring palette entries
are typically near to each other in the color space as shown in figure 4.2 (Westfeld
and Pfitzmann, 2000). Then index of the pixel’s RGB color in the reordered palette is
evaluated and is replaced with the bit of the message. EZ Stego embeds the message
in a binary form into the LSB of indices (pixels) pointing to the palette colors
(Fridrich, 1999a; Wang et al., 2005; Wu et al., 2004). However, occasionally colors
with similar luminance values may be relatively far from each other, generating very
noticeable artifact (Wang et al., 2005). The advantage of EZ Stego is that it gave
importance to color models and after embedding, the image is reconstructed by
arranging the palette (Agaian and Perez, 2004). According to Fridrich, (1999a), EZ
stego has problem that method does not easily generate better stego-images as similar
luminance values may be relatively far from each other and to avoid such problem,
the author presented a steganographic method for hiding message bits into the parity
bit of close colors by changing the image’s index. In their algorithm, they used
distance to select pixels that are close in distance. According to Agaian and Perez,
(2004), the algorithm stated by Fridrich, (1999a) has disadvantages that embedding
Image based Steganography
50
capacity is limited to the size of the index and also it uses the entire image to find the
desired parity bit, providing more room for errors.
The problem with the palette approach used with BMP images is that if the LSB of a
pixel is changed, it can result in a completely different color since the index to the
color palette is changed (Johnson and Jajodia, 1998). Agaian and Perez, (2004)
further stated that problems with such methods is that they embed within the palette
are that they do not take in to account other important color models. Also, the
embedding information is limited and the hidden message can be destroyed by
switching the order of the palettes.
4.1.2 Frequency Domain based Steganography
In the frequency domain cover images are transformed using a frequency-oriented
mechanism and then the secret messages can be combined with the coefficients in the
frequency-form images to achieve embedding. Frequency Domain is also known as
transform domain as it transforms the image. Unlike spatial domain techniques,
frequency domain based techniques hide secret data in significant parts of the cover
file. There are many transforms used to map a signal into the frequency domain.
Discrete cosine transform (DCT), discrete wavelet transform (DWT), and discrete
Fourier transform (DFT) are methods used as mediums to embed secret data in digital
images.
4.1.2.1 Discrete Cosine Transform based Steganography
The DCT algorithm is one of the main components of the JPEG compression
technique and it can be exploited for information hiding (Morkel et al., 2005). Such
technique basically applies lossy compression in images and thus they form an image
with some loss in bits (Fridrich, 2009). An example of an image format that uses this
compression technique is JPEG (Joint Photographic Experts Group) (Johnson and
Jajodia, 1998). JPEG is the most popular and common image file format on the
Internet and the image sizes are small because of the compression, thus making it the
least suspicious algorithm to use. In frequency domain based steganography the
Image based Steganography
51
knowledge of the JPEG compression algorithm that uses discrete cosine transform to
image content transformation is used to embed secret message (Chang et al., 2002).
Table 4.1: A block of 8 X 8 pixel values of a cover-image as stated in Jpeg–Jsteg
139 144 149 153 155 155 155 155
144 151 153 156 159 156 156 156
150 155 160 163 158 156 156 156
159 161 162 160 160 159 159 159
159 160 161 162 162 155 155 155
161 161 161 161 160 157 157 157
162 162 161 163 162 157 157 157
162 162 161 161 163 158 158 158
In order to compress an image into JPEG format, the RGB color representation is first
converted to a YUV representation space and each color plane is partitioned into non-
overlapping 8 x 8 blocks of pixels (Chang et al., 2002; Liu and Liao, 2008; Currie and
Irvine, 1996). In this representation the Y component corresponds to the luminance
(or brightness) and the U and V components correspond to chrominance (or color) (Li
and Wang, 2007). The human eye is more sensitive to changes in the brightness
(luminance) of a pixel than to changes in its color (Fridrich, 2009). Thus, it is possible
to remove a lot of color information from an image without losing a great deal of
quality (Watson, 1994a). JPEG compression performs downsampling (or
subsampling) of image where much of the compression takes place by downsampling
the chrominance data to reduce the overall file size (Cox et al., 2008). The color
components (U and V) are halved in horizontal and vertical directions.
Next, DCT transforms a signal from an image representation into a frequency
representation by grouping the pixels into 8 × 8 pixel blocks and transforming the
pixel blocks into 64 DCT coefficients each (Kharrazi et al., 2006; Almohammad et
al., 2008). A modification of a single DCT coefficient will affect all 64 image pixels
in that block (Morkel et al., 2005). Each DCT coefficient F (u, v) of an 8 x 8 block of
image pixels f(x, y) is given as stated by (Sheisi et al., 2012):
Image based Steganography
52
( )
( ) ( ) [∑ ∑ ( )
( )
( )
] (4.1)
where ( ) √ when u =0 and C(u)=1 otherwise.
( ) √ when v =0 and C(v)=1 otherwise.
In this case x, y, u, v ϵ {0, 1, …, 7} and f (x, y) is the particular pixel color space
component.
Table 4.2: The DCT coefficients formed after transformation of the image block
1260 -1 -12 -5 2 -2 -3 1
-23 -17 -6 -3 -3 0 0 -1
-11 -9 -2 2 0 -1 -1 0
-7 -2 0 1 1 0 0 0
-1 -1 1 2 0 -1 1 1
2 0 2 2 -1 1 1 -1
-1 0 0 -1 0 2 1 -1
-3 2 -4 -2 2 1 -1 0
After performing DCT to each 8x8 block, the low frequency coefficient which is on
the top left of the table gets the higher value as it encodes the data with the highest
importance and high frequency gets lower value (Chang et al., 2002) as shown in
table 4.2. Then, the quantization phase is performed which is the main lossy
compression step where the remaining coefficients are quantized. The transformed
coefficients are quantized (scaled) in accordance with the default quantization table of
JPEG (Tseng and Chang, 2004).
Image based Steganography
53
Table 4.3: The Standard quantization table of JPEG
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
18 22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 101
72 92 95 98 112 100 103 99
The standard quantization table is listed in table 4.3, which is a matrix that contains
64 coefficients and the user can adjust those 64 coefficients (Chang et al., 2002). It is
actually the real default table for luminance included in the JPEG specification and
higher the values on the quantization table, the more details are eliminated
(Almohammad et al., 2008). The aim is to quantize the values that represent the
image after transforming values to frequencies (Watson, 1994b). Quantization process
takes the 64 DCT coefficients and dividing them individually against a predetermined
set of values and then rounding the results to the nearest real number value and
thereby eliminating the redundant frequency coefficients (Liu and Liao, 2008; Li et
al., 2011).
Table 4.4: The quantized DCT coefficients formed by quantization
79 0 -1 0 0 0 0 0
-2 -1 0 0 0 0 0 0
-1 -1 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Image based Steganography
54
Each block is compressed through quantization table to scale the DCT coefficients
and then the secret message is embedded in quantized DCT coefficients of each block
(Chang et al., 2002).
Figure 4.3: Frequency distribution in a DCT block where embedding takes place
After quantization most of the quantized DCT coefficients (i.e. DCT coefficients
formed after quantization) are equal to zero as seen in table 4.4 and it is where data
hiding takes place. The bits of the secret message can be replaced with the bits of the
quantized DCT coefficients. Thus the secret data can be embedded into high, middle
and low frequency components as shown in figure 4.3. Such embedding alters the
magnitude of the coefficients in the frequency components that are formed by
quantization. The manipulations in the high, low and middle frequencies are not
generally sensitive to the human eye and does not cause any major degradation, thus
data hiding takes place in these components. It is because of the fact that the human
eye is fairly good at spotting small differences in brightness over a relatively large
area, but not so good as to distinguish between different strengths in frequency
brightness (Katzenbeisser and Petitolas, 2000). DCT based techniques does this by
dividing all the values in a block by a quantization coefficient. The quantization step
is lossy because of the rounding error (Watson, 1994b).
After performing embedding in the quantized DCT coefficients, each 8x8 block is left
with a few coefficients and large number of zeroes. Then the modified DCT
coefficients are sorted in the frequency order by zigzag ordering method (Sheisi et al.,
Image based Steganography
55
2012). In zigzag small unimportant coefficients are rounded to 0 while larger ones
lose some of their precision (Liu and Liao, 2008). Zigzag order is performed to group
similar frequencies together by putting maximum of zeroes close to each other so that
it will compress better.
After zigzag process, JPEG entropy coding (that contains Huffman coding, Run-
Length coding, and DPCM) is applied to each block for further compression (Chang
et al., 2002). Thus the results are rounded to integer values and the coefficients are
encoded using Huffman coding to further reduce the size (Currie and Irvine, 1996).
The size field for discrete cosine values is included in the Huffman coding for the
other size values, so that JPEG can achieve compression of the data. Entropy coding
is lossless compression process. For each block after the entropy coding, a JPEG file
is obtained that contains a quantization table and some compressed data (Chang et al.,
2002). And finally JPEG stego-image is generated.
Extraction of secret data from DCT based data embedding can be performed in two
ways. In one method, the JPEG file (stego-image) is entropy decoded using the
coding tables (Huffman tables) located in the image header (Almohammad et al.,
2008). The entropy decoding (inverse JPEG entropy coding) contains Huffman
decoding, Run-Length decoding, and DPCM decoding. Each block is reconstructed
after all the compressed data are decoded (Chang et al., 2002). After entropy decoding
8 × 8 non-overlapping blocks of the quantized stego DCT coefficients are recovered.
So long as the decoder knows that the embedding took place in the DCT domain, it
will be capable of extracting the message successfully. In another method, firstly the
stego-image is divided into non-overlapping 8X8 blocks of pixels. Next, two
dimensional DCT is applied on each block of the stego-image and the DCT
coefficients are quantized through quantization table to form quantized DCT
coefficients. Then the secret bits from the low/ middle/ high frequency coefficients
can be extracted and converted into 8 bit into character to form the secret message.
Image based Steganography
56
Figure 4.4: DCT based Steganography during JPEG compression process
The drawback of such technique is that the amount of secret data that can be
embedded is less compared to spatial domain based techniques and also there is a risk
that the bits of the secret data can be lost because of high compression. One of the
major characteristics of steganography is the fact that information is hidden in the
redundant bits of an object and since redundant bits are left out because of the harsh
compression applied, it was feared that the hidden message would be destroyed
(Morkel et al., 2005). However, properties of the compression algorithm have been
exploited for developing steganographic algorithm for JPEGs. Thus it is important to
recognize that the JPEG compression algorithm is actually divided into lossy and
lossless stages, the DCT and the quantization phase form part of the lossy stage, while
the Huffman encoding used to further compress the data is lossless. Steganography
can take place between these two stages (Johnson and Jajodia, 1998; Morkel et al.,
2005).Using this principle of insertion the secret message can be embedded into DCT
coefficients before applying the Huffman encoding (Watson, 1994a). The
steganographic embedding can take place before quantization phase or after
quantization phase as shown in figure 4.4. By embedding the information at this stage,
in the transform domain, it is extremely difficult to detect, since it is not in the visual
domain (Liu and Liao, 2008). Transform embedding methods are found to be in
general more robust than other embedding methods which are susceptible to image-
processing type of attacks (Li and Wang, 2007).
Image based Steganography
57
4.1.2.2 Discrete Wavelet Transform based Steganography
Wavelet transform is used to convert a signal from spatial domain to frequency
domain. Wavelet transform represents an image as a sum of wavelet functions
(wavelets) with different locations and scales and any decomposition of an image into
wavelets involves a pair of waveforms: one to represent the high frequencies
corresponding to the detailed parts of an image (wavelet function) and one for the low
frequencies or smooth parts of an image (scaling function) (Grgic et al., 2001). The
use of wavelet in image stenographic model lies in the fact that the wavelet transform
clearly separates the high frequency and low frequency information on a pixel by
pixel basis (Latef, 2011). High frequencies are transformed with short functions (low
scale) and low frequencies are transformed with long functions (high scale) (Khalifa
et al., 2008). The result of wavelet transform is a set of wavelet coefficients, which
measure the contribution of the wavelets at these locations and scales (Grgic et al.,
2001). The coefficients in this wavelet expansion are called the discrete wavelet
transform (DWT), of the signal. The Discrete Wavelet Transform (DWT) is based on
sub-band coding that result in fast computation of Wavelet Transform. The discrete
wavelet transform is a very useful tool for signal analysis and image processing,
especially in multi-resolution representation that can decompose signal into different
components in the frequency domain (Khalifa et al., 2008; Audithan and
Chandrasekaran, 2009). DWT for an image as a 2-D signal can be derived from 1-D
DWT and the easiest way for obtaining scaling and wavelet function for two
dimensions is by multiplying two 1-D functions (Grgic et al., 2001).
The simplest form of discrete wavelet transform (DWT) is Haar-DWT in which the
low frequency wavelet coefficient are generated by averaging the two pixel values
and high frequency coefficients are generated by taking half of the difference of the
same two pixels (Chen and Lin, 2006; Nag et al., 2011; Latef, 2011). Haar wavelet is
not continuous, therefore not differentiable and is used to convert spatial domain
image to wavelet domain (Dey et al., 2012). The operation for Haar DWT has been
applied to image processing especially in multi-resolution representation (Audithan
and Chandrasekaran, 2009).
Image based Steganography
58
For a function f, the HWT (Haar wavelet transform) is defined as (Chen and Lin,
2006):
( ) (4.2)
( ……… 𝑁 )
( ……… 𝑁 )
where L is the decomposition level, a is the approximation subband and d is the detail
subband.
Figure 4.5: Two dimensional wavelet transformation of an image
To apply HWT on images, a one level Haar wavelet is first applied to each row and
secondly to each column of the resulting image of the first operation (Dey et al.,
2012). The DWT is computed by successive low frequency and high frequency of the
discrete time-domain signal that decomposes into four classes or band coefficients
(Khalifa et al., 2008). Its significance is in the manner it connects the continuous-time
multiresolution to discrete-time filters. For 2-D images, applying DWT separates the
image into a lower resolution approximation image or band (LL) and higher
frequency band or detail components horizontal band (HL), vertical band (LH) and
diagonal band (HH) as shown in figure 4.5 (Audithan and Chandrasekaran, 2009; Nag
et al., 2011; Latef, 2011). The approximation band (LL) consists of low frequency
wavelet coefficients, which contain significant part (smooth parts) of the spatial
domain image. Thus embedding in the lower frequency sub-bands may degrade the
image significantly. The other bands such as HH, HL, and LH also called as detail
bands consists of high frequency coefficients, which contain the edge and texture
details of the spatial domain image (Audithan and Chandrasekaran, 2009; Nag et al.,
2011). However changes in the high frequency sub-bands are not generally sensitive
Image based Steganography
59
to the edges and textures of the image and also to human eye. Thus data hiding takes
place in high frequency sub-bands (HH, HL, and LH sub-band) by high modifying
frequency wavelet coefficients (Latef, 2011; Dey et al., 2012). The overall process is
called the one-level 2-D Haar-DWT. With this approach, the time resolution becomes
arbitrarily good at high frequencies, while the frequency resolution becomes
arbitrarily good at low frequencies (Grgic et al., 2001).
In the process of extraction, firstly, the modified coefficients matrix is obtained by
applying 2D-Haar DWT to the stego-image that separates all the four sub-bands that
are LL, HL, LH and HH (i.e. the high and low frequency information). Then the
coefficients of the three high frequency sub-bands (HH, HL, and LH) are extracted to
get the secret bits.
The main limitation of such method is that as the embedding takes place in the
frequency domain, so the hiding capacity is less as it achieves high compression.
Secondly, because of decomposition which is a part of wavelet transform that results
in creation of sub-bands, the time complexity of such process increases and is more
than spatial based embedding.
4.2 Image Steganalysis
Steganalysis is the art and science to detect whether a given digital image contains
hidden data. The steganalysis plays a role in the selection of features or properties of
the image to test for hidden data and also in designing of technique to detect or extract
tests the hidden data. A steganalysis method is considered as successful if it can detect
and extract the hidden data embedded (Katzenbeisser and Petitolas, 2000).
Steganalysis can be termed as a method of attacking the digital media for estimating
whether the media contains secret data embedded in it. Thus it can serve as an
effective way to judge the security performance of steganographic techniques. The
steganalyst (one who performs steganalysis) is assumed to control the process of
transmission channel and trace out for suspicious data. In practice, the steganalyst is
frequently more interested in verifying whether or not a secret message is present in a
medium (Fridrich et al., 2003b).
Image based Steganography
60
The objectives of steganalysis are:
To detect the existence of a secret message in a binary image. The suspect
image may or may not have hidden data encoded into them.
To evaluate techniques that can be used to distinguish the images hidden with
secret messages from those without. Some of the suspect images may have
noise or irrelevant data encoded into them.
Its purpose is to identify the type of steganographic method used to create the
stego-image by trying to understand the internal mechanism used during the
embedding operation.
The steganalysis technique is used not only to detect the stego-image but it
tries to recover the hidden data.
The steganalysis technique tries to estimate the length and the location of the
pixels bearing the hidden message.
Steganalysis is designed to estimate the relative numbers of embedding
changes in the digital image.
4.2.1 Types of Attacks
While the purpose of Steganography is to hide messages, there exist several attacks
that one may execute to test for Steganographic data. The strength of a steganographic
algorithm depends on its ability to successfully withstand attacks. Attacks and
analysis of hidden data may take several forms: detecting, extracting, disabling or
destroying hidden data. An attack is dependent on what information is available to the
steganalyst. Attacking steganographic algorithm is very similar to attacking
cryptographic algorithms and similar techniques apply (Wayner, 2009). There are six
general protocols used to attack the use of Steganography as pointed out by
Katzenbeisser and Petitolas, (2000). These are as follows:
Stego-only attack: Only the steganography medium/object is available for analysis.
Image based Steganography
61
Known-carrier attack: The carrier, that is, the original cover, and steganography
media/object are both available for analysis or are known.
Known-message attack: In this case, the hidden message is known and can be
compared with the stego-object/medium.
Chosen-stego attack: The steganography medium/object and tool (algorithm) are
both available for analysis.
Chosen-message attack: Here a chosen message and steganography tool (or
algorithm) is used to create steganography media for future analysis and comparison.
Known-steganography attack: The secret message, steganography medium/object
and the steganography tool (algorithm) are known and available for analysis.
Steganography elimination technique is involved with steganalysis that try to
eliminate or destroy the hidden information as the purpose is to break the cover
communication. The most common attacks based on this factor are (Katzenbeisser
and Petitolas, 2000):
Destroy everything attack –this type of attack aims in destroying the message
completely and the attacker might not even try to retrieve the message.
Random tweaking attacks – here small changes in the files are added so that the
message will be unreadable.
Add new Information – in some cases the attackers might use the same technique of
data hiding to embed a new message into the stego-file. The original message might
be overwritten.
Reformat attack – a common way to destroy the information hidden in a file is by
changing the file format. This type of attack can produce a lot of damages to the
hidden message.
Compression attack – the attacker might compress the file which might result in the
total loss of the secret message embedded in the file.
Image based Steganography
62
The attacks presented above discuss ways to destroy the hidden message. But for all
such case, the attack should be on the suspected image. It might also be a case that an
attack can be performed on an innocent image that does not contain any secret data.
Based on this certain attacks are implemented in steganography to evaluate if the
image contains hidden data.
4.2.2 Image based Steganalysis Techniques
Fig 4.6: Classification of image steganalysis techniques
Steganalysis can be classified into targeted method and blind method as shown in
figure 4.6 (Patil et al., 2012). A targeted steganalysis uses the knowledge about the
steganographic technique to detect stego-images created with that specific technique,
while blind steganalysis aims to distinguish whether an image contains hidden
information without any prior knowledge about the used steganographic technique.
Blind and targeted steganalysis techniques have been greatly studied on digital images
(Fridrich et al., 2001).
4.2.2.1 Targeted Steganalysis
Targeted steganalysis are designed to evaluate mechanisms of particular embedding
operations and fully utilizes the knowledge applicable to detect steganography. A
targeted steganalysis technique works on a specific type of known stego-system and
sometimes limited on image format (Chandramouli et al., 2004). By studying and
analyzing the embedding algorithm, one can find image statistics that change after
embedding. The results from the targeted steganalysis techniques can be accurate
Image based Steganography
63
while the technique is also inflexible since most of the time there is no path to extend
them to other embedding algorithms. A targeted Steganalysis can be of three types-
Visual, Statistical and Structural attacks (Patil et al., 2012).
(a) Visual attacks
Visual Attacks are simplest form of steganalysis that involves examining the stego-
image with the naked eye to identify any kind of degradation (Patil et al., 2012). The
steganographic method does not leave any kind of visual distortion on the image file
due to modification of bits. The visual attack makes the ability of humans to
distinguish between noise and visual patterns that can be implemented by picking on
different properties of the image. For example, a visual attack could be set up to
display the spatial domain of the image on its own to verify its LSB. A steganalyst
searches for such inconsistency in order to classify an image either as a stego-image
or normal image. Although such inconsistencies depend on way the data is embedded
in the cover-image. Similarly, the steganalyst could also attack in the transform
domain to evaluate whether or not the image contains signs of transform embedding
(Westfeld and Pfitzmann, 2000).
On the other hand, it is much harder to perform a visual attack on randomized
embedding as the data are embedded in the random pixels of an image. So it becomes
much difficult in identifying the regions that have been altered as a result of random
embedding. Visual attacks can be a useful tool for known cover attacks. When the
cover image is not available to the steganalyst, visual attack is depends on three
factors holding true to prove successful. The message must be embedded in a
sequential order, its length must be less than the maximum size of the bit plane and it
should not be encrypted. It is no longer possible to see a change in form in the bit
plane, so the steganalyst finds it harder to classify the image as stego-image. Also,
when a message is encrypted it can reduce the chance of success for a visual attack by
considerable proportions when the cover-image is not available.
It is essential for a visual attack to determine appropriately the features of the image
that can be ignored and those features that can be taken into consideration for
implementing a valued attack in order to test the possibility that the suspected image
contains secret message. The success of visual attack varies significantly depending
Image based Steganography
64
on the steganographic method applied and the format of the image. As the attack can
be applied in different embedding technique so examining properties of several image
formats is not sufficient. The cover-image or the steganographic technique is required
in order to detect the distorted regions successful attack. Thus it is proves time-
consuming in testing images for various methods of embedding. This is obviously an
inefficient methodology, and the main drawback with the attack is the fact that it
cannot be automated.
(b) Statistical Attacks
In this type of attacks, the statistical analysis of the images by some mathematical
formula is performed to detect the presence of hidden data. Statistical attack is
partially similar to visual attack. Generally the hidden message is more random than
the original data of the image thus finding the formula to know the randomness
reveals the existence of data (Wayner, 2009). A theory is constructed that seemingly
explains why the phenomenon occurs, and statistical methods are used to prove this
theory to be either true or false. Statistical tests try to reveal whether an image has
been modified by determining image’s statistical properties deviate from a norm.
Some tests are independent of the data format and just measure the entropy of the
redundant data (Provos and Honeyman, 2003). There are methods that try to detect
the existence of a hidden message via statistical approaches by identifying signs of
embedding for specific stego-systems. Chi-square Analysis is one of such attack that
belongs to statistical attack.
Chi-square Analysis
Westfeld and Pfitzmann, (2000) outlined a statistical attack where they observed that
for a given image, the embedding of data changes the histogram of color frequencies
in a particular way. In their case, the embedding process changes the least significant
bits of the colors in an image where the colors are addressed by their indices in the
color table. Then, the frequencies of the color indices before embedding become
larger than the frequencies after embedding because the frequency difference between
adjacent colors is reduced by the embedding process. Westfeld and Pfitzmann, (2000)
used a Chi-square (χ2) test to determine whether the color frequency distribution in an
image matches a distribution that shows distortion from embedding data with the
Image based Steganography
65
probability of statistics under the condition that the distributions frequencies of the
color indices before embedding and after embedding are equal. They increased the
sample size and applied the test at a constant position.
According to Provos and Honeyman, (2003) it is possible to extend Westfeld and
Pfitzmann’s Chi-square test to be more sensitive to partial distortions in an image i.e.,
the DCT coefficients in a JPEG format. According to them, two identical distributions
produce about the same chi-square values in any part of the distribution. Instead of
increasing the sample size and applying the test at a constant position, they used a
constant sample size but slide the position where the samples are taken over the entire
range of the image. They stated that the expected distribution for the chi-square test
has to be computed from the image by taking the arithmetic mean of the frequencies
of the color indices before embedding and after embedding, and then to compare
against the observed distribution.
(c) Structural Attacks
Structural attacks are designed to take advantage of the high-level properties that are
known to exist for a particular steganographic algorithm (Patil et al., 2012). Structural
attacks rarely analyze each image on its own merits. Instead, the images are scanned
to see if they contain any of the known side-effects for various steganographic
algorithms. Images that contain these properties are often subjected to further
investigation. There are sometimes cases where the image may possess signs of
steganography while it may be perfectly innocent. This is why a more detailed
investigation is done in structural attack. A common element of structural detectors is
to estimate features so that macroscopic cover property can be approximated from the
stego object by inverting the effects of embedding as a function of features so that it
matches cover assumptions best (Fridrich et al., 2003b). A successful structural attack
relies on being able to identify a distinct difference between the cover-image and a
stego-image, which means that there is a heavy reliance on either knowing the cover-
image or knowing the embedding details of the steganographic algorithm and
evaluating the consequences of the embedding strategy. It is rarely the case that a
steganalyst will have access to one of these, and even rarer for them to have access to
both, which only hampers the success of the attack. Structural attacks are not used as
Image based Steganography
66
a means of proving that an image contains steganography, rather they highlight
images that contain signs of embedding. RS (Regular and Singular groups) analysis
and Pair analysis represents structural attack.
Regular and Singular groups (RS) Analysis
RS steganalysis is used to estimate the length of the embedded message on a digital
image for LSB steganographic methods. RS steganalysis was introduced by Jessica
Fridrich and others (Fridrich et al., 2001; Fridrich et al., 2003b) for exploiting the
correlation of images in the spatial domain. They stated that lossless capacity reflects
the fact that the LSB plane – even though it looks random – is related to the other bit
planes and the method is based on the fact that the content of each bit plane of an
image is correlated with the remaining bit planes. In RS Analysis the image is
partitioned into groups of pixels (Regular and Singular groups) of a fixed shape
depending upon some properties. Each group classified as ‘regular’ or ‘singular’
depending on whether the pixel noise within the group (as measured by the mean
absolute value of the differences between adjacent pixels) is increased or decreased
after flipping the LSBs of a fixed set of pixels within each group (the pattern of pixels
to flip is called the ‘mask’). The classification is repeated for a dual type of flipping.
They stated that some theoretical analysis and experimentation show that that the
proportion of regular and singular groups form curves quadratic in the amount of
message embedded by the LSB method. Using such assumption, the proportions of
regular and singular groups with respect to the standard and dual flipping, some
information may be gained to estimate the proportion of an image in which data is
hidden. The estimate can be accurate (often within 1%), but fails when this
assumption does not hold.
Pairs analysis
Pairs Analysis is a steganalysis technique that detects the data hidden in palette
images by analyzing the LSBs of indices (Fridrich et al., 2003b; Fridrich et al.,
2003a). The principle of Pairs Analysis is based on the color pair. Pairs Analysis first
splits an image into a color cut, scanning through and selecting only pixels which fall
into each pair of values (0,1), (2, 3), and so on. Concatenating the color cuts into a
single stream, the homogeneity of the LSBs is measured. Repeating with the
Image based Steganography
67
alternatives pairs of values (255, 0), (1, 2), (3, 4) etc, one can show that the function
defined by the difference between the two homogeneity measures is quadratic in the
amount of embedded data. Under the assumption that natural images have no
difference in homogeneity, one can obtain information to deduce the amount of
embedded data in an image, and this estimate form the statistic which is used to
distinguish the cases of hidden data present and absent. However the method is not
reliable for images for which the assumption of equal homogeneity does not hold.
4.2.2.2 Blind Steganalysis
Blind steganalysis is an approach of detecting secret message embedded into a file
even when it is not sure how the information might have been embedded. Blind
steganalysis do not require prior knowledge about details of the embedding operations
(Luo et al., 2008; Chandramouli et al., 2004). Blind steganalysis therefore works
differently to targeted steganalysis because it assumes that nothing is known about
either the algorithm or the cover image that was used to produce a suspect image. It
tries to detect any steganographic tool, known or unknown in advance and both sets of
statistical moments are used as features for steganalysis (Chandramouli, 2003). The
attacks attempt to evaluate the probability of embedding based solely on the data of
the suspect image. Such approaches are more likely to be common in real-world
steganalysis.
Blind identification methods pose the steganalysis problem as a system identification
problem and the embedding algorithm is represented as a channel and the goal is to
invert this channel to identify the hidden message (Chandramouli and Subbalakshmi,
2004). In such steganalysis method, each image is analyzed individually based on the
computed statistics. Digital images are known to be statistically non-stationary and
such causes practical issues in implementing algorithms based on the blind
identification model which assumes stationarity of data (Chandramouli, 2003). When
the stationarity condition is violated additional effort is needed to make steganalysis
work. If the message embedding algorithm is nonlinear then the blind identification
problem becomes more difficult (Chandramouli et al., 2004). Perhaps the most
important aspect of blind steganalysis is ensuring that one can derive an estimate of
the cover image which is as accurate as possible. The attacks that follow this
Image based Steganography
68
procedure often compare the data in the estimated cover image to that of the suspect
image.
Some of the methods that belong to the blind steganalysis schemes are discussed
below:
(a) Self-calibration mechanism: Calibration process is used by the blind steganalysis
schemes to estimate the statistics of the cover image from the stego image in case of
JPEG which is proposed by Fridrich et al., (2002). It depends on the fact that JPEG
based stego-systems encode the message data in the transform domain during the
compression procedure to produce stego-image of JPEG format by transforming the
image into 8x8 blocks, and it is within these blocks the secret data are encoded. The
idea of calibration is to estimate marginal statistics of the cover-image’s transformed
domain coefficients from the stego-image by desynchronising the block transform
structure in the spatial domain. A stego-image in transformed domain representation
is first converted to spatial domain then it is cropped by a small number of pixels at
two orthogonal margins (e.g. cropped by 4 rows and 4 columns) and then re-encoded
in the JPEG format. The calibration is done by taking the feature differences of the
cropped image with the original image. Visually and technically, the calibrated image
is compared with the stego image based on statistical measures such as PSNR,
Histogram etc.
(b) Features capturing cover memory (Sullivan et al., 2006; Fu et al., 2006): Most
steganographic schemes hide data on a per-symbol basis, and typically do not
explicitly compensate or preserve statistical dependencies. Hence, features that
capture higher dimensional dependencies in the cover symbols are crucial in detecting
the embedding changes. Cover memory has been shown to be very important to
steganalysis and is incorporated into the feature vector in several ways.
(c) Supervised learning based steganalysis (Avcibas et al., 2003; Lyu and Farid,
2004): Supervised learning based steganalysis techniques employ two phase
strategies: (a) training phase and (b) testing phase. In the training phase it constructs a
classifier to differentiate between stego and non-stego images using training
examples. The learning classifier iteratively updates its classification rule based on its
prediction and the ground truth. In the testing phase unknown images are given as
Image based Steganography
69
input to the trained classifier to decide whether a secret message is present or not.
However the choice of proper features to train the classifier upon is a critical step. If
the selected features are not appropriate for the specific embedding algorithm then the
detector may completely fail. There is no systematic rule for feature and parameter
selection. It is extremely difficult or even impossible to identify portions of the image
where a message is hidden.
Steganalysis mechanism can be used to analyze the embedding performance of
steganographic techniques. The steganalysis techniques presented works only on the
specific format and thus there is no universal steganalysis technique. It is up to the
user (steganalyst) to choose an appropriate methodology based on the information that
is available and also these steganalysis techniques has pros and cons. A successful
attack relies on being able to identify a distinct difference between the cover-image
and a stego-image, which means that there is a heavy reliance on either knowing the
cover-image or knowing the embedding details of the steganographic algorithm. It is
rarely the case that a steganalyst will have access to one of these, and even rarer for
them to have access to both. As the attack can be applied in different embedding
technique so examining properties of several image formats is also not sufficient and
is also time-consuming or completely infeasible. Also, the analysis may give
significant amount of false positives. There are some cases where the image may
possess signs of steganography when it may be perfectly innocent. Among the
enormous amount of images present on the internet, it is very difficult to judge
whether an image is stego-image or not.
4.3 Performance Metrics of Image Steganography
In order to examine the performance of a steganographic system or technique, an
evaluation scheme for steganographic systems is needed. Currently, no standard test
or measure is available in order to evaluate the performance or the effectiveness of
steganographic systems. However, there are some guidelines and general procedures
that can be considered when evaluating or designing steganographic systems (Cox et
al., 2008). Basically, steganographic systems have two fundamental characteristics
which must be investigated in order to evaluate the system. The security or
Image based Steganography
70
undetectability and the hiding capacity are the most important requirements that must
be addressed in every steganographic system (Wang and Wang, 2004). Thus, the
effectiveness of a steganography technique can be measured using two key principles:
the amount of data that can be embedded and the difficulty of detection of this data
(Cole and Krutz, 2003). Therefore, measuring these two characteristics determines the
superiority of a steganography technique over another. Thus, designing information
hiding algorithms that are statistically undetectable and can hide a large amount of
data is the main goal of steganography (Cox et al., 2008).
Generally, a steganographic system fails if an attacker is able to prove the existence of
a secret message or if the embedding technique arouses suspicions of attackers. If a
steganographic algorithm leave a trace during embedding than it can be detected
through statistical analysis. For an algorithm to be statistically undetectable, it should
be impossible for a warden to statistically prove the existence of hidden information.
4.3.1 Measure of Steganographic Capacity
Fundamentally, capacity of a steganographic system is used as one of the evaluation
criteria which is defined as the amount of information that can be hidden within the
cover image. Capacity is the most important parameter since the size/amount of the
secret information has direct impact on a steganographic system. Therefore,
evaluating the capacity of a steganography technique is the maximum number of bits
that can be embedded in a given cover image with a negligible probability of
detection by an adversary. Moreover, the size of the hidden information relative to the
size of the cover image is known as embedding rate or capacity (Venkatraman et al.,
2004). The steganography embedding operation needs to preserve the statistical
properties of the cover image in addition to its perceptual quality. Steganographic
systems mainly used for secret communication aims to maximize the steganographic
capacity and minimize the perception of hidden messages in stego images (Wang and
Wang, 2004). Cole and Krutz, (2003) stated that ‘the more data you can hide, the
better the technique’. However, the steganographic capacity tends to be restricted by
the size of cover files (Artz, 2001; Rabah, 2004). Therefore, designing a
steganography technique should take into consideration how to increase the amount of
secret data that can be embedded without affecting the properties of stego-image.
Image based Steganography
71
Additionally, improving the stego image quality while maintaining the steganographic
capacity is also considered a significant contribution (Wu and Hwang, 2007).
4.3.2 Measure of Robustness
When determining the robustness of algorithms against image manipulation attacks, a
distinction can again be made between the cover-image image and the stego-image, as
the embedding of data can result in changes to the bits of the image data and
distortion can take place due to embedding. During communication of a stego-image
between authorized parties, the image may undergo changes by an attacker in an
attempt to remove hidden information. It is thus important for steganographic
algorithms to be robust against malicious as well as unintentional changes to the
image. Moreover, the design of most steganographic systems does not consider
robustness as a fundamental requirement, since the majority of these systems assume
the passive warden scenario (Cox et al., 2008). Hence, steganographic systems are
either not robust against modifications or have limited robustness against technical
modifications.
4.3.3 Measure of Imperceptibility
The invisibility of the embedded information is the first and foremost evaluation
criteria, since the strength of image steganography lies in its ability to go unnoticed to
human eye. If any changes to the bits of image lead to visual distortion which
becomes noticeable then the overall objective of the steganographic method fails. On
the other hand, if the level of invisibility is high in image steganography algorithms
then the overall objective of the approach is fulfilled. Thus, for better evaluation and
comparison it is therefore necessary to consider the perceptibility of the resultant
image in the evaluation process. Methods or techniques that can be used to evaluate
the undetectability or imperceptibility of steganographic systems are different from
one system to another depending on the type of cover file used for data hiding.
Two types of perceptibility can be distinguished and evaluated in image processing
systems, namely fidelity and quality (Stoica et al., 2003). Fidelity is the perceptual
similarity between images before and after processing. For image based
steganography, the fidelity is defined as the perceptual similarity between the original
Image based Steganography
72
cover image and the stego image. Therefore, the fidelity evaluation requires both
versions of the image before and after embedding. On the other hand, attackers or
recipients do not have access to the original cover image. Additionally,
steganographic systems must avoid attracting the attention of anyone not involved in
the secret communication process and therefore stego images must have very good
quality. Therefore, quality is the major perceptual concern for most steganography
techniques in order to avoid any suspension and therefore detection (Cox et al., 2008).
Even though the PSNR (peak signal-to-noise ratio) and the mean square error (MSE)
are by definition fidelity metrics, they also acts as quality measures, since they also
represent perceptual distance metrics used to measure the distortion of an image
(Stoica et al., 2003; Wang et al., 2002b). Accordingly, a high quality image entails a
large PSNR value and therefore both cover image and stego image are very similar
and quite undistinguishable (Yu et al., 2007; Cheddad et al., 2010). Significantly,
‘Fidelity’ is defined as the perceptual quality of stego files and therefore PSNR and
MSE describe how imperceptible the secret message is (Cox et al., 2008).
The level of statistical undetectability of an image steganography algorithm is
determined by the amount of noticeable difference between the cover-image and the
stego-image and thus, it is very important that there appears no visual difference
between the images and are perfectly imperceptible. In most cases of steganography,
the term of security is usually equivalent to undetectability. Therefore, secure
steganographic systems refer to imperceptible steganographic systems (Cox et al.,
2008). Chang et al., (2002) stated that ‘The better quality the stego image has, the
more secure the steganography system will be’. Thus, imperceptible steganographic
system means that the hidden information cannot be perceived by the human visual
system or other statistical means. Nevertheless, hiding secret information in a cover
image may introduce some noise or modulate this cover image (Venkatraman et al.,
2004). So, the embedding process must not degrade the perceived quality of stego
image in order to get a secure steganographic system.
4.3.3.1 Evaluating the Quality of the Images
A steganographic method is considered secure if it is difficult for attackers to detect
the presence of hidden data in the stego files by using any accessible means.
Image based Steganography
73
Additionally, the hidden message must be invisible both perceptually and statistically
in order to avoid any suspicions of attackers. Visual quality refers to any visual-
quality metric appropriate to evaluate the visual distortion due to the embedding
process. Evaluating and analyzing the quality of images still represent a significant
issue in many image processing applications. Thus, image quality represents a key
factor in most applications and assessing the perceived quality of digital images is
very important (Tan et al., 1998). Generally, there are two primary ways to measure
image quality: objective quality methods (automated) and subjective quality methods
(human based) (Stoica et al., 2003). The objective methods measure the physical
aspects of images and psychological issues while the subjective methods are
psychologically based methods. Additionally, subjective methods use human
observers in order to evaluate the quality of images. For example, subjects can be
asked to compare a modified image with its original version in order to know how
much this modified image is degraded (Wu and Rao, 2005). A steganographic system
is perfectly secure if the statistics of the cover file and that of the stego file are
identical. Therefore, the characteristics and attributes of cover files should not be
changed and no distortions should be produced during the embedding process
(Venkatraman et al., 2004). However, the presence of statistical anomalies (i.e.
histograms and a variety of higher-order statistics) may be used by an adversary to
prove that a secret communication is taking place (Cox et al., 2008). Accordingly, the
higher the quality of stego images, the larger the imperceptibility of the
steganographic system. Therefore, evaluating the quality of stego images is a
significant measure to be used for evaluating the performance of image steganography
techniques (Wu and Hwang, 2007).
4.3.3.2 Objective Quality Assessment
Designing image quality evaluation metrics that can automatically predict the
perceived image quality is the main goal of objective image quality assessment
research (Wang et al., 2002a). Thus, the assessment algorithms designed for objective
image quality evaluation should be in close agreement with subjective human
evaluation regardless of the image content, the distortion amount, or the distortion
type (Sheikh et al., 2006). Objective image quality evaluation metrics are classified
into three generic categories according to the availability of the unmodified or original
Image based Steganography
74
image (reference). These categories are: full-reference (FR), no-reference (NR), and
reduced-reference (RR) image quality assessment (Wang et al., 2002b). The full
reference means that the original image and the test (impaired) image are available.
However, the no-reference means that only the test image is available. On the other
hand, the reduced-reference means that the test image and some information about the
original image are available.
Nowadays, the most popular and common distortion measures used to evaluate the
quality of images in the field of image processing is the peak signal-to-noise ratio
(PSNR) and Mean squared error (MSE). In the literature, the peak signal-to-noise
ratio metric (PSNR) has shown the best advantage almost overall objective image
quality metrics under different image distortion environments and strict testing
conditions (Wang et al., 2002a).
PSNR and MSE are the most common and widely-used full-reference (FR) metrics
for objective image quality evaluation (Sheikh et al., 2006). Furthermore, PSNR is
used in many image processing applications and considered as a reference model to
evaluate the efficiency of other objective image quality evaluation methods (Wang et
al., 2002b). The PSNR measures the similarity between two images (how two images
are close to each other) and are usually measured in decibels (dB) and, while the MSE
measures the difference between these two images and measured in percentage. The
computing of these two metrics is very easy and fast, so they are widely-used and
very popular (Wang et al., 2002a). The MSE is the statistical difference in the pixel
values between the original and the reconstructed image. Moreover, PSNR and MSE
are defined as follows (Stoica et al., 2003; Wang et al., 2002b):
(4.3)
where mean square error (MSE) is a measure used to quantify the difference between
the cover image I and the stego (modified) image I’. If the image has a size of M * N
then
𝑁∑
∑ ( ) ( ) 𝑁 (4.4)
Image based Steganography
75
According to Yu et al., (2007) for color images, PSNR is similarly defined as follows:
(4.5)
where the MSE for color images is defined as follows
(4.6)
where MSER , MSEG , and MSEB are the MSE of red, green, and blue components
respectively.
Thus the best image quality can be found when the MSE value is very small or going
to be zero since the difference between the original and reconstructed image is
negligible. However, PSNR values between 20 and 40 can be considered as typical
values (Cole and Krutz, 2003). Moreover, the higher the PSNR value of a stego
image, the better the degree of hidden message imperceptibility. For example, it is
difficult for the human visual system to recognize any difference between a grayscale
cover image and its stego image if the PSNR value exceeds 36 dB (Wu and Hwang,
2007). According to Cheddad et al., (2010) PSNR values falling below 30dB indicate
a fairly low quality, i.e., distortion caused by embedding can be obvious. Thus a high
quality stego-image should strive for a PSNR value of 40dB and above. The above
metric is used to quantify the distortion caused by an embedding process for
calculating distortion measurements. By employing such quality metrics it is very
probable that future benchmarking of digital steganography systems can be evaluated
for making efficient steganographic system.
4.3.3.3 Subjective Quality Assessment
This kind of evaluation is based on observation of some images by humans for
analyzing or accessing their visual quality. However, the visual sensitivity varies from
person to another that changes over time and so, different viewers can behave
differently. The objective image quality measures may not perfectly reflect the
impression of humans. Thus, the subjective quality measure represents a true
performance benchmark for image processing tools (Stoica et al., 2003). Unlike
objective quality measures, subjective measures represent the most reliable method to
Image based Steganography
76
determine the actual image quality since human beings are the ultimate proposed
receivers. Furthermore, it can be stated that the subjective test is one of the best and
reliable method to evaluate the quality of images. Accordingly, subjective measures
use structured experimental designs and real end users or human subjects to assess the
quality of images (Tan et al., 1998; Wu and Rao, 2005). Furthermore, they are the
most widely recognized methods for image quality evaluation since they quantify the
actual perceived quality. However, subjective experiments of image quality
evaluation are complex, difficult to repeat, very expensive, and time consuming (Wu
and Rao, 2005). Generally, observers are asked to rate the quality of images,
sometimes with reference to other images.
The performance evaluation on individual data sets is desirable for direct performance
comparison of two methods for one data set. The quality metrics shows the
relationship between the bit- or detection-error and the visual-image quality for a
fixed attack and is very useful in comparing different steganographic methods since it
facilitates immediate robustness comparisons for a given visual image quality.