Perception of Steganography on .wav Files

1
Perception of Least Significant Bit Steganography in .wav Audio Files by John Garretson Audio Arts and Acoustics, Columbia College Chicago, 33 East Congress Parkway, Chicago, IL 60605 Introduction Methods Results Conclusions Methods Continued Steganography is the process of hiding information inside of a medium, while maintaining the integrity of that medium. Least Significant Bit steganography is the process of taking the lower valued bits of any digital structure and changing them to represent another digital structure, i.e. picture, audio, text. Using the digital structure of .wav files, and the Least Significant Bit technique, steganography was performed on various .wav files in order to find the threshold of perceivable steganography marking perception by listeners of the LSB technique on 16 bit .wav files. The project began December 2010, concluded May 2011, and had 30 test participants. First, a program was created that adds an LSB steganography mark to .wav files. The program makes a copy of the .wav file to be marked and takes in the information to be hidden in the .wav. Next, the program iterates through all of the .wav file wave data samples, turning each one into a bit array, the bit to be marked is then replaced by a bit from the message being hidden and then return to the wave data byte array. Test files for the experiment were all five seconds in length, 16 bit, stereo, 44.1 kHz samples per second, pure tone wave files. Test files where at 500 Hz, 1000 Hz, and 3150 Hz, each with a 100% signal level volume and a 25% signal level volume for a total of six test waves. LSB steganography was performed on the first half of each file one bit at a time for all 16 bits. The frequency over time displays of the 500 Hz full amplitude test wave for all 16 bit depth marks can be seen here below and are representative of the effect of steganography on the other test .wav files. 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Unmarked File An ABX test program was then created using Microsoft Visual Studio 2008 to test participants ability to discriminate between test files that have been marked and the original unmarked .wav file. ABX is a well established audio test that plays a series of three sounds for the participant, A, B, and then X. The A Sound is the original unmarked .wav file, the B Sound is a marked file of various bit depth, and the X Sound is a random selection of either the A Sound or the B Sound. The test participant is then asked to select if the X Sound they heard is either the A Sound or the B Sound. The program starts by asking participants to take a simple survey to establish demographics for later analysis of results. Once the survey is taken, the ABX test begins by comparing .wav test files with the highest order bit marks to their original unmarked counterpart. The participant is not allowed to listen to the X Sound until both the A Sound and the B Sound have been listened to at least once. Participants can listen to all three sounds as many times as they wish before they make their selection. The amount of times the X Sound is listened to for each round is recorded and can then be used to calculate a confidence factor for each response from any particular participant. When the participant selects the correct response for the X Sound, the B Sound is lowered in bit depth, the X Sound is randomized again, and the process starts over. When the participant’s selection is incorrect the program records the current bit depth for that test .wav and then moves on to the next .wav test file group. After all six test groups have been tested the program saves the survey information, the lowest bit depth achieved for each .wav test group, and the amount of plays of the X Sound for each trial to a text file on the testing computer’s hard drive. Screenshots from the test program can be seen below. Survey Screen Testing Screen The average bit depth for all testing showed that as the level of the .wav file was decreased from 100% to 25%, performance for the perception of the stenographic mark increased by approximately 1 bit. The 1000 Hz 100% test wave had the worst bit depth average, but this is most likely due to it being the very first test participants took and the testing having a slight learning curve. The performance with varying frequencies was very similar so frequency sensitivity, i.e. Fletcher-Munson curve, does not affect perception. The data also showed that younger participants and participants who have experience playing musical instruments did better, by approx. 1-2 bits, than older participants and participants that did not have as much experience playing a musical instrument. The data suggested that participants who download their music as opposed to buying it the traditional way had better performance, but this most likely can be attributed to almost all participants in the older demographic also being in the buy demographic. The average limit for the most sensitive files was slightly higher than the 5 th bit. This implies that a LSB steganography mark utilizing least significant bits 1-4 would not be detectable. A mark with a bit depth of 4 bits would allow for 25% of the .wav file size to be devoted to the mark, so a 40 MB .wav file could store 10 MB of undetectable information. Future testing should be done with bit depths of 1-5 bits on music and speech files to find if similar results are found, and then applications can be developed to utilize the findings.

description

A senior project on the abilty for listeners to detect hidden information in .wav files from Least Signinficant Bit steganography done at Columbia College Chicago.

Transcript of Perception of Steganography on .wav Files

Page 1: Perception of Steganography on .wav Files

Perception of Least Significant Bit Steganography in .wav Audio Files by John Garretson

Audio Arts and Acoustics, Columbia College Chicago, 33 East Congress Parkway, Chicago, IL 60605

Introduction

Methods

Results

Conclusions

Methods Continued Steganography is the process of hiding information inside of a medium, while maintaining the integrity of that medium. Least Significant Bit steganography is the process of taking the lower valued bits of any digital structure and changing them to represent another digital structure, i.e. picture, audio, text. Using the digital structure of .wav files, and the Least Significant Bit technique, steganography was performed on various .wav files in order to find the threshold of perceivable steganography marking perception by listeners of the LSB technique on 16 bit .wav files. The project began December 2010, concluded May 2011, and had 30 test participants.

First, a program was created that adds an LSB steganography mark to .wav files. The program makes a copy of the .wav file to be marked and takes in the information to be hidden in the .wav. Next, the program iterates through all of the .wav file wave data samples, turning each one into a bit array, the bit to be marked is then replaced by a bit from the message being hidden and then return to the wave data byte array. Test files for the experiment were all five seconds in length, 16 bit, stereo, 44.1 kHz samples per second, pure tone wave files. Test files where at 500 Hz, 1000 Hz, and 3150 Hz, each with a 100% signal level volume and a 25% signal level volume for a total of six test waves. LSB steganography was performed on the first half of each file one bit at a time for all 16 bits. The frequency over time displays of the 500 Hz full amplitude test wave for all 16 bit depth marks can be seen here below and are representative of the effect of steganography on the other test .wav files.

16 15

14 13

12 11

10 9

8 7

6 5

4 3

2 1

Unmarked File

An ABX test program was then created using Microsoft Visual Studio 2008 to test participants ability to discriminate between test files that have been marked and the original unmarked .wav file. ABX is a well established audio test that plays a series of three sounds for the participant, A, B, and then X. The A Sound is the original unmarked .wav file, the B Sound is a marked file of various bit depth, and the X Sound is a random selection of either the A Sound or the B Sound. The test participant is then asked to select if the X Sound they heard is either the A Sound or the B Sound. The program starts by asking participants to take a simple survey to establish demographics for later analysis of results. Once the survey is taken, the ABX test begins by comparing .wav test files with the highest order bit marks to their original unmarked counterpart. The participant is not allowed to listen to the X Sound until both the A Sound and the B Sound have been listened to at least once. Participants can listen to all three sounds as many times as they wish before they make their selection. The amount of times the X Sound is listened to for each round is recorded and can then be used to calculate a confidence factor for each response from any particular participant. When the participant selects the correct response for the X Sound, the B Sound is lowered in bit depth, the X Sound is randomized again, and the process starts over. When the participant’s selection is incorrect the program records the current bit depth for that test .wav and then moves on to the next .wav test file group. After all six test groups have been tested the program saves the survey information, the lowest bit depth achieved for each .wav test group, and the amount of plays of the X Sound for each trial to a text file on the testing computer’s hard drive. Screenshots from the test program can be seen below.

Survey Screen

Testing Screen

The average bit depth for all testing showed that as the level of the .wav file was decreased from 100% to 25%, performance for the perception of the stenographic mark increased by approximately 1 bit. The 1000 Hz 100% test wave had the worst bit depth average, but this is most likely due to it being the very first test participants took and the testing having a slight learning curve. The performance with varying frequencies was very similar so frequency sensitivity, i.e. Fletcher-Munson curve, does not affect perception. The data also showed that younger participants and participants who have experience playing musical instruments did better, by approx. 1-2 bits, than older participants and participants that did not have as much experience playing a musical instrument. The data suggested that participants who download their music as opposed to buying it the traditional way had better performance, but this most likely can be attributed to almost all participants in the older demographic also being in the buy demographic. The average limit for the most sensitive files was slightly higher than the 5th bit. This implies that a LSB steganography mark utilizing least significant bits 1-4 would not be detectable. A mark with a bit depth of 4 bits would allow for 25% of the .wav file size to be devoted to the mark, so a 40 MB .wav file could store 10 MB of undetectable information. Future testing should be done with bit depths of 1-5 bits on music and speech files to find if similar results are found, and then applications can be developed to utilize the findings.