I pledge on my honor that I have not given or received any unauthorized assistance on this assignment/examination. I further pledge that I have not copied any material from a book, article, the Internet, or any other source, except where I have expressly cited the source. Signature _______________________________ Date: ___________________
Acoustic Detection of Foreign Sounds in an Urban Environment
Submitted to MSC Summer Research Institute
1 Castle Point Terrace Hoboken, NJ 07030
By: Alvaro Murillo University of Alaska Fairbanks Anthony Bianco Stevens Institute of Technology Laurie Prinz Stevens Institute of Technology
Raúl Huertas University of Puerto Rico Mayaguez Yegor Sinelnikov Stevens Institute of Technology
“Written and presented with the support of the Maritime Security Center, A Department of
Homeland Security Science and Technology Center of Excellence.”
July 28th, 2016
This material is based upon work supported by the U.S. Department of Homeland Security under Grant Award Number 2014-ST-061-ML0001. The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the U.S. Department of Homeland Security.
1
TABLE OF CONTENTS
Abstract…………………………………………...………………….……………………………………6
Executive Summary…...…………....……..……………….…………………………………...............7
Introduction……………………………...………………………….……………………….…………….8
Materials & Methods………...….....………….…………………...…………………….……………..11
List of Materials................................................................................................................11
Microphone and F8 Multitrack Sensitivity…………...……………………………………….12
Calibration………………………….……………………...……………………...…………….12
Audio and Spectrogram Synchronization…………………………………………...…….…13
Bill of Material………………………….………………………………………………....….....14
USMMA Procedure………………………………………………………………..….………..15
Penn Plaza Pavilion Procedures………………………...…………………..……….………16
Anechoic Chamber Procedure…………………………………………..…………....……...17
Hoboken Pier Procedure...……………………………………………………………..……..18
Test Locations…..…………………....……………...…………………….…………..……….19
Definitions and Equations…………………………………...……………..…………………..……....20
Results & Discussion…...………………….………..……………………………………..…………..21
USMMA..…….……..……………………...……………………………………………………21
Penn Plaza Pavilion………….………………………………………………………………...27
Anechoic Chamber……..…………….....……………………………………………………..34
Hoboken Pier……………………………….…………………………………………………..40
Potential Filtering Process……………………………………………………………………………..43
Conclusion………………………….…………………………………………………………………....45
Recommendations…………………………………...…………………………………………………46
2
Appendices..……………………………………………………………………………………………..47
Calibration Factor Verifications Penn Plaza Pavilion...…………………………...………..47
Calibration Factor Verifications USMMA...…………………………………………………..53
Calibration Factor Penn Plaza Pavilion……..……..……..……..……..……..……..……....53
Calibration Factor Verifications Anechoic Chamber...……….……………………………..54
USMMA Data Organization……..……..……..……..……..……..……..……..……..……...56
Penn Plaza Data Organization……..……..……..……..……..……..……..……..………....63
Anechoic Chamber Data Organization……..……..……..……..……..……..……..……….72
Additional Figures…...………………………………………………………………………….78
References..….………………………………………………………………………………………….80
Acknowledgement……………………………...………...………...………...………...……………...82
3
LIST OF FIGURES
Fig 1: Process of Sound Event Recognition..................................................................................8
Fig 2: Microphone Setup at Penn Plaza (left) and Zoom F8 (right).............................................11
Fig 3: United States Merchant Marine Academy Experimental Setup…………..........................15
Fig 4: Penn Plaza Pavilion Experimental Setup……………………………………………..……....16
Fig 5: Stevens Institute of Technology Anechoic Chamber Experimental Setup………………...17
Fig 6: Joint Research with Buoy Team at Hoboken Pier…………………...………………..……..18
Fig 7: United States Merchant Marine Academy Experimental Research………………………..21
Fig 8: Recording 5 Spectrogram; Boat Traveling Downwind……………………………...……....22
Fig 9: Recording 5 Spectrogram; Boats Acoustic Signature Fades Away; 0-2 kHz……………..22
Fig 10: Recording 6 Spectrogram; Boat Traveling Downwind ……………………….…...……….23
Fig 11: Recording 6 Sound Pressure Level vs Distance (left), Sound Pressure Level vs Time
(right)…………..…………...…………...…………...…………...…………...…………...…………....23
Fig 12: Recording 6 Sound Pressure Level vs Distance (left), Sound Pressure Level vs Time
(right)……………………………………………………..…………...…………...…………...………..24
Fig 13: Recording 10 Spectrogram; Boat Traveling Upwind…………………………...…………..24
Fig 14: Recording 10 GPS Track (Left), Distance vs Time (Top), Azimuth vs Time (Bottom).....25
Fig 15: Recording 10 Sound Pressure Level vs Distance (left), Sound Pressure Level vs Time
(right).………………………………………………….…………...…………...…………...…………..25
Fig 16: Recording 10 Lloyd Mirror Effect……...…………...…………...…………...…………...…..26
Fig 17: Experimental Research at Penn Plaza Pavilion…..…………...…………...……………....27
Fig 18: Temperature Records throughout Penn Plaza Experiment…..…………...…………...….28
Fig 19: Humidity Records throughout Penn Plaza Experiment…...…………...…………...……...28
Fig 20: Wind Speed Records throughout Penn Plaza Experiment….…………...………………..29
Fig 21: Sound Pressure Records throughout Penn Plaza Experiment…………...…………...….29
Fig 22: Error Bar Chart of Sound Pressure Level as a function of Time at Penn Plaza Recording
with Standard Deviation Range……………...…………...…………...…………...……………...….30
Fig 23: Penn Plaza Pavilion Occurrence of Typical Events….…………...…………...…………...30
Fig 24: Recording 3 Whistling in Penn Plaza Pavilion at 8:31 AM …...…………...……………...31
Fig 25: Recording 3 Car Horn in Penn Plaza Pavilion…..…………...…………...…………...……31
Fig 26: Recording 3 Police Sirens in Penn Plaza Pavilion…..…………...…………...…………....32
Fig 27: Experimental Research at Stevens Institute of Technology Anechoic Chamber…..……34
Fig 28: Recording 8 Spectrogram; Gunshot 0-24 kHz…………………………....………………...35
4
Fig 29: Recording 9 Spectrogram; Sport Whistle 0-24kHz………….…………...………………...36
Fig 30: Recording 10 Spectrogram; Wooden Whistle 0-24kHz………………………..…………..36
Fig 31: Recording 11 Spectrogram; Metal Whistle 0-24 kHz……………….……………………...37
Fig 32: Recording 1 Spectrogram; Megaphone Siren 0-24kHz……………...…………………….37
Fig 33: Recording 12 Spectrogram; Male Yelling, 94dB……………………………..……………..38
Fig 34: Recording 15 Spectrogram; Female Yelling, 85dB…………….…………...……………...38
Fig 35: Recording 2 Spectrogram; Anechoic Chamber, No event……….………………………..39
Fig 36: Experimental Research at Hoboken Pier……...…………...…………...…………………..40
Fig 37: Hoboken Pier Helicopter 9:27:46 AM………………...…………...…………...…………....40
Fig 38: Environmental Acoustic Data Recording 3, helicopter occurs at 9:27:46 AM…………...41
Fig 39: Buoy Hydrophone Data recording 32, helicopter occurs at 9:27:46 AM…….…………...41
Fig 40: Recording 4 Spectrogram; Loud Horn from Vessel……………………...………………...42
Fig 41. Prevalence of 4.5 kHz boat harmonic over neighboring frequency band………………..43
Fig 42. Frequency Bands for Potential Filter………..…………...…………...…………...………...44
5
LIST OF TABLES Table 1. Environmental Acoustics Bill of Material……………………………………………………14 Table 2. List of sound events, distances, and maximum sound pressure levels………………...34
6
ABSTRACT
New York City has one of the liveliest soundscapes in the world; sounds of heavy traffic,
car horns, sirens, loud neighborhoods, construction equipment, and dogs barking are just a few
of the things that create an intense and dynamic noise environment. Identification of the sound
source characteristics in a boisterous environment may have substantial benefits from a security
perspective. Having the capabilities to filter out known sound events enhances the likelihood of
acoustic detection and identification of a potentially unlawful target. This capability will
strengthen the maritime security domain by eventually giving authorities the proper instruments
to identify potential threats. The Maritime Security Center’s Summer Research Institute team,
focusing on an Environmental Acoustics Project, conducted environmental air acoustic
measurements in Penn Plaza Pavilion, United States Merchant Marine Academy, Stevens
Institute of Technology Anechoic Chamber, and Hoboken Pier by using a linear alignment of
calibrated microphones. Typical collective city sounds’ spectro-temporal signatures were
identified in the recordings. Respective a-weighted sound pressure levels were calculated for
individual sound events and throughout the records. This research report documents
experimental observations, provides examples of city sounds in a noisy environment and
anechoic chamber. The real time algorithm to isolate foreign sounds from typical city sounds
within a record of noise, such as identifying a bird chirp during rush hour at Penn Plaza, has
substantial implications in becoming a useful tool with various applications in the maritime and
urban security domains.
7
EXECUTIVE SUMMARY
This report provides the thorough analysis of identifying a sound source and its detection
over a long distance, determining the sound pressure level, and the filtering of unwanted sounds
of a given environment. The analysis entails multiple recordings at the United States Merchant
Marine Academy, Penn Plaza Pavilion, Stevens Institute of Technology Anechoic Chamber and
Hoboken Pier. Calculations and analysis were provided through the use of a robust spectrogram
function in Matlab with a fully functional graphical user interface that allowed the researchers to
manipulate the fourier transform, adjust the intensity levels and frequency range in order to
isolate a clear spectrogram of a particular sound. The results of the data analysis show that a
single sound source can be isolated and identified within a collection of noise. After determining
the calibration factor and extensive calculations, the sound pressure level of Penn Plaza
Pavilion at different time intervals was determined. In summary, the researchers were able to
identify the acoustic signature of multiple sound events over a long distance that otherwise
would have been unknown to the human ear. The sound pressure level of Penn Plaza Pavilion
was determined with great accuracy which helped to characterize the intensity level throughout
the day. In addition, patterns from the acoustic signature of particular events were used to filter
out undesirable sound characteristics in a noisy environment. The researchers also discovered
that the microphones were a limiting factor for the experiment because the microphones were
not best suited for the outdoor environment.
8
INTRODUCTION
Environmental Acoustics is the study of sound and vibration of noise sources in the
environment. Agencies are concerned with the control of these noises; unwanted noises can
have significant impacts on human safety. Being able to detect specific events in urban
environments is one task security agencies need to address in order to identify a possible
intruder. Since Maritime Security Agencies are looking for improvements in the security domain,
having technologies that could help identify sound sources will significantly strengthen port
security. This year, The Maritime Security Center is conducting research related to
Environmental Acoustics, Maritime Cybersecurity, and Underwater Buoy Noise with a group of
future engineers from around the nation and faculty members from Stevens Institute of
Technology. In order to help security agencies, the Environmental Acoustics Team conducted
series of tests in several urban locations to determine the sound pressure level and examine
detection and classification of a sound source. The information collected was organized in a
database and accompanied by thorough documentation of environmental conditions, landscape
surveillance and auxiliary measurements. Sound pressure level and identification of sound
sources were done using signal processing algorithms temporal and frequency domains.
A sound event [1] recognition is comprised of three steps: the detection, feature
extraction and classification, as shown in Figure 1 below. First, sound events are detected from
the continuous audio signal. Second, sound events are segmented for feature extraction. And,
third, extracted features are classified based on training set. The training set is continuously
updated in this process. Each step has specific aims and challenges.
Figure 1. Process of Sound Event Recognition
Detection aims to detect segments that are different from the underlying background noise.
Detection challenge is setting a suitable threshold. Examples are the zero-crossing rate, higher-
9
order statistics, pitch estimation, or spectral divergence [2]. Feature Extraction aims to extract
attributes that discriminate between different classes of sound, while minimizing the variation
within classes of sound. The challenge is selection of the most suitable feature set. Examples
are the Mel-frequency cepstral coefficients, temporal evolution of the signal, the harmonic or
perceptual information, sound information across time and frequency [3,4,5]. Classification aims
to produce a label to a sound event based on extracted features. The challenge is a
computational cost. Examples are the similarity distance measures, k-nearest neighbors,
dynamic time warping, Gaussian mixture models, hidden Markov models, artificial neural
networks, support vector machines [6,7].
The sound event recognition is of interest in acoustic surveillance and environment monitoring
application. Recently, speech and sound event extraction and classification techniques have
been developed. Although both speech and sound event extraction methods are based on
similar signal processing concepts, there are a number of differences. The sound events have
wider variety in frequency content, duration and profile, and cannot be split in words or
phonemes. Furthermore, the environmental noise, distortion, reverberation and overlapping
sources complicate sound event recognition. Together, this makes sound events more suitable
for classification based on their visual time-frequency representation or spectrogram image
processing.
The power of visual analyses of spectrogram has been attempted in speech processing [8].
While spectrograms have been useful in human speech analysis, it demonstrated limited
success as a voiceprint of human vocalization [9]. Nevertheless, the spectrograms became a
major tool in studies of how people pronounce different words and syllables.
A large amount of information contained in the spectrogram makes it attractive for sound event
recognition. The sound events are typically shorter in time and are lexically less connected
compared to human speech, leading to certain advantages of operating in the spectrogram
image domain [10], [11]. Processing spectrogram as an image opens up the wide range of
techniques developed in conventional image processing [e.g, 12,13,14,15]. The representation
of sound event as an image in the time-frequency domain inspired development of novel image
processing methods [16,17]. Moreover, processing spectrogram as an image may create a
methodological and algorithmic base for a fusion of acoustics and video processing in
surveillance applications [18].
10
A typical spectrogram of a sound event constitutes an overlap of a set of harmonic lines, curves,
diffuse patterns and time dependent background, bearing similarity with conventional images.
Some sound events are easily differentiated by their spectrograms’ look. Examples of a traffic
whistle, starter gun shut, and a police siren are shown in subsequent sections of this report.
Numerous spectrogram image processing techniques exist in the literature. The concept of
spectrogram image based processing demonstrated good results in classification of
environmental sounds [19]. For improved detection performance the noise can be removed by
means of image processing operations [20]. Comprehensive review of sound event recognition
method can be found elsewhere [21].
Image processing of sound events is an area of active ongoing research. It has pros and cons.
The cons include transient temporal and spectral variability in environmental noise and sound
events not otherwise present in conventional images’ background and the lack of solid
geometrical constraints employed in image pattern recognition. The pros include large variety of
image processing and machine learning algorithms applicable to spectrogram processing to
enable the feature extraction and classification of sound events. The pros also include
developed image processing methodologies to effectively reduce noise and substantial
interdisciplinary efforts supported by steady advances in microprocessor and system
communication technologies.
11
MATERIALS AND METHODS List of Materials:
● Zoom F8 Multitrack Recorder
● Behringer B-5
● Digital Sound Level Meter
● ND9 Sound Calibrator
● Matlab 2015
● Modified Tripod
● Wind Protection
Figure 2. Microphone Setup at Penn Plaza (left) and Zoom F8 (right)
12
Behringer B-5 Microphone and F8 Multitrack Recorder Sensitivity:
Since the Behringer B-5 microphones are meant for indoor recording, there were a few
obstacles that the group had to overcome while using the microphones outside. The
researchers discovered that the microphones were sensitive to humidity and would produce
static if exposed to humidity for too long. After this discovery, the microphones were used in
humid areas only for a limited amount of time. The frequency response for the Behringer B-5
microphone is 20 Hz – 20 kHz. The max SPL is 140 dB and the equivalent SPL is 16 dB.
The F8 Multitrack Recorder was designed for professional filmmakers and sound designers so
there were many good qualities to the recorder. There are 8 channels with a low noise floor of -
127dB and a gain up to 75dB. The F8 Recorder records at a 24-bit/192 kHz resolution and
offers 10dB of headroom. Although there is a time stamp on the F8 Recorder, the group
discovered that sometimes it was not accurate.
Calibration
Calibration Option 1
1. Use sound level calibrator with microphone of interest
2. Record signal
3. Integrate signal in 1 second time window to calculate sound pressure level in dBA
Calibration Option 2
1. Use digital sound level meter
2. Record signal
3. Integrate signal in 1 second time window to calculate sound pressure level in dBA
A ND9 Sound Calibrator was used to calibrate the microphones throughout the summer
research. The calibrator has the capacity of producing two different sounds with two different
frequencies: one at 94 dBA and the other at 114 dBA. Depending on the application, one of
them is selected to conduct the calibration process. Since the environment of interest is very
noisy, the 94 dBA sound was selected for the calibration process. The Environmental Acoustic
team conducted different recordings in order to verify that the microphones were calibrated. The
calibration factor for those recordings was calculated using a Matlab algorithm. The algorithm
has the capability to calculate the intensity of the sound in decibels for a selected recording. A
Sound Pressure Level Meter (SPL) was used to determine the real-time sound pressure level of
13
the calibration recordings, environmental surroundings, or a single sound source. The device
determined the sound pressure level by taking measurements every 125 milliseconds.
The initial calibration factor that was calculated for a particular recording served as a marker the
researchers would adjust accordingly depending on a sound source, environment, or
equipment. Signal Processing was accomplished by using a Matlab Graphical User Interface
(GUI) that calculated the time-frequency representation of a selected recording. The researcher
chose a specific time lapse and retrieved the corresponding calibration factor at that specific
time frame by analyzing the sound pressure level calculated by the Matlab algorithm. A
comparison was drawn between the sound pressure level from the SPL meter measurements
and the calculated SPL. The objective was to have minimize discrepancy between the Matlab
algorithm and the SPL, and small to nonexistent deviations between the microphones. This
process was repeated extensively until a correct calibration factor was determined.
Audio and Spectrogram Synchronization
The Matlab spectrogram script that was developed in the earlier stages of the internship
generated a separate audio and spectrogram recording of the data that was collected during the
various experiments.The program did this by dividing a 10 second clip into 100 frames with a
90% overlap. The researchers then had to manipulate both the audio and spectrogram
recordings in order to have both files synchronize. Shortly after, both fully compatible recordings
were merged together using Windows Movie Maker. This allowed the audio and spectrogram
recordings to harmonize. This process was done for the four channels for all 11 recordings at
Penn Plaza Pavilion, this generated 85 gigabytes of data and 19 gigabytes for United States
Merchant Marine Academy.
14
Bill of Materials: Table 1. Environmental Acoustics Bill of Material
Equipment: Dimension(LxWxH): Weight Quantity: Cost:
Zoom F8 Multitrack Field Recorder Height -2.1", Width - 7'', Depth
5.5'' 2.1 lbs 1 $999
Behringer B-5 0.8 x 0.8 x 4.7 inches 8.5 oz 4 $69.99
goSTAND Portable Mic and Tablet Stand 18 x 4 x 4 inches 3.1 lbs 2 $49.99
Audix DCLIP Microphone Clip 11 x 12.5 x 2 inches 2.9 oz 4 $14.95
GLS Audio 6ft Patch Cable Cords 8.2 x 8 x 2.5 inches 2.0 lbs 1 $39.99
K&M 23510 Adjustable Bar 10 x 1 x 1 inches 9.1 oz 2 $19.99
Digital Sound Level Meter 10.7 x 8.8 x 2.6 inches 1.8 lbs 1 $47.99
ND9 Sound Calibrator 4 x 7 x 2 inches 13.6 oz 1 $169.99 The equipment values reflect the pricing on Amazon.com as of July 28th, 2016.
15
Methodology: United States Merchant Marine Academy
Figure 3. United States Merchant Marine Academy Experimental Setup
1. Find location where boats frequently sail and where the microphones are best protected
from wind and environment noise.
2. Assemble both tripods and position umbrella in the direct path of the wind.
3. Connect cables to microphones and the F8 Multitrack Field Recorder. Make sure to take
note of the arrangement of the microphones to the channel of the recorder.
4. Attach the directional cap and diffused metal head on each microphone and then place
on the tripods.
5. Place the tripods and microphones facing the direction that will record the sounds of the
boat.
6. Run a test recording in order to determine if the equipment is fully functional and
establish a quiet area where no one can walk or speak.
7. Establish communication with the vessel and have a person on board the vessel record
the gps movement.
16
8. Wait for an opportunity where there is the least amount of environmental activity and
signal the vessel to accelerate and move in a zigzag like pattern. Have a researcher
establish a quiet boundary and press record on the Multitrack Field Recorder.
9. Write down the start time, end time, noise level, humidity, wind speed, boat speed, and
distance the boat traveled.
10. Repeat step 8-9 until desired amount of recordings is reached.
Penn Plaza Pavilion
Figure 4. Penn Plaza Pavilion Experimental Setup
1. Find location near Penn Plaza Pavilion that is protected from the environment (rain &
wind).
2. Assemble the tripod and attach directional cap and diffused metal head to each
microphone. Attach each microphone to the F8 Zoom Recorder and then attach the
microphones to the tripods. Make sure to take note of which microphone is attached to
which channel on the Multitrack Recorder.
3. Take note of the surrounding environment.
a. Time traffic signals to know how often a steady stream of cars will be going by
b. Time how often the train comes and how long it takes to go by
c. Distances to nearby buildings, street corners etc.
4. Record for 10 or 15 minutes segments. Make sure to take notes throughout the
recording.
17
a. Take short video segments
b. Take notes of unusual noises (fire trucks, screaming, etc)
c. Take pictures of things that make unusual noises
d. Document time start/time end
e. Document noise level (dBA)
f. Document temperature
g. Document humidity
h. Document wind speed/direction
i. Document how much it is raining
j. Document the direction microphones are facing
5. After recording, transfer data to computer and run verification analysis in order to
determine if adjustments on the microphones or Multitrack Recorder needs to be made.
6. Walk around the perimeter recording observations of the environment
7. Repeat steps 4-6 every 30 minutes.
Stevens Institute of Technology Anechoic Chamber
Figure 5. Stevens Institute of Technology Anechoic Chamber Experimental Setup
1. Assemble the modified tripod at the farthest corner of the room. (The farthest corner of
the room is chosen in order to prevent the signal from being saturated from a large
sound)
2. Establish two points in the room; one in the center of the room and one in the opposite
corner to the microphones. Record the distance of those two points to the microphone.
3. Attach the directional cap to each microphone, attach each microphone to the F8 Zoom
Recorder, and then attach the microphones to the tripods. Make sure to take note of the
arrangement of the microphones to the channel of the recorder.
18
4. Establish safety guidelines since a firearm will be present
a. Safety Glasses and ear protection must be worn by the person who fires the
firearm
b. Always be mindful of which direction the firearm is facing
c. Always assume the firearm is loaded
d. Keep finger off of the trigger until ready to fire
5. When the recording occurs, the professor will count to 5 seconds before firing the first
shot. (This will give the other researchers enough time to cover their ears)
6. While the recording is taking place, the other researchers will take a video recording of
the Digital Sound Level Meter.
7. Repeat steps 5-6 for the different types of calibers.
8. Once the firearm recording has concluded, have the professor return the firearm to the
campus police office.
9. Record other significant sounds, such as whistle, yelling, blender etc, and the Digital
Sound Level Meter.
Hoboken Pier Joint Research
Figure 6. Joint Research with Buoy Team at Hoboken Pier
1) Assemble the tripods and attach microphones. (Remember to record which microphones
were being used and the connection configuration)
2) Establish communication with the Buoy Team in order for everyone to be on the same
page.
19
3) Point the microphones in the same direction as the GoPro video recording.
4) Record long segments, roughly 30 minutes to 1 hour, while slightly adjusting the
direction the microphones are facing. (As the boat maneuvers along the Hudson, both
the audio and video recording must capture its movements.)
5) Take diligent notes throughout the audio recording.
a. Recording unusual noises
b. Document time start/time end
c. Document noise level (dBA)
d. Document temperature
e. Document humidity
f. Document wind speed/direction
6) Repeat steps 3-5 until the researchers have collected a predetermined amount of data.
Test Locations
United States Merchant Marine Academy All data collection was completed on June 20, 2016. Penn Plaza Pavilion All data collection was completed on June 28, 2016. Stevens Institute of Technology Anechoic Chamber All data collection was completed on July 6, 2016. Joint Research with Buoy Team along Hoboken Pier All data collection was completed on July 14, 2016 Different Elevation Recordings of Babbio building All data collection was completed on July 14, 2016
20
DEFINITIONS AND FORMULAS
Definitions: ● Lp1 is the Sound Pressure at microphone one ● Lp2 is the Sound Pressure at microphone two ● P0 is equal to 20×10-6 Pascal ● P is the Relative Pressure to the atmosphere ● R1 is the distance from microphone one to sound source ● R2 is the distance from microphone two to sound source
Equivalent sound pressure equations:
Equations (1), (2), and (3): Equation 1 gives dB as a function of pressure, Equation 2 is used to
calculate the dB level caused by the addition of multiple sound sources, and Equation 3 yields
the dB at point 2 based upon the two distances from the sound source and the dB level at point
1.
21
RESULTS & DISCUSSION United States Merchant Marine Academy The objective of the experimental research was to measure the moving boat acoustic signature
and estimate detection distance in environment with significant acoustic interference from
helicopters and planes.
Figure 7: United States Merchant Marine Academy Experimental Research
During the experiment, two B-5 Behringer microphones were used to detect the boat's acoustic
noise from several distances while the boat made different maneuvering patterns.
Environmental noise interference from helicopters, airplanes, birds, and people added a degree
of difficulty when attempting to distinguish the boat’s acoustic signature. However, the
Environmental Acoustic team was able to overcome such challenges and successfully identified
the boat’s engine. Figure 8 displays the spectrogram of the boat’s acoustic signature. At 19
seconds into the audio recording, the boat begins to accelerate causing the engine revolution to
increase, thus the boat’s acoustic signatures were established. The boat was traveling in the
direction of the wind. Wind velocity was recorded to be 9.66 km/h
22
As the boat moves away from the
microphones, the higher frequencies
dissipate faster than the lower ones.
Forty four seconds into the audio
recording, there is a vertical line due to a
Large Rusted Metal Object (LRMO)
creating a dinging noise (see
appendices for image). Throughout
recording 5, the microphones were able
to detect the boat’s frequency up to 159
meters away from the recording station.
The boat’s signal begins to fade away
around 80 seconds into the recording as
an airplane begins to fly over the Figure 8. Recording 5 Spectrogram; Boat Traveling Downwind
equipment (See Fig. 9).
There is a strong frequency between 0
and 0.2 kHz. It was established that
between 0 and .1 kHz was being created
by the equipment itself. This was
determined via testing in the Anechoic
Chamber (see Figure 35). This still
leaves the possibility of using
frequencies between .1 and .2 kHz for
detection of the boat. At the end of the
audio recording the boat was 198
meters away. Figure 9. Recording 5 Spectrogram; Boats Acoustic Signature Fades Away; 0-2 kHz
23
Figure 10 and 13 display the boat
travelling the same distance away
from the microphones downwind
and upwind. In Figure 10, the boat
traveled to a distance of 221 meters
just as an airplane begins a flight
overhead. The recording is stopped
just as the airplane begins to fill the
spectrogram. At about 40 seconds
into the recording, the higher
frequencies between 1 and 1.5 kHz
begin to dissipate as the boat
increases in distance from the Figure 10. Recording 6 Spectrogram; Boat Traveling Downwind
microphones.
Figure 11 displays three separate
images. The left side is the GPS
track with a bright green highlight of
the duration of the recording. The top
right shows the distance from the
microphones as a function of time,
once again highlighted in green for
the duration of the recording. Finally,
the bottom right shows the Azimuth
from the boat to the microphones.
Figure 11 indicates, on the GPS
track, that the path the boat took was
down wind. The distance graph shows Figure 11. Recording 6 Sound Pressure Level vs Distance
that the boat moves from 53 meters to (left), Sound Pressure Level vs Time (right)
221 meters over the duration of the recording. Since the boat was almost directly South of the
microphones, the Azimuth is approximately 0 with variation as the boat moves East and West.
24
Figure 12. Recording 6 Sound Pressure Level vs Distance (left), Sound Pressure Level vs Time (right)
The left graph of Figure 12 displays the sound pressure level versus distance for recording 6.
The right graph of Figure 12 displays the sound pressure level versus time for recording 6.
Since the boat was increasing in distance with time, both graphs look similar. However, the
velocity of the boat was not constant, resulting in stretching and compression of the graph from
time to distance. Recording 6 clearly indicated the decrease in sound pressure level as the boat
increased in distance.
In Figure 13 the lines also begin to
dissipate as the boat travels away.
However, the remainders of the lines are
much stronger when the boat is travelling
upwind.
Figure 13. Recording 10 Spectrogram; Boat Traveling Upwind
25
Figure 14 indicates, on the GPS
track, that the path the boat took
was up wind. The distance graph
shows that the boat moves from 99
meters to 662 meters over the
duration of the recording. The boat
started moving northwest of the
microphones, but soon adjusted
the direction it was heading,
resulting in the Azimuth changing
suddenly with the start of the
recording but then flattening out as
the boat maintained a constant
direction. Figure 14. Recording 10 GPS Track (Left), Distance vs Time (Top),
Azimuth vs Time (Bottom)
Figure 15. Recording 10 Sound Pressure Level vs Distance (left), Sound Pressure Level vs Time (right)
Clearly seen in both graphs in Figure 15 are large peaks throughout the recordings. Despite the
boat moving out to 662 meters, the peaks caused sound pressure level interference. The
amount of peaks, the frequency of the peaks occurring, and the duration of the peaks vastly
limited the ability to draw any relation between boat distance and sound pressure level.
26
Figure 16 shows how the frequencies
that were established by the boat were
distorted when an airplane flew directly
above the recording station. The time-
frequency representation of the
airplane displayed a natural
phenomenon called the Lloyd Mirror
effect. The sound wave that is being
propagated from the airplane is being
reflected from the ground before the
microphones are able to record it. This
effect causes the frequency to appear
in a wave like pattern. Figure 16. Recording 10 Lloyd Mirror Effect
27
Penn Plaza Pavilion The objective of the experimental research was to measure urban city noise and create a database for the sound event recognition image processing evaluation.
Figure 17. Experimental Research at Penn Plaza Pavilion During the experiment, 4 microphones recorded the area surrounding Penn Plaza in 10 minute
increments from 7:30am to 1:00pm. The microphones picked up all environment noise such as
cars, sirens, horns, whistles, construction, and people. The sound pressure level was
determined at different times throughout the day. The following table displays the absolute
sound pressure level of one of the recordings collected throughout the experiment at Penn
Plaza Pavilion. There were a total of 11 recordings taking throughout the experiment, each of
which has a unique calibration factor for each of the four channels.
Appendix E. displays the data collected during the first recording at Penn Plaza Pavilion. The
start time of the recording, temperature, humidity, wind speed, and noise level were all
documented at the beginning of the recording. During each recording, events that were out of
the ordinary and would be easily distinguishable in a spectrogram were documented. Once the
experiment was finished, observational data was organized into an Excel sheet in a specific
format to allow the Matlab script to synchronize with the data contained. For the calibration and
distance rows, each column adjacent to the label represents each microphone. The first cell
after the calibration label is microphone 1, the second cell is microphone 2, etc. The row labeled
“distance” is zero for all of the recordings at Penn Plaza because there was no specific object
being recorded. This template was followed for each experiment performed. The calibration
factors for each microphone were later determined and included in the data collection Excel
sheet.
28
The following graph (Figure 18) displays the temperature throughout the day while recording at
Penn Plaza. At the beginning of the day, the temperature started at 70o and slowly increased.
The temperature remained constant at 72o for about an hour and a half and then dropped back
to 70o at 10:30am. The temperature then increased again for the remainder of the day.
Figure 18. Temperature Records throughout Penn Plaza Experiment
The following graph (Figure 19) shows the humidity records throughout the day while recording
at Penn Plaza. At the beginning of the day, the humidity was around 75%. Around 10:00am the
humidity increased dramatically for the remainder of the day. The increase in humidity caused
problems with the microphones. The more humid it was, the more static the microphones
produced on the recording.
Figure 19. Humidity Records throughout Penn Plaza Experiment The following graph (Figure 20) shows the wind speed, in miles per hour, throughout the day at
Penn Plaza. The wind speed varied throughout the day. The wind speed was highest at the
beginning of the day and then dramatically decreased between 10:00am and 11:00am. The
29
wind speed did not have a great effect on the recordings because the microphones were
protected from the wind during the experiment.
Figure 20. Wind Speed Records throughout Penn Plaza Experiment The following graph (Figure 21) displays the noise level in dBA throughout the day at Penn
Plaza. The noise level was highest in the early morning at 8:00am due to rush hour traffic. The
sound level decreased at 9:30am and then increased again at 10:00am. Contrary to initial
predictions, there was not a spike in noise level during lunch time.
Figure 21. Sound Pressure Records throughout Penn Plaza Experiment Figure 22 displays the average sound pressure level for each microphone from 7:30am to
1:00pm at Penn Plaza Pavilion. There was a sudden increase in the sound pressure level
starting at 10:00am which can be attributed to the opening of the dining services that were
adjacent to the recording station. Multiple pedestrians were walking to and from the dining
services around 10:00am. Furthermore, the standard deviation range for all four channels,
which appears in a vertical bar, are provided with their corresponding recording start time.
30
Figure 22. Error Bar Chart of Sound Pressure Level vs Time at Penn Plaza Recording with Standard Deviation Range.
The following figure (Figure 23)
displays the occurrences of events
recorded throughout the Penn Plaza
Pavilion research experiment. The
most common events throughout
the day were car horns, people
talking nearby, and traffic police
blowing a whistle.
Figure 23. Penn Plaza Pavilion Occurrence of Typical Events
The spectrogram of the three most prolific events at the Penn Plaza Pavilion are provided in the
following three Figures.
31
Figure 24 displays the difference in
acoustic signatures between a police
officer using a whistle at 8:31 AM to
direct traffic in contrast to a pedestrian
using their fingers to act as a whistle.
The whistles from the police officer
were present throughout the
recordings and were located 45
meters from the microphone station
whereas the person whistling was 32
meters away.
Figure 24. Recording 3 Whistling in Penn Plaza Pavilion at 8:31 AM
Figure 25 displays a vehicle honking
its horn in traffic at 8:39 AM. The event
was estimated to be roughly 35 meters
away from the recording station. In
order to isolate the event the frequency
scale, intensity range, and fourier
transforms were adjusted in a specific
arrangement to have a clear
representation of the sound event. The
reason this particular sound event was
difficult to isolate was because of the
vast amount of frequencies and
disparity in intensities that were
present during the honking of the vehicle. Figure 25. Recording 3 Car Horn in Penn Plaza Pavilion
32
The police sirens that were present
throughout the experiment at Penn
Plaza Pavilion generated unique
harmonic patterns at 8:30 AM that
went from high to low pitch (See
Figure 26 ). Even though the police
sirens were perceived to be a
continuous sound, its patterns were
not connected throughout the
spectrogram. Between 6.5 and 8.0
seconds, there appears to be a gap
in the harmonic signature. However,
the sirens could still be heard. Figure 26. Recording 3 Police Sirens in Penn Plaza Pavilion
The following spectrograms are noticeable events that occurred throughout the Penn Plaza
Pavilion recordings.
Penn Plaza Pavilion Additional Spectrograms, a-f:
a. Coughing, 71.1 dBA
b. NYPD Truck Siren, 81.3 dBA
c. Pedestrians Talking, 73.5 dBA
d. Janitorial Rolling Bucket, 79.9 dBA
e. Construction Equipment, 75.3 dBA
f. Baby Crying, 79.9 dBA
33
34
Stevens Institute of Technology Anechoic Chamber The objective of the experimental research was to record characteristic sounds from a set of sound events: gunshots, screams, whistles, sirens. The secondary goal was to estimate their absolute A-weighted sound pressure level.
Figure 27. Experimental Research at Stevens Institute of Technology Anechoic Chamber
Table 2 shows the data that was collected in the Anechoic Chamber for a variety of events.
Listed are the events and the maximum recorded sound pressure levels from microphones 1
and 2 as well as the distances to those microphones. The distances and sound pressure levels
were used to calculate the sound pressure level at 1 meter if the microphone was not located
there. Equation 3 was used for this calculation. Table 2. List of sound events, distances, and maximum sound pressure levels.
Event type
Distance to mic 1
(m)
Mic 1 max SPL
(dB)
Distance to mic 2
(m)
Mic 2 max SPL
(dB)
SPL at 1
meter
Gunshot 4.27 108.8 5.19 98.5 121.4
Gunshot
(misfire) 4.27 74.5 5.19 57.6 87.1
Man scream 1 106.7 1.91 88.5 106.7
Girl scream 1 99.9 1.91 80.3 99.9
Whistle
(wood) 1 93 1.91 74.4 93
35
Whistle
(sport) 1 113.3 1.91 93.5 113.3
Whistle (steel) 1 107.2 1.91 91 107.2
Siren 1 117.5 1.91 93.6 117.5
Figure 28 displays the acoustic
signature of a .22 caliber blank pistol
being shot in the anechoic chamber at
a distance of 5.19 meters. The
experiment was conducted in order to
identify the acoustic signature of the
firearm without any interference from
an outside source. The intensity level
of the firearm was far greater than the
Multitrack Recorder threshold, which
caused the audio recording to become
saturated and sequestered. Figure 28. Recording 8 Spectrogram; Gunshot 0-24kHz However, Figure 28 clearly indicates the acoustic characteristics of the firearm with a great
intensity at a high frequency followed by low intensity below 10 kHz for 0.2 seconds. The low
intensity frequency was likely produced by the gas discharging from the firearm.
Additional experiments involving three different whistles were conducted at the anechoic
chamber and are provided in the following three spectrograms.
36
Figure 29 demonstrates the acoustic
signature of the sport whistle, which
was recorded to be the loudest of the
three whistles. The intensity of the
sport whistle at 3 kHz was the
greatest and had a deafening effect.
Six distinctive frequency lines were
established each time the whistle
was blown.
Figure 29. Recording 9 Spectrogram; Sport Whistle 0-24kHz In comparison, the wooden whistle
that sounded similar to a conductor
on a train, was not as intense and
could be heard without ear protection.
The wooden whistle produced two
distinct lines under 5 kHz and less
distinctive lines above 5 kHz(See Fig.
30).
Figure 30. Recording 10 Spectrogram; Wooden Whistle 0-24kHz
37
In contrast, Figure 31 displays the
spectrogram of a metal whistle
being blown. This is characterized
as an intense low frequency pitch.
Five distinctive lines are generated
by the metal whistle. As previously
demonstrated, all three whistles
generated their own unique
acoustics signatures that are
characterized by their low frequency
intensity, distinctive frequency
patterns, and horizontal frequency
configurations. Figure 31. Recording 11 Spectrogram; Metal Whistle 0-24 kHz
Additional recordings were taken
of a megaphone siren at the
anechoic chamber (See fig.32 ).
The siren generated a unique
acoustic signature of a wave like
pattern. This is not to be
misinterpreted as the Lloyd Mirror
effect, but rather the megaphone
operating from a high to low pitch.
Figure 32. Recording 1 Spectrogram; Megaphone Siren 0-24kHz
38
The following two spectrograms in
Figure 33 and Figure 34
demonstrate the difference in
yelling patterns between a male
and female. Figure 33 displays a
high intensity sound being
generated under 5 kHz; whereas in
Figure 34 a harmonic pattern is
generated with a higher intensity
being evenly distributed throughout
a greater frequency range.
Figure 33. Recording 12 Spectrogram; Male Yelling
The noticeable difference between
the two spectrograms is the
beginning and end of the recorded
yelling. The male’s scream has an
abrupt beginning where his vocal
registry existed at a low frequency,
whereas the beginning of the
female’s screams gradually
increased into a high frequency,
resembling a ladder. The ending of
the male’s scream occurred at
lower frequencies than the female’s
scream. Figure 34. Recording 15 Spectrogram; Female Yelling
39
As stated earlier, the origins of
the high intensity, low frequencies
that were present under 0.2 kHz
at the United States Merchant
Marine Academy experimental
research were not able to
determined. It was hypothesized
the low frequencies could have
been caused by equipment
interference, or the origins of the
low frequency could have derived
from the surroundings that could
have been missed. Figure 35. Recording 2 Spectrogram; Anechoic Chamber, No event
Figure 35 displays the spectrogram of an audio recording within the anechoic chamber in
complete silence. By conducting this experiment, the possibility of an external sound source
influencing the data were able to eliminated. This supported the idea that the equipment was in
fact producing low frequency interference that existed under 0.1 kHz.
40
Joint Research with Buoy Team at Hoboken Pier The goal was to conduct simultaneous acoustic recording in water, air and video to enable
fusion signal processing of different sound events.
Figure 36. Experimental Research at Hoboken Pier
Throughout the recording at the Hoboken Pier, 39 helicopters and multiple ships were recorded.
The Environmental Acoustics audio recording and the Buoy video recording allowed for the
synchronization of a spectrogram with a video representation of what sound events occurred
over the deployed buoy. This would allow the determination of a correlation between sounds
generated above and underwater. There is a lot of great work that can be done with the
Hoboken Pier data that can further the study of how a sound transfers between two mediums
and the effects the sound wave will experience while doing so.
The acoustic signature of the helicopter shown in
Figure 37 was recorded by both the Buoy Team and
Environmental Acoustic Team as it flew over the
Hoboken Pier at 9:27:46 AM.
Figure 37. Hoboken Pier Helicopter 9:27:46 AM
41
Figure 38* displays the hydrophone
recording of the acoustic signature of the
helicopter that flew overhead at 9:27:46
AM. The acoustic signature of the
helicopter was registered at a relatively low
frequency that made its’ detection quite
difficult. Frequency range, intensity domain,
and fourier transform were manipulated in
order to distinguish the acoustic signature
of the helicopter. The Environmental
Acoustic team recorded the same
helicopter’s acoustic signature. However,
due to the lack of a well establish Figure 38*. Buoy Hydrophone Data recording 32, helicopter
synchronization procedure between occurs at 9:27:46 AM
both the Buoy Team and Environmental
Acoustic Team, it was not possible to
conclude with absolute certainty which of
the two acoustic signature displayed on
Figure 39 belongs to the helicopter. If the
researchers were to make the assumption
that the audio recording began exactly at
9:17:00, then the acoustic signature of
helicopter located at the far right of Figure
39 (See Appendix K) would then
correspond with the acoustic signature
displayed on Figure 38. However, if the
audio recording began at 9:17:59, then it is
possible for the acoustic signature of the Figure 39. Environmental Acoustic Recording 3, helicopter
helicopter to correspond with the left side present far left and far right
of Figure 39 (See Appendix L).
Figure 38*: It was later determined by the Acoustic Engineers at the Pond House that the acoustic signature displayed on Figure 38 may not belong to the helicopter but rather an interference from the hydrophone.
42
Figure 40 displays the frequency
characteristics that were generated
when a large vessel blew its horn.
Figure 40. Recording 4 Spectrogram; Loud Horn from
Vessel
43
POTENTIAL FILTERING PROCESS
While the boat’s harmonics are visible on spectrogram and their presence is visually detectable,
an algorithmic approach is required for real detection systems. Sound pressure levels in a
narrow band around one of the boat’s harmonics and in between the harmonics were
calculated. The window containing the harmonic was between 4 and 4.8 kHz and the window
without any harmonics was between 4.8 and 5.6 kHz. Consistent prevalence of signal in the
band with boat harmonic is shown in Figure 41.
Figure 41. Prevalence of 4.5 kHz boat harmonic over neighboring frequency band
44
Figure 42: Frequency Bands for Potential Filter
In Figure 42, R1 and R2 show potential frequency ranges selected for the filter. R1 is .49 kHz to
.51 kHz and contains the frequency produced by the boat engine. R2 is .45 kHz to .49 kHz and
captures the gap between the boat engine frequencies. R2 does not capture the target
frequency. By separating these out, the sound pressure level for each individual frequency band
can be calculated. Once this is done, the sound pressure levels can be compared. If the sound
pressure level of frequency band R1 is greater than R2 then this may be indicative of the boat’s
presence in the recording. If the sound pressure levels are approximately equal, this may
indicate that no boat is present during the recording. By expanding this basic example, an entire
set of frequency ranges that encompass all the boat’s frequencies and gaps between them
could be created. This set could be used as a full filter by constantly calculating the sound
pressure levels for each band and making comparisons to those around it. An algorithm that is
capable of detecting a boat based upon the relative sound pressure levels of the known boat
frequency ranges could then be written.
45
Conclusions
The Environmental Acoustic Team conducted several successful research experiments
throughout the 2016 Maritime Security Center’s Summer Research Institute. The overall
objective was to determine the absolute sound pressure level and identify a single sound source
within a given environment. Both were accomplished and thoroughly verified. Additional analysis
of the collected data was conducted, such as identifying a boat's acoustic signature over
distance in a noisy environment. In addition, comparisons between a boat travelling downwind
in contrast to a boat traveling upwind were also analyzed. The acoustic signatures of multiple
sound events, such as gunshots and police sirens were registered. Specific characteristics,
such as the gas expelling from a gunshot, were distinguished, something which could not have
been detectable without the proper computer software. Furthermore, the researchers were able
to distinguish distinctive characteristics from the acoustic signature of a helicopter and boat
above water in contrast to the acoustic signatures below water. The distinctive characteristics
of both a helicopter’s and boat’s acoustic signature were analyzed. In addition to the above
water recognition of the sound events, underwater analogues were also observed and analyzed.
46
Recommendations
Despite the large amount of analysis completed throughout the program’s duration, a
vast amount of potential has been left unutilized. This is especially prevalent in the data
processing from Penn Plaza. With such a rich environment of sound sources and the capability
of characterizing the urban noise environment, applications for this data have unfortunately
been left unrealized. With more time, the development of an algorithm to parse through
recordings and identify sound sources, location of origin, sound pressure level, and then be able
to determine whether this event was something typical of the environment like a car horn or a
whistle, or potentially a source of interest such as a gunshot or a scream. Using the MatLab
capabilities of searching through spectrograms, the potential for such an algorithm exists. These
computational capabilities would then be paired with selected events for analysis and
recognition. This would be achieved using a technique already in place in video processing
called Binary Large Object (BLOB) processing1. The screen is analyzed and groups of
connected pixels are recorded. Using this technique, it is possible for the algorithm to identify
objects and, in the case of Penn Plaza recordings, being able to identify specific BLOBs created
by distinct frequencies. Anechoic chamber recordings offer the purest acoustic signature for
events, and can likewise be used as a reference for the algorithm to compare events to. In
addition to feeding the algorithm already known acoustic signatures, the algorithm can be
complicated further by allowing the process of machine learning where the algorithm gathers
data from its own recordings and then uses past recordings as a whole to reference when
comparing for typical events. This is akin to using the entire urban noise environment as an
“event” for the algorithm to detect.
1 Moeslund, Thomas B. Introduction to Video and Image Processing: Building Real Systems and Applications.London: Springer, 2012. Web. Undergraduate topics in computer science; Undergraduate topics in computer science.
47
APPENDICES Calibration Factor Verifications As stated in the subsection titled ‘Calibration Factor’, the calibration factor for each channel of
every recording had to be determined. This was a significant procedure that would adjust the
data to counteract any potential interference from the microphones, multitrack recorder, and
cables. After the calibration factor for each channel for every recording had been determined, it
then had to undergo a verification process that either confirmed or refuted the calibration factor.
If all four microphones displayed a relatively similar sound pressure level, which must be in
synch with the Digital Sound Level Meter, throughout a predetermined time lapse, then the
researchers kept the calibration factors for that particular recording. If a single microphone
deviated by a noticeable difference for a prolonged time period, then the researchers had to
make specific adjustments and analysis of the data, and occasionally discard the calibration
factor for that particular recording all together and start from the beginning.
Penn Plaza Pavilion Calibration Verification: Appendix A. Calibration Verification for Penn Plaza
48
49
United States Merchant Marine Academy Calibration Verification Appendix B. Calibration Verification for United States Merchant Marine Academy
50
Stevens Institute of Technology Anechoic Chamber Calibration Verification The calibration verification for the experiment conducted at Stevens Institute of Technology
Anechoic Chamber is provided below. There is a distinction to be made with the calibration
verification from the anechoic chamber to Penn Plaza Pavilion and United States Merchant
Marine Academy. Due to the compact confinement of the anechoic chamber, the distance
between the microphones have a substantial impact in determining the sound pressure level,
whereas in a open environment the distance between the sound events and microphones are
so great that the distance between the microphones does not have a substantial impact in
determining the sound pressure level. This is the reason there is a slight deviation in sound
pressure level of the two microphones at the anechoic chamber but not at Penn Plaza Pavilion
or United States Merchant Marine Academy.
Appendix C. Calibration Verification for SIT Anechoic Chamber
51
52
53
Calibration Factors Appendix D. Calibration Factor USMMA SEQUENCE 160620_006 boat moving away down wind.WAV
Time Start 9:42 AM Calibration 3.85E-07 4.80E-07
SEQUENCE 160620_010 boat moving away up wind.WAV
Time Start 10:19 AM Calibration 5.05E-07 5.90E-07
SEQUENCE 160620_009 boat moving closer down wind.WAV Time Start
10:07 AM Calibration 5.15E-07 4.95E-07
Appendix E. Calibration Factor Penn Plaza Record Time Meter Reading (dBA) Pressure Calibration Factor
7:33 71.1 2.05E-05 7:33 71.1 7.75E-06 7:33 71.1 2.85E-05 7:33 71.1 1.09E-05 8:00 72.6 3.17E-05 8:00 72.6 9.40E-06 8:00 72.6 3.74E-05 8:00 72.6 2.30E-05 8:30 71.1 8.20E-06 8:30 71.1 7.70E-06 8:30 71.1 1.04E-05 8:30 71.1 4.60E-06 9:00 72 5.90E-06 9:00 72 6.45E-06 9:00 72 8.80E-06 9:00 72 2.80E-06 9:30 70.2 7.20E-06
54
9:30 70.2 1.01E-05 9:30 70.2 4.20E-06 9:30 70.2 0.00E+00
10:00 72.3 1.90E-05 10:00 72.3 5.55E-06 10:00 72.3 7.50E-06 10:00 72.3 3.02E-06 10:30 71.8 2.85E-05 10:30 71.8 8.90E-06 10:30 71.8 1.20E-06 10:30 71.8 6.50E-06 11:00 71.2 5.10E-05 11:00 71.2 1.04E-05 11:00 71.2 1.05E-05 11:00 71.2 6.00E-06 11:30 69.1 4.60E-05 11:30 69.1 7.40E-06 11:30 69.1 7.70E-06 11:30 69.1 3.70E-06 12:30 70 3.90E-05 12:30 70 7.90E-06 12:30 70 9.60E-06 12:30 70 4.50E-06 13:00 69.9 4.10E-05 13:00 69.9 7.50E-06 13:00 69.9 9.20E-06 13:00 69.9 4.10E-06
Appendix F. Calibration Factor Anechoic Chamber SEQUENCE 160706_001.WAV
Time Start
10:12 AM Calibration 4.00E-05 9.80E-06
SEQUENCE 160706_005.WAV Time Start
10:52 AM Calibration 1.56E-05 9.40E-06
55
SEQUENCE 160706_006.WAV Time Start
10:54 AM Calibration 7.80E-05 2.20E-05
SEQUENCE 160706_007.WAV Time Start
10:58 AM Calibration 5.20E-05 1.64E-05
SEQUENCE 160706_008.WAV Time Start
11:02 AM Calibration 3.50E-05 1.08E-05
SEQUENCE 160706_009.WAV Time Start
11:15 AM Calibration 4.70E-05 1.65E-05
SEQUENCE 160706_010.WAV Time Start
11:16 AM Calibration 4.00E-05 1.32E-05
SEQUENCE 160706_011.WAV Time Start
11:17 AM Calibration 3.65E-05 1.14E-05
SEQUENCE 160706_012.WAV Time Start
11:24 AM Calibration 3.70E-05 1.35E-05
SEQUENCE 160706_013.WAV Time Start
11:28 AM Calibration 4.05E-05 1.55E-05
56
SEQUENCE 160706_014.WAV Time Start
11:29 AM Calibration 3.80E-05 1.33E-05
SEQUENCE 160706_015.WAV Time Start
11:33 AM Calibration 3.95E-05 1.13E-05
Data Organization Appendix H. USMMA Data Organization SEQUENCE 160620_001 Noise.WAV
Time Start Temperature (F)
Humidity (%)
Wind speed (mph)
Noise Level (dBA)
9:02 AM 72 69 5 64
Calibration 1 1 0 0 0 0 0 0
Distance (m) 0 0 0 0 0 0 0 0
Events time (sec)
Duration (sec) Description
1 14 talking
16 1 knocking
17 6 birds chirping
18 48 airplane
28 4 talking
58 8 birds chirping
SEQUENCE 160620_002 Noise.WAV
Time Start Temperature (F)
Humidity (%)
Wind speed (mph)
Noise Level (dBA)
9:04 AM 72 69 5 60.3
Calibration 1 1 0 0 0 0 0 0
Distance (m) 0 0 0 0 0 0 0 0
Events time Duration Description
57
(sec) (sec)
0 20 boat
19 47 airplane
47 3 birds chirping
SEQUENCE 160620_003 Green Barge.WAV
Time Start Temperature (F)
Humidity (%)
Wind speed (mph)
Noise Level (dBA)
9:07 AM 72 69 4 65.7
Calibration 1 1 0 0 0 0 0 0
Distance (m) 0 0 0 0 0 0 0 0
Events time (sec)
Duration (sec) Description
0 52 helicopter
27 63 airplane
77 3 birds chirping
SEQUENCE 160620_004 Noise.WAV
Time Start Temperature (F)
Humidity (%)
Wind speed (mph)
Noise Level (dBA)
9:34 AM 74 66 5 61
Calibration 1 1 0 0 0 0 0 0
Distance (m) 0 0 0 0 0 0 0 0
Events time (sec)
Duration (sec) Description
0 1 talking
14 1 chair tag
18 3 birds chirping
27 7 birds chirping
36 7 chair tag
SEQUENCE 160620_005 boat moving away down wind.WAV
Time Start Temperature Humidity Wind speed Noise Level
58
(F) (%) (mph) (dBA)
9:37 AM 75 62 6 61.6
Calibration 3.85E-07 4.80E-07 0 0 0 0 0 0
Distance (m) 36.39 45.63 65.72 92.23 121.37 147.69 169.33 197.84
Events time (sec)
Duration (sec) Description
0 1 talking
7 1 talking
16 41 airplane
26 1 knocking
31 22 birds chirping
44 1 knocking
60 71 birds chirping
62 1 knocking
72 1 knocking
73 14 birds chirping
77 20 airplane
78 1 knocking
1 1 plane
1 0.2 LRMO
SEQUENCE 160620_006 boat moving away down wind.WAV
Time Start Temperature (F)
Humidity (%)
Wind speed (mph)
Noise Level (dBA)
9:42 AM 75 62 6 74.7
Calibration 3.85E-07 4.80E-07 0 0 0 0 0 0
Distance (m) 53 93 123 141 159 173 187 221
Events time (sec)
Duration (sec) Description
0 1 talking
0 26 airplane
31 1 knocking
59
38 1 knocking
42 1 knocking
45 1 knocking
65 13 airplane
SEQUENCE 160620_007 boat moving away down wind.WAV
Time Start Temperature (F)
Humidity (%)
Wind speed (mph)
Noise Level (dBA)
9:44 AM 75 62 6 65.6
Calibration 3.85E-07 4.80E-07 0 0 0 0 0 0
Distance (m) 0 0 0 0 0 0 0 0
Events time (sec)
Duration (sec) Description
0 1 talking
5 1 knocking
8 19 helicopter
11 1 knocking
26 1 talking
SEQUENCE 160620_008 boat standing still out there.WAV
Time Start Temperature (F)
Humidity (%)
Wind speed (mph)
Noise Level (dBA)
9:56 AM 76 61 6 51.7
Calibration 1 1 0 0 0 0 0 0
Distance (m) 925.8 925.8 925.8 925.8 925.8 925.8 925.8 925.8
Events time (sec)
Duration (sec) Description
0 34 airplane
2 1 birds chirping
14 1 birds chirping
18 1 knocking
22 1 knocking
24 2 birds
60
chirping
30 1 birds chirping
31 1 knocking
36 11 birds chirping
46 2 LRMO
54 6 airplane
SEQUENCE 160620_009 boat moving closer down wind.WAV
Time Start Temperature (F)
Humidity (%)
Wind speed (mph)
Noise Level (dBA)
10:07 AM 77 60 7 67.5
Calibration 1 1 0 0 0 0 0 0
Distance (m) 1107.8 1000.43 870.96 725.38 556.63 392.76 230.16 80
Events time (sec)
Duration (sec) Description
9 1 birds chirping
15 1 knocking
21 2 birds chirping
28 4 birds chirping
30 62 airplane
38 2 birds chirping
57 3 birds chirping
63 3 birds chirping
88 1 birds chirping
118 3 birds chirping
126 49 airplane
61
130 6 birds chirping
145 11 birds chirping
149 1 knocking
182 56 airplane
184 13 birds chirping
210 6 birds chirping
221 27 birds chirping
258 4 knocking
270 40 birds chirping
279 46 airplane
337 55 airplane
411 1 knocking
418 48 airplane
SEQUENCE 160620_010 boat moving away up wind.WAV
Time Start Temperature (F)
Humidity (%)
Wind speed (mph)
Noise Level (dBA)
10:19 AM 77 59 7 68.6
Calibration 5.05E-07 5.90E-07 0 0 0 0 0 0
Distance (m) 99 142 198 256 266 363 533 662
Events time (sec)
Duration (sec) Description
18 1 birds chirping
25 35 airplane
89 38 airplane
104 1 birds chirping
140 1 LRMO
149 1 knocking
62
160 1 knocking
177 3 birds chirping
208 24 airplane
212 1 knocking
216 1 knocking
232 1 knocking
264 55 airplane
296 3 birds chirping
326 12 birds chirping
345 1 birds chirping
353 1 knocking
364 1 birds chirping
370 68 airplane
434 1 knocking
SEQUENCE 160620_011 boat moving closer up wind.WAV
Time Start Temperature (F)
Humidity (%)
Wind speed (mph)
Noise Level (dBA)
10:28 AM 77 59 7 63.7
Calibration 1 1 0 0 0 0 0 0
Distance (m) 871.21 935.14 820.39 668.03 560.5 433.12 311.16 189.69
Events time (sec)
Duration (sec) Description
11 1 birds chirping
40 51 airplane
44 1 knocking
47 1 knocking
51 6 knocking
70 1 birds chirping
63
74 1 knocking
80 35 knocking
102 49 airplane
145 1 birds chirping
158 1 knocking
165 60 helicopter
175 1 knocking
180 1 LRMO Appendix I. Penn Plaza Data Organization
SEQUENCE Penn Plaza 01 0733.WAV
Time Start Temperature (F) Humidity (%)
Wind speed (mph)
Noise Level (dBA)
7:33 AM 70 75 10 71.1
Calibration 2.50E-05 9.00E-06 2.00E-05 1.25E-05
Distance (m) 0 0 0 0
Events time (sec) Duration (sec) Description
60 60 car horns
300 60 talking
420 60 sirens
540 60 walking
540 60 car horns
SEQUENCE Penn Plaza 02 0800.WAV
Time Start Temperature (F) Humidity (%)
Wind speed (mph)
Noise Level (dBA)
8:00 AM 72 75 11 72.6
Calibration 2.75E-05 9.00E-06 3.75E-05 3.50E-05
Distance (m) 0 0 0 0
Events time (sec) Duration (sec) Description
64
60 600 construction
60 60 walking
60 60 whistle
120 60 Bus
120 60 walking
120 60 car horns
180 18 whistle
180 60 Car brake screeching
180 60 car horns
180 60 coughing
240 60 Bus
240 12 whistle
300 60 walking
300 60 whistle
300 60 walking
360 60 sirens
420 60 whistle
420 60 Bus
480 60 car horns
480 60 bus
480 120 walking
SEQUENCE Penn Plaza 03 0830.WAV
Time Start Temperature (F) Humidity (%)
Wind speed (mph)
Noise Level (dBA)
8:30 AM 72 76 11 71.7
Calibration 9.00E-06 8.00E-06 1.10E-05 4.75E-06
Distance (m) 0 0 0 0
Events time (sec) Duration (sec) Description
60 6.00E+01 Sirens
65
60 60 cement washer
60 60 whistle
60 60 talking
120 60 cement washer
120 60 whistle
180 7.2 coughing
180 60 walking
240 60 walking
300 60 whistle
300 18 coughing
360 60 walking
360 12 coughing
420 60 walking
420 12 radio
420 60 coughing
480 60 cement washer
540 60 whistle
SEQUENCE Penn Plaza 04 0900.WAV
Time Start Temperature (F) Humidity (%)
Wind speed (mph)
Noise Level (dBA)
9:00 AM 7.20E+01 7.60E+01 11 72
Calibration 8.00E-06 8.00E-06 1.10E-05 3.75E-06
Distance (m) 0 0 0 0
Events time (sec) Duration (sec) Description
60 60 construction
60 60 cement washer
60 60 talking
60 60 cement washer
120 60 whistle
66
120 60 car horns
120 60 whistle
180 60 car horns
180 60 cement washer
240 6.00E+01 car horns
300 60 whistle
300 60 car horns
360 60 walking
360 60 cement washer
420 60 car horns
540 30 cement washer
540 60 car horns
SEQUENCE Penn Plaza 05 0930.WAV
Time Start Temperature (F) Humidity (%)
Wind speed (mph)
Noise Level (dBA)
9:30 AM 72 76 10 70.2
Calibration 1 6.40E-06 9.00E-06 3.30E-06
Distance (m) 0 0 0 0
Events time (sec) Duration (sec) Description
60 60 car horns
60 60 walking
60 18 construction
120 60 birds chirping
180 60 whistle
180 60 talking
180 60 walking
240 120 construction
240 60 whistle
300 60 radio
67
300 60 truck backing up
300 60 car horns
360 60 whistle
420 120 talking
480 60 loud truck drove past
480 60 car horns
540 60 whistle
SEQUENCE Penn Plaza 06 1000.WAV
Time Start Temperature (F) Humidity (%)
Wind speed (mph)
Noise Level (dBA)
10:00 AM 71 75 10 72.3
Calibration 2.80E-05 4.75E-06 8.25E-06 1
Distance (m) 0 0 0 0
Events time (sec) Duration (sec) Description
60 60 car horns
60 60 talking
60 120 construction
120 60 talking
120 60 car horns
180 60 talking
180 60 bus
240 60 car horns
240 60 talking
300 60 car horns
300 60 bus horn
360 60 whistle
420 60 walking
420 60 talking
68
420 60 walking
480 60 car horns
540 60 talking
SEQUENCE Penn Plaza 07 1030.WAV
Time Start Temperature (F) Humidity (%)
Wind speed (mph)
Noise Level (dBA)
10:30 AM 70 77 7 71.8
Calibration 2.85E-05 8.50E-06 9.25E-06 9.00E-06
Distance (m) 0.00E+00 0.00E+00 0 0
Events time (sec) Duration (sec) Description
60 60 construction
60 60 whistle
60 60 walking
60 30 sirens
120 60 walking
120 60 talking
180 60 whistle
180 60 car horns
180 60 whistle
180 60 car horns
240 60 whistle
240 30 construction
300 60 car horns
300 60 bus
300 30 construction
300 60 sirens
360 60 whistle
360 60 truck backing up
360 60 car horns
69
420 60 truck backing up
480 12 talking
540 60 car horns
540 60 whistle
540 60 car horns
SEQUENCE Penn Plaza 08 1100.WAV
Time Start Temperature (F) Humidity (%)
Wind speed (mph)
Noise Level (dBA)
11:00 AM 72 80 4 71.2
Calibration 5.00E-05 1.00E-05 8.00E-06 1.10E-05
Distance (m) 0 0 0 0
Events time (sec) Duration (sec) Description
60 60 construction
60 60 talking
60 60 car horns
60 60 walking
60 60 whistle
120 60 construction
120 60 radio
120 60 car horns
180 60 talking
180 60 car horns
240 60 whistle
240 60 construction
300 60 car horns
360 60 bus
360 60 sirens
420 60 fire truck
480 60 construction
70
540 60 talking
540 60 child screams
540 60 car horns
SEQUENCE Penn Plaza 09 1130.WAV
Time Start Temperature (F) Humidity (%)
Wind speed (mph)
Noise Level (dBA)
11:30 AM 72 87 6 69.1
Calibration 4.75E-05 8.00E-06 7.50E-06 4.00E-06
Distance (m) 0 0 0 0
Events time (sec) Duration (sec) Description
60 60 car horns
60 60 talking
60 60 car horns
60 60 walking
60 60 talking
120 60 car horns
180 60 talking
300 60 car horns
300 60 walking
360 120 car horns
480 60 sirens
480 60 construction
540 60 bus horn
540 60 talking
SEQUENCE Penn Plaza 10 1230.WAV
Time Start Temperature (F) Humidity (%)
Wind speed (mph)
Noise Level (dBA)
12:30 AM 72 90 5 70
Calibration 4.00E-05 8.00E-06 1.00E-05 4.50E-06
71
Distance (m) 0 0 0 0
Events time (sec) Duration (sec) Description
86 60 talking
159 60 whistle
168 60 car horns
183 30 talking
206 60 3 Rolling luggage, 4m
221 90 talking
292 60 whistle
401 60 talking
505 60 walking
539 60 talking
SEQUENCE Penn Plaza 11 1300.WAV
Time Start Temperature (F) Humidity (%)
Wind speed (mph)
Noise Level (dBA)
1:00 PM 73 90 4 69.9
Calibration 4.00E-05 7.50E-06 9.50E-06 6.00E-06
Distance (m) 0 0 0 0
Events time (sec) Duration (sec) Description
60 60 talking
60 18 car horns
60 60 talking
180 30 whistle
180 120 talking
300 60 construction
360 60 coughing
420 120 talking
72
Appendix J. Anechoic Chamber Data Organization SEQUENCE 160706_001.WAV
Time Start Temperature (F) Humidity (%) Wind speed (mph)
Noise Level (dBA)
10:12 AM 86 83.2 0 36.1
Calibration 2.87E-05 1.35E-05 0.00E+00 0.00E+00 0 0 0 0
Distance (m) 1 1.91 1 1 1 1 1 1
Events time (sec) Duration (sec) Description
5 56 siren
SEQUENCE 160706_002.WAV
Time Start Temperature (F) Humidity (%) Wind speed (mph)
Noise Level (dBA)
10:23 AM 86 83.2 0 35
Calibration 3.45E-05 2.90E-05 0 0 0 0 0 0
Distance (m) 1 1.91 1 1 1 1 1 1
Events time (sec) Duration (sec) Description
SEQUENCE 160706_003.WAV
Time Start Temperature (F) Humidity (%) Wind speed (mph)
Noise Level (dBA)
10:29 AM 86 83.2 0 41.2
Calibration 1.72E-05 1.21E-05 1 1 0 0 0 0
Distance (m) 1 1.91 1 1 1 1 1 1
Events time (sec) Duration (sec) Description
4 1 gun shot
8 1 misfire
12 1 gun shot
16 1 gun shot
21 1 gun shot
25 1 misfire
29 1 misfire
73
33 1 misfire
37 1 misfire
41 1 misfire
44 1 misfire
48 1 misfire
SEQUENCE 160706_004.WAV
Time Start Temperature (F) Humidity (%) Wind speed (mph)
Noise Level (dBA)
10:34 AM 86 83.2 0 39.8
Calibration 3.05E-05 1.28E-05 1 1 0 0 0 0
Distance (m) 1 1.91 1 1 1 1 1 1
Events time (sec) Duration (sec) Description
4 1 gun shot
9 1 misfire
15 1 gun shot
20 1 gun shot
26 1.00E+00 gun shot
32 1 misfire
38 1 gun shot
43 1.00E+00 misfire
SEQUENCE 160706_005.WAV
Time Start Temperature (F) Humidity (%) Wind speed (mph)
Noise Level (dBA)
10:52 AM 86 83.2 0 35
Calibration 3.45E-05 9.80E-06 1 1 0 0 0 0
Distance (m) 4.27 5.19 1 1 1 1 1 1
Events time (sec) Duration (sec) Description
7 1 gun shot
21 1 gun shot
26 1 misfire
74
31 1 misfire
36 1 gun shot
41 1 misfire
46 1 gun shot
51 1 misfire
SEQUENCE 160706_006.WAV
Time Start Temperature (F) Humidity (%) Wind speed (mph)
Noise Level (dBA)
10:54 AM 86 83.2 0 42.2
Calibration 2.43E-05 2.20E-05 1 1 0 0 0 0
Distance (m) 4.27 5.19 1 1 1 1 1 1
Events time (sec) Duration (sec) Description
4 1.00E+00 gun shot
9 1 gun shot
15 1 gun shot
20 1 misfire
25 1 misfire
30 1 gun shot
35 1 misfire
40 1 gun shot
SEQUENCE 160706_007.WAV
Time Start Temperature (F) Humidity (%) Wind speed (mph)
Noise Level (dBA)
10:58 AM 86 83.2 0 39
Calibration 4.51E-05 1.64E-05 1 1 0 0 0 0
Distance (m) 4.27E+00 5.19E+00 1 1 1 1 1 1
Events time (sec) Duration (sec) Description
5 1 gun shot
10 1 gun shot
15 1 gun shot
75
21 1 gun shot
26 1 gun shot
32 1 gun shot
37 1 gun shot
42 1 gun shot
SEQUENCE 160706_008.WAV
Time Start Temperature (F) Humidity (%) Wind speed (mph)
Noise Level (dBA)
11:02 AM 86 83.2 0 35.9
Calibration 2.98E-05 1.08E-05 1 1 0 0 0 0
Distance (m) 4.27 5.19 1 1 1 1 1 1
Events time (sec) Duration (sec) Description
5 1 gun shot
13 1 gun shot
20 1 gun shot
26 1 gun shot
33 1 gun shot
39 1 gun shot
44 1 gun shot
50 1 gun shot
SEQUENCE 160706_009.WAV
Time Start Temperature (F) Humidity (%) Wind speed (mph)
Noise Level (dBA)
11:15 AM 86 83.2 0 37.2
Calibration 3.85E-05 1.65E-05 1 1 0 0 0 0
Distance (m) 1 1.91 1 1 1 1 1 1
Events time (sec) Duration (sec) Description
4 1 whistle
8 2 whistle
14 2 whistle
76
19 1 whistle
SEQUENCE 160706_010.WAV
Time Start Temperature (F) Humidity (%) Wind speed (mph)
Noise Level (dBA)
11:16 AM 86 83.2 0 36.8
Calibration 4.00E-05 1.32E-05 1 1 0 0 0 0
Distance (m) 1 1.91 1 1 1 1 1 1
Events time (sec) Duration (sec) Description
4 1 whistle
8 1 whistle
12 1 whistle
SEQUENCE 160706_011.WAV
Time Start Temperature (F) Humidity (%) Wind speed (mph)
Noise Level (dBA)
11:17 AM 86 83.2 0 36.5
Calibration 3.65E-05 1.14E-05 1 1 0 0 0 0
Distance (m) 1 1.91 1 1 1 1 1 1
Events time (sec) Duration (sec) Description
3 1 whistle
7 1.5 whistle
11 2 whistle
16 2 whistle
20 2 whistle
SEQUENCE 160706_012.WAV
Time Start Temperature (F) Humidity (%) Wind speed (mph)
Noise Level (dBA)
11:24 AM 86 83.2 0 36
Calibration 3.70E-05 1.35E-05 1.00E+00 1.00E+00 0 0 0 0
Distance (m) 1.00E+00 1.91E+00 1 1 1 1 1 1
Events time (sec) Duration (sec) Description
77
5 1 screaming
SEQUENCE 160706_013.WAV
Time Start Temperature (F) Humidity (%) Wind speed (mph)
Noise Level (dBA)
11:28 AM 86 83.2 0 37.2
Calibration 4.05E-05 1.55E-05 1 1 0 0 0 0
Distance (m) 1 1.91 1 1 1 1 1 1
Events time (sec) Duration (sec) Description
5 1 screaming
SEQUENCE 160706_014.WAV
Time Start Temperature (F) Humidity (%) Wind speed (mph)
Noise Level (dBA)
11:29 AM 86 83.2 0 36.3
Calibration 3.80E-05 1.33E-05 1 1 0 0 0 0
Distance (m) 1 1.91 1 1 1 1 1 1
Events time (sec) Duration (sec) Description
4 1 screaming
SEQUENCE 160706_015.WAV
Time Start Temperature (F) Humidity (%) Wind speed (mph)
Noise Level (dBA)
11:33 AM 86 83.2 0 36
Calibration 3.95E-05 1.13E-05 1 1 0 0 0 0
Distance (m) 1 1.91 1 1 1 1 1 1
Events time (sec) Duration (sec) Description
2 1 screaming
9 1 screaming
15 1 screaming
SEQUENCE 160706_016.WAV
Time Start Temperature (F) Humidity (%) Wind speed (mph)
Noise Level (dBA)
78
11:40 AM 86 83.2 0 36.1
Calibration 2.27E-05 1.16E-05 1.00E+00 1.00E+00 0 0 0 0
Distance (m) 1 1.91 1 1 1 1 1 1
Events time (sec) Duration (sec) Description
3 18 blender liquefy
26 18 blender milkshake
47 18 blender smoothie
69 22 blender pulsing Additional Images: Appendix K. Large Rusted Metal Object (LRMO) at United States Merchant Marine Academy
Appendix L. Environmental Acoustic Recording 3; Helicopter acoustic signature if the researchers were to assume recording 3 of Hoboken Pier began at 9:17:00 AM.
79
Appendix M. Environmental Acoustic Recording 3; Helicopter acoustic signature if the researchers were to assume recording 3 of Hoboken Pier began at 9:17:59 AM.
80
REFERENCE
[1] Potamitis, I., & Ganchev, T. (2008). Generalized recognition of sound events: Approaches
and applications. In Multimedia Services in Intelligent Environments (pp. 41-79). Springer Berlin
Heidelberg.
[2] Robust speech recognition and understanding. I-Tech Education and Publishing, 2007. [3] Davis, S., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE transactions on acoustics, speech, and signal processing,28(4), 357-366. [4] Picone, J. W. (1993). Signal modeling techniques in speech recognition.Proceedings of the IEEE, 81(9), 1215-1247. [5] Butko, T. (2011). Feature Selection for Multimodal: Acoustic Event Detection. Universitat Politècnica de Catalunya. [6] O’Shaughnessy, D. (2008). Invited paper: Automatic speech recognition: History, methods and challenges. Pattern Recognition, 41(10), 2965-2979. [7] Cowling, M., & Sitte, R. (2002). Analysis of speech recognition techniques for use in a non-speech sound recognition system. [8] Zue, V. (1985). Notes on spectrogram reading. Mass. Inst. Tech. Course, 6. [9] Bolt, R. H., Cooper, F. S., David Jr, E. E., Denes, P. B., Pickett, J. M., & Stevens, K. N. (1970). Speaker identification by speech spectrograms: A scientists' view of its reliability for legal purposes. The Journal of the Acoustical Society of America, 47(2B), 597-612. [10] Chu, S., Narayanan, S., & Kuo, C. C. J. (2009). Environmental sound recognition with time–frequency audio features. IEEE Transactions on Audio, Speech, and Language Processing, 17(6), 1142-1158. [11] Ghoraani, B., & Krishnan, S. (2011). Time–frequency matrix feature extraction and classification of environmental audio signals. IEEE transactions on audio, speech, and language processing, 19(7), 2197-2209. [12] Uchida, S., & Sakoe, H. (2005). A survey of elastic matching techniques for handwritten character recognition. IEICE transactions on information and systems, 88(8), 1781-1790. [13] Ashbrook, A., & Thacker, N. A. (1998). Tutorial: Algorithms For 2-Dimensional Object Recognition. Imaging Science and Biomedical Engineering Division, Medical School, University of Manchester, Manchester. [14] Mundy, J. L. (2006). Object recognition in the geometric era: A retrospective. In Toward category-level object recognition (pp. 3-28). Springer Berlin Heidelberg. [15] http://www.visionbib.com/bibliography/contents.html [16] Sharma, N. S., Yakubovskiy, A. M., & Zimmerman, M. J. (2013, November). SCUBA diver detection and classification in active and passive sonars—A unified approach. In Technologies for Homeland Security (HST), 2013 IEEE International Conference on (pp. 189-194). IEEE. [17] Yakubovskiy, A., Salloum, H., Sutin, A., Sedunov, A., Sedunov, N., & Masters, D. (2015, October). Feature extraction for acoustic classification of small aircraft. In Applications of Signal Processing to Audio and Acoustics (WASPAA), 2015 IEEE Workshop on (pp. 1-5). IEEE. [18] Bunin, B., Sutin, A., Kamberov, G., Roh, H. S., Luczynski, B., & Burlick, M. (2008, April). Fusion of acoustic measurements with video surveillance for estuarine threat detection. In SPIE
81
Defense and Security Symposium (pp. 694514-694514). International Society for Optics and Photonics.
[19] Dennis, J., Tran, H. D., & Chng, E. S. (2013). Image feature representation of the subband power
distribution for robust sound event classification. IEEE Transactions on Audio, Speech, and Language
Processing, 21(2), 367-377.
[20] Gonzales, R. C., & Woods, R. E. Digital Image Processing. 2002. New Jersey: Prentice Hall, 6, 681.
[21] Chachada, S., & Kuo, C. C. J. (2014). Environmental sound recognition: A survey. APSIPA
Transactions on Signal and Information Processing, 3, e14.
82
Acknowledgement "This material is based upon work supported by the U.S. Department of Homeland Security under Grant Award Number 2014-ST-061-ML0001." "The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the U.S. Department of Homeland Security."
Top Related