8/18/2019 spherical microphone array processing with wave field synthesis and auralization
1/109
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
2/109
ACKNOWLEDGEMENTS
This master’s thesis would not have been possible without the support of many people.Firstly, I wish to express my gratitude to Univ.-Prof. Dr.-Ing. Karlheinz Brandenburg
for giving me the chance to work on such an interesting topic in his group.
I owe my deepest gratitude to my supervisor Dipl.-Ing. Johannes Nowak for giving me
the opportunity to do this master thesis under his supervision. His constant guidance,
assistance and support were really important for this thesis.
Further, I wish to express my love and gratitude to my beloved family, especially my
parents, for their love, understanding and supporting me through the duration of my
study.
Finally, I thank to all my friends for their support during my studies.
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
3/109
ABSTRACT
Microphone arrays are structures which have atleast two or more microphones placed
at different positions in space generally in a geometrical fashion. In many applications
apart from temporal characteristics we also need the spatial charaterization of sonic
fields and in order to achieve this goal microphone arrays are employed.
In particular for spatial sound reproduction microphone arrays play a very important
role. Researchers have earlier used different microphone array configurations for thepurpose to sound recording, characterization of room acoustics and for auralization.
As the research in spatial sound reproduction progressed it was found that rendering
sound using an array of loud speaker elements is not sufficient to fully auralize an acous-
tic scene. It was proposed that microphone arrays be implemented on recording side
in order to reproduce a complete three dimensional acoustic behaviour. Researchers
have used different array configurations like planar arrays or circular arrays to map the
listening room acoustics for the purpose of auralization with some rendering system
e.g. wave field synthesis (WFS).
A drawback was noticed using two dimensional arrays, as they were not able to suffi-
ciently characterize an acoustic scene in three dimension hence spherical microphone
array came into picture. Spherical microphone arrays and there processing has been
described by many authors but a perceptual analysis of various factors which plague
the performance of spherical microphone array is still not established fully.
In the present work we do a detailed analysis of the processing chain which starts from
simulation of room characteristics with spherical microphone array, wave field analysisof the sound fields, classification of errors and Auralization of the free field impulse
responses. We bring together the existing state of the art in spherical microphone
array processing and look for the perceptual impact of different factors. We use a rigid
sphere configuration and analyze the three different error categories namely: position-
ing error, spatial aliasing and microphone noise. We attempt to establish a qualitative
and quantitative relation between the errors and limitations, encountered in spherical
microphone array processing and look for the psychoacoustic effects by auralizing the
free field data through WFS.
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
4/109
Spherical microphone array gives a complete three dimensional image of the acoustic
environment, the spherical microphone array data is decomposed into plane wave us-
ing plane wave decomposition. In process of plane wave decomposition the spherical
aperture of a spherical microphone array is discretized and because of this, limitations
get imposed on the performance of the array.
We simulate an ideal full audio spectrum wave field impact on the continuous aperture
of spherical microphone array and compare this with sampled array aperture.
In the listening test we auralized sound field based on the ideal wave field decomposition
of a continuous aperture and compare it with different degrees of errors in different
categories. By this comparision we attempt to establish the extent to which a said
error would perceptually corrupt a reproduced sound field. We also try to see the
extent to which some degree to error remains perceptually insignificant or in other
words the extent of error which can be tolerated.
We check out the spatial aliasing limit imposed by the redering system and on the
basis of that establish a base for the transform order ( l = 3) used in spherical array
processing. The perceptual analysis is done in two ways we first obtain an error
level which when incorporated in auralization process (simulated for l = 3) would be
perceptually insignificant. And then we look for the perceptual effects of this error
when stepwise tarnsform order l is changed.
We also try to establish a correspondence between wave field synthesis on the rendering
side and spherical microphone array on the measurement side. We investigate to
what extent wave field synthesis could retain the perceptual quality by analyzing the
psychoacoustic effects of changing various parameters on the spherical microphone
array side. The independence of rendering side in regard to the meausrement side is
also analysed.
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
5/109
Contents i
Contents
1 INTRODUCTION 1
1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Auralization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Organization of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 11
2.1 Acoustic wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 Homogeneous acoustic wave equation . . . . . . . . . . . . . . . 11
2.1.2 Solution of Wave equation in cartesian coordinates . . . . . . . 152.1.3 Solution of wave equation in spehrical coordinates . . . . . . . . 15
2.1.4 Spherical Bessel and Hankel functions . . . . . . . . . . . . . . . 18
2.1.5 Legendre functions . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.1.6 Spherical harmonics . . . . . . . . . . . . . . . . . . . . . . . . 20
2.1.7 Radial Velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2 Spherical harmonic decomposition . . . . . . . . . . . . . . . . . . . . . 23
2.2.1 Interior and Exterior problem . . . . . . . . . . . . . . . . . . . 23
2.2.2 Spherical wave spectrum . . . . . . . . . . . . . . . . . . . . . . 25
2.3 Spherical wave sound fields . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4 Spherical harmonic expansion of plane wave . . . . . . . . . . . . . . . 28
2.5 Mode strength . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6 Discretization of spherical aperture and spatial aliasing . . . . . . . . . 32
2.7 Plane wave decomposition . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.8 Spatial resolution in plane wave decomposition . . . . . . . . . . . . . . 35
3 ERROR ANALYSIS 38
3.1 Measurement errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
6/109
Contents ii
3.2 Description of measurement error function . . . . . . . . . . . . . . . . 39
3.3 Microphone noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4 Spatial aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5 Positioning error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4 WAVE FIELD SYNTHESIS 48
4.1 Physics behind WFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2 Mathematical description of WFS . . . . . . . . . . . . . . . . . . . . . 51
4.3 Synthesis operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.4 Focusing operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.5 Practical Consequenses . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5 LISTENING TEST 62
5.1 Listening Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.2 Reproduction set up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.3 Auralization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.3.1 Aspect to be percetually evaluated . . . . . . . . . . . . . . . . 65
5.3.2 Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.4 Structure of listening test . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.4.1 Audio Tracks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.4.2 Listening test condition . . . . . . . . . . . . . . . . . . . . . . 70
5.5 Test subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.6.1 Test subject screening . . . . . . . . . . . . . . . . . . . . . . . 74
5.6.2 Statistic for the evaluation of listening test . . . . . . . . . . . . 74
5.6.3 Definations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.7 Spatial aliasing vs transform order . . . . . . . . . . . . . . . . . . . . 76
5.8 Evaluation of positioning error . . . . . . . . . . . . . . . . . . . . . . . 78
5.9 Microphone noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6 Conclusions 84
Bibliography 86
List of Figures 91
List of Tables 93
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
7/109
Contents iii
APPENDIX 95
A Derivations 95
A.1 Orthonormality of Spherical harmonics and Spherical Fourier transform 95
A.2 Position vector and Wave vector . . . . . . . . . . . . . . . . . . . . . . 97
A.3 Plane wave pressure field for different levels . . . . . . . . . . . . . . . 99
A.4 Rigid sphere and open sphere configuration . . . . . . . . . . . . . . . . 100
Theses 102
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
8/109
1 INTRODUCTION 1
1 INTRODUCTION
Digital processing of sounds so that they appear to come from particular locations in
three-dimensional space is a very important and is an integral part of virtual acoustics.
In virtual acoustics the goal is simulation of the complex acoustic fields so that a listener
experiences a natural environment and this is done by spatial sound reproduction
systems.
In realization of spatial sound reproduction systems the concept of sound field synthesis
is used. Various methodologies and analytical approaches are combinedly defined by
the concept of sound field synthesis.
In sound field synthesis we decompose the sound or audio into various components orwave fields. In simple terms we pull apart the basic components of sound character-
izing various spatial and temporal properties. And then after implementing complex
signal processing techniques we reproduce sound in such a way that these components
merge together in the propagation medium to auralize a complete three-dimensaional
characteristic of the sound.
Hence, sound field synthesis is a principle where an acoustic environment is processed,
synthesized and reproduced or re-created such that the real acoustic scenario could
be perceived by the listener. Spatial sound, Immersive audio, 3 − D sound, surroundsound systems; these are some of the terms which are used often to describe such audio
systems.
Different aspects come into play in realizing a sound field reproduction system and
a very broad research work is being done attempting to understand various factors.
Some examples for sound field reproduction systems which deal with various concep-
tual aspects of signal processing are like wave field synthesis which is also our choice
of reproduction system used in this thesis, Higher order Ambisonic [1], sound field
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
9/109
1 INTRODUCTION 2
reproduction with MIMO acoustic channel inversion [2] and vector based amplitude
panning methods [3]. These are few of the spatial sound system examples which
were developed by respective researchers, in [4] the author has presented very detailed
mathematical treatment of various spatial sound reproduction techniques, he has at-
tempted to bring these related spatial sound system on a single mathematical plane
on the basis of functional analysis.
In the present work we put forward our analysis where in, we answer various ques-
tions which come up when an acoustic environment is recreated.Sound reproduction
techniques for virtual sound systems have been studied, developed and implemented
in various different ways and configurations. Acoustic auralization of sound fields in
this work, focuses on wave field analysis (WFA) [5] [6] concerning spherical micro-phone arrays, and their auralization on a 2-dimensional geometry of loudspeaker array
following the principle of wave field synthesis (WFS).
In order to obtain the acoustic scene characteristics, researchers have proposed the
usage of microphone arrays. Apart from temporal properties, for spatial sound repro-
duction we need the spatial properties of sound field as well and therefore microphone
arrays are required as they can characterize the sound in space as well [7][5][6]. Au-
ralization using microphone arrays have been attempted using various kind of array
geometries, In [5] the author has focused circular microphone array and used it for theauralization of sound fields with wave field synthesis.
In [8], Spatial Sound Design principles have been explained for Auralization of room
acoustics.
In spatial sound design the spatial properties of an audio stream like position, direction,
orientation in a virtual room and room itself are modified. Two things are attempted
the first being the simulation of an acoustic environment and the other is the direction
dependent visualization and modification of the sound field by the user. In this workwe focus on the part where simulation and auralization of an acoustic scene is done.
More importantly we investigate the factors influencing the microphone array used for
room impulse response recording (RIR) and analyse the perceptual effects which would
be observed during auralization process when various parameters of the microphone
array are changed.
Any sound wave can be represented as a superpositon of plane waves in far field of its
sources [9][8], and consequently it can also be said that a room can be characterized
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
10/109
1 INTRODUCTION 3
by its impulse responses as it can be assumed to be linear time invariant (LTI). Hence
if we are able to capture the room impulse responses of a room then we can fully
characterize the acoustic nature of that room and inturn any acoustic event in that
room can be reproduced simply with the help of plane wave decomposed components
of its room impulse responses.
1.1 Preliminaries
To understand how sound radiates in a medium we would like to introduce the reader
with soap bubble analogy as explained in [10, page 6-10] by Zotter. The sound radiation
is considered as a soap bubble as shown in problem, Figure 1.1. We assume a free
sound field and an ideal bubble of soap which is large enough to enclose a musician and
an instrument, now when the sound is produced by the instrument, the bubble surface
will vibrate according to the motion of air, because as sound propagates through the
medium it will hit the bubble and consequently the soap bubble will also vibrate with
the air molecules. At respective observation points on the sphere or the soap bubble,
the wave form of the vibrating sphere can be said to represent the radiated sound.
In [9], Williams has explained that acoustic sound radiation from the instrument couldbe completely defined if we are able to acoustically map the motion of this continuous
surface enclosing the sources. This kind of analysis of sound radiation is called exterior
problem .
In a similar way if we say that there are no sources inside (rather it is enclosing the
measurement set up or listening area) the soap bubble but instead the sound radiation
propagates from outside (i.e. the sources are outside) and the waves hit the bubble
from the exterior.
Now again as the bubble is in contact with the medium therefore it will vibrate and
identifying the motion of the surface of bubble would be sufficient to describe the
acoustic radiation, this is called interior problem .
In [11], exterior and interior problem have again been elaborated, as for our appli-
cation interior problem is more important hence we would put forward the interior
problem with respect to spherical microphone array. A more mathematical treatment
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
11/109
1 INTRODUCTION 4
Figure 1.1: Soap bubble model of acoustic radiation [10].
is explained in chapter 2 and for further detailed analysis the reader is referred to [9,
page 124].
In auralization applications we follow the same analytical line and characterize a lis-
tening room environment by measuring the impulse response coming from different
directions [7], this in turn gives us the directional behavior of the sound i.e. how direct
sound reaches and affects the spherical array and how the reflected sound behaves.
1.2 Auralization
In order to auralize the sound field keeping the spatial characteristics of sound alive,
method based on WFS is applied in the present work. WFS is a consequence of
Huygen’s Principle expressed mathematically by Kirchoff Helmhotz integrals [12]. In
[13] wave field synthesis is discussed in explicit details, In [13] Verheijen has explained
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
12/109
1 INTRODUCTION 5
the reproduction techniques stressing more on the loudspeaker arrays. Mathematical
description of WFS would be followed up in chapter 4. Over here its important to
point out that although WFS does reproduce the room effect but it is not sufficient for
recreation of an acoustic virtual reality [14][5] as it lacked the knowledge of acoustic
room impression which is obtained by wave field analysis of the room. Hence, WFS
with wave field analysis was further proposed in [6]. The proposed techniques suggest
that we obtain the room characteristics by measuring the room impulse responses and
then the analysis of these impulse responses can be used for calculating the driving
function for wave field synthesis. It is suggested that we can ideally reproduce the
reverberant part of sound with 8-11 uncorrelated plane waves [6].
In our application we attempt to use different configuration of plane waves in order tosynthesize the the sound waves. As mentioned by Sonke in [6] we check out palne wave
decompositon for different number of directions and finally settle with 12 direction.
We also compare the psychoacoustic effect of using different numbers of plane wave
sources and evaluate the optimal configuration.
In [5] the techniques and suggestions proposed in [6] are further explored and imple-
mented using a circular microphone array. There were two methodologies explained in
[5]:
• Impulse response based auralization
• Natural recording based auralization
Impulse response based auralization: In this approach the room acoustics are
measured and analyzed or impulse responses for the room are measured. For re-
production the room characteristics which are obtained from the impulse response
measurement are combined which dry audio channel and then reproduced. In more
simpler words if we suppose an audio file is being played in a particular room and nowthis acoustic scene has to be recreated in another room. Then the knowledge of the
room impulse responses (RIR)would be sufficient to recreate the same acoustic scene
by convolving the dry audio file with the directional responses obtained by plane wave
decomposition of RIR. Refer to Figure 1.2.
Natural recording based auralization: In this approach real time recording of the
sound field is done along with the audio signals. Here we do not record impulse re-
sponses separately but live recording of acoustic event is captured. In natural recording
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
13/109
1 INTRODUCTION 6
Figure 1.2: Impulse response based auralization [5].
based auralization no separate impulse responses or perceptual parameters are used,
refer to Figure 1.3.
In the present work we follow impulse response based auralization technique. One im-
portant factor for using this line of approach is that the measurement and reproduction
sites are independent of each other [8][11][5].
For auralization of acoustics environment for example a hall over an extended area,
impulse responses have to be measured along an array of microphone positions [15].
The impulse responses can be processed by three different techniques:
• Holophony
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
14/109
1 INTRODUCTION 7
Figure 1.3: Natural recording based auralization [5].
• Wave Field Extrapolation (WFE)
• Wave Field Decomposition (WFD)
Holophony: The impulse responses are measured at microphone positions which in
turn correspond to the loudspeaker positions in wave field synthesis. The impulseresponses in this approach can be directly used as convolution filters to derive the
corresponding loudspeaker and no further processing is required. Although this tech-
nique looks very straight forward but it is very inflexible since the output can not be
used for any other WFS layout other than the one specifically designed for the corre-
sponding microphone array set up used in the measurement. More over, in Holophony
it is required for microphones to have very sharp directivity patterns which is quite
unrealistic in practice [16][5].
Wave Field Extrapolation: In WFE the mesurement of impulse responses is not
necessarily done at positions that correspond to the WFS loudspeaker configuration.
The impulse responses measured from any particular kind of microphone array config-
uration are extrapolated to the required WFS loudspeaker array positions (in principle
these positions are different then that of the microphone positions, hence extrapola-
tion). The extrapolation can be done using Kirchhoff-Helmholtz integrals [5]. Although
auralization with wave field extrapolation has given satisfactory results but there are
some drawbacks, it requires a very large measurement array for a medium size extrap-
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
15/109
1 INTRODUCTION 8
olation area [7]. In [5] the author has shown that it atleast requires microphone array
size equivalent to that of the listening area in order to achieve satisfactory results.
Wave Field Decomposition, [17][15][18]: The wave field decomposition approachdecomposes the sound field into planes wave, which arrive from different directions.
The plane wave decompostion can be considered as an acoustic photograph of the
sound sources including secondary sources which can be regarded as the one generating
reflections [7].
The impulse responses are decomposed into plane waves which give the directional
image of the sound field. Further these plane wave are reproduced as point sources in
WFS set up. The measurement array and reproduction site are independent of each
other in this approach and we can reproduce the sound field for a larger area as com-
pared to the other two approaches. The size of the measurement array and that of the
loud speaker array have no dependence as far as the microphone array characterizes
the room sufficiently [5][8]. The plane wave obtained through plane wave decomposi-
tion can optimally represent the sources and reflections. And hence in principle we can
reproduce the sound field satisfactorily. Due to considerable advantages of wave field
decomposition over other methods we would focus our work on plane wave decomposi-
tion of the acoustic wave fields. In next chapter we present the analysis for plane wave
decomposition of spherical microphone array. In [5], circular microphone array wasimplemented for the purpose of auralization with WFS, but in order to obtain a three
dimensional plane wave decomposition use of spherical mircophone array necessiated
[11][8].
In our work we simulate the acoustics characteristics of a free field full spectrum wave
impact on spherical microphone array, and analyze it to obtain plane waves represent-
ing direct sources, reflections and reverberation part, these plane wave responses are
implemented in the driving filter of WFS and we try to auralize the sound. As a conse-
quence of spherical microphone array being important for three dimensional sampling
of acoustic radiation we study different aspects of spherical microphone arrayin this
work and investigate their influence on spatial sound reproduction.
Finally we auralize the sound field for different cases and perceptual listening test are
conducted. Different test subjects are invited to listen to our simulated wave fields
which are auralized using WFS spatial sound renderer consisting of an 88 element
lounspeaker in a 2 dimensional nearly circular geometry.
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
16/109
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
17/109
1 INTRODUCTION 10
1.4 Organization of Thesis
The thesis is divided into 6 chapters, the presentation of the work in this thesis hasbeen put forward as per the practical scheme of work.
We start with fundamentals and state of the art of spherical microphone arrays in
Chapter 2. We explain the basic mathematical fundamentals in this chapter and fur-
ther continue it to spherical harmonic decomposition, types of arrays and behaviour of
there radial filters, spatial sampling in spherical microphone array and then plane wave
decomposition. We also talk about the spatial resolution of plane wave decomposition
and its limitations.
In Chapter 3 we bring up the issues related with errors which get involved in the
due course of processing. Positioning error, microphone noise and spatial aliasing
are discussed in this chapter and the state of the art about how these errors are
incorporated in the theoretical analysis is also presented.
Chapter 4 explains about the wave field synthesis, basic theoritical backgound and
its limitations.
In Chapter 5 we first provide the description about the auralization process and then
go on with the description of Listening test and the related analysis of perceptual
effects of the errors and artifacts is presented.
Chapter 6 is the conclusion and it draws out the result more prominently and we
discuss the final suggestions of this work and the future work.
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
18/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 11
2 MATHEMATICAL ANALYSIS AND
STATE OF THE ART
In this chapter we talk about fundamentals of wave propagation and sound fields and
summarize the existing state of the art in spherical microphone array processing and
its auralization using wave field synthesis.
The work presented in this thesis is based on simulating free field room impulse re-
sponses with a spherical microphone array and further these impulse responses are
utilized for rendering the spatial sound with WFS set up.
2.1 Acoustic wave equation
The acoustic wave equation is the mathematical formulation of sound propagation
through a medium. This section provides the introductory basics of wave equation for
more discussion please refer [9][20][21].
2.1.1 Homogeneous acoustic wave equation
For the derivation of acoustic wave equation some basic assumptions are made [22]
[23] [24]:
1. The medium of propagation of sound waves is homogeneous, that is the material
characteristics of the medium remain time invariant.
2. The medium is quiescent, that is, it remains in a state of inactivity or dormancy.
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
19/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 12
3. Propagation medium is characterized as an ideal gas.
4. The state changes in the gas are modeled as adiabatic process i.e a process
that takes place without the transfer heat or matter between a system and itssurroundings.
5. The static pressure p0 and static density ρ0 are significant in comparison to the
pressure and density perturbations of wave propagation.
Independence of relevant parameters of the medium is assured by the first condition.
The second condition gives the assurance that parameters are independent of time
and there is no gross movement of the medium. Laws of ideal gases could be applied
as a result of assumption three, fourth assumption postulates that there is no energyexchange in the form of heat conduction in the medium i.e., there are no propagation
losses. And fifth assumption tells us that we can linearize the field variables and
medium characteristics around an operation point.
Two fundamental principles are used to derive the wave equation:
1. conservation of mass
2. the equation of momentum
Figure 2.1: Infinitesimal volume element used for the derivation of Euler’s equation.
The momentum equation tells us the relation between the force applied to a volume
element and the acceleration of the element due to this applied force. In figure 2.1 an
infinitesimal volume element is considered, we use this explanation for the derivation of
Euler’s equation [20][9]. Considering an infinitesimal volume element of fluid ∆x ∆y ∆z .
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
20/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 13
We say that all the six faces experience forces due to the pressure p(x,y,z ) in the fluid.
Assume that pressure on any one side is more than other side, therefore a force would
be exerted on the volume element and it would tend to move along the direction of the
force. From Newton’s laws of motion we relate this force with acceleration. If we carry
out the same analysis for all three directions, finally it ends up with Euler’s equation
which tells us the relation between the pressure applied on the fliud to changes in the
particle velicity of the fluid.
ρ0∂υ
∂t = −∇ p (2.1)
Here ρ0 is the fluid density, υ is the velocity vector at any position (x , y , z ) in the
medium.
υ = uex + υey + wez (2.2)
p is the pressure. ∇ is called gradient or nabla operator and is defined as
∇ ≡ ∂ ∂x
ex + ∂
∂yey +
∂
∂z ez (2.3)
where ex, ey and ez are unit vectors in x , y , z direction respectively, sometimes in liter-
ature they are also written as î , ̂j , k̂ . −∇ p is the pressure gradient and ∂υ∂t
is the change
in particle velocity.
The second equation which follows conservation of mass is given as [ 22][20]:
∂ρ
∂t + ρ∇υ = 0 (2.4)
where ρ is the density of propagation medium, υ is the acoustic particle velocity. As
the above five asuumptions are assumed to hold good hence equation 2.4 signifies
that time rate change in density of the medium is proportional to the gradient of time
rate change in the particle velocity of the medium times the density of the medium.
Further taking above assumptions into consideration the time derivative in equation
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
21/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 14
2.5 expresses the proportionality between time derivative of acoustic pressure and ρ ,
refer [20][25][22] for more detailed description.
∂p∂t
= c2 ∂ρ∂t
(2.5)
where p is the pressure which is a variable in position and time t and c is speed of sound.
Equation 2.5 gives the temporal derivative of density of propogation medium in terms
of changes in pressure, combining equation 2.5 and 2.4 in view of last assumption,
we get
−∂p∂t
= ρ0c2∇υ (2.6)
Equation 2.1 and 2.6 with intial and boundary conditions form a complete set of
first order partial differential equations with a unique solution. These equation can be
combined together to form a single second order equation [25][20]. The time derivative
of equation 2.6
− ∂ 2
∂t2 p = ρ0c
2∇ ∂ ∂t
υ (2.7)
replacing the particle velocity component in equation 2.7 with gradient of pressure
from Euler’s equation in equation 2.1 we obtain the the homogeneous wave equationgiven by equation
∇2 p − 1c2
∂ 2
∂t2 p = 0 (2.8)
where p is pressure which is a function of position and time t . Equation 2.8 can also
be represented in frequency domain by applying Fourier transformation with respect
to the time t to acoustic pressure p [26][9].
∇2P (r, ω) + ωc
2
k2
P (r, ω) = 0(2.9)
Equation 2.9 is known as Helmholtz equation , r is the position, r = (x,y,z ), ωc
is the
wave number k and ω = 2πf . Analytically it can be seen that k = 2π/λ, λ being wave
length, hence k is the amount of angle or radians acheived in one wave length, so if we
want to know the phase of a wave when it has travelled say 7/9λ then 7/9λ × k givesthe phase.
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
22/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 15
2.1.2 Solution of Wave equation in cartesian coordinates
The general solution of wave equation in cartesian coordinates is derived through the
Helmholtz equation in three dimensions [9].
P (r, ω) = A (ω) ei(kxx+kyy+kzz) (2.10)
where A(ω) is an arbitrary constant. Here we define k as
k2 = k2x + k2y + k
2z (2.11)
Another notation for plane wave solution is
P (r, ω) = A (ω) eikr (2.12)
In time domain the solution for wave equation is
p (t) = Aei(kxx+kyy+kzz−ω0t) (2.13)
p (t) = Aei(kr−ω0t) (2.14)
A is a constant. This is the plane wave solution of the wave equation at a given
frequency ω0. We have directly put forward the solution of wave equation in cartesian
coordinate in an introductory form for detailed description please see [9].
2.1.3 Solution of wave equation in spehrical coordinates
Now we would discuss in detail about the solution of plane wave in spherical coordinate
system as this is directly related in the processing of spherical microphone arrays.
We recall the wave equation 2.8 and again present it over here
∇2 p(x , y, z, t)− 1c2
∂ 2
∂t2 p(x , y, z, t) = 0 (2.15)
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
23/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 16
∇2 is also called as laplace operator ∆ = ∇2 and is defined in cartesian coordianatesas
∇2 ≡ ∂ 2
∂x2 + ∂
2
∂y 2 + ∂
2
∂z 2 (2.16)
The spherical coordinate system shown in figure 2.2 would be followed in this thesis
work. Looking at figure 2.2 we can express cartesian coordinates in terms of r , ϑ, ϕ.
Figure 2.2: Spherical coordinate system and its relation to Cartesian coordinate system
x = r sin ϑ cos ϕ y = r sin ϑ sin ϕ z = r cos ϑ (2.17)
Here r denotes length of vector r and the direction Ω ≡ (ϕ, ϑ) represent azimuth-elevation pair. Hence we can say r =
x2 + y2 + z 2, ϑ = tan−1
x2 + y2/z
and
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
24/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 17
ϕ = tan−1 [y/x]. Considering equation 2.15 and 2.17 we can express the wave equation
in spherical coordinates as
1r2 · ∂
∂r
r2 ∂p
∂r
+ 1
r2sinϑ · ∂
∂ϑ
sinϑ ∂p
∂ϑ
+ 1
r2sinϑ · ∂ 2
∂ϕ2 − 1
c2 · ∂ 2 p
∂t2 = 0 (2.18)
In this equation p is a variable of (r ,ϑ,ϕ, t ). The right hand side of the equation
explains the consideration that there are no source in the volume for which the equation
is defined. The solutions for this wave equation in frequency domain is explained in
[9] and is given in two forms as
p(r, Ω, k) =l=0∞
m=−ll
(Alm(k) · jl(kr) + Blm(k) · yl(kr)) Y ml (Ω) (2.19)
p(r, Ω, k) =l=0∞
m=−ll
C lm(k) · h(1)l (kr) + Dlm(k) · h(2)l (kr)
Y ml (Ω) (2.20)
The two solution represent the interior and exterior problem, equation 2.20 refers
to the exterior problem and equation 2.19 refers to the interior problem. We will
elaborate more on these two solutions and the cofficients Alm(k), Blm(k), C lm(k), and
Dlm(k) in later sections.
The level l and mode m are integers with values defined as 0 ≤ l ≤ ∞ and −l ≤ m ≤ l.The acoustical wave number as defined earlier is k = ω
c = 2πf
c , here f is the frequency of
the sound wave and c is the speed of sound in the medium. Functions jl(kr) and yl(kr)
are spherical bessel function of first and second kind respectively. Similarly h(1)l (kr)
and h(2)
l
(kr) are known as spherical hankel functions of first and second kind. Y m
l
(Ω)
is the function known as spherical harmonic of level or order l and mode m and is
defined as
Y ml (ϑ, ϕ) =
(2l + 1)
4π
(l − m)!(l + m)!
P ml (cosϑ)eimϕ (2.21)
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
25/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 18
These expressions which are a outcome of the derivation for the solution of wave
equation 2.15, are acheived by separation of variable in equation 2.18. In [9, page
186],[25, page 380] and in [20, page 337] the derivation and solutions are explained quite
nicely, for more detailed analysis of separation of variable approach used in solving the
wave equation please refer to [27].
In equation 2.21, P ml (cosϑ) is the Legendre function of the first kind and i =√ −1.
2.1.4 Spherical Bessel and Hankel functions
In [9], solution for spherical wave equation is given. We use separation of variable
approach [27] in order to solve the wave equation in spherical coordinate system, in
this process the spherical wave equation gets separated into four different differential
equations. The solutions to these four constituent differential equation gives us the
solution for wave equation in spherical coordinates. The solution of these differential
equations leads us to spherical Bessel function, Hankel function and the Legendre
polynomial which appears in the spherical harmonics.
jl(kr) and yl(kr) are related to corresponding bessel function as [28] [9] :
jl(x) ≡
π
2xJ l+1/2(x)
yl(x) ≡
π
2xY l+1/2(x)
(2.22)
The equations in 2.22 are valid for l ∈ R. The spherical Hankel function of the firstand second kind h
(1)l (x) and h
(2)l (x), are defined as
h(1)l (x) ≡ jl(x) + i · yl(x)
h(2)l (x) ≡ jl(x) − i · yl(x)(2.23)
here x is the argument, and in our case it is kr. It is seen that when x is real then
h(1)l (x) is the conjugate of h(2)l (x), in our case kr is always real, as it is the product
of wave number and the radius or the distance from the origin. h(1)l (x) ∝ eikr andh
(2)l (x) ∝ e−ikr [9], hence Hankel function of the first kind represents an outgoing wave
where as the other one represents an incoming wave, these solution are used depending
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
26/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 19
upon the location of our sources, in our case the sources lie out side the measurement
sphere (refer to the explaination in the chapter 1 about soap bubble) hence we would
be interested in the incoming wave for the analysis of our spherical microphone array.
Figure 2.3: Spherical Bessel function of the first kind jl(x) (left) and the second kindyl(x) (right) for order l ∈ {0, 3, 6} [11]
Figure 2.3 shows the behaviour of these functions for different level or order l with
respect to the argument x. Here we would like to point out few conclusions, as seen
from the plots the spherical bessel function of the first kind are finite at the origin
but then for higher orders that is l > 0, there is an initial region where the functionremains zero except for the case of j0(x), also the function of second kind experiences
a negative non fininte behaviour near the origin. Hence firstly we would like to state
the obvious that is from equation 2.23, we can say that the spherical Hankel function
are singular at x = 0. The other consequence which is of importance in latter part
of our analysis is that as the function shows depreciating behaviour near origin or in
cases where l > x, where for us x is kr, that is the product of wave number and radius
or simply we may call it a measure of frequency of the acoustic wave. We notice that
shperical wave solution gives us a kind of damped response for lower frequency regionsand in situation where we use a higher value of level l, also refered as transform order,
we loose low frequency information of the acoustic wave and in order to retrieve it we
try to amplify the signal extensively, these conclusions would again be recalled when
we talk about interior-exterior problem and radial filter components or mode strength
for rigid sphere in plane wave decomposition.
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
27/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 20
2.1.5 Legendre functions
Refer to expression 2.21, the term P ml (x) appeared in the solution for wave equation
in spherical coordinate system. This term is called Legendre function. The legendre
function for the case when m = 0 are known as Legendre polynomials, denoted by
P l(x), and are expressed by Rodrigues’ Formula as [9]:
P l(x) = 1
2ll!
dl
dxl(x2 − 1)l (2.24)
The function P ml (x) which has two indices are known as associated Legendre functions where m = 0. For positive m
P ml (x) = (−1)m(1 − x2)m/2 dm
dxmP l(x) (2.25)
and for negative m
P −ml (x) = (−1)m(l −m)!(l + m)!
P l(x) m > 0 (2.26)
The property of Legendre function which makes it attractive for us is that they form
a set of orthogonal functions for each mode m. Hence the spherical harmonics are also
a set of orthogonal functions. For further details the reader is referred to [ 9][25]
2.1.6 Spherical harmonics
Any function on a sphere could be represented by a combination of spherical harmonics[9], in our case the solution of acoustic wave equation is obtained in terms of spherical
harmonics Y ml (ϑ, ϕ) orY ml (Ω) given in equation 2.21. The spherical harmonics define
the angular components of the wave solution. Considering equation 2.26 the spherical
harmonic for negative m can be obtained from the solution for positive m given as
Y −ml (Ω) = (−1)mY ml (Ω) m > 0 (2.27)
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
28/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 21
where Y ml (Ω) is the complex conjugate of Y ml (Ω). There are 2l + 1 different spherical
harmonics for each level l as −l ≤ m ≤ l. One more property of spherical harmonicsis that they are not only orthogonal but they are orthonormal too[ 9, page 191].
S 2Y ml (Ω)Y
ml (Ω)dΩ = δ llδ mm (2.28)
here δ ll is the Kronecker delta, which is 1 for l = l and 0 otherwise. The surface
integral is defined as
S 2
dΩ =
2π0
dϕ
π0
sin ϑdϑ (2.29)
As said above the any fuction on a shpere can be decomposed into the sum of sphericalharmonics [9, page 192] , [29, page 202] .
f (Ω) =∞l=0
lm=−l
f lm(k)Y ml (Ω) (2.30)
this expression can also be termed as inverse spherical Fourier transform (ISFT) [29].
As the spherical harmonic functions are orthonormal hence we can obtain the spherical
fourier transform coefficients, given as
f lm(k) =
S 2
Y ml (Ω)f (Ω)dΩ (2.31)
The derivation for this expression can be refered in [29, page 202] and [11] in appenddix
(A.1). The importance of these expression presented above is that with the help of
these expression we obtain our spherical wave decomposition and in turn the plane
wave decomposition.
The spherical harmonic functions are further depicted in figure 2.4 for levels l∈ {
0, 1, 2, 3}
.
In the expression for spherical harmonics in equation 2.21, the Legendre function P mlrepresents standing spherical waves in ϑ and the factor eimϕ represents traveling spher-
ical waves in ϕ [17].
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
29/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 22
Figure 2.4: Spherical harmonics Y ml (Ω) for order l ∈ {0, 1, 2, 3} [11]
2.1.7 Radial Velocity
Till now we talked about the pressure field, now we would like to shed some light on the
radial velocity of the sound wave. As radial velocity in plane wave docomposition of
spherical waves would represent our directivity function hence an introduction to this
term is important before we go further with Spherical and plane wave decomposition
concepts.
Equation 2.3 is written in spherical coordinates as
∇ ≡ ∂ ∂r
er + 1
r
∂
∂ϑeϑ +
1
rsinϑ
∂
∂ϕeϕ (2.32)
where e() represents the unit vector in spherical coordinates
Refer Euler’s equation 2.1, the Fourier transform of this equation gives us
iρ0ckυ = ∇ p(x , y, z, k) (2.33)
Equation 2.33 is in cartesian coordinates.
υ = u(r, Ω, k)eϑ + υ(r, Ω, k)eϕ + w(r, Ω, k)er (2.34)
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
30/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 23
Solving these equation we obtain the expression for radial velocity component
w(r, Ω, k) = 1
iρ0ck · ∂p(r, Ω, k)
∂r (2.35)
2.2 Spherical harmonic decomposition
With the given background as described in the previous part now we talk about the
specific solutions. There were two solution to wave equation 2.18, given by equations
2.19 and 2.20. In [30] the author has explained spherical harmonic decomposition
of spherical microphone arrays, and also explained its limitations. Further in [31]
[17] they explain the theoritical analysis for plane wave decomposition (PWD) using
spherical convolution and then further explain the technique of spherical Fourier trans-
forms used for PWD. In this section we will derive expressions for spherical harmonic
decomposition and talk about various consequences which are encountered during this
part of wave field analysis.
2.2.1 Interior and Exterior problem
The solution given by equation 2.20 describes the pressure field for exterior problem
[9]. Refer to figure 2.5, all the source are inside the spherical volume defined by radius
a. As the solution is valid only for the region without any sources hence the pressure
field is described for region where r ≥ a. As all sources lie inside the measurementsphere and as per our discussion on Hankel function, we only take into consideration
the first term of equation 2.20, because the sound waves have an outgoing direction
but as there are no sources outside the region r = a hence there would be no incoming
waves and hence the second part of the solution would not be considered, therefore
our solution is
p(r, Ω, k) =∞l=0
lm=−l
(C lm(k) · h(1)l (kr) (2.36)
Now we focus more rigorously on the Interior Problem, as this is more interesting for
our work and therefore all explainations would be done with regard to interior prob-
lem. In interior problem analysis, the sound sources are located outside the spherical
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
31/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 24
Figure 2.5: Exterior problem[11]
volume and estimating the acoustic effect on the surface of this volume it is sufficient
to characterize the sound in space. Going a bit further we may say that in order to
map this surface we use spherical microphone array. Hence we say that our spherical
microphone array is enclosed by an imaginary spherical volume and at each observa-
tion point of the array we attempt to measure the acoustic effect invoked by externalsources. Figure 2.6 shows the case for interior problem, all the source are present
outside the measurement sphere r = b, the region
1 and
2 represent the sources
outside the valid region of measurement. The solution for interior problem comes from
equation 2.19, as the solution should be finite at all points within the measurement re-
gion r ≤ b, hence from our discussion on Hankel function and spherical Bessel functionin section 3.2.1 when r = 0, that is at the origin, both the spherical Hankel function
and spherical Bessel function of second kind would not be finite hence our solution
would contain only the first term of equation 2.19 and is given as [9]:
p(r, Ω, k) =∞l=0
lm=−l
(Alm(k)) · jl(kr) · Y ml (Ω) (2.37)
where p(r, Ω, k) is the sound pressure at point (r, Ω), k is the wavenumber and Alm(k)
is the coefficient of the spherical harmonics Y ml (Ω) of order l and mode m and jl(kr)
is the spherical Bessel function of first kind.
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
32/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 25
Figure 2.6: Interior problem[11]
As we defined the expression for pressure field we can also define an expression for
radial velocity w(r, Ω, k), equation 2.37 can be used in equation 2.35 and we obtain
w(r, Ω, k) = 1
icρ0
∞l=0
lm=−l
Alm(k) · jl(kr) · Y ml (Ω) (2.38)
where jl(kr) is the derivative of jl(kr) with respect to kr given as
∂jl(kr)
∂r =
∂jl(kr)
∂kr · ∂ kr
∂r = j l(kr) · k (2.39)
As we are using spherical microphone array and hence we can describe the pressure
at any point on the surface of the spherical microphone array in the same fashion as
presented in the interior problem. This would become more clear in the later sections.
2.2.2 Spherical wave spectrum
Now as defined in equation 2.37, if we can obtain the cofficient Alm(k) then we can
easily define the pressure field p(r, Ω, k). Exploiting the property of orthonormality
of spherical harmonics and the fact that any arbitrary function on a sphere can be
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
33/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 26
expanded in terms of its spherical harmonics [29] we follow the procedure as described
in Appendix A.1, on following the same treatment with equation 2.37 we obtain
Alm(k) = 1 jl(kr)
S 2
p(r, Ω, k)Y ml (Ω)dΩ (2.40)
The expression for Alm(k) is also called spherical wave spectrum as it can be regarded
as spherical Fourier transform of p(r, Ω, k) [9], also written as
P lm(r, k) = 1
jl(kr)
S 2
p(r, Ω, k)Y ml (Ω)dΩ (2.41)
P lm(r, k) describes the sound wave in frequency in terms of wave number or k-space.
2.3 Spherical wave sound fields
Before we go to next sections lets bring out some analysis as how to express a spherical
wave at a point due to some given source. Refer to figure 2.7.
Figure 2.7: Geometrical description for the calculation of pressure p(r,ϑ,ϕ,k) at pointP for source at Q
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
34/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 27
We consider a point source, also termed as a monopole at the origin O. The pressure
p(r, k) at point P is given by the expression [9, page 198]
p(r, k) = −ip0(k)ckQs eikr4πr
(2.42)
Here r is the length of the position vector r for point P , c is the speed of sound, and
k is the wave number. Qs represents the source strength. It is the amount of fluid
volume injected into the medium per unit time[9, page 198, 37]. The sound radiation
from a monopole is omnidirectional hence it is independent of angles ϑ and ϕ. p0(k)
is the magnitude of the source at origin.
Now if we want to calculate the pressure field at point P due to a source located atpoint Q then this can be done by some geometrical manupulation on equation 2.42.
Assume the same monopole to be located at Q with distance rs = rs from the origin.If we say r
s = rs then the pressure at point P due to source at Q would be quivalent
to the pressure at P due to the source at the origin O. Therefore pressure p(r, Ω, k)
at point P for a source at Q is
p(r, Ω, k) = −ip0(k)ckQseikr−rs
4πr − rs X
(2.43)
Here Ω ≡ (ϕ, ϑ). The significance of this equation is that we derived an expression forpressure field at a point on a sphere due to a source located at a position other than
the origin, that means if we try to present a analogy with spherical microphone array,
then consider the array as a spherical surface and at any point on that surface we can
describe the pressure field due to a source located at any position Q. Here one more
thing which is to be noted is the fact that as r − rs is dependent on ϕ and ϑ hencethe sound pressure in equation 2.43 is also dependent on ϕ and ϑ.
Further as derived in [9, page 198] the term X is equivalent to Green function G(r|rs)[9, page 198].
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
35/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 28
2.4 Spherical harmonic expansion of plane wave
In section 2.1.2 an expression is given to calculate the pressure of an ideal plane wavein cartesian coordinates, now we represent a similar calculation but in spherical coor-
dinates.
p(r, Ω, k) = p0(k) · ei·k·r (2.44)
where p0(k) is the magnitude of plane wave, r is the position vector (r, Ω), and k is
the wave vector. Assuming p0(k) = 1 for the purpose of derivation and using equation
2.44 in 2.37 we get
ei·k·r =∞l=0
lm=−l
(Alm(k)) · jl(kr) · Y ml (Ω) (2.45)
Here k and r are the wave vector and position vector respectively. Over here we would
like to point out that the plane wave which was described in vector domain by wave
vector and position vector in equation 2.44 is expressed in terms of wave number k
and scalar distance r. More description about this is given in the A.2.
Equation 2.45 can be further transformed as explained in [9, page 227] and is given
as
ei·k·r = 4π∞l=0
il jl(kr)l
m=−l
Y ml (Ω) · Y ml (Ω0) (2.46)
here Ω0 ≡ ϕ0, ϑ0 is the incident direction of the plane wave, where as Ω is the pointwhere we want to observe the pressure field. From equations 2.45 and 2.46 we can
draw out a conclusion that
Alm = 4π il · Y ml (Ω0) (2.47)
and we observe from this that the spherical wave cofficient Alm for plane wave sound
field are not dependent on k or frequency f of the wave.
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
36/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 29
In [11, page 18] equation 2.46 has been simulated for a plane wave sound field of 1
kHz. The simulation has been shown for different maximum value of level l and finally
it was deduced that the plane wave field can be approximated exactly only within a
bounded region around the origin and this region is bigger for higher values of l. If
we say that in equation 2.46, in place of ∞ in the first summation, we replace it by amaximum level l = L, then we can establish an approximate rule given by
d
λ =
L
2π(2.48)
here d is the radius of the region, L maximum level l and λ is the wavelength of
the plane wave. This prportionality states the fact that the region for which we can
effectively define the pressure field is proportional to the level l. For reference plotsare provided in the Appendix A.3.
2.5 Mode strength
We define an expression for the combination of Bessel function and hankel function
which have appeared in earlier sections during the derivation for cofficients Alm of
spherical harmonics. In the process of measurement of sound fields using sphericalmicrophone arrays the interaction of sound field with the array structure has to be
taken into consideration [9] [17] [31].
If we recall equation 2.37 and 2.40 and express them in a generalized form in order
to associate them to different kind of spherical microphone array structure. Then the
equations are written as
s(r, Ω, k) =∞l=0
lm=−l
(Alm(k)) · bl(kr) · Y ml (Ω) (2.49)
Alm(k) = 1
bl(kr)
S 2
s(r, Ω, k)Y ml (Ω)dΩ (2.50)
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
37/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 30
here s(r, Ω, k) is the spherical microphone array response. The term bl(kr) is called
as mode strength. For different microphone array structure the interaction of sound
fields with the array is approximated using this term [9] [32]. In general we define two
types of spherical array structures
• Open sphere configuration
• Rigid sphere configuration
In open sphere configuration we have a single microphone mounted on a robotic arm
and according to predefined microphone positions, measurements are done for respec-
tive positions on the sphere. In rigid sphere the sensors are arranged on a solid sphere.
In appendix A.4 images for open sphere and rigid sphere configuration are given.
bl(kr) =
4πil jlkr, open sphere arrays
4πil
jlkr − j
l(ka)
h(2)l
(ka)h
(2)l (kr)
, rigid sphere arrays
(2.51)
here jl(kr) is the spherical bessel function of first kind, h(2)l (kr) and h
(2)l (ka)h
(2) are
the spherical Hankel function of second kind, (·) denotes the derivative, and a is theradius of the sphere, where r ≥ a.
The rigid sphere configuration is better than the open sphere configuration [31] [17]
[32]. The major disadvantage of the rigid sphere configuration is that it interferes
or interacts with the surrounding sound fields. The mode strength does accounts
for the scattering effect caused by the rigid sphere while calculating for the incident
waves. Although the scattering effect is negligible for small spheres but it become
more prominent when a larger sphere configuration is used. Hence in case of larger
sphere, measurement should be done more accurately as the scattered waves can be
considered as additional incident waves when they get reflected by other objects in the
measurement environment and impinge on the sphere [31].
In figure 2.8 the mode strengths bl(kr) is plotted as a function of kr and for different
order l, in the figure order l are represented by the alphabet n.
The major advantage of using rigid sphere configuration is the improved numerical
conditioning as in equation 2.50 the spherical coefficient Alm contains a term 1/bl,
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
38/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 31
((a)) Rigid sphere array
((b)) Open sphere array
Figure 2.8: Mode strength for rigid sphere array and open sphere array [31]
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
39/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 32
and as bl is zero for some cases in open sphere configuration but not in the case of rigid
spheres [17] [31] [33].
2.6 Discretization of spherical aperture and spatial
aliasing
The analysis presented till now gave the description of sound field on a continuous
spherical aperture but in practice we can sample a sphere only on a finite number of
microphone positions. Hence we need to translate the expression for spherical coef-
fecients Alm(k) which are defined by the integral over a unit sphere in 2.50 into afinite summation. The approximation of finite integrals is known as quadrature and
the expression for Alm(k) in terms of finite summation is given as [8, page 43]:
Alm(k) ≈ Âlm(k) = 1bl(kr)
wq · s(r, Ω, k) · Y ml (Ω) (2.52)
where Âlm(k) is the approximated spherical coeffecient, Q is the number of microphone
positions and wq are the quadrature weights. The weights wq are the factors which are
used for compensation in different types of quadrature schemes so as to approximate
the sound field as closely as possible to the continuous aperture
Spherical microphone arrays perform spatial sampling of sound pressure defined on a
sphere and similar to time-domain sampling spatial sampling also requires to be limited
in band width i.e., limited harmonic order l to avoid aliasing [31] [34].
Hence in order to avoid spatial aliasing the following equation must hold good [8, page
44]
Alm(k) = 0,where l > Lmax (2.53)
Here Lmax is the highest order spherical coefficient of the sound field. The equation
given in 2.53 must be ensured in sampling the sphere otherwise spatial aliasing will
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
40/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 33
corrupt the coefficients at lower orders. A more detailed analysis for spatial aliasing
in spherical microphone arrays is presented in [34].
The sampling of level-limited (the word level/order are used interchangeably and referto l) sound fields can be done in many different ways as explained in [35] [31] [8]. These
quadrature allow us to perform sampling on the sphere with negligible or no aliasing
as far as equation 2.53 holds good.
Commonly there are three sampling schemes, a more detailed mathematical description
of these sampling schemes can be found in the references provide above.
1. Chebyshev quadrature, the sampling is characterized by uniform sampling in
elevation ϑ and azimuth ϕ. The total number of microphones in this scheme aregiven as Qch = 2Lmax · (2Lmax + 1)
2. In Gauss-Legendre quadrature the sphere is sampled uniformly in azimuth ϕ
but in elevation it is sampled at the zeroes of the Legendre polynomials of level
Lmax + 1. Number of microphone position required in this scheme are given as
QGL = Lmax · (2Lmax + 1)
3. Lebedev grid, in this quadrature scheme the microphone positions are uniformly
spread over the surface of the sphere such that each point has the same distanceto its nearest neighbours.
QLb = 4
3(Lmax + 1)
2 (2.54)
In this work we use Lebedev grid as it has an advantage over the other two schemes
and that is, it uses a smaller number of microphones positions for the approxima-
tion than the other two. A more detailed description of the lebedev grid is given in
[36] [37] [38] [39]. Reference [39] gives the Fortran code for calculating the grid pointsand weights for levels upto l = 131.
Using the approach of quadrature for discretization of the sphere we require a level
limited sound field in order to get an aliasing free sampling but for plane wave sound
fields the restrictions to a maximum level Lmax is not true as we can see this from
equation 2.45 and 2.46 which involve infinite number of non-zero spherical coeffecients
Alm(k). Hence some degree of spatial aliasing does occurs. But refering to section
2.2.1 we get to know that the spherical Bessel function jl(kr) decay rapidly for kr > l,
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
41/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 34
Figure 2.9: Different quadrature schemes [8]
therefore the strength of coeffecients in equation 2.45 can also be supposed to show a
similar behaviour for kr > l. Hence in theory we can say that the aliasing error would
not be there, if the operation frequency of the microphone array follows kr
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
42/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 35
w(Ω0, k) and are arriving from all the directions Ω0. Integrating equation 2.56 for all
the incident directions we have the expression for spherical fourier coeffecients f lm(k)
f lm(k) = 4πilbl(kr)
S 2
w(Ω0, k)Y ml (Ω0) (2.57)
The expression in equation 2.57 is termed as the spherical fourier transfor of amplitudes
w(Ω0, k) and we express it as wlm(k)
wlm(k) = f lm(k) 1
4πilbl(kr) (2.58)
For obtaining the amplitude ws(Ω
s, k) of any plane wave arriving from any direction
Ωs we perform an inverse SFT of equation 2.58
ws(Ωs, k) =l=0∞
m=−ll
f lm(k) · 14πilbl(kr)
.Y ml (Ωs) (2.59)
ws(Ωs, k) is also called directivity function and describes the decomposed plane wave
for a particular direction Ωs. Ωs is also known as steering direction of the microphone
array, and tells the direction for which plane wave decomposition is computed.
Further if we use equation 2.55 in equation 2.59 we get the expression for plane wave
decomposition in terms of spherical harmonic coeffecients Alm(k).
ws(Ωs, k) =l=0∞
m=−ll
1
4πil
Alm(k) · Y ml (Ωs) (2.60)
2.8 Spatial resolution in plane wave decomposition
In [17] [32] the saptial resolution of plane wave decomposition with respect to level
l has been analysed. As we can not use higher order than kr because this results
in negligible amplitudes of spherical harmonic coeffecients in lower frequency regions.
Hence our plane wave decomposition remains level limited to finite extents.
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
43/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 36
It has been shown in [17] that directivity decreases for lower values of order l and this
directivity pattern has been quantified in [17] and [11, page 39] by expression
ws(Θ) = L + 1
4π(cosΘ − 1) (P L+1(cos Θ)− P L(cos Θ)) (2.61)
Here Θ is the angle between arrival direction of plane wave Ω 0 and steering direction
of the microphone array Ωs. P L(·) is the Legendre polynomial of level l. ws(Θ) is thedirectional weight and it defines the spatial resolution for plane wave decomposition
calculated with a maximum level L. Refer figure 2.10
Figure 2.10: Directivity weights for PWD verses l [17]
In this figure the directivity weights ws(Θ) are plotted for different levels l. It can be
noticed that for the Θ = 0 i.e., when the array looks towards the arrival direction of
plane wave the directivity coeffecient (main lobe) gives a very sharp peak for higher
order and it broadens as the orders are decreased.
The spatial resolution is further defined with a relation between level l = L and the
first (smallest) zero Θ0 of ws(Θ) for Θ > 0. Θ0 is defined as the half of the resolution of
PWD. Θ0 = 180◦
L is a relation derived in [17] which tells us the extent to which plane
wave decomposition with a particular level L can decomepose a wave field in different
plane waves in spatial sense. Figure 2.11 is approximated by the relation Θ0 = 180◦
L .
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
44/109
2 MATHEMATICAL ANALYSIS AND STATE OF THE ART 37
Figure 2.11: Half resolution of the PWD [17]
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
45/109
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
46/109
3 ERROR ANALYSIS 39
Figure 3.1: Errors in Spherical Microphone array measurement
3.2 Description of measurement error function
In this section we follow the frame work given in [31] and describe the mesurement
errors mathematically and there contribution to spherical harmonic coeffecients.
For the analytical description we assume an arbitrary sound field is captured by an
rigid sphere microphone array. The frequency domain output of a single microphone
element which is considered to have all the errors as depicted in figure 3.1 is
s(r, Ωq, k) + eq (3.1)
where k is the wave number, r is radius of the sphere, eq is the noise introduced
by the microphones and Ωq is the microphone position with positioning errors. The
spherical harmonic coeffecients Alm(k) can be calculated by using equation 2.52 which
is explained in 2.6. Keeping the said equations in mind we obtain the following
Âlm(k) = 1
bl(kr)
q=1Q
wq · s(r, Ωq, k) · Y ml (Ω) +q=1Q
wq · eq · Y ml (Ω)
(3.2)
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
47/109
3 ERROR ANALYSIS 40
In this equation Q is the number of microphones, wq are the quadrature weights and
bl(kr) is the mode strangth for rigid sphere configuration refer section 2.5. The correct
microphone position as defined by the sampling scheme are denoted by Ω q. Now we
express the sound field s(r, Ωq, k) in terms of the correct spherical harmonic coeffecients
Alm(k) using equation 2.49 in section 2.5 and substituting it in equation 3.2
Âlm(k) = 1
bl(kr)
l=0∞
m=−ll
Alm(k) · bl(kr)
×
q=1Q
wq · Y ml (Ωq) · Y ml (Ωq)
X +
q=1Q
wq · eq · Y ml (Ω) (3.3)
The term X is equivalent to the orthonormality condition of spherical harmonics given
in section 2.1.6. In ?? this term has been extended to find the contibution of aliasing
error a and positioning error Ω and is expressed as
Qq=1
wq · Y ml (Ωq) · Y ml (Ωq) =δ ll · δ mm + Ω(l,m,l
, m), where l, l ≤ Lmaxa(l,m,l
, m) + Ω(l,m,l, m), where l ≤ Lmax < l
(3.4)
Here δ ll and δ mm are Kronecker deltas. The maximum level Lmax is the highest level
of the spherical harmonic coeffecients Alm(k) inside the sound field which is sampled
using Q microphone positions, the relation for L and Q could be seen in section 2.6
equation 2.54 for Lebedev grid. In the first part of equation 3.4 the level l < Lmax
hence we do not see aliasing error in that expression.
Also from Kronecker deltas we see that if Ω = 0 then Ωq and Ωq should be equal,
hence Ω represents the positioning error.
In the lower part of equation 3.4 we consider l > Lmax hence we say spatial aliasing
would be there. Since l and l are different terms δ ll · δ mm does not appears in thispart.
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
48/109
3 ERROR ANALYSIS 41
The aliasing error a is given as [31]
a(l,m,l, m) =
q=1Q
wq · Y ml (Ωq) · Y ml (Ωq), where l ≤ Lmax < l (3.5)
For positioning error we obtain it by subtracting equation 3.4 from equation 3.5 [31]
Ω(l,m,l, m) =
q=1
Q wq Y m
l (Ωq)
−Y m
l (Ωq)Y ml (Ωq), where l ≤ Lmax, l
≥0 (3.6)
Finally if use equation 3.4 in equation 3.3 and separate the summation over l we get
the expression for spherical harmonic coeffecients with all the error contributions [31]
Âlm(k)= 1
bl(kr)
∞l=0
lm=−l
Alm(k) · bl(kr) · δ ll · δ mm
A(s)lm(k)−signal contribution+
1
bl(kr)
∞l=0
lm=−l
Alm(k) · bl(kr) · Ω(l,m,l, m) A(Ω)lm
(k)−positioning error
+ 1
bl(kr)
∞l=0
lm=−l
Alm(k) · bl(kr) · a(l,m,l, m) A(a)lm
(k)−aliasing error
+
1
bl(kr)
Q
q=1
wq · eq · Y m
l (Ωq) A(e)lm
(k)−microphone noise
(3.7)
In equation 3.7 the first term refer to the error free contribution in spherical harmonic
coeffecients ˆAlm(k). As the Kronecker deltas would be one, hence the first term simpli-
fies to Alm(k). All the other terms represent the errors. From the equation itself we see
that the errors depend on level l, kr and the quadrature. Although we are using rigid
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
49/109
3 ERROR ANALYSIS 42
sphere configuration but as mode strength bl(kr) has different expression for different
microphone configuration hence the errors are also dependent on array configuration.
Finally we can obtain the expression for plane wave decomposition by using equation3.7 and substituting it in equation 2.60, we get the expression for directivity function
in plane wave decomposition. Each term A(·)lm(k) in equation 3.7 yields the contribution
of that particular error to the direction weights ws
w(·)s (Ωs, k) =∞l=0
lm=−l
1
4πil · A(·)lm(k) · Y ml (Ωs) (3.8)
where Ωs is the steering direction of spherical microphone array and A(·)lm(k) can any of
the four different components in equation 3.7; A(s)lm(k), A
(Ω)lm (k), A
(a)lm (k) or A
(e)lm(k). In
order to get the effective influence of the measurement errors on results of plane wave
decomposition we relate the error contribution in equation 3.7 to corresponding signal
contribution and we look for relative error contribution by taking ratio of the squared
absolute values of different errors with respect to signal contribution [31].
E a(kr) =
w(a)s (Ωs, k)2w(s)s (Ωs, k)2
E Ω(kr) =
w(Ω)s (Ωs, k)2w(s)s (Ωs, k)2
E e(kr) = w
(e)s (Ωs, k)
2
w(s)s (Ωs, k)2
(3.9)
Equation 3.9 Noise to signal ratios are calculated. Figure 3.2 shows the behaviour of
different errors; noise, positioning and alising, for different levels l.
On comparing various quadratures for spatial aliasing, microphone noise and posi-
tioning error the Lebedev quadrature is found to have better robustness against the
errors in general. Due to these characteristics we use Lebedev grid along with rigid
sphere [31].
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
50/109
3 ERROR ANALYSIS 43
Figure 3.2: Errors in Spherical Microphone array measurement [31]
3.3 Microphone noise
Microphone noise is an important source of introducing corruptive artifact in auraliza-
tion process. Although the contemporary microphone technology provides a very high
signal to noise (SNR) ratio but in general we can not disregard the noise induced by
the microphones.
It is important to note that the mode strength refer 2.5, have quite low values for
smaller values of kr at higher levels l and hence, this amplifies the spherical harmonic
coefficients (refer equation 2.52) considerably therefore in situations where noise is
present, it also gets amplified 3.2. The increase in noise is more vigorous in the low krrange than in the higher kr
Microphone noise also depends on the number of microphone used, it was shown in
simulations in [31] that higher the number of microphone the better is its robustness
against noise. It is also seen that the influence of noise is the lowest when the maximum
level l ≈ kr. For higher kr the mode strengths some what converges towards 0 dBand hence we can say theoritically the increase in error for higher kr should not be
too significant. The quadratures used for discretization of the sphere do not have any
Master Thesis Gyan Vardhan Singh
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
51/109
3 ERROR ANALYSIS 44
significant affect in regard to the microphone noise and they all behave in a similar
way. But as the noise affect is more at low kr and hence, we can say that it limits the
array performance on lower frequencies.
3.4 Spatial aliasing
The problem of spatial aliasing is quite complex in spherical microphone arrays. As
continuous aperture is not practically feasible hence we do discretization of the sphere,
using quadratures, which gives us a relation between number of microphones and the
maximum level l. But this discretization of the sphere leads us to spatial aliasing
problem. In [31] [34] [40] analysis of sampling techniques for spherical microphone
arrays and its effect on plane wave decomposition is explained. Aliasing free techniques
for level limited functions and some solution like spatial anti aliasing filters for aliasing
reduction are proposed in [34]
Refer to figure 2.8(a), because of the nature of bl(kr) the magnitude of spherical har-
monic coeffecients of the sound pressure becomes increasingly insignificant for l > kr,
r is radius of sphere. The aliasing error is expected to be almost negligible if operating
frequency range of array satisfies the condition kr
8/18/2019 spherical microphone array processing with wave field synthesis and auralization
52/109
3 ERROR ANALYSIS 45
various sampling techniques, our work is based on Lebedev grid which is given in
Top Related