Study Non Linear Distortion Optical Sound Phdthesis
Transcript of Study Non Linear Distortion Optical Sound Phdthesis
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
1/148
RESTORATION OF NONLINEARLY DISTORTED OPTICAL
SOUNDTRACKS USING REGULARIZED INVERSE
CHARACTERISTICS
PhD thesis
Tamás B. Bakó
Supervisor: dr. Tamás Dabóczi
BUDAPEST UNIVERSITY OF TECHNOLOGY AND ECONOMICS
DEPARTMENT OF MEASUREMENT AND INFORMATION SYSTEMS
3rd June 2004.
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
2/148
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
3/148
Aluĺırott, Bakó Tamás Béla kijelentem, hogy ezt a doktori értekezést magam késźıtettem
és abban csak a megadott forrásokat használtam fel. Minden olyan részt, amelyet szó szerint,
vagy azonos tartalomban, de átfogalmazva más forrásból átvettem, egyértelműen, a forrás
megadásával megjelöltem.
A dolgozat b́ırálatai és a védésről készült jegyzőkönyv a későbbiekben, a Budapesti
Műszaki és Gazdaságtudományi Egyetem dékáni hivatalában lesz elérhető.
Budapest, 2004. június 3.
. . . . . . . . . . . . . . . . . . . . .
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
4/148
Magyar nyelvű összefoglaló
A régi filmfelvételek hangja gyakran nem túl jó minőśegű: a lejátszott hang rendkı́vül zajos
és torz. A torzult hang fárasztja a közönséget, akik kevésbé tudnak koncentrálni magára a
filmre, ezáltal a film élvezhetősége csökken. Ez az oka annak, hogy számos régi filmet nemérdemes lejátszani a közönségnek a telev́ızióban vagy a filmsźınházakban. A torz hangot
azonban digitális jelfeldolgozási módszerekkel jobbá lehet tenni.
Mivel a hangrestaurálás számára semmi más nem áll rendelkezésre, csak a torz és zajos
filmfelvétel, és nincs hozzáférésünk sem az eredeti jelhez, sem pedig a készülékekhez, amivel
a felvételt készı́tették, ezért az egyetlen lehetőśegünk a hangminőśeg feljavı́tására a hang
utólagos kompenzálása. Ez a disszertáció új módszereket javasol az optikai úton rögzı́tett
régi filmek nemlineárisan torzult hangjának hatékony és gyors utólagos kompenzálására.
A disszertáció első részében a nemlineáris modellekről és a nemlineáris kompenzáló tech-
nikákról esik szó, majd az utólagos nemlineáris kompenzálás lesz részletesen elmagyarázva
és az, hogy ez a probléma miért ún. rosszul kond́ıcionált probléma. A disszertáció második
részében olyan módszerek lesznek bemutatva, melyek képesek kezelni a probléma rosszul
kondı́cionáltságát (a hang helyreálĺıtás érzékenységét a torz jelhez hozzáadódott zajokra). A
módszer hatékonyságát szimulációk és filmrészletek hangjának helyreálĺıtása támasztják alá.
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
5/148
To the muse
D´ ora Sz´ asz
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
6/148
Acknowledgement
I am very grateful to László Fűszfás and Zoltán Sebán for helpful discussions and for finding
me the basic literatures of film-processing. I am also grateful to the Hungarian Radio for the
technical support of my research work. The Hungarian National Film Archive, especiallyÉva Beke is also acknowledged, who gave me film materials to finish my researches. Also
many thanks to László Balogh, who carefully checked the mathematics in this dissertation
and asked me better explanations.
I would also like to thank the many people who have made the Department of Mea-
surement and Instrumentation Technology such a stimulating environment, including those
whose heroic efforts have kept the absurdly nonstandard network running most of the time.
Keywords
The following keywords may be useful for indexing purposes:
Audio restoration, nonlinear compensation, regularization methods, Tikhonov regular-
ization, optical soundtrack, density characteristic.
iv
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
7/148
Summary
This dissertation is concerned with the possibilities of restoration of degraded film-sound.
The sound-quality of old films are often not acceptable, which means that the sound is so
noisy and distorted that the listener have to take strong efforts to understand the conversa-tions in the film. In this case the film cannot give artistic enjoyment to the listener. This is
the reason that several old films cannot be presented in movies or television.
The quality of these films can be improved by digital restoration techniques. Since we
do not have access to the original signal, only the distorted one, therefore we cannot adjust
recording parameters or recording techniques. The only possibility is to post-compensate
the signal to produce a better estimate about the undistorted, noiseless signal. In this dis-
sertation new methods are proposed for fast and efficient restoration of nonlinear distortions
in the optically recorded film soundtracks.
First the nonlinear models and nonlinear restoration techniques are surveyed and the
ill-posedness of nonlinear post-compensation (the extreme sensitivity to noise) is explained.
The effects and sources of linear and nonlinear distortions at optical soundtracks are also
described. A new method is proposed to overcome the ill-posedness of the restoration prob-
lem and to get an optimal result. The effectiveness of the algorithm is proven by simulations
and restoration of real film-sound signals.
v
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
8/148
vi
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
9/148
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
10/148
3.4.3 Restoration using nonlinear autoregressive models . . . . . . . . . . . 26
4 The nonlinear characteristic of movie film 27
4.1 Image formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Relationship between silver mass and transparency . . . . . . . . . . . . . . 284.3 Relationship between transparency and exposure . . . . . . . . . . . . . . . . 29
5 Imperfections in the optical sound-recording techniques 33
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.2 Optical sound-recording techniques . . . . . . . . . . . . . . . . . . . . . . . 34
5.2.1 Variable density method . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.2.2 Variable area method . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.3 Distortions at variable density method . . . . . . . . . . . . . . . . . . . . . 375.4 Distortions at variable area method . . . . . . . . . . . . . . . . . . . . . . . 40
5.5 Appearance of noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6 Compensation of memoryless nonlinearities 43
6.1 Representation of nonlinearity . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.1.1 Representation using a piecewise linear model . . . . . . . . . . . . . 44
6.1.2 Representation of the inverse nonlinearity . . . . . . . . . . . . . . . 45
6.2 Identification of the nonlinear distortion . . . . . . . . . . . . . . . . . . . . 46
6.3 Effect of noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.4 Compensation of the signal by Tikhonov regularization . . . . . . . . . . . . 50
6.4.1 Comparison of the solution to the optimal least squares solution . . . 53
6.4.2 Finding the appropriate value of the regularization parameter . . . . 60
6.4.3 Comparison of the novel method to Morozov’s and Hansen’s method 64
6.5 Results on synthetically distorted real audio signals . . . . . . . . . . . . . . 72
6.6 Results on real distorted audio signals . . . . . . . . . . . . . . . . . . . . . 77
6.7 Compensation of the signal to make an unbiased estimate . . . . . . . . . . . 826.7.1 Finding a proper compensation characteristic using an iterative method 84
6.7.2 Proof that the method is convergent under the given constraint . . . 84
6.8 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7 Conclusions and future possibilities 89
7.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.2 Suggestions for future research . . . . . . . . . . . . . . . . . . . . . . . . . . 91
viii
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
11/148
7.2.1 Improved blind identification . . . . . . . . . . . . . . . . . . . . . . . 91
7.2.2 Adaptivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.2.3 Elimination of nonlinearities with memory . . . . . . . . . . . . . . . 92
A Brief history of film-sound 93
A.1 Sound-on-disc sound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
A.2 Sound-on-film sound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
B Optimal signal restoration in linear systems 99
B.1 Simple linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
B.2 Piecewise linear model with two and more intervals . . . . . . . . . . . . . . 101
C MATLAB simulation of a realistic photosensitive layer 105
D MATLAB realization of computation of regularized nonlinear characteris-
tics 109
E MATLAB realization of finding the optimal regularization 111
F MATLAB realization of calculation of compensation characteristic for un-
biased signal reconstruction 113
ix
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
12/148
x
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
13/148
List of Tables
5.1 Velocity of different film formats. . . . . . . . . . . . . . . . . . . . . . . . . 36
6.1 Comparison results of the Morozov, Hansen and the new method. . . . . . . 68
6.2 Comparison results of the exact inverse, Tikhonov and the unbiased charac-
teristics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
xi
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
14/148
xii
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
15/148
List of Figures
2.1 Block diagram of an LNL system. . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1 Block-scheme of pre-distortion. . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Block-scheme of post-distortion. . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 Original, input signal (x in Fig. 3.2). . . . . . . . . . . . . . . . . . . . . . . 17
3.4 Distorted and noisy, observed signal (o in Fig. 3.2). . . . . . . . . . . . . . . 17
3.5 Reconstructed signal by the exact inverse of the nonlinear distortion (x̂ in Fig.
3.2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.1 Characteristic of exposure vs. developable crystals in a monosized silver-halide
layer for different foton quanta sensitivity (r). . . . . . . . . . . . . . . . . . 30
4.2 Exposure vs. 1-transmission characteristic of a typical emulsion. . . . . . . . 31
4.3 Logarithmic exposure vs. density characteristic of a typical emulsion. . . . . 31
5.1 Schematic diagram of variable density method. . . . . . . . . . . . . . . . . . 34
5.2 Sound-on-film, variable density. . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.3 Sound-on-film, variable area. . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.4 Schematic diagram of variable area method with electrodynamic mirror oscil-
lograph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.5 Amplitude response of light intensity controlled variable density sound-recording.
Solid line: standard (35 mm) film at 24 fps, dashed: substandard (16 mm)
film at 16 fps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.6 Creation of nonlinear distortions due to light diffusion. . . . . . . . . . . . . 41
6.1 Model of the nonlinearity compensation. . . . . . . . . . . . . . . . . . . . . 50
6.2 One block from the piecewise linear compensation model. . . . . . . . . . . . 51
6.3 The supplemented piecewise linear compensation model. . . . . . . . . . . . 51
xiii
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
16/148
6.4 R( pn(n), N (x)) at Gaussian error function and uniformly distributed noise
(noise interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear
function, dashed line: R( pn(n), N (x)). . . . . . . . . . . . . . . . . . . . . . . 55
6.5 R( pn(n), N (x)) at Gaussian error function and Gaussian noise (noise deviation
at left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed
line: R( pn(n), N (x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.6 R( pn(n), N (x)) at exponential function and uniformly distributed noise (noise
interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear function,
dashed line: R( pn(n), N (x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.7 R( pn(n), N (x)) at exponential function and Gaussian noise (noise deviation at
left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed
line: R
( pn(
n), N
(x
)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566.8 R( pn(n), N (x)) at square-root function and uniformly distributed noise (noise
interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear function,
dashed line: R( pn(n), N (x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.9 R( pn(n), N (x)) at square-root function and Gaussian noise (noise deviation at
left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed
line: R( pn(n), N (x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.10 R( pn(n), N (x)) at x0.2 function and uniformly distributed noise (interval at
left 0.1, noise interval at right 0.01). Solid line: nonlinear function, dashedline: R( pn(n), N (x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.11 R( pn(n), N (x)) at x0.2 function and Gaussian noise (noise deviation at left
0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed line:
R( pn(n), N (x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.12 Multisine signal, x, used for the simulations. . . . . . . . . . . . . . . . . . . 65
6.13 Gaussian error function used for the first simulation. . . . . . . . . . . . . . 66
6.14 x5 function used for the second simulation. . . . . . . . . . . . . . . . . . . . 66
6.15 Noisy output signal of the first simulation (distortion is made by the Gaussian
error function). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.16 Noisy output signal of the second simulation (distortion is made by the x5
function). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.17 Error of the compensation of nonlinearity by Morozov’s method (left) and
Hansen’s method (right) as a function of λ. The nonlinear distortion is the
Gaussian error function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
xiv
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
17/148
6.18 Error of the compensation of nonlinearity by the novel method (left) and the
true result (right) as a function of λ. The nonlinear distortion is the Gaussian
error function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.19 Error of the compensation of nonlinearity by Morozov’s method (left) and
Hansen’s method (right) as a function of λ. The nonlinear distortion is the
part of x5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.20 Error of the compensation of nonlinearity by the novel method (left) and the
true result (right) as a function of λ. The nonlinear distortion is the part of x5. 69
6.21 Reconstruction of x̂ by Morozov’s method (left) and Hansen’s method (right)
for the Gaussian error function. . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.22 Reconstruction of x̂ by the novel method (left) and the optimal result in least
squares sense (right) for the Gaussian error function. . . . . . . . . . . . . . 70
6.23 Reconstruction of x̂ by Morozov’s method (left) and Hansen’s method (right)
for the x5 nonlinear distortion. . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.24 Reconstruction of x̂ by the novel method (left) and the optimal result in east
squares sense (right) for the x5 nonlinear distortion. . . . . . . . . . . . . . . 71
6.25 Original, not distorted audio signal. . . . . . . . . . . . . . . . . . . . . . . . 73
6.26 Audio signal synthetically distorted by a γ -function. . . . . . . . . . . . . . . 73
6.27 Distorted, noisy signal part chosen for parameter determination of the non-
linear function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746.28 Result of parameter search of the nonlinear function. . . . . . . . . . . . . . 74
6.29 Estimate error of the iterative algorithm at different regularization values. . 75
6.30 True error at different regularization parameters. . . . . . . . . . . . . . . . . 75
6.31 Reconstructed signal by the best characteristic estimate. . . . . . . . . . . . 76
6.32 Reconstructed signal by overregularized characteristic. . . . . . . . . . . . . 76
6.33 Reconstructed signal by underregularized characteristic (note scale change). . 77
6.34 Real, nonlinearly disorted and noise contaminated audio signal. . . . . . . . 78
6.35 Signal part chosen for parameter determination of the nonlinear function. . . 78
6.36 Results of parameter estimation of the nonlinearity. . . . . . . . . . . . . . . 79
6.37 Result of the iterative algorithm. . . . . . . . . . . . . . . . . . . . . . . . . 79
6.38 Reconstructed signal by optimally regularized characteristic. . . . . . . . . . 80
6.39 Reconstructed signal by underregularized characteristic. . . . . . . . . . . . . 81
6.40 Sinusoid excitation signal used for the simulations. . . . . . . . . . . . . . . . 85
6.41 The nonlinear distortion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
xv
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
18/148
6.42 Distorted signal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.43 Reconstruction of x by the exact inverse (left) and Tikhonov-regularized in-
verse (right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.44 Unbiased reconstruction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
B.1 Estimation of x in the knowledge of o. . . . . . . . . . . . . . . . . . . . . . 100
B.2 Original and inverse piecewise linear system. . . . . . . . . . . . . . . . . . . 101
xvi
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
19/148
Chapter 1
Introduction
1.1 Overview
The optical filmsound-recordig technology is more than 100 years old. Since then millions of
sound-films were made and then stored in the national film archives, which have inestimable
artistical value. The task of the archives is not just to preserve these films but also to prepare
them for broadcasting and show them to the wide audience. However, most of these films
cannot be broadcasted because they suffer from several degradations.
There are several distinct types of film degradations. These can be broadly classified
into two groups: localised degradations and global degradations. Localised degradations arediscontinuities in the waveform which affect only certain samples. Global degradations affect
all samples from the waveform. We can distinguish the following sub-classes of degradations
[1]:
– clicks and cracklings,
– low-frequency noise transients,
– broad band noise,
– wow and flutter,
– non-linear defects.
Clicks and cracklings are short bursts of interference random in time and amplitude.
The cause of these impulsive disturbances are mutations on the sound-carrier material (e.g.
scratches or dirt spots on the surface).
1
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
20/148
Low-frequency noise transients are mainly larger scale defects than clicks. The reasons
are large discontinuities due to glued parts of film-rolls or other strong damages at optical
sound-recording. These changes in the film material cause special excitations in the light
intensity during sound reproduction and hence cause strong transients in the reproduced
sound. These large discontinuities can be heard as low-frequency pulses.
Broad band noise is common to all analogue measurement, storage and recording systems
and in the case of audio signals it is generally perceived as “hiss” by the listener. It can be
composed of electrical circuit noise, irregularities in the storage medium and ambient noise
from the recording environment.
Wow and flutter are pitch variation defects which may be caused by eccentricities in the
playback system, motor speed fluctuations or by special distortions of the sound carrier (e.g.
shrinkage of film).
Non-linear defect is a very general class that covers a wide range of distortions. In the
audio field, the principal causes are [2]:
– saturation in magnetic recording,
– tracing distortion (before compensation was introduced) and groove deformation in
records,
– the inherent nonlinearity of optical soundtracks.
There are already many solutions and applications in the scientific literature and on the
market that deals with restoration of local degradations and wide band noise. There are
already several results published in the literature to eliminate pitch defects. However, there
was a relatively small emphasize on the elimination of non-linear defects. It is the topic of
current research interests in DSP for audio [1].
In the last decade, methods restoring damaged audio recordings have progressed from ad
hoc methods, motivated primarily by ease of implementation, towards more sophisticated
approaches based on mathematical modeling of the signal and degradation processes.
This thesis addresses the elimination of distortion of optical soundtracks, a previously
not too extensively investigated problem. Restoration of nonlinear distortions is a special
kind of inverse filtering problem. This problem could be ill-posed, which means that during
reconstruction of the nonlinearly distorted signal, small uncertainties in this signal can cause
strong deviations in the restored one. In this case, our aim is to find a restoration method,
where both the signal distortion and the level of deviation (more simply the level of the
amplified noise) can be kept low. The aim of this dissertation is to clarify the reasons
2
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
21/148
of nonlinear distortions in the case of optical soundtracks and propose methods based on
digital signal processing to reduce the distortion and avoid the appearance of artefacts in
the restored sound.
1.2 Structure of thesis
Chapter 2 introduces the description and representation forms of memoryless nonlinearities
and nonlinearities with memory. Chapter 3 examines the possible methods for eliminating
effects of nonlinear distortions and explains in details the problems and possible solutions
of nonlinear post-compensation techniques. The main problem during post-compensation
is the amplification of the noise that is present in the original material. Without proper
compensation, the noise amplification could be so strong that the resulted sound couldbe worse than the distorted one. In this chapter the origin of the noise amplification is
discussed and the possible methods are summarized, which could be applicable to overcome
this problem.
Chapter 4 reviews the nonlinear characteristic of photosensitive materials and shows the
analytical equations, which describe the nonlinear behaviour. Chapter 5 discusses the film-
sound recording techniques and the appearance of nonlinear distortions of the photosensitive
materials in the sound.
Chapter 6 shows two novel methods for composing compensation characteristics for post-
compensation of distorted signals. One of them is based on Tikhonov regularization oper-
ators. The aim of this compensation technique is to minimize the estimated value of the
energy of noise and distortion terms together. The method is fast compared to other compen-
sation methods, because this method does not have iterative steps during the compensation
process. Simulations also show in this chapter that the accuracy of the method is as high as
other compensation methods.
A common problem at regularization of an ill-posed problem is that we have a very little
knowledge about the original signal, hence we don’t know, how much regularization is needed
to achieve the optimal result. In this chapter a new method is shown that can automatically
find a good estimate about the amount of regularization without the interaction of a user.
It is quite important at the film industry and at the film archives, where huge amount of
degraded films are waiting for restoration and there is no time to make several experiments
on each film.
The aim of the second compensation method is to produce an unbiased estimate from
3
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
22/148
the noisy, distorted signal about the original, undistorted one.
We also have little knowledge about the nonlinear distortion function, which is another
problem in signal compensation. In chapter 6 a possible method is shown for the identifi-
cation of the nonlinear function in the knowledge of an analytical, parametrizable formula
about the distortion.
Finally, Chapter 7 presents conclusions and suggests possible directions for future re-
search.
4
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
23/148
Chapter 2
Classification of nonlinearities and
nonlinear models
2.1 Classification of nonlinearities
A system, at which the relation between the input and the output of the system is described
by the function H (), is a linear system if, for any inputs x1(t) and x2(t), and for any constant,
c, the additive property (eq. (2.1)) and the homogeneity property (eq. (2.2)) are satisfied:
H (x1(t) + x2(t)) = H (x1(t)) + H (x2(t)), (2.1)
H (c · x(t)) = c · H (x(t)). (2.2)In the case of a nonlinear system the additive and/or homogeneity properties are not satisfied.
Nonlinear systems can be divided into two main categories:
– memoryless nonlinear systems,
– nonlinear systems with memory.
In a memoryless nonlinear system the current output at time t depends only from the current
input at time t and does not depend from previous or next input values. A nonlinear system
has memory if the output at time t depends on the input at time t, as well as the inputs
over a previous time interval.
2.2 Representation of memoryless nonlinearities
Memoryless nonlinear models are often adequate for representing nonlinearities in systems
that have a very wide bandwidth with respect to the signal bandwidth. The main advantage
5
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
24/148
in resorting to such models is their simplicity, ease of application and low computational
burden [3]. Good examples for applications that can be represented with memoryless non-
linearities are e.g. microwave amplifiers [4], A/D and D/A converters [5], photosensitive
materials [6, 7], tube amplifiers [8, 9], several types of transducers [10] and many other
applications that we cannot enumerate because of the lack of space.
2.2.1 Taylor series and piecewise linear representation
The most elementary model for dealing with nonlinear systems is the Taylor series. The
Taylor series provides a polynomial representation of a memoryless nonlinear system. Ac-
cording to [11], James Gregory was the first to discover the Taylor series in 1668, more than
forty years before Brook Taylor published it in 1717.
If a real function, f (x), has continuous derivatives up to (n+1)th order, then this functioncan be expanded in the following fashion:
f (x) = f (a) + 1
1!
df (x)
dx
x=a
+ 1
2!
d2f (x)
dx2
x=a
+ . . . + 1
n!
dnf (x)
dxn
x=a
+ Rn (2.3)
where Rn, called the remainder after n + 1 terms is given by:
Rn =
x a
f (n+1)(u)(x − u)n
n! du =
f (n+1)(ξ )(x − a)n+1(n + 1)!
a < ξ < x. (2.4)
When this expansion converges over a certain range of x, that is limn→∞
Rn = 0 then this
expansion is called the Taylor series of f (x) expanded about a.
If the value of n in eq. (2.3) equals 1, we will get a simple linear model, which has
appropriately small error in a given small domain. Linearity has been one of the fundamental
principles upon which theory of signal processing has been structured. Most real-world
problems however, are intrinsically nonlinear and can be modeled as linear ones only within
a limited range of values. Piecewise linear constitute a compromise between the inherent
complexity of the nonlinear domain and the theoretical abundance of linear methods.
2.2.2 Polynomial interpolation
In 1903, Weierstrass published a theorem that states that memoryless nonlinear systems that
are non-polynomial in nature, could be approximately represented with arbitrary accuracy
by polynomial models, over a given range of inputs [12]. This is now known as the Weierstrass
approximation theorem. In the 1950s, Davenport and Root showed how the direct method,
6
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
25/148
and the transform method can be used to determine the statistical properties of the output
of memoryless nonlinear devices [11].
In the late 1960s, Blachman showed that a memoryless nonlinearity can be represented
as a generalised Fourier decomposition into a sum of orthogonal polynomials ([13, 14]). The
orthogonality of the polynomials for particular input signal properties allowed the polynomial
coefficients to be calculated or measured using a cross-correlation method. Appropriate sets
of orthogonal polynomials for a number of stationary input signals, were discovered well
before Blachman’s application. In 1939 Szegő attempted to produce a complete bibliography
of every paper published on the subject of orthogonal polynomials before that date [15].
The most commonly used orthogonal polynomials are Chebyshev and Hermite polynomi-
als. Chebyshev polynomials , T n(x), n ∈ 0, 1, 2, . . ., are real functions, which form a completeorthogonal set on the interval
−1 ≤
x
≤ 1 with respect to the weighting function 1
√ 1−x2
. It
can be shown that
1 −1
1√ 1 − x2 T m(x)T n(x) =
0 if m = nπ if m = n = 0π2
if m = n = 1, 2, 3, . . .
(2.5)
Since sine wave signals have 1√ 1−x2 amplitude distribution, this kind of nonlinearity inter-
pretation is applicable to generate or eliminate certain harmonic distortions in sinusoid
excitations [3].Hermite polynomials, H n(x), n ∈ 0, 1, 2, . . . form a complete orthogonal set on the interval
−∞ ≤ x ≤ ∞ with respect to the weighting function exp(−x2) It can be shown that∞
−∞
e−x2
T m(x)T n(x) =
0 if m = n2nn!
√ π if m = n
(2.6)
Since Gauss-like signals have exp(−x2) amplitude distribution this kind of nonlinearity in-terpretation is applicable to simulate or eliminate distortions in the case of Gaussian distri-
bution, which is a quite often used signal modeling assumption.
The advantage of orthogonal polynomials instead of Taylor ones is that in the case of
cascaded systems they does not produce cross product terms. E.g., in the case of elimination
of the second and third order harmonic distortion of a system by a cascaded polynomial
compensation system, the result will not contain new, higher order terms. The disadvantage
of them is that this behaviour is true only for a small range of signal types, having a given
amplitude distribution.
7
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
26/148
2.2.3 Analytical models
Several nonlinear physical models such as traveling-wave tubes used in radio-frequency com-
munication channels, or photosensitive materials can be described by analytical models,
which are special (usually non-polynomial) mathematical functions. The advantage of thesefunctions is that they usually have physical basics, and they can be parametrized, hence the
correct identification of a given nonlinearity is only optimization of a few parameters.
An example is the case of narrow frequency excitations such as radio-frequency commu-
nication signals, where the relationship between the input and output can be expressed as
separate amplitude and phase distortions. If an input radio-frequency signal is expressed as
x(t) = r(t) cos(ωt + Φ(t)) (2.7)
then the output, y(t) of a traveling-wave tube can be described as
y(t) = A(r(t)) cos(ωt + Φ(t) + φ(r(t))), (2.8)
where A(r) and φ(r) are the amplitude and phase nonlinear distortions and t denotes time.
There are quite a few mathematical approximation formulae for these distortions ([16, 17,
18, 19]).
In the case of optical sound-recording the possible analytical formulae could be very im-
portant for identification and restoration. Analytical formulae with three or more constants
were proposed for photosensitive materials by several authors. They have reasonable agree-
ment with experimental curves, but the theory between these equations is quite inadequate.
Several empirical formulae were proposed in the 1940’s but these formulae were not accurate
enough [20]. A more accurate analytical formula about photosensitive emulsions for the
density vs. log exposure characteristic was given by Solman and Farnel [21]. It has good
agreement with real emulsions, although the photographic fog is not modeled.
A nowadays commonly used formula in the optical sound recording is the γ curve [22],
which can accurately describe a large range of the characteristic. The equation of the γ
curve is
T (E ) = 1 − (1 − T sat − T fog) ∗ E γ − T fog, (2.9)where T denotes the light-transmission ability of film after development and E stands for
light exposure on film before development. T sat means the lowest light-transmission ability of
film and T fog means the highest transmission ability that can be achieved. γ is a parameter
that is different for different film types. The normal range of this parameter is between about
0.2 and 5.
8
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
27/148
2.3 Representation of nonlinearities with memory
The approaches of nonlinear modelling based on Taylor series and orthogonal series, and the
direct and transform methods of nonlinear system analysis, are suitable only for memoryless
nonlinearities. However, the development of more complex models to deal with nonlinearsystems with memory dates back to the late 19th century.
2.3.1 Volterra series
In 1887 Volterra published a functional series expansion now known as the Volterra series
[23]. This generalised form of the Taylor series expansion can be used to represent a nonlin-
ear system with memory. In 1910 Fréchet published a more rigorous representation of the
Volterra series, and contributions towards the generalisation of Weierstrass’ approximationtheorem for functionals in which the polynomials are replaced by so called “polynomic func-
tionals”. Specifically, the generalisation of Weierstrass’ approximation theorem states that
nonlinear systems with memory that are non-polynomial in nature, can be approximately
represented with arbitrary accuracy, by polynomial based nonlinear functional models, over
a given range of inputs.
The Volterra series is a very general means of describing a continuous-time output, y(t)
in terms of an input, x(t). The Volterra series expansion for a causal, time-invariant system
can be expressed asy(t) = H 1[x(t)] + H 2[x(t)] + . . . + H n[x(t)] (2.10)
in which the n-th degree Volterra operator, H n[·] is defined by the convolution
H n[x(t)] =
∞ −∞
· · ·∞
−∞
hn(τ 1, . . . , τ n)x(t − τ 1) · . . . · x(t − τ n)dτ 1 . . . dτ n (2.11)
and the Volterra kernels, hn(·) have unspecified form, but hn(τ 1, . . . , τ n) = 0 for any τ i ≤ 0,i = 1, 2, . . . , n.
In discrete time, eq. (2.10) becomes [24]
H n[xt] =∞
j1=0
. . .
∞ jn=0
hn( j1, . . . , jn)xt− j1 . . . xt− jn (2.12)
This is a generalisation from linear systems theory: for a linear system, y(t) = H 1[x(t)], the
first degree kernel h1(t) is the impulse response, which completely describes the system. For
higher-degree systems, hn(t1, . . . , tn) can be thought of as an n-dimensional impulse response.
9
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
28/148
Discrete Volterra models are widely used in the control literature, classification problems
and artificial neural networks. Present applications in audio include input/output modeling
of audio systems and nonlinear filtering to precompensate for known loudspeaker nonlinear-
ities [25].
2.3.2 Parametric models
There are two basic situations in nonlinear system modeling:
– Input/output modeling in which we have access to both the input and output of the
system, and seek to describe the function mapping from present and past (for a causal
system) values of the input to the output.
– Time series modeling in which we have access only to the output of the system. In
this case we want to describe the output in terms of an input/output model acting on
a random, independent and identically distributed excitation process.
Volterra modeling is a typical example for input-output modeling. An alternative method-
ology for nonlinear modelling is to use time series nonlinear modeling. There is a plethora
of such models, but there is no universally recognised method to categorise them [25]. For
example, Tong [26], Tjøstheim [27], and Chen and Billings [28] take radically different ap-
proaches. They can all, however, be treated as generalisations or specialisations of the
nonlinear ARMA (autoregressive moving average) model.
In an autoregressive moving average model, an observed output signal, o can be repre-
sented as
ot =k
i=1
aiot−i +l
j=1
b jet− j + et, (2.13)
where ai and bi are weighting factors, ei is an excitation signal (can be thought as an additive
noise, which current value is unknown). This equation can be generalized to give a nonlinearARMA (NARMA) model. This takes the form
ot = f (ot−1, . . . , ot−k, et−1, . . . , et−l) + et, (2.14)
where f is now some arbitrary nonlinear function rather then being a simple weighted sum.
This function could be a polynomial model, which is very similar to a finite length and
finite maximum degree Volterra model. If the degree of the polynomial is two, this is the
10
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
29/148
so-called bilinear nonlinear model [25]:
ot = a0 +Ai=1
aiot−i +B
j=1
biet−i +C
k=1
Dl=1
ckdlxt−ket−l. (2.15)
2.3.3 Treshold models
In a threshold model [26], different functions f () are used depending on the value of the
output at some fixed lag d. This introduces nonlinearities even when the functions themselves
are linear. It can be written as
f (
·) =
g1() if r0 ≤ xt < r1g2() if r1 ≤ xt < r2.
..gm() if rm−1 ≤ xt < rm
(2.16)
where the tresholds, ri satisfy
−∞ ≤ r0 < r1 < r2 . . . < rm−1 < rm ≤ ∞, (2.17)
and gi can be defined as a linear or nonlinear model.
2.3.4 Cascade models
Rather than using large, general nonlinear models, an alternative approach is to cascade
smaller models together, connecting the output of one to the input of the next. This can
correspond to the real physical structure of the system itself.
A common cascaded structure is the Linear-Nonlinear-Linear (LNL) or sandwich model
illustrated in Fig 2.1. This model consists of a linear element, h(τ ), whose output, u(t), is
transformed by a memoryless nonlinearity, N (). The output of the nonlinearity is processed
by a second linear system, g(τ ). This system is also called Wiener-Hammerstein system.
The LNL cascade has two special cases, the Hammerstein system (NL) and the Wiener
system (LN). Both the Wiener and Hammerstein models can be linear in the parameters if
the component models themselves are linear. Block-oriented models are a generalisation of
cascade models to allow arbitrary connections, including feedback and feedforward, between
subsystems. They are widely used in the control literature.
Cascaded systems can be switched parallel. Palm [29] showed that any finite dimension,
finite order, finite memory Volterra system can be represented exactly by a finite sum of
11
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
30/148
x(t)h(τ )
u(t)v = N (u)
v(t)g(τ )
y(t)
Figure 2.1: Block diagram of an LNL system.
LNL models. More recently, Korenberg [30] showed that this was true for Wiener cascade
elements as well. This is a significant advancement, since the identification algorithms for
Wiener models are much simpler than those for LNL cascades [31].
12
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
31/148
Chapter 3
Techniques for nonlinear
compensation
3.1 Possible methods for compensation
When a signal passes a system having a nonlinear transfer function, the output signal will
be distorted. If the distortion is not acceptable, we have to somehow reduce it.
Methods for compensation or elimination of nonlinear distortions can be divided into
three main parts:
– If we can modify the structure of the system, we can re-design it in order to reduce
the nonlinear distortion. This is a widely used method in the industry. Examples for
reduction of nonlinear distortions of A/D converters can be seen in [32, 33, 5, 34, 35,
36, 37]; examples for current transformers can be seen in [38] and [39], examples for
reducing nonlinear distortions in movie cameras can be seen e.g. in [40]. Unfortunately
this method is too widespread to deal with it in details.
– If we can’t modify the structure, but we have access to the input, we can pre-distort
the original input signal to compensate the distortion.
– If we have neither access to the structure, nor to the input, we can post-process the
output signal to compensate the distortion.
13
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
32/148
iP (·)
xN (·)
y
Figure 3.1: Block-scheme of pre-distortion.
3.2 Pre-distortion
As it was spoken in Chapter 2.3.2, nonlinear modeling and also nonlinear compensation has
two basic situations: input/output modeling, where we have access to the input and output
and time series modeling, where we have access only to the output. In several applicationswe have access both to the input, x, and the output, o of the nonlinear system. In these
cases pre-distortion techniques are preferred. It’s block scheme is depicted in Fig 3.1. In the
other case, when we have access only to the output of the system post-distortion techniques
can be used.
In the case of pre-distortion the excitation of the nonlinear system is given by another
nonlinear system to eliminate the distortion of the input excitation signal, i, at the output
of the two cascaded system.
The limitation of this method is that the noise level before the original distortion have to
be negligibly low, but this usually can be fulfilled. Hence there is no need to care about the
extra effects of noise and the pre-distortion stage could be simply the inverse of the original
system.
Pre-distortion is a typical solution at the transmitter side of microwave communication
channels, where the transmit amplifier has strong nonlinear distortion. Pre-distorter char-
acteristics were proposed already in 1972 by Kaye [41] who proposed an analog, memoryless
pre-distorter to solve the problem of microwave tubes. A p-th order Volterra inversion for
microwave transmit amplifiers was proposed by Biglieri [42]. Another memoryless compensa-
tion techniques were proposed by Karam [43] and Pupolin [44]. Neural network approaches
can be seen in [45] and [19]. Good surveys can be read about this research field in the article
of Lazzarin [46] and in the PhD thesis of Wohlbier [4].
Pre-distortion is used in other fields as well, e. g. predistortion of power amplifiers [47],
laser diodes [48] or cathode ray tubes [49].
Audio related articles are typically reducing the nonlinearities of loud-speakers or com-
14
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
33/148
plete audio systems. Closed-loop system structures were proposed already in 1977 by Black
[50] and 1983 by Adams [51], who introduced a kind of system re-design. The first pioneer in
the pre-distortion field was A. J. M. Kaizer who made the first loud-speaker models based on
truncated Volterra-series in 1987 [52]. Solutions for loud-speakers based on Volterra-filters
were proposed by Klippel [53, 54, 55, 56, 57, 58] and Schurer [38, 59]. Adaptive nonlinear
compensators were proposed by Klippel [57] and Sternad [60]. Bellini proposed a solution
based on inverting the analytical sound pressure level characteristic of the loud-speaker [61].
Other algorithms were proposed for eliminating acoustic echo by Stenger and Rabenstein
[62, 63, 64, 65] that were based on scalable nonlinearity functions for cancelling nonlinear
distortions in hands-free phone systems. The nonlinear function is described by a polynomial
series, where the coefficients of the series were the parameters of the nonlinear function. The
method can adapt to the changing in the parameters of the distortion and can be extended
for handle nonlinearities with memory.
In all cases the main problem is to identify the characteristic of the nonlinear system.
In some studies, the nonlinear characteristic is assumed to be given, the others proposed
identification techniques.
3.3 Post-distortion
While system re-design and pre-distortion are relatively simple tasks, post-distortion is amore difficult one. The difficulty arises because most post-distortion processes are ill-posed.
This is also the case of the optical soundtracks.
A problem characterized by the equation f (x) = y is well-posed, if the following condi-
tions – introduced by Hadamard in the early 1900’s – are satisfied [66]:
– the solution exists for each element y in the range of Y ;
– the solution x is unique;
– small perturbations in y result in small perturbations in the solution x without the
need to impose additional constraints.
If any of the above conditions are violated, the problem is said to be ill-posed.
Ill-posed problems exist in countless different fields just like measurement technology
[67], spectroscopy [68], optical measurements [69], image restoration [70, 71], high voltage
measurements [72, 73, 74], RC network identification [75] and in many other fields. Several
15
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
34/148
xN (·)
y +
n
oP (·)
x̂
Figure 3.2: Block-scheme of post-distortion.
solutions were proposed for linear problems, based on filtering techniques, using regulariza-
tion operators or singular value decomposition, etc. (A good overview can be found about
these methods e.g. in [76] or [77]). However, relatively small amount of works deal with the
ill-posed problems of nonlinear signal reconstruction. In the followings, these problems will
be examined in details.
In the case of nonlinear post-distortion usually the third ill-posed problem arises: small
perturbations in the measurement will result big deviations in the solution. The schematic
block-scheme of post-distortion can be seen in Fig. 3.2. In this case the noise-source is before
the inverse stage and in a lot of cases the noise level is not negligible. If the inverse system
amplifies the signal, the noise will also be amplified. The amplification could be so strong
that the amplified noise signal covers the original one.
A simulation example for noise amplification can be seen in Fig. (3.3–3.5). In this
simulation the original sinusoid signal was distorted by a Gaussian error function. The
signal-to-noise ratio was 50 dB. After restoration, the noise was amplified at the top part of
the sinusoid, where the nonlinear curve was nearly flat.
Given an ill-posed problem various schemes are available for defining an associated prob-
lem which is well-posed [66]. This approach is referred to as regularization of the ill-posed
problem. In particular, an ill-posed problem may be regularized by
1. changing the definition of what is meant by an acceptable solution,
2. changing the space to which the acceptable problem belongs,
3. revising the problem statement,
4. introducing regularization operators and
5. introducing probabilistic concepts to obtain a stochastic extension of the original de-
terministic problem.
16
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
35/148
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.5
0
0.5
1
1.5
2
2.5
3
time
x
Figure 3.3: Original, input signal (x in Fig. 3.2).
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.5
0
0.5
1
1.5
2
2.5
3
time
o = erf(x)
o
Figure 3.4: Distorted and noisy, observed signal (o in Fig. 3.2).
17
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
36/148
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.5
0
0.5
1
1.5
2
2.5
3
time
x^
Figure 3.5: Reconstructed signal by the exact inverse of the nonlinear distortion (x̂ in Fig.
3.2).
Inversion problems have been extensively studied since 1960. In the early 1960s Tikhonov
began to produce an important series of papers on ill-posed problems. He defined a class of
regularisable ill-posed problems and introduced the concept of a regularising operator which
was used in the solution of these problems [78].While for linear ill-posed problems a very comprehensive regularization theory is avail-
able, the development of regularization methods for non-linear ill-posed problems and the
corresponding theory is quite young and very vital field of research with many open ques-
tions [79]. The rigorous analysis of the Tikhonov regularization in the nonlinear context was
initiated first only in 1989 by Engl, Kunich and Neubauer [80].
Since nonlinear equations generally do not have an analytical solution, these algorithms
are mostly iterative ones [81]. In this case there are two points at the algorithms, where
regularization operators can be used:
– regularization may be required to make the solution well-posed,
– regularization may be required to avoid divergence of the iterative algorithm.
These techniques will be introduced in the next three sections.
Another class of algorithms to handle nonlinear ill-posed problems are based on proba-
bilistic concepts such as Bayesian algorithms and Markov-chain Monte-Carlo methods [25].
18
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
37/148
The aim of these techniques is to create a parametric model of the original, undistorted and
noiseless signal, then to find the possible parameters of this model, based on the noisy and
distorted observation, hence recreate the original signal. These techniques will be introduced
in section 3.3.4.
3.3.1 Regularization of the solution
Let us consider the following nonlinear problem:
y = N (x) (3.1)
Our goal is to best approximate eq. (3.1) in the situation, when the exact data, y, are not
precisely known and only a perturbed data, o with
y − o ≤ δ (3.2)
are available. Here, δ is called the noise level. This problem is usually ill-posed, because
the third rule of Hadamard is not satisfied: small perturbations in o will produce big per-
turbations in the estimate of x, (that will be noted in the followings by x̂), just like in the
example of section 3.3.
A commonly used method for solving this problem is Tikhonov regularization. In Tikhonov
regularization, eq. (3.1) is replaced by a minimization problem, where not only the predic-tion error, N (x̂) − o is minimized, but other terms as well, which are in connection withthe estimated input signal. A practical realization of this minimizaton problem is
N (x̂) − o + λx̂ − xc → min, (3.3)
where λ > 0 is the regularization parameter and xc is some center value ideally chosen as
the critical point of interest, but often just set to zero [82]. In this case, when we try to find
that x̂ value, which produces the minimum value of eq. (3.3), deviances between our initial
guess, xc and our estimate, x̂ will be “punished”, hence big deviations, caused by noise won’t
be allowed.
In eq. (3.3), it is not obligatory to use the norm of x̂ − xc. Using other norms lead tothe generalized Tikhonov regularization that can be expressed as
N (x̂) − o + λ R{x̂}−R{xc} → min, (3.4)
where R(·) is the generalized regularization operator [79, 83].
19
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
38/148
One possibility could be maximum entropy regularization
N (x̂) − o + λ Ω
x̂(t)log
x̂(t)
xc(t)
dt → min, x̂ ∈ Ω, (3.5)
where xc(t) is some initial guess about x(t) such as in eq. (3.3). In this case xc is often just1. For further explanation and examples for nonlinear maximum entropy regularization, see
for example [84, 85, 86, 87].
Other commonly used possibility is bounded variation regularization
N (x̂) − o + λ Ω
dx̂(t)dtdt → min, x̂ ∈ Ω, (3.6)
which enhances sharp features in x̂ as needed in, e.g., image reconstruction, see [88, 89, 90,
71, 91].In the case of monotone nonlinear functions, where
N (x2) − N (x1) ≥ 0 if x2 − x1 ≥ 0 (3.7)
the least squares minimization can be avoided and one can use the simpler regularized
equation
N (x̂) + λ(x̂ − xc) = o, (3.8)
which is called Lavrentiev regularization or method of singular perturbation [92]. Thismethod preserves the original structure of the problem and sometimes can lead to easily-
implemented localized approximation strategies [93].
Since eq. (3.3) – (3.6) and (3.8) are nonlinear equations, analytical solution of them
is generally not possible. The commonly used method is to solve the problem by iterative
methods. In the next section the iterative methods will be discussed.
3.3.2 Regularization of the iteration
The first candidate for solving eq. (3.1) in an iterative way could be Newton’s method [81]
that is the iterative solution of the output least squares problem
o − N (x̂) → min, (3.9)
where · corresponds to the L2 norm. (Of course, regularization methods also can be usedon all the other equations discussed in the previous section, but for simplicity and for easy
20
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
39/148
understanding, the iterative methods will be shown on eq. (3.9) ). In this case eq. (3.9)
simplifies todN (ξ )
dξ
ξ=x̂
(o − N (x̂)) = 0 (3.10)
From eq. (3.10), the Newton’s method can be described as
x̂k+1 = x̂k + dN (ξ )
dξ
−1ξ=x̂k
(o − N (x̂k)), (3.11)
starting from an initial guess, x0. Even if the iteration is well defined and dN (ξ)
dξ is invertible
for every x̂, the inverse is usually unbounded for ill-posed problems. Hence eq. (3.11) is
inappropriate in this case, since each iteration means to solve a linear ill-posed problem, and
some regularization technique has to be used instead. Applying Tikhonov regularization
yields the Levenberg Marquardt method [94]
x̂k+1 = x̂k + 1
dN (ξ)dξ
2ξ=x̂k
+ λk
dN (ξ )
dξ
ξ=x̂
(o − N (x̂k)), (3.12)
where λk is a sequence of positive numbers. Augmenting eq. (3.12) by the term
− 1dN (ξ)dξ
2ξ=x̂k
+ λkλk (x̂k − xc) (3.13)
for additional stabilization gives the iteratively regularized Gauss-Newton method [95]
x̂k+1 = x̂k + 1
dN (ξ)dξ
2ξ=x̂k
+ λk
dN (ξ )
dξ
ξ=x̂k
(o − N (x̂k)) − λk(x̂k − xc)
. (3.14)
The other widely used iterative method is the steepest descent method [96]
x̂k+1 = x̂k − δ dN (ξ )dξ
ξ=x̂k
, (3.15)
where δ is an appropriately chosen positive value or sequence of positive values. If
δ k = N (x̂k)
−o (3.16)
this leads to the so-called Landweber iteration [97]
x̂k+1 = x̂k + dN (ξ )
dξ
ξ=x̂k
(N (x̂k) − o). (3.17)
Another nonlinear iterative method that is based on the steepest descent algorithm is [98]
x̂k+1 = x̂k + λ (o − N (x̂k)) . (3.18)For a more detailed explanation about techniques based on Newton’s method see e.g. [99].
21
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
40/148
3.3.3 Choosing the value of the regularization parameter
One important question in the application of regularization methods is the proper choice
of the regularization parameter, λ. Let us see the equation of the Tikhonov regularization
problem again:N (x̂) − o + λx̂ − xc → min . (3.19)
If we choose λ near to zero, the regularization will be too little. The solution, x̂ tends to the
original, ill-posed result that is the solution of the output least squares problem, eq. (3.9).
If λ approaches to infinity, the result will be overregularized. The output norm becomes
negligible compared to λx̂ − xc. In this case the solution will be well posed, however, ittends to xc. The result will be our initial guess that estimate could be strongly distorted
(for example simply zero). The optimal solution can be found at an optimum λ∗ value that
lies somewhere between 0 and ∞.Several methods were proposed for finding an optimum λ∗ in the case of linear problems.
A commonly used method is the Generalized Cross Validation (GCV).
The underlying principle in cross validation is that if an arbitrary observation is left out
from o, then it’s input can be well predicted using the solution calculated from the optimal
regularized remaining observations. GCV is based on the same principle and, in addition,
ensures that the regularization parameter found has some desirable invariance properties,
such as being invariant to an orthogonal transformation (which includes permutations) of the
data. For the linear problem, A x = b, this leads to choosing the regularization parameter
as the minimizer of the following function
G(λ) = (A x̂ − b)2
trace(I − A(AT A + λ2)−1AT ) (3.20)
For more explanation see e.g. [100] or [101].
Glans [102] proposed a method based on minimizing the imaginary part of x̂ that is
produced by the numerical errors of the computation method. The technique in this method
seems quite unreliable and this method has no heuristic and no formal proof. Instead of
this method, Daboczi [103, 104] proposed a systematic iterative method for finding λ in the
case of impulse signals, based on a rough signal model. Chen [105] proposed a solution for
deconvolution of noisy images even if the point-spread-function (the linear, two-dimensional
filter function that distorted the original image) is not exactly known. Roy proposed a
method based on the difference norm calculated from the linearly distorted observation and
it’s further distorted version with the same linear distortion [106], however, this method also
22
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
41/148
has no formal, no heuristic proof. Solutions based on probabilistic approaches were proposed
in [71] and [107].
At the iterative techniques, Bertocco [108] published a method that worked on the itera-
tive deconvolution of step-response signals estimating the noise spectrum from the flat part
and the signal spectrum from the changing one. Parruck’s method [109] is based on similar
assumptions.
In the case of nonlinear iterative problems, first Engl gave an analysis about the conver-
gence rate dependence from the regularization term in the case of iteratively solved maximum
entropy and Tikhonov regularization [80, 85]. Haber examined rigorously these problems and
collected the possible methods in [101] and [99]. These methods are based on simple con-
tinuation, or cooling [99]. They start with a relatively large value of λ, then they gradually
reduce that. If the result deemed to be unacceptable, λ is increased by a certain factor.A combination of Tikhonov regularization and gradient method was proposed by Ramlau
[110].
For nonlinear Tikhonov regularization Morozov proposed a so-called discrepancy rule
[111] in which the regularization parameter is chosen as the solution of
N (x̂, λ) − o = Cδ, C ≥ 1 (3.21)
where δ is the estimation of the norm of the noise [112].Another heuristic method is the L-curve technique developed by Hansen [113]. This
method does not have a formal proof, however, it is often used because of it’s simplicity
[101]. The L-curve is made by plotting the log of the misfit, N (x̂) − o) as the function of log(x̂) which are obtained for different regularization parameters. This plot has a typicalL-shape. Hansen claimed that the best model norm for a small misfit is obtained at the
corner of the L-curve.
For Lavrentiev regularization, constrains were given for λ in [114], but there are no special
methods to determine it’s exact value.
3.3.4 Bayesian techniques
Bayesian nonlinear restoration techniques are based on nonlinear time series. Many models
are possible for nonlinear time series (see e.g. [28, 26]). In the audio field nonlinear au-
toregressive (NAR) models are widely used [115]. A commonly used representation of NAR
23
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
42/148
models is the cascaded NAR model:
yt = xt +
ηbi=1
i j=1
β (i,j)b(i,j)yt−iyt− j
+
ηbi=1
i j=1
jk=1
β (i,j,k)bi,j,kyt−iyt− jyt−k + higher degree terms, (3.22)
where yt is the t-th sample from the distorted signal from that we can make a (noisy)
observation, b(i,j), b(i,j,k) are the weighting parameters of the NAR process, β (i,j), β (i,j,k) are
0, 1 binary indicators, which decide the usefulness of a weighting parameter, ηb is themaximum lag of the model and xt is the undistorted signal modeled as an autoregressive
process
xt = et +k
i=1
aixt−i. (3.23)
A major advantage of this model formulation is that the inverse of the nonlinear stage is
a straightforward nonlinear moving average (NMA) filter, which is guaranteed to be stable.
Hence it is simple to reconstruct the signal xt from yt for a given set of NAR parameters
[25].
The signals and the parameters are modeled as random variables, usually by Gaussian
or multivariate Gaussian distribution. The correct parameter values of the NAR model can
be estimated by finding the value, which has the maximum probability. The parameter
searching can happen by Monte-Carlo methods [2, 116, 117] or simulated annealing [25] orby any other optimum-searching algorithms.
The advantage of this method is that it can work in the case, when there is no a priori
information about the input signal nor the shape of the nonlinear distortion function. A
disadvantage is that a priori information can hardly be implemented in the process. Other
problem is that the optimum searching algorithm itself can stuck in local minima and requires
high computational power. Also a serious problem is that the velocity of the optimum
searching algorithm at a given task is unknown, therefore the applications realized with this
method cannot be used in real-time environments.
3.4 Audio related post-distortion techniques for reduc-
ing nonlinear distortions
Several solutions have been made for nonlinear pre-compensation of audio devices such as
pre-compensation of hi-fi sets, nonlinear echo cancellation of mobile sets, compensation of
24
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
43/148
loudspeakers, etc., however, relatively small amount of work has been done in the field of
nonlinear post-compensation. In the followings these works will be discussed.
3.4.1 Histogram equalization
Histogram equalisation is a simple technique to estimate a memoryless nonlinear transfer
function through which a speech signal has been passed [118]. A smooth function is fitted to
the histogram of sample values from an extract of the signal. This is compared to a reference
histogram shape, based on analysis of a range of speakers, and a 1:1 mapping is derived which
will make the smoothed histogram conform with the reference one. This mapping is then
applied to the distorted signal.
Because it is assumed that the original signal closely conforms to a standard referencehistogram, this method cannot readily be applied to complex music signals, where histograms
differ greatly between recordings and vary significantly over the duration of a recording. The
other problem that was claimed by the author is that the algorithm is very sensitive to noise.
The algorithm was originally proposed for use in speech communication channels, and has
led to a patented device [119]. A related method has been used to restore recordings made
using early analogue-to-digital converters with non-uniform quantisation step heights and
some missed codes [120]. Since these are all small-scale, local defects, they can be reduced
by smoothing the histogram, without the need for a reference.
3.4.2 Signal reconstruction with known nonlinearity
For situations in which distortion is caused by a known memoryless nonlinearity, an iterative
algorithm has been proposed by Polchlopek [98] to reconstruct the original signal where
only a bandlimited version of the distorted signal is available. The reconstruction uses
the iterative method described in eq. (3.18). The algorithm seems to be applicable also
for certain nonlinearities with memory. The analysis of the algorithm for noise was not
performed.
Tsimbinos composed the inverse of the memoryless nonlinearity from orthogonal poly-
nomials to compensate distortions in digital radio receivers [121]. The advantage of this
method is that in the case of sinusoid excitations, the unwanted harmonics can be filtered
out without the appearance of new harmonic components. However, the method works only
in the case of pure sinusoid excitations that is not the case at general audio problems.
25
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
44/148
3.4.3 Restoration using nonlinear autoregressive models
If there is absolutely no information available about the nonlinear distortion, we have to
make a blind compensation. Audio signals can be well represented by autoregressive models,
therefore a possible method in the case of audio signals is to use autoregressive models foridentification and compensation such as eq. (3.22) and (3.23) in Chapter 3.3.4. This method
is used by Troughton for eliminating tape saturation [2, 116]. The method is applicable to
handle also nonlinearities with memory.
The disadvantage of this method is that the correct model order of the autoregressive
models are not known. The correct parameters are also not known. These data can be
found only by optimum searching algorithms, however, these algorithms may not find the
true parameters, they may stuck in local minima. In this case the resulted signal could be
even more distorted than the original one.
26
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
45/148
Chapter 4
The nonlinear characteristic of movie
film
4.1 Image formation
Optical recording of sound and motion picture is made by photosensitive materials. These
materials are on thin film-rolls. Formerly the carrier was made of cellulose nitrate, later
cellulose acetate. Nowadays it is made of polyester-based plastic. This carrier material is
coated by a photosensitive layer. A normal photosensitive layer consists of very large number
of tiny crystals (grains) of silver-halide embedded in a layer of gelatin. The combination of
grains and gelatin is often referred to as the photographic emulsion [122].
During taking a picture, the optical image is projected onto the photosensitive layer for
a fraction of a second. In ordinary practice this photographic effect is not revealed by any
visible change in the appearance of the emulsion. The exposed emulsion, however, contains
an invisible latent image of the light pattern that can be translated readily into a visible
silver image by action of a developing agent. This latent image is formed by ionization of
silver in the silver-halide crystals that produces very small (few atoms large) silver specks on
the crystal and errors in the crystal structure. During development, if a crystal is adequately
exposed, these mutations cause the acceleration of chemical reactions between the mutated
crystal and the developing agent causing the fast decay of these crystals to metallic silver
grains. This is the so-called print-out. The reaction of the not (or inadequately) affected
crystals is about two or more decades slower [123].
Developing of a silver-halide crystal can be treated as a binary process. If the crystal
contains enough mutation, it will completely transform to silver during the development. An
27
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
46/148
inadequately exposed crystal will be practically untouched. Since the crystals are absolutely
isolated from each other due to the gelatin carrier, the status of a crystal will be independent
of the status of the neighbouring crystals.
With this process, the light amplitude distribution can be reconstructed as the amount
of silver grains on the developed film. Since these silver grains are black, we will get a black
and white negative copy from the original optical image. However, the relationship between
the amount of silver and exposure is not linear.
4.2 Relationship between silver mass and transparency
In practice, the reduction of transparency of the layer is of most interest than the quantity
of silver. Transparency (T ) is defined as the ratio of flux transmitted (P t) to that incident(P o) on a uniformly exposed and processed area that is large compared to the area of a grain
[20]:
T = P t
P o. (4.1)
In their classical paper [124], Hurter and Driffield proposed a new measure, the opacity:
O = 1
T . (4.2)
They also proposed to represent the relationship between light exposure and opacity of the developed film in logarithmic scaled graphs, because it is more descriptive in visual
and photographic reproduction than absolute values or transparency vs. exposure. The
logarithmic of the opacity is termed density (D):
D = log (O) = − log(T ) . (4.3)
The value of D depends on the emulsion, the light magnitude, duration and spectral be-
haviour of the exposing light. Usually the quantity of light received per unit area is of
greatest interest. This is called exposure and denoted by E . E can be expressed as
T 0
I (t)dt, (4.4)
where I is the intensity of light and T is the duration of exposure.
The ratio between silver mass and density is the photometric equivalent. Its reciprocal
is the covering power and is a measure of the efficiency with which the silver mass produces
28
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
47/148
optical density. This number depends on the number of silver-halide grains per unit area and
on the average area of a grain surface, but not on the exposure [125]. Hence the transmission
and the amount of silver is proportional.
4.3 Relationship between transparency and exposure
As it was told in the introduction of this chapter, relationship between exposure and the
amount of silver (so the transmission) is not linear.
If a single layer of silver-halide grains of equal size and sensitivity is exposed, the prob-
ability that a given grain will form latent image depends entirely on the random arrival of
photons and the chance of absorption of a photon by a grain. Assuming that a grain must
absorb r quanta to become developable, the possibility, p, that a grain will absorb r quantafrom an exposure such that the mean number of absorbed quanta per grain is q , is given by
the Poisson-equation:
p(q, r) = exp(−q ) q r
r!. (4.5)
Grains absorbing more than r quanta will also be developable. The probability that a grain
absorbs r or more quanta will be
P (q, r) = 1
−exp(
−q )
r−1
0
q r
r!
. (4.6)
The most sensitive crystals in a typical photosensitive layer require at least 10 or more
quanta. Special emulsions used for recording X-ray or nuclear particles require 1 quantum
per grain. A typical emulsion requires about 1000 photons per grain to make the half of the
crystals developable [126]. Calculated characteristics by these basic parameters for some r
values can be seen in Fig. 4.1.
The linear parts of the characteristics of monosized and uniformly sensitive emulsions are
very tight. Usually this is not proper for image recording. Therefore, instead of monosized
photosensitive emulsions, usually lognormally distributed emulsions are used, where several
different sized crystals are present in the emulsion having different photon sensitivity. In
this case the distribution of the grain size can be described by the equation:
p(x) = 1
(x − Θ)σ√ 2π exp
−(ln(x − Θ))2
2σ2
x > Θ; σ > 0 (4.7)
where σ is the shape parameter (variance) and Θ is the location parameter (modus).
29
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
48/148
0 20 40 60 80 100 120 140 160
0
0.2
0.4
0.6
0.8
1
r=1
r=16
r=32
r=64
r=128
number of photons
relative amount of silver
Figure 4.1: Characteristic of exposure vs. developable crystals in a monosized silver-halide
layer for different foton quanta sensitivity (r).
The photon sensitivity of the same sized grains (one size class) is also not uniform. A
single size class has a sensitivity distribution also close to lognormal. In the case of commonly
used photosensitive emulsions, the variance of the sensitivity of a size class is about the same
as the variance of grain size (70% to 170% of the variance of sensitivity) [20].The exposure vs. (1-transmission) simulated characteristic of a typical emulsion can be
seen in Fig. 4.2 (the MATLAB simulation file can be seen in Appendix C). The logarithmic
exposure vs. density characteristic can be seen in Fig. 4.3. The applicable characteristic
part, where the change of the output is appropriately high, now is much wider. It is about
two decades. However, the whole characteristic is far from linear. There is no linear part,
only a small part near to the beginning can be approximated as linear.
The characteristic begins with a constant part, where the particles are still insensitive
to the light intensity. The transmission here, however is not one, but a bit smaller. Thisis caused by crystal imperfections created during the creation of the photoemulsive layer,
which causes a basic blackness in the image. This basic blackness is called photographic fog
or veil.
The constant part is followed by a toe, then an interval, which can be represented by the
following equation in the linear graph [40]:
1 − T = (1 − T sat − T fog) ∗ (E − E 0)γ + T fog, (4.8)
30
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
49/148
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
exposure
1−transmission
Figure 4.2: Exposure vs. 1-transmission characteristic of a typical emulsion.
10−5
10−4
10−3
10−2
10−1
100
0
0.2
0.4
0.6
0.8
1
1.2
1.4
log exposure
density
Figure 4.3: Logarithmic exposure vs. density characteristic of a typical emulsion.
31
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
50/148
where T sat is the transmission at saturation, T fog is the basic transmission and γ is a constant
that depends from the photosensitive material. This is the so-called gamma-curve. Usually
photo- and film-negative materials have γ , which is smaller than one, positive photosensitive
materials have γ higher than one.
After the gamma-curve part of the photosensitive layer the emulsion becomes more and
more saturated. At extremely high light intensities, in a given interval the transmission
becomes to increase and the density decrease (this part is not involved in Fig. 4.3). This
part is called solarisation. The effect is caused by special secondary chemical effects. This
light intensity cannot be reached in the sound-stripe of the film, therefore we don’t have to
deal with it.
32
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
51/148
Chapter 5
Imperfections in the optical
sound-recording techniques
5.1 Introduction
At professional sound-films, many methods were used for sound recording. After the 1990’s
almost only digital sound-recording technologies are preferred, because they have high sound-
quality and they can be easily copied. However, before the digital age, only analogue methods
were exist. In the film industry, these techniques usually were based on optical sound pro-
jection. Magnetic recording technique was also used in the film industry since the 1950’s.
Although magnetic recording technique had lower distortion as optical methods, it was not
so widespread, since copying of this kind of film is much more difficult and magnetic sound
degrades much more quickly at every broadcasting. Therefore before 1990, the optical sound-
recording methods were typically used. Before the 1950’s, only the optical sound-recording
techniques were known in the film industry.
The advantage of optical sound-recording methods in film-making that they can be easily
copied together with the film without using any additional technologies. Another advantage
that during sound-recording and reproduction nothing has to touch the surface of the film,
therefore the sound on the film will not be degraded by the reproduction. However, optical
sound-recording techniques have disadvantages as well. One disadvantage is the quite high
distortion level, which comes from the nonlinear behaviour of the photosensitive materials,
the other one is the quite high noise level. In the following sections the possible optical
sound-recording techniques will be explained and the description of the distortions at these
techniques will be discussed.
33
-
8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis
52/148
Figure 5.1: Schematic diagram of variable density method.
5.2 Optical sound-recording techniques
Optical sound-recording has two different methods. One of them was developed by Western
Electric and Fox Movietone and called variable density method. The other method was
developed by RCA and is called variable area method. Variable area method is still beingused for sound-recording. Variable density method was used only until the 70’s. However,
from