Study Non Linear Distortion Optical Sound Phdthesis

8/19/2019 Study Non Linear Distortion Optical Sound Phdthesis

1/148

RESTORATION OF NONLINEARLY DISTORTED OPTICAL

SOUNDTRACKS USING REGULARIZED INVERSE

CHARACTERISTICS

PhD thesis

Tamás B. Bakó

Supervisor: dr. Tamás Dabóczi

BUDAPEST UNIVERSITY OF TECHNOLOGY AND ECONOMICS

DEPARTMENT OF MEASUREMENT AND INFORMATION SYSTEMS

3rd June 2004.


2/148


3/148

Aluĺırott, Bakó Tamás Béla kijelentem, hogy ezt a doktori értekezést magam késźıtettem

és abban csak a megadott forrásokat használtam fel. Minden olyan részt, amelyet szó szerint,

vagy azonos tartalomban, de átfogalmazva más forrásból átvettem, egyértelműen, a forrás

megadásával megjelöltem.

A dolgozat b́ırálatai és a védésről készült jegyzőkönyv a későbbiekben, a Budapesti

Műszaki és Gazdaságtudományi Egyetem dékáni hivatalában lesz elérhető.

Budapest, 2004. június 3.

. . . . . . . . . . . . . . . . . . . . .


4/148

Magyar nyelvű összefoglaló

A régi filmfelvételek hangja gyakran nem túl jó minőśegű: a lejátszott hang rendkı́vül zajos

és torz. A torzult hang fárasztja a közönséget, akik kevésbé tudnak koncentrálni magára a

filmre, ezáltal a film élvezhetősége csökken. Ez az oka annak, hogy számos régi filmet nemérdemes lejátszani a közönségnek a telev́ızióban vagy a filmsźınházakban. A torz hangot

azonban digitális jelfeldolgozási módszerekkel jobbá lehet tenni.

Mivel a hangrestaurálás számára semmi más nem áll rendelkezésre, csak a torz és zajos

filmfelvétel, és nincs hozzáférésünk sem az eredeti jelhez, sem pedig a készülékekhez, amivel

a felvételt készı́tették, ezért az egyetlen lehetőśegünk a hangminőśeg feljavı́tására a hang

utólagos kompenzálása. Ez a disszertáció új módszereket javasol az optikai úton rögzı́tett

régi filmek nemlineárisan torzult hangjának hatékony és gyors utólagos kompenzálására.

A disszertáció első részében a nemlineáris modellekről és a nemlineáris kompenzáló tech-

nikákról esik szó, majd az utólagos nemlineáris kompenzálás lesz részletesen elmagyarázva

és az, hogy ez a probléma miért ún. rosszul kond́ıcionált probléma. A disszertáció második

részében olyan módszerek lesznek bemutatva, melyek képesek kezelni a probléma rosszul

kondı́cionáltságát (a hang helyreálĺıtás érzékenységét a torz jelhez hozzáadódott zajokra). A

módszer hatékonyságát szimulációk és filmrészletek hangjának helyreálĺıtása támasztják alá.


5/148

To the muse

D´ ora Sz´ asz


6/148

Acknowledgement

I am very grateful to László Fűszfás and Zoltán Sebán for helpful discussions and for finding

me the basic literatures of film-processing. I am also grateful to the Hungarian Radio for the

technical support of my research work. The Hungarian National Film Archive, especiallyÉva Beke is also acknowledged, who gave me film materials to finish my researches. Also

many thanks to László Balogh, who carefully checked the mathematics in this dissertation

and asked me better explanations.

I would also like to thank the many people who have made the Department of Mea-

surement and Instrumentation Technology such a stimulating environment, including those

whose heroic efforts have kept the absurdly nonstandard network running most of the time.

Keywords

The following keywords may be useful for indexing purposes:

Audio restoration, nonlinear compensation, regularization methods, Tikhonov regular-

ization, optical soundtrack, density characteristic.

iv


7/148

Summary

This dissertation is concerned with the possibilities of restoration of degraded film-sound.

The sound-quality of old films are often not acceptable, which means that the sound is so

noisy and distorted that the listener have to take strong efforts to understand the conversa-tions in the film. In this case the film cannot give artistic enjoyment to the listener. This is

the reason that several old films cannot be presented in movies or television.

The quality of these films can be improved by digital restoration techniques. Since we

do not have access to the original signal, only the distorted one, therefore we cannot adjust

recording parameters or recording techniques. The only possibility is to post-compensate

the signal to produce a better estimate about the undistorted, noiseless signal. In this dis-

sertation new methods are proposed for fast and efficient restoration of nonlinear distortions

in the optically recorded film soundtracks.

First the nonlinear models and nonlinear restoration techniques are surveyed and the

ill-posedness of nonlinear post-compensation (the extreme sensitivity to noise) is explained.

The effects and sources of linear and nonlinear distortions at optical soundtracks are also

described. A new method is proposed to overcome the ill-posedness of the restoration prob-

lem and to get an optimal result. The effectiveness of the algorithm is proven by simulations

and restoration of real film-sound signals.

v


8/148

vi


9/148


10/148

3.4.3 Restoration using nonlinear autoregressive models . . . . . . . . . . . 26

4 The nonlinear characteristic of movie film 27

4.1 Image formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.2 Relationship between silver mass and transparency . . . . . . . . . . . . . . 284.3 Relationship between transparency and exposure . . . . . . . . . . . . . . . . 29

5 Imperfections in the optical sound-recording techniques 33

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5.2 Optical sound-recording techniques . . . . . . . . . . . . . . . . . . . . . . . 34

5.2.1 Variable density method . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.2.2 Variable area method . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5.3 Distortions at variable density method . . . . . . . . . . . . . . . . . . . . . 375.4 Distortions at variable area method . . . . . . . . . . . . . . . . . . . . . . . 40

5.5 Appearance of noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6 Compensation of memoryless nonlinearities 43

6.1 Representation of nonlinearity . . . . . . . . . . . . . . . . . . . . . . . . . . 44

6.1.1 Representation using a piecewise linear model . . . . . . . . . . . . . 44

6.1.2 Representation of the inverse nonlinearity . . . . . . . . . . . . . . . 45

6.2 Identification of the nonlinear distortion . . . . . . . . . . . . . . . . . . . . 46

6.3 Effect of noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6.4 Compensation of the signal by Tikhonov regularization . . . . . . . . . . . . 50

6.4.1 Comparison of the solution to the optimal least squares solution . . . 53

6.4.2 Finding the appropriate value of the regularization parameter . . . . 60

6.4.3 Comparison of the novel method to Morozov’s and Hansen’s method 64

6.5 Results on synthetically distorted real audio signals . . . . . . . . . . . . . . 72

6.6 Results on real distorted audio signals . . . . . . . . . . . . . . . . . . . . . 77

6.7 Compensation of the signal to make an unbiased estimate . . . . . . . . . . . 826.7.1 Finding a proper compensation characteristic using an iterative method 84

6.7.2 Proof that the method is convergent under the given constraint . . . 84

6.8 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

7 Conclusions and future possibilities 89

7.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

7.2 Suggestions for future research . . . . . . . . . . . . . . . . . . . . . . . . . . 91

viii


11/148

7.2.1 Improved blind identification . . . . . . . . . . . . . . . . . . . . . . . 91

7.2.2 Adaptivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

7.2.3 Elimination of nonlinearities with memory . . . . . . . . . . . . . . . 92

A Brief history of film-sound 93

A.1 Sound-on-disc sound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

A.2 Sound-on-film sound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

B Optimal signal restoration in linear systems 99

B.1 Simple linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

B.2 Piecewise linear model with two and more intervals . . . . . . . . . . . . . . 101

C MATLAB simulation of a realistic photosensitive layer 105

D MATLAB realization of computation of regularized nonlinear characteris-

tics 109

E MATLAB realization of finding the optimal regularization 111

F MATLAB realization of calculation of compensation characteristic for un-

biased signal reconstruction 113

ix


12/148

x


13/148

List of Tables

5.1 Velocity of different film formats. . . . . . . . . . . . . . . . . . . . . . . . . 36

6.1 Comparison results of the Morozov, Hansen and the new method. . . . . . . 68

6.2 Comparison results of the exact inverse, Tikhonov and the unbiased charac-

teristics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

xi


14/148

xii


15/148

List of Figures

2.1 Block diagram of an LNL system. . . . . . . . . . . . . . . . . . . . . . . . . 12

3.1 Block-scheme of pre-distortion. . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.2 Block-scheme of post-distortion. . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.3 Original, input signal (x in Fig. 3.2). . . . . . . . . . . . . . . . . . . . . . . 17

3.4 Distorted and noisy, observed signal (o in Fig. 3.2). . . . . . . . . . . . . . . 17

3.5 Reconstructed signal by the exact inverse of the nonlinear distortion (x̂ in Fig.

3.2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.1 Characteristic of exposure vs. developable crystals in a monosized silver-halide

layer for different foton quanta sensitivity (r). . . . . . . . . . . . . . . . . . 30

4.2 Exposure vs. 1-transmission characteristic of a typical emulsion. . . . . . . . 31

4.3 Logarithmic exposure vs. density characteristic of a typical emulsion. . . . . 31

5.1 Schematic diagram of variable density method. . . . . . . . . . . . . . . . . . 34

5.2 Sound-on-film, variable density. . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.3 Sound-on-film, variable area. . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5.4 Schematic diagram of variable area method with electrodynamic mirror oscil-

lograph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5.5 Amplitude response of light intensity controlled variable density sound-recording.

Solid line: standard (35 mm) film at 24 fps, dashed: substandard (16 mm)

film at 16 fps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.6 Creation of nonlinear distortions due to light diffusion. . . . . . . . . . . . . 41

6.1 Model of the nonlinearity compensation. . . . . . . . . . . . . . . . . . . . . 50

6.2 One block from the piecewise linear compensation model. . . . . . . . . . . . 51

6.3 The supplemented piecewise linear compensation model. . . . . . . . . . . . 51

xiii


16/148

6.4 R( pn(n), N (x)) at Gaussian error function and uniformly distributed noise

(noise interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear

function, dashed line: R( pn(n), N (x)). . . . . . . . . . . . . . . . . . . . . . . 55

6.5 R( pn(n), N (x)) at Gaussian error function and Gaussian noise (noise deviation

at left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed

line: R( pn(n), N (x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

6.6 R( pn(n), N (x)) at exponential function and uniformly distributed noise (noise

interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear function,

dashed line: R( pn(n), N (x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

6.7 R( pn(n), N (x)) at exponential function and Gaussian noise (noise deviation at

left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed

line: R

( pn(

n), N

(x

)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566.8 R( pn(n), N (x)) at square-root function and uniformly distributed noise (noise

interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear function,

dashed line: R( pn(n), N (x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6.9 R( pn(n), N (x)) at square-root function and Gaussian noise (noise deviation at

left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed

line: R( pn(n), N (x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6.10 R( pn(n), N (x)) at x0.2 function and uniformly distributed noise (interval at

left 0.1, noise interval at right 0.01). Solid line: nonlinear function, dashedline: R( pn(n), N (x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

6.11 R( pn(n), N (x)) at x0.2 function and Gaussian noise (noise deviation at left

0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed line:

R( pn(n), N (x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

6.12 Multisine signal, x, used for the simulations. . . . . . . . . . . . . . . . . . . 65

6.13 Gaussian error function used for the first simulation. . . . . . . . . . . . . . 66

6.14 x5 function used for the second simulation. . . . . . . . . . . . . . . . . . . . 66

6.15 Noisy output signal of the first simulation (distortion is made by the Gaussian

error function). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

6.16 Noisy output signal of the second simulation (distortion is made by the x5

function). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

6.17 Error of the compensation of nonlinearity by Morozov’s method (left) and

Hansen’s method (right) as a function of λ. The nonlinear distortion is the

Gaussian error function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

xiv


17/148

6.18 Error of the compensation of nonlinearity by the novel method (left) and the

true result (right) as a function of λ. The nonlinear distortion is the Gaussian

error function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

6.19 Error of the compensation of nonlinearity by Morozov’s method (left) and

Hansen’s method (right) as a function of λ. The nonlinear distortion is the

part of x5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

6.20 Error of the compensation of nonlinearity by the novel method (left) and the

true result (right) as a function of λ. The nonlinear distortion is the part of x5. 69

6.21 Reconstruction of x̂ by Morozov’s method (left) and Hansen’s method (right)

for the Gaussian error function. . . . . . . . . . . . . . . . . . . . . . . . . . 70

6.22 Reconstruction of x̂ by the novel method (left) and the optimal result in least

squares sense (right) for the Gaussian error function. . . . . . . . . . . . . . 70

6.23 Reconstruction of x̂ by Morozov’s method (left) and Hansen’s method (right)

for the x5 nonlinear distortion. . . . . . . . . . . . . . . . . . . . . . . . . . . 71

6.24 Reconstruction of x̂ by the novel method (left) and the optimal result in east

squares sense (right) for the x5 nonlinear distortion. . . . . . . . . . . . . . . 71

6.25 Original, not distorted audio signal. . . . . . . . . . . . . . . . . . . . . . . . 73

6.26 Audio signal synthetically distorted by a γ -function. . . . . . . . . . . . . . . 73

6.27 Distorted, noisy signal part chosen for parameter determination of the non-

linear function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746.28 Result of parameter search of the nonlinear function. . . . . . . . . . . . . . 74

6.29 Estimate error of the iterative algorithm at different regularization values. . 75

6.30 True error at different regularization parameters. . . . . . . . . . . . . . . . . 75

6.31 Reconstructed signal by the best characteristic estimate. . . . . . . . . . . . 76

6.32 Reconstructed signal by overregularized characteristic. . . . . . . . . . . . . 76

6.33 Reconstructed signal by underregularized characteristic (note scale change). . 77

6.34 Real, nonlinearly disorted and noise contaminated audio signal. . . . . . . . 78

6.35 Signal part chosen for parameter determination of the nonlinear function. . . 78

6.36 Results of parameter estimation of the nonlinearity. . . . . . . . . . . . . . . 79

6.37 Result of the iterative algorithm. . . . . . . . . . . . . . . . . . . . . . . . . 79

6.38 Reconstructed signal by optimally regularized characteristic. . . . . . . . . . 80

6.39 Reconstructed signal by underregularized characteristic. . . . . . . . . . . . . 81

6.40 Sinusoid excitation signal used for the simulations. . . . . . . . . . . . . . . . 85

6.41 The nonlinear distortion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

xv


18/148

6.42 Distorted signal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

6.43 Reconstruction of x by the exact inverse (left) and Tikhonov-regularized in-

verse (right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6.44 Unbiased reconstruction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

B.1 Estimation of x in the knowledge of o. . . . . . . . . . . . . . . . . . . . . . 100

B.2 Original and inverse piecewise linear system. . . . . . . . . . . . . . . . . . . 101

xvi


19/148

Chapter 1

Introduction

1.1 Overview

The optical filmsound-recordig technology is more than 100 years old. Since then millions of

sound-films were made and then stored in the national film archives, which have inestimable

artistical value. The task of the archives is not just to preserve these films but also to prepare

them for broadcasting and show them to the wide audience. However, most of these films

cannot be broadcasted because they suffer from several degradations.

There are several distinct types of film degradations. These can be broadly classified

into two groups: localised degradations and global degradations. Localised degradations arediscontinuities in the waveform which affect only certain samples. Global degradations affect

all samples from the waveform. We can distinguish the following sub-classes of degradations

[1]:

– clicks and cracklings,

– low-frequency noise transients,

– broad band noise,

– wow and flutter,

– non-linear defects.

Clicks and cracklings are short bursts of interference random in time and amplitude.

The cause of these impulsive disturbances are mutations on the sound-carrier material (e.g.

scratches or dirt spots on the surface).

1


20/148

Low-frequency noise transients are mainly larger scale defects than clicks. The reasons

are large discontinuities due to glued parts of film-rolls or other strong damages at optical

sound-recording. These changes in the film material cause special excitations in the light

intensity during sound reproduction and hence cause strong transients in the reproduced

sound. These large discontinuities can be heard as low-frequency pulses.

Broad band noise is common to all analogue measurement, storage and recording systems

and in the case of audio signals it is generally perceived as “hiss” by the listener. It can be

composed of electrical circuit noise, irregularities in the storage medium and ambient noise

from the recording environment.

Wow and flutter are pitch variation defects which may be caused by eccentricities in the

playback system, motor speed fluctuations or by special distortions of the sound carrier (e.g.

shrinkage of film).

Non-linear defect is a very general class that covers a wide range of distortions. In the

audio field, the principal causes are [2]:

– saturation in magnetic recording,

– tracing distortion (before compensation was introduced) and groove deformation in

records,

– the inherent nonlinearity of optical soundtracks.

There are already many solutions and applications in the scientific literature and on the

market that deals with restoration of local degradations and wide band noise. There are

already several results published in the literature to eliminate pitch defects. However, there

was a relatively small emphasize on the elimination of non-linear defects. It is the topic of

current research interests in DSP for audio [1].

In the last decade, methods restoring damaged audio recordings have progressed from ad

hoc methods, motivated primarily by ease of implementation, towards more sophisticated

approaches based on mathematical modeling of the signal and degradation processes.

This thesis addresses the elimination of distortion of optical soundtracks, a previously

not too extensively investigated problem. Restoration of nonlinear distortions is a special

kind of inverse filtering problem. This problem could be ill-posed, which means that during

reconstruction of the nonlinearly distorted signal, small uncertainties in this signal can cause

strong deviations in the restored one. In this case, our aim is to find a restoration method,

where both the signal distortion and the level of deviation (more simply the level of the

amplified noise) can be kept low. The aim of this dissertation is to clarify the reasons

2


21/148

of nonlinear distortions in the case of optical soundtracks and propose methods based on

digital signal processing to reduce the distortion and avoid the appearance of artefacts in

the restored sound.

1.2 Structure of thesis

Chapter 2 introduces the description and representation forms of memoryless nonlinearities

and nonlinearities with memory. Chapter 3 examines the possible methods for eliminating

effects of nonlinear distortions and explains in details the problems and possible solutions

of nonlinear post-compensation techniques. The main problem during post-compensation

is the amplification of the noise that is present in the original material. Without proper

compensation, the noise amplification could be so strong that the resulted sound couldbe worse than the distorted one. In this chapter the origin of the noise amplification is

discussed and the possible methods are summarized, which could be applicable to overcome

this problem.

Chapter 4 reviews the nonlinear characteristic of photosensitive materials and shows the

analytical equations, which describe the nonlinear behaviour. Chapter 5 discusses the film-

sound recording techniques and the appearance of nonlinear distortions of the photosensitive

materials in the sound.

Chapter 6 shows two novel methods for composing compensation characteristics for post-

compensation of distorted signals. One of them is based on Tikhonov regularization oper-

ators. The aim of this compensation technique is to minimize the estimated value of the

energy of noise and distortion terms together. The method is fast compared to other compen-

sation methods, because this method does not have iterative steps during the compensation

process. Simulations also show in this chapter that the accuracy of the method is as high as

other compensation methods.

A common problem at regularization of an ill-posed problem is that we have a very little

knowledge about the original signal, hence we don’t know, how much regularization is needed

to achieve the optimal result. In this chapter a new method is shown that can automatically

find a good estimate about the amount of regularization without the interaction of a user.

It is quite important at the film industry and at the film archives, where huge amount of

degraded films are waiting for restoration and there is no time to make several experiments

on each film.

The aim of the second compensation method is to produce an unbiased estimate from

3


22/148

the noisy, distorted signal about the original, undistorted one.

We also have little knowledge about the nonlinear distortion function, which is another

problem in signal compensation. In chapter 6 a possible method is shown for the identifi-

cation of the nonlinear function in the knowledge of an analytical, parametrizable formula

about the distortion.

Finally, Chapter 7 presents conclusions and suggests possible directions for future re-

search.

4


23/148

Chapter 2

Classification of nonlinearities and

nonlinear models

2.1 Classification of nonlinearities

A system, at which the relation between the input and the output of the system is described

by the function H (), is a linear system if, for any inputs x1(t) and x2(t), and for any constant,

c, the additive property (eq. (2.1)) and the homogeneity property (eq. (2.2)) are satisfied:

H (x1(t) + x2(t)) = H (x1(t)) + H (x2(t)), (2.1)

H (c · x(t)) = c · H (x(t)). (2.2)In the case of a nonlinear system the additive and/or homogeneity properties are not satisfied.

Nonlinear systems can be divided into two main categories:

– memoryless nonlinear systems,

– nonlinear systems with memory.

In a memoryless nonlinear system the current output at time t depends only from the current

input at time t and does not depend from previous or next input values. A nonlinear system

has memory if the output at time t depends on the input at time t, as well as the inputs

over a previous time interval.

2.2 Representation of memoryless nonlinearities

Memoryless nonlinear models are often adequate for representing nonlinearities in systems

that have a very wide bandwidth with respect to the signal bandwidth. The main advantage

5


24/148

in resorting to such models is their simplicity, ease of application and low computational

burden [3]. Good examples for applications that can be represented with memoryless non-

linearities are e.g. microwave amplifiers [4], A/D and D/A converters [5], photosensitive

materials [6, 7], tube amplifiers [8, 9], several types of transducers [10] and many other

applications that we cannot enumerate because of the lack of space.

2.2.1 Taylor series and piecewise linear representation

The most elementary model for dealing with nonlinear systems is the Taylor series. The

Taylor series provides a polynomial representation of a memoryless nonlinear system. Ac-

cording to [11], James Gregory was the first to discover the Taylor series in 1668, more than

forty years before Brook Taylor published it in 1717.

If a real function, f (x), has continuous derivatives up to (n+1)th order, then this functioncan be expanded in the following fashion:

f (x) = f (a) + 1

1!

df (x)

dx

x=a

+ 1

2!

d2f (x)

dx2

x=a

+ . . . + 1

n!

dnf (x)

dxn

x=a

+ Rn (2.3)

where Rn, called the remainder after n + 1 terms is given by:

Rn =

x a

f (n+1)(u)(x − u)n

n! du =

f (n+1)(ξ )(x − a)n+1(n + 1)!

a < ξ < x. (2.4)

When this expansion converges over a certain range of x, that is limn→∞

Rn = 0 then this

expansion is called the Taylor series of f (x) expanded about a.

If the value of n in eq. (2.3) equals 1, we will get a simple linear model, which has

appropriately small error in a given small domain. Linearity has been one of the fundamental

principles upon which theory of signal processing has been structured. Most real-world

problems however, are intrinsically nonlinear and can be modeled as linear ones only within

a limited range of values. Piecewise linear constitute a compromise between the inherent

complexity of the nonlinear domain and the theoretical abundance of linear methods.

2.2.2 Polynomial interpolation

In 1903, Weierstrass published a theorem that states that memoryless nonlinear systems that

are non-polynomial in nature, could be approximately represented with arbitrary accuracy

by polynomial models, over a given range of inputs [12]. This is now known as the Weierstrass

approximation theorem. In the 1950s, Davenport and Root showed how the direct method,

6


25/148

and the transform method can be used to determine the statistical properties of the output

of memoryless nonlinear devices [11].

In the late 1960s, Blachman showed that a memoryless nonlinearity can be represented

as a generalised Fourier decomposition into a sum of orthogonal polynomials ([13, 14]). The

orthogonality of the polynomials for particular input signal properties allowed the polynomial

coefficients to be calculated or measured using a cross-correlation method. Appropriate sets

of orthogonal polynomials for a number of stationary input signals, were discovered well

before Blachman’s application. In 1939 Szegő attempted to produce a complete bibliography

of every paper published on the subject of orthogonal polynomials before that date [15].

The most commonly used orthogonal polynomials are Chebyshev and Hermite polynomi-

als. Chebyshev polynomials , T n(x), n ∈ 0, 1, 2, . . ., are real functions, which form a completeorthogonal set on the interval

−1 ≤

x

≤ 1 with respect to the weighting function 1

√ 1−x2

. It

can be shown that

1 −1

1√ 1 − x2 T m(x)T n(x) =

0 if m = nπ if m = n = 0π2

if m = n = 1, 2, 3, . . .

(2.5)

Since sine wave signals have 1√ 1−x2 amplitude distribution, this kind of nonlinearity inter-

pretation is applicable to generate or eliminate certain harmonic distortions in sinusoid

excitations [3].Hermite polynomials, H n(x), n ∈ 0, 1, 2, . . . form a complete orthogonal set on the interval

−∞ ≤ x ≤ ∞ with respect to the weighting function exp(−x2) It can be shown that∞

−∞

e−x2

T m(x)T n(x) =

0 if m = n2nn!

√ π if m = n

(2.6)

Since Gauss-like signals have exp(−x2) amplitude distribution this kind of nonlinearity in-terpretation is applicable to simulate or eliminate distortions in the case of Gaussian distri-

bution, which is a quite often used signal modeling assumption.

The advantage of orthogonal polynomials instead of Taylor ones is that in the case of

cascaded systems they does not produce cross product terms. E.g., in the case of elimination

of the second and third order harmonic distortion of a system by a cascaded polynomial

compensation system, the result will not contain new, higher order terms. The disadvantage

of them is that this behaviour is true only for a small range of signal types, having a given

amplitude distribution.

7


26/148

2.2.3 Analytical models

Several nonlinear physical models such as traveling-wave tubes used in radio-frequency com-

munication channels, or photosensitive materials can be described by analytical models,

which are special (usually non-polynomial) mathematical functions. The advantage of thesefunctions is that they usually have physical basics, and they can be parametrized, hence the

correct identification of a given nonlinearity is only optimization of a few parameters.

An example is the case of narrow frequency excitations such as radio-frequency commu-

nication signals, where the relationship between the input and output can be expressed as

separate amplitude and phase distortions. If an input radio-frequency signal is expressed as

x(t) = r(t) cos(ωt + Φ(t)) (2.7)

then the output, y(t) of a traveling-wave tube can be described as

y(t) = A(r(t)) cos(ωt + Φ(t) + φ(r(t))), (2.8)

where A(r) and φ(r) are the amplitude and phase nonlinear distortions and t denotes time.

There are quite a few mathematical approximation formulae for these distortions ([16, 17,

18, 19]).

In the case of optical sound-recording the possible analytical formulae could be very im-

portant for identification and restoration. Analytical formulae with three or more constants

were proposed for photosensitive materials by several authors. They have reasonable agree-

ment with experimental curves, but the theory between these equations is quite inadequate.

Several empirical formulae were proposed in the 1940’s but these formulae were not accurate

enough [20]. A more accurate analytical formula about photosensitive emulsions for the

density vs. log exposure characteristic was given by Solman and Farnel [21]. It has good

agreement with real emulsions, although the photographic fog is not modeled.

A nowadays commonly used formula in the optical sound recording is the γ curve [22],

which can accurately describe a large range of the characteristic. The equation of the γ

curve is

T (E ) = 1 − (1 − T sat − T fog) ∗ E γ − T fog, (2.9)where T denotes the light-transmission ability of film after development and E stands for

light exposure on film before development. T sat means the lowest light-transmission ability of

film and T fog means the highest transmission ability that can be achieved. γ is a parameter

that is different for different film types. The normal range of this parameter is between about

0.2 and 5.

8


27/148

2.3 Representation of nonlinearities with memory

The approaches of nonlinear modelling based on Taylor series and orthogonal series, and the

direct and transform methods of nonlinear system analysis, are suitable only for memoryless

nonlinearities. However, the development of more complex models to deal with nonlinearsystems with memory dates back to the late 19th century.

2.3.1 Volterra series

In 1887 Volterra published a functional series expansion now known as the Volterra series

[23]. This generalised form of the Taylor series expansion can be used to represent a nonlin-

ear system with memory. In 1910 Fréchet published a more rigorous representation of the

Volterra series, and contributions towards the generalisation of Weierstrass’ approximationtheorem for functionals in which the polynomials are replaced by so called “polynomic func-

tionals”. Specifically, the generalisation of Weierstrass’ approximation theorem states that

nonlinear systems with memory that are non-polynomial in nature, can be approximately

represented with arbitrary accuracy, by polynomial based nonlinear functional models, over

a given range of inputs.

The Volterra series is a very general means of describing a continuous-time output, y(t)

in terms of an input, x(t). The Volterra series expansion for a causal, time-invariant system

can be expressed asy(t) = H 1[x(t)] + H 2[x(t)] + . . . + H n[x(t)] (2.10)

in which the n-th degree Volterra operator, H n[·] is defined by the convolution

H n[x(t)] =

∞ −∞

· · ·∞

−∞

hn(τ 1, . . . , τ n)x(t − τ 1) · . . . · x(t − τ n)dτ 1 . . . dτ n (2.11)

and the Volterra kernels, hn(·) have unspecified form, but hn(τ 1, . . . , τ n) = 0 for any τ i ≤ 0,i = 1, 2, . . . , n.

In discrete time, eq. (2.10) becomes [24]

H n[xt] =∞

j1=0

. . .

∞ jn=0

hn( j1, . . . , jn)xt− j1 . . . xt− jn (2.12)

This is a generalisation from linear systems theory: for a linear system, y(t) = H 1[x(t)], the

first degree kernel h1(t) is the impulse response, which completely describes the system. For

higher-degree systems, hn(t1, . . . , tn) can be thought of as an n-dimensional impulse response.

9


28/148

Discrete Volterra models are widely used in the control literature, classification problems

and artificial neural networks. Present applications in audio include input/output modeling

of audio systems and nonlinear filtering to precompensate for known loudspeaker nonlinear-

ities [25].

2.3.2 Parametric models

There are two basic situations in nonlinear system modeling:

– Input/output modeling in which we have access to both the input and output of the

system, and seek to describe the function mapping from present and past (for a causal

system) values of the input to the output.

– Time series modeling in which we have access only to the output of the system. In

this case we want to describe the output in terms of an input/output model acting on

a random, independent and identically distributed excitation process.

Volterra modeling is a typical example for input-output modeling. An alternative method-

ology for nonlinear modelling is to use time series nonlinear modeling. There is a plethora

of such models, but there is no universally recognised method to categorise them [25]. For

example, Tong [26], Tjøstheim [27], and Chen and Billings [28] take radically different ap-

proaches. They can all, however, be treated as generalisations or specialisations of the

nonlinear ARMA (autoregressive moving average) model.

In an autoregressive moving average model, an observed output signal, o can be repre-

sented as

ot =k

i=1

aiot−i +l

j=1

b jet− j + et, (2.13)

where ai and bi are weighting factors, ei is an excitation signal (can be thought as an additive

noise, which current value is unknown). This equation can be generalized to give a nonlinearARMA (NARMA) model. This takes the form

ot = f (ot−1, . . . , ot−k, et−1, . . . , et−l) + et, (2.14)

where f is now some arbitrary nonlinear function rather then being a simple weighted sum.

This function could be a polynomial model, which is very similar to a finite length and

finite maximum degree Volterra model. If the degree of the polynomial is two, this is the

10


29/148

so-called bilinear nonlinear model [25]:

ot = a0 +Ai=1

aiot−i +B

j=1

biet−i +C

k=1

Dl=1

ckdlxt−ket−l. (2.15)

2.3.3 Treshold models

In a threshold model [26], different functions f () are used depending on the value of the

output at some fixed lag d. This introduces nonlinearities even when the functions themselves

are linear. It can be written as

f (

·) =

g1() if r0 ≤ xt < r1g2() if r1 ≤ xt < r2.

..gm() if rm−1 ≤ xt < rm

(2.16)

where the tresholds, ri satisfy

−∞ ≤ r0 < r1 < r2 . . . < rm−1 < rm ≤ ∞, (2.17)

and gi can be defined as a linear or nonlinear model.

2.3.4 Cascade models

Rather than using large, general nonlinear models, an alternative approach is to cascade

smaller models together, connecting the output of one to the input of the next. This can

correspond to the real physical structure of the system itself.

A common cascaded structure is the Linear-Nonlinear-Linear (LNL) or sandwich model

illustrated in Fig 2.1. This model consists of a linear element, h(τ ), whose output, u(t), is

transformed by a memoryless nonlinearity, N (). The output of the nonlinearity is processed

by a second linear system, g(τ ). This system is also called Wiener-Hammerstein system.

The LNL cascade has two special cases, the Hammerstein system (NL) and the Wiener

system (LN). Both the Wiener and Hammerstein models can be linear in the parameters if

the component models themselves are linear. Block-oriented models are a generalisation of

cascade models to allow arbitrary connections, including feedback and feedforward, between

subsystems. They are widely used in the control literature.

Cascaded systems can be switched parallel. Palm [29] showed that any finite dimension,

finite order, finite memory Volterra system can be represented exactly by a finite sum of

11


30/148

x(t)h(τ )

u(t)v = N (u)

v(t)g(τ )

y(t)

Figure 2.1: Block diagram of an LNL system.

LNL models. More recently, Korenberg [30] showed that this was true for Wiener cascade

elements as well. This is a significant advancement, since the identification algorithms for

Wiener models are much simpler than those for LNL cascades [31].

12


31/148

Chapter 3

Techniques for nonlinear

compensation

3.1 Possible methods for compensation

When a signal passes a system having a nonlinear transfer function, the output signal will

be distorted. If the distortion is not acceptable, we have to somehow reduce it.

Methods for compensation or elimination of nonlinear distortions can be divided into

three main parts:

– If we can modify the structure of the system, we can re-design it in order to reduce

the nonlinear distortion. This is a widely used method in the industry. Examples for

reduction of nonlinear distortions of A/D converters can be seen in [32, 33, 5, 34, 35,

36, 37]; examples for current transformers can be seen in [38] and [39], examples for

reducing nonlinear distortions in movie cameras can be seen e.g. in [40]. Unfortunately

this method is too widespread to deal with it in details.

– If we can’t modify the structure, but we have access to the input, we can pre-distort

the original input signal to compensate the distortion.

– If we have neither access to the structure, nor to the input, we can post-process the

output signal to compensate the distortion.

13


32/148

iP (·)

xN (·)

y

Figure 3.1: Block-scheme of pre-distortion.

3.2 Pre-distortion

As it was spoken in Chapter 2.3.2, nonlinear modeling and also nonlinear compensation has

two basic situations: input/output modeling, where we have access to the input and output

and time series modeling, where we have access only to the output. In several applicationswe have access both to the input, x, and the output, o of the nonlinear system. In these

cases pre-distortion techniques are preferred. It’s block scheme is depicted in Fig 3.1. In the

other case, when we have access only to the output of the system post-distortion techniques

can be used.

In the case of pre-distortion the excitation of the nonlinear system is given by another

nonlinear system to eliminate the distortion of the input excitation signal, i, at the output

of the two cascaded system.

The limitation of this method is that the noise level before the original distortion have to

be negligibly low, but this usually can be fulfilled. Hence there is no need to care about the

extra effects of noise and the pre-distortion stage could be simply the inverse of the original

system.

Pre-distortion is a typical solution at the transmitter side of microwave communication

channels, where the transmit amplifier has strong nonlinear distortion. Pre-distorter char-

acteristics were proposed already in 1972 by Kaye [41] who proposed an analog, memoryless

pre-distorter to solve the problem of microwave tubes. A p-th order Volterra inversion for

microwave transmit amplifiers was proposed by Biglieri [42]. Another memoryless compensa-

tion techniques were proposed by Karam [43] and Pupolin [44]. Neural network approaches

can be seen in [45] and [19]. Good surveys can be read about this research field in the article

of Lazzarin [46] and in the PhD thesis of Wohlbier [4].

Pre-distortion is used in other fields as well, e. g. predistortion of power amplifiers [47],

laser diodes [48] or cathode ray tubes [49].

Audio related articles are typically reducing the nonlinearities of loud-speakers or com-

14


33/148

plete audio systems. Closed-loop system structures were proposed already in 1977 by Black

[50] and 1983 by Adams [51], who introduced a kind of system re-design. The first pioneer in

the pre-distortion field was A. J. M. Kaizer who made the first loud-speaker models based on

truncated Volterra-series in 1987 [52]. Solutions for loud-speakers based on Volterra-filters

were proposed by Klippel [53, 54, 55, 56, 57, 58] and Schurer [38, 59]. Adaptive nonlinear

compensators were proposed by Klippel [57] and Sternad [60]. Bellini proposed a solution

based on inverting the analytical sound pressure level characteristic of the loud-speaker [61].

Other algorithms were proposed for eliminating acoustic echo by Stenger and Rabenstein

[62, 63, 64, 65] that were based on scalable nonlinearity functions for cancelling nonlinear

distortions in hands-free phone systems. The nonlinear function is described by a polynomial

series, where the coefficients of the series were the parameters of the nonlinear function. The

method can adapt to the changing in the parameters of the distortion and can be extended

for handle nonlinearities with memory.

In all cases the main problem is to identify the characteristic of the nonlinear system.

In some studies, the nonlinear characteristic is assumed to be given, the others proposed

identification techniques.

3.3 Post-distortion

While system re-design and pre-distortion are relatively simple tasks, post-distortion is amore difficult one. The difficulty arises because most post-distortion processes are ill-posed.

This is also the case of the optical soundtracks.

A problem characterized by the equation f (x) = y is well-posed, if the following condi-

tions – introduced by Hadamard in the early 1900’s – are satisfied [66]:

– the solution exists for each element y in the range of Y ;

– the solution x is unique;

– small perturbations in y result in small perturbations in the solution x without the

need to impose additional constraints.

If any of the above conditions are violated, the problem is said to be ill-posed.

Ill-posed problems exist in countless different fields just like measurement technology

[67], spectroscopy [68], optical measurements [69], image restoration [70, 71], high voltage

measurements [72, 73, 74], RC network identification [75] and in many other fields. Several

15


34/148

xN (·)

y +

n

oP (·)

x̂

Figure 3.2: Block-scheme of post-distortion.

solutions were proposed for linear problems, based on filtering techniques, using regulariza-

tion operators or singular value decomposition, etc. (A good overview can be found about

these methods e.g. in [76] or [77]). However, relatively small amount of works deal with the

ill-posed problems of nonlinear signal reconstruction. In the followings, these problems will

be examined in details.

In the case of nonlinear post-distortion usually the third ill-posed problem arises: small

perturbations in the measurement will result big deviations in the solution. The schematic

block-scheme of post-distortion can be seen in Fig. 3.2. In this case the noise-source is before

the inverse stage and in a lot of cases the noise level is not negligible. If the inverse system

amplifies the signal, the noise will also be amplified. The amplification could be so strong

that the amplified noise signal covers the original one.

A simulation example for noise amplification can be seen in Fig. (3.3–3.5). In this

simulation the original sinusoid signal was distorted by a Gaussian error function. The

signal-to-noise ratio was 50 dB. After restoration, the noise was amplified at the top part of

the sinusoid, where the nonlinear curve was nearly flat.

Given an ill-posed problem various schemes are available for defining an associated prob-

lem which is well-posed [66]. This approach is referred to as regularization of the ill-posed

problem. In particular, an ill-posed problem may be regularized by

1. changing the definition of what is meant by an acceptable solution,

2. changing the space to which the acceptable problem belongs,

3. revising the problem statement,

4. introducing regularization operators and

5. introducing probabilistic concepts to obtain a stochastic extension of the original de-

terministic problem.

16


35/148

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.5

0

0.5

1

1.5

2

2.5

3

time

x

Figure 3.3: Original, input signal (x in Fig. 3.2).

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.5

0

0.5

1

1.5

2

2.5

3

time

o = erf(x)

o

Figure 3.4: Distorted and noisy, observed signal (o in Fig. 3.2).

17


36/148

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.5

0

0.5

1

1.5

2

2.5

3

time

x^

Figure 3.5: Reconstructed signal by the exact inverse of the nonlinear distortion (x̂ in Fig.

3.2).

Inversion problems have been extensively studied since 1960. In the early 1960s Tikhonov

began to produce an important series of papers on ill-posed problems. He defined a class of

regularisable ill-posed problems and introduced the concept of a regularising operator which

was used in the solution of these problems [78].While for linear ill-posed problems a very comprehensive regularization theory is avail-

able, the development of regularization methods for non-linear ill-posed problems and the

corresponding theory is quite young and very vital field of research with many open ques-

tions [79]. The rigorous analysis of the Tikhonov regularization in the nonlinear context was

initiated first only in 1989 by Engl, Kunich and Neubauer [80].

Since nonlinear equations generally do not have an analytical solution, these algorithms

are mostly iterative ones [81]. In this case there are two points at the algorithms, where

regularization operators can be used:

– regularization may be required to make the solution well-posed,

– regularization may be required to avoid divergence of the iterative algorithm.

These techniques will be introduced in the next three sections.

Another class of algorithms to handle nonlinear ill-posed problems are based on proba-

bilistic concepts such as Bayesian algorithms and Markov-chain Monte-Carlo methods [25].

18


37/148

The aim of these techniques is to create a parametric model of the original, undistorted and

noiseless signal, then to find the possible parameters of this model, based on the noisy and

distorted observation, hence recreate the original signal. These techniques will be introduced

in section 3.3.4.

3.3.1 Regularization of the solution

Let us consider the following nonlinear problem:

y = N (x) (3.1)

Our goal is to best approximate eq. (3.1) in the situation, when the exact data, y, are not

precisely known and only a perturbed data, o with

y − o ≤ δ (3.2)

are available. Here, δ is called the noise level. This problem is usually ill-posed, because

the third rule of Hadamard is not satisfied: small perturbations in o will produce big per-

turbations in the estimate of x, (that will be noted in the followings by x̂), just like in the

example of section 3.3.

A commonly used method for solving this problem is Tikhonov regularization. In Tikhonov

regularization, eq. (3.1) is replaced by a minimization problem, where not only the predic-tion error, N (x̂) − o is minimized, but other terms as well, which are in connection withthe estimated input signal. A practical realization of this minimizaton problem is

N (x̂) − o + λx̂ − xc → min, (3.3)

where λ > 0 is the regularization parameter and xc is some center value ideally chosen as

the critical point of interest, but often just set to zero [82]. In this case, when we try to find

that x̂ value, which produces the minimum value of eq. (3.3), deviances between our initial

guess, xc and our estimate, x̂ will be “punished”, hence big deviations, caused by noise won’t

be allowed.

In eq. (3.3), it is not obligatory to use the norm of x̂ − xc. Using other norms lead tothe generalized Tikhonov regularization that can be expressed as

N (x̂) − o + λ R{x̂}−R{xc} → min, (3.4)

where R(·) is the generalized regularization operator [79, 83].

19


38/148

One possibility could be maximum entropy regularization

N (x̂) − o + λ Ω

x̂(t)log

x̂(t)

xc(t)

dt → min, x̂ ∈ Ω, (3.5)

where xc(t) is some initial guess about x(t) such as in eq. (3.3). In this case xc is often just1. For further explanation and examples for nonlinear maximum entropy regularization, see

for example [84, 85, 86, 87].

Other commonly used possibility is bounded variation regularization

N (x̂) − o + λ Ω

dx̂(t)dtdt → min, x̂ ∈ Ω, (3.6)

which enhances sharp features in x̂ as needed in, e.g., image reconstruction, see [88, 89, 90,

71, 91].In the case of monotone nonlinear functions, where

N (x2) − N (x1) ≥ 0 if x2 − x1 ≥ 0 (3.7)

the least squares minimization can be avoided and one can use the simpler regularized

equation

N (x̂) + λ(x̂ − xc) = o, (3.8)

which is called Lavrentiev regularization or method of singular perturbation [92]. Thismethod preserves the original structure of the problem and sometimes can lead to easily-

implemented localized approximation strategies [93].

Since eq. (3.3) – (3.6) and (3.8) are nonlinear equations, analytical solution of them

is generally not possible. The commonly used method is to solve the problem by iterative

methods. In the next section the iterative methods will be discussed.

3.3.2 Regularization of the iteration

The first candidate for solving eq. (3.1) in an iterative way could be Newton’s method [81]

that is the iterative solution of the output least squares problem

o − N (x̂) → min, (3.9)

where · corresponds to the L2 norm. (Of course, regularization methods also can be usedon all the other equations discussed in the previous section, but for simplicity and for easy

20


39/148

understanding, the iterative methods will be shown on eq. (3.9) ). In this case eq. (3.9)

simplifies todN (ξ )

dξ

ξ=x̂

(o − N (x̂)) = 0 (3.10)

From eq. (3.10), the Newton’s method can be described as

x̂k+1 = x̂k + dN (ξ )

dξ

−1ξ=x̂k

(o − N (x̂k)), (3.11)

starting from an initial guess, x0. Even if the iteration is well defined and dN (ξ)

dξ is invertible

for every x̂, the inverse is usually unbounded for ill-posed problems. Hence eq. (3.11) is

inappropriate in this case, since each iteration means to solve a linear ill-posed problem, and

some regularization technique has to be used instead. Applying Tikhonov regularization

yields the Levenberg Marquardt method [94]

x̂k+1 = x̂k + 1

dN (ξ)dξ

2ξ=x̂k

+ λk

dN (ξ )

dξ

ξ=x̂

(o − N (x̂k)), (3.12)

where λk is a sequence of positive numbers. Augmenting eq. (3.12) by the term

− 1dN (ξ)dξ

2ξ=x̂k

+ λkλk (x̂k − xc) (3.13)

for additional stabilization gives the iteratively regularized Gauss-Newton method [95]

x̂k+1 = x̂k + 1

dN (ξ)dξ

2ξ=x̂k

+ λk

dN (ξ )

dξ

ξ=x̂k

(o − N (x̂k)) − λk(x̂k − xc)

. (3.14)

The other widely used iterative method is the steepest descent method [96]

x̂k+1 = x̂k − δ dN (ξ )dξ

ξ=x̂k

, (3.15)

where δ is an appropriately chosen positive value or sequence of positive values. If

δ k = N (x̂k)

−o (3.16)

this leads to the so-called Landweber iteration [97]

x̂k+1 = x̂k + dN (ξ )

dξ

ξ=x̂k

(N (x̂k) − o). (3.17)

Another nonlinear iterative method that is based on the steepest descent algorithm is [98]

x̂k+1 = x̂k + λ (o − N (x̂k)) . (3.18)For a more detailed explanation about techniques based on Newton’s method see e.g. [99].

21


40/148

3.3.3 Choosing the value of the regularization parameter

One important question in the application of regularization methods is the proper choice

of the regularization parameter, λ. Let us see the equation of the Tikhonov regularization

problem again:N (x̂) − o + λx̂ − xc → min . (3.19)

If we choose λ near to zero, the regularization will be too little. The solution, x̂ tends to the

original, ill-posed result that is the solution of the output least squares problem, eq. (3.9).

If λ approaches to infinity, the result will be overregularized. The output norm becomes

negligible compared to λx̂ − xc. In this case the solution will be well posed, however, ittends to xc. The result will be our initial guess that estimate could be strongly distorted

(for example simply zero). The optimal solution can be found at an optimum λ∗ value that

lies somewhere between 0 and ∞.Several methods were proposed for finding an optimum λ∗ in the case of linear problems.

A commonly used method is the Generalized Cross Validation (GCV).

The underlying principle in cross validation is that if an arbitrary observation is left out

from o, then it’s input can be well predicted using the solution calculated from the optimal

regularized remaining observations. GCV is based on the same principle and, in addition,

ensures that the regularization parameter found has some desirable invariance properties,

such as being invariant to an orthogonal transformation (which includes permutations) of the

data. For the linear problem, A x = b, this leads to choosing the regularization parameter

as the minimizer of the following function

G(λ) = (A x̂ − b)2

trace(I − A(AT A + λ2)−1AT ) (3.20)

For more explanation see e.g. [100] or [101].

Glans [102] proposed a method based on minimizing the imaginary part of x̂ that is

produced by the numerical errors of the computation method. The technique in this method

seems quite unreliable and this method has no heuristic and no formal proof. Instead of

this method, Daboczi [103, 104] proposed a systematic iterative method for finding λ in the

case of impulse signals, based on a rough signal model. Chen [105] proposed a solution for

deconvolution of noisy images even if the point-spread-function (the linear, two-dimensional

filter function that distorted the original image) is not exactly known. Roy proposed a

method based on the difference norm calculated from the linearly distorted observation and

it’s further distorted version with the same linear distortion [106], however, this method also

22


41/148

has no formal, no heuristic proof. Solutions based on probabilistic approaches were proposed

in [71] and [107].

At the iterative techniques, Bertocco [108] published a method that worked on the itera-

tive deconvolution of step-response signals estimating the noise spectrum from the flat part

and the signal spectrum from the changing one. Parruck’s method [109] is based on similar

assumptions.

In the case of nonlinear iterative problems, first Engl gave an analysis about the conver-

gence rate dependence from the regularization term in the case of iteratively solved maximum

entropy and Tikhonov regularization [80, 85]. Haber examined rigorously these problems and

collected the possible methods in [101] and [99]. These methods are based on simple con-

tinuation, or cooling [99]. They start with a relatively large value of λ, then they gradually

reduce that. If the result deemed to be unacceptable, λ is increased by a certain factor.A combination of Tikhonov regularization and gradient method was proposed by Ramlau

[110].

For nonlinear Tikhonov regularization Morozov proposed a so-called discrepancy rule

[111] in which the regularization parameter is chosen as the solution of

N (x̂, λ) − o = Cδ, C ≥ 1 (3.21)

where δ is the estimation of the norm of the noise [112].Another heuristic method is the L-curve technique developed by Hansen [113]. This

method does not have a formal proof, however, it is often used because of it’s simplicity

[101]. The L-curve is made by plotting the log of the misfit, N (x̂) − o) as the function of log(x̂) which are obtained for different regularization parameters. This plot has a typicalL-shape. Hansen claimed that the best model norm for a small misfit is obtained at the

corner of the L-curve.

For Lavrentiev regularization, constrains were given for λ in [114], but there are no special

methods to determine it’s exact value.

3.3.4 Bayesian techniques

Bayesian nonlinear restoration techniques are based on nonlinear time series. Many models

are possible for nonlinear time series (see e.g. [28, 26]). In the audio field nonlinear au-

toregressive (NAR) models are widely used [115]. A commonly used representation of NAR

23


42/148

models is the cascaded NAR model:

yt = xt +

ηbi=1

i j=1

β (i,j)b(i,j)yt−iyt− j

+

ηbi=1

i j=1

jk=1

β (i,j,k)bi,j,kyt−iyt− jyt−k + higher degree terms, (3.22)

where yt is the t-th sample from the distorted signal from that we can make a (noisy)

observation, b(i,j), b(i,j,k) are the weighting parameters of the NAR process, β (i,j), β (i,j,k) are

0, 1 binary indicators, which decide the usefulness of a weighting parameter, ηb is themaximum lag of the model and xt is the undistorted signal modeled as an autoregressive

process

xt = et +k

i=1

aixt−i. (3.23)

A major advantage of this model formulation is that the inverse of the nonlinear stage is

a straightforward nonlinear moving average (NMA) filter, which is guaranteed to be stable.

Hence it is simple to reconstruct the signal xt from yt for a given set of NAR parameters

[25].

The signals and the parameters are modeled as random variables, usually by Gaussian

or multivariate Gaussian distribution. The correct parameter values of the NAR model can

be estimated by finding the value, which has the maximum probability. The parameter

searching can happen by Monte-Carlo methods [2, 116, 117] or simulated annealing [25] orby any other optimum-searching algorithms.

The advantage of this method is that it can work in the case, when there is no a priori

information about the input signal nor the shape of the nonlinear distortion function. A

disadvantage is that a priori information can hardly be implemented in the process. Other

problem is that the optimum searching algorithm itself can stuck in local minima and requires

high computational power. Also a serious problem is that the velocity of the optimum

searching algorithm at a given task is unknown, therefore the applications realized with this

method cannot be used in real-time environments.

3.4 Audio related post-distortion techniques for reduc-

ing nonlinear distortions

Several solutions have been made for nonlinear pre-compensation of audio devices such as

pre-compensation of hi-fi sets, nonlinear echo cancellation of mobile sets, compensation of

24


43/148

loudspeakers, etc., however, relatively small amount of work has been done in the field of

nonlinear post-compensation. In the followings these works will be discussed.

3.4.1 Histogram equalization

Histogram equalisation is a simple technique to estimate a memoryless nonlinear transfer

function through which a speech signal has been passed [118]. A smooth function is fitted to

the histogram of sample values from an extract of the signal. This is compared to a reference

histogram shape, based on analysis of a range of speakers, and a 1:1 mapping is derived which

will make the smoothed histogram conform with the reference one. This mapping is then

applied to the distorted signal.

Because it is assumed that the original signal closely conforms to a standard referencehistogram, this method cannot readily be applied to complex music signals, where histograms

differ greatly between recordings and vary significantly over the duration of a recording. The

other problem that was claimed by the author is that the algorithm is very sensitive to noise.

The algorithm was originally proposed for use in speech communication channels, and has

led to a patented device [119]. A related method has been used to restore recordings made

using early analogue-to-digital converters with non-uniform quantisation step heights and

some missed codes [120]. Since these are all small-scale, local defects, they can be reduced

by smoothing the histogram, without the need for a reference.

3.4.2 Signal reconstruction with known nonlinearity

For situations in which distortion is caused by a known memoryless nonlinearity, an iterative

algorithm has been proposed by Polchlopek [98] to reconstruct the original signal where

only a bandlimited version of the distorted signal is available. The reconstruction uses

the iterative method described in eq. (3.18). The algorithm seems to be applicable also

for certain nonlinearities with memory. The analysis of the algorithm for noise was not

performed.

Tsimbinos composed the inverse of the memoryless nonlinearity from orthogonal poly-

nomials to compensate distortions in digital radio receivers [121]. The advantage of this

method is that in the case of sinusoid excitations, the unwanted harmonics can be filtered

out without the appearance of new harmonic components. However, the method works only

in the case of pure sinusoid excitations that is not the case at general audio problems.

25


44/148

3.4.3 Restoration using nonlinear autoregressive models

If there is absolutely no information available about the nonlinear distortion, we have to

make a blind compensation. Audio signals can be well represented by autoregressive models,

therefore a possible method in the case of audio signals is to use autoregressive models foridentification and compensation such as eq. (3.22) and (3.23) in Chapter 3.3.4. This method

is used by Troughton for eliminating tape saturation [2, 116]. The method is applicable to

handle also nonlinearities with memory.

The disadvantage of this method is that the correct model order of the autoregressive

models are not known. The correct parameters are also not known. These data can be

found only by optimum searching algorithms, however, these algorithms may not find the

true parameters, they may stuck in local minima. In this case the resulted signal could be

even more distorted than the original one.

26


45/148

Chapter 4

The nonlinear characteristic of movie

film

4.1 Image formation

Optical recording of sound and motion picture is made by photosensitive materials. These

materials are on thin film-rolls. Formerly the carrier was made of cellulose nitrate, later

cellulose acetate. Nowadays it is made of polyester-based plastic. This carrier material is

coated by a photosensitive layer. A normal photosensitive layer consists of very large number

of tiny crystals (grains) of silver-halide embedded in a layer of gelatin. The combination of

grains and gelatin is often referred to as the photographic emulsion [122].

During taking a picture, the optical image is projected onto the photosensitive layer for

a fraction of a second. In ordinary practice this photographic effect is not revealed by any

visible change in the appearance of the emulsion. The exposed emulsion, however, contains

an invisible latent image of the light pattern that can be translated readily into a visible

silver image by action of a developing agent. This latent image is formed by ionization of

silver in the silver-halide crystals that produces very small (few atoms large) silver specks on

the crystal and errors in the crystal structure. During development, if a crystal is adequately

exposed, these mutations cause the acceleration of chemical reactions between the mutated

crystal and the developing agent causing the fast decay of these crystals to metallic silver

grains. This is the so-called print-out. The reaction of the not (or inadequately) affected

crystals is about two or more decades slower [123].

Developing of a silver-halide crystal can be treated as a binary process. If the crystal

contains enough mutation, it will completely transform to silver during the development. An

27


46/148

inadequately exposed crystal will be practically untouched. Since the crystals are absolutely

isolated from each other due to the gelatin carrier, the status of a crystal will be independent

of the status of the neighbouring crystals.

With this process, the light amplitude distribution can be reconstructed as the amount

of silver grains on the developed film. Since these silver grains are black, we will get a black

and white negative copy from the original optical image. However, the relationship between

the amount of silver and exposure is not linear.

4.2 Relationship between silver mass and transparency

In practice, the reduction of transparency of the layer is of most interest than the quantity

of silver. Transparency (T ) is defined as the ratio of flux transmitted (P t) to that incident(P o) on a uniformly exposed and processed area that is large compared to the area of a grain

[20]:

T = P t

P o. (4.1)

In their classical paper [124], Hurter and Driffield proposed a new measure, the opacity:

O = 1

T . (4.2)

They also proposed to represent the relationship between light exposure and opacity of the developed film in logarithmic scaled graphs, because it is more descriptive in visual

and photographic reproduction than absolute values or transparency vs. exposure. The

logarithmic of the opacity is termed density (D):

D = log (O) = − log(T ) . (4.3)

The value of D depends on the emulsion, the light magnitude, duration and spectral be-

haviour of the exposing light. Usually the quantity of light received per unit area is of

greatest interest. This is called exposure and denoted by E . E can be expressed as

T 0

I (t)dt, (4.4)

where I is the intensity of light and T is the duration of exposure.

The ratio between silver mass and density is the photometric equivalent. Its reciprocal

is the covering power and is a measure of the efficiency with which the silver mass produces

28


47/148

optical density. This number depends on the number of silver-halide grains per unit area and

on the average area of a grain surface, but not on the exposure [125]. Hence the transmission

and the amount of silver is proportional.

4.3 Relationship between transparency and exposure

As it was told in the introduction of this chapter, relationship between exposure and the

amount of silver (so the transmission) is not linear.

If a single layer of silver-halide grains of equal size and sensitivity is exposed, the prob-

ability that a given grain will form latent image depends entirely on the random arrival of

photons and the chance of absorption of a photon by a grain. Assuming that a grain must

absorb r quanta to become developable, the possibility, p, that a grain will absorb r quantafrom an exposure such that the mean number of absorbed quanta per grain is q , is given by

the Poisson-equation:

p(q, r) = exp(−q ) q r

r!. (4.5)

Grains absorbing more than r quanta will also be developable. The probability that a grain

absorbs r or more quanta will be

P (q, r) = 1

−exp(

−q )

r−1

0

q r

r!

. (4.6)

The most sensitive crystals in a typical photosensitive layer require at least 10 or more

quanta. Special emulsions used for recording X-ray or nuclear particles require 1 quantum

per grain. A typical emulsion requires about 1000 photons per grain to make the half of the

crystals developable [126]. Calculated characteristics by these basic parameters for some r

values can be seen in Fig. 4.1.

The linear parts of the characteristics of monosized and uniformly sensitive emulsions are

very tight. Usually this is not proper for image recording. Therefore, instead of monosized

photosensitive emulsions, usually lognormally distributed emulsions are used, where several

different sized crystals are present in the emulsion having different photon sensitivity. In

this case the distribution of the grain size can be described by the equation:

p(x) = 1

(x − Θ)σ√ 2π exp

−(ln(x − Θ))2

2σ2

x > Θ; σ > 0 (4.7)

where σ is the shape parameter (variance) and Θ is the location parameter (modus).

29


48/148

0 20 40 60 80 100 120 140 160

0

0.2

0.4

0.6

0.8

1

r=1

r=16

r=32

r=64

r=128

number of photons

relative amount of silver

Figure 4.1: Characteristic of exposure vs. developable crystals in a monosized silver-halide

layer for different foton quanta sensitivity (r).

The photon sensitivity of the same sized grains (one size class) is also not uniform. A

single size class has a sensitivity distribution also close to lognormal. In the case of commonly

used photosensitive emulsions, the variance of the sensitivity of a size class is about the same

as the variance of grain size (70% to 170% of the variance of sensitivity) [20].The exposure vs. (1-transmission) simulated characteristic of a typical emulsion can be

seen in Fig. 4.2 (the MATLAB simulation file can be seen in Appendix C). The logarithmic

exposure vs. density characteristic can be seen in Fig. 4.3. The applicable characteristic

part, where the change of the output is appropriately high, now is much wider. It is about

two decades. However, the whole characteristic is far from linear. There is no linear part,

only a small part near to the beginning can be approximated as linear.

The characteristic begins with a constant part, where the particles are still insensitive

to the light intensity. The transmission here, however is not one, but a bit smaller. Thisis caused by crystal imperfections created during the creation of the photoemulsive layer,

which causes a basic blackness in the image. This basic blackness is called photographic fog

or veil.

The constant part is followed by a toe, then an interval, which can be represented by the

following equation in the linear graph [40]:

1 − T = (1 − T sat − T fog) ∗ (E − E 0)γ + T fog, (4.8)

30


49/148

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

exposure

1−transmission

Figure 4.2: Exposure vs. 1-transmission characteristic of a typical emulsion.

10−5

10−4

10−3

10−2

10−1

100

0

0.2

0.4

0.6

0.8

1

1.2

1.4

log exposure

density

Figure 4.3: Logarithmic exposure vs. density characteristic of a typical emulsion.

31


50/148

where T sat is the transmission at saturation, T fog is the basic transmission and γ is a constant

that depends from the photosensitive material. This is the so-called gamma-curve. Usually

photo- and film-negative materials have γ , which is smaller than one, positive photosensitive

materials have γ higher than one.

After the gamma-curve part of the photosensitive layer the emulsion becomes more and

more saturated. At extremely high light intensities, in a given interval the transmission

becomes to increase and the density decrease (this part is not involved in Fig. 4.3). This

part is called solarisation. The effect is caused by special secondary chemical effects. This

light intensity cannot be reached in the sound-stripe of the film, therefore we don’t have to

deal with it.

32


51/148

Chapter 5

Imperfections in the optical

sound-recording techniques

5.1 Introduction

At professional sound-films, many methods were used for sound recording. After the 1990’s

almost only digital sound-recording technologies are preferred, because they have high sound-

quality and they can be easily copied. However, before the digital age, only analogue methods

were exist. In the film industry, these techniques usually were based on optical sound pro-

jection. Magnetic recording technique was also used in the film industry since the 1950’s.

Although magnetic recording technique had lower distortion as optical methods, it was not

so widespread, since copying of this kind of film is much more difficult and magnetic sound

degrades much more quickly at every broadcasting. Therefore before 1990, the optical sound-

recording methods were typically used. Before the 1950’s, only the optical sound-recording

techniques were known in the film industry.

The advantage of optical sound-recording methods in film-making that they can be easily

copied together with the film without using any additional technologies. Another advantage

that during sound-recording and reproduction nothing has to touch the surface of the film,

therefore the sound on the film will not be degraded by the reproduction. However, optical

sound-recording techniques have disadvantages as well. One disadvantage is the quite high

distortion level, which comes from the nonlinear behaviour of the photosensitive materials,

the other one is the quite high noise level. In the following sections the possible optical

sound-recording techniques will be explained and the description of the distortions at these

techniques will be discussed.

33


52/148

Figure 5.1: Schematic diagram of variable density method.

5.2 Optical sound-recording techniques

Optical sound-recording has two different methods. One of them was developed by Western

Electric and Fox Movietone and called variable density method. The other method was

developed by RCA and is called variable area method. Variable area method is still beingused for sound-recording. Variable density method was used only until the 70’s. However,

from

Study Non Linear Distortion Optical Sound Phdthesis

Documents

Transcript of Study Non Linear Distortion Optical Sound Phdthesis