intro auditory system - Argentina.gob.ar

Post on 27-Apr-2022

8 views 0 download

Transcript of intro auditory system - Argentina.gob.ar

introauditorysystem

marcelomagnasco

woods hole, august 2009; c.a.b., octubre 2009

whatexactlyisthecolor“brown”?

our senses parse entire scenes at a time, separating attributes of the objects being looked at (albedo of surfaces) from the attributes of the scene (direction of incident light) from the attributes of our own relationship to the scene (“shiny” reflectivity) into separate percepts.

therefore concepts such as “color” end up being extremely complex because they are anchored in the relationship of an object to an entire scene to the viewer

wemixaudioandvideo

•  localiza:onofsimultaneousclickandflash(whicharedisplacedfromoneanother)showwetrusthearing~1/3andvision~2/3

•  manipula:onofvideostreamofhumanvocaliza:onshowswe“hear”differentsyllableswhenthevideoandaudioaremismatched

arichauditoryworldIheartherainpaHeringontheroofaboveme,drippingdownthewallstomyleJandright,splashingfromthedrainpipeatgroundlevelonmyleJ,whilefurtherovertotheleJthereisalighterpatchastherainfallsalmostinaudiblyuponalargeleafyshrub.Ontheright,itisdrumming,withadeeper,steadiersounduponthelawn.Icanevenmakeoutthecontoursofthelawn,whichrisestotherightinaliHlehill.Thesoundoftherainisdifferentandshapesoutthecurvatureforme.S:llfurtherto the right, I hear the rain sounding upon the fencewhich divides our propertyfromthatnextdoor.Infront,thecontoursofthepathandthestepsaremarkedout,right down to the garden gate. Here the rain is striking the concrete, here it issplashing into the shallow pools which have already formed. Here and there is alightcascadeasitdripsfromsteptostep.Thesoundonthepathisquitedifferentfromthesoundoftheraindrummingintothelawnontheright,andthisisdifferentagainfromtheblanketed,heavy,soddenfeelofthelargebushontheleJ.Furtherout, the sounds are less detailed. I can hear the rain falling on the road, and theswishofthecarsthatpassupanddown.IcanheartherushingofthewaterinthefloodedguHerontheedgeoftheroad.

JohnHull,TouchingtheRock

numbersgiveclues

•  200’000’000photoreceptorsintwore:nas•  100’000’000olfactoryreceptors•  20’000’000tac:le‐noci‐andpropio‐receptors

•  7’000innerhaircells

verydifferentdensityofinforma:oninbits/sec/cell

what’s in a voice?

the text the volume of the utterance the emotional stance the identity of the speaker the speaker’s accent the distance to the speaker the position of the speaker the orientation of the speaker an impression of the room

a multitude of percepts!

I can tell when other things are moving by the sounds they make. Cars swish past, feet patter along, leaves rustle, but a silent nature is immobile. So it is that, for me, the clouds do not move.

John Hull, Touching the Rock

we hear things that move

many sounds have survival and evolutionary importance: we perceive things that make noise

that includes living beings, especially predators, prey, offspring and mates,

and many inanimate things on which our survival hinges, such as fire, water, wind and earth.

why sounds wake us up

we can even hear things that do not make sounds

a current project: sound textures

sounds such as rain, fire crackling, brook babbling, windblown leaves

they are important because: •  they have great survival (ecological) value, •  we inattentively recognize them extremely fast, •  they are powerful modulators of emotional state.

:meinvariance

/2 /4 normal speed *2 *4

introauditorypathways

thepathwayofsound

earlycodingandtransduc:on

thecochleaislikeacamera

the cochlear partition: the organ of Corti sits atop the basilar membrane

oval window sound waves enter

sound waves exit and dissipate

impenetrable boundary condition(bone)

basal end

apical end

round window

for acoustical purposes, the cochlea is a fluid filled cavity encased in bone, and separated into two distinct partitions by a membrane of exponentially varying stiffness

sound is focused by an acoustical lens onto a sound-sensitive film

thecochleaislikeacamera

high frequency

low frequency

the basilar membrane vibrates with different amplitudes in different places according to the frequency of sound

fluid mechanics can be formulated as a free-boundary problem with the b.m. as boundary—long-range kernel and other issues.

traveling wave—not standing wave!—from basal to apical

theearisac:ve

live cochlea

dead cochlea

both the cochlear high gain and the sharpness of its frequency tuning depend on a biological power supply. a power interruption (by pressing the carotid artery, for instance) reversibly abolishes both.

in addition, the cochlea displays nonlinear responses which are similarly abolished by interrupting the power supply. the discovery by rhodes of the active nonlinearity (1971) ushered a new era in auditory biophysics.

as far as we can gather, the active elements are the hair cells.

thecochleaislikeacamera

the acoustical lens is in the organ of Corti

the sound-sensitive film is also in the organ of Corti

in this camera, the lens and the film are one and the same: the hair cells of the organ of Corti are responsible for both sensitivity and frequency selectivity!

theorganofcor:

kandel

stereociliaarejoinedby:p

links

bechara kachar et al

tuningcurves

pick any observable—spike rates in fibers of the eighth nerve, velocity, suppression of a SOA line. Now, apply a sinusoidal input and vary the amplitude until the response reaches a prescribed level; record that amplitude as a function of the frequency and sweep the frequency.

I

f

many tuning curves (across species and observables) have a stereotyped shape:

cochleartuningcurve

ruggero ‘00

240

db/o

ctav

e= fr

eq40

=20

pole

s

fourcharacteris:csoftheac:veprocess

•  gain•  frequencyselec:vity•  compressivenonlinearity

•  spontaneousemissions

all four characteristics have been well documented in basilar-membrane motion from lower vertebrates to mammals.

no experimental manipulation has been able to abolish one without abolishing all.

geometryofhopf

neartheresonance

when is big then

on the contrary, exactly at the resonance,

which is an “essential nonlinearity”. V. M. Eguiluz , M. Ospeck , Y. Choe , A. J. Hudspeth and M. O. Magnasco, PRL 84 5232-5236 (2000)

thehopfresonance

cochlearvelocimetryhopf behaviour

nonlinearly compressive for all frequencies > center frequency!

ruggero et al ‘00

propaga:on+hopf=

parallel active elements version; the series version is more complex but also has more interesting behaviour

geometryoftheefferents

gaincontrol

along basilar membrane

amplitude of BM motion (log)

gain

spot having f as BF

best place to control

three:mesacochlea?The tympanic ear evolved three times independently from the ancestor fish ear. E.g., ossicles and collumella are not homologous. A number of convergent evolutionary features were evolved which indicates certain features are evolutionarily stable. Mammals, birds and lizards all evolved elongated papillae/cochleae and two types of hair cells.

Recent evidence suggests the two types of hair cells are actually homologous, and represents a specialization in two classes, one for sensing and the other one for amplifying.

from Manley, PNAS 2002

introauditorycoding

auditorynervefibers

the fibers of the eighth cranial nerve fire in response to sounds. each fiber has a frequency to which it is most sensitive. near the threshold of the fiber, the number of action potentials evoked by the stimulus increases with stimulus intensity, but it rapidly saturates.

for all stimuli intensity that evoke potentials, the fiber fires always at specific phases of the stimulus. they may not fire every cycle but in subharmonic lock, particularly at high frequencies. this phase lock has been shown in mammals up to 4 kHz, and up to 9 kHz in owls.

therefore any given fiber carefully encodes the driving frequency, and encodes very little information about intensity.

encodingintheauditorynerve

sopranos

•  cherylstuder

•  editagruberova

•  florencefoster‐jenkins

michale fee

phaseandintensity

theauditorysystempreservesphaseinforma:onmuchmorecarefullythanintensityinforma:on:

•  discrimina:oninintensity:1dB(26%changeinenergy)

•  discrimina:oninfrequency:1/12thsemitone(0.5%changeinf).

•  discrimina:oninheading:15degrees=20µs.(needsheadphones–echoesinterfere)

445.7

441.9

12% v 26% E

26% v 58% E

χ(ω,t) = eiω(t− t ' )∫ e−( t− t ' )2

2σ 2 x(t') dt '

theshort:meFouriertransform(a.k.a. Gabor transform, a.k.a. single wavelet)

Fourier

time localiz

signal

amplitude

phase

theshort:meFouriertransform

1/σ

σ

ω

t

the stft depends only on nearby data: a neighbourhood of size σ in time, and of size 1/σ in frequency. therefore all nearby points are strongly correlated, and so the amplitude of the stft appears to be “out of focus”. making σ smaller only succeeds, of course, in making 1/σ bigger, so the area of the “grain” is constant.

simplest thing one can do with the phase is take derivatives

instantaneous:me‐frequency

instantaneous:mepicture

simplest thing one can do with the phase is take derivatives

now that we have these two estimates, the easiest thing to do is to plot one against the other and dispense with (w,t) entirely!

instantaneous:me‐frequency

ω ins(ω,t) =∂φ∂t

tins(ω,t) = t − ∂φ∂ω

simpleelements

x = eiω 0t x = δ(t − t0) x = eiαt2 / 2

ω ins(ω,t) =∂φ∂t

ω0 ω αtins

tins(ω,t) = t − ∂φ∂ω

t t0 ...

any linear element in the time-frequency plane, like a tone, a click, or a linear frequency sweep (calculation mercifully spared and available upon request) is mapped by the instantaneous reassignment to a one-dimensional, perfectly thin line. meanwhile, the original stft is still out of focus, with this line blurred by (σ,1/σ). no linear transform has this property. only the (bilinear) wigner-ville distribution localizes these signals exactly.

the implications for the fourier uncertainty theorem are clear. uncertainty in the time-frequency plane refers to resolution, i.e., the ability to distinguish two objects as distinct. any single object can be tracked in both frequency and time to arbitrary accuracy.

K. Kodera, R. Gendrin, and C. de Villedary, ``Analysis of time-varying signals with small BT values”, IEEE Trans. ASSP, 26.1 64-76 (1978). F. Auger and P. Flandrin, ``Improving the readability of time-frequency and time-scale representations by the reassignment method,'' IEEE Trans. on Signal Proc. {\bf 43}, 1068-1089 (1995). E. Chassande-Mottin, I. Daubechies, F. Auger, P. Flandrin, ``Differential reassignment,” IEEE Signal Proc. Lett., 410, 293-294 (1997).

themap

ωins

t

tins

ω

the instantaneous estimates define a transformation

ωins

t

tins

ω

the instantaneous estimates define a transformation taking points from ωt to points in ωtins

ωins

t

tins

ω

taking a grid of points from ωt to a distorted grid of points in ωtins

ωins

t

tins

ω

taking a grid of points from ωt to a distorted grid of points in ωtins, which can then be put on a two-dimensional histogram

ωins

t

tins

ω

taking a grid of points from ωt to a distorted grid of points in ωtins, which can then be put on a two-dimensional histogram and counted

remappingif one takes a fine grid of points, maps it through the instantaneous time map, and then “histograms” the imaged points, one obtains a measure µ (or better stated, the derivative of µ wrt lebesgue (λ), which is of course singular):

if we denote by γ the measure weighted by the STFT

then we can also define

which is equivalent to making a histogram of the points in the grid, mapped by the it map, and weighted by the stfft at the original point.

whitenoise

•  attheoppositeendof“singlelinearelements”wehavewhitenoise:astreamofuncorrelatedgaussianrandomnumbers.

•  whitenoiseisthe“densest”signal:i.e.,there’sstuffhappeningatall:mesandallfrequencies.

•  whiletheexpecta:onvalueofthespectrumofwhitenoiseisaconstant,theactualspectrumforanygivenrealiza:onhasstochas:cvaluesandfluctuatesasmuchinfrequencydomainasitdoesinthe:medomain.

zebra finch, regular sonogram

zebra finch, reassigned multiband

?