Dolphin whistle classification for determining group identities

8

Click here to load reader

Transcript of Dolphin whistle classification for determining group identities

Page 1: Dolphin whistle classification for determining group identities

Signal Processing 82 (2002) 251–258www.elsevier.com/locate/sigpro

Dolphin whistle classi�cation for determining group identitiesS. Datta ∗, C. Sturtivant

Underwater Acoustics Group, Department of Electronic and Electrical Engineering, Loughborough University,Leicestershire LE11 3TU, UK

Received 28 August 2000; received in revised form 7 May 2001

Abstract

Traditionally, dolphin recognition techniques in the �eld have relied upon photographic identi�cation, but this has severalpractical disadvantages. Some whistled vocalisations may be used for group identi�cation, and these are viable at longerranges than visual means. Novel automated algorithms have been developed to detect, encode and classify these whistles,with the aim of allowing a rapid, quantitative assessment of group identity. Hidden Markov models were constructed foreach whistle class together with statistical representations of the whistles’ detailed shapes, in an unsupervised manner andfrom little a priori information. The encoding and classi�cation routines were applied to whistles from a 15 min recordingmade during a �eld trial, which contained three periods of whistle activity. Cross-group comparison of the whistle classessuggested that the whistles from the �rst period were vocally distinct from the second and third. Further analysis revealedthat the latter two periods contained whistles that had been recorded simultaneously from two separate groups, but whichcould indeed be separated with the classi�cation routines. This paper will detail the problems involved with detecting thewhistles, characterising and classifying them, and �nally will show the analysis of the results to calculate group similarityprobabilities. ? 2002 Elsevier Science B.V. All rights reserved.

Keywords: Whistle classi�cation; Dolphin whistles; Unsupervised learning; Underwater acoustics

1. Introduction

The Underwater Acoustics Group at LoughboroughUniversity has developed software for quantitativecomparison of frequency-modulated tonal sounds,speci�cally those produced by dolphins [15–17]. Thesoftware is capable of identifying the parts of record-ings that contain whistle-like sounds, can extracttheir frequency–time-intensity contours, and thenapply automatic pattern recognition techniques toclassify them against previous whistles. These tech-niques have the bene�ts of being both objective and

∗ Corresponding author.E-mail address: [email protected] (S. Datta).

quantitative, and provide a probability that any can-didate whistle belongs to each existing class, oralternatively to some new class. Novel algorithmshave been developed both for digitally �ltering theincoming signals and for solving the problem ofunsupervised learning with the whistle classes. It isbelieved that the application of hidden Markov mod-els to represent each class of encoded whistles hasnot been attempted prior to this study, but the re-sults proved that this is a practical method of whistleclassi�cation.This identi�cation technique relies on the fact that

dolphins produce distinctive whistles that can beused to separate them from other individuals. The‘signature’ whistle theory was proposed a long timeago [3], and suggested that dolphin whistles carried

0165-1684/02/$ - see front matter ? 2002 Elsevier Science B.V. All rights reserved.PII: S 0165 -1684(01)00184 -0

Page 2: Dolphin whistle classification for determining group identities

252 S. Datta, C. Sturtivant / Signal Processing 82 (2002) 251–258

the identity of the vocalising animal. This theory hasrecently been questioned [12], although the more cur-rent proposal still maintains that whistles carry iden-tity information, but in the wider context of groups ofanimals. Such identifying whistles have been foundfor several dolphin species as well as for the killerwhale [4,5,7,9,18] and thus techniques which makeuse of ‘signature’ whistles may potentially be appliedto a wide number of toothed cetaceans. Apart from‘signature’ whistles time frequency analysis of thesonar signal produced by bat has also been carriedout [8,14].Identi�cation of groups of dolphins in the wild

can be most bene�cial during studies of behaviouralchanges. The traditional use of photographic identi-�cation for �eld identi�cation [19,20] suGers fromtwo main problems when studying behaviour. Firstly,photo ID is reliant on good visibility, and so mustbe made above the surface due to the lack of clar-ity in British coastal water, and also can only beused during the day. Secondly, dolphins rarely spendlonger than a few seconds at the surface to breathe,and thus an observer must be close enough to a dol-phin’s surfacing to take clear pictures. This secondproblem presents a diHcult obstacle when attemptingto record changes in behaviour, since any observerclose enough to the dolphins for photo ID work mightaGect the behaviour he or she is trying to observe.A bene�t of acoustic identi�cation is that it can bemade at much longer ranges and does not require anobserver in the vicinity.

2. Equipment and data

Recordings were made on the Dutch research ves-sel Tridens as part of a parallel project [6,10,11]named CETASEL (Commission of European Com-munities Contract No. AIR3-CT94-2423). Whistleswere noted from three groups of common dolphinsand were analysed to attempt to distinguish betweenthe three groups. Vocalisations were recorded onan R-DAT recorder (Sony TCD-D7) with a 22 kHzbandwidth. Signals from a trawl-mounted hydrophonewere preampli�ed (with a Benthos AQ4=AD743) be-fore being sent via a coaxial cable some 450 m backto a ship, where acoustic and visual observations werelogged.

3. Whistle detection and extraction

Dolphin whistles are characterised from other ma-rine noises by their narrow bandwidths and their rel-atively stable frequency component from 1 ms to thenext. A discrete fast Fourier transform (FFT) was car-ried out on the data to convert the signal into thetime-frequency domain. A transform partition size of256 samples (corresponding to 5:8 ms) produced agood time to frequency resolution for viewing a con-tour, but for the detection routine a smaller parti-tion size of 32 samples (corresponding to 0:73 ms)was used. This briefer partition time resulted in fre-quency bins with a width of 689 Hz, so that whistles(which changed only slowly in frequency) moved in-frequently between adjacent bins.In order to reduce the contribution to the signal

made by impulsive noises (such as echolocationclicks); a �ltering technique was used to enhancesignals with narrow frequency bandwidths. Echolo-cation clicks generally have a spectrum that changesonly slowly with frequency below 22 kHz [1], so aninitial method was devised for the removal of theseclicks whereby the average energy between time par-titions was normalised using m-adjacent frequencybins. In order to remove any background or remainingtransient components to the signal for whistle detec-tion, two averages were taken of the resulting �lteredsignal using an exponential decay.For whistle extraction the FFT is taken of signals

in which whistles have been detected, and the edgedetection �lter is applied to the resulting spectrogramto reduce broadband noises such as echolocation clicks(Fig. 1(a) and (b)). A contour following routine isused that attempts to follow a smooth path through thespectrogram, whereby a higher intensity is required tosuddenly alter the direction of the traced contour.The contour is subdivided into a number of seg-

ments in order to reduce the data requirements forcomparisons. Each segment indicates periods whenthe contour is generally rising, falling, or Kat in fre-quency, or a temporary silence. The time–frequencyinformation within each of these segments is approx-imated by a quadratic equation using a least-squares�tting routine [16]. Thus the characteristic ‘shape’ ofthe contour is kept with a marked reduction in data, al-though currently intensity information is not preserved(Fig. 1(c) and (d)).

Page 3: Dolphin whistle classification for determining group identities

S. Datta, C. Sturtivant / Signal Processing 82 (2002) 251–258 253

Fig. 1. Example signal spectrogram before (a) and after (b) whistle enhancement (c) an extracted whistle contour showing segments and(d) �tted curves.

4. Whistle contour encoding

Attempting to compare one whistle with anotherby using the simple sequence of time–frequency pairsextracted by the previous algorithm clearly would in-volve comparison of a large amount of data, sincewhistle durations are typically between 0.2 and 3 s.Observation of the ‘shape’ of the whistle suggestedthat a more compact representation could equally welldescribe the salient characteristics. An algorithm was

adopted whereby the whistle was split up into seg-ments, indicating whether the whistle was ‘rising’,‘Kat’, or ‘falling’ in frequency with time, or ‘blank’indicating a break in the contours. The data points con-tained in each of these segments were represented asa quadratic equation of the familiar form as follows:

y(x) = a0 + a1x + a2x2: (1)

The origin for y and x is placed at the �rst fre-quency bin in the �rst time partition of the segment,

Page 4: Dolphin whistle classification for determining group identities

254 S. Datta, C. Sturtivant / Signal Processing 82 (2002) 251–258

allowing the curves of similar segments to be com-pared. In general, the quadratic equation would notexactly match all points on the extracted whistle con-tour, so a least-squares �tting routine was used. Theequation was solved by producing a ‘design matrix’A, which had the general form as follows:

Aij =(xi)j

i; (2)

where i ranges over the number of time partitions inthe segment, and j from 0 to 2 (from the three powersof x in Eq. (1)). Since the standard deviations foreach point i were not known, these were all set to1.0. A column matrix b was constructed from each ofthe value yi for the points in the whistle segment asfollows:

bi =yii; (3)

where, again, each i was set to 1.0. A precise solutionto this problem might then be to �nd the values of a(which is the column matrix of the parameters ai) asfollows:

A × a = b: (4)

However, in this case the equation would have noprecise solutions, so the least-squares solution usingthe following equation was calculated:

2 = |A × a − b|2: (5)

The technique of singular value decomposition (SVD)was used to solve this equation which takes A, anM×N matrix (where M ¿N ), and returns three matricesU (M × N column orthogonal), W (diagonal N ×N ), and V (N × N orthogonal), related to A by thefollowing equation:

A =U ·W · VT: (6)

The inverse of A is then trivial to calculate, exceptwhere the diagonal values inW; wi, are equal to zero.For zero (or small) values of wi, the columns of Vform part of the nullspace of the equation, and sotheir value in the inverse (1=wi) can be set to zero.With this proviso, a is calculated using the followingequation:

a = V[diag(1=wi)]UTb: (7)

The square of errors at the ends of the segment couldthen be used to indicate where the segment is re-quired to be further subdivided to produce an accuratecurve �t.

5. Classi!cation procedure

Whistle classi�cation was based around two fea-tures: overall contour shape, and detailed contourstructure diGerences. Hidden Markov models, orHMMs, were employed to represent segment se-quences for each contour class. Each optimally trainedmodel was combined with an untrained model witha Kat response, which reduced the probabilities forknown members so that undiscovered class memberscan be accounted for.The HMM consists of a set of states S with N mem-

bers where one of these states will be the current stateq of the system. Transitions between states are madewith varying probabilities, represented by the statetransition probability distribution A = {aij for transi-tion from state i to state j}. In each state an output ischosen from a set V ofM symbols weighted by the ob-servation symbol probability distribution B = {bj(k),where j is the current state and k indicates the kthmember vk of V}. Thus, from a state qt = Si a tran-sition is made to qt+1 = Sj with probability aij, andsymbol vk is the output with probability bj(k). Theinitial state of the model is de�ned by the initial-stateprobability distribution � = {�i. By modi�cation of�; A, and B, the output symbols from a HMM canbe tailored to reKect the probabilities of obtainingthem in a class if a suitable mapping can be madefor V}.Having in some way trained a HMM to represent

the members of a whistle class, we can calculate theprobability that the HMM would produce a given ob-servation sequence. This value is simply the sum ofprobabilities of all possible state sequences multipliedby the probability of that sequence producing the givenobservations. Since a brute force approach to this cal-culation is time consuming, the forward and backwardvariables (� and �) used by Baum et al. [2] wereused [13].Training of the HMM is accomplished similarly by

using the forward and backward variables to calculatethe expected number of times a transition from one

Page 5: Dolphin whistle classification for determining group identities

S. Datta, C. Sturtivant / Signal Processing 82 (2002) 251–258 255

Fig. 2. Automatic classi�cation of whistles from the three groups.

state to another is made, given a set of observationsequences, and also the expected number of times eachsymbol is output in each state.A separate comparison was made in addition to

the HMM method, based on three measures betweencorresponding segments for pairs of contours. Thequadratic equation parameters allow calculation ofaverage diGerences in frequency, frequency slope,and rate of change of slope for pairs of segments,and standard deviations of these values could be cal-culated over all contours within a class. Since thesemeasures form a Gaussian distribution within a class,the degree to which a candidate whistle is representa-tive of a class can be found as a percentage based oncomparisons with those whistles already contained inthat class.DiGerences in average frequency can be calculated

from the quadratic parameters from Eq. (1). Similarly,the average diGerence in the rate of change of fre-quency (frequency slope) and rate of change of fre-quency slope, can also be calculated from the quadraticparameters.The product of this similarity percentage and the

probability from the hidden Markov model was usedto calculate the probability that a whistle from thespeci�ed class would match that whistle. The prob-ability for membership of each class for a speci�cwhistle could then be calculated using Bayes’s theo-

rem for conditional probabilities, using an empiricalprobability that the whistle might belong to an asyet undiscovered class. Any candidate whistle couldthen be placed either in the class with the high-est membership probability or in a new class if soindicated.

6. Whistle group classes

The whistles from each period in the recordingswere termed whistle ‘groups’, such that group Acontained all whistles from period A, etc. The clas-si�cation procedure described above was applied toeach group of whistles. Thus, within each group, thewhistles were assigned to one of a number of diGer-ent classes. These classes and membership are showngraphically in Fig. 2.The classes within each group are labelled with the

group letter followed by the class number. Severalclasses within each group contained just one whistle,and it is possible that some were due to ‘aberrant’whistle from dolphins, i.e. a whistle type other thanthat individual’s ‘signature’ whistle. Alternatively,this species’ whistles may be variable and carry noinformation unique to individuals or groups. Com-paring whistles between groups can test this laststatement.

Page 6: Dolphin whistle classification for determining group identities

256 S. Datta, C. Sturtivant / Signal Processing 82 (2002) 251–258

Table 1(a) Cross-classi�cation of whistles across groups

A0 A1 A2 A3 B0 B1 B2 B3 B4 B5 B6 B7 B8

Group A 1 2 3 1 0 0 0 0 (2) 0 0 0 0Group C 0 0 0 0 1 1 3 1 1 2 1 1 1

A0 A1 A2 A3 C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12

Group A 1 2 3 1 0 0 0 0 0 0 0 0 0 0 0 0 0Group B 0 2 1 0 9 1 1 7 1 1 2 1 3 1 1 1 1

B0 B1 B2 B3 B4 B5 B6 B7 B8 C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12

Group B 1 1 3 1 1 2 1 1 1 3 0 1 0 0 0 0 0 2 0 0 0 0

Group C 0 0 9 1 0 3 1 0 0 9 1 1 7 1 1 2 1 3 1 1 1 1

(b) Expected number of whistles per class per group based on membership probabilities

A B A C B CClasses Classes Classes Classes Classes Classes

A 5.72 1.90 A 5.72 2.35 B 9.56 7.02Classes Classes Classes

B 2.46 9.56 C 6.25 25.09 C 15.39 25.09Classes Classes Classes

2 = 3:784, 1 d.f., p= 1:75% 2 = 7:874, 1 d.f., p= 1:39% 2 = 1:844, 1 d.f., p= 17:45%

(c) Whistle comparison for ‘major’ classes in groups B and C

B2 B3 B5 B6 C0 C2 C3 C6 C8

Group B 3 1 2 1 3 1 0 0 2Group C 9 1 3 1 9 1 7 2 3

7. Cross-group comparison

Classi�cation was attempted on contours fromone group with classes characterised for another(Table 1(a)). The probabilities for class membershipwere calculated for each whistle–class pair, and thensummed by group and class to give the expected classmembership for each group based on the classi�ca-tion parameters. Chi-squared analysis was made onclass-group pairs with the null hypothesis of no diGer-ence in whistle classes between groups (Table 1(b)).The resulting signi�cance probability could then beused to determine whether two groups of dolphinsproduced the same types of whistle.

Chi-squared analysis revealed group A had a sig-ni�cantly diGerent whistle-type distribution to eitherof the other two groups. None of the whistles fromgroup B matched any of the classes from group A, al-though the software indicated that two whistles fromgroup A fell into class B4 (bracketed in Table 1(a)),which on further investigation had been misclassi�ed.Although three whistles out of 30 from group C

were matched by classes formed from group A whis-tles, none of those from group A fell into group Cclasses. If this fact is taken in combination with thechi-squared probability of 1.39%, this table indicatesgroup A whistles are signi�cantly diGerent to those ingroup B.

Page 7: Dolphin whistle classification for determining group identities

S. Datta, C. Sturtivant / Signal Processing 82 (2002) 251–258 257

Similar analysis of classes and whistles from groupsB and C indicated some diGerence between groups, butnot a signi�cant one. Further analysis was conductedon just those classes that contained more than onewhistle, termed the ‘major’ classes for a group, andthus eliminating classes consisting of aberrant whistles(Table 1(c)).Classes B2 and C0, B3 and C2, and B5 and C8 all

contained the same whistles, suggest that these classpairs were identical, although classes C3 and C6 con-tained no whistles from the other group. One possibleexplanation is that the two groups of dolphins wererecorded simultaneously for a time, and whistles fromC3 and C6 belonged solely to the second of the twogroups.

8. Conclusions

An analysis software tool has been developed that toa large extent can automate the task of pattern recogni-tion of dolphin whistles. Since several species of dol-phin are known to use identifying whistles, this toolhas a wide application for the study of the reoccur-rence of dolphins within a study area. When appliedto a species that has not previously been known forits identifying whistles, the software was able to en-code and classify the whistles from common dolphinsby contour shape. Using these classes, it was shownthat one of the three groups contained whistles sig-ni�cantly diGerent from the other two, and that theother two groups had been recorded simultaneouslyfor part of the time. When whistles from the earlierof the groups were removed from the data, there weresigni�cantly diGerent whistles from the latter group.Thus, the whistles from all three groups could be usedto determine group identities and to separate them intime.This result shows that a combination of hidden

Markov modelling and statistical modelling of dol-phin whistle time–frequency contours can be usedto characterise diGerent whistle types. A data re-duction technique that preserved the frequency–time‘shape’ of the whistle was employed, and since thisdid not obstruct the group identi�cation appears tobe appropriate to the situation. Although limited inthe number of whistles they contain, these resultsstrongly suggest that the common dolphin employs

whistles which can be used to identify individualgroups.

Acknowledgements

The authors gratefully acknowledge the help of theCETASEL project members for providing the data forthis research, and the aid of Kristin Kaschner, DavidGoodson, and Professor Bryan Woodward. Fundingfor this project was provided by the UK Department ofthe Environment under Contract No. CR 0129 and theMinistry of Agriculture, Fisheries, and Foods underContract CSA 2270.

References

[1] R.A. Altes, Computer generation of some dolphinecholocation signals, Science 173 (4000) (1971) 912–914.

[2] L.E. Baum, T. Petrie, G. Soules, N. Weiss, A maximizationtechnique occurring in the statistical analysis of probabilisticfunctions of Markov chains, Ann. Math. Stat. 41 (1970) 164–171.

[3] M.C. Caldwell, D.K. Caldwell, Individualized whistlecontours in bottlenosed dolphins (Tursiops truncatus), Nature(London) 207 (1965) 434–435.

[4] M.C. Caldwell, D.K. Caldwell, Statistical evidence forindividual signature whistles in Paci�c whitesided dolphins,Lagenorhynchus obliquidens, Cetology 3 (1971) 9.

[5] M.C. Caldwell, D.K. Caldwell, J.F. Miller, Statisticalevidence for individual signature whistles in the spotteddolphin, Stenella plagiodon, Cetology 16 (1973).

[6] P.R. Connelly, A.D. Goodson, K. Kaschner, P.A. Lepper,C.R. Sturtivant, B. Woodward, Acoustic techniques to studycetacean behaviour around pelagic trawls, Proceedings ofICES Conference, Baltimore, USA, September 25–October1, 1997.

[7] M.E. Dahlheim, Signature information in killer whale calls,Whalewatcher 15 (1) (1981) 12–13.

[8] B. Escudie, B. Torresani, Wavelet representation, time-scaledmatched receiver for asymptotic sonar signals emitted by bats,Proceedings of Fifth European Signal Processing Conference,Vol. 1, Elsevier, Amsterdam, 1990, pp. 305–308.

[9] J.K. Ford, H.D. Fisher, Group speci�c dialects of killer whales(Orcinus orca) in British Columbia, in: R. Payne (Ed.),Communication and behavior of Whales, AAAS selectedSymposia Series, Westview, Boulder, CO, 1983, pp. 129–161.

[10] A.D. Goodson, M. Amundin, R.H. Mayo, D. Newborough,P.A. Lepper, C. Lockyer, F. Larsen, C. Blomqvist, Aversivesounds and sound pressure levels for the harbour porpoise(Phocoena phocoena): an initial �eld study, Proceedings ofICES Conference, Baltimore, USA, September 25–October1, 1997.

Page 8: Dolphin whistle classification for determining group identities

258 S. Datta, C. Sturtivant / Signal Processing 82 (2002) 251–258

[11] A.D. Goodson, D. Newborough, B. Woodward, Setgillnet acoustic deterrents for harbour porpoises, Phocoenaphocoena: improving the technology, Proceedings of ICESConference, Baltimore, USA, September 25–October 1, 1997.

[12] B. McCowan, D. Reiss, Quantitative comparison ofwhistle repertoires from captive adult bottlenose dolphins(Delphinidae, Tursiops truncatus): a re-evaluation of thesignature whistle hypothesis, Ethology 100 (1995) 194–209.

[13] L.R. Rabiner, A tutorial on hidden Markov models andselected applications in speech recognition, Proc. IEEE 77(2) (1989) 257–285.

[14] P.A. Saillant, J.A. Simmons, S.P. Dear, T.A. McMullen,A computational model for echo processing in frequencymodulated echolocating bats: the spectrogram correlation andtransformation receiver, J. Acoust. Soc. Amer. 9 (2) (1992)1150–1163.

[15] C.R. Sturtivant, S. Datta, Techniques to isolate dolphinwhistles and other tonal sounds from background noise,Acoust. Lett. 18 (10) (1995) 189–193.

[16] C.R. Sturtivant, S. Datta, The isolation from backgroundnoise and characterisation of bottlenose dolphin (Tursiopstruncatus) whistles, J. Acoust. Soc. India 23 (4) (1995)199–205.

[17] C.R. Sturtivant, S. Datta, An acoustic aid for populationestimates, Eur. Res. Cetaceans 11 (1997).

[18] F. Thomsen, D. Franck, J.K.B. Ford, Characteristics ofwhistles from the acoustic repertoire of resident killer whales(Orcinus orca) oG Vancouver Island, British Columbia, J.Acoust. Soc. Amer. 109 (3) (2001) 1240–1246.

[19] R.S. Wells, A.B. Irvine, M.D. Scott, The social ecologyof inshore odontocetes, in: L.M. Herman (Ed.), CetaceanBehavior: Mechanisms and Functions, Robert E. Krieger, FL,1980, pp. 263–317.

[20] B. WRursig, M. WRursig, The photographic determination ofgroup size, composition, and stability of coastal porpoises(Tursiops truncatus), Science 198 (1977) 755–756.