A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording

15
A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording R. F. B. Sotero Filho, H. M. de Oliveira (qPGOM), R. Campello de Souza Signal Processing Group, Federal University of Pernambuco – UFPE E-mail: [email protected], {hmo,ricardo}@ufpe.br

description

A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording. R. F. B. Sotero Filho, H. M. de Oliveira ( qPGOM ), R. Campello de Souza Signal Processing Group, Federal University of Pernambuco – UFPE E-mail: [email protected], { hmo , ricardo }@ ufpe . br. - PowerPoint PPT Presentation

Transcript of A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording

A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording

A Full Frequency Masking Vocoder for Legal Eavesdropping Conversation Recording

R. F. B. Sotero Filho, H. M. de Oliveira (qPGOM), R. Campello de Souza Signal Processing Group, Federal University of Pernambuco UFPE

E-mail: [email protected], {hmo,ricardo}@ufpe.br

Abstract:

New approach for a vocoder

Based on: full frequency masking by octaves

Useful to save bandwidth (applications requiring intelligibility)

Recommended for: legal eavesdropping of long conversations.

IntroductionVocoder = contraction from voice encoder:

waveform not recreate the original waveform in appearance, (but it should be perceptually similar to it) first described by Homer Dudley at Bell Telephone Laboratory in 1939Parameters are extracted from the spectrum and updated every 10-25 ms

Properties of voice: limitation of the human auditory system physiology of the voice generation process

Psycho-Acoustics of the Human Auditory System

Frequency Masking:

Masking in frequency or "reduced audibility of a sound due to the presence of another"

Insensitivity to the phase:

The human ear has little sensitivity to the phase of signals

Simplification of the spectrum via frequency masking

For each voice segment: FFT of blocklength 160 (frame of 20 ms)

The spectrum is segmented into regions of influence (octaves).

The range 32 - 64 Hz is removed.

64 Hz-128 Hz, 128 Hz-512 Hz, and so on.

Each spectral sample corresponds to a multiple of 50 Hz

Table 1. Number of spectral lines per octave (DFT of length N=160, sample rate 8 kHz)Octave (Hz)# spectral samples/octave32-64164-1281128-2563256-5125512-1024101024-2048202048-409639A total of 79 frequencies (DFT with N=160) is reduced to 4 survivors! (holding less than 5% of the spectral components).

Figure 1. The spectrum of a voice frame computed by the FFT:

Original spectrum

Simplified full-masking spectrum This technique is called full frequency masking.

Signal synthesis via spectral filling

The beta distribution is a probability distribution defined over 0x1, characterized by a pair of parameters and :

P(x)=1/B(,) x(-1) (1-x)(-1), 1