16th ENFSI FSAAWG MEETING
Wiesbaden 25-26 September 2014
Andrea Paoloni
Fondazione Ugo Bordoni
Preliminary results of the
collaborative exercise on
reliability of transcripts
2
Wiretaps
In activities of crime contrast, there is extensive resorting to wiretaps, which recordings are then transcribed to be used by the prosecutor. Many of these transcripts are environmental recordings and, among these, many have very poor audio quality, thus being unintelligible.
3
Reliability
The open problem is to assess the reliability of transcripts of these degraded audio signals.
4
Objectives of this collaborative exercise
• to check the reliability of a transcription as a
function of quantitative parameters such as signal
to noise ratio and reverberation;
• to assess whether there are remarkable variations
in the intelligibility into different languages in the
same noising conditions;
• to verify the effectiveness of signal enhancement
in improving the intelligibility of the degraded
audio signal;
.
5
First subjective test composition
In this collaborative exercise, we have been
selected 16 sentences among 20 you have
sent, to make the test less burdensome.
One meaningless sentence, has been
replaced by an english sentence. The
degradations considered include the additive
and the convolutive noise. In particular, the
corpus has been made trying to simulate an
office environment with babble noise. All
speech files have been filtered in a typical
telephone line.
6
Presentation
Each audio file is preceded by 1 second of noise before the sentence starts.
The test set consists of 16 different test signals randomly presented. The listener fill in the proper space the sentence heard. The following figure shows the application interface.
8
Second subjective test
The second test is designed to investigate the role of the noise suppression on the speech intelligibility. We ask you to operate a signal enhancement to improving the intelligibility through the methods usually you adopted. You have to explain us what algorithms and all the steps you use. All the audio files you can find in the application directory --> data --> 1. You could restore this files and then replace the original files with those restored.
11
Differences among listeners (Italian
listener only)
0 5 10 15 20 25
0
5
10
15
20
25
IT -4 SS
IT -4 CS
IT 0 SS
IT 0 CS
Mean
STD
15
Notes
• It is impossible, for the data scarcity, answer questions about the robustness of different languages to audio degradation.
• For the same reason the result of the second experiment, concerning the speech enhancement, do not allow us to draw conclusions.
• The results of the Italian listeners (all experts) showed that there is a great variability between different transcribers.
16
Preliminary results
• The intelligibility of – 4 dB N/S speech signal is less then 50%;
• The meaningless sentences are less intelligible than the meaningful one;
• There is a large variation among different listeners.
17
Future work
• They performed the first experiment 31 transcribers of which 19 are Italians.
• It would be desirable, for each language, to count on at least 5 subjects.
• It would also be interesting to proceed with the evaluation of the effectiveness of speech enhancement (at least two different “filtration” for each language).
Top Related