8th and 9th June 2004 Mainz, Germany Workshop on Wideband Speech Quality in Terminals and Networks:...

26
8th and 9th June 200 4 Mainz, Germany Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 1 Vincent Barriac, Jean-Yves Le Saout, Catherine Lockwood France Telecom R&D, Lannion, France Teamlog, Lannion, France : [email protected] [email protected] Discussion on unified objective methodologies for the comparison of voice quality of narrowband and wideband scenarios

Transcript of 8th and 9th June 2004 Mainz, Germany Workshop on Wideband Speech Quality in Terminals and Networks:...

8th and 9th June 2004 Mainz, Germany

Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction1

Vincent Barriac, Jean-Yves Le Saout, Catherine Lockwood

France Telecom R&D, Lannion, FranceTeamlog, Lannion, France

: [email protected] [email protected]

Discussion on unified objective methodologies for the comparison of voice quality of narrowband and wideband scenarios

8th and 9th June 2004 Mainz, Germany 2

Context

Emergence of new services using wideband speech communications

Need to track the performance of communication channels mixing wideband and narrowband conditions (for example, scalable…)

8th and 9th June 2004 Mainz, Germany

Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction3

How to evaluate the speech quality?

8th and 9th June 2004 Mainz, Germany 4

Subjective tests

Subjective tests for Narrowband conditions

Subjective tests for Wideband conditions

Subjective tests for mixed Narrowband and Wideband conditions?

8th and 9th June 2004 Mainz, Germany 5

Perceptual Evaluation of Speech Quality (PESQ)

PESQ for Narrowband Conditions

PESQ for Wideband Conditions?

PESQ for mixed Narrowband and Wideband Conditions?

8th and 9th June 2004 Mainz, Germany

Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction6

Open Issues

8th and 9th June 2004 Mainz, Germany 7

Questions about subjective tests

Is it possible to merge narrowband and wideband subjective scales?

In order to adapt existing MOS scores for narrowband systems to such a common scale, should we introduce in all subjective tests wideband references?

Or, can we find a mapping function to adapt narrowband subjective MOS values to wideband equivalent values?

8th and 9th June 2004 Mainz, Germany 8

Questions about PESQ

Would wideband-PESQ be adequate for measuring both wideband and narrowband codecs ?

Is the mapping function of P.862.1 also applicable for wideband scenarios ?

Finally, how to compare wideband PESQ values with the narrowband values?

8th and 9th June 2004 Mainz, Germany

Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction9

First results on Mixed Narrowband & Wideband Subjective tests

8th and 9th June 2004 Mainz, Germany 10

Description of the subjective tests (1/4)

The method of assessment uses the ACR (Absolute Category Rating) method as given in Recommendation P800.Each judgement has been collected on a 5-point quality scale, and scores have been assigned according to the classic ACR methodology:

5 Excellent4 Good3 Fair2 Poor1 Bad

8th and 9th June 2004 Mainz, Germany 11

Description of the subjective tests (2/4)

All the conditions are level adjusted to -26 dB with P.56 algorithm

Headphones are set at a constant nominal level of –79 dB SPL .

PESQ values are evaluated on the two sets of conditions in order to calibrate the judgement scale.

8th and 9th June 2004 Mainz, Germany 12

Description of the subjective tests (3/4)

Two tests: – one narrowband test, – a second narrowband and wideband mixed test

containing the same narrowband conditions as in the first one

For each ACR test, 3 different groups of 8 listeners. Each listening session divided into 2 sub-sessions.

8th and 9th June 2004 Mainz, Germany 13

Description of the subjective tests (4/4)

30 test conditions including: The 22 previous conditions 3 wideband codecs at 16 kHz at different bit rates a 16 kHz clear channel reference.

First test:

Second test:

22 narrowband test conditions including: 4 standard codecs (alone or in tandeming conditions at different bit rates) 3 wideband codecs at different bit rates with output signals down-sampled

to 8 kHz a clear channel reference down-sampled to 8 kHz.

8th and 9th June 2004 Mainz, Germany 14

Speech quality evaluation on narrowband conditions only

1

2

3

4

5

1

Su

bje

ctiv

e M

OS

C1 C2

C3 C4

C5 C6

C7 C8

C9 C10

C11 C12

C13 C14

C15 C16

C17 C18

C19 C20

C21 Reference

Results of Narrowband Subjective test

8th and 9th June 2004 Mainz, Germany 15

Speech quality evaluation for narrowband and w ideband conditions

1

2

3

4

5

1 2

Su

bje

ctiv

e M

OS

C1 C2

C3 C4

C5 C6

C7 C8

C9 C10

C11 C12

C13 C14

C15 C16

C17 C18

C19 C20

C21 Reference

Results of mixed Narrowband/ Wideband subjective test

Improvement due to theincrease of

frequency range

8th and 9th June 2004 Mainz, Germany 16

Impact of mixing narrowband and wideband conditions

Decrease of narrowband conditions MOS values obtained in a " mixed narrowband/wideband " test in comparison to those obtained with a "narrowband only" test Relationship between MOS for narrowband

conditions and MOS for w ideband conditions

1

2

3

4

5

1 2 3 4 5

Subjective MOS (Narrowband-only test)

Su

bje

cti

ve

MO

S (

mix

ed

NB

/WB

te

st)

8th and 9th June 2004 Mainz, Germany 17

Impact of mixing narrowband and wideband conditions

No change of MOS values on wideband conditions for a mixed narrowband/wideband test compared to MOS values on wideband conditions for a "wideband only" test

Subjective MOS (wideband-only test)

Subj

ectiv

e M

OS

(mix

ed N

B/W

B te

st)

8th and 9th June 2004 Mainz, Germany 18

Conclusion on Subjective tests

No need to introduce systematically wideband references in narrowband subjective tests

Better definition of the scale with a complete use of the MOS scale.

Transfer function to adapt narrowband MOS scores to mixed narrowband/wideband MOS scale.

8th and 9th June 2004 Mainz, Germany

Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction19

Extension of this result to PESQ

8th and 9th June 2004 Mainz, Germany 20

Adaptation & validation of PESQ for wideband conditions

Modification of the input filter

Use of a mapping function

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 51

1.5

2

2.5

3

3.5

4

4.5

5

2 6

41

1 xy

e

Equation of the mapping function:

8th and 9th June 2004 Mainz, Germany 21

PESQ Results for wideband conditions

Good matching between MOS scores and PESQ valuesMapping function well adapted on test setResults to be confirmed on more test material

Relationship between PESQ mapped with wideband function and MOS for wideband conditions

1

2

3

4

5

1 2 3 4 5

Subjective MOS

Wid

eb

an

d P

ES

Q

Wideband mappingfunction used

P862.1

Linear (Widebandmapping functionused)

Linear (P862.1)

8th and 9th June 2004 Mainz, Germany 22

Conclusion on objective measures

Merge of Narrowband-PESQ and Wideband-PESQ in a unified scale by the same transfer function as for subjective tests.

Two objective measures PESQ.

Transfer function.

PESQ with P862.1 mapping function for narrowband studies. PESQ (including input filter modification) with a new mapping function? for wideband studies.

8th and 9th June 2004 Mainz, Germany

Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction23

Conclusion

8th and 9th June 2004 Mainz, Germany

Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction24

Perspectives

8th and 9th June 2004 Mainz, Germany 25

Possible Applications

Tool to evaluate the best compromise between bit rate and frequency range for scalable codecs.

Tool to calibrate the MOS scale coverage for subjective tests.

Extension of the model E to wideband applications with the determination of new equipment impairment factors Ie according to the usual procedure using auditory listening results.

8th and 9th June 2004 Mainz, Germany

Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction26

Questions?