3GPP TR 26.975 Performance Characterization AMR Speech Codec
ETSI/SMG11 Speech Aspects · Outline • SMG11 • GSM Speech Codecs • GSM Enhanced Full Rate...
Transcript of ETSI/SMG11 Speech Aspects · Outline • SMG11 • GSM Speech Codecs • GSM Enhanced Full Rate...
ETSI/SMG11 "Speech Aspects"ETSI/SMG11 "Speech Aspects"
Presentation of SMG11 Activities to Tiphon
OutlineOutline
• SMG11
• GSM Speech Codecs
• GSM Enhanced Full Rate Codec
• Tandem Free Operation
• Adaptive Multi-Rate (AMR) Codec
• Narrowband AMR
• Wideband AMR
• UMTS Matters
• Next Meetings
SMG11SMG11
• ETSI STC SMG11 is the competent body responsible for speech aspects of the GSMand UMTS standards (since 1996)
• SMG11 Chairman: Mr. Kari Järvinen, Nokia
• SMG11 plenary meets four times a year
• Additional extraordinary meetings as needed
• SMG11 currently consists of three sub-groups
• TFO sub-group (Tandem Free Operation issues)
• AMR sub-group (Adaptive Multi-Rate codec issues)
• SQ sub-group (Speech Quality issues)
• Sub-groups may have ad-hoc meetings between SMG11 plenary meetings
• Typical attendance in SMG11 plenary meetings is between 30-50
• SMG11 e-mail reflector as well as sub-group reflectors are extensively used betweenthe meetings
GSM Speech CodecsGSM Speech Codecs
• GSM has so far standardised three codecs
• 13 kbps GSM-FR (1987); good cellular quality and robust operation in thepresence of background noise
• 5.6 kbps GSM-HR (1994); possibility for higher system capacity at the expenseof slightly lower speech quality in some conditions (particularly in backgroundnoise)
• 12.2 kbps GSM-EFR (1996); high quality even exceeding the G.726 "wirelinereference" under clear channel conditions and in background noise
• SMG11 is currently in the process of defining an Adaptive Multi Rate (AMR) codecwhich will be the fourth GSM speech codec
Subjective speech quality GSM FR GSM HR GSM EFR No coding
Clean conditions (MOS) 3.71 3.85 4.43 4.61
Vehicle noise (DMOS) 3.83 3.45 4.25 4.42
Street noise (DMOS) 3.92 3.56 4.18 4.35
Source: TR 06.85 v5.0.0 (1998-07), "Subjective tests on the interoperability of the HR/FR/EFR speechcodecs; single, tandem and tandem free operation"
GSM EFR CodecGSM EFR Codec
• Selected as a basis for a new high quality speech service for PCS 1900 in the US in1995 (formal standardization procedure in TIA and T1 completed in 1996)
• ETSI standardized the same codec for GSM in 1996
• Provides high quality speech service for GSM, GSM 1800 (DCS 1800), and GSM1900 (PCS 1900) systems in all continents
• Technical summary
• Source coding rate 12.2 kbps (channel coding 10.6 kbps)
• Based on the Algebraic CELP (ACELP) algorithm
• Speech frame size and algorithmic delay 20 ms
• Optional VAD/DTX function with comfort noise generation
• Example implementation for error concealment
• Complexity (encoder/decoder) approximately 18 MIPS (processor dependent)
• Memory requirement (incl. RAM and ROM) approximately 16-19k 16-bit words
GSM EFR Speech QualityGSM EFR Speech Quality
• GSM EFR speech quality is characterized in
ETSI Technical Report
“Performance Characterisation of the GSM EFR speech codec”, GSM 06.55.
• Additional performance data can be found in
ETSI Technical Report
"Subjective tests on the interoperability of the HR/FR/EFR speech codecs;single, tandem and tandem free operation", GSM 06.85
• The GSM EFR codec has been included in numerous other formal and informalsubjective listening tests and extensive test data is available
• The examples in the following slides are an extract of test results from COMSATlaboratories obtained during the PCS 1900 EFR codec standardization, comparing
12.2 kbps EFR codec
32 kbps G.726 codec
8 kbps G.729
(13 kbps GSM FR codec)
GSM EFR PerformanceGSM EFR Performance
• Basic speech quality at different input levels and tandeming
Test Condition G.726 at32kbit/s
GSM EFR G.729
Clean speech, high level, -16 dBOL (MOS) 3.7 3.8 3.4 Clean speech, medium level, -26 dBOL (MOS) 3.6 3.6 3.3 Clean speech, low level, -36 dBOL (MOS) 3.0 2.9 2.7 Self-tandem = codec-codec tandem (MOS) 3.1 3.4 2.9 Tandem with G.726 at 32kbit/s (MOS) 3.2 3.6 3.3
GSM EFR PerformanceGSM EFR Performance
• Performance in background noise
Test Condition G.726 at32kbit/s
GSM EFR G.729
Background noise, Home noise 20 dB (DMOS) 4.5 4.6 4.3 Background noise, Car noise 10 dB (DMOS) 4.4 4.5 3.9 Background noise, Car noise 20 dB (DMOS) 4.6 4.6 4.1 Background noise, Street noise 10 dB (DMOS) 3.7 4.1 3.7 Background noise, Office noise 20 dB (DMOS) 4.3 4.5 3.7
GSM EFR PerformanceGSM EFR Performance
• Performance in error conditions
Test Condition Frameerror rate
BER class 2
GSM FR GSM EFR
Clean speech, No errors (MOS) 0.0% ≈0% 3.4 4.1 Clean speech, 13 dB C/I, 30 mph (MOS) ≈0.0% ≈2% 3.3 4.0 Clean speech, 10 dB C/I, 30 mph (MOS) ≈0.5% ≈4% 3.0 3.8 Clean speech, 7 dB C/I, 30 mph (MOS) ≈3.0% ≈8% 2.3 3.2
• In GSM, part of the coded bits are protected by a convolutional code, and residualerrors are detected via CRC. The frame error rate for this part is indicated above.Part of the data is unprotected and receive the BER class 2 indicated above.
• The frame error rates are not directly comparable to quality figures with no residualerrors
Tandem Free Operation (TFO)Tandem Free Operation (TFO)
• Motivation: "Unnecessary" dual speech encoding and decoding in mobile-to-mobilecalls can significantly decrease speech quality
• TFO prevents the encoding and decoding performed in the network
• Applicable to all the three GSM codecs (FR, HR, and EFR)
• The same speech codec must be used in both mobile stations for TFO to work
• TFO Standardization ongoing in ETSI SMG11 TFO Sub-group
• Work started (TFO sub-group established) in early 1996
• Target: specifications ready by 4Q/1998 (ETSI GSM release 98)
• Current work concentrating on completing four Annexes to Stage 3 description:in-band signalling, operation with In-Path Equipments (IPEs), SDL definition, testvectors
• The TFO Stage 3 GSM 04.53 will be forwarded to SMG#27 plenary in October-98
• Formal subjective tests to evaluate the audible effects of TFO signalling are beingcarried out by Coherent
MS-to-MS Call, no TFOMS-to-MS Call, no TFO
Decoding
MSa MSb
Encoding
Encoding Decoding
64 kbits/s PCM Coded Speech
8 or 16 kbits/s Voice Coded Speech
A-side B-side
PLMN PLMN
MSC
TRAU
BSS
MSC
BSS
TRAU
Effect of Tandemin gEffect of Tandemin g
MOS value
Speech codecOne encoding anddecoding (normal)
Two encodings anddecodings (tandem)
Enhanced Full Rate 4.43 4.29
Full Rate 3.71 3.13
Half Rate 3.85 3.15
Source: TR 06.85 v2.0.0 (1998-06), "Subjective tests on the interoperabilityof the HR/FR/EFR speech codecs; single, tandem and tandem freeoperation"
Note: The above results are from clean conditions (no background noise, nochannel errors)
Effect of Tandemin g in Error ConditionsEffect of Tandemin g in Error Conditions
MOS value
Speech codecOne encoding anddecoding (normal)
Two encodings anddecodings (tandem)
Enhanced Full Rate 4.12 3.45
Full Rate 3.41 2.64
Half Rate 3.68 2.77
Source: TR 06.85 v2.0.0 (1998-06), "Subjective tests on the interoperabilityof the HR/FR/EFR speech codecs; single, tandem and tandem freeoperation"
Note: EP1 error condition was used (moderate errors).
Effect of Tandemin g in Back ground NoiseEffect of Tandemin g in Back ground Noise
MOS value
Speech codecOne encoding anddecoding (normal)
Two encodings anddecodings (tandem)
Enhanced Full Rate 4.25 3.87
Full Rate 3.83 3.34
Half Rate 3.45 2.38
Source: TR 06.85 v2.0.0 (1998-06), "Subjective tests on the interoperabilityof the HR/FR/EFR speech codecs; single, tandem and tandem freeoperation"
Note: Vehicle noise of 10 dB was used.
TFO ModesTFO Modes
• Two modes in TFO
• Establishment mode : the necessary conditions for TFO are verified with inaudiblebit stealing
• Verify whether both transcoders support TFO
• Possible change of speech codecs to enable TFO
• Duration typically 0.5-1.0 seconds
• TFO mode : speech is transmitted compressed through the whole network with bitstealing that guarantees smooth transitions in all situations
• TFO includes the proper means to ensure TFO also when In Path Equipment suchas Echo Cancellers and DCMEs are used in the fixed network
MS-to-MS Call, with TFOMS-to-MS Call, with TFO
PLMN
MSa MSb
8 or 16 kbits/s
Encoding Decoding
A-side B-side
MSC
TRAU
BSS
MSC
TRAU
BSS
PLMN
56 or 48 kbits/s
EncodingDecoding
TFO ModeTFO Mode
X X X X X X X Y
56 Kbits/s 8 Kbits/s
PCM Coded Speech
Voice Coded Speech
X X X X X X Y Y
48 Kbits/s 16 Kbits/s
PCM Coded Speech
Voice Coded Speech
• Coded speech is transmitted in the LSBs of the PCM samples in the Ainterface with the decoded PCM samples
• Both types of speech presentations (PCM and coded) are available at thereceiving end
• Minor speech degradation in TFO - non TFO transition due to bit-stealing(increased noise) when the 48/56 kbit/s speech samples are used for avery short period
Adaptive Multi-Rate CodecAdaptive Multi-Rate Codec
• Source codec rates probably between 4 kbit/s and 14.4 kbit/s (no fixed source raterequirements)
• Operation in both GSM full rate (22.8 kbps) and half rate (11.4 kbps) channels
• Main advantages in GSM
• Increased robustness against channel errors
• Enhanced quality in the half-rate channel in good channel conditions
• Codec rate selected dynamically depending on radio conditions and local capacityrequirements
• Codec bit rate selected by an adaptation algorithm specific to the system applicatione.g. GSM or UMTS
• Generic speech codec applicable to many mobile systems
• High AMR performance targets and the flexibility obtained by the switchable codecbit-rates (modes) have made it an interesting candidate for UMTS and IMT2000.
• Ability to adapt the bit-rate in a wide range may also be of interest for VoIPapplications
Adaptive Multi-Rate Codec ScheduleAdaptive Multi-Rate Codec Schedule
• Qualification testing has been completed on schedule
• Substantial improvements demonstrated, justifying the AMR technique
• 5 codecs advanced to the selection phase
• Good expectation that all, or nearly all, requirements will be met
• Selection phase to end by September 1998
• The AMR speech codec specifications are planned to be completed by December1998
• The AMR codec will be selected from among five different proposals passing thequalification phase
Alcatel/BT/Cellnet/France Telecom/Nortel/Rockwell
Ericsson/Nokia 1
Ericsson/Nokia 2
Lucent
NEC
Delivery Dates of AMR SpecificationsDelivery Dates of AMR Specifications
Target date SpecificationsDecem ber1998(required)
• source codec• channel codec• bad fram e handling• in-band s ignalling o f codec m ode - transm iss ion aspects
and defin ition o f param eters• in-band s ignalling o f channe l m etric and s ide in form ation -
transm iss ion aspects (b it a llocation and channelprotection)
Decem ber1998(objective)
• VAD/D TX /com fort no ise generation• defin ition of channel m etric and s ide in form ation
param eters• exam ple of codec m ode adaptation• layer 3 s ignalling
June1999
• AM R TR AU fram es• channel perform ance tab les (G SM 05.05)• TFO• test sequences
Decem ber1999
• perform ance characterisation• [m in im um perform ance of adapta tion a logorithm s]
AMR Speech Quality RequirementsAMR Speech Quality Requirements
Full-Rate Channel Half-Rate Channel
C/I Ideal caseperformance(requirement)
Worst caseperformance(objective)
Ideal caseperformance(requirement)
Worst caseperformance(objective)
no errors EFR no errors G.728 no errors G.728 no errors FR no errors
19 dB EFR no errors G.728 no errors G.728 no errors FR no errors
16 dB EFR no errors G.728 no errors G.728 no errors FR at 10 dB
13 dB EFR no errors G.728 no errors FR at 13 dB FR at 7 dB
10 dB G.728 no errors EFR at 10 dB FR at 10 dB FR at 4 dB
7 dB G.728 no errors EFR at 7 dB FR at 7 dB
4 dB EFR at 10 dB EFR at 4 dB FR at 4 dB
Table 1a: Clean speech requirements and objectives under static testconditions.
• Static error conditions: without background noise
AMR Speech Quality RequirementsAMR Speech Quality Requirements
Full-Rate Channel Half-Rate Channel
C/I Ideal caseperformance(requirement)
Worst caseperformance(objective)
Ideal caseperformance(requirement)
Worst caseperformance(objective)
no errors EFR no errors G.729 and FR
no errors
better than
G.729 and FR
no errors
G.729 and FR
no errors
19 dB EFR no errors G.729 and FR
no errors
better than
G.729 and FR
no errors
G.729 and FR
no errors
16 dB EFR no errors G.729 and FR
no errors
better than
G.729 and FR
no errors
FR at 10 dB
13 dB EFR no errors G.729 and FR
no errors
FR at 13 dB FR at 7 dB
10 dB G.729 and FR
no errors
FR at 10 dB FR at 10 dB FR at 4 dB
7 dB G.729 and FR
no errors
FR at 7 dB FR at 7 dB
4 dB FR at 10 dB FR at 4 dB FR at 4 dB
Table 1b: Background noise requirements and objectives under static testconditions.
• Static error conditions: in the presence of background noise
AMR Speech Quality RequirementsAMR Speech Quality Requirements
Full-Rate Channel
Requirement Same or better than the EFR under the sameconditions, and also the same or better than all theAMR full rate tested modes under the sameconditions
Objective 1 Same or better than the EFR using the error pattern +3 dB
Objective 2 Same or better than the EFR using the error pattern +6 dB
Table 2a: Requirements and objectives under dynamic test conditions for the full-rate channel
Half-Rate Channel
Requirement Same or better than the FR under the sameconditions, and also the same or better than all theAMR half rate tested modes under the sameconditions
Objective 1 Same or better than the FR on a full rate channelusing the error pattern + 3 dB
Objective 2 Same or better than the FR on a full rate channelusing the error pattern + 6 dB
Table 2b: Requirements and objectives under dynamic test conditions for the half-rate channel
• Dynamic conditions
(no background noise):
AMR Desi gn ConstraintsAMR Desi gn Constraints
• Some AMR design constraints (simplified to a general form)
• Only very moderate complexity increase compared to existing GSM codecs
• Maximum source coding rate for FR channel modes is 14.4 kbit/s (due to 16kbit/s sub multiplexing)
• In-band signalling for codec modes. Independent adaptation on the up- anddown-links.
• The AMR codec shall support Tandem Free Operation
• The AMR codec shall support DTX operation
• The AMR codec and its control will operate without any changes to the air-interface channel multiplexing, with the possible exception of the interleavedepth.
• It shall be possible to operate power control independently of the AMRadaptation. Not included in qualification and selection tests.
•
AMR Desi gn ConstraintsAMR Desi gn Constraints
• Some AMR design constraints (continued)
• Codec mode control relating to capacity or radio link quality should be located inthe network (BSS).
• Transmission delay: The total algorithmic round trip delay is limited by EFR+10ms in AMR FR channel, and HR+10 ms in AMR HR channel.
• Frame size: 5ms, 10ms or 20 ms
• The AMR in-band signalling shall be expandable to signal the use of future AMRmodes including signalling the use of the existing GSM FR, GSM HR and GSMEFR speech coders, one or two wideband modes and all AMR speech codecmodes in FR channel mode (to guarantee proper TFO operation).
Qualification testsQualification tests• The expected performance of the AMR candidates was evaluated in
qualification tests
• Tests conducted in FR and HR channels, including
• Clear speech
no errors and C/I 19 dB to 1 dB
• Speech in background noise with channel errors
street noise (@15 dB SNR)
car noise (@15 dB SNR)
• Tandeming
• Speech level dependency
• Switching between codec modes
• Dynamic C/I: 5 error profiles
3 profiles for downlink test
2 profiles for uplink test
Overall performance aimsOverall performance aims
0.00
5.00
10.00
15.00
20.00
25.00
30.00
C / I (d B ) - Id e a l fre q ue nc y ho p p ing
A MR -FR e nve lo p e
A MR -H R e nve lo p e
E FR
H R
Introduce improvements where they are needed
• low C/I in FR mode
• high C/I in HR mode.
Qualification results overviewQualification results overview
• Major benefits of AMR technique demonstrated especially
• low C/I in FR mode (1 - 2 delta MOS)
• high C/I in HR mode (same as G.728 - wireline)
• dynamic conditions in FR mode (up to 1.6 delta MOS)
• Several codecs close to meeting all the requirements
• Most challenging condition - background noise in HR mode
Static C/I - examplesStatic C/I - examples
FR Channel HR channel
Experiment 1b - Family of Curves
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
5.00
No Errors@-26dBovl
C/I=19 dB C/I=16 dB C/I=13 dB C/I=10 dB C/I= 7 dB C/I= 4 dB C/I= 1 dB
Condi tions
M OS
Rate A
Rate B
Rate C
Spec.
Experiment 1a - Family of Curves
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
5.00
No Errors@-26dBovl
C/I=19 dB C/I=16 dB C/I=13 dB C/I=10 dB C/I= 7 dB C/I= 4 dB C/I= 1 dBCondi tions
M OS
Rate A
Rate B
Rate C
Spec.
Experiment 2a - Family of Curves in FR
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
5.00
FR No Errors FR EC16 FR EC10 FR EC4
Condi tions
M OS
Rate A
Rate B
Rate C
Spec. FR
Experiment 2a - Family of Curves in HR
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
5.00
HR No Errors HR EC19 HR EC13 HR EC7
Condi tions
M OS
Rate A
Rate B
Rate C
Spec. HR
Back ground noise; static C/I -Back ground noise; static C/I -examplesexamples
FR channel HR channel
Failed Conditions
Dynamic C/I - examplesDynamic C/I - examples
• Dynamic test designed to evaluate AMR performances in “realistic” radio environment with codecadaptation turned on
• Consistent results demonstrated by all candidates
• adaptation mechanism finds best codec mode
• in FR mode, significant improvement compared to fixed rate codec reference, EFR (up to 1.6delta MOS)
• in HR mode, quality equivalent to GSM FR or better (improvement sensitive to dynamic profile)
Typical Result in FR Typical Result in HR
Experiment 4a Test Results
1.00
1.50
2.00
2.50
3.00
3.50
4.00
DEC1 DEC2 DEC3 DEC4 DEC5
Dynam ic Er ror Condition
M OS
YtestEFRRate CRate BRate ARate D
Experiment 4b Test Results
1.00
1.50
2.00
2.50
3.00
3.50
DEC1 DEC2 DEC3 DEC4 DEC5
Dynam ic Error Condition
M OS
Ytes tFRRate ARate BRate CRate D
Examples of dynamic conditionsExamples of dynamic conditions
� Dynamic error profilesfrom Radio Simulator(SMG2)
� One minute long� Up and down links� Correlation of C/I
between up and downlinks controlled
0 10 20 30 40 50 60−85
−80
−75
−70
−65
−60
−55
−50C and I profile etsiq3
time [s]
C a
nd I
[dB
m]
DL CUL CDL IUL I
0 10 20 30 40 50 600
5
10
15
20
25
30C/(I+N) profile etsiq3
time [s]
C/(I
+N) [
dB]
DLUL
0 10 20 30 40 50 60−75
−70
−65
−60
−55
−50C and I profile etsiq11
time [s]
C a
nd I
[dB
m]
DL CUL CDL IUL I
0 10 20 30 40 50 602
4
6
8
10
12
14
16
18
20
22C/(I+N) profile etsiq11
time [s]
C/(I
+N) [
dB]
DLUL
Wideband AMRWideband AMR
• The narrowband AMR work will continue with the specification of a wideband mode
• No target date for finalized specification yet
• Feasibility phase on-going
• Discussion on Design Constraints and Recommended audio bandwidth
• Preliminary working assumption for optimum audio bandwidth (to be confirmed)
• 100 Hz to 7 kHz (possibly also 100 Hz to 5 kHz)
• In some types of background noise, advantages to reducing low frequencies
• So far, there has been little activity on wideband AMR due to work load on thenarrowband AMR
• Several organisations indicated they are studying wideband AMR.
• Results probably not available until end 1998.
UMTS MattersUMTS Matters
• Liaisons with ARIB (Japan)
• Set-up collaboration on UMTS/IMT-2000 speech coding matters
• ARIB representatives attending SMG11 meetings
• AMR in UMTS and IMT-2000
• Working assumption for UMTS (decision from SMG#26, subject to re-evaluationafter the AMR selection)
• A possible candidate for IMT-2000 in ARIB, if standardized on schedule
• WCDMA simulations
• Initial simulation results with the GSM EFR codec and the AMR concept in aWCDMA channel have been presented to SMG11
New Work Item: Noise Suppression New Work Item: Noise Suppression
• A new Work Item on Noise Suppression with AMR was approved by SMG in June
•• Optional DSP feature to reduce audio background noise
• Can improve ease of conversation
• Located ahead of the speech codec
• Effective in many but not all background noise environments
• Optimised for the AMR speech codec
• Standardisation to guarantee minimum performance level
•• The work has not started yet and the scope of the work and possible standardization
has not been fully defined and agreed to
Next SMG11 Plenary Meetin gsNext SMG11 Plenary Meetin gs
•• SMG11#7: 28 September - 2 October 1998; Sophia Antipolis; host Texas
Instruments
• SMG11#8: 11 - 15 January 1999
• SMG11#9: 3 - 5 June 1999