UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation...

17
UNIVERSITÉ DE SHERBROOKE - Philippe GOURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François ROUSSEAU, Roch LEFEBVRE - ICASSP 2003, Hong Kong, 6-10 April 2003 « Improved Packet Loss Recovery for Prediction- based Speech Coders »

Transcript of UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation...

Page 1: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

-Philippe GOURNAY

Senior Research EngineerVoiceAge Corporation

University of Sherbrooke

François ROUSSEAU, Roch LEFEBVRE

-ICASSP 2003, Hong Kong, 6-10 April 2003

« Improved Packet Loss Recovery for Prediction-based Speech Coders »

Page 2: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

Error propagation (CELP decoder)

One Lost Frame !

Original

Coded

Concealed

Error

Page 3: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

The ACELP coder• Prediction (short- and long-term + quantization)• Analysis-by-synthesis

ErrorMinimization

InnovativeCodebook

Long-termPred. Filter

Short-term (LPC) Filter

PerceptualWeighting

LPC Analysis& Quantization

pulseamplitudes

andlocations

local decoder

weighted error

synthesized speech

error en

speech sn

Page 4: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

Improving the Robustness (1/4)

• Concealment:

Q : What to do when a (Binary) Frame is Missing ?

A : Compute a Replacement (Speech) Frame

• Recovery:

Q : What to do when the Frames are Received Again ?

A : Control Error Propagation

Page 5: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

Improving the Robustness (2/4)

• Sender-based Methods

– Forward Error Correction (FEC) Bandwidth

– Multiple Descriptions Bandwidth, Independent Transmission Paths

– Retransmission Delay

Page 6: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

Improving the Robustness (3/4)

• Receiver-based Methods

– Frame Loss Concealment Limited Effectiveness

– Interpolative Update of the Decoder Internal State Very Limited Effectiveness

– Playout Buffering Additional Delay, Speech Rate Adaptation

Page 7: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

Improving the Robustness (4/4)

• Summary

– Higher Bit Rate or Delay– Late Packets are Considered as Lost– Recovery Problem Largely Overlooked

Page 8: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

Update (Basic Idea)

• The concealment method does not correctly update the Internal State (I.S.) of the decoder

• We keep a copy of the past (“Good”) I.S. of the decoder before the concealment

• We use the late frame to update the I.S., starting from the past (“Good”) I.S.

• Smooth transition between Concealment and Recovery– Best when done in the excitation domain

Page 9: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

Decoder

Concealment

Recovery

Update

ISGood

Bitsn

ISBad

BFI

UPD

DecodedSpeech

1

0

1

0

Decoder Block

Diagram

Page 10: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

n-1

Concealment Recovery…

A)

B)

C)

n

n n+1 n+2 n+3

n-1

n+1 n+2 n+3

n-1 (ii) (iii) n+2 n+3

n

(i)n+1

“Chronogram” (1 Late Frame)

Page 11: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

Decode(Bitsn-1, Audion-1, BFI=0, UPD=0);

Decode( - , Audion, BFI=1, UPD=0); // Conceal

Decode(Bitsn, - , BFI=0, UPD=1); // Update

Decode(Bitsn+1, Audion+1, BFI=0, UPD=0); // Recover

Decode(Bitsn+2, Audion+2, BFI=0, UPD=0); // ...

...

...

Call Sequence (1 Late Frame)

Page 12: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

Complexity (1 Late Frame)

• Memory– One Copy of the Internal State

• AMR-WB: roughly 1.5 k-octet

• Processing Power– Two Additional Decoding of the Excitation

• AMR-WB: equivalent to decoding one frame

Page 13: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

Sample Waveforms (1/2)

Original

Coded

Concealed

Updated

Page 14: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

Sample Waveforms (2/2)

Original

Coded

Concealed

Updated

Page 15: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

Evaluation Results (AMR-WB)

• Cond. 1: One Late Frame / 10 Frames• Cond. 2: One Lost Frame + One Late Frame / 15 Frames• Cond. 3: Three Consecutive Late Frames / 20 Frames

Total Cond. 1 Cond. 2 Cond.3 UPD 70% 67% 48% 94%

= 25% 29% 41% 4% STD 6% 4% 11% 2%

Page 16: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

Demonstration Files (AMR-WB)

Original File

AMR-WB (12.65 kbits/s)

Std. AMR-WB (Cond 1) Upd. AMR-WB (Cond 1)

Std. AMR-WB (Cond 2) Upd. AMR-WB (Cond 2)

Std. AMR-WB (Cond 3) Upd. AMR-WB (Cond 3)

Page 17: UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.

UNIVERSITÉ DE

SHERBROOKE

Conclusion

• Using Late Frames Substantially Improves the Recovery of the Decoder

• Update the Internal State of the Decoder– Some smoothing is required between the Concealment

and the Recovery

• In a VoIP environment: More Robust against Jitter, with no Increase in Delay Less Delay, with no Quality Degradation