FPGA based Prototyping of Next Generation Forward Error ......is expected for 100G DP-QPSK 100G LSI...

© 2009, Mitsubishi Electric Corporation 1/25

Mitsubishi Electric Corporation, Information Technology R&D Center

Symposium: Real-time Digital Signal Processing for Optical Transceivers

FPGA based Prototyping of Next Generation Forward Error Correction

16:45-17:10 September 22nd, 2009ECOC2009, Vienna

T. Mizuochi, Y. Konishi, Y. Miyata, T. Inoue, K. Onohara , S. Kametani, T. Sugihara, K. Kubo, T. Kobayashi, H. Yoshida and T. Ichikawa


Outline

Expectations of stronger FECs for 100Gb/s transmission

Soft decision based LDPC + RS

FPGA prototyping

Error correction experiment

LSI for 100G digital coherent


Expectations of Stronger FECs for 100Gb/s Transmission


100G Needs Higher OSNR

64-QAM

16-QAM

8-PSK

QPSK

PSK

We should not lose sight of the fact that multi-level modulation needs a higher SNR than binary formats.

As the level of an M-ary modulation scheme increases, the Euclidean distance decreases, and it becomes more difficult to distinguish between states.

The rate of decrease of the Euclidean distance is faster than the rate of noise bandwidth reduction.

0

5

10

15

20

25

1 2 4 6 8bit/symbol

OS

NR

pen

alty

(dB

)

M-PSKM-QAM

3 5 7

(bit rate = const.)


Toward 100Gb/s

In order to deploy 100G over existing 40G systems, 1.3dB~2.7dB higher OSNR becomes mandatory.

Stronger FEC can be a great help here

2

4

6

8

10

12

14

16

18

10 100

DPSK

OOK

DQPSK

OOKDP-16QAMDQPSKDPSKDP-QPSK

DP-16QAM

DP-QPSK

20 40

Bit rate (Gb/s)

Req

uire

d O

SNR

(dB

in 0

.1nm

)

1.3dB2.7dB


FEC Deployment in Optical Communications

2.5Gb/s

100Gb/s

40Gb/s

10Gb/s

‘86 ‘88 ‘90 ‘92 ‘94 ‘96 ‘98 ‘00 ‘02 ‘04 ‘06 ‘08 ‘10 ‘12

Net

cod

ing

gain

–Bi

t rat

e pr

oduc

t (G

b/s)

( def

ined

in te

rms

of a

pos

t-FE

C B

ER

of 1

0-15

)

Year‘14 ‘16 ‘18

RS(255,239)40Gb/s

2.5Gb/s

10Gb/s

100Gb/s

Shannon limit(Soft decision, 25% redundancy)

1st gen. RS(255,239）2nd gen. Concatenated codes, Iterative decoding3rd gen. Soft decision, Iterative decoding100

101

102

103

104

100Gb/s (target)

x1.4 every year

The product of (linear) NCG and bit rate (in Gb/s) shows a clear trend in that an improvement of 1.4 times has been achieved every year.

This improvement has been achieved not by FEC algorithm improvements, but by LSI technology evolution.

Very strong FECs can be a key enabler for DSP-based 100G transmission.

T. Mizuochi, et al., IEEE Photonics Society Summer Topicals, WC1.1


Soft Decision based LDPC + RS


Low-Density Parity-Check Codes – LDPC –

A linear code, defined by a very sparse parity check matrix

Invented by Robert Gallager in his 1960 MIT Ph.D. dissertation. Long ignored.- R. G. Gallager, IRE Trans. Inform. Theory, Jan. 1962.

Re-discovered by D. MacKay in 1996.

Can achieve very strong error correction capability

First calculation for optical communications- B. Vasic and I. B. Djordjevic., IEEE Photon. Technol. Lett., Aug. 2002.


Decoding Algorithms and Circuit Complexity

Performance GoodPoor

Com

plex

ity

Large

Small

min-sum

shuffled BP

about 1.5dB

easycalculation

Proposed

Conventional algorithms

Shuffled belief propagation (BP)High-performance, but quite complex

Min-sum algorithmEasy calculation, but poor performance

Cyclically approx. -min algorithm (Proposed)

Simple LLR calculationMathematical function approximated

by - and minimum functions nearly identical performance to Shuffled BP nearly 1/5 the circuit size of Shuffled BP

Circuit configuration for one codeword's bit (a weight of 3)

Y. Miyata, et al., OFC/NFOEC2007, OWE5


Concatenated LDPC + RS

LDPC(9216,7936) + RS(992,956), 20.5% redundancy

Combating Error Floor

Encoder

Decoder O/EO/E

E/OE/O

LDPC code(inner code)

RS code(outer code)

Input information

Output information

IterationDe-inter-leave

Interleave

Soft-decision

Soft-decision

Encoder

Decoder

How to eliminate the error floor(1) Increase the codeword length

20,000 bits or longer are needed (2) Increase the redundancy 35% or more is needed

These can't be allowed in high speed optical communication systems

Concatenating another weak code can effectively eliminate the unwanted error floor, without increasing circuit complexity.

Y. Miyata, et al., OFC/NFOEC2008, OTuE4


OTU4V Frame

OTU4V frame for LDPC + RS- The length of the payload is the same as the OTUk frame- Enables transparent transmission of 100GbE client signals- Enables asynchronous multiplexing of multiple 10 Gb/s signals- Efficient parallel-processing of FEC enc./dec. and interleaver as a multiple of 128-parallelism

248239

9

288

OTU row 1

OTU row 2

OTU row 3

OTU row 4

overhead

R1

R2

R3

R4

L1

L2

L3

L4

128

128

128

128

OTU row 1

OTU row 2

OTU row 3

OTU row 4

RS parity

LDPC parity

R1

R2

R3

R4

L1

L2

L3

L4

40

Interleave

Y. Miyata et al., OFC/NFOEC2009, NThB2


FPGA Prototyping


FPGA Prototyping

Real-time emulation using high-speed FPGAs

LDPC+RS FEC in FPGAs

De-skew

125G MUX

Soft Decision


14/24

Set-up for FPGA Prototyping

BERTest

31.3 Gb/sOOK ASE

Gear Box

RS ENC ILIL

Pre-Skew

LDPC ENC

31.3 Gb/s

Gear Box Fsync

De-Skew

Soft-decLSI

High Speed FPGA Prototyping Boards

MX

2bitIteration

FIFOFIFO

+

Dummy

ILCopy

RS DEC dILdIL

LDPC DEC dIL

10.3 Gb/sPRBS31

12.5Gb/s 15.6 Gb/s

1st Stage

I/OInterface

Terminate&

Output I/F

Terminate& Input I/F

2nd Stage 3rd Stage

3rd Stage 2nd Stage 1st Stage

Pipelined ArchitectureSFI-4 (700 Mb/s x 16 ch)

CML (2 Gb/s x 16 chs)

LN Mod

PDTIA

12.5Gb/s 31.3Gb/s


31 Gsample/s Soft Decision LSI

0.13m SiGe BiCMOS (fT =200 GHz)

9.7mm x 6.9mm

14W (+3.3V)

LSI Chip Package

Low temperature co-fired ceramic

30 mm x 29 mm x 2.15 mm

570 I/O pads

9.7mm

6.9m

m

30mm

29mm

T. Kobayashi, et al., OFC/NFOEC2009, OWeE2


Pipelined Architecture

FPGA

11 22

InRAMIn

RAMOut

RAMOut

RAM

DecoderDecoder

Through

bloc

k p

Inse

rt p

afte

r n-fr

ame

・・・・・・ mm

FPGA board #1

11 22 ・・・・・・ mm

FPGA board #n

・・・

In order to emulate the operation of a massive circuit, e.g. iterative decoding, a pipelined architecture was constructed from concatenated FPGAs.

time

12345678123456

core1

1

core2

1

12

2

2

core8

8

8

・・・

dec

dec

dec

dec

dec

DataSequence

781

1

...


FPGA Boards

RS ENCLDPC ENC LDPC DEC

Pipeline8 x FPGAs on a boardn-concatenation = n x 8 x 2 (Mgates)

Altera Stratix II 2 Mgates100 MHz x 128=10 Gb/s throughput

RS DEC


Error Correction Experiment


Experimental Results

100

10-310-210-1

Pre-FEC BER

Out

put B

ER

10-1

10-2

10-3

10-4

10-5

10-6

10-7

10-8

10-9

10-10

10-11

10-12

10-13

ExperimentalLDPC(9216,7936) only

ExperimentalLDPC(9216,7936) + RS(992,956)

8.9x10-3 2.5x10-13Input Q = 7.5dB

Calculated

31.3 Gb/s AWGN OOK, 2-bit soft dec., 4 iterations


Hard/Soft dec. and Number of Iterations

10-13

10-1210-11

10-1010-9

10-810-7

10-6

10-510-4

10-310-2

10-1

6 7 8 9 10 11

Out

put B

ER

Pre-FEC Q (dB)

Hard dec.measured,4 iterations

2-bit Soft dec.measured, 4 iterations

2-bit Soft dec.

calculated, 4 iterations8 iterations16 iterations

Soft dec.2.4 dB better than hard dec

Number of iterations4, 8, 16

Expected NCG @10-159.9 dB(1.2x10-2 1x10-15)2-bit soft dec.16 iterations


Comparison with Shannon Limit

5

6

7

8

9

10

11

12

13

14

0 5 10 15 20 25 30

Net

Cod

ing

Gai

n @

Out

put B

ER

=10

-15

(dB

)

FEC Redundancy (%)

7% 25%20%

40 Gb/s EFEC

40 Gb/sRS(255,239)

10 Gb/s Turbo

10 Gb/sEFEC

H

H

HS

S

NCG of >9.9 dB can be expected for 100G DP-QPSK2-bit soft dec., 16 iterations

S

H

Soft decisionHard decision

9.9 dB expected100Gb/s DP-QPSK LDPC+RS2-bit soft dec. 16 iterations

LDPC+RS 4 iterations(this work)

S

Shannon limit

Hard dec.

Soft d

ec.

Economical Shannon limit

Unseen limit = Economical Shannon limit1dB more gain needs >10M$


LSI for 100G Digital Coherent


Implementation in 100 Gb/s LSI

25G x 4

100GbE

RS ENC

RS DEC

HDFEC

SDFEC

(LDPC)

OTU4V Framer LSI Coherent Transceiver LSI

ADCs

Tx DSP

Rx DSP 100G

Opt

ics

RS ENC

HDFEC

ADCs

Tx DSP

100G

Opt

ics

Rx DSP

SDFEC

(LDPC)

25G x 4

100GbE

FEC redundancy, Latency, Balance between hard dec and soft dec- 20% is preferable, 4~5x OTU4- xx% hard dec in OTU4 framer + yy% soft dec in coherent transceiver LSI

LSI Technology, Expected performance- 45nm CMOS, < 30~50 Mgates- hopefully NCG of 10~11dB at BER=10-15

OTU4


Conclusions

Expectations of stronger FECs for 100Gb/s transmission discussed

1.3~2.7 dB stronger NCG than 40G EFEC is expected

Soft decision FEC for coherent systems proposed

Concatenated LDPC + RS, Cyclically approximated -min algorithm

FPGA prototyping developed

Pipelined architecture, 31.3 Gb/s throughput

Error correction experiment carried out

7.5 dB input Q can be corrected to 10-13 (2-bit soft dec., 4 iterations)9.9 dB NCG @10-15 is expected for 100G DP-QPSK

100G LSI issues discussed

Hard decision FEC in OTU4 framer LSI + soft decision FEC in coherent LSIFurther improvement of NCG : 10~11dB expectedReal ASIC for 100G may emerge in 2010~2011

This work was in part supported by the “Lambda Utility Project” of the National Institute of Information and Communications Technology (NICT) of Japan

FPGA based Prototyping of Next Generation Forward Error Correction OutlineExpectations of Stronger FECs for 100Gb/s Transmission100G Needs Higher OSNRToward 100Gb/s FEC Deployment in Optical CommunicationsSoft Decision based LDPC + RSLow-Density Parity-Check Codes – LDPC –Decoding Algorithms and Circuit ComplexityCombating Error FloorOTU4V FrameFPGA PrototypingFPGA PrototypingSet-up for FPGA Prototyping31 Gsample/s Soft Decision LSIPipelined ArchitectureFPGA BoardsError Correction ExperimentExperimental ResultsHard/Soft dec. and Number of IterationsComparison with Shannon LimitLSI for 100G Digital CoherentImplementation in 100 Gb/s LSIConclusions

FPGA based Prototyping of Next Generation Forward Error ......is expected for 100G DP-QPSK 100G LSI...

Documents

Transcript of FPGA based Prototyping of Next Generation Forward Error ......is expected for 100G DP-QPSK 100G LSI...