FPGA based Prototyping of Next Generation Forward Error ......is expected for 100G DP-QPSK 100G LSI...

24
© 2009, Mitsubishi Electric Corporation 1/25 Mitsubishi Electric Corporation, Information Technology R&D Center Symposium: Real-time Digital Signal Processing for Optical Transceivers FPGA based Prototyping of Next Generation Forward Error Correction 16:45-17:10 September 22 nd , 2009 ECOC2009, Vienna T. Mizuochi, Y. Konishi, Y. Miyata, T. Inoue, K. Onohara , S. Kametani, T. Sugihara, K. Kubo, T. Kobayashi, H. Yoshida and T. Ichikawa

Transcript of FPGA based Prototyping of Next Generation Forward Error ......is expected for 100G DP-QPSK 100G LSI...

  • © 2009, Mitsubishi Electric Corporation 1/25

    Mitsubishi Electric Corporation, Information Technology R&D Center

    Symposium: Real-time Digital Signal Processing for Optical Transceivers

    FPGA based Prototyping of Next Generation Forward Error Correction

    16:45-17:10 September 22nd, 2009ECOC2009, Vienna

    T. Mizuochi, Y. Konishi, Y. Miyata, T. Inoue, K. Onohara , S. Kametani, T. Sugihara, K. Kubo, T. Kobayashi, H. Yoshida and T. Ichikawa

  • © 2009, Mitsubishi Electric Corporation 2/25

    Outline

    Expectations of stronger FECs for 100Gb/s transmission

    Soft decision based LDPC + RS

    FPGA prototyping

    Error correction experiment

    LSI for 100G digital coherent

  • © 2009, Mitsubishi Electric Corporation 3/25

    Expectations of Stronger FECs for 100Gb/s Transmission

  • © 2009, Mitsubishi Electric Corporation 4/25

    100G Needs Higher OSNR

    64-QAM

    16-QAM

    8-PSK

    QPSK

    PSK

    We should not lose sight of the fact that multi-level modulation needs a higher SNR than binary formats.

    As the level of an M-ary modulation scheme increases, the Euclidean distance decreases, and it becomes more difficult to distinguish between states.

    The rate of decrease of the Euclidean distance is faster than the rate of noise bandwidth reduction.

    0

    5

    10

    15

    20

    25

    1 2 4 6 8bit/symbol

    OS

    NR

    pen

    alty

    (dB

    )

    M-PSKM-QAM

    3 5 7

    (bit rate = const.)

  • © 2009, Mitsubishi Electric Corporation 5/25

    Toward 100Gb/s

    In order to deploy 100G over existing 40G systems, 1.3dB~2.7dB higher OSNR becomes mandatory.

    Stronger FEC can be a great help here

    2

    4

    6

    8

    10

    12

    14

    16

    18

    10 100

    DPSK

    OOK

    DQPSK

    OOKDP-16QAMDQPSKDPSKDP-QPSK

    DP-16QAM

    DP-QPSK

    20 40

    Bit rate (Gb/s)

    Req

    uire

    d O

    SNR

    (dB

    in 0

    .1nm

    )

    1.3dB2.7dB

  • © 2009, Mitsubishi Electric Corporation 6/25

    FEC Deployment in Optical Communications

    2.5Gb/s

    100Gb/s

    40Gb/s

    10Gb/s

    ‘86 ‘88 ‘90 ‘92 ‘94 ‘96 ‘98 ‘00 ‘02 ‘04 ‘06 ‘08 ‘10 ‘12

    Net

    cod

    ing

    gain

    –Bi

    t rat

    e pr

    oduc

    t (G

    b/s)

    ( def

    ined

    in te

    rms

    of a

    pos

    t-FE

    C B

    ER

    of 1

    0-15

    )

    Year‘14 ‘16 ‘18

    RS(255,239)40Gb/s

    2.5Gb/s

    10Gb/s

    100Gb/s

    Shannon limit(Soft decision, 25% redundancy)

    1st gen. RS(255,239)2nd gen. Concatenated codes, Iterative decoding3rd gen. Soft decision, Iterative decoding100

    101

    102

    103

    104

    100Gb/s (target)

    x1.4 every year

    The product of (linear) NCG and bit rate (in Gb/s) shows a clear trend in that an improvement of 1.4 times has been achieved every year.

    This improvement has been achieved not by FEC algorithm improvements, but by LSI technology evolution.

    Very strong FECs can be a key enabler for DSP-based 100G transmission.

    T. Mizuochi, et al., IEEE Photonics Society Summer Topicals, WC1.1

  • © 2009, Mitsubishi Electric Corporation 7/25

    Soft Decision based LDPC + RS

  • © 2009, Mitsubishi Electric Corporation 8/25

    Low-Density Parity-Check Codes – LDPC –

    A linear code, defined by a very sparse parity check matrix

    Invented by Robert Gallager in his 1960 MIT Ph.D. dissertation. Long ignored.- R. G. Gallager, IRE Trans. Inform. Theory, Jan. 1962.

    Re-discovered by D. MacKay in 1996.

    Can achieve very strong error correction capability

    First calculation for optical communications- B. Vasic and I. B. Djordjevic., IEEE Photon. Technol. Lett., Aug. 2002.

  • © 2009, Mitsubishi Electric Corporation 9/25

    Decoding Algorithms and Circuit Complexity

    Performance GoodPoor

    Com

    plex

    ity

    Large

    Small

    min-sum

    shuffled BP

    about 1.5dB

    easycalculation

    Proposed

    Conventional algorithms

    Shuffled belief propagation (BP)High-performance, but quite complex

    Min-sum algorithmEasy calculation, but poor performance

    Cyclically approx. -min algorithm (Proposed)

    Simple LLR calculationMathematical function approximated

    by - and minimum functions nearly identical performance to Shuffled BP nearly 1/5 the circuit size of Shuffled BP

    Circuit configuration for one codeword's bit (a weight of 3)

    Y. Miyata, et al., OFC/NFOEC2007, OWE5

  • © 2009, Mitsubishi Electric Corporation 10/25

    Concatenated LDPC + RS

    LDPC(9216,7936) + RS(992,956), 20.5% redundancy

    Combating Error Floor

    Encoder

    Decoder O/EO/E

    E/OE/O

    LDPC code(inner code)

    RS code(outer code)

    Input information

    Output information

    IterationDe-inter-leave

    Interleave

    Soft-decision

    Soft-decision

    Encoder

    Decoder

    How to eliminate the error floor(1) Increase the codeword length

    20,000 bits or longer are needed (2) Increase the redundancy 35% or more is needed

    These can't be allowed in high speed optical communication systems

    Concatenating another weak code can effectively eliminate the unwanted error floor, without increasing circuit complexity.

    Y. Miyata, et al., OFC/NFOEC2008, OTuE4

  • © 2009, Mitsubishi Electric Corporation 11/25

    OTU4V Frame

    OTU4V frame for LDPC + RS- The length of the payload is the same as the OTUk frame- Enables transparent transmission of 100GbE client signals- Enables asynchronous multiplexing of multiple 10 Gb/s signals- Efficient parallel-processing of FEC enc./dec. and interleaver as a multiple of 128-parallelism

    248239

    9

    288

    OTU row 1

    OTU row 2

    OTU row 3

    OTU row 4

    overhead

    R1

    R2

    R3

    R4

    L1

    L2

    L3

    L4

    128

    128

    128

    128

    OTU row 1

    OTU row 2

    OTU row 3

    OTU row 4

    RS parity

    LDPC parity

    R1

    R2

    R3

    R4

    L1

    L2

    L3

    L4

    40

    Interleave

    Y. Miyata et al., OFC/NFOEC2009, NThB2

  • © 2009, Mitsubishi Electric Corporation 12/25

    FPGA Prototyping

  • © 2009, Mitsubishi Electric Corporation 13/25

    FPGA Prototyping

    Real-time emulation using high-speed FPGAs

    LDPC+RS FEC in FPGAs

    De-skew

    125G MUX

    Soft Decision

  • © 2009, Mitsubishi Electric Corporation 14/25

    14/24

    Set-up for FPGA Prototyping

    BERTest

    31.3 Gb/sOOK ASE

    Gear Box

    RS ENC ILIL

    Pre-Skew

    LDPC ENC

    31.3 Gb/s

    Gear Box Fsync

    De-Skew

    Soft-decLSI

    High Speed FPGA Prototyping Boards

    MX

    2bitIteration

    FIFOFIFO

    +

    Dummy

    ILCopy

    RS DEC dILdIL

    LDPC DEC dIL

    10.3 Gb/sPRBS31

    12.5Gb/s 15.6 Gb/s

    1st Stage

    I/OInterface

    Terminate&

    Output I/F

    Terminate& Input I/F

    2nd Stage 3rd Stage

    3rd Stage 2nd Stage 1st Stage

    Pipelined ArchitectureSFI-4 (700 Mb/s x 16 ch)

    CML (2 Gb/s x 16 chs)

    LN Mod

    PDTIA

    12.5Gb/s 31.3Gb/s

  • © 2009, Mitsubishi Electric Corporation 15/25

    31 Gsample/s Soft Decision LSI

    0.13m SiGe BiCMOS (fT =200 GHz)

    9.7mm x 6.9mm

    14W (+3.3V)

    LSI Chip Package

    Low temperature co-fired ceramic

    30 mm x 29 mm x 2.15 mm

    570 I/O pads

    9.7mm

    6.9m

    m

    30mm

    29mm

    T. Kobayashi, et al., OFC/NFOEC2009, OWeE2

  • © 2009, Mitsubishi Electric Corporation 16/25

    Pipelined Architecture

    FPGA

    11 22

    InRAMIn

    RAMOut

    RAMOut

    RAM

    DecoderDecoder

    Through

    bloc

    k p

    Inse

    rt p

    afte

    r n-fr

    ame

    ・・・・・・ mm

    FPGA board #1

    11 22 ・・・・・・ mm

    FPGA board #n

    ・・・

    In order to emulate the operation of a massive circuit, e.g. iterative decoding, a pipelined architecture was constructed from concatenated FPGAs.

    time

    12345678123456

    core1

    1

    core2

    1

    12

    2

    2

    core8

    8

    8

    ・・・

    dec

    dec

    dec

    dec

    dec

    DataSequence

    781

    1

    ...

  • © 2009, Mitsubishi Electric Corporation 17/25

    FPGA Boards

    RS ENCLDPC ENC LDPC DEC

    Pipeline8 x FPGAs on a boardn-concatenation = n x 8 x 2 (Mgates)

    Altera Stratix II 2 Mgates100 MHz x 128=10 Gb/s throughput

    RS DEC

  • © 2009, Mitsubishi Electric Corporation 18/25

    Error Correction Experiment

  • © 2009, Mitsubishi Electric Corporation 19/25

    Experimental Results

    100

    10-310-210-1

    Pre-FEC BER

    Out

    put B

    ER

    10-1

    10-2

    10-3

    10-4

    10-5

    10-6

    10-7

    10-8

    10-9

    10-10

    10-11

    10-12

    10-13

    ExperimentalLDPC(9216,7936) only

    ExperimentalLDPC(9216,7936) + RS(992,956)

    8.9x10-3 2.5x10-13Input Q = 7.5dB

    Calculated

    31.3 Gb/s AWGN OOK, 2-bit soft dec., 4 iterations

  • © 2009, Mitsubishi Electric Corporation 20/25

    Hard/Soft dec. and Number of Iterations

    10-13

    10-1210-11

    10-1010-9

    10-810-7

    10-6

    10-510-4

    10-310-2

    10-1

    6 7 8 9 10 11

    Out

    put B

    ER

    Pre-FEC Q (dB)

    Hard dec.measured,4 iterations

    2-bit Soft dec.measured, 4 iterations

    2-bit Soft dec.

    calculated, 4 iterations8 iterations16 iterations

    Soft dec.2.4 dB better than hard dec

    Number of iterations4, 8, 16

    Expected NCG @10-159.9 dB(1.2x10-2 1x10-15)2-bit soft dec.16 iterations

  • © 2009, Mitsubishi Electric Corporation 21/25

    Comparison with Shannon Limit

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    0 5 10 15 20 25 30

    Net

    Cod

    ing

    Gai

    n @

    Out

    put B

    ER

    =10

    -15

    (dB

    )

    FEC Redundancy (%)

    7% 25%20%

    40 Gb/s EFEC

    40 Gb/sRS(255,239)

    10 Gb/s Turbo

    10 Gb/sEFEC

    H

    H

    HS

    S

    NCG of >9.9 dB can be expected for 100G DP-QPSK2-bit soft dec., 16 iterations

    S

    H

    Soft decisionHard decision

    9.9 dB expected100Gb/s DP-QPSK LDPC+RS2-bit soft dec. 16 iterations

    LDPC+RS 4 iterations(this work)

    S

    Shannon limit

    Hard dec.

    Soft d

    ec.

    Economical Shannon limit

    Unseen limit = Economical Shannon limit1dB more gain needs >10M$

  • © 2009, Mitsubishi Electric Corporation 22/25

    LSI for 100G Digital Coherent

  • © 2009, Mitsubishi Electric Corporation 23/25

    Implementation in 100 Gb/s LSI

    25G x 4

    100GbE

    RS ENC

    RS DEC

    HDFEC

    SDFEC

    (LDPC)

    OTU4V Framer LSI Coherent Transceiver LSI

    ADCs

    Tx DSP

    Rx DSP 100G

    Opt

    ics

    RS ENC

    HDFEC

    ADCs

    Tx DSP

    100G

    Opt

    ics

    Rx DSP

    SDFEC

    (LDPC)

    25G x 4

    100GbE

    FEC redundancy, Latency, Balance between hard dec and soft dec- 20% is preferable, 4~5x OTU4- xx% hard dec in OTU4 framer + yy% soft dec in coherent transceiver LSI

    LSI Technology, Expected performance- 45nm CMOS, < 30~50 Mgates- hopefully NCG of 10~11dB at BER=10-15

    OTU4

  • © 2009, Mitsubishi Electric Corporation 24/25

    Conclusions

    Expectations of stronger FECs for 100Gb/s transmission discussed

    1.3~2.7 dB stronger NCG than 40G EFEC is expected

    Soft decision FEC for coherent systems proposed

    Concatenated LDPC + RS, Cyclically approximated -min algorithm

    FPGA prototyping developed

    Pipelined architecture, 31.3 Gb/s throughput

    Error correction experiment carried out

    7.5 dB input Q can be corrected to 10-13 (2-bit soft dec., 4 iterations)9.9 dB NCG @10-15 is expected for 100G DP-QPSK

    100G LSI issues discussed

    Hard decision FEC in OTU4 framer LSI + soft decision FEC in coherent LSIFurther improvement of NCG : 10~11dB expectedReal ASIC for 100G may emerge in 2010~2011

    This work was in part supported by the “Lambda Utility Project” of the National Institute of Information and Communications Technology (NICT) of Japan

    FPGA based Prototyping of Next Generation Forward Error Correction OutlineExpectations of Stronger FECs for 100Gb/s Transmission100G Needs Higher OSNRToward 100Gb/s FEC Deployment in Optical CommunicationsSoft Decision based LDPC + RSLow-Density Parity-Check Codes – LDPC –Decoding Algorithms and Circuit ComplexityCombating Error FloorOTU4V FrameFPGA PrototypingFPGA PrototypingSet-up for FPGA Prototyping31 Gsample/s Soft Decision LSIPipelined ArchitectureFPGA BoardsError Correction ExperimentExperimental ResultsHard/Soft dec. and Number of IterationsComparison with Shannon LimitLSI for 100G Digital CoherentImplementation in 100 Gb/s LSIConclusions