JFEX Uli Schäfer 1 Mainz Logos ? jFEX (inc. specific software/firmware & Tilecal input signal,...

1

jFEX

Uli Schäfer

Mainz

Logos ?

jFEX (inc. specific software/firmware & Tilecal input signal, options) 30'

http://www.staff.uni-mainz.de/uschaefe/browsable/Meeting/2013/neu/

2

jFEX (inc. specific software/firmware & Tilecal input signal, options) 30'

Assume we have to cover all that’s in the document, but more details, where we feel it helps. Did I forget to cover any important section

- No physics justification• Jet processing general• L1Calo phase 1 scheme• Algorithms• Data replication • jFEX description #4• Density, fibre count etc.

• Input options #4

• Firmware (also software to be mentioned?)

• Demonstrators/current designs (GOLD, Topo, MiniPOD/V7) #4• Schedule, manpower req. jfex input HD+MZ

Uli Schäfer

3

Jet processingPhase-0 jet system consisting of • Pre-Processor

• Analogue signal conditioning• Digitization • Digital signal processing• Jet element pre-summation to 0.2 x 0.2 (η×φ)

• Jet processor• Sliding window processor for jet finding• Jet multiplicity determination• Jet feature extraction into L1Topo (pre-phase1)

At phase-1: complement with jet feature extractor jFEX• LAr signals optically from digital processor system• Tilecal signals from analogue Pre-Processor / JEP …• … eventually Tilecal optical data off detector, and possible

retirement of current L1Calo systemUli Schäfer

4

L1Calo Phase-1 System

Uli Schäfer

CPM

JEMCMX

CMX

Hub

Hub

L1Topo

ROD

ROD

JMMPPR

From Digital Processing System

CPM

JEMCMX

CMX

Hub

Hub

L1Topo

ROD

ROD

JMMPPR

jFEX

CPM

JEMCMX

CMX

Hub

eFEXHub

Opt.

Plant

L1Topo

ROD

ROD

JMM

New at Phase 1

RTM

RTM

5

• Hier 3 optionen

Uli Schäfer

6

Algorithms, now and then

Sliding window algorithm :Find and disambiguate ROIsCalculate jet energy in differently sized windows (programmable)

• Improve granularity by factor of

four, to 0.1×0.1 (η×φ)• Slightly increase environment• Allow for flexibility in jet definition

(non-square jet shape, Gaussian filter, …)

• Fat jets to be calculated from high granularity small jets

• Optionally increase jet environment (baseline 0.9 × 0.9)

Uli Schäfer

Phase 0 Phase 1

ROI 0.4 x 0.4

tower 0.2 x 0.2 0.1 x 0.1

three jet windows

up to 0.8 × 0.8 0.9 × 0.9limited by data duplication

7

Data replication

Sliding window algorithm requiring large scale replication of data• Forward duplication only (fan-out), no re-transmission• Baseline: no replication of any source into more than two

sinks• Fan-out in eta (or phi) handled at source only (DPS)

• Duplication at the parallel end (on-FPGA), using additional Multi-Gigabit Transceivers

• Allowing for differently composed streams• Minimizing latency

• Fan-out in phi (or eta) handled at destination only• Baseline “far end PMA loopback” • Looking into details and alternatives

Uli Schäfer

8

Initial baseline 8+ Modules, each covering full phi, limited eta range• Environment of 0.9 in eta (core bin +/- 4 neighbours)• Each module receives fully duplicated data in eta :

1.6 eta worth of data required for a core of 0.8• 16 eta bins including environment

8 FPGAs per module, each:• Environment 0.9×0.9• Each FPGA receives fully duplicated data in eta and phi:

1.6×1.6 worth of data required for a core of 0.8×0.8• 256 bins @ 0.1×0.1 in η×φ, e/m + had 512 numbers• With baseline 6.4 , 64 Multi-Gb/s receivers

• Hier fasercount

Uli Schäfer

10

fibre count / density• Erst durch 2 und dann mal 8 nach oben an slide 7

• 64 * 8 channels per FPGA• Due to full duplication in phi direction, exactly half of all 512 signals

are routed into the modules optically on fibres• 256 fibres• 22 × 12-channel opto receivers• 4 × 72-way fibre bundles / MTP connectors

• Option• For larger jets window we require larger FPGAs, some more fibres and

replication factor > ×2• Aim at higher line rates (currently FPGAs support 13 Gb/s, microPOD

10 Gb/s)• Allow for even finer granularity / larger jets / smaller FPGA devices

:• If digital processor baseline allows for full duplication of 6.4Gb/s

signals, the spare capacity, when run at higher rate, can be used to achieve a replication of more than 2-fold, so as to support a larger jet environment.

Uli Schäfer

11

How to fit on a module ?• ATCA

• 8 processors (~XC7VX690T)

• 4 microPODs each

• fan-out passive or “far end PMA loopback”

• Small amount of control logic / non-realtime (ROD)

• Nein! Might add 9th processor for consolidation of results

• Opto connectors in Zone 3

• Module control !!!

• Maximise module payload with help of small-footprint ATCA power brick and tiny IPMC mini-DIMM

Uli Schäfer

Z3

12

…and 3-d

Uli Schäfer

13

jFEX system• Need to handle both fine granularity and large jet environment (minimum 0.9×0.9) Require high density / high bandwidth per moduleNeed that density to keep input

replication factor at acceptable level

• ~ 8 modules (+FCAL ?) Bild neuer Teil, e,j, e ausgegraut• Single crate go for ATCA shelf / blades:• Sharing infrastructure with eFEX

• Handling / splitting of fibre bundles• ROD design• Hub design• RTM

Input signals:• Granularity .1×.1 (η×φ)• One electromagnetic, one hadronic tower – nach oben• --- nach unten• Unlike eFEX, no “BCMUX” scheme due to consecutive non-zero data• 6.4 Gb/s line rate, 8b/10b encoding, 128 bit per BC • For now, assume 16bit per tower, 8 towers per fibre

Uli Schäfer

15

Tilecal input options• List 3 options and mention DPS approach• For following slides/drawings:

Merge Sam and HCSC slides• Do we attach preferences, do we talk about

advantages/disadvantages ?• My preference: NO!!!

Uli Schäfer

Background

Phase-1 eFEX and jFEX receive digital EM layer data from LAr DPS But equivalent Tile data path not available

before Phase 2 So: need to extract digital hadronic tower sums

produced from the current analog sums sent to L1Calo

Three points where this can be done See next slide

16

17

Alternatives

EM calorimeterdigital readout

Muon detector

Analog sums from Tile/LAr nMCM

CMX

CMX

JEM

Endcap sector logic

Barrel sector logic

MuCTPi

Muon Trigger

L1Topo CTP

CORE

DPS

eFEX

jFEX

PreProcessor

Topological info

CTP outputNew/upgradedHardware

Central Trigger

L1Calo Trigger

JEP

CP

Receiverstations

EM data to FEX

Hadronic data to FEX

1

2 3

Can extract Tile tower sums from:1. Tile receiver stations2. PreProcessor modules3. JEM modules in JEP

Tile tower“DPS”

Considerations Latency Dynamic range

Current L1Calo towers have 8 bit dynamic range with 1GeV/LSB

Would like 9 or 10 bits, if possible Cost to implement Risk of disruption to existing system

18

Option 1: Tile Rx stations

• Signals extracted at arrival point in USA15, so latency cost is minimal

• Must build a new system to digitize and process analog signals• No constraints on dynamic range• Cost is high – essentially need to build new

receiver and PreProcessor systems• High risk of disruption to current L1Calo:

• Analog data path ahead of L1Calo rearranged• Where do we fit the new systems that do this?

19

Option 2: PreProcessor

New MCM (Phase 0) FPGA based tower processing Can drive higher-speed data to

the LVDS link driver card(blue arrows)

Replacement link card (Phase 1) Send tower data electrically to

CP and JEP (same as now) An FPGA and parallel-optic

transmitter (e.g. minipod) produce hadronic output to FEX

Fiber ribbon takes data from link card to an MTP/MPO output port (probably on front panel)

20

nMCM prototype

Option 2: PreProcessor

• Minimal latency:• Essentially equal to option 1;

• Can extend dynamic range:• nMCM can drive outputs at higher rates, so

more bits per tower possible• ‘Easy’ to get 9 bits, 10 bits probably possible

• Relatively low cost• nMCM will already exist• A few (small) LVDS link boards• Possibly need to replace some PreProcessor

mother boards (8 layers, low component count)

• Low disruption: Only upgrading existing boards

21

22

Option 3: JEM Upgrade

Double-ratetower data fromupgraded PPM(960 Mbit/s)

High-speed links to FEXfrom input cards to frontpanel (lowest latency)(hadronic tower sums)

Upgraded input cards

Option 3: JEM upgrade

Higher latency: Serial transmission from PPr to JEP adds

multiple BCs to latency Limited dynamic range:

BCMUX protocol consumes some bandwidth 9 bits possible (by removing parity), 10 bits

probably not possible Similar cost to Option 2

PreProcessor nMCM and link cards still get replaced (but not PPr mother boards?)

Plans to upgrade JEM daughter boards anyway

Low disruption: Again, similar to Option 2

23

SUM

ch1

ch2ch3

ch4

BCMUX

BCMUX

FPGA (Spartan-6) MCM #1

10

10

LVDS-Tx

LVDS-Tx

LVDS-Tx

MCM #16

CP1

CP2

JEP JEP

CP

Virtex-II

32x

16x

480Mb/s

480Mb/s

FPGA

FPGA

FPGA

FPGA

J2power

PPM

LCD (f/o & routing)

to CP(LVDS cables)

to JEP(LVDS cables)

to DAQr/o data

RGTM

V.Andrei, KIPL1Calo Weekly Meeting, 10/01/2013 2

Current SystemCurrent System

ch1

ch2ch3

ch4

BCMUX

BCMUX


MCM #16

CP1

10CP2

JEP

MUX + LVDS-Tx

LVDS-Tx

LVDS-Tx

JEP

CP

Spartan-6/Artix-7

32x

16x

480Mb/s

960Mb/s

FPGA

FPGA

FPGA

FPGA

J2power

PPM

nLCD (f/o & routing)

to CP(LVDS cables)

to JEP(LVDS cables)

to DAQr/o data

RGTM


Phase-I: first solutionPhase-I: first solution

ch1

ch2ch3

ch4

BCMUX

BCMUX


MCM #16

CP1

10CP2

JEP

MUX + LVDS-Tx

LVDS-Tx

LVDS-Tx

CP

JEP

JEP

CP

nLCD (f/o & routing)

Spartan-6/Artix-7

32x

16x

480Mb/s

960Mb/s

FPGA

FPGA

FPGA

FPGA

SNAP12

to CP(LVDS cables)

to JEP(LVDS cables)

J2

to DAQr/o data

Xilinx 7 Series

Rear Extension

power

PPM

to jFEX(optic fibers)

RGTM


Phase-I: second solution (A)Phase-I: second solution (A)

?FPGA

SUM

ch1

ch2ch3

ch4

BCMUX

BCMUX


10

10

LVDS-Tx

LVDS-Tx

LVDS-Tx

MCM #16

CP1

CP2

JEP JEP

CP

Virtex-II

32x

16x

480Mb/s

480Mb/s

FPGA

FPGA

FPGA

FPGA

J2power

PPM

LCD (f/o & routing)

to DAQr/o data

RGTM


to CP(LVDS cables)

to JEP(LVDS cables)

Xilinx 7 Series

Rear Extension

to jFEX(optic fibers)

FPGA

SNAP12

CP

JEP

Phase-I: second solution (B)Phase-I: second solution (B)

28

Firmware• jFEX• Sliding window algorithm• Infrastructure for high speed links• Module control• DAQ (buffers and embedded ROD functionality, common

effort)

• Example : JEM based Tilecal inputs• Serialization nMCM 960Mb/s• Serialization JMM 6.4Gb/s• Re-target existing JEM input firmware to new FPGA

Uli Schäfer

29

• Development line

Uli Schäfer

30

Demonstrator projects : “GOLD”

Try out technologies and schemes for L1Topo • Fibre input from the backplane (MTP-CPI connectors)• Up to 10Gb/s o/e data paths via industry standard converters on

mezzanine• Using mid range FPGAs (XC6VLX240T) up to 6.4Gb/s,

24 channels per device• Typical sort/dφ algorithm (successfully implemented) takes

~ 13% logic resources• Real-time output via opto links on the front panel (currently used as

data source for latency measurements etc.)• Will continue to be used as source/sink for L1Topo tests

Uli Schäfer

31

Recent GOLD results (Virtex-6)• Jitter analysis on cleaned TTC clock (σ = 2.9 ps)• Signal integrity: sampled in several positions along the chain• MGT and o/e converters settings optimization• Bit Error Rate (BER) < 10-16 at 6.4 Gbps / 12 channels• Eye widths above 60 ps (out of 156 ps)• Crosstalk among channels measured in some cases but with

negligible effect

Uli Schäfer Sampled after fan-out chip

GOLD latency (Virtex-6)

Latency measured along the real-time path in various points at 6.4 Gbs, 16 bit data width, and 8b/10b encoding

• Far End PMA loopback: 34 ns latency• Electrical (LVDS) output: 63 ns latency• Far End PCS loopback: 78 ns latency• Parallel loopback in fabric: 86 ns latencyFirmware mod for latency measurement including algorithm, and electrical out towards CTP under way (Volker almost got there…)

33

Topo processor details – post PDR• Real-time path:

• 14 fibre-optical 12-way inputs (miniPOD)

• Via four 48-way backplane connectors

• 4-fold segmented reference clock tree, 3 Xtal clocks each, plus jitter-cleaned LHC bunch clock multiple

• Two processors XC7V690T (prototype XC7V485T)

• Interlinked by 238-way LVDS path• 12-way (+) optical output to CTP• 32-way electrical (LVDS) output to

CTP via mezzanine• Full ATCA compliance / respective

circuitry on mezzanine• Module control via Kintex and Zynq

processors• Initially via VMEbus extension• Eventually via Kintex or Zynq

processor (Ethernet / base interface)Uli Schäfer

34

floor plan, so far…

Uli Schäfer

• Processor FPGA configuration via SystemACE and through module controller

• Module controller configuration via SPI and SD-card

• DAQ and ROI interface• Two SFPs, L1Calo style• up to 12 opto fibres

(miniPOD)• Hardware to support

both L1Calo style ROD interface, and embedded ROD / S-Link interface on these fibres

35

… and in 3-d

Uli Schäfer

36

Tests miniPOD on VC707• From Eduard

• Note : microPOD same o/e engine as miniPOD• (no pics on MicroPOD, probably confidential…)

Uli Schäfer

37

Schedule• Extract from Ian's Gantt chart, and that’s it?• Check with Ian whether anything to be updated before

the session

• Also manpower estimates ?• If so, anything to be said in addition to what’s in the

document ?

Uli Schäfer

38

conclusion• The 8-module jFEX seems possible with ~2013’s technology• Key technologies explored already (GOLD, L1Topo,…)• Use of microPODs challenging for thermal and mechanical reasons,

but o/e engine is the same as in popular miniPODs• Scheme allows for both fine granularity and large environment at

6.4Gb/s line rate and a limit of 100% duplication of input channels• Rather dense circuitry, but comparable to GOLD demonstrator• For even finer granularity and / or larger jets things get more

complicated Need to explore higher transmission rates • DPS needs to handle the required duplication (in eta)Details of fibre organization and content cannot be presented now Started work on detailed specifications, in parallel exploring higher

data rates…• Tilecal signals required in FEXes in fibre-optical format• Three options for generating them• All seem viable, but probably at different cost• Need to agree on a baseline before TDR

Uli Schäfer

JFEX Uli Schäfer 1 Mainz Logos ? jFEX (inc. specific software/firmware & Tilecal input signal,...

Documents

Transcript of JFEX Uli Schäfer 1 Mainz Logos ? jFEX (inc. specific software/firmware & Tilecal input signal,...