The RCU2 ALICE TPC readout electronics consolidation for Run 2
description
Transcript of The RCU2 ALICE TPC readout electronics consolidation for Run 2
The RCU2 ALICE TPC readout electronics consolidation for Run 2
Johan AlmeBergen University College, Norway
on behalf of the ALICE-TPC collaboration
TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
2
ALICE detector
TPC detector
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
3
Present ALICE TPC Readout Electronics
• TPC is divided into 2 x 18 Sectors:– 216 Readout Control Units (RCUs)– 4356 Front End Cards (FECs)
• The RCU is a complex system:– RCU Motherboard– Detector Control System (DCS) Board
• Embedded Linux platform
– Source Interface Unit Card (SIU)– In total 3 PCBs with 4 FPGAs
• 2 branches per RCU of a multidrop parallel bus– 18 – 25 FECs per RCU
• depending on position
– Peak bandwidth 200 MB/sJohan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
SIU
DCS
RCU
TTC (optics)Monitor/Control (Ethernet)
DDL link (optics - 160 MB/s)
FECs
– B
ranc
h A
FECs
– B
ranc
h B
GTL
bus b
ranc
h A
GTL
bus b
ranc
h B
4
ALICE Run 2 Scenario• Run 2 is the period between Long Shutdown 1 (LS1) and Long Shutdown 2 (LS2)
– Planned start of Run 2: January 2015 • Run 2 will have higher interaction rates and higher track densities• Expected values for Pb-Pb collisions:
– Peak luminocity: 1 – 4 x 1027 cm-2s-1
– 8 – 30 kHz interaction rate• Current ALICE design value: 8 kHz
– 40% more data for central events• Event sizes increase from 65 MB to 90MB
• Higher readout speed needed• Higher radiation load => more Single Event Upsets
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
Low multiplicity event from Run 1 High multiplicity event is completely crowded Run 2 will have even higher multiplicity
5
Motivation for Upgrade (I)• Data Rate Limitations:
– Current bottleneck: Bandwidth of the data bus• ≤ 200 MB/s per branch • Large (fixed) overhead - addressing and header
– Bandwidth of optical detector data link (DDL)• 160 MB/s
• Stability Limitations (Run 1 experiences):– End-Of-Run situations caused by radiation
related errors in the TPC electronics have been seen
– DCS board failures frequent – few per fill• Not “mission critical”
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
Two read-out modes: sparse and full read-out.Read-out time for full TPC is defined by the slowest read-out partition
6
Motivation for Upgrade (II)
• Main message:– The readout performance is limited by the parallel bus architecture
• This reduces the expected performance for Run 2
– Conclusion: We need a faster data readout than what the current solution provides!
– Both main FPGAs on the RCU has a relatively high SEU susceptibility and almost no design-level protection • Not enough resources in the FPGAs to implement it
– Conclusion: With even the higher luminosity in Run 2 we need an improved radiation tolerance for the readout chain!
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
7
RCU2The ALICE TPC Consolidation Effort
• Constraints:– Time-budget– Reuse all exisiting interfaces
• TTC fiber• DAQ fiber• Ethernet cable• Power cable• GTL bus
– No change to form-factor and cooling
– Improve radiation tolerance– Increase data rate to meet Run2
conditions
• Solution – RCU2:– One single radiation tolerant FLASH based SmartFusion2 FPGA– FPGA design composed of a few building blocks largely based on existing modules– Backplanes - Double the number of readout branches to 4 branches– Nice to have functionality: Radiation Monitor
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
RCU2
TTC (optics)Monitor/Control (Ethernet)
DDL2 link (optics – 5 Gb/s)
FECs
Bran
ch B
O
A_ou
ter
A_in
ner
B_in
ner
B_ou
ter
FECs
Bran
ch B
IFE
CsBr
anch
AO
FECs
Bran
ch A
I
8
Branch Partitioning• There are 6 readout partitions per TPC sector:
– 6 different sizes of backplanes– Two wings per partition
• Each wing is electrically split into two branches each with new branch naming convention– A A_inner (AI) | A_outer (AO)– B B_inner (BI) | B_outer (BO)
• Depending on partition – various number of FECs per branch– RP0: 5 + 4 + 4 + 5– RP1: 6 + 6 + 6 + 7– RP2: 5 + 4 + 4 + 5– RP3-RP5: 5 + 5 + 5 + 5
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
RP0
RP1
RP2
RP3
RP4
RP5
9
RCU2 Backplanes• Two options:
1. Adapter card solution2. All in One solution
• Pros and Cons are considered for both options
• Prototypes will be produced for both options and decision taken afterwards.
• Location of connectors between RCU2 and branches are fixed:– Backward compatible with current
backplanes
Adapter card solution
All in One solution
Present backplanes
Electrical split of backplanes
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
10
RCU2 Hardware
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
11
Microsemi Smartfusion2• The Smartfusion2 M2S050-
FG896 provides:– Radiation Tolerant Flash Cells– SECDED encoded DDR RAM
interface– Microcontroller Subsystem
with ARM Cortex M3 and useful peripherals• Platform for Embedded Linux
– 5 Gb/s operation in custom working mode on one lane of the SERDESIF for DDL2*
– Enough resources to have TMR on vital parts of the logic
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
* Not available before end of 2013
12
RCU2 Hardware• Pictures shows first draft of RCU2 layout
• Important points:– Backward compatible regarding
placement of connectors– Reuse of cooling plates
• Constraints placement regarding heat dissipation and connector locations
• Power estimates– Typical values (current consumption):
• Ext 4V3: ~1.2A• Ext 3V3: ~4.3A
– The GTL bus drivers are by far the most power-hungry components on the board
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
RCU2 back
RCU2 front
13
RadMon – Radiation Monitor• On the present RCU there is an additional FPGA
that counts and corrects SEUs in the configuration memory in the main FPGA– This acts as a radiation monitor!
• This is an interesting feature to keep for the RCU2:– Additional SRAM memory and Microsemi proASIC3
250 added to the RCU2• Not enough user-IOs on the smartFusion2 for this feature
– Low risk – design already done and proven*– Cypress SRAM – same as used for the latest LHC
RadMon devices• Extensively characterized in various beams (n, p, mixed) and
compared/benchmarked to FLUKA MC simulations
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
* Arild Velure ”Anvendelse av FPGA som preprosessor i en SRAM-basert nøytrondetektor”, Master Thesis 2011
14
TTC interface• TTC = Timing, Trigger and Control
– System responsible for distribution of system clock and triggers• One of the challenges of the upgrade is that the TTCrx ASIC is out of stock
– This is the Trigger/Clock receiver chip for the Front end Electronics– No proven radiation tolerant TTCrx replacement exist!
• Solution:– HFBR 2316T Optical Receiver (suggested by TTC group)– MAX3748 Post limiting amplifier– Clock Data Recovery internally in FPGA
• Challenge: The clock signal must be recovered with high accuracy!
• This has been tested and verified in the lab with real-life setup – Radiation tolerance not yet proven
• Involves simple components with no configuration registers – Success very likely!
• Makes reuse of existing FPGA modules very easy
HFBR2316T
Post Limiting
Amplifier
Digital LVDSAnalog single ended CDR
Optical TTC signalClk 40MHz
Channel AChannel B
Smartfusion2
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
15
RCU2 FPGA design• The ARM Cortex M3 hosts an
Embedded Linux platform– New drivers are needed– Most software can be reused
• The FPGA PLD design is heavily based on the present RCU design with a few new features
• New readout scheme:– Ordering of channels by pads &
rows– Higher clock speed – Pause and recover
implementation– Discard of Junk data
• Important: Improve radiation tolerance
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
Clock/Data Recovery Module
TTCClk 40Mhz
TTC Ch A
TTC Ch B
Trigger InterfaceModule
Altro Interface Module
FCBus
If
Branch A_inner
Altro Interface Module
FCBus If
Branch A_inner
Altro Interface Module
FCBus If
Branch B_inner
Altro Interface Module
Branch B_outer
DDL2 Interface (SERDES)
APB bus decoder
Arbiter
Result Unit
Instruction Sequencer
Data Assembler
Event Manager
CDH
DDL2
dat
a
ALTRO Raw data
ARM Cortex M3 CPU(Linux)
SoCPLD
PLD programminginterface
Hardcore interfacesSPI FLASH, DDR2/3 Mem
RadMON
Microsemi Flashpro
FPGA
TTC
DAQ
DCS
4 branches – 18-25 FECs
Smartfusion2
Triggers
Triggers CDH
TTC
Ch A
TTC
Ch B
RCU Bus
ChunkFIFO
4 x databusFC
Bus If BO
SPIcore
ADCsTemperature,
voltage, current monitoring
I2Ccore
ALTRO busFrontend Control Bus
HardwareID EEPROM
JTAG programming IF
Ethernet
FICIF core
FICIF core
System Controller
APB
CLK 160 MHz
JTAG IF
CLKManager125MHz
125MHz
Ported from RCU#1
New Module
Partly New Module
APB
16
Readout Scheme and Data Ordering (I)• One row of sensor pads are spread over
all 4 branches• Data pre-prossesing demands that data is
shiped ordered by padrow– Needed to find charge clusters along tracks
in the HLT/DAQ system• A chunk of data is defined:
– Number of channels belonging to same padrow and branch ordered by pad location
– The order of the channels within a chunk is configurable to match pad location
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
Padrow N
Pad 0
Pad N
Row N without branches
Row N with 4 branches
Pad 0
Branch A_outer Branch A_inner Branch B_inner Branch B_outer
Pad N
Chunk A_outer Chunk A_inner Chunk B_inner Chunk B_outer
Padrow N
17
Readout Scheme and Data Ordering (II)
• Given average event size estimations for 0-10% central events (Run 2):– Max average size of one chunk is ~35 kBit
• Readout partition 1– Highest channel density – Largest data volume
– Bandwidth per branch: ~1.0 Gbit/s– Min. Readout speed per FIFO: 32 bit @ 125
MHz– Bandwidth RCU2 DDL2 interface: 4.0 Gbit/s
• Given internal memory resources:– Branch FIFO capacity: 4.7 average max size
chunks
• Simulations are planned to get exact figures
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
Chunk 0_AO Chunk 0_AI Chunk 0_BI Chunk 0_BO
Chunk 1_AO Chunk 1_AI Chunk 1_BI Chunk 1_BO
FIFOAO
FIFOAI
FIFOBI
FIFOBO
Data Assembler+ Formatter
Branch AO Module Branch AI Module Branch BI Module Branch BO Module
Chunk N_AO Chunk N_AI Chunk N_BI Chunk N_BO
Readout List
Memory
Hit List Memory
ALTR
O IF
Mod
uleReadout
List Memory
Hit List Memory
ALTR
O IF
Mod
uleReadout
List Memory
Hit List Memory
ALTR
O IF
Mod
uleReadout
List Memory
Hit List Memory
ALTR
O IF
Mod
ule
DDL2interface
Chunk 0_AO
Chunk 0_AI
Chunk 0_BI
Chunk 0_BO
Chunk 1_AO
Chunk 1_AI
Round Robin Readout
18
Irradiation Campaigns• Component testing will be done at Oslo Cyclotron (in-house)
– ~30 MeV protons– 1 slot in October & 3 slots in November– We can not test all components individually (No time!)– A list have been made with components we find it critical to test
as early as possible• This is essentially ”new” components that we don’t have experience with,
or• Components in critical parts of the design (i.e. TTC chain)
– Testing of the smartFusion2• Effects of Single Event Transients can be a problem at higher clock
speeds*
• RCU2 system test in Uppsala or PSI early next year– 180 MeV Protons & Neutrons possible– Beamtime will be requested when we are certain to reach the
milestone
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
* https://indico.cern.ch/getFile.py/access?resId=0&materialId=slides&confId=152527
19
Summary & Outlook• The proposed upgrade would enable ALICE to collect a
significant larger amount of events in the central barrel at a moderate cost– Estimated cost: ~455 kCHF
• The read-out time for TPC events is estimated to be improved by a factor of up to 2.6– TPC will conform to the running scenario envisaged for Run 2
of ALICE– SystemC simulations are planned to confirm this figure
• Radiation tolerance will be improved
• Biggest challenge for the project is the tight time-budget constraint– At time of writing we are approximately on schedule
Timeline RCU2
Jan. 2013 LS1 starts
Apr. 2013 ALICE TPC LS1 consolidation effort decided
Oct. 2013
Apr. 2014
RCU2 HW prototype final review: start of production
Mass production start
Sep. 2014 Commisioning start
Jan. 2015 RUN2 start
Nov. 2013 Irradition campaigns - componentsBackplane prototype finished
Jan. 2014 RCU2 prototype functional testFPGA design first version ready
Feb. 2014 Irradition campaigns – system test
May 2014 Embedded software prototype finshed
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013
20
Thanks for Listening• RCU2 people (in no particular order):
Johan Alme ([email protected]) – Bergen University College, NorwayLars Bratrud, Jørgen Lien, Rune Langøy – Vestfold University College, NorwayKetil Røed, Chengxin Zhao – University of Oslo, NorwayKjetil Ullaland, Dieter Röhrich, Shiming Yang, Arild Velure, Inge Nikolai Torsvik,
Christian Torgersen – University of Bergen, NorwayAttiq Ur Rehman – COMSATS, Islamabad, PakistanTivadar Kiss, Ernö David – Cerntech, Budapest, HungaryChristian Lippman– GSI Darmstadt, GermanyAnders Oskarsson, Peter Christiansen, Lennart Osterman – University of Lund,
SwededHarald Appelshäuser, Torsten Alt , Attilio Tarantola – Goethe University
Frankfurt, GermanyFillipo Costa – CERN, Switzerland Taku Gunji – University of Tokyo, Japan
Johan Alme - TWEPP 2013, Perugia, Italy 23rd – 27th September 2013