STAR High-Level Trigger & STAR Tracking Focus Group
HLT and CA Tracker
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 1
Aihong Tang
Outline
q STAR High Level Trigger l History
l CA tracker (online tracking & offline as seeding)
l Infrastructure
q HLT Spring/Summer 2017 activities
q Plan for BES-II
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 2
STAR HLT History
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 3
• STAR Level-3 trigger system (Developed by Frankfurt group, conformal mapping based tracker, phased out ~2002)
• HLT 1.0 (2010 – 2012) – Borrowed CPU share from TPX machines. Fragmented event reconstruction. Not scalable.
• HLT 2.0 (2013 – 2016) – Independent farm sponsored by NSFC. Integrate all tasks in one process. Use CA tracker to replace the L3 tracker, including experimental usage of KFParticle. Developed new DAQ infrastructure. Scale up effortlessly.
• HLT 3.0 (2017 – ) – Balance the total throughput of ~1k CPU
threads and ~10k Phi threads. Decompose the HLT task into independently schedulable sub-tasks and hide the details of Phi from DAQ side.
Year # of Nodes # of CPU 2012 4 96
2014 9 296
2016 27 1192 + 45 Xeon Phi
Integration with DAQ
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 4
HLT
• STAR HLT use high performance computers to do real time event reconstruction and analysis
• Provide additional event selection capability based on physics analysis on top of hardware trigger layers
CellularAutomatonTracker
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 5
• local data access • intrinsically parallel • extremely simple algorithms • suitable for SIMD
S. Gorbunov et al. Real Time Conference (RT), 2010
GORBUNOV et al.: ALICE HLT HIGH SPEED TRACKING ON GPU 1847
Fig. 6. Reconstruction time on CPU for events with different track multiplicity.
Fig. 7. a) Neighbors finder. b) Evolution step of the Cellular Automaton.
The tracking algorithm starts with a combinatorial search fortrack candidates (tracklets), which is based on the Cellular Au-tomaton method [3]. Local parts of trajectories are created fromgeometrically nearby hits, thus eliminating unphysical hit com-binations at the local level. The combinatorial processing com-poses the following two steps:
• 1. Neighbor finder: For each hit at a row k the best pairof neighboring hits from rows k 1 and k 1 is found,as it is shown in Fig. 7(a). The neighbor selection criteriarequires the hit and its two best neighbors to form a straightline. The links to the best two neighbors are stored. Oncethe best pair of neighbors is found for each hit, the step iscompleted.
• 2. Evolution step: Reciprocal links are determined andsaved, all the other links are removed (see Fig. 7(b)).
Every saved one-to-one link defines a part of the trajec-tory between the two neighboring hits. Chains of consecutiveone-to-one links define the tracklets. One can see from Fig. 7(b)that each hit can belong to only one tracklet because of thestrong evolution criteria. This uncommon approach is possibledue to the abundance of hits on every TPC track. Such a strongselection of tracklets results in a linear dependence of theprocessing time on the number of track candidates. When thetracklets are created, the sequential part of the reconstructionstarts, implementing the following two steps:
Fig. 8. Reconstruction performance for proton-proton collisions at 14 TeV.
Fig. 9. Reconstruction performance for central heavy ion collisions at 5 TeV.
• 3. Tracklet construction: The tracklets are created by fol-lowing the hit-to-hit links as it is described above. The ge-ometrical trajectories are fit using a Kalman Filter, with a
quality check. Each tracklet is extended in order to col-lect hits being close to its trajectory.
• 4. Tracklet selection: Some of the track candidates can haveintersected parts. In this case the longest track is saved,the shortest removed. A final quality check is applied tothe reconstructed tracks, including a cut on the minimalnumber of hits and a cut for low momentum.
IV. TRACKER EFFICIENCY
The performance of the HLT track finder of 99.9% for proton-proton events and 98.5% for central Pb-Pb collisions has beenverified on simulated events. Corresponding efficiency plots areshown on Figs. 8 and 9. In addition to the high efficiency, thereal-time reconstruction is an order of magnitude faster than theoff-line algorithm used as reference.
The described algorithm has the advantage of a high degree oflocality and parallelism. Step one only searches for local neigh-bors to each hit. It can be done in parallel for all the hits as theresult does not depend on the order of processing. Step three
NHitsInTrack0 10 20 30 40 50 60 70 80 90 100
Entr
ies
0
500
1000
1500
2000
2500
3000
3500
4000
Entries 33666
Mean 12.2
RMS 8.63
Underflow 0
Overflow 0
of 806 Aug 2013 Igor Kulakov, BNL
HLT Based CA (offline calibration)
eval_Sti-CA/daq_sl53.ittf/year_2011/AuAu200_production
STAR AuAu 200GeV Run11
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 6
Ivan Kisel, GSI STAR Tracking Review Meeting, Berkeley, November 14, 2011
Cellular Automaton versus Track Following
The CA track finder is more stable w.r.t. track multiplicity and is ~10 times faster than the TF based Sti track finder.8
The Track Following method is well suitable for simple event topologies, but suffers from large combinatorics in case of high track densities. In addition, the final efficiency of the TF method is limited by the seeding efficiency.
The Cellular Automaton method is based on local analysis of data. The CA algorithm has staged structure, therefore it accumulates continuously the tracking information while working with hits, neighbors, track segments, track candidates and tracks. In addition, CA can apply global competition at each stage of data analysis. Locality of the algorithm makes CA intrinsically parallel.
Improve the STAR tracking by integrating the CA track finder as a seed finder for Sti.
Sti trackerCA+Sti tracker
CPU time per event
global CA tracker sector CA tracker
CA+Sti is ~50% faster than Sti alone Efficiency for global tracks has been increased on 7-9%
Sti trackerCA+Sti tracker
Real Au-Au 200 GeV/n data
Y. Fisyak, I. Kisel, I. Kulakov, J. Lauret, M. Zyzak, „Track Reconstruction in the STAR TPC with a Cellular Automaton Based Approach“,CHEP-2010, Taipei, October 18-22, 2010
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 7
• StiCA gives 6-12% higher tracking efficiency, depend on luminosity, than Sti in Au+Au 200GeV collisions
• StiCA gives ~8% higher tracking efficiency than Sti in p+p 510GeV collisions
• The pT difference between Sti and StiCA is less than 3% for global tracks and no obvious difference for primary tracks
• StiCA is more stable when bad TPC sectors exist
• StiCA is now the default tracker for Run16/17 offline data production
M.Mustafa & X. Chen
Particle Identification
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 8
Pt (GeV/c)0 2 4 6 8 10 12 141
10
210
310
410
510
610
Prim_PtPrim_Pt
Entries 4679274
Mean 0.6214
RMS 0.4307
Prim_Pt
φ
0 1 2 3 4 5 60
2000
4000
6000
8000
10000
12000
14000
16000
18000
Prim_PhiPrim_Phi
Entries 4679274
Mean 2.965
RMS 1.805
Prim_Phi
η-3 -2 -1 0 1 2 30
20406080
100120140160180200220240
310×Prim_Eta
Prim_EtaEntries 1.057004e+07
Mean 0.04002
RMS 0.7795
Prim_Eta
1
10
210
310
Primary Momentum (GeV/c)-5 -4 -3 -2 -1 0 1 2 3 4 5
dEdx
(GeV
/cm
)
0
5
10
15
20
25
30-610×
Prim_dEdxPrim_dEdx
Entries 4386294Mean x 0.01939Mean y 2.771e-06RMS x 0.9262RMS y 1.208e-06
Prim_dEdx
VertexX (cm)-2 -1.5 -1 -0.5 0 0.5 1 1.5 20
200
400
600
800
1000
1200
1400
1600
1800
2000
VertexXVertexX
Entries 28742
Mean -0.2138RMS 0.1614
VertexX
VertexY (cm)-2 -1.5 -1 -0.5 0 0.5 1 1.5 20
200
400
600
800
1000
1200
1400
1600
1800
2000
VertexYVertexY
Entries 28742
Mean -0.1627RMS 0.1637
VertexY
• PID using TPC, BEMC and TOF • online self-calibration of TPC dE/dx gain
Multiplicity0 1000 2000 3000 4000 5000 6000 70001
10
210
globalMultglobalMultEntries 28742Mean 2243
RMS 1422
globalMult
Multiplicity0 200 400 600 800 1000 1200 1400 16001
10
210
primaryMultprimaryMultEntries 28742Mean 367.8
RMS 257.4
primaryMult
matchPhiDiff0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10
10000
20000
30000
40000
50000
60000
70000
Emc_matchPhiDiffEmc_matchPhiDiff
Entries 1564783
Mean 0.02688
RMS 0.01907
Emc_matchPhiDiff
TowerEnergy (GeV)0 2 4 6 8 10 12 14 16 18 200
500
1000
1500
2000
2500
310×Emc_towerEnergy
Emc_towerEnergy
Entries 8730685
Mean 0.8227
RMS 0.3683
Emc_towerEnergy
TowerDaqId0 500 1000 1500 2000 2500 3000 3500 4000 4500 50000
5000
10000
15000
20000
25000
30000
Emc_towerDaqIdEmc_towerDaqId
Entries 8730685
Mean 2674
RMS 1373
Emc_towerDaqId
TowerSoftId0 500 1000 1500 2000 2500 3000 3500 4000 4500 50000
5000
10000
15000
20000
25000
30000
Emc_towerSoftIdEmc_towerSoftId
Entries 8730685
Mean 2455
RMS 1372
Emc_towerSoftId
zEdge0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50
1000
2000
3000
4000
5000
6000
Emc_zEdgeEmc_zEdgeEntries 225170
Mean 0.9535
RMS 0.579
Emc_zEdge
0
5000
10000
15000
20000
25000
φ
-3 -2 -1 0 1 2 3
η
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1Emc_towerEtaPhi
Emc_towerEtaPhiEntries 8730685Mean x 0.009471Mean y -0.02656RMS x 1.837RMS y 0.6257
Emc_towerEtaPhi
LocalZ-5 -4 -3 -2 -1 0 1 2 3 4 50
10000
20000
30000
40000
50000
60000
Tof_LocalZTof_LocalZEntries 4211508
Mean -0.7225
RMS 1.934
Tof_LocalZ
LocalY-15 -10 -5 0 5 10 150
20
40
60
80
100
120
310×Tof_LocalY
Tof_LocalYEntries 4211508
Mean 0.005957
RMS 1.445
Tof_LocalY
0
50
100
150
200
250
300
350
400
450
Momentum (GeV/c)0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
β1/
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5Tof_InverseBeta
Tof_InverseBetaEntries 4211508Mean x 0.7962Mean y 1.085RMS x 0.4976RMS y 0.2175
Tof_InverseBeta
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
matchId0 20 40 60 80 100 120 140 160 180 200
fiber
Id
0
20
40
60
80
100
120
140
160
180
200Tof_matchId_fireId
Tof_matchId_fireIdEntries 5.626722e+07
Mean x 101.5
Mean y 99.58
RMS x 54.3
RMS y 55.63
Tof_matchId_fireId
Selected HLT Trigger Algorithms
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 9
[GeV/c]T,mc
p0 1 2 3 4 5 6 7 8 9 10
Eff
0
0.2
0.4
0.6
0.8
1
1.2
1.4
HLT: MTD matching efficiency for J/psi
DefaultClosest match
Heavy fragments, e.g. anti-4He e+ + e-
µ+ + µ-
Primary Vertex Section for HFT
BeamLineMonitoring
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 10
• Online 3D primary vertex reconstruction • Real time beam position monitoring in RHIC low energy runs • Reject background • Live feedback to CA for accelerator monitoring and
performance tuning • Benefit all BES-I physics analyses • Likely to be used again in BES-II
HLT Offline
KFParticle and Xeon Phi Coprocessor
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 11
Reconstruction of decay chains
• The state vector does not depend on the number of daughter particles
• All particles are described with the same state vector
• All particles are equivalent
• All functionality is the same for mother and daughter particles
07.07.2011 Maksym Zyzak, 2nd Tracking Workshop 4/22
r = { x, y, z, px, py, pz, E }
Position, direction and momentum State vector
Reconstruction of decay chains
• The state vector does not depend on the number of daughter particles
• All particles are described with the same state vector
• All particles are equivalent
• All functionality is the same for mother and daughter particles
07.07.2011 Maksym Zyzak, 2nd Tracking Workshop 4/22
r = { x, y, z, px, py, pz, E }
Position, direction and momentum State vector
• KFParticle on Intel Xeon Phi (pioneered by FIAS group) • Intel Xeon Phi 7120P
o 61 core x 4 hardware threads per core o 512-bit vector register o 16G RAM o on board Linux
• Experience with Xeon Phi benefits more the KFParticle
15 January 2014 Maksym Zyzak, BNL /05
KF Particle Finder on Xeon Phi
!3
• Standalone KF Particle Finder is run on the Xeon Phi card.
• Currently search of Ks0, Λ, Ξ- and Ω- is turned on. The parallelism between cores is implemented on the event level.
• The algorithm is significantly modified for higher level of vectorization, including data structure modification.
• The program scales up to 240 logical cores on the Xeon Phi.
Number of logical cores0 50 100 150 200
Eve
nts/
s
0
2000
4000
6000
8000
10000
Intel Xeon E5-2680 2.7 GHzIntel Xeon Phi 1 GHz 1 thread/coreIntel Xeon Phi 1 GHz 2 threads/coreIntel Xeon Phi 1 GHz 3 threads/coreIntel Xeon Phi 1 GHz 4 threads/core
Number of logical cores0 50 100 150 200
Spee
d up
fact
or
0
20
40
60
80
100
Intel Xeon E5-2680 2.7 GHzIntel Xeon Phi 1 GHz 1 thread/coreIntel Xeon Phi 1 GHz 2 threads/coreIntel Xeon Phi 1 GHz 3 threads/coreIntel Xeon Phi 1 GHz 4 threads/core
See M. Zyzak’s talk
Infrastructure Reshape in 2017
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 12
Event Builder Event #4 Event #0
Tracking Vertex Finding(a thin wrapper)
Trigger Decision Making
Shared Memory
CPU × 40
Xeon Phi × 480 Offloaded Task
One HLT Node
Event #3 Event #2 Event #1
• Fully synchronized system. All working threads are independently scheduled and workload balanced between different stages/devices. DAQ infrastructure provided by J. Landgraf from BNL.
• Expose parallelism at difference levels, event level, within event J. Landgraf and H. Ke
HLT Data Taking in Year 2016
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 13
• HLT process almost all of the data taking by DAQ at 1700+Hz, 1.7+GB/s input data
• ~1000Hz Mini-Bias • ~700Hz MTD (tracking 6-8
sectors) • < 40% CPU load
• MTD di-muon quarkonium selection • Partial TPC tracking, < 300Hz • TPC-MTD matching • Selection rate ~1%
hMTDQmInvMassUSEntries 10Mean 2.161RMS 1.146
2) GeV/cµµ(invM0 2 4 6 8 10 120
2000
4000
6000
8000
10000
12000
MTD quarkonium InvMass hMTDQmInvMassUSEntries 213505
Mean 2.604RMS 2.082
USLS
MTD quarkonium InvMass hMTDQmUpsilonMassUS
Entries 10Mean 0RMS 0
2) GeV/cµµ(invM8 8.5 9 9.5 10 10.5 11 11.5 12 12.5 13
100
150
200
250
300
350
400
450
MTD quarkonium InvMass(Upsilon) hMTDQmUpsilonMassUS
Entries 213505
Mean 9.815RMS 1.342
USLS
MTD quarkonium InvMass(Upsilon)
hMTDQmJpsiMass_ptcut0_US
Entries 10Mean 3.258RMS 0.3336
2) GeV/cµµ(invM2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4
2000
3000
4000
5000
6000
7000
8000
MTD quarkonium InvMass(Jpsi) hMTDQmJpsiMass_ptcut0_US
Entries 213505Mean 3.118RMS 0.4698
USLS
MTD quarkonium InvMass(Jpsi) hMTDQmJpsiMass_ptcut2_US
Entries 8Mean 3.115RMS 0.318
2) GeV/cµµ(invM2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4
1500
2000
2500
3000
3500
MTD quarkonium InvMass(Jpsi_ptcut2) hMTDQmJpsiMass_ptcut2_US
Entries 161619Mean 2.999RMS 0.5083
USLS
MTD quarkonium InvMass(Jpsi_ptcut2)
hMTDQmJpsiMass_ptcut4_US
Entries 5Mean 3.559RMS 08− 1.314e
2) GeV/cµµ(invM2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4
300
400
500
600
700
800
900
1000
1100
MTD quarkonium InvMass(Jpsi_ptcut4) hMTDQmJpsiMass_ptcut4_US
Entries 111294Mean 2.84RMS 0.5286
USLS
MTD quarkonium InvMass(Jpsi_ptcut4)
HLT Run 17 p+p 510GeV
HLT Data Taking in Year 2017
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 14
Spring/Summer activities :TPC Space Charge Calibration (AuAu)
Space Charge Used
<sD
CA
> [c
m] • TPC SC calibration needs to be done once at
the very begging of each run for HLT and fast off-line production
• It needs to be done multiple times during the run if things get bad
• <sDCA> as a function of SC offset has a almost universal relationship, which can be used for self-calibration.
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 15
Making the SCC automatic hDCA
xy
i = a · SC + b
Assumption:• True SC stays the same at tk and tk+1
SCk+1 = SC
k
� hDCAxy
ik
a
a ⇡ �100 for Au+Au 200 GeV
Reality:• At Au+Au 54 GeV, ZDCx changes in a
fraction of a second• Need to fit multiple times in a second• HLT run at ~1500 Hz• Fit once every 2000, 500, and 200 events
Tue Jun 27 00:15:48 2017
hDcaXYEntries 12116Mean 0.06804− RMS 0.8948
/ ndf 2χ 202 / 86Constant 2.4± 159.7 Mean 0.0058±0.0901 − Sigma 0.0067± 0.4693
DCAxy [cm]10− 8− 6− 4− 2− 0 2 4 6 8 100
20
40
60
80
100
120
140
160
180
200
hDcaXYEntries 12116Mean 0.06804− RMS 0.8948
/ ndf 2χ 202 / 86Constant 2.4± 159.7 Mean 0.0058±0.0901 − Sigma 0.0067± 0.4693
DCA XY
Y. Ye, H. Ke and G. Van Buren Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 16
<DC
Axy
>
Tue Jun 27 00:27:16 2017
0 200 400 600 800 1000 1200
0.28−
0.26−
0.24−
0.22−
0.2−
0.18−
0.16−
0.14−
0.12−
0.1−
Tue Jun 27 00:44:21 2017
0 5 10 15 20 25 30
0.28−
0.26−
0.24−
0.22−
0.2−
0.18−
0.16−
0.14−
0.12−
0.1−
• Auto calibration has effect. Converge in no more than a few seconds. • Long time, ~10 min, trend is dominated by the luminosity • The yet deviating from zero and the luminosity dependence is due to the
inaccuracy of the slope a in the previous page
Time for a typical run (a.u.)
Test for Auto-SCC in 2017
~ 10 min
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 17
Test for Auto-SCC in 2017 18
1680
05
1816
8010
1816
8017
1816
8023
1816
8029
1816
8038
1816
8044
1816
8050
1816
8055
1816
9008
1816
9013
1817
0015
1817
0020
1817
0034
1817
0039
1817
1005
1817
1012
1817
1018
1817
2005
1817
2010
0.16−
0.14−
0.12−
0.1−
0.08−
0.06−
0.04−
0.02−
0beam position
beam positionEntries 100Mean 50.92RMS 27.3
BeamX BeamY
beam position
1816
8005
1816
8010
1816
8017
1816
8023
1816
8029
1816
8038
1816
8044
1816
8050
1816
8055
1816
9008
1816
9013
1817
0015
1817
0020
1817
0034
1817
0039
1817
1005
1817
1012
1817
1018
1817
2005
1817
2010
0
10
20
30
40
50
9−10×innerGain
gain parametersEntries 100Mean 49.93RMS 28.88
innerGainSector outerGainSector
innerGain
1816
8005
1816
8010
1816
8017
1816
8023
1816
8029
1816
8038
1816
8044
1816
8050
1816
8055
1816
9008
1816
9013
1817
0015
1817
0020
1817
0034
1817
0039
1817
1005
1817
1012
1817
1018
1817
2005
1817
2010
0.35−
0.3−
0.25−
0.2−
0.15−
0.1−
0.05−
0
dcaXy mean valuedcaXy mean value Auto Correction ON
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 18
• HLT Cluster Sharing Resources for Offline Production • ~500 job slots available for use during experiment downtime • Intend to share the resources with anyone • Wayne Betts is experimenting migrating production jobs submitted to
STAR online pool to HLT cluster • Both two clusters are in DAQ room • HLT cluster has no AFS access, in general, has no network
access (easy to solve, need additional cable and switch) • Two ways of doing this and both has been successful
• Merge to online linux pool • Flock jobs, i.e. grid production
• CephFS (massive storage) and Xeon Phi cards both require specific version of Linux kernel version. Potential conflict is under investigation.
Spring/Summer activities : HLT Job Management/Sharing
W. Betts & H. Ke
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 19
HLT Cluster Summer Activities
Max. # of Slots
• Users mainly from BNL, GSI and Czech for TFG • ~67000 CPU hours in two weeks (as of 09/17/2017, only
limited historical data available)
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 20
Plans for BES-II
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 21
• Produce picoDST online (express online production) for fast diagnostics with physics quantities, which is critical for decision making and fast turning around.
Take advantage of otherwise mostly idling resource. Speed up BFC with new tracker and Xeon Phi cards. Gain experience for future online parallel computing.
• Prototype next year
BES-II • Will run express online production for HLT-good events only. • Total ~1200 slots. For HLT PV filtering : ~400 slots. For express
production : 500 – 800 with all Xeon phi cards. • Projected production CPU time per event is ~3s at 39GeV without
additional optimization, i.e. 0.3Hz per slot1
• Total processing rate is 150-240Hz • Targeting event rate is up to 450Hz for HLT-good events2
• Looking for a 3x speed-up
• 25-30TB buffer needed for buffering one days’ data, assuming DAQ runs 16hr per day at 450Hz “HLT good events”
• Require a copy of all “HLT Good Event”, which is additional pressure to the DAQ network and CPU power, especially the later is already stretching
1Energies for BES-II are below 20 GeV, here we use 39 as baseline is to take into account the ~50% event size increase due to iTPC. 2From D. Cebra’s estimation.
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 22
Potential Improvements aka. Action Items
Sept 23rd, 2017 23
• Full CA tracking including speeding up fitting (Yuri and Grigory ) • Use AVX instructions in CA (Grigory) • Try to run dN/dx on Phi cards (Hongwei/Biao) • Additional possible gain from trying out new compiler and 64 bit.
• Job management system, needed for resource sharing anyway. In place. (Wayne/Levente/Hongwei)
• Disk space requirement and data transfer IO bandwidth – needs to be tested and be sorted out (Hongwei/Jeff)
CBM-STAR Joint Workshop, CCNU Wuhan
BFC timing profile
Sept 23rd, 2017 24
AuAu 200 GeV
AuAu 39 GeV
Sti and dN/dx take most of the time. Inside Sti, track (re)fitting takes most of the time. Solution : 1. Speed up fitting. 2. Run one or both on Xeon Phi
cards.
CBM-STAR Joint Workshop, CCNU Wuhan
than 1% of the total run time of the job. The total input for this production is 397.3 TB, and the total output is 213.7 TB. For the movement of data in both directions the methods employed with KISTI worked very well with little measurable loss.
3. Stages of production Figure 1 shows the stages of STAR’s grid production framework. Many of the components are used for other functions and have been reused here, because they are well debugged, actively maintained, proven to work over a decade, and cut down on development time. These components include the STAR file catalog [4], STAR data carousel [5] [6], STAR Unified Meta Scheduler [7], HTCondor [8] and, Globus [9].
Figure 1. Stages of STAR’s production system – each box and stages are explained in details as sub-
sections of section 3.
3.1. Input File Staging The first task illustrated in box one, is staging input files. The input files are initially stored on tape. They have the extension *.daq which stands for data acquisition, meaning it is data read out of STAR’s detectors. The STAR file catalog holds the list of files belonging to the dataset that needs to be reconstructed. It holds LFNs (logical file names), PFNs (physical file names) as well as a breadth of MetaData (MD) users may key on. The file list from the STAR file catalog is passed via a script which is part of the grid production framework to the STAR data carousel and an entry stub for each file is made in the production DB. The STAR data carousel is a queueing tool which optimizes requests to restore files from tape (HPSS) by reordering the requests in the queue to minimize the number of mount and dismount operations needed to restore the requests. Once the files are restored from tape to central disk the entry in the production database is marked as restored and input staging is completed.
3.2. Job Submission Once the input is staged we move on to submission. With the stub in the production database and the input file restored, a script will check the production database for restored files and compose a list of
3
L Hajdu et al, Automated Finite State Workflow for Distributed Data Production, Journal of Physics: Conference Series 762 (2016) 012006
HLT Job Management/Sharing for BES II express production
Considered PANDA and Levente’s scheme. Decided to use the latter. In place already.
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 25
Summary
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 26
• Collaboration between STAR and CBM has been very fruitful within STAR High-Level Trigger and Tracking Focus Group. We expect it will be even more fruitful in BES-II era.
• It is demonstrated that we can deliver important physics fast with the STAR HLT. We continue to make improvements to the system, and expand its functionalities for multi-purpose usages.
Backup Slides
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 27
RHIC Beam Use Request for Run17
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 28
43
Figure 4–3: The projected uncertainties for transverse single spin asymmetries of W-bosons as a function of rapidity for a delivered integrated luminosity of 400 pb-1 and an average beam polarisation of 55%. The solid light gray and pink bands represent the uncertainty on the KQ [18] and EIKV [52] results, respectively, due to the unknown sea-quark Sivers function. The crosshatched dark grey region indicates the current uncertainty in the theoretical predictions due to TMD evolution.
W± boson production is also the ideal tool to study the spin-flavor structure of sea
quarks inside the proton. Such a measurement of the transverse single-spin asymmetry will provide the very first constraint on the sea quark Sivers function in an x-range where the measured asymmetry in the and unpolarized sea quark distribution functions, as measured by E866 [55], can only be explained by strong non-pQCD contributions. At the same time, this measurement is also able to access the sign change of the Sivers function, if the effect due to TMD evolution on the asymmetries is in the order of a factor of 5 reduction. Figure 4–3 shows the projected uncertainties for transverse single-spin asymmetries of W± and Z0 bosons as a function of rapidity for a delivered integrated luminosity of 400pb-1 and an average beam polarization of 55%. The solid light gray and pink bands represent the uncertainty on the KQ [18] and EIKV [52] results, respectively, due to the unknown sea-quark Sivers function. The crosshatched dark grey region indicates the current uncertainty in the theoretical predictions due to TMD evolution.
Figure 4–4: The FoM to reconstruct W-bosons in STAR as function of the ZDC raw rate. The W-boson reconstruction efficiency was obtained from the data measured in 2011 to 2013. The highest FoM is reached at a ZDC rate of 330 kHz corresponding to a luminosity of 1.5 x 1032 cm-2s-1.
W y-0.5 0 0.5
N A
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
W y-0.5 0 0.5
N A
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2STAR projections
-1L(del.) = 400 pb
i + lA +W
no TMD evolutionarXiv:0903.3629
due to TMD evolutionUncertainty
Uncertainty on sea quarks
W y-0.5 0 0.5
N A
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
W y-0.5 0 0.5
N A
-0.3
-0.2
-0.1
0
0.1
0.2
0.3 STAR projections-1L(del.) = 400 pbi - lA -W
TMD evolvedarXiv:1401.5078
Uncertainty on sea quarks
Z y0.5< 0 0.5
N A
0.4<
0.3<
0.2<
0.1<
0
0.1
0.2
0.3
0.4STAR projections
-1L(del.) = 400 pb- l+ lA 0Z
arXiv:0903.3629 (no TMD evolution)Uncertainty due to TMD evolutionUncertainty on sea quarks
arXiv:1401.5078 (TMD evolved)Uncertainty on sea quarks
u d
• Transversely polarized p+p @ 510GeV • Requested luminosity increase 10% in terms of ZDC rate
StiCA Evaluation 2015
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 29
• StiCA vs. Sti: Use CA as track finder and Sti do the track fitting as usual.
5
Run14 Au + Au 200GeV Primary Tracks
StiCA
Sti
Production Mid (w/ HFT)
Production High (w/o HFT)
Production Mid (w/ HFT)
Production High (w/o HFT)
8
Run 9 and 13 pp 500GeV Primary Tracks
Run 13 Run 9 Run 13 Run 9AuAu 200GeV
High L w/ HFT Mid L w/o HFT pp 510GeV
High L Low L
D0 Production in Run14 Au+Au 200GeV
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 30
)2(GeV/cπKm1.7 1.75 1.8 1.85 1.9 1.95 2 2.05 2.1
)2C
ount
s (p
er 1
0 M
eV/c
0
20
40
60
80
100
120
140 = 200 GeVNNsAu+Au
RHIC Run 2014 < 12.00
T1.50 < p
Sti26M MinBias Events
= 6S+BS/Counts = 204
)2(GeV/cπKm1.7 1.75 1.8 1.85 1.9 1.95 2 2.05 2.1
)2C
ount
s (p
er 1
0 M
eV/c
020406080
100120140160180200220
= 200 GeVNNsAu+Au RHIC Run 2014
< 12.00T
1.50 < p
StiCA25M MinBias Events
= 7S+BS/Counts = 256
• Run14 Au+Au @ 200 GeV, 25M events production sample • StiCA gives ~25% more D0 count with better S/B ratio
M. Mustafa
Single-spin Asymmetries AL for W±
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 31
• Run13 longitudinally polarized p+p @ 510GeV
• StiCA provides 20+% W • With StiCA the “eta-dip” at high
luminosity is not observed
Devika Gunarathne
Di-jet observables
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 32
Daniel Olvitt
More Than HLT
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 33
Standalone event reconstruction
@FIAS
Development
HLT event reconstruction
@BNL
Application
StRoot event reconstruction
@BNL
Implementation
Ivan Kisel, STAR Collaboration Meeting, Berkeley, 18.10.2013
RHIC BES-II
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 34
26
2. Quantum Chromodynamics: The Fundamental Description of the Heart of Visible Matter
The trends and features in BES-I data provide compelling
motivation for a strong and concerted theoretical
response, as well as for the experimental measurements
with higher statistical precision from BES-II. The goal
of BES-II is to turn trends and features into definitive
conclusions and new understanding. This theoretical
research program will require a quantitative framework
for modeling the salient features of these lower energy
heavy-ion collisions and will require knitting together
components from different groups with experience
in varied techniques, including LQCD, hydrodynamic
modeling of doped QGP, incorporating critical
fluctuations in a dynamically evolving medium, and more.
Experimental discovery of a critical point on the QCD
phase diagram would be a landmark achievement. The
goals of the BES program also focus on obtaining a
quantitative understanding of the properties of matter
in the crossover region of the phase diagram, where it
is neither QGP nor hadrons nor a mixture of the two, as
these properties change with doping.
Additional questions that will be addressed in this
regime include the quantitative study of the onset
of various signatures of the presence of QGP. For
example, the chiral symmetry that defines distinct
left- and right-handed quarks is broken in hadronic
matter but restored in QGP. One way to access the
onset of chiral symmetry restoration comes via BES-II
measurements of electron-positron pair production in
collisions at and below 20 GeV. Another way to access
this, while simultaneously seeing quantum properties
of QGP that are activated by magnetic fields present
early in heavy collisions, may be provided by the slight
observed preference for like-sign particles to emerge
in the same direction with respect to the magnetic field.
Such an effect was predicted to arise in matter where
chiral symmetry is restored. Understanding the origin
of this effect, for example by confirming indications that
it goes away at the lowest BES-I energies, requires the
substantially increased statistics of BES-II.
NEW MICROSCOPES ON THE INNER WORKINGS OF QGPTo understand the workings of QGP, there is no
substitute for microscopy. We know that if we had a
sufficiently powerful microscope that could resolve the
structure of QGP on length scales, say a thousand times
smaller than the size of a proton, what we would see
Figure 2.10: The top panel shows the increased statistics anticipated at BES-II; all three lower panels show the anticipated reduction in the uncertainty of key measurements. RHIC BES-I results indicate nonmonotonic behavior of a number of observables; two are shown in the middle panels. The second panel shows a directed flow observable that can encode information about a reduction in pressure, as occurs near a transition. The third panel shows the fluctuation observable understood to be the most sensitive among those measured to date to the fluctuations near a critical point. The fourth panel shows, as expected, the measured fluctuations growing in magnitude as more particles in each event are added into the analysis.
are quarks and gluons interacting only weakly with each
other. The grand challenge for this field in the decade
to come is to understand how these quarks and gluons
conspire to form a nearly perfect liquid.
Microscopy requires suitable messengers that reveal
what is happening deep within QGP, playing a role
analogous to light in an ordinary microscope. The
The Beam Energy Scan at the Relativistic Heavy Ion Collider 32
Table 2. Event statistics (in millions) needed for Beam Energy Scan Phase-II for various observables.Collision Energy (GeV) 7.7 9.1 11.5 14.5 19.6µB (MeV) in 0-5% central collisions 420 370 315 260 205
Observables
RCP up to pT = 5 GeV/c – 160 125 92Elliptic Flow (f mesons) 100 150 200 200 400Chiral Magnetic Effect 50 50 50 50 50Directed Flow (protons) 50 75 100 100 200Azimuthal Femtoscopy (protons) 35 40 50 65 80Net-Proton Kurtosis 80 100 120 200 400Dileptons 100 160 230 300 400Required Number of Events 100 160 230 300 400
3.1.1. RCP of identified hadrons up to pT = 5 GeV/c High-pT suppression is seen as anindication of the energy loss of leading partons in a colored medium. Therefore, the RAA
measurements are one of the clearest signatures for the formation of the quark-gluon plasma.Because there was not a comparable p+ p energy scan, the BES analysis has had to resortto RCP measurements as a proxy. Still the study of the shape of RCP(pT ) will allow usto quantitatively address the evolution of the phenomenon of jet-quenching to lower beamenergies. A very clear change in behavior as a function of beam energy is seen in thesedata (see Figs. 12 and 13); at the lowest energies (7.7 and 11.5 GeV) there is no evidence ofsuppression for the highest pT values that are reached. However, it should be noted that forthese energies the BES-I measurements are only able to reach 3-4 GeV/c for inclusive hadronsand 2-3 GeV/c for identified hadrons. Typically, one considers pT of 5 GeV/c and above to bedominated by partonic behavior. Therefore, although the BES-I RCP results are suggestive ofa disappearance of this QGP signature, they are not conclusive. The pT reach expected in theproposed BES Phase-II measurements will be crucial in drawing definitive conclusions aboutevidence for the creation of QGP at a given collision energy.
Although the BES-I spectra do not reach high enough pT to extend into the purely hard-scattering regime, they do allow us to make detailed projections of how many events would beneeded to reach a given pT for a given beam energy. We propose to acquire about 400 tracksin the pT range of 4-5 GeV/c for the 11.5, 14.5, and 19.6 GeV energies. At the lower energiesof 7.7 and 9.1 GeV, there is simply not enough kinematic reach to get out to 4-5 GeV/c. Theserequired numbers of events are listed in Table 2
We have used the yields of identified particles measured in BES-I to make projections ofthe expected errors for the RCP measurements with increased statistics expected to be availablein BES Phase-II. For each particle species, energy, and centrality, we have used a exponentialextrapolation (note that this is a more conservative estimate than the power law extrapolation)to estimate the expected number of particles to be measured in each pT bin based on theexpected number of events at each energy shown in Table 2. From this expected number ofparticles per bin, we can estimate the statistical error for the central and peripheral bins andpropagate these to estimate the expected error on the RCP measurements. These projected
The Beam Energy Scan at the Relativistic Heavy Ion Collider 32
Table 2. Event statistics (in millions) needed for Beam Energy Scan Phase-II for various observables.Collision Energy (GeV) 7.7 9.1 11.5 14.5 19.6µB (MeV) in 0-5% central collisions 420 370 315 260 205
Observables
RCP up to pT = 5 GeV/c – 160 125 92Elliptic Flow (f mesons) 100 150 200 200 400Chiral Magnetic Effect 50 50 50 50 50Directed Flow (protons) 50 75 100 100 200Azimuthal Femtoscopy (protons) 35 40 50 65 80Net-Proton Kurtosis 80 100 120 200 400Dileptons 100 160 230 300 400Required Number of Events 100 160 230 300 400
3.1.1. RCP of identified hadrons up to pT = 5 GeV/c High-pT suppression is seen as anindication of the energy loss of leading partons in a colored medium. Therefore, the RAA
measurements are one of the clearest signatures for the formation of the quark-gluon plasma.Because there was not a comparable p+ p energy scan, the BES analysis has had to resortto RCP measurements as a proxy. Still the study of the shape of RCP(pT ) will allow usto quantitatively address the evolution of the phenomenon of jet-quenching to lower beamenergies. A very clear change in behavior as a function of beam energy is seen in thesedata (see Figs. 12 and 13); at the lowest energies (7.7 and 11.5 GeV) there is no evidence ofsuppression for the highest pT values that are reached. However, it should be noted that forthese energies the BES-I measurements are only able to reach 3-4 GeV/c for inclusive hadronsand 2-3 GeV/c for identified hadrons. Typically, one considers pT of 5 GeV/c and above to bedominated by partonic behavior. Therefore, although the BES-I RCP results are suggestive ofa disappearance of this QGP signature, they are not conclusive. The pT reach expected in theproposed BES Phase-II measurements will be crucial in drawing definitive conclusions aboutevidence for the creation of QGP at a given collision energy.
Although the BES-I spectra do not reach high enough pT to extend into the purely hard-scattering regime, they do allow us to make detailed projections of how many events would beneeded to reach a given pT for a given beam energy. We propose to acquire about 400 tracksin the pT range of 4-5 GeV/c for the 11.5, 14.5, and 19.6 GeV energies. At the lower energiesof 7.7 and 9.1 GeV, there is simply not enough kinematic reach to get out to 4-5 GeV/c. Theserequired numbers of events are listed in Table 2
We have used the yields of identified particles measured in BES-I to make projections ofthe expected errors for the RCP measurements with increased statistics expected to be availablein BES Phase-II. For each particle species, energy, and centrality, we have used a exponentialextrapolation (note that this is a more conservative estimate than the power law extrapolation)to estimate the expected number of particles to be measured in each pT bin based on theexpected number of events at each energy shown in Table 2. From this expected number ofparticles per bin, we can estimate the statistical error for the central and peripheral bins andpropagate these to estimate the expected error on the RCP measurements. These projected
- The 2015 Long Range Plan for Nuclear Science - Studying the Phase Diagram of QCD Matter at RHIC, BES-II Whitepaper
• iTPC extent eta coverage • 45 pad rows to 72 pad rows
Current BFC Performance
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 35
• Au+Au @ 62GeV • ~7s per event • ~39GeV during BES-II ?
• Au+Au @ 200GeV • ~40s per event
• 70+% time spend on tracking, CA tracking finding + Sti track fiting
STAR BFC Performance @ 39GeV
Sept 23rd, 2017 36
Memory [MB] CPU Time [s/event]
• Consider the iTPC will increase the TPC data volume by ~50%, use 39GeV as estimation baseline.
CBM-STAR Joint Workshop, CCNU Wuhan
Expected Rates during BES-II
Sept 23rd, 2017 37
• DAQ hours per day during Run10 39GeV
• ~70TB per day on average • Need to be prepared for the worst
case scenario
D. Cebra
CBM-STAR Joint Workshop, CCNU Wuhan
Available HLT Resources
Sept 23rd, 2017 38
² Existing Computing Resources • 1192 CPU logical cores • 45 Xeon Phi 7110P Coprocessors (2 per node, each has
240 hardware threads, 16GB RAM) • Up to 48T disk storage for online calibration, QA and etc.
CBM-STAR Joint Workshop, CCNU Wuhan
dN/dx Status and Plans
Sept 23rd, 2017 CBM-STAR Joint Workshop, CCNU Wuhan 39
Status • Year ago new dN/dx method to use ionization measurement from TPC has
been proposed (https://drupal.star.bnl.gov/STAR/system/files/dNdx_rev3.pptx.pdf). The main goal is to improved particle identification (PiD) in iTPC era.
• The main idea on the dN/dx method is to use model for shape of energy deposited in a given TPC cluster versus number of primary clusters created during passage of charged particle in TPC. The shape has been calculated using Photo absorption Ionization (PAI) Model (http://www.annualreviews.org/doi/abs/10.1146/annurev.ns.30.120180.001345).
• dN/dx method has demonstrated improvement in PiD by ~15% using 2016 AuAu200 calibration sample.
• It was proposed to use dN/dx for whole 2016 AuAu200 sample and to check its performance.
• The production has been done and PWG have tested the dN/dx performance (https://drupal.star.bnl.gov/STAR/system/files/dNdxdEdx_Run16.pdf, https://drupal.star.bnl.gov/STAR/system/files/20170726_ComparisionStiSticaStihr.pdf )
Sept 23rd, 2017 40 CBM-STAR Joint Workshop, CCNU Wuhan
PWG test results
• dN/dx PiD resolution is ~10% better than dE/dx one. • dN/dx distribution has longer tails wrt dE/dx. • dN/dx measured for kaons is significantly shifted wrt the model
prediction. • Usage dN/dx does NOT give any improvement in significance
of D0 signal wrt dE/dx. • Signal of ϕèK+K- is ~5% narrower for dN/dx wrt dE/dx (?).
Sept 23rd, 2017 41 CBM-STAR Joint Workshop, CCNU Wuhan
Conclusions and plans for dN/dx NO obvious improvement with dN/dx wrt dE/dx is seen. The plan to improve it includes : 1. It is necessary to improve dN/dx fit for tracks in order to
avoid long tails. 2. It is necessary to correct model predictions of dN/dx for
kaons, protons, … 3. It is necessary to revist dN/dx model in order to
understand the reason why do we need these corrected predictions.
Sept 23rd, 2017 42 CBM-STAR Joint Workshop, CCNU Wuhan
Top Related