LundNet: Robust jet identification using graph neural networks...Lundrepresentationofajet I...
Transcript of LundNet: Robust jet identification using graph neural networks...Lundrepresentationofajet I...
-
LundNet: Robust jet identificationusing graph neural networks4th IML Machine Learning Workshop, CERN, 22 October 2019
Frédéric Dreyer
arXiv:20xx.xxxxxwork in progress with Huilin Qu
http://arxiv.org/abs/
-
Boosted objects at the LHC
I At LHC energies, EW-scale particles (W/Z/t. . . ) are often producedwith ?C �
-
Boosted objects at the LHC
I At LHC energies, EW-scale particles (W/Z/t. . . ) are often producedwith ?C �
-
Lund diagrams
I Lund diagrams in the (ln I�, ln�)plane are a very useful way ofrepresenting emissions.
I Different kinematic regimes areclearly separated, used to illustratebranching phase space in partonshower Monte Carlo simulations andin perturbative QCD resummations.
I Soft-collinear emissions are emitteduniformly in the Lund plane
3F2 ∝ B3I
I
3��
Primary Lund-plane regions
soft-collinear
hard-collinear (largez)
ISR(larg
eΔ)
non-pert. (small kt)
MPI/UE
ln(R/Δ)ln(k
t/GeV)
Frédéric Dreyer 2/17
-
Lund diagrams
Features such as mass, angle and momentum can easily be read from aLund diagram.
jet mass ≡
-
Lund diagrams for substructure
Substructure algorithms can often also be interpreted as cuts in the Lundplane.
[Dasgupta, Fregoso, Marzani, Salam JHEP 1309 (2013) 029]
Frédéric Dreyer 4/17
https://arxiv.org/abs/1307.0007
-
Studying jets in the Lund plane
Lund diagrams can provide a useful approach to study a range of jet-relatedquestions
I First-principle calculations of Lund-plane variables.I Constrain MC generators, in the perturbative and non-perturbative
regions.I Brings many soft-drop related observables into a single framework.I Impact of medium interactions in heavy-ion collisions.I Experimental measurements and precision comparison to data.I Basis of generative models for fast simulations.I Boosted object tagging using Machine Learning methods.
Frédéric Dreyer 5/17
-
Lund plane representation
To create a Lund plane representation of a jet, recluster a jet 9 with theCambridge/Aachen algorithm then decluster the jet following the hardestbranch.
1. Undo the last clustering step, defining two subjets 91 , 92ordered in ?C .
2. Save the kinematics of the current declusteringΔ ≡ (H1 − H2)2 + ()1 − )2)2 , :C ≡ ?C2Δ,
-
Lund plane representation
ln k
(c)
ln k
t
ln k
t
LU
ND
DIA
GR
AM
PR
IM
AR
Y L
UN
D P
LA
NE
JE
T
(b)
(a)
(b)
(a) (c)
tt
ln k
(c)
ln 1/ ∆
ln 1/ ∆ ln 1/ ∆
ln 1/ ∆
(b)
(c)
(b)
(b)(b)
(c)
Frédéric Dreyer 6/17
-
Lund representation of a jet
I Each jet has an imageassociated with its primarydeclustering.
I For a C/A jet, Lund plane is filledleft to right as we progressthrough declusterings of hardestbranch.
I Additional information such asazimuthal angle # can beattached to each point. 0 1 2 3 4 5 6 7 8ln(R/ )
4
2
0
2
4
6
ln(k
t/GeV
)
Lund image for a 2 TeV QCD jet
Frédéric Dreyer 7/17
-
Lund representation of a jet
I Each jet has an imageassociated with its primarydeclustering.
I For a C/A jet, Lund plane is filledleft to right as we progressthrough declusterings of hardestbranch.
I Additional information such asazimuthal angle # can beattached to each point. 0 1 2 3 4 5 6 7 8ln(R/ )
4
2
0
2
4
6
ln(k
t/GeV
)
Lund image for a 2 TeV QCD jet
Frédéric Dreyer 7/17
-
Lund representation of a jet
I Each jet has an imageassociated with its primarydeclustering.
I For a C/A jet, Lund plane is filledleft to right as we progressthrough declusterings of hardestbranch.
I Additional information such asazimuthal angle # can beattached to each point. 0 1 2 3 4 5 6 7 8ln(R/ )
4
2
0
2
4
6
ln(k
t/GeV
)
Lund image for a 2 TeV QCD jet
Frédéric Dreyer 7/17
-
Lund representation of a jet
I Each jet has an imageassociated with its primarydeclustering.
I For a C/A jet, Lund plane is filledleft to right as we progressthrough declusterings of hardestbranch.
I Additional information such asazimuthal angle # can beattached to each point. 0 1 2 3 4 5 6 7 8ln(R/ )
4
2
0
2
4
6
ln(k
t/GeV
)
Lund image for a 2 TeV QCD jet
Frédéric Dreyer 7/17
-
Lund representation of a jet
I Each jet has an imageassociated with its primarydeclustering.
I For a C/A jet, Lund plane is filledleft to right as we progressthrough declusterings of hardestbranch.
I Additional information such asazimuthal angle # can beattached to each point. 0 1 2 3 4 5 6 7 8ln(R/ )
4
2
0
2
4
6
ln(k
t/GeV
)
Lund image for a 2 TeV QCD jet
Frédéric Dreyer 7/17
-
Lund representation of a jet
I Each jet has an imageassociated with its primarydeclustering.
I For a C/A jet, Lund plane is filledleft to right as we progressthrough declusterings of hardestbranch.
I Additional information such asazimuthal angle # can beattached to each point. 0 1 2 3 4 5 6 7 8ln(R/ )
4
2
0
2
4
6
ln(k
t/GeV
)
Lund image for a 2 TeV QCD jet
Frédéric Dreyer 7/17
-
Lund representation of a jet
I Each jet has an imageassociated with its primarydeclustering.
I For a C/A jet, Lund plane is filledleft to right as we progressthrough declusterings of hardestbranch.
I Additional information such asazimuthal angle # can beattached to each point. 0 1 2 3 4 5 6 7 8ln(R/ )
4
2
0
2
4
6
ln(k
t/GeV
)
Lund image for a 2 TeV QCD jet
Frédéric Dreyer 7/17
-
Jets as Lund images
Average over declusterings of hardest branch for 2 TeV QCD jets.
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0ln(R/ )
2
1
0
1
2
3
4
5
6
7
ln(k
t/GeV
)
Pythia8.230(Monash13)s = 14 TeV, pt > 2 TeV
QCD jets, averaged primary Lund plane
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9( , kt)
0
0.05
0.1
0.15
0.2
0.25
0.02 0.2 0.5 0.01 0.1 1
16.4
-
Jets as Lund images
Average over declusterings of hardest branch for 2 TeV QCD jets.
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0ln(R/ )
2
1
0
1
2
3
4
5
6
7
ln(k
t/GeV
)
Pythia8.230(Monash13)s = 14 TeV, pt > 2 TeV
QCD jets, averaged primary Lund plane
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9( , kt)
Perturbative
Non-perturbative– – – – – – – – –
Primary Lund-plane regions
soft-collinear
hard-collinear (largez)
ISR(larg
eΔ)
non-pert. (small kt)
MPI/UE
ln(R/Δ)ln(k
t/GeV)
Non-perturbative region clearly separated from perturbative one.Frédéric Dreyer 8/17
-
Jets as Lund images
Average over declusterings of hardest branch for 2 TeV QCD jets.
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0ln(R/ )
2
1
0
1
2
3
4
5
6
7
ln(k
t/GeV
)
Pythia8.230(Monash13), partons = 14 TeV, pt > 2 TeV
QCD jets, averaged primary Lund plane
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9( , kt)
Perturbative
Non-perturbative– – – – – – – – –
Primary Lund-plane regions
soft-collinear
hard-collinear (largez)
ISR(larg
eΔ)
non-pert. (small kt)
MPI/UE
ln(R/Δ)ln(k
t/GeV)
Non-perturbative region clearly separated from perturbative one.
hadron+MPI→ parton
Frédéric Dreyer 8/17
-
Measurement of the Lund plane
Lund images provide an opportunity for experimental measurements andcomparisons with theory
Frédéric Dreyer 9/17
-
Lund images for QCD and W jets
I Hard splittings clearly visible, along the diagonal line with jet mass< = 2 TeV
QCD jets, averaged primary Lund plane
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9( , kt)
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0ln(R/ )
2
1
0
1
2
3
4
5
6
7
ln(k
t/GeV
)
Pythia8.230(Monash13)s = 14 TeV, pt > 2 TeV
W jets, averaged primary Lund plane
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9S( , kt)
Frédéric Dreyer 10/17
-
Boosted] tagging
I LL approach alreadyprovides substantialimprovement overbest-performing substructureobservable.
I LSTM network substantiallyimproves on results obtainedwith other methods.
I Large gain in performance,particularly at higherefficiencies.
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0125
102050
100200500
100020005000
1/QC
D
Pythia8(Monash13)hadron+MPI
Delphes+SPRA1pt > 2 TeV
QCD rejection v. W efficiency
Jet Image+CNNLund Image+CNNLund+DNNLund+LSTMLund+likelihoodD[loose]2 +BDT
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0W
0.20.30.50.7
11.52
3ra
tio to
Lun
d+lo
g.lik
.
Frédéric Dreyer 11/17
-
Using the full jet information
I Performance can be improved further by taking secondary Lund planesinto account, particularly relevant for top tagging.
I Dynamic Graph CNN based methods perform particularly well, treatingthe full Lund diagram as a vertices on a graph.
[Wang, Sun, Liu, Sarma, Bronstein, Solomon arXiv:1801.07829][Qu, Gouskos, arXiv:1902.08570]
Frédéric Dreyer 12/17
https://arxiv.org/abs/1801.07829https://arxiv.org/abs/1902.08570
-
LundNet models
We study two separate models, one which optimises performance and onedesigned to reduce the complexity of the model and improve its robustness.
In both cases, the graph network starts by constructing a graph for a jet,with each node corresponding to a Lund declustering.
FullLundNet:
The kinematic input for each node is (ln I, lnΔ,#, ln
-
Boosted] tagging revisited
I Graph-based methodsoutperform our previousbenchmark significantly.
I The FullLundNet modelprovides a slightimprovement overParticleNet.
I FastLundNet achievesalmost the sameperformance with alightweight and robust model. 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
W
1
10
100
1000
10000
1/QC
D
Pythia 8.223 simulationsignal: pp WW, background: pp jj
anti-kt R = 1 jets, pt > 500 GeV
QCD rejection v. W tagging efficiency
RecNN (Louppe et al. '17)Lund+LSTM (Dreyer et al. '18)FastLundNetFullLundNetParticleNet (Gouskos et al. '19)
Frédéric Dreyer 14/17
-
Boosted top tagging
I For top tagging primarydeclustering sequencedoesn’t capture the fullsubstructure information.
I Can achieve large gains inperformance by taking intoaccount the full tree.
I FullLundNet model performsparticularly well, showingvast improvement overprevious benchmarks. 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Top
1
10
100
1000
10000
1/QC
D
Pythia 8.223 simulationsignal: pp tt, background: pp jj
anti-kt R = 1 jets
pt > 500 GeV
QCD rejection v. Top tagging efficiency
RecNN (Louppe et al. '17)Lund+LSTM (Dreyer et al. '18)FastLundNetFullLundNetParticleNet (Gouskos et al. '19)
Frédéric Dreyer 15/17
-
Robustness to non-perturbative effectsI Performance compared to resilience to MPI and hadronisation corrections.I Vary cut on :C , which reduces sensitivity to the non-perturbative region.
5
50
10
1 10
anti kt R=1 jets, pt>2 TeV, εW=0.4signal: pp->WW, background: pp->jj
Pythia 8.223 simulation
perf
orm
ance
resilience to non-perturbative effects
performance v. resilience
Lund+LSTM (Dreyer et al 18)FastLundNetFullLundNet
ParticleNet (Gouskos et al. 19)RecNN (Louppe et al. 17)
Δ& = & − &′
� =©«Δ&2
(
〈&〉2(
+Δ&2
�
〈&〉2�
ª®®¬− 12
〈&〉 = 12 (& + &′)
I FullLundNet & ParticleNetreach high performance,but are not particularlyresilient to NP effects.
I FastLundNet modelachieves high resiliencewhile retaining most of theperformance.
(c.f. arXiv:1803.07977)
& ,√& Q
CD
�
Frédéric Dreyer 16/17
https://arxiv.org/abs/1803.07977
-
CONCLUSIONS
-
Conclusions
I Jet substructure provides a unique practical playground for recentdevelopments in machine learning.
I Discussed a new way to study and exploit radiation patterns in a jetusing the Lund plane.
I Introduced two models, FullLundNet and FastLundNet, which canachieve state-of-the-art performance on jet tagging benchmarks.
I Combination of physical insight and machine learning allows formodels that combine performance and robustness.
Frédéric Dreyer 17/17
Lund images and CNNsConclusions