Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired...

39
Exploiting the Unused Part of the Brain Deep Learning and Emerging Technology For High Energy Physics Jean-Roch Vlimant

Transcript of Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired...

Page 1: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

Exploiting the UnusedPart of the Brain

Deep Learning and Emerging TechnologyFor High Energy Physics

Jean-Roch Vlimant

Page 2: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

A 10 Megapixel Camera

Page 3: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

CMS 100 Megapixel Camera

Page 4: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

CMS Detector

Page 5: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

CMS Readout

Highly heterogeneous system Raw data is 100M channelssampled every 25 ns : 1Pb/s50EB per day in readout and

online processing.

Page 6: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

Event Filtering

From Big Data to Smart Datawith ultra fast decision

1000 Gb/s1000

Gb

/s

105 H

z

40 M

Hz

1-3

kHz

L1 HLT

Page 7: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7

Why Deep Learning● LHC Data Processing may use deep learning methods in many aspects (attend

other relevant talk at the Caltech booth)● Large volume of data and simulated data to analyze● Several class of LHC Data analysis make use of classifier for signal versus

background discrimination.✔ Use of BDT on high level features✔ Increasing use of MLP-like deep neural net

● Deep learning has delivered super-human performance at certain class of tasks(computer vision, speech recognition, ...)

✔ Use of convolutional neural net, recurrent topologies, long-short-term-memorycells, ...

● Deep learning has advantage at training on “raw” data➢ Several levels of data distillation in LHC data processing

● Neural net computation is highly parallelizable

➔ Going beyond fully connected networks with advanced deep neural net topologies➢ Multi-classification of LHC events from particle-level information➢ Charged particle tracking with recurrent and convolutional topologies➢ Particle identification and energy regression in highly granular future

calorimeter➢ ...

Page 8: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 8

Advanced Machine Learningand Deep Learning

(my selection)

Page 9: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 9

Machine Learning in a Nutshell

Balazs Kegl, CERN 2014

Page 10: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 10

Scene Labeling

Karpathy, Fei-Fei, CVPR 2015

● Create a description of images

➢ Generate a decay process description from collisionrepresentation, with application to triggers

Page 11: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 11

Scenery Interpretation

Farabet et al. ICML 2012, PAMI 2013

● Group and classify what each pixel belongs to● Real-time video processing with deep learning

➢ Multiple applications to pileup mitigation, objectidentification, tracking. All from “raw data”

Page 12: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 12

Attention Learning

● Identify people from faces with multiple attentionfilters

➢ Object identification, noise subtraction, ...

Page 13: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 13

Text Processing

● Question and Answer machine, languagetranslation, semantic arithmetic, ...

➢ Can the raw data of detector be interpreted astexts and translated into physics descriptions ?

Page 14: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 14

Embedded Symmetries

T.S. Cohen, M. Welling ICML2016

p4m group

p4 group

● Introduction of convolutionallayers was a ground-breaking advancement

● Research on embedding morefundamental symmetries intoneural nets

● Symmetries operate on thedata or internalrepresentation of data

● Next is to implementsymmetries of physics to buildphysics-specific NN

Page 15: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 15

Toolkit and Services

● Lots of libraries out there, several key components in each majorlanguages. Lots of big-data analytics services offered

● Common theme of going for spark-hdfs support➢ Question of having in-house software or embracing external

libraries is very much alive

Page 16: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 16

Application to Intensity andEnergy Frontiers

(a selected few)

Page 17: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 17

NOVA Event Classification

Slides on Paolo

Page 18: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 18

Particle Jet Identification

Neural net

Train

W to QCDdiscrimination

W tagger arXiv: 1511.05190, Oliveira, Kagan, Mackey, Nachman, Schwartzman

Top Tagger arXiv: 1501.05968 Almeida, Backovic, Cliche, Lee, Perelstein

W to QCDdiscrimination

Page 19: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 19

3D Calorimetry Imaging

100GeV Photon 100GeV Pi0

≠LCD Calorimeter configurationhttp://lcd.web.cern.ch5x5 mm Pixel calorimeter28 layer deep for Ecal70 layer deep for Hcal

Photon and pion particle gunClassification, regression andcombined models

Page 20: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 20

Irregular Geometry Challenge

Hexagonal cells

Projective Geometry

Variable Depth Segmentation

The images we are dealing with arenot as regular as standard images.Need for specific new treatment andmethods to feed neural nets

Page 21: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 21

Collision Event Classification

● Full event classification using reconstructed particle 4-vectors● Recurrent neural nets, Long short term memory cells● Dedicated layer with Lorentz boosting● Step toward event classification with lower level data : low

level feature as opposed to analysis level variables

Page 22: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 22

Ordering ChallengeText have naturalorder. RNN/LSTMcan correlate theinformation tointernalrepresentation

There isunderlying order incollision events.Smeared throughtiming resolution.No natural order in observable

Page 23: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 23

Charged Particle Tracking

● Perfect example of pattern recognition● Data sparsity is not common in image processing● Several angles to tackle the problem. Deep Kalman filter,

RNN to learn dynamics, sparse image processing, ...● Kaggle challenge in preparation

https://indico.hep.caltech.edu/indico/event/102

Page 24: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 24

Challenge of Tracking

Page 25: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 25

HEP Trk.Xhttps://heptrkx.github.io/

● LBL, FNAL, Caltech consortium sponsored by the DOE● Preparation of simulated data

➢ Accurate, fast and light● Explore new approaches to charged particle tracking using

advanced pattern recognition techniques➢ Recurrent neural nets in learning track kinematics➢ Convolutional neural nets for pattern recognition➢ Hough-like transformation from hit position space to track

parameter space➢ Advanced Kalman filter parallelization➢ ... exploration has started

● Tracking competition (a la kaggle) in preparation

Page 26: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 26

Other Applications

● Outliers selection● Anomaly detection● Data quality automation● Detector control● Experiment control● Data popularity prediction● Computing grid control● Denoising with auto-encoder● Fast simulation ● ...

Page 27: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 27

Accelerating and EmergingTechnologies

Page 28: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 28

Supermicro Server

Caltech Servers● 2 compute nodes :

Intel® Xeon® CPU E5-2697 v3 @ 2.60GHz processors pernode (28 cores per CPU) with 8 NVIDIA® Pascal® GTX 1080

● Theoretical Peak Performance :80 Tflops

➔ Theano, Tensorflow, Keras➔ MPI training➔ Spearmint hyper-optimization

Many thanks to our partners Nvidia and Supermicro

Page 29: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 29

ALCF

Cooley● 126 compute nodes :

Two 2.4 GHz Intel®Haswell® E5-2620 v3processors per node (6 coresper CPU, 12 cores total) andNVIDIA® Tesla® K80

● Theoretical Peak Performance :293 Tflops

➔ Development Project with 8kcore hours

➔ Theano, Tensorflow, Keras➔ MPI ready➔ Spearmint experimental

Page 30: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 30

CSCS

Piz Daint● 5272 compute nodes :

Intel® Xeon® E5-2670 @2.60GHz (8 cores, 16 virtualcores with hyperthreadingenabled, 32GB RAM) andNVIDIA® Tesla® K20X

● Theoretical Peak Performance :7.787 Pfops

● Scratch capacity : 2.7 PB

➔ Development Project with 36kcore hours

➔ Theano, Tensorflow, Keras➔ MPI ready➔ Spark experimental stage

Page 31: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

OLCF

Titan● 18688 compute nodes :

2.2GHz AMD® Opteron®6274 processors per node (16cores per CPU) and NVIDIA®Tesla® K20X

● Theoretical Peak Performance :20 Pflops

➔ Allocation in preparation

Page 32: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 32

Distributed Learning

● Deep learning with elastic averaging SGDhttps://arxiv.org/abs/1412.6651

● Revisiting Distributed Synchronous SGDhttps://arxiv.org/abs/1604.00981

● Implementation with Spark and MPI for the Kerasframework https://keras.io/

➔ https://github.com/JoeriHermans/dist-keras ➔ https://github.com/duanders/mpi_learn

Titan X

GTX 1080

Page 33: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 33

Performance

● Ran on the supermicro 8GPU server here at SC16● Normalized to 2 GPU performance point

➔ Please add a factor 2x

Linear speedup withnumber of workers

Saturation in co-scheduling GPU (2 workers on the same GPU)

7x max speedup

Page 34: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 34

Performance

● Ran on ALCF Cooley 126 GPU cluster● Normalized to 2 GPU performance point

➔ Please add a factor 2x

14x max speedup

Non-linear speedup withnumber of GPU

Linear speedup withnumber of GPU

Page 35: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 35

Applicability

Page 36: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 36

Training vs Inference

● GPUs are the workhorse forparallel computing

● Enable training large models, withlarge dataset

● Deep learning facility clusters

● Emergence of smaller GPU● Not dedicated to training● Strike the balance between

Tflops/$ for inference● Deployment on the grid

Page 37: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 37

Neuromorphic Hardware

http://www.nature.com/articles/srep14730

● Implementing plasticity in hardware ● Process signal from detector and adapt tocategories of pattern (unsupervised)

● Post-classified from data analysis or rate throttling➢ NCCR consortium assembling to develop thistechnology further, with our use case in mind

Page 38: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 38

Cognitive Computing

● Spiking neural net as processing units : ➔Cognitive Computing Processing Unit : CCPU

● Adopt a new programming scheme, translateexisting software

● See Rebecca Carney's talk for more details

Page 39: Exploiting the Unused Part of the Brain - Caltech Group at ... · 11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 7 Why Deep Learning LHC Data Processing mayuse deep learning

11/16/16 SC16, Brain Inspired Technologies, J.-R. Vlimant 39

Summary

Impressive achievement and promise ofmodern machine learning and deep learning

From realistic to speculative applicability tofield of High Energy and Frontier Physics

Emerging tools and technology to embrace

Thanks to our sponsors