12 September 2013, NEC2013/Varna René Brun/CERN*.
-
Upload
lilian-eells -
Category
Documents
-
view
217 -
download
0
Transcript of 12 September 2013, NEC2013/Varna René Brun/CERN*.
The Evolutionof HEP software
12 September 2013, NEC2013/Varna
René Brun/CERN*
R.Brun : Evolution of HEP software 2
planIn this talk I present the views of somebody involved in some aspects of scientific computing as seen from a major lab in HEP.
Having been involved in the design and implementation of many systems, my views are necessarily biased by my path in several experiments and the development of some general tools.
I plan to describe the creation and evolution of the main systems that have shaped the current HEP software, with some views for the near future.
12/09/13
R.Brun : Evolution of HEP software 3
Machines
12/09/13
From Mainframes ===== Clusters
Walls of
cores
GRIDs&
Clouds
R.Brun : Evolution of HEP software 4
Machine Units (bits)
12/09/13
16 32 36 48 56 60 64pdp
11nord50
besm6
cdc many
many
univac
With even more combinations of
exponent/mantissa size
or byte ordering
A strong push to develop portable
machine independent I/O systems
R.Brun : Evolution of HEP software 5
User machine interface
12/09/13
R.Brun : Evolution of HEP software 6
General Software in 1973
Software for bubble chambers: Thresh, Grind, Hydra
Histogram tool: SUMX from Berkeley
Simulation with EGS3 (SLAC), MCNP(Oak Ridge)
Small Fortran IV programs (1000 LOC, 50 kbytes)
Punched cards, line printers, pen plotters (GD3)
Small archive libraries (cernlib), lib.a12/09/13
R.Brun : Evolution of HEP software 7
Software in 1974First “Large Electronic Experiments”
Data Handling Division == Track Chambers
Well organized software in TC with HYDRA, Thresh, Grind, anarchy elsewhere
HBOOK: from 3 routines to 100, from 3 users to many
First software group in DD12/09/13
R.Brun : Evolution of HEP software 8
GEANT1 in 1975Very basic framework to drive a simulation program, reading data cards with FFREAD, step actions with GUSTEP, GUNEXT, apply mag-field (GUFLD).
Output (Hist/Digits) was user defined
Histograms with HBOOK
About 2,000 LOC
12/09/13
R.Brun : Evolution of HEP software 9
ZBOOK in 1975Extraction of the HBOOK memory manager in an independent package.
Creation of banks and data structures anywhere in common blocks
Machine independent I/O, sequential and random
About 5,000 LOC
12/09/13
R.Brun : Evolution of HEP software 10
GEANT2 in 1976Extension of GEANT1 with more physics (e-showers based on a subset of EGS, mult-scattering, decays, energy loss
Kinematics, hits/digits data structures in ZBOOK
Used by several SPS experiments (NA3, NA4, NA10, Omega)
About 10,000 LOC
12/09/13
R.Brun : Evolution of HEP software 11
Problems with GEANT2
Very successful small framework.
However, the detector description was user written and defined via “if” statements at tracking time.
This was becoming a hard task for large and always evolving detectors (case with NA4 and C.Rubbia)
Many attempts to describe a detector geometry via data cards (a bit like XML), but the main problem was the poor and inefficient detector description in memory.
12/09/13
R.Brun : Evolution of HEP software 12
GEANT3 in 1980A data structure (ZBOOK tree) describing complex geometries introduced , then gradually the geometry routines computing distances, etc
This was a huge step forward implemented first in OPAL, then L3 and ALEPH.
Full electromagnetic showers (first based on EGS, then own developments)
12/09/13
R.Brun : Evolution of HEP software 13
Systems in 1980
12/09/13
OS & fortran
LibrariesHBOOK, Naglib, cernlib
ExperimentSoftware
End userAnalysis software
CDC, IBM
1000 KLOC
500 KLOC
100 KLOC
10 KLOC
Vax780
TapesRAM1 MB
R.Brun : Evolution of HEP software 14
GEANT3 with ZEBRA
ZEBRA was very rapidly implemented in 1983.
We introduced ZEBRA in GEANT3 in 1984.
From 1984 to 1993 we introduced plenty of new features in GEANT3: extensions of the geometry, hadronic models with Tatina, Gheisha and Fluka, Graphics tools.
In 1998, GEANT3 interface with ROOT via the VMC (Virtual Monte Carlo)
GEANT3 has been used and still in use by many experiments.
12/09/13
R.Brun : Evolution of HEP software 15
PAWFirst minimal version in 1984
Attempt to merge with GEP (DESY) in 1985, but take the idea of ntuples for storage and analysis. GEP was written in PL1.
Package growing until 1994 with more and more functions. Column-wise ntuples in 1990.
Users liked it, mainly once the system was frozen in 1994.
12/09/13
R.Brun : Evolution of HEP software 16
Vectorization attempts
During the years 1985->1990 a big effort was invested in vectorizing GEANT3 (work in collaboration with Florida State University) on CRAY/YMP, CYBER205,ETA10.
The minor gains obtained did not justify the big manpower investment. GEANT3 transport was still essentially sequential and we had a big overhead with vectors creation, gather/scatter.
However this experience and failure was very important for us and many messages useful for the design of GEANT5 many years later.
12/09/13
R.Brun : Evolution of HEP software 17
Parallelism in the 80s & early 90s
Many attempts (all failing) with parallel architectures
Transputers and OCCAM
MPP (CM2, CM5, ELXI,..) with OpenMP-like software
Too many GLOBAL variables/structures with Fortran common blocks.
RISC architectures or emulators perceived as a cheaper solution in the early 90s.
Then MPPs died with the advent of the Pentium Pro (1994) and farms of PCs or workstations.
12/09/13
R.Brun : Evolution of HEP software 18
1992: CHEP Annecy
Web, web, web, web…………
Attempts to replace/upgrade ZEBRA to support/use F90 modules and structures, but modules parsing and analysis was thought to be too difficult.
With ZEBRA the bank description was within the bank itself (just a few bits). A bank was typically a few integers followed by a dynamic array of floats/doubles.
We did not realize at the time that parsing user data structures was going to be a big challenge!!
12/09/13
R.Brun : Evolution of HEP software 19
ConsequencesIn 1993/1994 performance was not anymore the main problem.
Our field invaded by computer scientists.
Program design, object-oriented programming , move to more sexy languages was becoming a priority.
The “goal” was thought less important than the “how”
This situation deteriorates even more with the death of the SSC.
12/09/13
R.Brun : Evolution of HEP software 20
1993: Warning Danger
3 “clans” in my group1/3 pro F90
1/3 pro C++
1/3 pro commercial products (any language) for graphics, User Interfaces, I/O and data bases
My proposal to continue with PAW, develop ZOO(ZEBRA Object-Oriented) and GEANT3 geometry in C++ is not accepted.
Evolution vs Revolution
12/09/13
R.Brun : Evolution of HEP software 21
1995: roads for ROOT
The official line was with GEANT4 and Objectivity, not much room left for success with an alternative product when you are alone.
The best tactic had to be a mixture of sociology , technicalities and very hard work.
Strong support from PAW and GEANT3 users
Strong support from HP (workstations + manpower)
In November we were ready for a first ROOT show
Java is announced (problem?)
12/09/13
R.Brun : Evolution of HEP software 22
1998: work & smile
RUN II projects at FNALData Analysis and Visualization
Data Formats and storage
ROOT competing with HistoScope, JAS, LHC++
CHEP98 (September) Chicago
ROOT selected by FNAL, followed by RHICVital decision for ROOT
But official support at CERN only in 2002
12/09/13
R.Brun : Evolution of HEP software 23
ROOT evolutionNo time to discuss the creation/evolution of the 110 ROOT shared libs/packages.
ROOT has gradually evolved from a data storage, analysis and visualization system to a more general software environment replacing totally what was known before as CERNLIB.
This has been possible thanks to MANY contributors from experiments, labs or people working on other fields.
ROOT6 coming soon includes a new interpret CLING and supports all the C++11 features
12/09/13
R.Brun : Evolution of HEP software
Input/Output: Major Steps
24
parallel merge
TreeCache
member-wise streamingfor STL collections<T*>
member-wise streamingfor TClonesArray
automatic streamers from dictionary with StreamerInfosin self-describing files
streamers generatedby rootcint
User written streamersfilling TBuffer
12/09/13
R.Brun : Evolution of HEP software 25
GEANT4 EvolutionGEANT4 is an important software tool for current experiments with more and more physics improvements and validation procedures.
However, the GEANT4 transport system is not any more suitable for parallel architectures. Too many changes are required.
GEANT5: keep the Geant4 physics and a radically new transport system.
12/09/13
Tools & Libs
10/09/13R.Brun : Computing in HEP 26
hbook
zebra
pawzbook
hydra
geant1
geant2
geant3
geant4
Root 1,2,3,4,5,6
minuit
bos
Geant4+5
R.Brun : Evolution of HEP software 27
Systems today
12/09/13
OS & compilers
Frameworks likeROOT, Geant4
ExperimentSoftware
End userAnalysis software
Hardware
20 MLOC
5 MLOC
4 MLOC
0.1 MLOC
HardwareHardwareHardwareClusters of multi-core machines
10000x8
GRIDS
CLOUDS
Networks10 Gbit/s
Disks1o PB
RAM16 GB
R.Brun : Evolution of HEP software 28
Systems in 2025 ?
12/09/13
OS & compilers
Frameworks likeROOT, Geant5
ExperimentSoftware
End userAnalysis software
Hardware
40 MLOC
10 MLOC
10 MLOC
0.2 MLOC
HardwareHardware
HardwareMulti-level parallel machines10000x1000x1000
GRIDS
CLOUDSon
demand
Networks100
Gbit/s
Disks1o00 PB
Networks100
Gbit/sNetworks10 Tbit/s
RAM10 TB
R.Brun : Evolution of HEP software 29
BUT !!!!!It looks like the amount of money devoted to computing is not going to increase with the same slope as it used to increase in the past few years.
The Moore’s law does not apply anymore for one single processor.
However, the Moore’s law looks still OK when looking at the amount of computing delivered/$, € when REALLY using parallel architectures.
Using these architectures is going to be a big challenge, but we do not have the choice!!!!
12/09/13
R.Brun : Evolution of HEP software 30
Software and Hardware
GRIDs/Clouds are inherently parallel. However, because the hardware has been relatively cheap, GRIDs have pushed towards job-level parallelism at the expense of parallelism within one job.
It is not clear today what will be the winning hardware systems: supercomputer?, walls of cores with accelerators?, zillions of ARM-like systems?,..
Our software must be upgraded keeping in mind all these possible solutions. A big challenge!
12/09/13
R.Brun : Evolution of HEP software 31
Expected Directions
Parallelism: Today we do not exploit well the existing hardware (0.6 instructions/cycle in average) because our code was designed “sequential”. Important gains foreseen (10?), eg in detector simulation.
Automatic Data Caches: Many improvements are required to speed-up and simplify skimming procedures and data analysis.
12/09/13
R.Brun : Evolution of HEP software 32
Data cachesMore effort is required to simplify the analysis of large data sets (typically ROOT Trees).
When zillions of files are distributed in Tiers1/2, automatic, transparent, performing, safe caches are becoming mandatory on Tiers2/3 or even laptops.
This must be taken into account in the dilemma: sending jobs to data or vice-versa.
This will require changes in ROOT itself and in the various data handling or parallel file systems.
12/09/13
R.Brun : Evolution of HEP software 33
Parallelism: key points
12/09/13
Minimize the sequential/synchronization parts (Amdhal law): Very difficult
Run the same code (processes) on all cores to optimize the memory use (code and read-only data sharing)
Job-level is better than event-level parallelism for offline systems.
Use the good-old principle of data locality to minimize the cache misses.
Exploit the vector capabilities but be careful with the new/delete/gather/scatter problem
Reorganize your code to reduce tails
Data Structures & parallelism
12/09/13R.Brun : Evolution of HEP software 34
eventevent
vertices
tracks
C++ pointersspecific to a process
Copying the structure implies a
relocation of all pointers
I/O is a nightmare
Update of the structure from a different thread implies a
lock/mutex
R.Brun : Evolution of HEP software 35
Data Structures & Locality
12/09/13
sparse data structures defeat the system memory caches
Group object elements/collections such
that the storage matches the traversal processes
For example: group the cross-sections for all
processes per material instead of all materials
per process
R.Brun : Evolution of HEP software 36
Create Vectors& exploit LocalityBy making vectors , you optimize the instruction cache (gain >2) and data cache (gain >2)
By making vectors, you can use the built-in pipeline instructions of existing processors (gain >2)
But, there is no point in making vectors if your algorithm is still sequential or badly designed for parallelism, eg:
Too many threads synchronization points (Amdhal)
Vectors gather/scatter
12/09/13
Conventional Transport
11/07/2011LPCC workshop Rene Brun 37
oo
o
o
oo
o
o
o
o
ooo
o
oo
o oo
o
o
o
T1
T3
T2
o
o
o
oo
oo
o
o
o
o
ooo
o
oo
oo
oT4
Each particle tracked step by step through hundreds of volumes
when all hits for all tracks are in
memory summable digits
are computed
Analogy with car traffic
11/07/2011LPCC workshop Rene Brun 38
New Transport Scheme
11/07/2011LPCC workshop Rene Brun 39
oo
o
o
oo
o
o
o
o
ooo
o
oo
o oo
o
o
o
T1
T3
T2
o
o
o
oo
oo
o
o
o
o
ooo
o
oo
oo
oT4
All particles in the same volume
type are transported in
parallel.Particles
entering new volumes or
generated are accumulated in
the volume basket.
Events for which all hits are
available are digitized in parallel
R.Brun : Evolution of HEP software 40
Towards Parallel Software
A long way to go!!
There is no point in just making your code thread-safe. Use of parallel architectures requires a deep rethinking of the algorithms and dataflow.
One such project is GEANT GEANT4+5 launched 2 years ago. We start having very nice results. But still a long way to go to adapt (or write radically new software) for the emerging parallel systems.
12/09/13
R.Brun : Evolution of HEP software 41
A global effortSoftware development is nowadays a world-wide effort with people scattered in many labs developing simulation, production or analysis code.
It remains a very interesting area for new people not scared by big challenges.
I had the fantastic opportunity to work for many decades in the development of many general tools in close cooperation with many people to whom I am very grateful. 12/09/13