Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT.
-
Upload
egbert-davis -
Category
Documents
-
view
214 -
download
0
description
Transcript of Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT.
Current status of the development of
the Unified Solids libraryMarek GayerCERN PH/SFT
Current status of the development of the Unified Solids library 2
Motivations for a common solids
library• Optimize and guarantee better long-term
maintenance of ROOT and Gean4 solids librarieso A rough estimation indicates that about 70-80% of code investment for
the geometry modeler concerns solids, to guarantee the required precision and efficiency in a huge variety of combinations
• Create a single high quality library to replace solid libraries in Geant4 and ROOTo Starting from what exists today in Geant4 and ROOTo Adopt a single type for each shapeo Significantly optimize (Multi-Union, Tessellated Solid, Polyhedra,
Polycone)o Reach complete conformance to GDML solids schema
• Create extensive testing suite
Current status of the development of the Unified Solids library 3
Solids implemented so far
• Box• Orb• Trapezoid• Sphere (+ sphere section)• Tube (+ cylindrical section) • Cone (+ conical section) • Generic trapezoid• Tetrahedron• Arbitrary Trapezoid (ongoing)• Multi-Union• Tessellated Solid• Polycone• Polyhedra• Extruded solid (ongoing)
0
500
1000
0
500
1000
200
400
600
800
1000
-10-5
05x 10
4 -5
0
5
x 104
-6
-4
-2
0
2
4
6
x 104
-4000-3000
-2000-1000
01000
20003000
4000
-4000
-3000
-2000
-10000
10002000
3000
4000-1000
0
1000
-1000
-500
0
500
1000
-1000
-500
0
500
1000-1000
-500
0
500
1000
-1000
-500
0
500
1000
-1000
-500
0
500
1000-1000
-500
0
500
1000
Current status of the development of the Unified Solids library 4
Testing Suite• Unified Solid Batch Test• Optical Escape• Specialized tests (e.g. quick performance
scalability test for multi-union)
Current status of the development of the Unified Solids library 5
Unified Solid Batch Test• New powerful diagnosis tool for testing solids of ROOT,
Geant4 and Unified Solids library designed for transition to Unified Solids
• Compares performance and output values from different codes
• Helps us to make sure we have similar or better performance in each method and different test cases
• Support for batch configuration and execution of tests• Useful to detect, report and fix numerous bugs or suspect
return values in Geant4, ROOT and Unified Solids• Two phases
o Sampling phase (generation of data sets, implemented in C++o Analysis phase (data post-processing, implemented as MATLAB
scripts)
Current status of the development of the Unified Solids library 6
Visualization of scalar and vector data sets
Current status of the development of the Unified Solids library 7
3D plots allowing to overview data sets
Current status of the development of the Unified Solids library 8
3D visualization of investigated shapes
-5000
0
5000
0
50000
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
x 104
Current status of the development of the Unified Solids library 9
Support for regions of data, focusing on sub-parts
Current status of the development of the Unified Solids library 10
Visual analysis of differences
Current status of the development of the Unified Solids library 11
Visual analysis of differences in 3D
Current status of the development of the Unified Solids library 12
Inside DistanceToOut DistanceToIn Normal SafetyFromOutside SafetyFromInside0
20
40
60
80
100
120
140
Method
Tim
e pe
r one
met
hod
call
[nan
osec
onds
]
Performance of methods at folder trap-1m-performance
Geant4ROOTUSolids
Graphs with comparison of performance
Current status of the development of the Unified Solids library 13
Visualization of scalability performance for specific
solids
2 4 6 8 10 20 500
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
Count of solid parts
Tim
e [n
anos
econ
ds p
er o
pera
tion]
Performance of method DistanceToOut at folder log
Geant4ROOTUSolid
Current status of the development of the Unified Solids library 14
New Multi-Union solid
Current status of the development of the Unified Solids library 15
Multi-Union solid• We implemented a new
solid as a union of many solids using voxelization technique to optimize speed and scalabilityo 3D space partition for fast
localization of componentso Aiming for a log(n) scalability,
unlike the Geant4 Boolean solid• Useful for complex
composites made of many solids
-0.8-0.6
-0.4-0.2
00.2
0.40.6
0.8
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
Current status of the development of the Unified Solids library 16
Test multi-union solid for scalability measurements
Union of random boxes, orbs and trapezoids (on pictures with 5, 10, 20 solids)
Current status of the development of the Unified Solids library 17
The most performance critical methods
Current status of the development of the Unified Solids library 18
Fast Tessellated Solid
Current status of the development of the Unified Solids library 19
Fast Tessellated Solid• Made from connected triangular and
quadrangular facets forming solid• Old implementation was slow, no
spatial optimization• We use spatial division of facets into
3D grid forming voxels• Voxelization is based on:
o Bitmasks and logical and operations during initialization
o NEW: Bitmasks are not used at runtime, only pre-calculated lists of facets candidates are used during runtime
• Speedup factor ~10x additional improvement since the previous Geant4 collaboration meeting
Current status of the development of the Unified Solids library 20
Fast Tessellated Solid• Old Tessellated Solid had also several weak parts
of algorithm, used at initialization with n2 complexityo This sometimes caused very huge delays when loading (e.g. in case of
foil with 164k faces)o Rewrote to have n ∙ log n complexityo Loading the solid is much faster. What before took minutes, now takes
seconds• Backported in autumn 2012 also to Geant4 9.6• By improving design of classes and removing
inefficiencies, total memory save is about ~50%, o Voxelization has overhead vs. original G4TS, which reduces to a total memory
save of ~25%
Current status of the development of the Unified Solids library 21
LHCb VELO foil benchmarkMethod Speedu
pInside 2423xDistanceToIn 1334xDistanceToOut 1976xInformation ValueNumber of facets
164.149
Number of voxels
158.928
Memory saved compared with original Geant4
22% (51MB)
Current status of the development of the Unified Solids library 22
Inside Normal SafetyFromOutside SafetyFromInside DistanceToIn DistanceToOut0
5
10
15x 10
6
Method
Tim
e pe
r one
met
hod
call
[nan
osec
onds
]
Performance of methods at folder tessellatedsolid-test-t100-100k
G4TessellatedSolidNew G4TessellatedSolid
• OLD Speedup: 240x None 10000x+ None 133x 397x• NEW Speedup: 894x 10.3x 10000x+ ~8.7x 412x
1183x• NEW Speedup: 2423x 72x 10000x+ ~10.9x 1334x
1976x• NEW Speedup: 2925x 46x 10000x+ ~9.8x 1332x
2968x
Performance – 164k/SCL5 with
10k/100k/1M voxels
Current status of the development of the Unified Solids library 23
Polycone
Current status of the development of the Unified Solids library 24
New Ordinary Polycone
implementation• Polycone is an important solid, heavily used in most
experimental setups• Special optimization for common cases: Z sections
can only increase (big majority of real cases)o Based on composition of separate instances of cones, tubes (or their
sections)o This way, providing brief, readable, clean and fast codeo Excellent scalability over the number of Z sectionso Very significant performance improvement
• Will make visible CPU improvement in full simulation of complex setups like ATLAS and CMS o In CMS, 6.92% was measured as the total CPU share of Polycone methodso Assuming possible speedup factor ~5x, this would save around 5%
total time of the whole CMS experimento Cost of polycone in ATLAS is “around 10%”
-500005000-500005000
2
4
6
8
10
12
x 104
Current status of the development of the Unified Solids library 25
Inside DistanceToOut DistanceToIn Normal SafetyFromOutside SafetyFromInside0
500
1000
1500
2000
2500
Method
Tim
e pe
r one
met
hod
call
[nan
osec
onds
]
Performance of methods at folder polycone-3s-360-perf
Geant4ROOTUSolids
Ordinary Polycone performance, 3 z
sections • Speedup factor 3.3x vs. Geant4, 7.6x vs. ROOT
for most performance critical methods, i.e.: Inside, DistanceToOut, DistanceToIn
Current status of the development of the Unified Solids library 26
Ordinary Polycone performance, 10 z
sections • Speedup factor 8.3x vs. Geant4, 7.9x vs. ROOT
for most performance critical methods
Inside DistanceToOut DistanceToIn Normal SafetyFromOutsideSafetyFromInside0
200
400
600
800
1000
1200
1400
1600
1800
2000
Method
Tim
e pe
r one
met
hod
call
[nan
osec
onds
]
Performance of methods at folder polycone-10s-360-perf
Geant4ROOTUSolids
Current status of the development of the Unified Solids library 27
Ordinary Polycone performance, 100 z
sections • Speedup factor 34.3x vs. Geant4, 10.7x vs.
ROOT for most performance critical methods
Inside DistanceToOut DistanceToIn Normal SafetyFromOutsideSafetyFromInside0
1000
2000
3000
4000
5000
6000
7000
Method
Tim
e pe
r one
met
hod
call
[nan
osec
onds
]
Performance of methods at folder polycone-100s-360-perf
Geant4ROOTUSolids
Current status of the development of the Unified Solids library 28
3s 10s 100s0
1000
2000
3000
4000
Count of solid partsTim
e [n
anos
econ
ds p
er o
pera
tion]Performance of method SafetyFromInside at folder ordinary
Geant4ROOTUSolid
3s 10s 100s0
2000
4000
6000
8000
Count of solid parts
Tim
e [n
anos
econ
ds p
er o
pera
tion] Performance of method DistanceToOut at folder ordinary
Geant4ROOTUSolid
3s 10s 100s0
500
1000
1500
2000
2500
Count of solid parts
Tim
e [n
anos
econ
ds p
er o
pera
tion] Performance of method Inside at folder ordinary
Geant4ROOTUSolid
Ordinary Polycone scalability
-500005000-500005000
2
4
6
8
10
12
x 104
3s 10s 100s0
1000
2000
3000
4000
Count of solid parts
Tim
e [n
anos
econ
ds p
er o
pera
tion] Performance of method DistanceToIn at folder ordinary
Geant4ROOTUSolid
Geant4 poor performance scalability is explained by the lack of spatial optimization
Current status of the development of the Unified Solids library 29
Ordinary Polycone: sometimes only proper, careful coding makes big
difference• Performance gain over Geant4 could be explained
due to lack of spatial division by voxelization• But new USolids polycone uses very similar
algorithms and voxelization as ROOT does: why it is factor 7 – 10x faster, much more clear, shorter, readable and easier to maintain?
• Answer is given in 2 backup slides in the end of presentation
Current status of the development of the Unified Solids library 30
Generic Polycone• Optimization made as well
o Using voxelization on generalized surface facet model
o Performance improvement over Geant4 (generic polycone does not exist in ROOT)
o Performance improvement depends on the number of sections and is less than in case of ordinary polycone
o Scalability is excellent as well as for the ordinary polycone case
o Tested and measured on original Geant4 test cases of solids
o Values are 100% conformal with Geant4• This work was achieved thanks to
the 3 month extension negotiated by John Harvey and supported by the Linear Collider Detector Team
Current status of the development of the Unified Solids library 31
3s 10s 100s0
1000
2000
3000
4000
Count of solid parts
Tim
e [n
anos
econ
ds p
er o
pera
tion]Performance of method SafetyFromInside at folder combined
Geant4 PolyconeNew Generic PolyconeNew Ordinary Polycone
3s 10s 100s0
2000
4000
6000
8000
Count of solid parts
Tim
e [n
anos
econ
ds p
er o
pera
tion] Performance of method DistanceToOut at folder combined
Geant4 PolyconeNew Generic PolyconeNew Ordinary Polycone
3s 10s 100s0
1000
2000
3000
4000
Count of solid parts
Tim
e [n
anos
econ
ds p
er o
pera
tion]
Performance of method DistanceToIn at folder combined
Geant4 PolyconeNew Generic PolyconeNew Ordinary Polycone
Ordinary vs. Generic Polycone
-500005000-500005000
2
4
6
8
10
12
x 104
3s 10s 100s0
500
1000
1500
2000
2500
Count of solid parts
Tim
e [n
anos
econ
ds p
er o
pera
tion] Performance of method Inside at folder combined
Geant4 PolyconeNew Generic PolyconeNew Ordinary Polycone
Geant4 poor performance scalability is explained by the lack
of spatial optimization
Current status of the development of the Unified Solids library 32
Polyhedra
Current status of the development of the Unified Solids library 33
New Polyhedra implementation
• Uses similar methodology as was done for the Generic Polycone
• Works for both ordinary and generic polyhedra
• Provides improvement of performance and scalability
• Values are 100% conformal with Geant4• Same as for the case of generic polycone, this work
was made during my last 3 months contract extension
Current status of the development of the Unified Solids library 34
Inside scalability
2 4 6 8 10 20 500
500
1000
1500
2000
2500
Count of solid parts
Tim
e [n
anos
econ
ds p
er o
pera
tion]
Performance of method Inside at folder log
Geant4ROOTUSolid
Geant4 poor performance scalability is explained by the lack of spatial optimization
Current status of the development of the Unified Solids library 35
DistanceToIn scalability
2 4 6 8 10 20 500
2000
4000
6000
8000
10000
12000
14000
16000
Count of solid parts
Tim
e [n
anos
econ
ds p
er o
pera
tion]
Performance of method DistanceToIn at folder log
Geant4ROOTUSolid
Geant4 poor performance scalability is explained by the lack of spatial optimization
Current status of the development of the Unified Solids library 36
DistanceToOut scalability
2 4 6 8 10 20 500
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
Count of solid parts
Tim
e [n
anos
econ
ds p
er o
pera
tion]
Performance of method DistanceToOut at folder log
Geant4ROOTUSolid
Geant4 poor performance scalability is explained by the lack of spatial optimization
Current status of the development of the Unified Solids library 37
Normal scalability
2 4 6 8 10 20 500
1000
2000
3000
4000
5000
6000
Count of solid parts
Tim
e [n
anos
econ
ds p
er o
pera
tion]
Performance of method Normal at folder log
Geant4ROOTUSolid
Geant4 poor performance scalability is explained by the lack of spatial optimization
Current status of the development of the Unified Solids library 38
SafetyFromInside scalability
2 4 6 8 10 20 500
1000
2000
3000
4000
5000
6000
Count of solid parts
Tim
e [n
anos
econ
ds p
er o
pera
tion]
Performance of method SafetyFromInside at folder log
Geant4ROOTUSolid
Geant4 poor performance scalability is explained by the lack of spatial optimization
Current status of the development of the Unified Solids library 39
2 4 6 8 10 20 500
1000
2000
3000
4000
5000
6000
7000
Count of solid parts
Tim
e [n
anos
econ
ds p
er o
pera
tion]
Performance of method SafetyFromOutside at folder log
Geant4ROOTUSolid
SafetyFromOutside scalability
TODO
Geant4 poor performance scalability is explained by the lack of spatial optimization
Current status of the development of the Unified Solids library 40
Status of work Types and Unified Solids interface are defined Bridge classes for comparison with Geant4 and
ROOT implemented Testing suite defined and deployed Implemented 9 solid primitives New, high performance implementation of
composed solids: Multi-Union, Tessellated solid, Polycone, Polyhedra and Extruded-Solid (ongoing)
Code and knowledge is being passed since Q1 2013 to Tatiana Nikitina
Current status of the development of the Unified Solids library 41
Future work plan• Marek Gayer
o Finish documentation• Tatiana Nikitina
o Analyze and implement remaining solids for the new libraryo Verify implementation of secondary methods for all solidso Integration to Geant4 10
Thank you for your attention.
???Questions and Answers
Current status of the development of the Unified Solids library 42
Backup
Current status of the development of the Unified Solids library 43
Current status of the development of the Unified Solids library 44
Polycone: sometimes only proper, careful coding makes huge difference
1/2• But new USolids polycone uses very similar
algorithms and voxelization as ROOT does: why it is factor 7 – 10x faster, much more clear, shorter, readable and easier to maintain?
• Pre-calculates as much as possible in the constructor and not during runtime, not needing to do it in navigation methods again (e.g. if it is cylindrical or tubular section)
• Using directly methods from Tubs on Cons for polycone section to separate and move the logic from Polycone to Polycone sections o not re-implementing it again in new methods => difficult to read, maintain
• Lot of polycone data is instead contained in Tubs and Cons
Current status of the development of the Unified Solids library 45
Polycone: sometimes only proper, careful coding makes huge difference
2/2• No use of recursion, troubling compiler, processor and
code readability• DistanceToIn moves point to bounding box• std::lower_bound is much faster than doing binary
search by own coding• C++, not feeling of mere C wrapped in classes• Using class for vector (x,y,z) is better than using C array• No use of pointers, memcpy, memset, …• Section data not contained in Tubs, Cons are grouped
together in a std::vector of struct’s• Logical components, namely repeatedly used are in
separated methods, often inlined