CERN
2001 Summer Student Lectures
Computing at CERN
Lecture 1 — Looking Around
Tony Cass — [email protected]
CERN
2Tony Cass
Acknowledgements The choice of material presented here is entirely my own. However, I
could not have prepared these lectures without the help of– Charles Granieri, Hans Grote, Mats Lindroos, Franck Di Maio, Olivier Martin,
Pere Mato, Bernd Panzer-Steindel, Les Robertson, Stephan Russenschuck, Frank Schmidt, Archana Sharma and Chip Watson
who spent time discussing their work with me, generously provided material they had prepared, or both.
For their general advice, help, and reviews of the slides and lecture notes, I would also like to thank– Marco Cattaneo, Mark Dőnszelmann, Dirk Düllmann, Steve Hancock, Vincenzo
Innocente, Alessandro Miotto, Les Robertson, Tim Smith and David Stickland.
CERN
3Tony Cass
Some DefinitionsGeneral
Computing Power– CERN Unit– MIPS– SPECint92, SPECint95
Networks– Ethernet
» Normal (10baseT, 10Mb/s)» Fast (100baseT, 100Mb/s)» Gigabit (1000Mb/s)
– FDDI– HiPPI
bits and Bytes– 1MB/s = 8Mb/s
Factors– K=1024, K=1000
CERN Interactive Systems
– Unix: WGS & PLUS» CUTE, SUE, DIANE
– NICE Batch Systems
– Unix: SHIFT, CSF» CORE
– PCSF
Other Data Storage, Data Access & Filesystems
– AFS, NFS, RFIO, HPSS, Objectivity[/DB] CPUs
– Alpha, MIPS, PA-Risc, PowerPC, Sparc
– Pentium, Pentium II, Merced
CERN
4Tony Cass
How to start? Computing is everywhere at CERN!
– experiment computing facilities, administrative computing, central computing, many private clusters.
How should this lecture course be organised? – From a rigorous academic standpoint?
– From a historical standpoint
– ...
– From a physics based viewpoint
CERN
5Tony Cass
Weekly use of Interactive Platforms1987-2001
Nu
mb
er
of
Us
ers
ea
ch
We
ek
Week
Windows 95
Windows NT
WGS and PLUS
CERNVM
VXCERN
0
2000
4000
6000
8000
10000
12000
CERN
7Tony Cass
Computing at CERN Computing “purely for (experimental) physics” will be the
focus of the second two lectures of this series. Leaving this area aside, other activities at CERN can be considered as falling into one of three areas:– administration,
– technical and engineering activities, and
– theoretical physics.
We will take a brief look at some of the ways in which computing is used in these areas in the rest of this first lecture.
CERN
9Tony Cass
Technical and Engineering Computing Engineers and physicists working at CERN must
– design,
– build, and
– operate
for experimental physicists to be able to collect the data that they need.
As in many other areas of engineering design, computer aided techniques are essential for the construction of today’s advanced accelerators and detectors.
– accelerators and
– detectorsboth
CERN
10Tony Cass
Accelerator design issues Oliver Brüning’s lectures will tell you more about accelerators. For the
moment, all we need to know is that – particles travelling in bunches around an accelerator are bent by dipole magnets and
must be kept in orbit. » Of course, they must be accelerated as well(!), but we don’t consider that here.
Important studies for LHC are– magnet design
» how can we build the (superconducting) dipole magnets that are needed?
– transverse studies» will any particles leave orbit? (and hit the magnets!)
– longitudinal studies» how can we build the right particle bunches for LHC?
CERN
11Tony Cass
LHC Magnet Design2D field picture for LHC dipole coil
3D representation of dipole coil end with magnetic field vectorsPictures generated with ROXIE.
CERN
12Tony Cass
Genetic Algorithms for Magnet Design
Original coil design.
New coil design found usinga genetic algorithm.This was further developedusing deterministic methodsand replaced the originaldesign.
Genetic Algorithm convergence plot.
The algorithm is designed to come upwith a number of alternative solutionswhich can then be further investigated.
CERN
14Tony Cass
Longitudinal Studies Not all particles in a bunch have the same energy. Studies of energy
distribution show aspects of bunch shape.– The energy of a particle affects its arrival time at the accelerating cavity… which
then in turn affects the energy.
Need to measure both energy and arrivaltime, but can’t measure energy directly.Measuring arrival times is easy– but difficult to interpret successive slices.
Tomography techniques lead to a completepicture– like putting together X-ray slices through
a person.
CERN
20Tony Cass
CERN and the World Wide Web The World Wide Web started as a project to make information more
accessible, in particular, to help improve information dissemination within an experiment.– These aspects of the Web are widely used at CERN today. All experiments
have their own web pages and there are now web pages dedicated to explaining about Particle Physics to the general public.
– In a wider sense, the web is being used to distribute graphical information on system, accelerator and detector status. The release of Java has given a big push to these uses.
Web browsers are also used to provide a common interface, e.g.» currently to the administrative applications, and» possibly in future as a batch job submission interface for PCs.
CERN
23Tony Cass
20002001: What has changed? I Windows 2000 has arrived and Wireless Ethernet is arriving.
– Portable PCs replacing desktops.
– Integration of home directory, web files, working offline makes things easier—just like AFS and IMAP revolutionised my life 8 years ago.
I now have ADSL at home rather than ISDN.– I am now outside the CERN firewall when connected from home but
this doesn’t matter so much with all my files cached on my portable.» I just need to bolt on a wireless home network so I can work in the garden!
– The number of people connecting from outside the firewall will grow» CERN will probably have to support Virtual Private Networks for privileged
access
» And users will have to worry about securing their home network against hackers…
CERN
24Tony Cass
Looking Around—Summary Computing extends to all areas of work at CERN. In terms of CERN’s “job”, producing particle physics
results, computing is essential for– the design, construction and operation of accelerators and
detectors, and
– theoretical studies, as well as
– the data reconstruction and analysis phases.
The major computing facilities at CERN, though, are provided for particle physics work and these will be the subject of the next two lectures.
CERN
2001 Summer Student Lectures
Computing at CERN
Lecture 2 — Looking at Data
Tony Cass — [email protected]
CERN
26Tony Cass
Data and Computation for Physics Analysis
batchphysicsanalysis
batchphysicsanalysis
detector
event summary data
rawdata
eventreconstruction
eventreconstruction
eventsimulation
eventsimulation
interactivephysicsanalysis
analysis objects(extracted by physics topic)
event filter(selection &
reconstruction)
event filter(selection &
reconstruction)
processeddata
CERN
27Tony Cass
Central Data Recording CDR marks the boundary between the experiment and
the central computing facilities. It is a loose boundary which depends on an
experiment’s approach to data collection and analysis. CDR developments are also affected by
– network developments, and
– event complexity.detector
rawdata
event filter(selection &
reconstruction)
event filter(selection &
reconstruction)
CERN
28Tony Cass
Monte Carlo Simulation From a physics standpoint, simulation is needed to study
– detector response
– signal vs. background
– sensitivity to physics parameter variations.
From a computing standpoint, simulation– is CPU intensive, but
– has low I/O requirements.
Simulation farms are therefore good testbedsfor new technology:– CSF for Unix and now PCSF for PCs and Windows/NT.
eventsimulation
eventsimulation
CERN
29Tony Cass
Data Reconstruction The event reconstruction stage turns detector information into
physics information about events. This involves– complex processing
» i.e. lots of CPU capacity
– reading all raw data» i.e lots of input, possibly read
from tape
– writing processed events» i.e. lots of output which
must be written topermanent storage.
event summary data
rawdata
eventreconstruction
eventreconstruction
CERN
30Tony Cass
Batch Physics Analysis
Physics analysis teams scan over all events to find those that are interesting to them.– Potentially enormous input
» at least data from current year.
– CPU requirements are high.
– Output is “small”» O(102)MB
– but there are many different teams andthe output must be stored for future studies
» large disk pools needed.
batchphysicsanalysis
batchphysicsanalysis
event summary data
analysis objects(extracted by physics topic)
CERN
37Tony Cass
Today’s CORE Computing Systems
CERN Network
CORE Physics Services
CERN
Dedicated RISCclusters
300 computers, 750 processors(DEC, HP, SGI, SUN)
300 computers, 750 processors(DEC, HP, SGI, SUN)
Central Data Services
Shared Disk Servers
5 TeraByte disk3 Sun servers6 PC based servers
5 TeraByte disk3 Sun servers6 PC based servers
10 tape robots100 tape drives9940, Redwood, 9840, DLT,IBM 3590E, 3490, 3480EXABYTE, DAT, Sony D1
10 tape robots100 tape drives9940, Redwood, 9840, DLT,IBM 3590E, 3490, 3480EXABYTE, DAT, Sony D1
Shared Tape Servers
Homedirectories& registry
consoles&
monitors
DXPLUS, HPPLUS,RSPLUS,LXPLUS, WGS
InteractiveServices
120 systems (HP, SUN, IBM, DEC, Linux)120 systems (HP, SUN, IBM, DEC, Linux)
NAP - accelerator simulation service
NAP - accelerator simulation service
10-CPU DEC 840012 DEC workstations
20 dual processor PCs
10-CPU DEC 840012 DEC workstations
20 dual processor PCs
PaRCEngineeringCluster
PaRCEngineeringCluster
13 DEC workstations5 dual processor PCs5 Sun workstations
13 DEC workstations5 dual processor PCs5 Sun workstations
“Queue shared”Linux Batch Service
350 dual processor PCs350 dual processor PCs
RISC Simulation FacilityMaintained for LEP only
“Timeshared” Linuxcluster
200 dual processor PCs200 dual processor PCs
Dedicated Linuxclusters
250 dual processor PCs250 dual processor PCs
PC & EIDE baseddisk Servers
40TB mirrored disk(80TB raw capacity)
40TB mirrored disk(80TB raw capacity)
25 PC servers
CERN
38Tony Cass
Hardware Evolution at CERN, 1989-2001
Event Filter
Engineering Mainframes(I BM, Cray)
Disk Servers RI SC systems
Tape Servers Scalable systems(SP2, CS2)
I nteractive PCs
Batch
89 90 91 92 93 94 95 96 97 98 99 00 01
CERN
39Tony Cass
Interactive Physics Analysis Interactive systems are needed to enable physicists to develop and test
programs before running lengthy batch jobs.– Physicists also
» visualise event data and histograms
» prepare papers, and
» send Email
Most physicists use workstations—either private systems or central systems accessed via an Xterminal or PC.
We need an environment that provides access to specialist physics facilities as well as to general interactive services.
analysis objects(extracted by physics topic)
CERN
40Tony Cass
Unix based Interactive Architecture Backup
& ArchiveReference
EnvironmentsCORE
Services
Optimized Access
X Terminals PCs PrivateWorkstations.
WorkGroupServer
Clusters
PLUSCLUSTERS
Central Services
(mail, news,ccdb, etc.)
ASIS :Replicated
AFS Binary Servers
AFS Home Directory Services
GeneralStaged Data
Pool
X-terminal Support
CERN InternalNetwork
CERN
42Tony Cass
Event Displays
Event displays, such as this ALEPH display help physicists to understand what is happening in a detector. A Web based event display, WIRED, was developed for DELPHI and is now used elsewhere.
Clever processing of events can also highlight certain features—such as in the V-plot views of ALEPH TPC data.
Standard X-Y view
V-plot view
CERN
43Tony Cass
Data Analysis Work
By selecting a dE/dx vs. p region on this scatter plot, a physicist can choose tracks created by a particular type of particle.
Most of the time, though, physicists will study eventdistributions rather than individual events.
RICH detectors provide better particle identification, however. This plot shows that the LHCb RICH detectors can distinguish pions from kaons efficiently over a wide momentum range.
Using RICH information greatly improves the signal/noise ratio in invariant mass plots.
CERN
44Tony Cass
Looking at Data—Summary Physics experiments generate data!
– and physcists need to simulate real data to model physics processes and to understand their detectors.
Physics data must be processed, stored and manipulated. [Central] computing facilities for physicists must be designed to take
into account the needs of the data processing stages– from generation through reconstruction to analysis
Physicists also need to– communicate with outside laboratories and institutes, and to
– have access to general interactive services.
Top Related