Feb. 26, 2001 L. Dennis, FSU
The Search for Exotic Mesons – The Critical Role of Computing in Hall D
Production of Mesons and Gluonic Excitations Using 6-12 GeV Photons
Fundamental Physics
Role of “glue” in strong QCD
Experimental Goal
Unambiguous identification of gluonic excitationsstarting with exotic hybrids
Experimental Requirements
Hybrids are expected to exist precisely where we haveAlmost no experimental information – photoproduction
Requires 6 – 12 GeV photon beam energies
Looking for Hybrids
We should observe exotic hybrids precisely where we have no data: PHOTOPRODUCTION
S = 0 – For pion and kaon probes
where most of our data exist
S = 1 – Use a probe with quark spins aligned - the photonwhere we have essentially no data
Predicted Meson Spectrum
Predictions for exotic mesons come from: Lattice QCD Flux Tube Models
In flux tube picture, gluons in hadrons are confined to flux tubes.
Conventional mesons arise when the flux tube is in its ground state.
Hybrid mesons arise when the flux tube is in an excited state.
Meson Map
Hall D Online Data Acquisition
CEBAF provides us with a tremendous scientific
opportunity for understanding one of the
fundamental forces of nature.
75 MB/s
900 MB/s
Critical Role for Computing in Hall D
The quality of Hall D science depends critically upon the collaboration’s
ability to conduct it’s computing tasks.
The Challenge
Minimize the effort required to perform computing
Data Intensive Application Compute Intensive Applications Information Intensive Analysis Research Application – methods and
algorithms are not fully defined.
Trigger Rates for Hall D
Detector180 kev/s
Trigger15 kev/s
5 kB/ev75 MB/s
Trigger requires~100 CPU’s*
* Assume a factor of 10 improvement over existing CPU’s
5 CPU-ms/ev Full Reconstruction (CLAS) 50 ms/ev today.100 CPU-ms/ev Full Simulation (CLAS) 1-3 s/ev today.1/3 Assumed detector & accelerator efficiency.
Required Sustained Reconstruction Rate
[15 kev/s] * [1/3] * [2] = 10 kev/s
EquipmentDuty
Factor
RawRate
Duplication Factor
10 kev/s * 5 CPU-ms/ev = 50 CPU’s
Required Sustained Simulation Rate
5 kev/s * 100 CPU-ms/ev = 500 CPU’s
[15 kev/s] * [1/3] * [10] * [1/10] = 5 kev/s
EquipmentDuty Factor
RawRate
Systematics
Studies
Good Event
Fraction
PWA error is determined by one’s knowledge of systematicerrors. This requires extensive simulations, but not allevents simulated are accepted events.
Annual Date Rate to Archive
Raw Data
75 MB/sec * (3 *107 s/yr) * (1/3) = 0.75 PB/yr
Simulation Data
25 MB/sec * (3 *107 s/yr) = 0.75 PB/yr
Reconstructed Data
50 MB/sec * (3 *107 s/yr) = 1.50 PB/yr
Total Rate to Archive ~ 3 PB/yr
Requirements Summary
Hall D CPU Requirements
First Pass7%
Trigger13%
Analysis13%
Simulation67%
Hall D Annual Data Rates
Simulation50%
Raw Data25%
Analyzed25%
Hall D Computing Tasks
First PassAnalysis
Data Mining
Physics Analysis
Partial WaveAnalysis
Physics Analysis
Acquisition
Monitoring
Slow Controls
Data Archival
Planning
Simulation
Publication
Calibrations
Meeting the Hall D Computational Challenges
Moore’s law: Computer performance increases by a factor of 2 every 18 months.
Gilder’s Law: Network bandwidth triples every 12 months.
Solving the information management problems requires people working on the software and developing a workable computing environment.
Dennis’ Law: Neither Moore’s Law nor Gilder’s Law will solve our computing
problems.
“Chaos of Analysis”
Problem:
It is impossible to efficiently complete our computing in a single large, common, democratic computer facility.
Solution:
Provide several sites with the resources required to complete specific tasks. Choose those sites which seek to become lead institutions in specific efforts, such as simulations, calibrations or partial wave analysis.
Common access for Physicists everywhere.Common access for Physicists everywhere. Utilizing all intellectual resourcesUtilizing all intellectual resources
JLab, universities, remote sites JLab, universities, remote sites
Scientists, studentsScientists, students Maximize total funding resources while meeting the total Maximize total funding resources while meeting the total
computing need.computing need. Reduce Systems’ complexity Reduce Systems’ complexity
Partitioning of facility tasks, to manage and focus Partitioning of facility tasks, to manage and focus resources.resources.
Optimization of computing resources to solve the problem.Optimization of computing resources to solve the problem.
Tier-n or “Grid” Model.Tier-n or “Grid” Model. Reduce long-term computational management problems.Reduce long-term computational management problems.
Grid Computing Advantages
Digital Hall D Ground Rules
Distributed Objects Define all programs and data as objects.
Define “or wrap” everything in XML. Implement in Object Model de jour (CORBA, Java, COM,
SOAP …)
Does not require that we use an Object Database or that we use relational databases inappropriately.
Move and query metadata rather than data whenever possible. Move the applications to the data.
Assume everybody has wireless access to the “Digital Hall D” through hand-held and conventional computers.
Digital Hall D Technologies HallD Grid
Globus provides infrastructure to access computer resources around the world
HallD Grid. Structure access to Digital Hall D as a Portal –
myHallD.org Use a multi-tier software architecture separating
resources, servers/brokers, display engines, display devices.
Do not write any HTML – use XML and convert. Program in C++ or Java.
Vision for Grid Environment
Work toward a Grid-based Operating System. Standard toolkit for manipulating objects.
For example: copy, find, create, delete,… Standards for developing additional complex
Grid based tools. For example: A tool that builds an acceptance function from
available GEANT simulations, whose results are stored in several locations.
Tools to share intermediate results of large computations.
Many of these tools exist, it is remain to selecting the appropriate ones and wrap them in standardized interfaces so they can work with Hall D objects.
Foundations for Grid Sites
GridServices
DataServices
ComputeServices
InformationServices
InteractiveServices
BatchServices
NeedsVery ReliableHardware &Software at
Remote Sites
Needs Very Reliable, Easy to Install
Software at Remote Sites
……...
Hierarchy of Portals and Their Technology
Portal Building Tools and Frameworks (XUL, Ninja,
iPlanet, E-Speak, Portlets, WebSphere,
www.desktop.com)
Enterprise Portals
Generic Portals
Education &Training Portals
Science Portals
K-12 University BiologyChem Eng
CollaborationUniversal AccessSecurity …….
Databases ……. User customization, component libraries,
fixed channels
Education Services Compute Services
Information Services
Generic Services
Collaborative Objects
Digital objects shared by more than one person.
Asynchronous sharing: You create/modify an object. Others access/modify it at a later time.
Synchronous Collaboration: Real-time access/modification of objects by several people in distributed locations.
Virtual Experimental Control Room
Could be a big win as (unexpected) real-time decisions need “experts-on-demand.”
Model being considered by NASA for remote spacecraft mission control and real-time scientific analysis of earthquakes.
Need collaborative decision making (vote?) and planning tools.
Needs shared streaming data and shared read-outs of experimental monitors (output of all devices must be distributed objects which can be shared).
Needs to support experts caught on the beach with poor connectivity or in their car with just a cell phone and a PDA.
Building Computer Science & Physics Teams for Computing System
Development
Physicists ComputerScientists
Computing environment we
need to besuccessful
$’sPrestigeTradition
$’sPrestig
eTraditio
n
Conclusions
Hall D provides tremendous opportunities for new physics.
Requires unprecedented computing. Grid and portal technology provide a
unique new method of involving distributed intellectual resources in this important problem.
The resources required to create those solutions are not yet in place.
Collaboration Computing Organization
Attracting physicists to work on software is difficult.
Perceived importance is based on capital “$’s” spent.
Accelerator Detector Computing. Once it works, they have nothing they can show to
their dean and say, “I built that!” “Everyone” thinks it is easy. One good way to have a really positive
impact on the science. Helps train and attract students for a variety
of careers.
Collaboration Computing Organization
Attracting computer scientists to work on physics software is difficult.
Perceived importance is based on computer science research, not computer science applications.
Physics publications don’t help computer scientists get tenure.
“Everyone” thinks it is easy. A good way to actually test computer
science theory. Science requires experimental testing to
progress. Real world training ground for students.
2 Tape Drives4/1 ratio of processing to I/O per tape1.2 TBytes of Disk Required
e3 e4b1 b2 b3 b4
a1 a2 a3 a4b5 c1 c2 c3
b1 b2 b3 b5b4a1 a2 a3 a4 a5c4
a1a2
a3a4
a5
a5d1
b1b2
b3
b4b5
c1c2
c1 c2 c3 d1c4
c3c4
d1d2
d2 d3 d4 e1 e2d3
d4e1
e2
e3e4
e3 e4d2 d3 d4 e2e1
Start-up Equilibrium
f1 f2 f3 f4
f1 f2
f1 f2 f3 f4 f3
f4
g1 g2
Shut-down
x1 x2 x3 y1x4
x3x4
y1y2
y3y4
z1z2
z1 z2 z3 z4y2 y3 y4 z2z1
x2x1
w4z3
z4
z4z3w4w3
Obtaining Optimum System Performance
Data Reducation System Efficiency
0.40
0.50
0.60
0.70
0.80
0.90
1.00
1 4 7 10 13 16 19 22 25 28 31 34
Equilibrium Cycle Count
Effi
cie
ncy
TapeEfficiency
CPUEfficiency
Estimated System Efficiency
Efficient Information Access is Key to Using the HallD Grid
Data Acquisition Raw Data, Experimental Conditions
Calibrations
Simulations
Data Reduction
Physics Analysis
PWA
InformationFrom
Researchers
Hall DExperimentalInformation
Focus Accurate, Timely Analysis
Provide people with the information and resources they need to conduct their analysis
Provide it reliably Provide it in the way scientists need it Provide it efficiently (speed, effort) Provide flexibility for other applications
Hall D Portal: MyHallD
What’s Involved in MyHallD? Probably needs some money, but < $30.9442 M, Commitment to use the “HallD Digital Object
Framework”.
Basic functions are available in existing commercial systems.
Start to use these. Prototype some of the special capabilities needed.
What is involved in making HallD objects collaborative?
First use objects! Then we have choices – which vary in ease of use and
functionality.
MyHallD: The Portal Door to:
Experiment Control Room Simulation Farms & Data Calibration Farm & Data Reconstruction Farm Analysis Farms & Data Board Room & Archive Personalized Electronic Logbook Hall D Education and Outreach Area
Collaborative Computing Organization
Clearly establishes responsibility for software subsystems. Gives University groups working on software something to
show for their efforts. Helps to attract people and resources to the computing efforts.
Can leverage other University and National resources. Infrastructure, personnel, funding, NSF & DOE ITR initiatives.
Eases the creation of customized (Grid) computing systems. Establishes new capabilities within the JLab/NP community.
These capabilities allow JLab to take advantage of new opportunities.
Critical Software Issues
Early creation of a “core group” of software developers.
Creation of key design elements. Commitment to key design goals.
Key Software Problems. Simulations. Software organization and management. Data formats for raw and derived data. Software for defining and accessing raw and derived
data. Event visualization.
Using available software. Developing & maintaining high-quality software.
Computing Organization Issues
Recommendations. Online database – rely totally on automated
methods. Offline database – rely totally on automated
methods. Integrated online/offline/simulation database. Event Analysis – do it at Jefferson Lab. Calibrations – possible to do elsewhere. Physics Analysis – possible to do elsewhere. Simulations – possible to do elsewhere. PWA – possible to do elsewhere.
Computing Organization Issues (continued)
Recommendations. Develop infrastructure to easily share
computing resources and information. Develop customized computing approach to
Hall D computing. Provides clear lines of responsibility for
software and computing tasks.
These are social decisions – not technical or financial decisions.
Collaboration Computing Organization
The job is too big to be managed without databases. Provides wider access to experimental
information. Databases are optimized for managing large
data sets. We will create 5 – 10 M files every year.
Database use can be organized to minimize it’s impact on time critical applications.
Experiments Database
Run Detector Config.
Analysis
SimulationCalibration
1/M
1/M
1/M
1/M
M/M
M/M
M/M
Online & Offline Analysis
Integrated online & offline analysis systems. Pros:
Common system requires less effort. Encourages cooperation between online & offline. Potentially higher reliability.
Challenges: Broad contributions to offline analysis require
standards and convenience performance overhead. Level 3 trigger performance must be acceptable. No working Level 3 trigger system at JLab. No “suitable” memory management system for CODA
events.
Online, Offline & Simulation Database
Automated Experiments Database. Pros:
Common system requires less effort. Encourages cooperation between different
computing groups. Better organization of needed information. Higher reliability and better access.
Challenges: Anyone software developer in the information
chain can break it. Distributed simulations require modern
organization of the database.
Where to Perform 1st Pass Analysis?
1st Pass Analysis at JLab. Pros:
Don’t need to transport the data. Computer system support is in place. Detector experts on site.
Challenges: Oversubscribed computer system. Obtaining efficient tape access, system
throughput is unlikely in a heterogeneous computing environment.
Where to Perform Physics Analysis?
Physics analysis is done where the researcher live. Pros:
Not competing with major analysis & simulation efforts.
Easier to involve more people. Challenges:
Requires a portable analysis code. Requires a good system for quality control of
results.
Where to Perform Simulations?
Simulations done at a few institutions. Pros:
Get more groups invested in simulation effort.
Probably don’t need to transport the data. Easy to do remotely.
Challenges: Need computer infrastructure in place. Need software infrastructure in place.
Key Differences Between Halls B and D
More uniform physics goals in Hall D. Jefferson Lab computing infrastructure is in
place. Hall B computing personnel hired late in the
process. Fundamentally changed the direction of the
software and organizational approach to the problems.
Many things had to wait until the very last minute.
Related Computing Trends
We depend on commodity computing Clusters Networks Storage Media (disks & tapes)
Intel’s Merced processors (Itainium) 500 MHz, 64 bits, 4-way processor A year late
File Size Currently 2 GB software limit 2 GB going to 232 * 2 GB (effectively infinite for
us) What determines the optimum file size?
Related Computing Trends (Continued)
Grid Computing High speed networks Distributed “service” or “data” centers GLOBUS, Legion, home-grown
XML – not just a better HTML Standard method for creating self-describing data Many tools available (B2B)
Mobile Computing, Portal Technology Customized access to computing resources via data
starved devices Customized view of an experiment or equipment
Benefits of XML
Standardized access to databases and applications.
DB to XML
DB
Select
XML to XMLSelect
ApplicationXML App
XML to DB
Config.View
Launcher
XML App
Benefits of XML
Standard routines exist in Perl, C++ and Java for converting between internal and external storage.
XML SIISII App
XML App
SII XML SIISII App
XML App
SII
Top Related