Post on 19-Jan-2016
description
Experiment Applications:applying the power of the grid to real science
Rick Cavanaugh
University of Florida
GriPhyN/iVDGL
External Advisory Committee
13 January, 2002
GriPhyN/iVDGLand ATLAS
Argonne, Boston, Brookhaven, Chicago, Indiana, Berkeley, Texas
13.01.2003 EAC Review 3
ATLAS at SC2002
Grappa Manages the overall grid experience Magda Distributed data management and
replication Pacman Defines and produces software
environments Dc1 production with grat Data
challenge simulations for Atlas Instrumented athena Grid monitoring of
Atlas analysis applications vo-gridmap Virtual organization management Gridview Monitoring U.S. Atlas resources Worldgrid World-wide US/EU grid
infrastructure
13.01.2003 EAC Review 4
Pacman at SC2002
How did we install our software for this demo?• % pacman –get
iVDGL:WorldGrid
ScienceGrid
Pacman lets you define how a mixed tarball/rpm/gpt/native software environment is
• Fetched• Installed• Setup• Updated
This can be figured out once and exported to the rest of the world via caches• % pacman –get
atlas_testbed
13.01.2003 EAC Review 5
The caches you have decided to trust
Installed software, pointer to local documentation
Dependencies are automatically resolved
Pacman at SC2002
How did we install our software for this demo?• % pacman –get
iVDGL:WorldGrid
ScienceGrid
Pacman lets you define how a mixed tarball/rpm/gpt/native software environment is
• Fetched• Installed• Setup• Updated
This can be figured out once and exported to the rest of the world via caches• % pacman –get
atlas_testbed
13.01.2003 EAC Review 6
Grappa at SC2002 Web-based interface for
Athena job submission to Grid resources
Based on XCAT Science Portal technology developed at Indiana
EDG JDL backend to Grappa
Common submission to US gatekeepers and EDG resource broker (through EDG “user interface” machine)
13.01.2003 EAC Review 7
Grappa Portal Machine:XCAT tomcat server
Web Browsing Machine (JavaScript)Netscape/Mozilla/Int.Expl/PalmScape
https - JavaScript
http: JavaScript Cactus framework
Script-BasedSubmisson
interactiveor
cron-job
Resource A Resource Z. . .MAGDA: registers file/location registers file metadata
Compute Resources
http://browse catalogue
CoG :Submission,Monitoring
CoG :
Data Copy
Data Storage: - Data Disk - HPSS
Magda
(spider)
Inputfiles
Grappa Communications Flow
13.01.2003 EAC Review 8
Instrumented Athena at SC2002 Part of SuperComputing 2002
ATLAS demo
Prophesy (http://prophesy.mcs.anl.gov/)
• An Infrastructure for Analyzing & Modeling the Performance of Parallel & Distributed Applications
• Normally a Parse & auto-instrument approach (C & FORTRAN).
NetLogger (http://www-didc.lbl.gov/NetLogger/)
• End-to-End Monitoring & Analysis of Distributed Systems
• C, C++, Java, Python, Perl, Tcl APIs
• Web Service Activation
GriPhyN/iVDGL and CMS
Caltech, Fermilab, Florida, San Diego, Wisconsin
13.01.2003 EAC Review 10
Bandwidth Gluttony at SC2002
"Grid-Enabled" particle physics analysis application
issued remote database selection queries; prepared data object collections,
moved collections across the WAN using specially enhanced TCP/IP stacks
rendered the results in real time on the analysis client workstation in Baltimore.
13.01.2003 EAC Review 11
MonaLisa at SC2002
MonaLisa (Caltech)
– Deployed on the US-CMS Test-bed
– Dynamic information/resource discovery mechanism using agents
– Implemented in > Java / Jini with interfaces to SNMP, MDS,
and Ganglia> WDSL / SOAP with UDDI
– Proved critical during live CMS production runs
Pictures taken from Iosif Legrand
13.01.2003 EAC Review 12
MOP and Clarens at SC2002
Simple, robust grid planner integrated with CMS production software
1.5 million simulated CMS events produced over 2 months (~30 CPU years)
VDT Client
VDT Server 1
MCRunJob
DAGMan/Condor-G
Condor
GridFTP
VDT Server N
Condor
GridFTP
GridFTP
mop-submitter
Linker ScriptGen
Config
Req.
Self Des
Master
ClarensClient
ClarensServer
ClarensServer
13.01.2003 EAC Review 13
Chimera Production at SC2002
Used VDL to describe virtual data products and their dependencies
Used the Chimera Planners to map abstract workflows onto concrete grid resources
Implemented a WorkRunner to continously schedule jobs across all grid sites
Generator
Simulator
Formator
Reconstructor
Ntuple
Pro
du
ctio
nA
naly
sis
para
ms
exec.
data
Stage File In
Execute Job
Stage File Out
Register File
Example CMSconcrete DAG
13.01.2003 EAC Review 14
mass = 200decay = WWstability = 1event = 8
mass = 200decay = WWstability = 1plot = 1
mass = 200decay = WWplot = 1
mass = 200decay = WWevent = 8
mass = 200decay = WWstability = 1
mass = 200decay = WWstability = 3
mass = 200
mass = 200decay = WW
mass = 200decay = ZZ
mass = 200plot = 1
mass = 200event = 8
A virtual space of simulated data is created for futureuse by scientists...
Data Provenance at SC2002
13.01.2003 EAC Review 15
mass = 200decay = WWstability = 1event = 5
mass = 200decay = WWstability = 1plot = 1
mass = 200decay = WWplot = 1
mass = 200decay = WWevent = 8
mass = 200decay = WWstability = 1
mass = 200decay = WWstability = 3
mass = 200
mass = 200decay = WW
mass = 200decay = ZZ
mass = 200plot = 1
mass = 200event = 8
Search forWW decays of the Higgs Boson and where only stable, final state particles are recorded: mass = 200; decay = WW; stability = 1
Data Provenance at SC2002
13.01.2003 EAC Review 16
mass = 200decay = WWstability = 1LowPt = 20HighPt = 10000
mass = 200decay = WWstability = 1event = 8
mass = 200decay = WWstability = 1plot = 1
mass = 200decay = WWplot = 1
mass = 200decay = WWevent = 8
mass = 200decay = WWstability = 1
mass = 200decay = WWstability = 3
mass = 200
mass = 200decay = WW
mass = 200decay = ZZ
mass = 200plot = 1
mass = 200event = 8
...The scientistadds a new derived data branch... and continues to
investigate !
Data Provenance at SC2002
ISI, Caltech, Milwaukee
GriPhyN and LIGO (Laser Interferometer Gravitational-wave
Observatory)
13.01.2003 EAC Review 18
LIGO’s Pulsar Search
Long time frames
Store
raw channels
Short time frames
Hz
Time
Single Frame
Extract channel
transpose
Time-frequency Image
Find Candidate event
DB
archive
Inte
rfero
mete
r
ShortFourierTransform
Extract frequency range
Construct image
30 minutes
13.01.2003 EAC Review 19
Developed at ISI as part of the GriPhyN project
Configurable system that can map and execute complex workflows on the Grid
Integrated with the GriPhyN Chimera system• It Receives an abstract workflow (AW) description from Chimera, produces
a concrete workflow (CW)• Submits the CW to DAGMan for execution. • Optimizations of CW are done from the point of view of Virtual Data.
Can perform AW planning based on application-level metadata attributes.
Given attributes such as time interval, frequency of interest, location in the sky, etc., Pegasus is currently able to produce any virtual data products present in the LIGO pulsar search
Pegasus: Planning for Execution in Grids
13.01.2003 EAC Review 20
Condor-G/DAGMan
TransformationCatalog
RLS
MCS
(1) Metadata Attributes
(3) Metadata Attributes
(4) List of Existing VirtualData Products Matching
the Request (LFNs)
(5) Logical File Names(LFNs)
(6) Physical File Names(PFNs)
(8)Metadata Attributes,Current State
Chimera(10b) VDLx
Request Manager
(18) Results
(9b) Derivations
VDL GeneratorSubmit File
Generator forCondor-G
Concrete PlannerAbstract and
Concrete PlannerIn-time scheduler
(9) Concrete DAGMDS
Current SateGenerator
(2) MetadataAttributes
User’s VO information
Available Resources
(7) CurrentState
(10) concreteDAG
(13) DAGMan files
DAGManSubmission and
Monitoring
(14) DAGMan files
(17) Monitoring
(11) PhysicalTransformations
(12) ExecutionEnvironment Information
(15) DAG (16) Log FIles
In development
Resource SelectionInterface
Replica SelectionInterface
Abstract DAGreduction
Metadata Driven Configuration
13.01.2003 EAC Review 21
LIGO’s pulsar search at SC2002
The pulsar search conducted at SC 2002 • Used LIGO’s data collected
during the first scientific run of the instrument
• Targeted a set of 1000 locations of known pulsar as well as random locations in the sky
• Results of the analysis were published via LDAS (LIGO Data Analysis System) to the LIGO Scientific Collaboration
• performed using LDAS and compute and storage resources at Caltech, University of Southern California, University of Wisconsin Milwaukee.
13.01.2003 EAC Review 22
Results
SC 2002 demo Over 58 pulsar searches Total of
• 330 tasks
• 469 data transfers
• 330 output files
The total runtime was 11:24:35
To date 185 pulsar searches Total of
• 975 tasks
• 1365 data transfers
• 975 output files
Total runtime96:49:47
Virtual Galaxy Cluster System:
An Application of the GriPhyN Virtual Data Toolkit to Sloan Digital Sky Survey Data
Chicago, Argonne, Fermilab
13.01.2003 EAC Review 24
The Brightest Cluster GalaxyPipeline
CatalogCluster
Core
CoreBRGFieldtsObj
FieldtsObj
FieldtsObj
FieldtsObj
BRG
BRG
BRG
3
21
1
1
1
2
2
23
54
Interesting intermediate data reuse made possible by Chimera:
maxBcg is a series of transformations
Cluster finding works well with 1 Mpc radius apertures. If one instead was looking for the sites of gravitational lensing, one would rather use a 1/4 Mpc radius. This would start at transformation 3.
1: extracts galaxies from the full tsObj data set. 2: filter the field for Bright Red Galaxies. 3: calculate the weighted BCG likelihood for each galaxy, most expensive. 4: is this galaxy the most likely galaxy in the neighborhood? 5: remove extraneous data, and store in a compact format.
13.01.2003 EAC Review 25
BRG
Core
Cluster
Catalog
The DAG
13.01.2003 EAC Review 26
A DAG for 50 Fields
744 files, 387 nodes, 40 minutes
108
168
60
50
13.01.2003 EAC Review 27With Jim Annis &
Steve Kent, FNAL1
10
100
1000
10000
100000
1 10 100
Num
ber
of C
lust
ers
Number of Galaxies
Galaxy clustersize distribution
DAG
Example:Sloan Galaxy Cluster Analysis
Sloan Data
13.01.2003 EAC Review 28
Conclusion
Built a virtual cluster system based on Chimera and SDSS cluster finding.
Described the five stages and data dependencies in VDL.
Tested the system on a virtual data grid. Conducting performance analysis. Helped improve Chimera.
13.01.2003 EAC Review 29
Some CMS Issues/Challenges
How to generate more buy-in from the experiments? Sociological trust problem, not technical.
More exploition of (virtual) collections of objects and further use of web services (work already well underway).
What is required to store the complete provenance of data generated in a grid environment?
Creation of collaborative peer-to-peer environments. Data Challenge 2003-4: generate and analyze 5% of the
expected data at startup (~1/2 year of continuous production).
What is the relationship between WorldGRID and the LCG? Robust, portable applications! Virtual Organization Management and Policy Enforcement.
13.01.2003 EAC Review 30
Some ATLAS Issues/Challenges
How to generate more buy-in from the experiments? Sociological trust problem, not technical.
Fleshing out the notion of Pacman "Projects" and prototyping them
What is the best integration path for chimera infrastructure with international atlas catalog systems? Need standardized Virtual Data API?
Packaging and distribution of ATLAS SW releases for each step in the production/analysis chain: gen, sim, reco, analysis.
LCG SW application development env. is now SCRAM: ATLAS evaluating possible migration from CMT to SCRAM
13.01.2003 EAC Review 31
SDSS Challenges
Cluster Finding• Distribution of clusters in the universe• Evolution of the mass function
• Balanced I/O and compute Power Spectrum
• Distribution of galaxies in the universe• Direct constraints on cosmological parameters
• Compute intensive, prefer MPI systems• Premium on discovering similar results
Analyses based on pixel data• Weak lensing analysis of the SDSS coadded southern survey data• Near Earth asteroid searches• Galaxy morphological properties: NVO Galaxy Morphology Demo
• All involve moving around terabytes of data• Or choosing not to
13.01.2003 EAC Review 32
LIGO Challenges