HNSciCloud update @ the World LHC Computing Grid deployment board
Foundations for an LHC Data Grid
description
Transcript of Foundations for an LHC Data Grid
Foundations for anLHC Data Grid
Stu Loken
Berkeley Lab
The Message
• Large-scale Distributed Computing (known as Grids) is a major thrust of the U.S. Computing community
• Annual investment in Grid R&D and infrastructure is ~$100M per year
• This investment can and should be leveraged to provide the Regional computing model for LHC
The Vision for the Grid
• Persistent, Universal and Ubiquitous
Access to Networked Resources
• Common Tools and Infrastructure for
Building 21st Century Applications
• Integrating HPC, Data Intensive
Computing, Remote Visualization
and Advanced Collaborations
Technologies
The Grid from a Services View
Resource-specific implementations of basic services:E.g., Transport protocols, name servers, differentiated services, CPU schedulers, public keyinfrastructure, site accounting, directory service, OS bypass
Resource-independent and application-independent services:E.g., authentication, authorization, resource location, resource allocation, events, accounting,remote data access, information, policy, fault detection
DistributedComputing
ApplicationsToolkit
Grid Fabric(Resources)
Grid Services(Middleware)
ApplicationToolkits
Data-Intensive
ApplicationsToolkit
CollaborativeApplications
Toolkit
RemoteVisualizationApplications
Toolkit
ProblemSolving
ApplicationsToolkit
RemoteInstrumentation
ApplicationsToolkit
Applications Chemistry
Biology
Cosmology
High Energy Physics
Environment
Grid-based Computing Projects
• China Clipper
• Particle Physics Data Grid
• NASA Information Power Grid: Distributed Problem Solving
• Access Grid: The Future of Distributed Collaboration
Clipper Project
• ANL-SLAC-Berkeley• Push the limits of very high-speed
data transmission• Builds on Globus Middleware and
high-performance distributed storage• Demonstrated data rates up to 50
Mbytes/sec.
China Clipper TasksHigh-Speed Testbed
– Computing and networking infrastructure
Differentiated Network Services– Traffic shaping on ESnet
Monitoring Architecture– Traffic analysis to support traffic shaping and CPU scheduling
Data Architecture– Transparent management of data
Application Demonstration– Standard Analysis Framework (STAF)
China Clipper Testbed
Clipper Architecture
MonitoringEnd-to-end monitoring of the assets in a
computational grid is necessary both for resolving network throughput problems and for dynamically scheduling resources.
China Clipper adds precision-timed event monitor agents to:– ATM switchs – DPSS servers– Testbed computational resources
• Produce trend analysis modules for monitor agents• Make results available to applications
Monitoring
Particle Physics Data Grid
• HENP Labs and Universities (Caltech-SLAC lead)
• Extend GRID concept to large-scale distributed data analysis
• Uses NGI testbeds as well as production networks
• Funded by DOE-NGI program
NGI: “Particle Physics Data Grid”ANL(CS/HEP), BNL, Caltech, FNAL, JLAB,LBNL(CS/HEP), SDSC, SLAC, U.Wisconsin
High-Speed Site-to-Site File Replication ServiceFIRST YEAR:
• SLAC-LBNL at least;
• Goal intentionally requires > OC12;
• Use existing hardware and networks (NTON);
• Explore “Diffserv”, instrumentation, reservation/allocation.
Bulk Transfer Service:100 Mbytes/s, 100 Tbytes/year
Primary Site
Data Acquisition,CPU, Disk,Tape-Robot
Replica Site(Partial)
CPU, Disk,Tape-Robot
NGI: “Particle Physics Data Grid”
Deployment of Multi-Site Cached File AccessPrimary Site
Data Acquisition,CPU, Disk,Tape-Robot
Satellite Site
CPU, Disk,Tape-Robot
Satellite Site
CPU, Disk,Tape-Robot
UniversityCPU, Disk,
Users
UniversityCPU, Disk,
Users
UniversityCPU, Disk,
Users
Satellite Site
CPU, Disk,Tape-Robot
FIRST YEAR:
• Read access only;
• Optimized for 1-10 GB files;
• File-level interface to ODBMSs;
• Maximal use of Globus, MCAT, SAM, OOFS, Condor, Grand Challenge etc.;
• Focus on getting users.
Information Power GridDistributed High-Performance Computing,
Large-Scale Data Management, andCollaboration Environments for Science and
EngineeringBuilding Problem-Solving Environments
William E. Johnston, Dennis Gannon, William Nitzberg
IPG Problem Environment
IPG Requirements• Multiple datasets
• Complex workflow scenarios
• Data-streams from instrument systems
• Sub-component simulations coupled simultaneously
• Appropriate levels of abstraction
• Search, interpret and fuse multiple data archives
• Share all aspects of work processes
• Bursty resource availability and scheduling
• Sufficient available resources
• VR and immersive techniques
• Software agents to assist in routine/repetitive tasks
• All this will be supported by the Grid. PSEs are the primary scientific/engineering user interface to the Grid.
The Future of Distributed Collaboration Technology:
The Access Grid
Ian Foster, Rick Stevens
Argonne National Laboratory
Beyond Teleconferencing:
• Physical spaces to support distributed groupwork
• Virtual collaborative venues
• Agenda driven scenarios and work sessions
• Integration with Integrated GRID services
Access Grid Project Goals • Enable Group-to-Group Interactions at a
Distance
• Provide a Sense of Presence• Use Quality but Affordable Digital IP Based
Audio/video (Open Source)
• Enable Complex Multi-site Visual and Collaborative Experiences
• Build on Integrated Grid Services Architecture
The Docking Concept for Access Grid
Private Workspaces - Docked into the Group Workspace
Ambient mic(tabletop)
Presentermic
Presentercamera
Audience camera
Ambient mic(tabletop)
Presentermic
Presentercamera
Audience camera
Access Grid Nodes
• Access Grid Nodes Under Development– Library, Workshop– ActiveMural Room– Office– Auditorium
Conclusion
A set of closely coordinated projects is laying the foundation for a high-performance distributed computing environment.
There appear to be good prospects for a significant long-term investment to deploy the necessary infrastructure to support Particle Physics Data Analysis.