LIGO LSC DataGrid Workshop
description
Transcript of LIGO LSC DataGrid Workshop
LIGO LSC DataGrid Workshop
March 24-26, 2005
Livingston Observatory
Part One: Introduction
• A: Workshop Agenda and Pragmatics
• B: Defining “the Grid”
• C: Who’s Who in the Grid World
• D: Overview of the LSG DataGrid
• E: Lab 1: Getting Started
A: Workshop Agenda and Pragmatics
Workshop Agenda
• Thursday, March 24• Introduction• Grid Security• Data Management
• Friday, March 25• Job Management• Workflow Management• MyProxy (Coming Attractions!)
• Saturday, March 26• Local Presentations
Preparation for the Labs
• We assume a RedHat 9 installation—• Although it’s not impossible that other platforms
may work just as well.
• We’ll assume you’ve installed LSC DataGrid Client Toolkit.
• We assume your security credentials are already in place.
Bio-Imperatives
• Food• Lunches• Dinner
• Plumbing
Temporal Disclaimer
• The state of the art is: the art is always changing.
• Grid infrastructure standards are, however, firming up.
• For the most part, we’re going to be talking about how things work at the moment.
• We’ll warn you when we go into Coming Attractions mode.
Who Are Those Guys?
• GRIDS Center• David Gehrig, NCSA-UIUC• Mike Freemon, NCSA-UIUC• Jaime Frey, University of Wisconson—Madison
Now, everybody—
B: Defining “the Grid”
“Grid”
• Buzzword of the year(s).
• In enterprise computing, different meanings at different times.• It often simply means “cluster computing.”
• In research, it usually means…
Definition: 1998
“A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.”
Ian Foster and Carl Kesselman: The Grid: Blueprint for a New Computing Infrastructure
Definition: 2002
“A Grid is a system that
• coordinates resources that are not subject to centralized control
• using standard, open, general-purpose protocols and interfaces
• to deliver nontrivial qualities of service.”Ian Foster, ANL:What is the Grid? A Three-Point Checklist
A Working Definition
• A distributed computing environment that coordinates• Computational jobs• Data placement• Information management
• Scales from one computer to thousands• Capable of working across many
administrative domains
C: Who’s Who in theGrid World
National Middleware Initiative
• Middleware: an evolving layer of services that resides between the network and more traditional applications for managing security, access, and information exchange
• www.nsf-middleware.org• Funds GRIDS Center• Funds Open Grid Computing Environment
GRIDS Center
• www.grids-center.org• Grid Research Integration, Deployment, and
Support Center
• Mission: making grid technology deployable and useful outside the development labs• Packaging• Education
The Globus Alliance
• www.globus.org
• Creates core infrastructure services
• Sponsors include:• DARPA, DoE, NSF, NASA• e-Science (UK), Vetenskapsrådet (Sweden), KTH
(Royal Institute of Technology, Stockholm)• IBM, Microsoft Research, Cisco Systems
Globus: Participating Institutions
• Argonne National Laboratories
• Information Sciences Institute/USC
• University of Chicago
• University of Edinburgh (UK)
• Center for Parallel Computers (Sweden)
• “Globus Academic Affiliates”
Globus Toolkit: GT3
• Software services and libraries• Resource monitoring, discovery, and management• Security• File management
• Note! GT4: Expected release sixth quarter of 2004
PyGlobus
• www-itg.lbl.gov/gtg/projects/pyGlobus/
• Lawrence Berkeley National Laboratory
• An interface to the Globus toolkit using the Python scripting language
Condor
• A serial/parallel job management system for a pool of compute nodes:• job queueing mechanism, scheduling policy,
priority scheme, resource monitoring, and resource management.
• Can be used with Globus Toolkit• www.cs.wisc.edu/condor/• We’ll use “local Condor” and Condor-G
iVDGL:International Virtual Data Grid Laboratory
• www.ivdgl.org
• Goals• Deploy a Grid laboratory• Use Grid software tools in experiments• Support delivery of Grid technologies• Education and outreach
• iVDGL pacman and VDT
• LSC is an active participant
GriPhyN: Grid Physics Network
• www.griphyn.org
• Coalesced around four experiments• Compact Muon Solenoid and ATLAS (“A
Toroidal LHC ApparatuS”) at LHC/CERN• Laser Interferometer Gravitational-wave
Observatory• Sloal Digital Sky Survey
• Petabytes of data annually
VDT: Virtual Data Toolkit
• www.cs.wisc.edu/vdt/
• Goal: to make it as easy as possible for users to deploy, maintain and use grid middleware
• Initially developed by GriPhyN and iVDGL
• Now includes LHC Computing Grid (LCG) and Physics Particle Data Grid (PPDG).
VDT: Components
• Basic Grid Services• Condor, Globus
• Virtual Data Tools• Virtual Data System
• Utilities• Such as GSI-OpenSSH
D: Overview of the LSG DataGrid
What is the LSC DataGrid?
• A collection of LSC computational and storage resources…
• … linked through Grid middleware…
• … into a uniform LSC data analysis environment.
LSC DataGrid Sites
• Tier 1: CalTech• Tier 2: UWM and PSU• Tier 3: UT-Brownsville and Salish Kootenai
College (SKC)• Linux clusters at GEO sites Birmingham,
Cardiff and the Albert Einstein Institute (AEI)• LDAS instances at Caltech, MIT, PSU, and
UWM
For this Workshop
• LSC DataGrid Sites• ldas-grid.ligo.caltech.edu• ldas-grid.ligo-wa.caltech.edu• ldas-grid.ligo-la.caltech.edu
• We’ll use ldas-grid.ligo-la.caltech.edu as our head node
• Full list of LSC DataGrid resources at www.lsc-group.phys.uwm.edu/lscdatagrid/resources
• More discussion of LSC DataGrid later
E: Lab 1 — Getting Started
Lab 1 — Getting Started
This lab will verify:
• Your software is installed correctly
• Your sacrifices have pleased the webgod Ping
• Your security credential (i.e. proxy certificate) is okay
• Your environment variables won’t suddenly go away
Credits
• Some slides in this presentation were adapted from presentations from• GryPhyN Grid Summer Workshop 2004• The Globus Consortium