LIGO LSC DataGrid Workshop March 24-26, 2005 Livingston Observatory.

34

Transcript of LIGO LSC DataGrid Workshop March 24-26, 2005 Livingston Observatory.

LIGO LSC DataGrid Workshop

March 24-26, 2005

Livingston Observatory

Part One: Introduction

• A: Workshop Agenda and Pragmatics

• B: Defining “the Grid”

• C: Who’s Who in the Grid World

• D: Overview of the LSG DataGrid

• E: Lab 1: Getting Started

A: Workshop Agenda and Pragmatics

Workshop Agenda

• Thursday, March 24• Introduction• Grid Security• Data Management

• Friday, March 25• Job Management• Workflow Management• MyProxy (Coming Attractions!)

• Saturday, March 26• Local Presentations

Preparation for the Labs

• We assume a RedHat 9 installation—• Although it’s not impossible that other platforms

may work just as well.

• We’ll assume you’ve installed LSC DataGrid Client Toolkit.

• We assume your security credentials are already in place.

Bio-Imperatives

• Food• Lunches• Dinner

• Plumbing

Temporal Disclaimer

• The state of the art is: the art is always changing.

• Grid infrastructure standards are, however, firming up.

• For the most part, we’re going to be talking about how things work at the moment.

• We’ll warn you when we go into Coming Attractions mode.

Who Are Those Guys?

• GRIDS Center• David Gehrig, NCSA-UIUC• Mike Freemon, NCSA-UIUC• Jaime Frey, University of Wisconson—Madison

Now, everybody—

B: Defining “the Grid”

“Grid”

• Buzzword of the year(s).

• In enterprise computing, different meanings at different times.• It often simply means “cluster computing.”

• In research, it usually means…

Definition: 1998

“A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.”

Ian Foster and Carl Kesselman: The Grid: Blueprint for a New Computing Infrastructure

Definition: 2002

“A Grid is a system that

• coordinates resources that are not subject to centralized control

• using standard, open, general-purpose protocols and interfaces

• to deliver nontrivial qualities of service.”Ian Foster, ANL:What is the Grid? A Three-Point Checklist

A Working Definition

• A distributed computing environment that coordinates• Computational jobs• Data placement• Information management

• Scales from one computer to thousands• Capable of working across many

administrative domains

C: Who’s Who in theGrid World

National Middleware Initiative

• Middleware: an evolving layer of services that resides between the network and more traditional applications for managing security, access, and information exchange

• www.nsf-middleware.org• Funds GRIDS Center• Funds Open Grid Computing Environment

GRIDS Center

• www.grids-center.org• Grid Research Integration, Deployment, and

Support Center

• Mission: making grid technology deployable and useful outside the development labs• Packaging• Education

The Globus Alliance

• www.globus.org

• Creates core infrastructure services

• Sponsors include:• DARPA, DoE, NSF, NASA• e-Science (UK), Vetenskapsrådet (Sweden), KTH

(Royal Institute of Technology, Stockholm)• IBM, Microsoft Research, Cisco Systems

Globus: Participating Institutions

• Argonne National Laboratories

• Information Sciences Institute/USC

• University of Chicago

• University of Edinburgh (UK)

• Center for Parallel Computers (Sweden)

• “Globus Academic Affiliates”

Globus Toolkit: GT3

• Software services and libraries• Resource monitoring, discovery, and management• Security• File management

• Note! GT4: Expected release sixth quarter of 2004

PyGlobus

• www-itg.lbl.gov/gtg/projects/pyGlobus/

• Lawrence Berkeley National Laboratory

• An interface to the Globus toolkit using the Python scripting language

Condor

• A serial/parallel job management system for a pool of compute nodes:• job queueing mechanism, scheduling policy,

priority scheme, resource monitoring, and resource management.

• Can be used with Globus Toolkit• www.cs.wisc.edu/condor/• We’ll use “local Condor” and Condor-G

iVDGL:International Virtual Data Grid Laboratory

• www.ivdgl.org

• Goals• Deploy a Grid laboratory• Use Grid software tools in experiments• Support delivery of Grid technologies• Education and outreach

• iVDGL pacman and VDT

• LSC is an active participant

GriPhyN: Grid Physics Network

• www.griphyn.org

• Coalesced around four experiments• Compact Muon Solenoid and ATLAS (“A

Toroidal LHC ApparatuS”) at LHC/CERN• Laser Interferometer Gravitational-wave

Observatory• Sloal Digital Sky Survey

• Petabytes of data annually

VDT: Virtual Data Toolkit

• www.cs.wisc.edu/vdt/

• Goal: to make it as easy as possible for users to deploy, maintain and use grid middleware

• Initially developed by GriPhyN and iVDGL

• Now includes LHC Computing Grid (LCG) and Physics Particle Data Grid (PPDG).

VDT: Components

• Basic Grid Services• Condor, Globus

• Virtual Data Tools• Virtual Data System

• Utilities• Such as GSI-OpenSSH

D: Overview of the LSG DataGrid

What is the LSC DataGrid?

• A collection of LSC computational and storage resources…

• … linked through Grid middleware…

• … into a uniform LSC data analysis environment.

LSC DataGrid Sites

• Tier 1: CalTech• Tier 2: UWM and PSU• Tier 3: UT-Brownsville and Salish Kootenai

College (SKC)• Linux clusters at GEO sites Birmingham,

Cardiff and the Albert Einstein Institute (AEI)• LDAS instances at Caltech, MIT, PSU, and

UWM

For this Workshop

• LSC DataGrid Sites• ldas-grid.ligo.caltech.edu• ldas-grid.ligo-wa.caltech.edu• ldas-grid.ligo-la.caltech.edu

• We’ll use ldas-grid.ligo-la.caltech.edu as our head node

• Full list of LSC DataGrid resources at www.lsc-group.phys.uwm.edu/lscdatagrid/resources

• More discussion of LSC DataGrid later

E: Lab 1 — Getting Started

Lab 1 — Getting Started

This lab will verify:

• Your software is installed correctly

• Your sacrifices have pleased the webgod Ping

• Your security credential (i.e. proxy certificate) is okay

• Your environment variables won’t suddenly go away

Credits

• Some slides in this presentation were adapted from presentations from• GryPhyN Grid Summer Workshop 2004• The Globus Consortium