An overview of the EGEE infrastructure and middleware

Post on 11-Jan-2016

45 views 2 download

description

Elena Slabospitskaya IHEP NA3 manager for Russia. An overview of the EGEE infrastructure and middleware. EGEE is funded by the European Union under contract IST-2003-508833. Sources of information. LCG-2 User Guide https://edms.cern.ch/file/454439//LCG-2-UserGuide.html LCG Releases - PowerPoint PPT Presentation

Transcript of An overview of the EGEE infrastructure and middleware

An overview of the EGEE infrastructure and middleware

EGEE is funded by the European Union under contract IST-2003-508833

Elena SlabospitskayaIHEP

NA3 manager for Russia

Sources of information

LCG-2 User Guidehttps://edms.cern.ch/file/454439//LCG-2-UserGuide.html

LCG Releaseshttp://grid-deployment.web.cern.ch/grid-deployment/cgi-bin/index.cgi?var=releasesLCG-2 Install Notes (for administrators)

LCG-2 Manual Installation Guide (for administrators)https://edms.cern.ch/file/434070//LCG2Install.htmlSite with EDG

Tutorialshttp://hep-proj-grid-tutorials.web.cern.ch/hep-proj-grid-tutorials/

Overall

1. GSI – Grid Security Infrastructure

2. Infi\ormation System

3. Job Management

4. Data Management

5. Monitoring System

Conclusions

Main Logical Machine Types (1)

RMSCERN

PS

RBMSU

BD I IMSU

SE

SE

SE

SE

SE

SE

UI

UI

UI

CECE

CE

Protvino, IHEPDubna, JINR

Moscow,SINP MSU

SES

E

CE

UI

Moscow, ITEP

Distributed system - A collection of (probably heterogeneous) automata whose distribution is transparent to the user so that the system appears as one local machine.

UI – User Interface

CE – Grid Gate and Worker Nodes GG – Globus Gatekeeper, Globus Resource Allocation Manager, master server of Local Resource Management System, local Logging and Bookkeepering server

SE – Classic Storage Element – GridFTP server SE may control large disk arrays or Mass Storage System(MSS). This storage resources are managed by Storage Resource Manager (SRM). SRM is interacting with OS, MSS and with protocols (to perform file transfer operations) As MSS, LCG-2 support dcache disk pool (GridFTP and rfio), tape archiving system - Castor( GridFTP and rfio) and Enstore(GridFTP ).RB -Resource BrokerRMS -Replica Management SystemBDII – Berkeley DB Information IndexPS – proxy server

Main Logical Machine Types (2)

How do I login on the Grid ?

Two basic concepts: Authentication: Who am I?

“Equivalent” to a pass port, ID card etc.

Authorisation: What can I do? Certain permissions, duties etc.

The Grid Security Infrastructure (GSI) in LCG-2 enables secure authentication and

communication over an open network . GSI is based on public key encryption,

X.509 certificates, and the Secure Sockets Layer (SSL) communication protocol.

- Provides information about grid resourses and their status

- GLUE (Grid Laboratory for a Uniform Environment) schema – common conceptual data model for CE, SE and binding CE-SE.

-MDS (Monitoring and Discovery Service) from Globus has been adopted asa provider of IS.

- IS implements Glue schema using OpenLDAP – Lightweight Directory Acess Protocol

- GRIS – Grid Resource Information System – local on CE and SE

- GIIS – Grid Index Information Service – site (CE)

- BDII -Berkeley DB Information Index

Information System

.

Information system in LCG-2

A LDAP Information System is based on entries.Each entries describes an object – person, computer etc and has unique Distinquished Name (DN). Which kind of information can be stored in each entryis specified in an LDAP schema

Directory Information Tree

Directory Information Tree (DIT) – a tree of directory entries

LDAP directory of an LCG-2 BDII

Job management

Workload Management System (WMS) services is usually run at Resource Broker. Network Server (NS), which accepts the incoming job requests from the UI,

and provides for the job control functionality.

Workload Manager, which is the core component of the system.

Match-Maker (also called Resource Broker), whose duty is finding the best resource matching the requirements of a job (match-making process).

Job Adapter, which prepares the environment for the job and its final description, before passing it to the Job Control Service.

Job Control Service (JCS), which finally performs the actual job management operations (job submission, removal...)

Logging and Bookkeeping service (LB) . The LB logs all job management Grid events, which can then be retrieved by users or system administrators for monitoring or troubleshooting.

A Job Submission Example

UIJDL

Logging &Book-keeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement CE)Element CE)

Information Service (IS)

ReplicaCatalogue(RC)

A Job Submission Example

UIJDL

Logging &Book-keeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job SubmitEvent

Input Sandbox

Job Status

submitted

A Job Submission Example

UIJDL

Logging &Book-keeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job Status

submitted

waiting

A Job Submission Example

UIJDL

Logging &Book-keeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job Status

submitted

waiting

ready

A Job Submission Example

UIJDL

Logging &Book-keeping(LB)

ResourceBroker (RB)

Job SubmissionService(JSS)

StorageElement (SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job Status

submitted

waiting

ready

BrokerInfo

scheduled

A Job Submission Example

UIJDL

Logging &Book-keeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job Status

submitted

waiting

ready

scheduled

Input Sandbox

running

A Job Submission Example

UIJDL

Logging &Book-keeping(LB)

ResourceBroker (RB)

Job SubmissionService (JSS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Job Status

submitted

waiting

ready

scheduled

Job Status

running

A Job Submission Example

UIJDL

Logging &Book-keeping

ResourceBroker

Job SubmissionService

StorageElement

ComputeComputeElementElement

Information Service

ReplicaCatalogue

submitted

waiting

ready

scheduled

running

Job Status

done

Job Status

A Job Submission Example

UIJDL

Logging &Book-keeping

ResourceBroker

Job SubmissionService

StorageElement

ComputeComputeElementElement

Information Service

ReplicaCatalogue

submitted

waiting

ready

scheduled

running

done

Job Status

Job Status

outputready

Output Sandbox

A Job Submission Example

UIJDL

Logging &Book-keeping(LB)

ResourceBroker (RB)

Job SubmissionService (JS)

StorageElement(SE)

ComputeComputeElement (CE)Element (CE)

Information Service (IS)

ReplicaCatalogue(RC)

Output Sandbox

cleared

submitted

waiting

ready

scheduled

running

done

Job Status

outputready

Possible Job States

SUBMITTED

WAITING

READY

SCHEDULED

RUNNING

DONE(ok)DONE(failed)

OUTPUTREADY

CLEARED

ABORTEDDONE(cancelled)

Data Management Data Naming

SURL Storage URL An SURL is a locator for a physical filesrm://lxshare0282.cern.ch:8443/castor/cern.ch/home/dteam/generated/2004-02-11/A SURL is often called PFN (Physical File Name)filed8f59bcf-5c85-11d8-bbf3-c59c9bed1519

UUID Universally Unique IDentifier A UUID is a 128 bits long numberGUID Grid Unique IDentifier A UUID generated by the Replica Management System guid:e4fbe9b0-5c85-11d8-bbf3-c59c9bed1519

LFN Logical File Name A Logical File Name is a user defined alias to a GUID.

TURL Transport URL A Transport URL is returned by a SRM in response to a request for a way to access a SURL.

lfn:anjita-demo0236-2004-11-02

rfio://lxshare0282.cern.ch//data/dt/stage/filec0fabd63-5cba-11d8-ba4c-e2aa3666572b.4003

Different filenames in LCG-2

The main services offered by the RMS are: the Replica Location Service (RLS) and the Replica Metadata Catalog (RMC).

The RLS maintains information about the physical location of the replicas (mapping with the GUIDs). It is composed of several Local Replica Catalogs (LRCs) which hold the information of replicas for a single VO.

The RMC stores the mapping between GUIDs and the respective aliases (LFNs) associated with them, and maintains other metada information (sizes, dates, ownerships...)

The last component of the Data Management framework is the Replica Manager. The Replica Manager presents a single interface for the RMS to the user, and interacts with the other services.

REPLICA MANAGEMENT SYSTEM (RMS)

Interactions of the RM with other grid components

CONCLUSIONS

The EGEE Grid requires resources, an infrastructure and middleware that allows for:

Authentication and Authorization Information services Job and Data Management Monitoring and fault recovery

SRM Storage Resource Manager A high-level interface to a storage system. RLS Replica Location Service The distributed service providing the mappings between GUIDs and SURLs. An RLS has two components: LRC and RLI LRC Local Replica Catalog The catalog storing GUID to SURL mappings, along with SURL attributes for a given site, or a single Storage Re- source Manager at a site. RLI Replica Location Index The catalog storing information about which Local Replica Catalogs have GUID to SURL mappings for a par- ticular GUID. It thus provides the link between different LRCs, allowing for distributed indexing and querying of the Catalogs. RMC Replica Metadata Catalog The catalog storing LFN aliases for GUID, as well as at- tributes on GUIDs and LFNs. ROS Replica Optimization Service A service providing information to guide selection be- tween replicas located at different sites. This is based on network information collected from available network monitors.

Appendix. Data Management Services

http://lspitsky.home.cern.ch/lspitsky/

MDS- Monitoring and Discovery ServiceLCFG -Local ConFiguration System - Edinburgh