SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky [email protected] An Introduction to the.

27
SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky An Introduction to the

Transcript of SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky [email protected] An Introduction to the.

Page 1: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

By: Roman Olschanowsky [email protected]

An Introduction to the

Page 2: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

Outline

• SDSC and History of SRB• Example Project

• Introduction to SRB• Discussion on SRB basics• SRB Clients

• Overview of a Data Grid• Infrastructure• Topology

Page 3: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

Archival Systems

6 PB

10.4 TF

DataStarIBM Power4

4.4 TF

TeraGrid Linux Cluster (IA64)

600 TB

Storage Area Network Disk Sun F15K

Disk Server

NetworkingVisualization

Storage and Compute

Resources

Human infrastructure: Experienced multi-

disciplinary staff support a broad spectrum of national

science, engineeringand technology projects

Blue Gene/L(Due 12/04)

2.8/5.7 TF

www.sdsc.edu

Page 4: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

Sites Using the SRBCiteSeer, Penn StateCity Univ. of New YorkGeospatial Environment, UCSDDrexel UniversityEOSDIS Distributed Active, NASA GoddardGeorgia TechKentucky State Libraries & ArchivesLibrary of CongressLos Alamos National LabNASA AmesNASA Goddard Space Flight CenterNCSA Grid Computing NIH (NCI Center for Bioinformatics)Penn State UniversityPittsburgh Supercomputing CenterPurdue University. IndianaStanford UniversityTACC, University of TexasTexas A & MUC Santa CruzUCLAUCSD NeuroscienceUniversity of MarylandUniversity of Michigan, CAC department University of New MexicoUniversity of WashingtonUniversity of WisconsinUSCYale University

Academia Sinica, TaiwanASCC, Computing Centre, TaiwanAustralian National UniversityBedford Oceanography,CanadaBioinformatics Institute, SingaporeCSIRO, AustraliaData Storage Institute, SingaporeEGEE, French National CenterGeoForschungsZentrum, GermanyJames Cook University, AustraliaKEK High Energy Physics, JapanMax Planck Institute, NetherlandsParallab, NorwaySouth Australian Advanced ComputingUIB (Parallab) , NorwayUniversity of AmsterdamUniversity of Cambridge, AstronomyUniversity of Cambridge, e-ScienceUniversity of EdinburghUniversity of Genoa, ItalyUniversity of Hong KongUnivrsity of ManchesterUniversity of OsloUniversity of SouthamptonYork Univ (UK)

Page 5: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

SDSC SRB Projects (60 million, .5 PB )

• Digital Libraries• UCB, Umich, UCSB, Stanford,CDL• NSF NSDL - UCAR / DLESE

• NASA Information Power Grid• Astronomy

• National Virtual Observatory • 2MASS Project (2 Micron All Sky Survey)

• Particle Physics • Particle Physics Data Grid (DOE)• GriPhyN • SLAC Synchrotron Data Repository

• Medicine• Digital Embryo (NLM)

• Earth Systems Sciences• ESIPS• LTER

• Persistent Archives• NARA• LOC

• Neuro Science & Molecular Science• TeleScience/NCMIR, BIRN• SLAC, AfCS, …

Page 6: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

The SCEC Project• Southern California Earthquake Center

• 400 people, the best earthquake seismologists in the country (33 states) and several from abroad (9 countries). (Sep. 2004 SCEC AHM attendees)

• Simulating a 7.7 earthquake in the L.A. basin• 10 year effort• 100+ TB of input data ( soil conditions, topography, grid coordinates, etc… )• 240 procs on SDSC Datastar cluster, 5 days, 1 TB RAM, 2GB/sec IO

Thanks!• SDSC, scientific applications group, with porting the code;

parallelizing the calculation and the IO; and generalizing the code for scaling up to a large run. Offered invaluable insights regarding IO management. 

• SRB, took care of draining the GPFS cache regularly,  moving 43 TB of data safely to archive storage.   That task was completed a mere 36 hours after the end of the calculation.

The SRB was critical in this achievement.

Page 7: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

SDSC & SRB Example

Page 8: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

Storage Resource Broker (SRB)

• A distributed file system (Data Grid)• Client-Server, Server-Server architecture.• Abstracts physical

• SRB provides the ability to transparently share data across remote sites.• Heterogeneous Resources• Single sign on• Single logical file hierarchy

Page 9: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

What we are familiar with

Page 10: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

What we are not familiar with, yet

Page 11: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

How do the file systems differ?• Logical Abstraction

• Folders are NOT physical• Files do NOT inherit physical location• Everything is potentially distributed

• Access Control• Permissions are NOT rwxrwxrwx• Permissions ARE on a object by object basis• Groups and permissions ARE more similar to NTFS

• Domains• Geographical / logical grouping of users• Namespace scalability: john@harvard john@mit• Also doubles as groups

Page 12: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

Interfaces to theStorage Resource Broker

• inQ – Windows Client• Scommands– UNIX, DOS Command line Client• Jargon – Java API and GUI components• mySRB – Web Client• Matrix – WSDL, Data Grid Workflows• C, C++ – C and C++ API• Python – Python API• Perl – Perl API

Page 13: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

Common Scommands (69 total)

• Sinit• Senv• Spwd• Sls• Scd• Sget• Sput• Ssh

• Scp• Smv (logical)• Sphymove (physical)• Srm• Smkdir• Srmdir• Serror• Schmod• Sexit

Page 14: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

mySRB

Page 15: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

BIRN Portal (perl based)

Page 16: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

NEEScentral Portal (php based)

Page 17: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

Biomedical Informatics Research Network (BIRN)

• Major collaboration with SDSC, several of the projects’ Co-Investigators and Co-PIs are at SDSC.

• BIRN’s purpose is to provide it’s consortium of neuroscience laboratories the ability to share, compute, and collaborate.

• The Storage Resource Broker provides the ability to transparently share data across remote sites.

Page 18: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

The BIRN SRB Data Grid

Page 19: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

Doing this “Manually”

Page 20: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

The BIRN Data Grid

Page 21: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

The grid is in the details

Page 22: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

File ReplicationSls/home/Demo/SRB-Tutorial/files-2: Doc.txt

Sls -l/home/Demo/SRB-Tutorial/files-2: romanoly 0 z-ucsd-ncmir-nas1 15 2003-07-09-05.15 Doc.txt romanoly 1 z-jhu-cis-nas0 15 2003-07-09-05.16 Doc.txt romanoly 2 z-stanford-lucas-nas 15 2003-07-09-05.16 Doc.txt romanoly 3 z-umn-cmrr-nas0 15 2003-07-09-05.16 Doc.txt romanoly 4 z-uci-bic-nas0 15 2003-07-09-05.17 Doc.txt

Page 23: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

SRB “Location” or “Slave Server”

SRB

SRB

“Location”

“Physical Resources”

z-jhu-cis-nas0

“jhu-cis-nas”

DR

z-jhu-cis-nas1z-jhu-cis-nas2

“Logical Resource”

Page 24: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

Pooling physical resources

0.7 TB

5.2 TB

0 TB

1.6 TB

0.8 TB

0.8 TB

3.2 TB

0.8 TB

2.4 TB

0.8 TB

0.8 TB

2.4 TB

1.6 TB

0.8 TB

5.0 TB

0.78 TB

0.08 TB

Page 25: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

Logical / Compound Resources

SRB

SRB“My-Resource”

“instant replication”

“fast archival”

“resource pooling”

Page 26: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

Logical Resources

Page 27: SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky roman2u@sdsc.edu An Introduction to the.

SAN DIEGO SUPERCOMPUTER CENTER

Thanks!

SRB handles large data and provides the ability to share and collaborate on distributed heterogeneous resources.

Questions?

www.sdsc.edu/[email protected]