The Grid : Grids for Worldwide Science Vicky White Head, Computing Division, Fermilab May 21, 2003...
-
Upload
raven-crutchfield -
Category
Documents
-
view
220 -
download
0
Transcript of The Grid : Grids for Worldwide Science Vicky White Head, Computing Division, Fermilab May 21, 2003...
The Grid : Grids for The Grid : Grids for Worldwide ScienceWorldwide Science
Vicky WhiteVicky White
Head, Computing Division, FermilabHead, Computing Division, Fermilab
May 21, 2003May 21, 2003
Fermilab ColloquiumFermilab Colloquium
Acknowledgements for Acknowledgements for Materials possibly stolen from Materials possibly stolen from
talks by talks by
Fran Berman(Director SDSC), Harvey Fran Berman(Director SDSC), Harvey Newman(Caltech), Miron Newman(Caltech), Miron Livny(U.Wisc), Ian Foster(U.Chicago), Livny(U.Wisc), Ian Foster(U.Chicago), Doug Olson(LBNL), Tony Hey (UK e-Doug Olson(LBNL), Tony Hey (UK e-science), John Taylor (UK), Fabrizio science), John Taylor (UK), Fabrizio Gagliardi (CERN), Federico Gagliardi (CERN), Federico Carmenati(CERN), Lee Lueking(FNAL), Carmenati(CERN), Lee Lueking(FNAL), Ruth Pordes(FNAL), Lothar Bauerdick Ruth Pordes(FNAL), Lothar Bauerdick (FNAL), Irwin Wladavsky-berger(IBM), (FNAL), Irwin Wladavsky-berger(IBM), DaSilva (EU), Igor Terakhov(FNAL)DaSilva (EU), Igor Terakhov(FNAL)
Grid Computing in the News Grid Computing in the News FEATURE ARTICLE Scientific AmericanApril 2003 issue INFORMATION TECHNOLOGYThe Grid: Computing without BoundsBy linking digital processors, storage systems and software on a global scale, grid technology is poised to transform computing from an individual and corporate activity into a general utilityBy Ian Foster
Scientists Giddy About the Grid By Randy Dotinga | Also by this reporter Page 1 of 1
WIRED NEWS
02:00 AM Jan. 20, 2003 PTSAN DIEGO, California -- One pesky word has doomed many collaborative supercomputer projects to the purgatory of the suggestion box: feasible. Sure, universities could conceivably link their most powerful machines. But just like back in the kindergarten sandbox, differing standards prevented everyone from playing well with others.
Wired News articles last yearWired News articles last year
• Grid Computing Good for Business Jan. 16, 2003
• Tackling Breast Cancer on a Grid Oct. 14, 2002
• Library of Congress Taps the Grid Oct. 02, 2002
• Open Sourcers Say Grid Is Good Aug. 15, 2002
• Time to Hop on the Gridwagon Jul. 26, 2002
• Supercomputing: Suddenly Sexy Jul. 08, 2002
• The Grid Draws Its Battle Lines Feb. 20, 2002
Nature : April 2003Nature : April 2003
Article about CERN and the LHC Computing Grid (and US Grid projects)
What is GRID computing :What is GRID computing :
Coordinated resource sharing and Coordinated resource sharing and problem solving in dynamic, multi-problem solving in dynamic, multi-institutional virtual organizationsinstitutional virtual organizations. . [Ian [Ian Foster]Foster]
• A virtual organization is a collection of users sharing similar A virtual organization is a collection of users sharing similar needs and requirements in their access to processing, data needs and requirements in their access to processing, data and distributed resources and pursuing similar goals.and distributed resources and pursuing similar goals.
KKey concept ey concept ::– ability to negotiate resource-sharing ability to negotiate resource-sharing
arrangements among a set of participating arrangements among a set of participating parties (providers and consumers) and parties (providers and consumers) and then to use the resulting resource pool for then to use the resulting resource pool for some purposesome purpose. . [Ian Foster][Ian Foster]
Who is this guy Ian Foster? Who is this guy Ian Foster?
Ian Foster and Carl Kesselman, editors,
“The Grid: Blueprint for a New Computing Infrastructure,”
Morgan Kaufmann, 1999,
http://www.mkp.com/grids
Computer Scientist at University of Chicago and Argonne National Lab
Leader of sequence of efforts in distributed computing Globus Project
Globus is a toolkit of software that helps do some of the things you need to do to make Grid Computing a reality
It is not the only software offering
And what is he talking about? And what is he talking about?
Ian Foster and Carl Kesselman, editors,
“The Grid: Blueprint for a New Computing Infrastructure,”
Morgan Kaufmann, 1999,
http://www.mkp.com/grids
An idea of how to work using distributed computers and storage devices and disks and networks in a global way based on emerging standards for interoperability
•Based on the metaphor of the Power Grid – you just plug into the wall and get power wherever you go!
(but its not quite as simple as that)
The Grid VisionThe GRID: networked data processing centres and ”middleware” software as the “glue” of resources.
Researchers perform their activities regardless geographical location, interact with colleagues, share and access data
Scientific instruments and experiments (and simulations) provide huge amount of [email protected]
What does Grid Computing What does Grid Computing involve? involve?
Ian Foster and Carl Kesselman, editors,
“The Grid: Blueprint for a New Computing Infrastructure,”
Morgan Kaufmann, 1999,
http://www.mkp.com/grids
•some specific Grid software that is evolving. Often called “Grid Middleware”
• Funding opportunities and some hype•Many funded grid projects all over the world
•An opportunity for Computer Scientists and Scientists with a job to do (e.g. Physicists) to get together and make something work
• Potential broad benefit to society. Vendor interest has peaked.
•Is this what comes after the web?
• Lots and lots of buzzwords to learn
1995 – 2000+: Maturation of Grid 1995 – 2000+: Maturation of Grid ComputingComputing
• ““Grid book” gave a comprehensive Grid book” gave a comprehensive view of the state of the artview of the state of the art
• Important infrastructure and Important infrastructure and middleware efforts initiatedmiddleware efforts initiated– GlobusGlobus, Legion, , Legion, Condor, NWSCondor, NWS, , SRBSRB, ,
NetSolve, AppLes, etc.NetSolve, AppLes, etc.
• 2000+: Beginnings of a Global Grid2000+: Beginnings of a Global Grid– Evolution of the Global Grid ForumEvolution of the Global Grid Forum– Some projects evolving to de facto Some projects evolving to de facto
standards (e.g. Globus, Condor, NWS)standards (e.g. Globus, Condor, NWS)
The Condor Project The Condor Project (Established ‘85)(Established ‘85)
Distributed Computing Distributed Computing researchresearch performed by a team of 30 faculty, full performed by a team of 30 faculty, full time staff and students whotime staff and students who
– face face software engineeringsoftware engineering challenges in a UNIX/Linux/NT challenges in a UNIX/Linux/NT environment, environment,
– are involved in national and international are involved in national and international collaborationscollaborations,,– actively interact with actively interact with usersusers,,– maintain and support a distributed maintain and support a distributed productionproduction environment, environment,– and educate and train and educate and train studentsstudents..
FundingFunding – – DoD, DoE, NASA, NIH, NSF,AT&T, INTEL, DoD, DoE, NASA, NIH, NSF,AT&T, INTEL,
Microsoft and the UW Graduate SchoolMicrosoft and the UW Graduate School
Distributed batch system with many features for high Distributed batch system with many features for high throughput computing across clusters spread worldwide. throughput computing across clusters spread worldwide.
..
What’s different about Grids What’s different about Grids and are they useful now?and are they useful now?
•Lots of production quality Lots of production quality Distributed Distributed ApplicationsApplications in real life in real life
•Much work in High Energy Physics and Much work in High Energy Physics and much progress on building early grids, much progress on building early grids, prototypes, testbeds, even spanning prototypes, testbeds, even spanning continentscontinents
– Our Our Virtual OrganizationsVirtual Organizations are our are our ExperimentsExperiments
– Satellite technology used to track every itemSatellite technology used to track every item• Bar code information sent to remote data centers to Bar code information sent to remote data centers to
update inventory database and cash flow estimatesupdate inventory database and cash flow estimates• Satellite networking used to coordinate vast Satellite networking used to coordinate vast
operations operations
– Inventory adjusted Inventory adjusted in real time to avoid in real time to avoid shortages and predictshortages and predictdemanddemand• Data management,Data management,
prediction, real-time,prediction, real-time,wide-area synchronizationwide-area synchronization
Real-world example - Walmart Inventory ControlReal-world example - Walmart Inventory Control
More Real World Distributed More Real World Distributed ApplicationsApplications
• SETI@homeSETI@home– 3.8M users in 226 3.8M users in 226
countriescountries
– 1200 CPU years/day1200 CPU years/day
– 38 TF sustained (Japanese 38 TF sustained (Japanese Earth Simulator is 40 TF Earth Simulator is 40 TF peak)peak)
– 1.7 ZETAflop over last 3 1.7 ZETAflop over last 3 years (10^21, beyond years (10^21, beyond peta and exa …)peta and exa …)
– Highly heterogeneous: Highly heterogeneous: >77 >77 differentdifferent processor processor types types
Grid for ScienceGrid for Science
• Walmart – controlled/owned distributed Walmart – controlled/owned distributed computing systemcomputing system
• SETI - other end of the spectrum SETI - other end of the spectrum – Same “job” run on anyone’s desktopSame “job” run on anyone’s desktop
• Science using the GRID – Science using the GRID – – Focus on standards and interfaces, not all Focus on standards and interfaces, not all
resources owned and controlledresources owned and controlled– Sharing with other disciplinesSharing with other disciplines– Opportunistic use of resources for ever-changing Opportunistic use of resources for ever-changing
jobs and investigationsjobs and investigations
The Grid and the WebThe Grid and the Web
•As the Web became an everyday tool As the Web became an everyday tool for “everyone,” the Grid concept has for “everyone,” the Grid concept has become much easier to explain become much easier to explain
– Web users are familiar with accessing Web users are familiar with accessing remote resourcesremote resources
– typically the resources accessed are typically the resources accessed are static documents but some are dynamic. static documents but some are dynamic.
– Consequently, the idea of harnessing Consequently, the idea of harnessing major remote computational and data major remote computational and data resources is no longer quite so foreign. resources is no longer quite so foreign.
Global Grid ForumGlobal Grid Forum• International Working Group on Grid ComputingInternational Working Group on Grid Computing
• Has grown from first meeting in Amsterdam (Mar 2001) with 200 Has grown from first meeting in Amsterdam (Mar 2001) with 200 registrants to 4registrants to 4thth meeting in Toronto (Feb 2002) with 450 registrants meeting in Toronto (Feb 2002) with 450 registrants
• Strong participation by developers, application users, industryStrong participation by developers, application users, industry
• Promote Grid technologies via "best practices," implementation Promote Grid technologies via "best practices," implementation guidelines, and standardsguidelines, and standards
• Meetings three times a yearMeetings three times a year– International participation, hundreds of attendeesInternational participation, hundreds of attendees
• Many Physics participants contributing to GGFMany Physics participants contributing to GGF– Working group chairs, document production, etc.Working group chairs, document production, etc.
• Mature Physics technologies should transition to GGFMature Physics technologies should transition to GGF
Vendors and the GridVendors and the Grid
Sun1 Grid EngineSun1 Grid Engine
When you move from network When you move from network computing to grid computing to grid computing, you will notice computing, you will notice reduced costs , shorter time reduced costs , shorter time to market, increased quality to market, increased quality and innovation and you will and innovation and you will develop products you develop products you couldn't do before. Sun Grid couldn't do before. Sun Grid Computing solutions are Computing solutions are ideal for compute-intensive ideal for compute-intensive industries such as scientific industries such as scientific research, EDA, life sciences, research, EDA, life sciences, MCAE, geosciences, MCAE, geosciences, financial services, and financial services, and others.others.
Vendors and the Vendors and the GridGrid
IBM Says Its Grid Services Contributions Will Be Royalty-FreeIBM Says Its Grid Services Contributions Will Be Royalty-FreeBy By Paul Paul ShreadShread The Open Grid Services Architecture vision for the convergence of The Open Grid Services Architecture vision for the convergence of Web services and Grid computing received a big boost today Web services and Grid computing received a big boost today when IBM announced that its contributions to the core Grid when IBM announced that its contributions to the core Grid Services Specification will be royalty-free. Services Specification will be royalty-free.
"IBM is pleased to announce that any essential patent claims, held by IBM, "IBM is pleased to announce that any essential patent claims, held by IBM, that are necessary to implement the 'Grid Services Specification' that are necessary to implement the 'Grid Services Specification' document submitted to the Global Grid Forum, will be licensed on a document submitted to the Global Grid Forum, will be licensed on a royalty-free basis," IBM Grid Computing General Manager Tom Hawk royalty-free basis," IBM Grid Computing General Manager Tom Hawk wrote in a letter to Charlie Catlett, chair of the standard-setting Global wrote in a letter to Charlie Catlett, chair of the standard-setting Global Grid Forum. Grid Forum.
"The 'Grid Services Specification' is built upon both Grid and Web Services "The 'Grid Services Specification' is built upon both Grid and Web Services technologies, referred to as the Open Grid Services Architecture (OGSA)," technologies, referred to as the Open Grid Services Architecture (OGSA)," Hawk wrote. "Like Web Services and XML before it, this foundational Grid Hawk wrote. "Like Web Services and XML before it, this foundational Grid work will be an essential part of the Web and network infrastructure for work will be an essential part of the Web and network infrastructure for businesses, as well as for governments and scientific institutions."businesses, as well as for governments and scientific institutions."
Next generation Grids will Next generation Grids will include include
new technologies new technologies
•New devicesNew devices– PDAs, sensors, cars, PDAs, sensors, cars,
clothes, smart dust,clothes, smart dust,smart bandaids, …smart bandaids, … Wired and Wired and
WirelessWireless
Many BioGrid ProjectsMany BioGrid Projects
•EUROGRID BioGRIDEUROGRID BioGRID
•Asia Pacific BioGRIDAsia Pacific BioGRID
•NC BioGridNC BioGrid
•Bioinformatics Research NetworkBioinformatics Research Network
•Osaka University BiogridOsaka University Biogrid
• Indiana University BioArchive BioGrid Indiana University BioArchive BioGrid
Why GRIDs for HEP and other Why GRIDs for HEP and other sciencessciences
•The The scalescale of the problems of the problems human beingshuman beings have to face to perform frontier research have to face to perform frontier research in many different fields in many different fields is constantly is constantly increasingincreasing. .
•Performing frontier research in these Performing frontier research in these fields already today requires world-wide fields already today requires world-wide collaborations (i.e. multi domain access collaborations (i.e. multi domain access to distributed resources). to distributed resources).
Farms (Production Processing Farms (Production Processing resources)resources)
CMS
1U Run I
Fixed Target
Run II
Feynman Computing Center Feynman Computing Center
• Run2 and CMS Run2 and CMS require require massive PC massive PC computing clusterscomputing clusters– Very high Very high
physical densityphysical density– ~200 Watts per ~200 Watts per
CPU chip CPU chip (similar for SMP (similar for SMP and PC)and PC)
One Gigantic Computing One Gigantic Computing Center? Center?
•At Fermilab for Run II expts, MINOS, At Fermilab for Run II expts, MINOS, BTeV, CKM, SDSS ? BTeV, CKM, SDSS ?
•At CERN for LHC ?At CERN for LHC ?
•Makes no sense technically, Makes no sense technically, politically, sociallypolitically, socially
– Changing role for Fermilab ComputingChanging role for Fermilab Computing
Getting to the VisionGetting to the Vision
•Democratization of Science Democratization of Science – Not just a question of computers and disks Not just a question of computers and disks
and tapes and access to them all and tapes and access to them all – Seeds a way of working that puts emphasis Seeds a way of working that puts emphasis
on equal access for all and standardization on equal access for all and standardization of the way things are doneof the way things are done
– Computer Scientists’ focus is on the Computer Scientists’ focus is on the technology technology
– Our focus is on enabling Scientists to work Our focus is on enabling Scientists to work by creating a massive “virtual” by creating a massive “virtual” environmentenvironment• Long way to go to get to the visionLong way to go to get to the vision
How close to the Vision? How close to the Vision?
• State of the art today for “The Grid” isState of the art today for “The Grid” is– Moving data files between institutions, across Moving data files between institutions, across
continents, (almost) seamlessly and continents, (almost) seamlessly and automaticallyautomatically
• Almost here today is Almost here today is – submitting jobs from your desktop and having submitting jobs from your desktop and having
them run somewhere at one of the “centers” of them run somewhere at one of the “centers” of your “virtual organization” your “virtual organization”
• SeamlessSeamless• You don’t care whereYou don’t care where• Choice of where to run the job optimized for youChoice of where to run the job optimized for you• All the errors/resubmits handled for youAll the errors/resubmits handled for you
How do the economics work? How do the economics work?
• So we need massive amounts of CPU, disk, So we need massive amounts of CPU, disk, tape to do analysistape to do analysis
– The more we have the more we can The more we have the more we can • explore our data quicklyexplore our data quickly• reprocess multiple timesreprocess multiple times• let every grad student sift through masses of data let every grad student sift through masses of data
• How does getting tied up with Grid help to How does getting tied up with Grid help to make more make more
– Aren’t there only so many $$s in total? Aren’t there only so many $$s in total? – Doesn’t distributing all the computing make it all Doesn’t distributing all the computing make it all
more complicated? more complicated? – Isn’t this all nonsense because the real scarce Isn’t this all nonsense because the real scarce
resource for scientific discovery is brainpower? resource for scientific discovery is brainpower?
Does Grid Computing make Does Grid Computing make more Computing and more more Computing and more
capabilities ? capabilities ?
• Information technology is now accepted Information technology is now accepted as an essential enabling infrastructure as an essential enabling infrastructure
– For a UniversityFor a University– For a RegionFor a Region– For a CountryFor a Country– For a BusinessFor a Business– For Global competitivenessFor Global competitiveness
•The “GRID” word has captured the vision The “GRID” word has captured the vision for a new way of using computers (and for a new way of using computers (and so has captured the $$s)so has captured the $$s)
Funding for Grid ProjectsFunding for Grid Projects
HENP Related Grid ProjectsHENP Related Grid ProjectsProjectsProjects– PPDG IPPDG I USAUSA DOEDOE $2M$2M 1999-20011999-2001– GriPhyNGriPhyN USAUSA NSFNSF $11.9M + $1.6M$11.9M + $1.6M 2000-20052000-2005– EU DataGridEU DataGrid EUEU ECEC €10M€10M 2001-20042001-2004– PPDG II (CP)PPDG II (CP) USAUSA DOEDOE $9.5M$9.5M 2001-20042001-2004– iVDGLiVDGL USAUSA NSFNSF $13.7M + $2M$13.7M + $2M 2001-2004 2001-2004 – DataTAGDataTAG EUEU ECEC €4M€4M 2002-20042002-2004– GridPP GridPP UKUK PPARCPPARC >$15M>$15M 2001-20042001-2004– LCG Phase1LCG Phase1 CERN MSCERN MS 30 MCHF30 MCHF 2002-20042002-2004
Many Other Projects of interest to HENPMany Other Projects of interest to HENP– Initiatives in US, UK, Italy, France, NL, Germany, Japan, …Initiatives in US, UK, Italy, France, NL, Germany, Japan, …– US and EU networking initiatives: US and EU networking initiatives: AMPATH, I2, DataTAG AMPATH, I2, DataTAG – US Distributed Terascale Facility: US Distributed Terascale Facility:
($53M, 12 TeraFlops, 40 Gb/s network)($53M, 12 TeraFlops, 40 Gb/s network)– Storage Resource Manager (Fermilab participating with LBL Storage Resource Manager (Fermilab participating with LBL
and others) – gaining acceptance as a standardand others) – gaining acceptance as a standard
PPDGB
aBar
Dat
a M
anag
emen
t
BaBar
D0
STAR
Jefferson Lab
CMSAtlas
Globus Users
SRB Users
Condor Users
StacsUsers
CM
S D
ata Managem
ent
Jlab Data Management
D0 Data M
anagement
STAR Data Management
Atla
s D
ata
Man
agem
ent
Globus Team
Condor
SRB Team
HE
NP
GC
The Particle Physics Data Grid
GriPhyN: PetaScale Virtual Data GriPhyN: PetaScale Virtual Data GridsGrids
Virtual Data Tools
Request Planning &
Scheduling ToolsRequest Execution & Management Tools
Transforms
Distributed resources(code, storage,
computers, and network)
Resource Management
Services
Resource Management
Services
Security and Policy
Services
Security and Policy
Services
Other Grid ServicesOther Grid
Services
Interactive User Tools
Production TeamIndividual
Investigator Workgroups
Raw data source
There is funding for Grid There is funding for Grid projects worldwideprojects worldwide
•European Commission and many European Commission and many individual European countries individual European countries
– And for Network infrastructure – e.g And for Network infrastructure – e.g GEANT and Netherlands GEANT and Netherlands
•Asia – coordinated and individual Asia – coordinated and individual countries countries
•Australia, Brazil, and many others…Australia, Brazil, and many others…
UK e-Science InitiativeUK e-Science Initiative
• £120£120M Programme over 3 yearsM Programme over 3 years
• £75£75M is for Grid Applications in all areas of science M is for Grid Applications in all areas of science and engineeringand engineering
• £35£35M M ‘Core Program’ to encourage‘Core Program’ to encourage development of development of generic generic ‘industrial strength’ Grid middleware‘industrial strength’ Grid middleware
Require Require £20£20M additional ‘matching’ M additional ‘matching’ funds funds from industryfrom industry
e-Science e-Science
‘‘e-Science is about global e-Science is about global collaboration in key areas of collaboration in key areas of science, and the next generation science, and the next generation of infrastructure that will enable of infrastructure that will enable it.’it.’
John TaylorJohn Taylor
(Director General of Research Councils Office of (Director General of Research Councils Office of Science and Technology Dept of Trade and Science and Technology Dept of Trade and Industry, UKIndustry, UK ) )
In flight data
Airline
Maintenance Centre
Ground Station
Global Networkeg: SITA
Internet, e-mail, pager
DS&S Engine Health Center
Data centre
DAME is an e-Science pilot project, demonstrating the use of the GRID to implement a distributed decision support system for deployment in maintenance applications and environments.It is funded by the EPSRC under the UK e-Science programme, and is one of six EPSRC projects launched in the first phase of e-Science funding.DAME will demonstrate how the GRID and web services (based on OGSA) can facilitate the design and development of systems for diagnosis and maintenance applications which combine geographically distributed resources and data within a localised decision support system.
European Union Funds Projects – next round of European Union Funds Projects – next round of funding – for the 6funding – for the 6thth Framework Program Framework Program
INTEGRATING E UROPEAN R ESEARCH
P RIORITY T HEMATIC A REAS A NTICIPATING S /T NEEDS
Research for policysupport
Frontier research,unexpecteddevelopments
Specific SME activities
Specific international Co-operation activities
Gen
omic
and
bio
tech
nolo
gy
for
heal
th
Info
rmat
ion
soci
ety
tech
nolo
gies
Nan
otec
hnol
ogie
s, in
tell
igen
t mat
.,
new
pro
duct
ion
proc
esse
s
Aer
onau
tics
and
spa
ce
Foo
d sa
fety
and
hea
lth
risk
s
Sus
tain
able
dev
elop
men
t
and
glo
bal c
hang
e
Cit
izen
s an
d go
vern
ance
in th
e kn
owle
dge
soci
ety
JRC activities
S TRUCTURING THE ERAS TRENGTHENING THE
FOUNDATIONS OF ERA
Researchandinnovation
Humanresources &mobility
Researchinfrastructures
Science andsociety
Coordinationof researchactivities
Developmentof research/innovationpolicies
Specific international Co-operation activities
Integrating European Research
Structuring the ERAStrengthening the foundation of ERA
European Union Funded, CERN led, Data Grid Project 2001-2003
• 1. To deliver production level Grid services, the essential elements of 1. To deliver production level Grid services, the essential elements of which are manageability, robustness, resilience to failure and a which are manageability, robustness, resilience to failure and a consistent security model, as well as the scalability needed to rapidly consistent security model, as well as the scalability needed to rapidly absorb new resources as these become available, while ensuring the absorb new resources as these become available, while ensuring the long-term viability of the infrastructure.long-term viability of the infrastructure.
2. To carry out a professional Grid middleware re-engineering and 2. To carry out a professional Grid middleware re-engineering and development activity in support of the production services. This will development activity in support of the production services. This will support and continuously upgrade a suite of software tools capable of support and continuously upgrade a suite of software tools capable of providing production level Grid services to a base of users which is providing production level Grid services to a base of users which is anticipated to rapidly grow and diversify.anticipated to rapidly grow and diversify.
3. To ensure outreach and training effort which can proactively market 3. To ensure outreach and training effort which can proactively market Grid services to new research communities in academia and industry, Grid services to new research communities in academia and industry, capture new e-Science requirements for the middleware and service capture new e-Science requirements for the middleware and service activities, and provide the necessary education to enable new users to activities, and provide the necessary education to enable new users to benefit from the Grid infrastructure.benefit from the Grid infrastructure.
EGEE is the new EU proposal EGEE is the new EU proposal just submitted just submitted
The LHC Computing GridProject Structure
The LHC Computing Grid Project
LHCC
Project Overview Board
RTAG
Reports
Reviews
CommonComputing
RRB
Resource Matters
OtherComputing
GridProjects
EUDataGridProject
implementation teams
Other HEPGrid
Projects
OtherLabs
Project Manager
ProjectExecution
Board
Requirements,Monitoring
Software andComputingCommittee
(SC2)
Launch WorkshopMarch 11-15
DOE/NSF Partnership for Global InfostructureDOE/NSF Partnership for Global Infostructure
I.Gaines, 4-Agency
meeting at CERN
March 21st, 2003I.Gaines, 4-Agency
meeting at CERN
March 21st, 2003
Physics + Computer Science/Information Technology Funding Agencies
Closer to home – what are we Closer to home – what are we doing at Fermilab? doing at Fermilab?
• SAM for Run II ExperimentsSAM for Run II Experiments
• Grid-enabled storage systems using standards like Grid-enabled storage systems using standards like SRM (Storage Resource Manager)SRM (Storage Resource Manager)
• Partipating in Grid projects and submitting proposals Partipating in Grid projects and submitting proposals in the areas of in the areas of
– Grid, Cyber Security, Network Research, Collaborative Grid, Cyber Security, Network Research, Collaborative Tools, Global Analysis, Lattice QCD, Sloan Digital Sky Tools, Global Analysis, Lattice QCD, Sloan Digital Sky SurveySurvey
• US-CMS Software and Computing leadership US-CMS Software and Computing leadership – Building US-CMS production Grid for Data ChallengesBuilding US-CMS production Grid for Data Challenges
• Playing a big role in coordination of Grid Projects Playing a big role in coordination of Grid Projects – Trillium - joining together 3 US HEP Grid Projects Trillium - joining together 3 US HEP Grid Projects
(Ruth Pordes )(Ruth Pordes )– HICB – international coordination bodyHICB – international coordination body
Dzero has a prototype GridDzero has a prototype Grid
•SAM - data handling systemSAM - data handling system
•Now also being used by CDF offsiteNow also being used by CDF offsite
•Also working to incorporate defacto Also working to incorporate defacto Grid standards – Globus, Condor, Grid standards – Globus, Condor, GridFTP, SRM, etc GridFTP, SRM, etc
– UK Grid funding as well at DOE funded UK Grid funding as well at DOE funded PPDG project is supporting this effort PPDG project is supporting this effort
What’s SAM?What’s SAM?
• SAM SAM SSequential data equential data AAccess via ccess via MMeta-data is a data eta-data is a data handling system with a Grid job management services handling system with a Grid job management services (JIM) being added (JIM) being added
• http://d0db.fnal.gov/samhttp://d0db.fnal.gov/sam
• Joint project between D0 and Computing Division Joint project between D0 and Computing Division started in 1997 to meet the FNAL Run II data handling started in 1997 to meet the FNAL Run II data handling needsneeds
• Globally distributed systemGlobally distributed system
• Provides high-level collective services of Provides high-level collective services of reliable reliable data data storage and replicationstorage and replication– Arguably the most functional “Grid” in HEP currentlyArguably the most functional “Grid” in HEP currently
SAM Data Handling System at SAM Data Handling System at D0D0
Registered UsersRegistered Users 600600
Number of StationsNumber of Stations 5656
Registered NodesRegistered Nodes 900900
Total Disk CacheTotal Disk Cache 40 TB40 TB
Number of FilesNumber of Files 1.5M1.5M
Regional CenterAnalysis site
Summary of Resources (DØ)Integrated Files Consumed vs Month (DØ)
Integrated GB Consumed vs Month (DØ)
4.0 M Files Consumed
1.2 PB Consumed
3/02 3/03
LHC Physics Discoveries at Universities LHC Physics Discoveries at Universities and Labs!and Labs!
U.S. CMS is Committed to Empower the CMS Scientists at U.S. Universities and Labs to U.S. CMS is Committed to Empower the CMS Scientists at U.S. Universities and Labs to do Research on LHC Physics Datado Research on LHC Physics Data
This is why we are pushing Grids This is why we are pushing Grids and other Enabling Technologyand other Enabling Technology
Lothar Bauerdick – U.S. CMS Software and Computing Project ManagerLothar Bauerdick – U.S. CMS Software and Computing Project Manager
•This is why we are pushing Grids This is why we are pushing Grids and other Enabling Technologyand other Enabling Technology
• Grid Testbeds for Research, Grid Testbeds for Research, Development and Dissemination!Development and Dissemination!– USCMS Testbeds real-life large Grid installations, USCMS Testbeds real-life large Grid installations,
becoming production qualitybecoming production quality– Strong Partnership between Labs, Universities, with Strong Partnership between Labs, Universities, with
Grid (iVDGL, GriPhyN, PPDG) and Middleware Projects Grid (iVDGL, GriPhyN, PPDG) and Middleware Projects (Condor, Globus)(Condor, Globus)
– Strong dissemination component, together with Grid Strong dissemination component, together with Grid ProjectsProjects
– Caltech, UCSD, U.Florida, UW Madison, Fermilab, CERNCaltech, UCSD, U.Florida, UW Madison, Fermilab, CERN
MITRiceMinnesota Iowa
Princeton
Grid Testbeds becoming Production Grid Testbeds becoming Production
GridsGrids
1.5 Million Events
Delivered to CMS Physicists!
(nearly 30 CPU years)1.5 Million Events
Delivered to CMS Physicists!
(nearly 30 CPU years)
• Now joining:
BrazilSouth Korea
Facility-Centric Components Facility-Centric Components ViewView
Peta Scales!!Peta Scales!!
... Towards The Environment ... Towards The Environment View: DAWN (Recent proposal to View: DAWN (Recent proposal to
NSF)NSF)• Communities of Scientists Working Locally within a Global ContextCommunities of Scientists Working Locally within a Global Context
• Infrastructure for sharing, consistency of physics and calibration data, Infrastructure for sharing, consistency of physics and calibration data, softwaresoftware
Communities!!Communities!!
Transition to Production-Quality Transition to Production-Quality GridGrid
•LHC is Starting a Production Grid Service Around the World LHC is Starting a Production Grid Service Around the World
DOE/NSF Partnership for Global InfostructureDOE/NSF Partnership for Global Infostructure
I.Gaines, 4-Agency
meeting at CERN
March 21st, 2003I.Gaines, 4-Agency
meeting at CERN
March 21st, 2003
Physics + Computer Science/Information Technology Funding Agencies
Fermilab role – what’s nextFermilab role – what’s nextFrom participting in iVDGL (International Virtual Data From participting in iVDGL (International Virtual Data
Grid Laboratory )Grid Laboratory )– NSF funded project that provides the $$s for the NSF funded project that provides the $$s for the
prototype “Tier 2” University based computing centers prototype “Tier 2” University based computing centers in the US for LHC (shared with SDSS and Ligo)in the US for LHC (shared with SDSS and Ligo)
To helping “WorldGrid” demonstrations of To helping “WorldGrid” demonstrations of interoperability and broader participationinteroperability and broader participation
Playing a leading role in a proposed Open Science Playing a leading role in a proposed Open Science GridGrid
come together in the US to build a persistent Grid come together in the US to build a persistent Grid Infrastructure for our Science, driven by LHC Infrastructure for our Science, driven by LHC experiments, used by Run II experiments, open more experiments, used by Run II experiments, open more broadly and “matching” the EGEE goals in Europe. broadly and “matching” the EGEE goals in Europe.
LHC Computing – a driver for LHC Computing – a driver for Global ScienceGlobal Science
• LHC is visible, strongly international, pushed by CERN, LHC is visible, strongly international, pushed by CERN, supported by funding agencies worldwide, aligned with supported by funding agencies worldwide, aligned with “Grid” vision and motivation“Grid” vision and motivation
• We can either use this fact and build on it - or go in a We can either use this fact and build on it - or go in a corner and “just do a physics experiment”corner and “just do a physics experiment”
• We can either build on this opportunity to explain our We can either build on this opportunity to explain our science to other scientists, to those with $$s, and to science to other scientists, to those with $$s, and to the public – or let it pass us bythe public – or let it pass us by
• We can either capitalize on the perception that we We can either capitalize on the perception that we may be leading the way in working in a high-tech way may be leading the way in working in a high-tech way across global “virtual organizations” – showing the way across global “virtual organizations” – showing the way for the future for corporations as well as for other for the future for corporations as well as for other sciences that are not yet at that scale – or we can sciences that are not yet at that scale – or we can hide. hide.
Will the Grid help us with Will the Grid help us with Global Science projects Global Science projects
I think it can and it is. High Energy Physics is prime I think it can and it is. High Energy Physics is prime example and a real driving force for global scale example and a real driving force for global scale science projects with massive data and huge data science projects with massive data and huge data processing needsprocessing needs
The promise of the Grid has been not been oversold The promise of the Grid has been not been oversold but but the difficulty of developing the requisite Grid the difficulty of developing the requisite Grid infrastructure has been underestimated. infrastructure has been underestimated.
It is a new way of working – a bit less focused on It is a new way of working – a bit less focused on HEP but with a greater chance to see and be seen HEP but with a greater chance to see and be seen and have an impact more broadly than our science and have an impact more broadly than our science and more aligned with the long-term standards and more aligned with the long-term standards and developments in Information Technology and developments in Information Technology
The EndThe End
A few URLs – to start fromA few URLs – to start from
• http://ppdg.nethttp://ppdg.net
• http://www.griphyn.orghttp://www.griphyn.org
• http://www.ivdgl.orghttp://www.ivdgl.org
• http://d0db.fnal.gov/samhttp://d0db.fnal.gov/sam
• http://www.uscms.org/scpages/sc.htmlhttp://www.uscms.org/scpages/sc.html
• http://lcg.web.cern.ch/lcghttp://lcg.web.cern.ch/lcg
• http://eu-datagrid.web.cern.ch/eu-datagrid/http://eu-datagrid.web.cern.ch/eu-datagrid/
• http://www.globalgridforum.org/http://www.globalgridforum.org/
• http://www.crossgrid.org/http://www.crossgrid.org/
• http://www.nature.com/nature/webmatters/grid/grid.htmlhttp://www.nature.com/nature/webmatters/grid/grid.html
• http://www.escience-grid.org.uk/docs/background.htmhttp://www.escience-grid.org.uk/docs/background.htm
• http://grid.fzk.de/grid/http://grid.fzk.de/grid/
• ………….. These contain many links to Globus, Condor, SRM, Network and Security projects .. These contain many links to Globus, Condor, SRM, Network and Security projects that Fermilab and Fermilab experiments are involved with. that Fermilab and Fermilab experiments are involved with.