Issues for Grids and WorldWide Computing Harvey B Newman California Institute of Technology ACAT2000...
-
Upload
sydney-morton -
Category
Documents
-
view
214 -
download
1
Transcript of Issues for Grids and WorldWide Computing Harvey B Newman California Institute of Technology ACAT2000...
Issues for Grids and WorldWide Computing
Harvey B NewmanHarvey B NewmanCalifornia Institute of TechnologyCalifornia Institute of Technology
ACAT2000ACAT2000Fermilab, October 19, 2000Fermilab, October 19, 2000
LHC Vision: Data Grid HierarchyLHC Vision: Data Grid Hierarchy
Tier 1
Tier2 Center
Online System
Offline Farm,CERN Computer Ctr > 20 TIPS
FranceCenter
FNAL Center
Italy Center
UK Center
InstituteInstitute
InstituteInstitute ~0.25TIP
S
Workstations
~100 MBytes/se
c
~2.5 Gbits/sec
100 - 1000 Mbits/sec
1 Bunch crossing; ~17 interactions per 25 nsecs; 100 triggers per second. Event is ~1 MByte in size
Physicists work on analysis “channels”
Each institute has ~10 physicists working on one or more channels
Physics data cache
~PBytes/sec
~0.6-2.5 Gbits/sec
Tier2 CenterTier2 CenterTier2 Center
~622 Mbits/sec
Tier 0 +1
Tier 3
Tier 4
Tier2 Center Tier 2
Experiment
US-CERN Link BW RequirementsUS-CERN Link BW RequirementsProjection Projection (PRELIMINARY)(PRELIMINARY)
2001 2002 2003 2004 2005 2006
Installed LinkBW in MbpsIncl. New SLACThroughput [*]
310
(120)
622
(250)
1600
(400)
2400
(600)
4000
(1000)
6500 [#]
(1600)
[#] Includes ~1.5 Gbps Each for ATLAS and CMS, Plus Babar, Run2 and Other[*] D0 and CDF at Run2: Needs Presumed to Be to be Comparable to BaBar
0
1000
2000
3000
4000
5000
6000
7000
Ba
nd
wid
th
(M
bp
s)
FY2001 FY2002 FY2003 FY2004 FY2005 FY2006
Grids: The Broader Issues and Grids: The Broader Issues and RequirementsRequirements
A New Level of Intersite Cooperation, andA New Level of Intersite Cooperation, andResource SharingResource Sharing
Security and Authentication Across Security and Authentication Across World-Region BoundariesWorld-Region Boundaries
Start with cooperation among Grid ProjectsStart with cooperation among Grid Projects (PPDG, GriPhyN, EU DataGrid, etc.) (PPDG, GriPhyN, EU DataGrid, etc.)
Develop Methods for Effective HEP/CS Collaboration Develop Methods for Effective HEP/CS Collaboration In Grid and VDT DesignIn Grid and VDT Design
Joint Design and Prototyping Effort, with Joint Design and Prototyping Effort, with (Iterative) Design Specifications(Iterative) Design Specifications
Find an Appropriate Level of AbstractionFind an Appropriate Level of Abstraction Adapted to > 1 Experiment; Adapted to > 1 Experiment;
> 1 Working Environment> 1 Working Environment Be Ready to Adapt to the Coming RevolutionsBe Ready to Adapt to the Coming Revolutions
In Network, Collaborative, and Internet In Network, Collaborative, and Internet Information TechnologiesInformation Technologies
PPDG
BaB
ar D
ata
Man
agem
ent
BaBar
D0
CDF
Nuclear Physics
CMSAtlas
Globus Users
SRB Users
Condor Users
HENPGC
Users
CM
S D
ata Man
agem
ent
Nucl Physics Data Management
D0 Data M
anagement
CDF Data ManagementA
tlas
Dat
a M
anag
emen
t
Globus Team
Condor
SRB Team
HE
NP
GC
GriPhyN: PetaScale GriPhyN: PetaScale Virtual Data GridsVirtual Data Grids
Build the Foundation for Petascale Virtual Data GridsBuild the Foundation for Petascale Virtual Data Grids
Virtual Data Tools
Request Planning &
Scheduling ToolsRequest Execution & Management Tools
Transforms
Distributed resources(code, storage,
computers, and network)
Resource Management
Services
Resource Management
Services
Security and Policy
Services
Security and Policy
Services
Other Grid ServicesOther Grid
Services
Interactive User Tools
Production TeamIndividual Investigator
Workgroups
Raw data source
WorkPackageNumber
Work Package title Leadcontractor
WP1 Grid Workload Management INFN
WP2 Grid Data Management CERN
WP3 Grid Monitoring Services PPARC
WP4 Fabric Management CERN
WP5 Mass Storage Management PPARC
WP6 Integration Testbed CNRS
WP7 Network Services CNRS
WP8 High Energy Physics Applications CERN
WP9 Earth Observation Science Applications ESA
WP10 Biology Science Applications INFN
WP11 Dissemination and Exploitation INFN
WP12 Project Management CERN
EU-Grid ProjectEU-Grid ProjectWork PackagesWork Packages
Grid Issues: A Short List of Grid Issues: A Short List of Coming Revolutions Coming Revolutions
Network TechnologiesNetwork Technologies Wireless Broadband (from ca. 2003)Wireless Broadband (from ca. 2003) 10 Gigabit Ethernet10 Gigabit Ethernet (from 2002: See www.10gea.org) (from 2002: See www.10gea.org)
10GbE/DWDM-Wavelength (OC-192) integration: OXC10GbE/DWDM-Wavelength (OC-192) integration: OXC Internet Information Software TechnologiesInternet Information Software Technologies
Global Information “Broadcast” ArchitectureGlobal Information “Broadcast” Architecture E.g the Multipoint Information Distribution Protocol E.g the Multipoint Information Distribution Protocol
(MIDP; [email protected])(MIDP; [email protected]) Programmable Coordinated Agent ArchtecturesProgrammable Coordinated Agent Archtectures
E.g. Mobile Agent Reactive Spaces (MARS)E.g. Mobile Agent Reactive Spaces (MARS) by Cabri et al., Univ. of Modena by Cabri et al., Univ. of Modena
The “Data Grid” - Human InterfaceThe “Data Grid” - Human Interface Interactive monitoring and control of Grid resources Interactive monitoring and control of Grid resources
By authorized groups and individualsBy authorized groups and individuals By Autonomous AgentsBy Autonomous Agents
GigaPOP
Vancouver
Calgary ReginaWinnipeg
Ottawa
Montreal
Toronto
Halifax
St. John’s
FrederictonCharlottetown
ORAN
BCnet
Netera SRnet MRnet
ONet RISQ
ACORN
Chicago
STAR TAP
CA*net 3 Primary Route
Seattle
New YorkLos Angeles
CA*net 3 Diverse Route
Deploying a 4 channel
CWDM Gigabit
Ethernet network – 400 km
Deploying a 4 channel Gigabit
Ethernet transparent
optical DWDM– 1500 km
Multiple Customer
Owned Dark Fiber
Networks connecting universities and schools
16 channel DWDM-8 wavelengths @OC-192 reserved for CANARIE-8 wavelengths for carrier and other customers
Consortium Partners:Bell Nexxia
NortelCisco
JDS UniphaseNewbridge
Condo Dark Fiber
Networks connecting universities
and schools
Condo Fiber Network
linking all universities and hospital
CA*net 3 National Optical InternetCA*net 3 National Optical Internetin Canadain Canada
Vancouver
Calgary
Regina Winnipeg
Ottawa
Montreal
Toronto
Halifax
St. John’s
Fredericton
Charlottetown
Chicago
Seattle
New York
Los Angeles Miami
Europe
Dedicated Wavelength or SONET channel
OBGP switches
Optional Layer 3 aggregation service
Large channel WDM
system
CA*net 4 Possible ArchitectureCA*net 4 Possible Architecture
Intermediate ISP
Tier 1 ISPTier 2 ISP
AS 1 AS 2 AS 3 AS 4
AS 5
Dual Connected
Router to AS 5
Optical switch looks like BGP router and AS1 is direct connected to Tier 1 ISP but still transits AS 5
Router redirects networks with heavy traffic load to optical switch, but routing policy still maintained by ISP
Bulk of AS 1 traffic is to Tier 1 ISP
For simplicity only data forwarding
paths in one direction shown
Red Default Wavelength
OBGP Traffic Engineering - PhysicalOBGP Traffic Engineering - Physical
VRVS Remote Collaboration VRVS Remote Collaboration
System: Statistics System: Statistics
0200400600800
1000120014001600180020002200240026002800300032003400
Ja
n-9
7F
eb
-97
Ma
r-9
7A
pr-
97
Ma
y-9
7J
un
-97
Ju
l-9
7A
ug
-97
Se
p-9
7O
ct-
97
No
v-9
7D
ec
-97
Ja
n-9
8F
eb
-98
Ma
r-9
8A
pr-
98
Ma
y-9
8J
un
-98
Ju
l-9
8A
ug
-98
Se
p-9
8O
ct-
98
No
v-9
8D
ec
-98
Ja
n-9
9F
eb
-99
Ma
r-9
9A
pr-
99
Ma
y-9
9J
un
-99
Ju
l-9
9A
ug
-99
Se
p-9
9O
ct-
99
No
v-9
9D
ec
-99
Ja
n-0
0F
eb
-00
Ma
r-0
0A
pr-
00
Ma
y-0
0J
un
-00
Ju
l-0
0A
ug
-00
Se
p-0
0
Months
Number of Machines and People registered in VRVS
Machines Registered People Registered
30 Reflectors52 Countries
Mbone, H.323, MPEG2Streaming, VNC
VRVS: Mbone/H.323/QT SnapshotVRVS: Mbone/H.323/QT Snapshot
VRVS R&D: Sharing DesktopVRVS R&D: Sharing Desktop
VNC technology integrated in the upcoming VRVS release VNC technology integrated in the upcoming VRVS release
Worldwide Computing IssuesWorldwide Computing Issues
Beyond Grid Prototype Components: Integration of Beyond Grid Prototype Components: Integration of Grid Prototypes for End-to-end Data TransportGrid Prototypes for End-to-end Data Transport
Particle Physics Data Grid (PPDG) ReqM; SAM in D0Particle Physics Data Grid (PPDG) ReqM; SAM in D0 PPDG/EU DataGrid GDMP for CMS HLT ProductionsPPDG/EU DataGrid GDMP for CMS HLT Productions
Start Building the Grid System(s): Integration with Start Building the Grid System(s): Integration with Experiment-specific software frameworksExperiment-specific software frameworks
Derivation of Strategies (MONARC Simulation System) Derivation of Strategies (MONARC Simulation System) Data caching, query estimation, co-schedulingData caching, query estimation, co-scheduling Load balancing and workload management amongst Load balancing and workload management amongst
Tier0/Tier1/Tier2 sites (SONN by Legrand)Tier0/Tier1/Tier2 sites (SONN by Legrand) Transaction robustness: simulate and verifyTransaction robustness: simulate and verify
Transparent Interfaces for Replica ManagementTransparent Interfaces for Replica Management Deep versus shallow copies: Thresholds; Deep versus shallow copies: Thresholds;
tracking, monitoring and controltracking, monitoring and control
Grid Data Management Grid Data Management Prototype (GDMP)Prototype (GDMP)
Distributed Distributed Job Job ExecutionExecution and and Data Handling:Data Handling:
GoalsGoals TransparencyTransparency PerformancePerformance Security Security Fault ToleranceFault Tolerance AutomationAutomation
Submit job
Replicate data
Replicatedata
Site A Site B
Site C
Jobs are executed locally or
remotely Data is always
written locally Data is replicated
to remote sites
Job writes data locally
GDMP V1.1: Caltech + EU DataGrid WP2 Tests by CALTECH, CERN, FNAL, Pisa for CMS “HLT” Production 10/2000;
Integration with ENSTORE, HPSS, Castor
MONARC Simulation: Physics MONARC Simulation: Physics Analysis at Regional CentresAnalysis at Regional Centres
Similar data processingSimilar data processing
jobs are performed in jobs are performed in each of several RCs each of several RCs
There is profile of jobs,There is profile of jobs,each submitted to a job each submitted to a job schedulerscheduler
Each Centre has “TAG”Each Centre has “TAG”and “AOD” databases and “AOD” databases replicated.replicated.
Main Centre provides Main Centre provides “ESD” and “RAW” data “ESD” and “RAW” data
Each job processes Each job processes AOD data, and also aAOD data, and also aa fraction of ESD and a fraction of ESD and RAW data.RAW data.
ORCA Production on CERN/IT-LoanedORCA Production on CERN/IT-LoanedEvent Filter Farm Test Facility Event Filter Farm Test Facility
PileupDB
PileupDB
PileupDB
PileupDB
PileupDB
HPSS
PileupDB
PileupDB
SignalDB
SignalDB
SignalDB
...
6 Servers for Signal
Output Server
Output Server
Lock Server
Lock Server
SU
N
...FARM 140 Processing
Nodes
17 Servers
9 S
erve
rs
Total 24 Pile Up Servers
2 Objectivity
Federations The strategy is to use many commodity PCs as Database Servers
Network Traffic & Job efficiency Network Traffic & Job efficiency
Mean measured Value ~48MB/s
Measurement
SimulationJet
<0.52>
Muon<0.90>
CD
CH
MDMH TH
MC
UF.bootMyFED.boot
UserCollection
MDCDMC
TD
AMS
ORCA 4 tutorial, part II - 14. October 2000
From UserFederation From UserFederation To Private Copy To Private Copy
Mobile Agents: (Semi)-Autonomous, Mobile Agents: (Semi)-Autonomous, Goal Driven, AdaptiveGoal Driven, Adaptive Execute AsynchronouslyExecute Asynchronously Reduce Network Load: Local ConversationsReduce Network Load: Local Conversations Overcome Network Latency; Some OutagesOvercome Network Latency; Some Outages Adaptive Adaptive Robust, Fault Tolerant Robust, Fault Tolerant Naturally Heterogeneous Naturally Heterogeneous Extensible Concept: Extensible Concept: Coordinated Agent Coordinated Agent
Architectures Architectures
Beyond Traditional Architectures:Beyond Traditional Architectures:Mobile AgentsMobile Agents
““Agents are objects with rules and legs” -- D. TaylorAgents are objects with rules and legs” -- D. Taylor
Application
Se
rvic
e
Ag
entAgent
Ag
ent A
gen
tA
gen
t
Ag
ent
Ag
ent
Coordination Architectures for Coordination Architectures for Mobile Java AgentsMobile Java Agents
A lot of Progress since 1998A lot of Progress since 1998 Fourth Generation Architecture: “Associative Fourth Generation Architecture: “Associative
Blackboards”Blackboards” After 1) Client/Server, 2) Meeting-Oriented, 3) After 1) Client/Server, 2) Meeting-Oriented, 3)
Blackboards;Blackboards; Analogous to CMS ORCA software: Analogous to CMS ORCA software:
Observer-based “action on demand”Observer-based “action on demand” MARS: Mobile Agent Reactive Spaces (Cabri et al.)MARS: Mobile Agent Reactive Spaces (Cabri et al.)
See http://sirio.dsi.unimo.it/MOONSee http://sirio.dsi.unimo.it/MOON Resilient and Scalable; Simple ImplementationResilient and Scalable; Simple Implementation Works with standard Agent implementations Works with standard Agent implementations
(e.g. Aglets: http://www.trl.ibm.co.jp)(e.g. Aglets: http://www.trl.ibm.co.jp) Data-oriented, to provide temporal and spatial Data-oriented, to provide temporal and spatial
asynchronicity (See Java Spaces, Page Spaces)asynchronicity (See Java Spaces, Page Spaces) Programmable, authorized reactions, based onProgrammable, authorized reactions, based on
“virtual Tuple spaces”“virtual Tuple spaces”
Mobile Agent Reactive Spaces Mobile Agent Reactive Spaces (MARS) Architecture(MARS) Architecture
MARS Programmed Reactions: Based on MetalevelMARS Programmed Reactions: Based on Metalevel 4-Ples: (Reaction, Tuple, Operation-Type, Agent-ID) 4-Ples: (Reaction, Tuple, Operation-Type, Agent-ID) Allows Security, PoliciesAllows Security, Policies Allows Production of Tuple on DemandAllows Production of Tuple on Demand
The
Internet
NETWORK NODENETWORK NODE
Tuple Space
MetaLevel Tuple space
Agent Server
NETWORK NODE
NETWORK NODE
A
Reference to the
local Tuple Space
B
C
A: Agents ArriveB: They Get Ref. To Tuple SpaceC: They Access Tuple SpaceD: Tuple Space Reacts, with Programmed Behavior
D
GRIDs In 2000: SummaryGRIDs In 2000: Summary
Grids are (in) our Grids are (in) our Future…Future…
Let’s Get to Work Let’s Get to Work
Grid Data ManagementGrid Data ManagementIssuesIssues
Data movement and responsibility for Data movement and responsibility for updating the Replica Catalogupdating the Replica Catalog
Metadata update and replica consistencyMetadata update and replica consistency Concurrency and lockingConcurrency and locking
Performance characteristics of replicasPerformance characteristics of replicas Advance Reservation: Policy, time-limitAdvance Reservation: Policy, time-limit
How to advertise policy and resource How to advertise policy and resource availabilityavailability
Pull versus push (strategy; security)Pull versus push (strategy; security) Fault tolerance; recovery proceduresFault tolerance; recovery procedures Queue managementQueue management Access control, both global and localAccess control, both global and local