Welcome, Opening Remarks, Goals and Agenda
-
Upload
datacenters -
Category
Technology
-
view
373 -
download
2
Transcript of Welcome, Opening Remarks, Goals and Agenda
Towards a Grid-enabled Towards a Grid-enabled Analysis EnvironmentAnalysis Environment
Harvey B. NewmanHarvey B. Newman California Institute of TechnologyCalifornia Institute of Technology
Grid-enabled Analysis Environment WorkshopGrid-enabled Analysis Environment WorkshopJune 23, 2003June 23, 2003
Welcome to Caltech Welcome to Caltech and Our Workshopand Our Workshop
Welcome to Caltech Welcome to Caltech and Our Workshopand Our Workshop
CaltechCaltech LogisticsLogistics Agenda Agenda Our Staff Our Staff
GAE Workshop Agenda GAE Workshop Agenda OverviewOverview
GAE Workshop Agenda GAE Workshop Agenda OverviewOverview
Monday Until 3:15 PM: Monday Until 3:15 PM: Presentations on Existing Work, IdeasPresentations on Existing Work, Ideas (Lunch at Noon Next to Lauritsen; Breaks at 10:30, 3:15) (Lunch at Noon Next to Lauritsen; Breaks at 10:30, 3:15)
3:15 - 6:003:15 - 6:00 GAE DemonstrationsGAE Demonstrations Tuesday 9:00 – 9:15 Tuesday 9:00 – 9:15 PDA JAS Client (A. Anjum, Pakistan)PDA JAS Client (A. Anjum, Pakistan)
9:15 – 10:30 9:15 – 10:30 Discussion of Workshop Goals and PlanDiscussion of Workshop Goals and Plan 10:30 – 11:00 10:30 – 11:00 BREAKBREAK 11:00 – 12:45 11:00 – 12:45 Discussion of GAE Architectures for LHC Discussion of GAE Architectures for LHC
Global Analysis and Remote WorkingGlobal Analysis and Remote Working 12:45 – 2:0012:45 – 2:00 LUNCHLUNCH 2:00 – 4:00 2:00 – 4:00 Discussion of Simulating GAE Discussion of Simulating GAE
Architectures Architectures 4:00 – 4:30 4:00 – 4:30 BREAKBREAK 4:30 – 5:30 4:30 – 5:30 Existing Analysis Software, Grid prod. Existing Analysis Software, Grid prod.
Systems; Integration in Candidate Arch.Systems; Integration in Candidate Arch.
6:30 6:30 Workshop Dinner at the AthenaeumWorkshop Dinner at the Athenaeum
GAE Workshop Agenda GAE Workshop Agenda Overview (Cont’d)Overview (Cont’d)
GAE Workshop Agenda GAE Workshop Agenda Overview (Cont’d)Overview (Cont’d)
Wednesday 9:00 – 10:30Wednesday 9:00 – 10:30 Existing Analysis software, Grid Prod. Existing Analysis software, Grid Prod. Systems; Integration in Arch. (Cont’d)Systems; Integration in Arch. (Cont’d)
10:30 – 11:00 10:30 – 11:00 BREAKBREAK 11:00 – 12:15 11:00 – 12:15 Future Activities, relationships with LHC Future Activities, relationships with LHC
Experiments, LCG, US-Grid Projects, Experiments, LCG, US-Grid Projects, CrossGrid, etc.CrossGrid, etc. 12:15 12:15 Conclusions, Comments, Workshop WrapupConclusions, Comments, Workshop Wrapup 12:30 12:30 ADJOURNADJOURN
LHC Data Grid HierarchyLHC Data Grid Hierarchy
Tier 1
Tier2 Center
Online System
CERN 700k SI95 ~1 PB Disk; Tape Robot
FNALIN2P3 Center INFN Center RAL Center
InstituteInstituteInstituteInstitute ~0.25TIPS
Workstations
~100-1500 MBytes/sec
2.5-10 Gbps
0.1 to 10 Gbps
Tens of Petabytes by 2007-8.An Exabyte ~5-7 Years later.Physics data cache
~PByte/sec
~2.5-10 Gbps
Tier2 CenterTier2 CenterTier2 Center
~2.5-10 Gbps
Tier 0 +1
Tier 3
Tier 4
Tier2 Center Tier 2
Experiment
CERN/Outside Resource Ratio ~1:2Tier0/Tier1/Tier2 ~1:1:1
Emerging Vision: A Richly Structured, Global Dynamic System
Issues For Grid Enabled Issues For Grid Enabled Analysis (GEA): Introduction Analysis (GEA): Introduction
(1)(1)
Issues For Grid Enabled Issues For Grid Enabled Analysis (GEA): Introduction Analysis (GEA): Introduction
(1)(1) The Problem: How can [experiments with] ~1000 physicists at ~100 The Problem: How can [experiments with] ~1000 physicists at ~100 Institutes in ~30 countries do “their analysis” effectively: A BalanceInstitutes in ~30 countries do “their analysis” effectively: A Balance
Efficient use of resources versus turnaround-time Efficient use of resources versus turnaround-time Central control/sharing of distributed resources;Central control/sharing of distributed resources;
versus use of local and regional resources under versus use of local and regional resources under group and regional control group and regional control
Proximity of the Jobs to the Data (if enough CPU+Priority); Proximity of the Jobs to the Data (if enough CPU+Priority); versus data transport to, and use of, more local resourcesversus data transport to, and use of, more local resources
And many other related issues…And many other related issues… The Problem, and its solution, also apply to “Production”, The Problem, and its solution, also apply to “Production”,
but the problem is most evident and severe for GEAbut the problem is most evident and severe for GEA A large and diverse community of usersA large and diverse community of users A wide range of tasks with a wide range of priorities;A wide range of tasks with a wide range of priorities;
representing large and small groups, and individuals representing large and small groups, and individuals But all the work has to get doneBut all the work has to get done
Which Approach: from Centralized Computing to Peer-to-PeerWhich Approach: from Centralized Computing to Peer-to-Peer For example: Managed, Structured P2P ?For example: Managed, Structured P2P ?
Computing Model Progress CMS Internal Review of Software and Computing
Example TExample Tag (JetMET) ag (JetMET) Web ServicesWeb Services
CAIGEE Draft ArchitectureCAIGEE Draft Architecture
CAIGEECAIGEEARCHITECTUREARCHITECTURE
NSF ITR: “Private” Grids & P2PNSF ITR: “Private” Grids & P2PSub-Communities in Global HEPSub-Communities in Global HEP
L. Bauerdick, FNAL
Develop and build Develop and build Dynamic WorkspacesDynamic Workspaces
Construct Construct Autonomous Autonomous CommunitiesCommunities Within Within
Global Collaborations Global Collaborations Build Build Private GridsPrivate Grids to to
support scientific analysis support scientific analysis communitiescommunities
e.g. Using Agent Based e.g. Using Agent Based Peer-to-peer Web Peer-to-peer Web ServicesServices
NSF ITR: Globally EnabledNSF ITR: Globally EnabledAnalysis CommunitiesAnalysis Communities
HENP Major Links: Bandwidth Roadmap (Scenario) in Gbps
HENP Major Links: Bandwidth Roadmap (Scenario) in Gbps
Year Production Experimental Remarks
2001 0.155 0.622-2.5 SONET/SDH
2002 0.622 2.5 SONET/SDH DWDM; GigE Integ.
2003 To 2.5 To 10 DWDM; 1 + 10 GigE Integration
2005 10 2-4 X 10 Switch; Provisioning
2007 2-4 X 10 ~10 X 10; 40 Gbps
1st Gen. Grids
2009 ~10 X 10 or 1-2 X 40
~5 X 40 or ~20-50 X 10
40 Gbps Switching
2011 ~5 X 40 or
~20 X 10
~25 X 40 or ~100 X 10
2nd Gen Grids Terabit Networks
2013 ~Terabit ~MultiTbps ~Fill One Fiber
Continuing the Trend: ~1000 Times Bandwidth Growth Per DecadeGRIDS WILL BECOME MORE DYNAMIC
[email protected] ARGONNE ö CHICAGO
Grid Architecture Layers
“Talking to things”: Communication (Internet protocols) & security
“Sharing single resources”: Negotiating access, controlling use
“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services
“Controlling things locally”: Access to, & control of resources
Connectivity
Resource
Collective
Application
Fabric
Internet
Transport
Appli-cation
Link
Inte
rnet P
roto
col
Arc
hite
ctu
re
More info: www.globus.org/research/papers/anatomy.pdf
The original Computational and Data Grid concepts are The original Computational and Data Grid concepts are largely stateless, open systems: known to be scalable largely stateless, open systems: known to be scalable
Analogous to the WebAnalogous to the Web The classical Grid architecture has a number of implicit The classical Grid architecture has a number of implicit
assumptionsassumptions The ability to locate and schedule suitable resources,The ability to locate and schedule suitable resources,
within a tolerably short time (i.e. resource richness) within a tolerably short time (i.e. resource richness) Short transactions with relatively simple failure modesShort transactions with relatively simple failure modes
HEP Grids are Data Intensive, and Resource-ConstrainedHEP Grids are Data Intensive, and Resource-Constrained 1000s of users competing for resources at 100s of sites1000s of users competing for resources at 100s of sites Resource usage governed by local and global policiesResource usage governed by local and global policies Long transactions; some long queuesLong transactions; some long queues
Need Realtime Monitoring and TrackingNeed Realtime Monitoring and Tracking Distributed failure modes Distributed failure modes Strategic task managementStrategic task management
HENP Data Grids Versus HENP Data Grids Versus Classical GridsClassical Grids
Upcoming HEP Grid Challenges: Workflow Management and Optimization
Upcoming HEP Grid Challenges: Workflow Management and Optimization
Maintaining a Maintaining a Global ViewGlobal View of Resources and System State of Resources and System State End-to-end System MonitoringEnd-to-end System Monitoring Adaptive Learning: new paradigms for optimization, Adaptive Learning: new paradigms for optimization,
problem resolution (eventually automated, in part)problem resolution (eventually automated, in part) Workflow Management,Workflow Management, Balancing Policy Versus Balancing Policy Versus
Moment-to-moment Capability to Complete TasksMoment-to-moment Capability to Complete Tasks Balance High Levels of Usage of Limited Resources Balance High Levels of Usage of Limited Resources
Against Better Turnaround Times for Priority JobsAgainst Better Turnaround Times for Priority Jobs Realtime Error Detection, Propagation; RecoveryRealtime Error Detection, Propagation; Recovery Handling User-Grid Interactions: Guidelines Handling User-Grid Interactions: Guidelines Higher Level Services, and an IntegratedHigher Level Services, and an Integrated
User Environment for the AboveUser Environment for the Above
Physicists’ ApplicationsPhysicists’ Applications Reconstruction, Calibration, Analysis; Code DevelopmentReconstruction, Calibration, Analysis; Code Development
Experiments’ Software Framework LayerExperiments’ Software Framework Layer Modular and Grid-aware: Architecture able toModular and Grid-aware: Architecture able to
interact effectively with the lower layers interact effectively with the lower layers Grid Applications Layer: Time-Dependent Metrics and MethodsGrid Applications Layer: Time-Dependent Metrics and Methods
(Parameters and algorithms that govern system operations)(Parameters and algorithms that govern system operations) Policy and priorityPolicy and priority Workflow-management Workflow-management Task-Placement Task-Placement Overall Performance:Overall Performance: STEERINGSTEERING methods methods
Global End-to-End System Services Layer: MechanismsGlobal End-to-End System Services Layer: Mechanisms Monitoring and Tracking Monitoring and Tracking Workflow Monitoring Workflow Monitoring Error recovery and redirection Error recovery and redirection System self-monitoring, evaluation and System self-monitoring, evaluation and optimisationoptimisation
HENP Grid Architecture: HENP Grid Architecture: Layers Above the Collective Layer Layers Above the Collective Layer
The Move to OGSA and StatefulThe Move to OGSA and StatefulManaged SystemsManaged Systems
Incr
ease
d f
un
ctio
nal
ity,
stan
dar
diz
atio
n
Time
Customsolutions
Open GridServices Arch
GGF: OGSI, …(+ OASIS, W3C)
Multiple implementations,including Globus Toolkit
Web services + …
Globus Toolkit
Defacto standardsGGF: GridFTP, GSI
X.509,LDAP,FTP, …
App-specificServices~Integrated Systems~Integrated Systems
Stateful; Managed
OGSA Example:OGSA Example:Reliable File Transfer ServiceReliable File Transfer Service
Performnce
Policy
Faults
servicedataelements
Pending
FileTransfer
InternalState
GridService
Notf’nSource
Policy
interfacesQuery &/orsubscribe
to service data
FaultMonito
r
Perf.Monitor
Client Client Client
Request & manage file transfer operations
Data transfer operations
A standard substrate: A standard substrate: the Grid servicethe Grid service Standard interfaces & Standard interfaces &
behaviors; to address behaviors; to address key distributed system key distributed system issuesissues
Refactoring, extension Refactoring, extension of Globus Toolkit of Globus Toolkit protocol suiteprotocol suite
Example “COG” Computing EnvironmentBasic, Advanced and Commodity Services
G. Von Laszewski et al. GC2002G. Von Laszewski et al. GC2002Session and ExecutionEnvironment Concepts
Building an Architecture Requires:Components that Communicate and InterworkDefining How the Components are Integrated
Interfaces Cooperative Behaviors
A System Concept
Analysis Desktop; Applications & FrameworksAnalysis Desktop; Applications & Frameworks
Advanced Grid Advanced Grid Service SessionsService Sessions
Managed Managed Grid ServicesGrid Services
Basic Grid ServicesBasic Grid Services
Dynamic Distributed Services Architecture (DDSA): Caltech/Romania/Pakistan
Dynamic Distributed Services Architecture (DDSA): Caltech/Romania/Pakistan
““Station Server” Services-enginesStation Server” Services-engines at sites host “Dynamic Services” at sites host “Dynamic Services” Auto-discovering, CollaborativeAuto-discovering, Collaborative
Service Agents: Goal-Oriented, Service Agents: Goal-Oriented, Autonomous, Adaptive Autonomous, Adaptive
Servers interconnect dynamically; Servers interconnect dynamically; form a robust fabric in which form a robust fabric in which mobile agents travel, with a mobile agents travel, with a payload of (analysis) taskspayload of (analysis) tasks
Event notification, subscriptionEvent notification, subscription Adaptable to Web services: Adaptable to Web services:
OGSA; many platforms.OGSA; many platforms. Also mobile working Also mobile working
environmentsenvironments
StationStationServerServer
StationStationServerServer
StationStationServerServer
LookupLookupServiceService
LookupLookupServiceService
Proxy ExchangeProxy Exchange
Registration
Registration
Service Listener
Service Listener
Lookup Lookup Discovery Discovery
ServiceService
Remote Notification
Remote Notification
By I. Legrand et al. Deployed on US CMS Grid,
+ increasing number of sites Agent-based Dynamic
information / resource discovery mechanism
Talks w/Other Mon. Systems Implemented in
Java/Jini; SNMP WDSL / SOAP with UDDI
Part of a Global Grid Control Room Service
MonALISA: A Globally Scalable Grid Monitoring System
http://www.naradabrokering.org gcf,spallick,[email protected]://www.naradabrokering.org gcf,spallick,[email protected]
P2P Narada Broker Network (P2P) Community
Database
Resource
Broker
Broker
Broker
Broker
Broker
Broker
Software multicast
(P2P) Community
(P2P) Community
For message/events service
(P2P) Community
G. Fox et al., GC2002, Ch. 22
Services-Oriented Architecture: Services-Oriented Architecture: Service ContractsService Contracts
A Grid Market EconomyA Grid Market Economy Procure and/or Trade Procure and/or Trade
ResourcesResources Service Contracts Service Contracts
Service Owners: Service Owners: Institutional Context Institutional Context
Service Contracts Nego-Service Contracts Nego- tiated in Marketplaces tiated in Marketplaces
Each with Its Own Rules,Each with Its Own Rules, Set by Marketplace Set by Marketplace Owners (VOs) Owners (VOs)
Renegotiation PossibleRenegotiation Possible at Contract Creation Time at Contract Creation Time
Track Service: Oversee Track Service: Oversee Satisfaction of Contract Satisfaction of Contract
D. De Roure et al., The Semantic Grid, GC 2002 Ch. 17
GAE Issues: A Computing Model;GAE Issues: A Computing Model; Key Role of Simulation and Prototyping Key Role of Simulation and Prototyping
GAE Issues: A Computing Model;GAE Issues: A Computing Model; Key Role of Simulation and Prototyping Key Role of Simulation and Prototyping
Need a “Computing Model”: Common Aspects; By ExperimentNeed a “Computing Model”: Common Aspects; By Experiment A complete picture of what should happen: A complete picture of what should happen:
All Tasks, Policies and Priorities, Performance Targets, All Tasks, Policies and Priorities, Performance Targets, Corrective ActionsCorrective Actions
Answer these Questions (Among Others): Answer these Questions (Among Others): 1.1. ““How many jobs using how much CPU and how much How many jobs using how much CPU and how much
data are accessed by how many physicists at how data are accessed by how many physicists at how many locations, how often, and how ?” many locations, how often, and how ?” [Specify the Tasks and Their Profiles][Specify the Tasks and Their Profiles]
2.2. “ “What Performance Can Users Expect ? What’s Normal ?” What Performance Can Users Expect ? What’s Normal ?” [Specify Turnaround Time Profiles][Specify Turnaround Time Profiles]
3.3. ““What happens when it doesn’t work; what do you What happens when it doesn’t work; what do you (or the system) do then ?” (or the system) do then ?” [Specify Corrective Actions: Strategies & Methodology][Specify Corrective Actions: Strategies & Methodology]
Therefore: A Key Role of Modeling and SimulationTherefore: A Key Role of Modeling and Simulation
MONARC/SONN: 3 Regional Centres MONARC/SONN: 3 Regional Centres Learning to Export Jobs (Day 9)Learning to Export Jobs (Day 9)
NUST20 CPUs
CERN30 CPUs
CALTECH25 CPUs
1MB/s ; 150 ms RTT
1.2 MB
/s
150 ms R
TT
0
.8 M
B/s
200
ms
RTT
Day = 9
<E> = 0.73
<E> = 0.66
<E> = 0.83
Model Higher Level Services
Simulations for Strategy and HLS Develop
GAE Workshop Goals (1)GAE Workshop Goals (1)GAE Workshop Goals (1)GAE Workshop Goals (1) ““Getting Our Arms Around” the Grid-Enabled Getting Our Arms Around” the Grid-Enabled
Analysis “Problem” Analysis “Problem” Review Existing Work Towards a GAE:Review Existing Work Towards a GAE:
Components, Interfaces, System Concepts Components, Interfaces, System Concepts Review Client Analysis Tools; Consider How to Integrate ThemReview Client Analysis Tools; Consider How to Integrate Them User Interfaces: What does the GAE Desktop Look Like ?User Interfaces: What does the GAE Desktop Look Like ?
(Different Flavors) (Different Flavors) Look At Requirements, Ideas for a GAE Architecture Look At Requirements, Ideas for a GAE Architecture
A Vision of the System’s Goals and WorkingsA Vision of the System’s Goals and Workings Attention to Strategy and Policy Attention to Strategy and Policy
Develop (Continue) a Program of Simulations Develop (Continue) a Program of Simulations of the System of the System For the Computing Model, and Defining the GAEFor the Computing Model, and Defining the GAE Essential for Developing a Feasible Vision; DevelopingEssential for Developing a Feasible Vision; Developing
Strategies, Solving Problems and Optimizing the System Strategies, Solving Problems and Optimizing the System With a Complementary Program of PrototypingWith a Complementary Program of Prototyping
GAE Collaboration DesktopGAE Collaboration DesktopExampleExample
Four-screen Analysis Desktop Four-screen Analysis Desktop 4 Flat Panels: 5120 X 1024; RH94 Flat Panels: 5120 X 1024; RH9
Driven by a single server and Driven by a single server and single graphics cardsingle graphics card
Allows simultaneous work on:Allows simultaneous work on: Traditional analysis tools Traditional analysis tools
(e.g. ROOT)(e.g. ROOT) Software development Software development Event displays (e.g. IGUANA)Event displays (e.g. IGUANA) MonALISA monitoring MonALISA monitoring
displays; Other “Grid Views”displays; Other “Grid Views” Job-progress ViewsJob-progress Views Persistent collaboration Persistent collaboration
(e.g. VRVS; shared windows)(e.g. VRVS; shared windows) Online event or detector Online event or detector
monitoringmonitoring Web browsing, emailWeb browsing, email
GAE Workshop Goals (2)GAE Workshop Goals (2)GAE Workshop Goals (2)GAE Workshop Goals (2) Architectural Approaches: Choose A Feasible Direction Architectural Approaches: Choose A Feasible Direction
For example a For example a Managed Services ArchitectureManaged Services Architecture Be Prepared to Learn by Doing;Be Prepared to Learn by Doing;
Simulating and Prototyping Simulating and Prototyping Where to Start, and the Development StrategyWhere to Start, and the Development Strategy
Existing and MissingExisting and Missing Parts of the System Parts of the System [Layers; Concepts] [Layers; Concepts]
When to Adapt Existing Components, When to Adapt Existing Components, Or to Re-Build Them “from Scratch” Or to Re-Build Them “from Scratch”
Manpower Available to Meet the Goals; ShortfallsManpower Available to Meet the Goals; Shortfalls Allocation of Tasks; Including Generating a PlanAllocation of Tasks; Including Generating a Plan
Linkage Between Analysis and Grid-Enabled ProductionLinkage Between Analysis and Grid-Enabled Production Planning for Closer Relationship with LCG, Trillium, Planning for Closer Relationship with LCG, Trillium,
and the Experiments’ starting Efforts in this areaand the Experiments’ starting Efforts in this area
Computing Model Progress CMS Internal Review of Software and Computing
Some Extra Slides Some Extra Slides
FollowFollow
Self Discovering, CooperativeSelf Discovering, Cooperative Registered Services, Lookup Services; self-describingRegistered Services, Lookup Services; self-describing “ “Spaces” for Mobile Code and ParametersSpaces” for Mobile Code and Parameters
Scalable and Robust Scalable and Robust Multi-threaded: with a thread pool managing engineMulti-threaded: with a thread pool managing engine Loosely Coupled: errors in a thread don’t stop the task Loosely Coupled: errors in a thread don’t stop the task
Stateful: System State as well as task stateStateful: System State as well as task state Rich set of “problem” situations: implies Rich set of “problem” situations: implies Grid Views, Grid Views,
and and User/System DialoguesUser/System Dialogues on what to do on what to do For Example: Raise Priority (Burn Quota); or Redirect WorkFor Example: Raise Priority (Burn Quota); or Redirect Work
Eventually may be increasingly automated asEventually may be increasingly automated as we scale up and gain experience we scale up and gain experience
Managed; to deal with a Complex Execution EnvironmentManaged; to deal with a Complex Execution Environment Real time higher level supervisory services monitor, Real time higher level supervisory services monitor, track, optimize and Revive/Restart services as needed track, optimize and Revive/Restart services as needed
Policy and strategy-driven; Self-Evaluating and OptimizingPolicy and strategy-driven; Self-Evaluating and Optimizing Investable with increasing intelligenceInvestable with increasing intelligence
Agent Based; Evolutionary Learning AlgorithmsAgent Based; Evolutionary Learning Algorithms
HENP Grids: Services Architecture HENP Grids: Services Architecture Design for a Global SystemDesign for a Global System
Generate a Blueprint: A “Computing Model”Generate a Blueprint: A “Computing Model”Tasks Tasks Workload, Facilities, Priorities & GOALS Workload, Facilities, Priorities & GOALS Persistency; Modes of Accessing Data (e.g. Object Collections)Persistency; Modes of Accessing Data (e.g. Object Collections) What runs where; when to redirectWhat runs where; when to redirect The User’s Working EnvironmentThe User’s Working Environment
What is normal (managing expectations) ?What is normal (managing expectations) ? Guidelines for dealing with problems: Guidelines for dealing with problems: based on which information ? based on which information ?
Performance and problem reporting/tracking/handling ?Performance and problem reporting/tracking/handling ? Known Problems: Strategies to deal with thoseKnown Problems: Strategies to deal with those
Set up, code a Simulation of the ModelSet up, code a Simulation of the Model Develop mechanisms and sub-models as neededDevelop mechanisms and sub-models as needed
Set up prototypes to measure the performance parameters Set up prototypes to measure the performance parameters where not already known to sufficient precisionwhere not already known to sufficient precision
Building a Computing ModelBuilding a Computing Modeland an Analysis Strategy (I)and an Analysis Strategy (I)
Run simulations (avatars for “actors”; agents; tasks; mechanisms)Run simulations (avatars for “actors”; agents; tasks; mechanisms) Analyze and evaluate performanceAnalyze and evaluate performance
General performance (throughput; turnaround)General performance (throughput; turnaround) Ensure “all” work is done: learn how to do this: within a Ensure “all” work is done: learn how to do this: within a reasonable time; compatible with the Collaboration’s guidelinesreasonable time; compatible with the Collaboration’s guidelines
Vary Model to Improve PerformanceVary Model to Improve Performance Deal with bottlenecks and other problemsDeal with bottlenecks and other problems New strategies and/or mechanisms to manage workflowNew strategies and/or mechanisms to manage workflow Represent key features and behaviors, for example:Represent key features and behaviors, for example:
Responses to Link or Site failuresResponses to Link or Site failures User input to redirect data or jobsUser input to redirect data or jobs Monitoring information gathering Monitoring information gathering Monitoring and management agent actions and Monitoring and management agent actions and behaviors in a variety of situations behaviors in a variety of situations
Validate the ModelValidate the Model Using Dedicated setupsUsing Dedicated setups Using Data Challenges (measure, evaluate, compare; fix key items)Using Data Challenges (measure, evaluate, compare; fix key items) Learn of new factors and/or behaviors to take into accountLearn of new factors and/or behaviors to take into account
Building a Computing ModelBuilding a Computing Modeland an Analysis Strategy (II)and an Analysis Strategy (II)
MAJOR MilestoneMAJOR Milestone: Obtain a first picture of a Model that : Obtain a first picture of a Model that Seems to Work Seems to Work
This may or may not involve changes in the computing resource This may or may not involve changes in the computing resource requirements-estimates; or Collaboration policies and expectationsrequirements-estimates; or Collaboration policies and expectations It is hard to estimate how long it will take to It is hard to estimate how long it will take to reach this milestone reach this milestone [most experiments until now have reached it [most experiments until now have reached it after the start of data taking] after the start of data taking]
Evolve the Model to Evolve the Model to Distinguish what works and what does notDistinguish what works and what does not Incorporate evolving site hardware and network performanceIncorporate evolving site hardware and network performance Progressively incorporate new and “better” strategies, to Progressively incorporate new and “better” strategies, to improve throughput and/or turnarounds, or fix critical problems improve throughput and/or turnarounds, or fix critical problems Take into account experience with the actual software-system Take into account experience with the actual software-system components as they developcomponents as they develop
In parallel with the Model evolution keep developing the overallIn parallel with the Model evolution keep developing the overall data analysis + Grid + monitoring “system”; represent it in the data analysis + Grid + monitoring “system”; represent it in the simulation simulation
And the associated strategiesAnd the associated strategies
Building a Computing ModelBuilding a Computing Modeland an Analysis Strategy (III)and an Analysis Strategy (III)
http://www.naradabrokering.org gcf,spallick,[email protected]://www.naradabrokering.org gcf,spallick,[email protected]
NaradaBrokering Based on a network of cooperating broker nodes
• Cluster based architecture allows system to scale to arbitrary size
Originally to provide uniform software multicast to support real-time collaboration linked to publish-subscribe for asynchronous systems.
Now has four major core functions• Message transport (based on performance) in multi-link
fashion• General publish-subscribe including JMS & JXTA• Support for RTP-based audio/video conferencing.• Federation of multiple instances (just starting) of Grid
services
http://www.naradabrokering.org gcf,spallick,[email protected]://www.naradabrokering.org gcf,spallick,[email protected]
Role of Event/Message Brokers We will use events and messages interchangeably
• An event is a time stamped message Our systems are built from clients, servers and “event brokers”
• These are logical functions – a given computer can have one or more of these functions
• In P2P networks, computers typically multifunction; in Grids one tends to have separate function computers
• Event Brokers “just” provide message/event services; servers provide traditional distributed object services as Web services
There are functionalities that only depend on event itself and perhaps the data format; they do not depend on details of application and can be shared among several applications• NaradaBrokering is designed to provide these
functionalities
http://www.naradabrokering.org gcf,spallick,[email protected]://www.naradabrokering.org gcf,spallick,[email protected]
Why P2P? Core features
• Resource Sharing & Discovery CPU cycles: SETI@home, Folding@HOME File Sharing: Napster, Gnutella
Deployments user driven
• No dedicated management Management of resources
• Expose resources & specify security strategy
• Replicate resources based on demand Dynamic peer groups, fluid group memberships Sophisticated search mechanisms
• Peers respond to queries based on their interpretations
• Responses do not conform to traditional templates.
http://www.naradabrokering.org gcf,spallick,[email protected]://www.naradabrokering.org gcf,spallick,[email protected]
What are the downsides? Interactions are attenuated
• Localized• Fragmented world of multiple P2P subsystems
Routing not very sophisticated• Inefficient network utilization (Tragedy of Commons)• Simple forwarding• Peer Traces (to eliminate echoing)• Attenuations (to suppress propagation)
TTL’s associated with interactions.
http://www.naradabrokering.org gcf,spallick,[email protected]://www.naradabrokering.org gcf,spallick,[email protected]
Narada-JXTA events
Peer AdvertisementRequest
(a)
Narada Headers
JXTA Interaction Type
Narada Connection Info
Peer group id
Peer Advertisement
Narada Event DistributionTraces
After Peer AdvertisementResponse
(c)
Narada Headers
JXTA Interaction Type(Subscription)
Peer group id
Peer id
Narada Event DistributionTraces
Peer AdvertisementResponse
(b)
Narada Headers
JXTA Interaction Type
Narada Connection Info
Peer group id
Peer id
Peer Advertisement
Narada Event DistributionTraces
Requests/Responses to be part of a certain peer group
http://www.naradabrokering.org gcf,spallick,[email protected]://www.naradabrokering.org gcf,spallick,[email protected]
NaradaBrokering Results (G. Fox et al.)
i4 5
6 l13 14
15
j7 8
9
h1 2
3
k10 11
12
m16 17
18
n20
21
19
22
MeasuringSubscriber
Publisher Transit Delay under different matching rates:22 Brokers 102 Clients
Match Rate=100% Match Rate=50% Match Rate=10%
0 1002003004005006007008009001000Publish Rate (Events/sec)
0 50100150200250300350400450500
Event Size (Bytes)
050
100150200250300350400450
Mean Transit Delay (MilliSeconds)
Services-Oriented Architecture: Services-Oriented Architecture: Key Functions of ComponentsKey Functions of Components
D. De Roure et al., The Semantic Grid, GC 2002 Ch. 17
Grid Enabled Analysis (GEA) CrossGrid “Componentology”Grid Enabled Analysis (GEA) CrossGrid “Componentology”
Specify, Study, Iterate: Evolve with Specify, Study, Iterate: Evolve with Experience and Advancing Technologies Experience and Advancing Technologies
Workload: Actors (Tasks) Workload: Actors (Tasks) Job & Data Transport Profiles, Frequency Job & Data Transport Profiles, Frequency Site components and architectures: Performance vs. Load; Failure ModesSite components and architectures: Performance vs. Load; Failure Modes Data Structures, Streams, and Access Methods: (Sub-collections)Data Structures, Streams, and Access Methods: (Sub-collections) Networks: scale, operations, main behaviorsNetworks: scale, operations, main behaviors Operational Modes (Develop a Common Understanding ?)Operational Modes (Develop a Common Understanding ?)
e.g. What are guidelines and steps that make up thee.g. What are guidelines and steps that make up the data access/processing/analysis policy and strategy (the GEA) data access/processing/analysis policy and strategy (the GEA)
e.g. What should the user do under different situations ?e.g. What should the user do under different situations ? If we have automated services to help users: what do they do ?If we have automated services to help users: what do they do ? What are the technical goals + emphasis of the systemWhat are the technical goals + emphasis of the system
How is it intended to be used by the Collaboration ?How is it intended to be used by the Collaboration ? High Level Software Services architecture: High Level Software Services architecture:
Adaptive, partly autonomous, e.g. agent-basedAdaptive, partly autonomous, e.g. agent-based How/how much should they steer the system, to get the work done ?How/how much should they steer the system, to get the work done ?
Note: Common services among experiments imply some similar op. modesNote: Common services among experiments imply some similar op. modes
The LHC Distributed ComputingThe LHC Distributed ComputingModel: Getting StartedModel: Getting Started
2001 Transatlantic Net WG 2001 Transatlantic Net WG Bandwidth Bandwidth Requirements [*]Requirements [*]
2001 2002 2003 2004 2005 2006
CMS 100 200 300 600 800 2500
ATLAS 50 100 300 600 800 2500
BaBar 300 600 1100 1600 2300 3000
CDF 100 300 400 2000 3000 6000
D0 400 1600 2400 3200 6400 8000
BTeV 20 40 100 200 300 500
DESY 100 180 210 240 270 300
CERN BW
155-310
622 2500 5000 10000 20000
[*] [*] See See http://gate.hep.anl.gov/lprice/TAN. The 2001LHC requirements outlook now looks Very Conservative in 2003
FAST TCP: Aggregate Throughput
1 flow 2 flows 7 flows 9 flows 10 flows
Average utilization
95%
92%
90%
90%
88%
Measurements Std Packet Size Utilization averaged
over > 1hr 3000 km Path
RTT estimation: fine-grain timerRTT estimation: fine-grain timer Fast convergence to equilibriumFast convergence to equilibrium Delay monitoring in equilibriumDelay monitoring in equilibrium Pacing: reducing burstinessPacing: reducing burstiness
Now working towards10 Gbps in ~2 Flows
On Feb. 27-28, a On Feb. 27-28, a Terabyte of data was transferred in 3700 Terabyte of data was transferred in 3700 secondsseconds by S. Ravot of Caltech between the Level3 PoP in by S. Ravot of Caltech between the Level3 PoP in Sunnyvale near SLAC and CERN through the TeraGrid Sunnyvale near SLAC and CERN through the TeraGrid router at StarLight from memory to memoryrouter at StarLight from memory to memory As a single TCP/IP stream at average rate of 2.38 Gbps. As a single TCP/IP stream at average rate of 2.38 Gbps. (Using large windows and 9kB “Jumbo frames”)(Using large windows and 9kB “Jumbo frames”)This beat the former record by a factor of ~2.5, and This beat the former record by a factor of ~2.5, and used the US-CERN link at 99% efficiency. used the US-CERN link at 99% efficiency.
10GigE Data Transfer Trial10GigE Data Transfer Trial
European CommissionEuropean Commission
10GigE NIC10GigE NIC
UltraLight: An Ultra-scale Optical Network Laboratory for Next Generation Science
UltraLight: An Ultra-scale Optical Network Laboratory for Next Generation Science
Caltech, UF, FIU, UMich, SLAC,FNAL,MIT/Haystack,CERN, UERJ(Rio),
NLR, CENIC, UCAID,Translight, UKLight, Netherlight, UvA, UCLondon, KEK, Taiwan
Cisco, Level(3)
Flagship Applications(HENP, VLBI, Oncology, …)
Grid/Storage Management
Network Protocols &Bandwidth Management
En
d-to
-en
d M
on
ito
rin
gIn
tell
ige
nt
Ag
en
ts
En
d-to
-en
d M
on
ito
rin
gIn
tell
ige
nt
Ag
en
ts
Application Frameworks
Distributed CPU & Storage
Network Fabric
Grid Middleware
Flagship Applications(HENP, VLBI, Oncology, …)
Grid/Storage Management
Network Protocols &Bandwidth Management
En
d-to
-en
d M
on
ito
rin
gIn
tell
ige
nt
Ag
en
ts
En
d-to
-en
d M
on
ito
rin
gIn
tell
ige
nt
Ag
en
ts
Application Frameworks
Distributed CPU & Storage
Network Fabric
Grid Middleware
http://ultralight.caltech.edu
GECSR Proposal: A Grid-Enabled GECSR Proposal: A Grid-Enabled Collaboratory for Scientific ResearchCollaboratory for Scientific Research
Create a Persistent, Data-Intensive Collaboratory for Create a Persistent, Data-Intensive Collaboratory for Analysis by Global HENP collaborations; built to be widely Analysis by Global HENP collaborations; built to be widely applicable to a wide range of other large scale science projectsapplicable to a wide range of other large scale science projects
““Giving scientists from all world regions the means to function Giving scientists from all world regions the means to function as full partners in the process of search and discovery” as full partners in the process of search and discovery”
Customization for discipline- and project-specific applications
Community-Specific Knowledge Environments for Research and Ed.(collaboratory, grid community, e-science community, virtual community)
Highperformancecomputation
services(1)
Data, informationknowledge
managementservices
(2)
Observation,measurement
fabricationservices
(3)
Interfaces,visualization
services
(4)
CollaborativeServices
(5)
Networking, Operating systems, Middleware
Base Technology: computation, storage, communication
CMS/MONARC Analysis ModelCMS/MONARC Analysis Model
Hierarchy of Processes (Experiment, Analysis Groups, Individuals)Hierarchy of Processes (Experiment, Analysis Groups, Individuals)
SelectionSelection
Iterative selectionIterative selectionOnce per monthOnce per month~20 Groups’~20 Groups’
ActivityActivity(10(109 9 101077 events) events)
Trigger based andTrigger based andPhysics basedPhysics basedrefinementsrefinements
25 SI95sec/event25 SI95sec/event~20 jobs per month~20 jobs per month
25 SI95sec/event25 SI95sec/event~20 jobs per month~20 jobs per month
AnalysisAnalysisDifferent Physics cutsDifferent Physics cuts
& MC comparison& MC comparison~Once per day~Once per day
~25 Individual~25 Individualper Groupper GroupActivityActivity
(10(1066 –10 –1088 events) events)
Algorithms appliedAlgorithms appliedto datato data
to get resultsto get results
10 SI95sec/event10 SI95sec/event~500 jobs per day~500 jobs per day
10 SI95sec/event10 SI95sec/event~500 jobs per day~500 jobs per day
Monte CarloMonte Carlo
5k SI95sec/event5k SI95sec/event5k SI95sec/event5k SI95sec/event
RAW DataRAW Data
ReconstructionReconstruction Re-processingRe-processing3 Times per year3 Times per year
Experiment-Experiment-Wide ActivityWide Activity(10(1099 events) events)
New detector New detector calibrationscalibrations
Or understandingOr understanding
3000 SI95sec/event3000 SI95sec/event1 job year1 job year
3000 SI95sec/event3000 SI95sec/event1 job year1 job year
3000 SI95sec/event3000 SI95sec/event3 jobs per year3 jobs per year
3000 SI95sec/event3000 SI95sec/event3 jobs per year3 jobs per year
CMS CPU and Storage TotalCMS CPU and Storage Total
TOTAL Active tape for CMS :TOTAL Active tape for CMS :TIER0/1 CERN + 5 TIER1 = 1540 + 5 X 590 =TIER0/1 CERN + 5 TIER1 = 1540 + 5 X 590 = 4490 TB4490 TBTOTAL Archive tape for CMS :TOTAL Archive tape for CMS :TIER0/1 CERN + 5 TIER1 + 25 TIER2 = 2632 + 5 X 433 + 25X50 =TIER0/1 CERN + 5 TIER1 + 25 TIER2 = 2632 + 5 X 433 + 25X50 = 6047 TB6047 TB
TOTAL Tape for CMS :TOTAL Tape for CMS :TIER0/1 CERN + 5 TIER1 + 25 TIER2 = 4172 + 5 X 1023 +25X50TIER0/1 CERN + 5 TIER1 + 25 TIER2 = 4172 + 5 X 1023 +25X50 = = 10537 TB10537 TB
TOTAL Disk for CMS :TOTAL Disk for CMS :TIER0/1 CERN + 5 TIER1 +25 TIER2 = 796 + 5 X 313 + 25X70 =TIER0/1 CERN + 5 TIER1 +25 TIER2 = 796 + 5 X 313 + 25X70 = 4111 TB4111 TB
TOTAL TOTAL CPU CPU for CMS :for CMS :TIER0/1 CERN + 5 TIER1 + 25 TIER2 = 615 + 5 X TIER0/1 CERN + 5 TIER1 + 25 TIER2 = 615 + 5 X 167167 + 25 X 32 = + 25 X 32 = 2250 kSI952250 kSI95
[Where 1 PC (Ca. 2000) ~ 25 SI95][Where 1 PC (Ca. 2000) ~ 25 SI95]
HENP Lambda Grids:Fibers for Physics
HENP Lambda Grids:Fibers for Physics
Problem: Extract “Small” Data Subsets of 1 to 100 Terabytes Problem: Extract “Small” Data Subsets of 1 to 100 Terabytes from 1 to 1000 Petabyte Data Storesfrom 1 to 1000 Petabyte Data Stores
Survivability of the HENP Global Grid System, with Survivability of the HENP Global Grid System, with hundreds of such transactions per day (circa 2007)hundreds of such transactions per day (circa 2007)requires that each transaction be completed in a requires that each transaction be completed in a relatively short time. relatively short time.
Example: Take 800 secs to complete the transaction. ThenExample: Take 800 secs to complete the transaction. Then Transaction Size (TB)Transaction Size (TB) Net Throughput (Gbps)Net Throughput (Gbps) 1 101 10 10 10010 100 100 1000 (Capacity of 100 1000 (Capacity of
Fiber Today) Fiber Today) Summary: Providing Switching of 10 Gbps wavelengthsSummary: Providing Switching of 10 Gbps wavelengths
within ~3-5 years; and Terabit Switching within 5-8 yearswithin ~3-5 years; and Terabit Switching within 5-8 yearswould enable “Petascale Grids with Terabyte transactions”,would enable “Petascale Grids with Terabyte transactions”,as required to fully realize the discovery potential of major HENP as required to fully realize the discovery potential of major HENP programs, as well as other data-intensive fields.programs, as well as other data-intensive fields.
PPDG Past, Present and Future:PPDG Past, Present and Future:OutlineOutline
HENP Challenges: Science Drivers of Data-Intensive HENP Challenges: Science Drivers of Data-Intensive Grid SystemsGrid Systems
Progress in Grids for Physics: 1999-2003Progress in Grids for Physics: 1999-2003 Key Roles of the Particle Physics Data Grid: Key Roles of the Particle Physics Data Grid:
Mission, Focus, Accomplishments Mission, Focus, Accomplishments The Coming Generation ChangeThe Coming Generation Change
HENP Grids: Global End-to-end Managed System ArchitectureHENP Grids: Global End-to-end Managed System Architecture OGSA: Transition to a stateful services architecture;OGSA: Transition to a stateful services architecture;
appropriate for systems of this complexityappropriate for systems of this complexity Rapid Advances in NetworksRapid Advances in Networks
Future Vision: Dynamic, PetaScale Grids Future Vision: Dynamic, PetaScale Grids with Terabyte transactionswith Terabyte transactions
Computing Model Progress CMS Internal Review of Software and Computing
Layered Grid ArchitectureLayered Grid Architecture
Application
Fabric“Controlling things locally”: Access to, & control of, resources
Connectivity“Talking to things”: communicat’n(Internet protocols) & security
Resource“Sharing single resources”: nego- tiating access, controlling use
Collective“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services
InternetTransport
Application
Link
Inte
rnet P
roto
col
Arc
hite
ctu
re
“The Anatomy of the Grid: Enabling Scalable Virtual Organizations”, Foster, Kesselman, Tuecke, Intl J. High Performance Computing Applications, 15(3), 2001.