Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information...
-
Upload
marshall-newman -
Category
Documents
-
view
216 -
download
1
Transcript of Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information...
Rick Cavanaugh, University of FloridaCHEP06 Mumbai, 13 February, 2006
An Ultrascale Information Facility for Data Intensive Research
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 2
The UltraLight Collaboration
• California Institute of Technology
• University of Michigan• University of Florida• Florida International
University• Internet2• Fermilab• Brookhaven
• SLAC• University of California,
San Diego• Massachusetts Institute
of Technology• Boston University• University of California,
Riverside• UCAID
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 3
The Project
• UltraLight is– A four year $2M NSF ITR funded by MPS– Application driven Network R&D
• Two Primary, Synergistic Activities– Network “Backbone”: Perform network R&D /
engineering– Applications “Driver”: System Services R&D /
engineering
• Ultimate goal : Enable physics analysis and discoveries which could not otherwise be achieved
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 4
The Motivation
Ability to rapidly transport large datasets will strongly impact computing models
– Datasets (used for analysis) no longer need be pinned for long periods
– SE’s more willing to grant greater temporary storage
– Opportunistic use of volatile (non-VO controlled) resources enhanced
– Particularly important in resource over-subscribed environments
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 5
• Expose the Network as an Actively Managed Resource
• Based on a “Hybrid” packet- and circuit-switched optical network infrastructure– Ultrascale Protocols (e.g. FAST) and Dynamic Optical Paths
• Monitor, Manage and Optimize resources in real-time – Using a set of Agent-Based Intelligent Global Services
• Leverages already-existing, developing software infrastructure in round-the-clock operation:– MonALISA, GAE/Clarens, OSG
• Exceptional Support from – Industry: Cisco & Calient– Research community: NLR, CENIC, Internet2/Abilene, ESnet
A New Class of Integrated Information Systems
See talk fr
om S. McKee
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 6
UltraLight Activities TEAMS: Physicists, Computer Scientists, Network Engineers
• High Energy Physics Application Services– Integrate and Develop physics applications into the
UltraLight Fabric: Production Codes, Grid-enabled analysis, User Interfaces to Fabric
• Global System Services– Critical “Upperware” software components in the UltraLight
Fabric:Monitoring, Scheduling, Agent-based Services, etc.
• Network Engineering– Routing, Switching, Dynamic Path Construction Ops.,
Management • Testbed Deployment and Operations
– Including Optical Network, Compute Cluster, Storage, Kernel and UltraLight System Software Configs.
• Education and Outreach
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 7
Project Structure
Steering CommitteeOverall Project Guidance
Management TeamDay-to-day Internal and External
Coordination
Technical GroupsDay-to-day Activities and Operations
Net
wo
rk
Ap
pli
cati
on
s
Ed
uca
tio
n &
Ou
trea
ch
Use
rC
om
mu
nit
yExternal Projects
ATLAS, CMS, DISUN,LCG, OSG, TeraPaths, CHEPREO, AMPATH,
KyraTera, …
External PeeringPartners
NLR, ESNet, USNet, LHCNet, HOPI, TeraGrid,
Pacific Wave, WIDE,AARNet, Brazil/HEPGrid
CA*net4, GLORIAD, IEEAFJGN2, KEK, Korea,NetherLIght, TIFR,UKLight/ESLEA, …
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 8
Main Science Driver: The LHC
LEVEL-1 Trigger Hardwired processors (ASIC, FPGA) Pipelined massive parallel
HIGH LEVEL Triggers Farms of
processors
10-9 10-6 10-3 10-0 103 106 sec
25ns 3µs hour yearms
Reconstruction&ANALYSIS TIER0/1/2
Centers
ON-lineOFF-line
sec
Giga Tera Petabit
H2Z4l
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 9
LEVEL-1 Trigger Hardwired processors (ASIC, FPGA) Pipelined massive parallel
HIGH LEVEL Triggers Farms of
processors
10-9 10-6 10-3 10-0 103 106 sec
25ns 3µs hour yearms
Reconstruction&ANALYSIS TIER0/1/2
Centers
ON-lineOFF-line
sec
Giga Tera Petabit
New Physics Searches multi-Terabyte scale Datasets!
H2Z4l
Main Science Driver: The LHC
Requests from Multiple users for
Multiple types of… …Multiple times!
Individual TB transactions
should finish inminutes to hours,
rather than hours to days
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 10
Science Areas2005 End2End
Throughput5 years
End2End Throughput
5-10 Years End2End
Throughput
Remarks
High Energy Physics
0.5 Gb/s 100 Gb/s 1000 Gb/s High bulk throughput
Climate (Data & Computation)
0.5 Gb/s 160-200 Gb/s N x 1000 Gb/s High bulk throughput
SNS NanoScience
Not yet started
1 Gb/s 1000 Gb/s + QoS for Control
Channel
Remote control and time critical
throughput
Fusion Energy 0.066 Gb/s(500 MB/s burst)
0.198 Gb/s(500MB/
20 sec. burst)
N x 1000 Gb/s Time critical throughput
Astrophysics 0.013 Gb/s(1 TByte/week)
N*N multicast 1000 Gb/s Computat’l steering and
collaborations
Genomics Data & Computation
0.091 Gb/s(1 TBy/day)
100s of users 1000 Gb/s + QoS for Control
Channel
High throughput and
steering
Evolving Science Requirements for Networks (DOE High Perf. Network Workshop)
Slide taken from H. Newman
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 11
Ever-increasing Network Flows
Amsterdam Internet Exchange Point
ES Net Total Traffic
Jan 2006120+ Gbits/sec
Now: Should be at Petabyte/month
These two examples are representative of the trend in research and education networks worldwide
ATLAS/CMS data flows are “in the ballpark” in comparison
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 12
Project Scope and Context
• Application Frameworks Augmented to Interact Effectively with the Global Services (GS)
– GS Interact in Turn with the Storage Access & Local Execution Service Layers
• Apps. Provide Hints to High-Level Services About Requirements
– Interfaced also to managed Net and Storage services
– Allows effective caching, pre-fetching; opportunities for global and local optimization of thru-put
NetworkingResources
Storage Resources
Computation Resources
Storage Access Execution ServicesNetwork Access
Workflow Management
Request Planning Services
Application Interface
End-to-end Monitoring
ROOT IGUANA COBRA ATHENA Other apps.
Network Management
Ap
plic
atio
n-la
yer
Se
rv
ice
s
Ult
ra
Lig
ht
Glo
ba
l S
erv
ice
s
Ult
ra
Lig
ht
Infra
stru
ctu
re
En
d-to-e
nd
Mo
nit
orin
g
Inte
llig
en
t A
ge
nts
NetworkingResources
Storage Resources
Computation Resources
Storage Access Execution ServicesNetwork Access
Workflow Management
Request Planning Services
Application Interface
End-to-end Monitoring
ROOT IGUANA COBRA ATHENA Other apps.
Network Management
Ap
plic
atio
n-la
yer
Se
rv
ice
s
Ult
ra
Lig
ht
Glo
ba
l S
erv
ice
s
Ult
ra
Lig
ht
Infra
stru
ctu
re
En
d-to-e
nd
Mo
nit
orin
g
Inte
llig
en
t A
ge
nts
Make the Network an Integrated Managed Resource a la CPU & Storage
P h E D E x
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 13
GAE and UltralightMake UltraLight available to Physics applications and their environments
• Unpredictable multi user analysis
• Overall demand typically fills the capacity of the resources
• Real time monitor systems for networks, storage, computing resources,… : E2E monitoring
Network Resources
Network Planning
Request Planning
Mo
nito
r
Application Interfaces
Support data transfers ranging from predictable movement of large scale (simulated and real) data, to highly dynamic analysis tasks initiated by rapidly changing teams of scientists
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 14
Network Resource TestbedSee talk from S. McKee
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 15
Global Network Planning Services
• VINCI :– Virtual Intelligent
Networks for Computing Infrastructures
– Based on existing MonALISA framework
• LISA :– Localhost Information
Service Agent– Monitors end-systems
• User• servers
See talk from I. Legrand
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 16
Prototype App. Layer: E2E
• ATLAS/CMS Software stacks are complex and still developing– Integration work is challenging & constantly evolving
• Generic Service Oriented Architecture crucial for integration– Catalogs to select
datasets, – Resource &
Application Discovery – Schedulers guide jobs
to resources– Policies enable “fair”
access to resources– Robust (large size) data (set)
transfer
8Client Application
Discovery
Planner/Scheduler
Monitor InformationPolicy
Steering
Catalogs
Job Submission
Storage Management
Storage Management
Execution
12
3
4
5
5
6
7Dataset service
9
Data Transfer
Ultralight Focus : data transfer, planning scheduling, (sophisticated) policy management on VO level, integration
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 17
Supercomputing 2005• Internet Land Speed
Record• 151 Gbps peak rate• 100+ Gbps sustained
throughput for hours• 475 Terabytes of
physics data transported in less than 24 hours
• Sustained rate of 100+ Gbps translates to greater than 1 Petabyte per day
0 24126 18
t [hours]
0 603015 45
t [min]
Cum
ulat
ive
[TB
]R
ate
[Gbs
]
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 18
Project Milestones•High-level Milestones
–Link critical services and applications together–Multiple-services & -clients–Distributed system, some fault-tolerance–Logical grid (physical details hidden)–Strategic Steering of Work/Dataflows–Self-organizing, robust distributed E2E system
•User Adoption–Identify small community of users (some within UL)–Integrate UL services seamlessly with LHC Environ.–Deliver LHC Physics Analyses within LHC Timeline
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 19
UltraLight Plans
UltraLight is a 4 year program delivering a new, high-performance, network-integrated infrastructure:
• Phase I (12 months) 2004-2005: focused on deploying initial network infrastructure & bringing up first services
• Phase II (18 months) 2005-2006: concentrates on implementing all the needed services & extending the infrastructure to additional sites
• Phase III (18 months) 2007-2008: will focus on a transition to production in support of LHC Physics
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 20
Beyond UltraLight: PLaNetS – Physics Lambda Network System
13.02.2006 R. Cavanaugh, CHEP06, Mumbai, India 21
Summary• For many years the WAN has been the bottleneck;
This no longer the case in many countries– Deployment of Grid infrastructure now a reality!– Recent land-speed records
• network can be truly transparent• throughputs limited by end-hosts
– Challenge shifting • from getting adequate bandwidth • to deploying adequate infrastructure to make effective use of it!
• UltraLight is delivering a critical missing component for future eScience: the integrated, managed network
Extend and augment existing grid infrastructures (currently focused on CPU/storage) to include the network as an integral component