National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for...

35
National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim Ferguson - NCSA I2 Members Meeting May 2002

Transcript of National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for...

Page 1: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Web100

Wendy Huntoon - PSC

Jim Ferguson - NCSA

I2 Members Meeting

May 2002

Page 2: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Outline

• Project Overview– Motivation: What is the problem– Web100 Collaboration

• Progress to Date– Standardization Process– Code Release

• Code Capabilities• Overview of Users• Web100 Resources

Page 3: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Motivations: What’s the Problem?

• High performance flows slower than line rate– Delays continue/increase even with higher bandwidth

• TCP tuning issues are non-trivial• Poorly conceived stacks• Router/switch buffer queues inadequate• Slow start and AIMD algorithm • Eliminate/dramatically reduce the “wizard gap”• Need for kernel instrumentation set for TCP variables

Page 4: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

The Wizard Gap

TCP over a long haul path

Year Wizards Non-wizards Ratio

1988 1Mb/s 300kb/s 3:1

1991 10Mb/s

1995 100Mb/s

1999 1Gb/s 3Mb/s 300:1

Scientists/researchers not happy with this

Page 5: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Page 6: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

TCP tuning is painful debugging

• All problems limit performance– IP routing, long round trip times

– Improper MSS negotiations or path MTU discovery

– IP Packet reordering

– Packet losses, congestion, lame hardware

– TCP sender or receive buffer space

– Inefficient applications

• Any one problem can mask all the others and confound all but the best (and few) tuning gurus

• Need for better diagnostics and visibility into problems

Page 7: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Goal and Method

• Make it “easy” (transparent) for non-experts to achieve higher throughput performance

• Enhance TCP capabilities with better (finer grain) kernel instrumentation and automatic controls

• Real time triage capability determines sender, receiver, and/or network bottlenecks

Page 8: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Why Focus on TCP

• TCP has an ideal vantage point into throughput problem space

• TCP can identify bottleneck subsystem(s)

• TCP already measures the network (some)

• TCP can measure the application

• TCP can adjust itself (auto-tuning feedback)

Page 9: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Web100 Collaboration

• Funded by the NSF– Currently Year 2 of a 3 Year grant.– Cisco URP for initial seed funding.

• Collaborators– PSC (Matt Mathis, R. Reddy, Janet Brown,

John Heffner)– NCAR (Peter O’Neil, Marla Meehl)– NCSA (John Estabrook, Tanya Brethour,

Stephen Engelhardt, Jim Ferguson)

Page 10: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

What is in the code

• Web100 software consists of:– TCP Kernel Instrument Set (TPC-KIS)

• Instruments coded directly in to the Operating System kernel.

– Derived Instrument Set (DIS)• Information that is collected based on KIS

parameters.

– Application Code• Tools, applications, etc. that use the information

provided by the KIS and DIS.

Page 11: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Kernel Instrument Set

• Definition– Set of instruments designed to collect as much of the

information as possible to enable a user to isolate the performance problems of a TCP connection.

• How it is implemented– Each instrument is a variable in a "stats" structure that

is linked through the kernel socket structure.

– The Linux /proc interface is used to expose these instruments outside the kernel.

Page 12: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

What is the TCP-KIS?

• TCP-KIS instruments group naturally into categories.– Currently roughly 19 categories.

• Already more than 125 instruments have been developed.• For each instrument:

– Precise (standards ready) definition.– Instrument code in the kernel– Implementation verification tests

• Does the kernel implementation meet the definition.

• Prototype diagnostic tool(s) to demonstrate functionality and effectiveness.

Page 13: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

TCP-KIS

• Basic instrumentation examples• Connection ID: 5-tuple that uniquely

identifies a connection.• State: determines what protocol features or

algorithms are enabled.• Traffic out: statistics aggregate packets and

traffic sent out on a connection.

Page 14: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Local Sender Triage

• Group of instruments associated with the local sender.– Determine what subsystems are throttling TCP

data transmission.– Three parallel sets of instruments that measure:

• Receiver Window

• Network Congestion

• Senders Availability

Page 15: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Local Sender Groups

• Other groups of instruments associated with the Local Sender:

• Local Sender Congestion Model

• Local Sender Loss Model

• Local Sender Re-order Model

• Local Sender RTT

• Local Sender Segment Size

• Local Sender Bottlenecks

• Local Sender Tuning

Page 16: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Other Instruments

• Similar instruments for the Local Receiver.• Observed Receiver instruments

– Often inferred from the data stream.

– E.g, Observed Receiver - receivers state is inferred from the ACK stream.

• Application Interface– Future instruments to collect statistics on how the

application is using the network.

Page 17: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Userland Distribution

• Released asynchronously with kernel distribution

• Currently at Alpha 1.1– Version 1.2 release imminent

• Consists of– The web100 library– Command line utilities– GUI utilities

Page 18: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Web100 Library

• Web100 kernel exposes critical TCP variables/instruments through /proc

• Web100 library provides the necessary access functions to access these variables/instruments

• Functions– Read the value of a variable/instrument– Snap shot of a group (facilitates atomic reading of a group of

variables)– Modify tunable variables (ex. send buffer size)– Etc …

Page 19: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Utilities

• Command line utilities– Useful in batch scripts– Serve as demo codes for the usage of web100

library

• GUI utilities– Based on GTK+– Useful for troubleshooting network

applications– Serve as examples for application developers

Page 20: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

GUI Sample Screens – DTB

Page 21: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Connection Selector

Page 22: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Looking at a Variable

Page 23: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Timeline - Year 1

• Alpha code development• Establish User Support

– www.web100.org

• Initial User Community– Very limited to begin with.

– Knowledgeable users, expected to provide technical input on the code.

– Understand and develop applications.

Page 24: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Timeline - Year 2

• Began standardization process.– Develop MIB– Submit to IETF

• Develop public code– Fix bugs in alpha versions– Add instrumentation– Code release

• Continue code development– Identify and add new instruments

Page 25: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Code Releases - To date• Initial Release

– Alpha0.2, released May 23, 2001– Alpha0.3, released Sept. 19, 2001

• Alpha 1.0-Separation of Kernel and Userland code– Kernel Patch:

• Alpha 1.1 for Linux 2.4.16, released March 18, 2002• Alpha 1.0, released March 1, 2002 • Alpha 1.0, released February 26,2002

– Userland:• Alpha 1.1, released February 28, 2002• Alpha 1.0, released February 26,2002

Page 26: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Timeline - Year 3

• New pathprobe diagnostic tool (wip, unreleased).• Add another 10-12 instruments.• Review instruments and code with other wizards.• Gain vendor support for ideas and code.• Finalize IETF draft by December IETF meeting.

Page 27: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Milestones

• Over a year of ~ 30 alpha testers – Including: SLAC, ORNL, LBNL, and universities

– www.net100.org

• Modified Linux kernel supports 2.4.16• Separation between KIS and library functions• draft-ietf-tsvwg-tcp-mib-extension-00.txt• draft-ietf-ipngwg-rfc2012-update-01.txt

Page 28: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Web100 Collaborator Activity

• Rich Carlson, ANL• Tom Dunnigan, ORNL• Tom Hacker, U. of Michigan• Doug Chang, SLAC• Andreas Burkhardt & Matt Grob, Qualcomm• Larry Dunn & Scott Dier, Cisco/U. of Minnesota• Jason Lee, LBL

Page 29: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Collaborator Assistance

• Bugs!– Kernel– Utilities– Release

• Request new features• Review and criticize documentation

– Way too easy on us

Page 30: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Collaborator Activity

• Carlson/ANL working on a troubleshooting guide for LANs.

• Set up network of 13 identically equipped PIII connected via Cisco 5500 network switch, running Web100-enabled Linux.

• Introduces typical network faults (duplex mismatches, other config errors) and analyzes data for “signatures” of these faults.

• Modified Iperf 1.2 to collect variables and reverse flow.

Page 31: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Collaborator Activity

• Dunnigan/ORNL has found web100 helpful in seeing losses/retransmission and congestion avoidance parameters of individual TCP flows, and for tuning flows

• Has developed a Web100-enabled ttcp• Has developed a daemon that logs web100 variables for

designated paths when a flow closes• Has developed an autotuning daemon that uses web100 to

tune flows, including modifications to web100 to support "event notification", so the daemon knows when a new flow/socket is opened

Page 32: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Collaborator Activity

• Hacker/U.Michigan has been using the web100 software to help tune and diagnose end-to-end network performance problems across the U-M campus network as well as across Abilene for the Visible Human and Atlas projects at U-M.

• Chang/SLAC is looking to fix performance problem between Linux and Solaris machines.

Page 33: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Collaborator Activity

• Qualcomm is using Web100 to measure TCP performance over certain types of high speed wireless links under development. Web100 is partially integrated into some other tools - in the sense that output reports are published automatically in a format similar to other tools Qualcomm uses.

• Dunn/Cisco currently using Web100 for a class at U.Minnesota. Includes accounts on test machine at NCSA.

Page 34: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Collaborator Activity

• Lee/LBL has obtained accounts at SLAC and ANL for WAN testing, and have co-located one of our machines in Washington D.C. to do testing over SuperNet. Still in the process of testing all this out.

• Keith Jackson at LBL has written Python wrappers to the Web100 calls using swing.

Page 35: National Center for Atmospheric Research Pittsburgh Supercomputing Center National Center for Supercomputing Applications Web100 Wendy Huntoon - PSC Jim.

National Center for Atmospheric ResearchPittsburgh Supercomputing CenterNational Center for Supercomputing Applications

Web100 Summary

• Main WWW site: www.web100.org• Freely available software distribution

– www.web100.org/download– hundreds of downloads

• Please be cognizant of impacts on others• Please use, test, provide feedback, contribute code • IETF standards process to benefit all• Attention turning to working with OS vendors to

incorporate standards enhancements into their stacks