High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May...
-
Upload
jason-powers -
Category
Documents
-
view
216 -
download
3
Transcript of High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May...
![Page 1: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/1.jpg)
High Throughput Computing On High Throughput Computing On CampusCampus
High Throughput Computing: How we got here, Where we are
May 2011
Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu
Center for High Throughput ComputingDepartment of Computer Sciences
![Page 2: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/2.jpg)
Center for High Throughput Computing
Agenda• Center for High Throughput Computing
– What is High Throughput Computing?– What is the CHTC, what does it offer?– How to get started
• Sugar and caffeine break• Introduction to using Condor• Campus case studies
![Page 3: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/3.jpg)
Center for High Throughput Computing
![Page 4: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/4.jpg)
Center for High Throughput Computing
FREE!
![Page 5: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/5.jpg)
Center for High Throughput Computing 5
Why we need compute power• Three-fold scientific
method– Theory– Experiment– Computational
analysis
• Cycles as “fertilizer”
Th
eory
Exp
erim
ent
Simulation
=
![Page 6: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/6.jpg)
Center for High Throughput Computing
The basic problem• Speed of light
• 1 ft / nanosecond– (roughly)
• 3 GHz CPU =– 1/3 foot per clock
![Page 7: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/7.jpg)
Center for High Throughput Computing
Parallelism: The Solution– Parallelism can cover latency with bandwidth
“If I cannot get a computer to run 1000x faster, how about using 1000 computers?”
– But at what level should parallelism be?• This is basic question for last 40 years
• Programmer visible?• Compiler visible?• Instruction level?• Machine level?• etc. etc. etc.
![Page 8: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/8.jpg)
Center for High Throughput Computing 8
HPC Super Computers is one avenue…
• Specialized, expensive– Super Computer
Centers
• Difficult to program– Shared Memory?– Message Passing?– Multiple Threads?
![Page 9: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/9.jpg)
Center for High Throughput Computing 9
Late 90s and beyond: Why Commodity Clusters?
• Massive cost/performance gain of CPU and network technology.
• The industry started to provide fully assembled subsystems (microprocessors, motherboards, disks and network interface cards).
• Mass market competition has driven the prices down and reliability up for these subsystems.
• The availability of open source software, particularly the Linux operating system, GNU compilers and programming tools, MPI / PVM message passing libraries, workload managers such as Condor, PBS.
• The recognition that obtaining high performance, even from vendor provided, on parallel platforms can be hard work
• An increased reliance on computational science which demands high performance computing.
![Page 10: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/10.jpg)
Center for High Throughput Computing 10
Birth of off-the-shelf compute clusters
• Beowulf Project started in 1994
• Off-the-shelf commodity PCs, networking
• Parallelism via processes
![Page 11: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/11.jpg)
Center for High Throughput Computing 11
HPC vs HTCHPC vs HTC
• High Performance - Very large amounts of processing capacity over short time periods (FLOPS - Floating Point Operations Per Second)
• High Throughput - Large amounts of processing capacity sustained over very long time periods (FLOPY - Floating Point Operations Per Year)– Process level parallelism, opportunistic
FLOPY FLOPY 30758400*FLOPS30758400*FLOPS
![Page 12: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/12.jpg)
Center for High Throughput Computing 12
So given that background...
![Page 13: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/13.jpg)
Center for High Throughput Computing 13
In August 2006, theUW Academic PlanningCommittee approved the
Center for High Throughput Computing (CHTC).
The College of L&S then staffed positions for the center.
![Page 14: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/14.jpg)
Center for High Throughput Computing
Since 2006: 150M CPU Hours (17K years!)
Today ~6.5k cores busy
![Page 15: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/15.jpg)
Center for High Throughput Computing 15
About the CHTC• Goal
– Offer the UW community computing capabilities that enable scientific discovery.
• How– Work with research projects and the campus
on the organization, funding, creation, and maintenance of a campus computing grid
– Strong commitment to the engagement model; be ready to work with research groups and applications
– Involve UW in national and international grid efforts.
![Page 16: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/16.jpg)
Center for High Throughput Computing 16
About the CHTC, cont.• Who
– Steering Committee• Jeff Naughton, CS Department Chair• Miron Livny, Condor Project PI• Todd Tannenbaum, Condor Project Technical Lead• Juan de Pablo, Professor of Chemical Eng• David C. Schwartz, Professor of Genetics/Chemistry• Wesley Smith, Professor of Physics
– Staffing• Primarily comes from the projects associated with
the CHTC
![Page 17: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/17.jpg)
Center for High Throughput Computing 17
About the CHTC, cont.• What
– Administer UW Campus distributed computing resources
– Linked with world-wide grid via OSG
• Low ceremony engagements
![Page 18: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/18.jpg)
Center for High Throughput Computing 18
About the CHTC, cont• Force multiplier for UW Campus• Can help with grantsmanship• Help UW researchers with HTC best
practices– We won’t write your end application code, but
will do much to help move it into a turnkey HTC application
– Can play a matchmaker role, hooking you up w/ others on campus
![Page 19: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/19.jpg)
Center for High Throughput Computing 19
The rest of this presentation
• Introduce CHTC resources –Infrastructure Resources
• Middleware• Hardware
–People Resources
• How to get involved...
![Page 20: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/20.jpg)
Center for High Throughput Computing 20
So you can be like this studentAdam Butts, who is now with IBM Research. On the Thursday before his PhD defense on Monday, he was still many simulations short of having final numbers. GLOW provided him with an on-demand burst of 15,000 CPU hours over the weekend, getting him the numbers he needed:
From: "J. Adam Butts" <[email protected]>Subject: Condor throughput
The Condor throughput I'm getting is really incredible. THANK YOU so very much for the extra help. My deadline is Monday, and if I'm able to meet it, you will deserve much of the credit. The special handling you guys have offered when a deadline must be met (not just to me) is really commendable. Condor is indispensable, and I'm sure it has a lot to do with the success of (at least) the architecture group at UW. I can't imagine how graduate students at other universities manage...
![Page 21: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/21.jpg)
Center for High Throughput Computing 21
Or like this studentAdam Tregre, a physics graduate student, has used over 70,000 hours of compute time. He reports:
In "Neutrino mass limits from SDSS, 2dFGRS and WMAP," (PLB 595:55-59, 2004), we performed a six-dimensional grid computation, which allowed us to put an upper bound on the neutrino mass of 0.75 eV at two sigma (1.11 eV at three sigma). We calculated CMB and matter power spectra, and found the corresponding chi2 value at approximately 106 parameter points, a feat only made possible by [CHTC]… CHTC provides an excellent tool for the many parameter fittings and different starting points necessary for such an analysis.
![Page 22: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/22.jpg)
Center for High Throughput Computing 22
Infrastructure: Condor
![Page 23: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/23.jpg)
Center for High Throughput Computing 23
The Condor ProjectDistributed Computing research project in the Comp Sci dept performed by a team of faculty, full time staff and students who
face software/middleware engineering challenges, involved in national and international collaborations, interact with users in academia and industry, maintain and support a distributed production
environments and educate and train students.
Funding (~ $4.5M annual budget)
![Page 24: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/24.jpg)
Center for High Throughput Computing 24
Condor Project has a lot of experience... (Est 1985!)
![Page 25: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/25.jpg)
Center for High Throughput Computing
Condor Basics
• Condor software powers all CHTC
• Two key concepts in Condor:– Jobs and Machines
• Condor takes your jobs (lots of ‘em) and runs them on our machines (lots of ‘em) in a reliable way
![Page 26: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/26.jpg)
Center for High Throughput Computing
Jobs• Job: unit of work – a batch OS process
with input and output• Ideally, between 5 minutes and 4 hours,
CPU bound• In any (cough) programming language• Along with any input or output files.
![Page 27: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/27.jpg)
Center for High Throughput Computing
HTC in a nutshell• With HTC, unlike HPC, we have a lot of
independent* batch jobs. By running as many as we can in parallel, we get a big speedup. But they could be run sequentially, correctly, but slowly.
![Page 28: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/28.jpg)
Center for High Throughput Computing
Submit filesSubmit files describe your job to Condoruniverse = vanilla
executable = my_program
arguments = -r 100 –l –run-fast
output = output_file
should_transfer_files = yes
when_to_transfer_output = on_exit
queue
![Page 29: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/29.jpg)
Center for High Throughput Computing
Sample simple HTC
• Create 100 submit files • Run condor_submit on each file• Wait for output…
• Condor tools are all command-line– making scripting easy -- glue– A necessity for reliably running large #s
![Page 30: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/30.jpg)
Center for High Throughput Computing
I’ve got dependency issues• Condor supports
not just sets of independent jobs, but workflows where some jobs depend on the output of previous jobs.
• Note: All communication via files
![Page 31: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/31.jpg)
Center for High Throughput Computing
DAGMan• DAGman sequences jobs with input and
output file dependencies
• (via a text file)
![Page 32: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/32.jpg)
Center for High Throughput Computing
Defining a DAG• A DAG is defined by a .dag file, listing
each of its nodes and their dependencies:# diamond.dagJob A a.subJob B b.subJob C c.subJob D d.subParent A Child B CParent B C Child D
• each node will run the Condor job specified by its accompanying Condor submit file
Job D
Job A
Job B Job C
![Page 33: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/33.jpg)
Center for High Throughput Computing
LIGO inspiral search application
• Describe…
Inspiral workflow application is the work of Duncan Brown, Caltech,
Scott Koranda, UW Milwaukee, and the LSC Inspiral group
![Page 34: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/34.jpg)
Center for High Throughput Computing
What about machines?
• machines are grouped into pools• Pools have one or more submit points
–Where you can run condor_submit• Machines can be heterogeneous • Machines (usually) have 1 slot per
core–So jobs should be single-threaded
• condor_status command shows machines
![Page 35: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/35.jpg)
Center for High Throughput Computing
But I absolutely have a parallel job
• Either MPI or OpenMP, or pthread …• Condor HTPC allows a job to claim a
whole machine and use all CPUS/cores and memory
• But, there is a much longer wait, and fewer of them (and you still need to debug your job)
![Page 36: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/36.jpg)
Center for High Throughput Computing
What kind of machines are we talking about?
• Depends on Pool–Example CHTC B-240 cluster:
• 1700 Linux slots, 64 bit w/ ~1Gb ram each• 1 PB spinning disk, 150 TB in HDFS• 240 Windows slots, 32 bit
–Others may be different
• You have access to other pools on campus and around the world
![Page 37: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/37.jpg)
Center for High Throughput Computing 37
Linking Condor Pools• Across campus
– Simple to set up “flocking”– If not enough machines in your pool, your job
will “flock” to a friendly pool.– You are responsible for finding friends.
• Across the world– Condor interoperates with many evolving “grid
protocols”• Remote machines do not need to be running
Condor
![Page 38: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/38.jpg)
Center for High Throughput Computing
That can’t possibly be it, can it?• No• 900 Page manual at http://www.cs.wisc.edu/condor
• Lots more power, lots more fun– Ask us! We’ll help…
![Page 39: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/39.jpg)
Center for High Throughput Computing 39
Infrastructure: Machines
The Story of GLOW:The Grid Laboratory of
Wisconsin
![Page 40: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/40.jpg)
Center for High Throughput Computing 40
How GLOW got started
• Computational Genomics, Chemistry• High Energy Physics (CMS, Atlas)• Materials by Design, Chemical Engineering• Radiation Therapy, Medical Physics• Computer Science• Amanda, Ice-cube, Physics/Space Science• Plasma PhysicsDiverse users with different conference deadlines,
and usage patterns.
Seven departments on Campus needed computing, big time.
![Page 41: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/41.jpg)
Center for High Throughput Computing 41
GLOW• GLOW Condor pool is distributed across the
campus at the sites of the machine owners.3200 coresOver 100 TB diskOver 25 million CPU-hours servedContributed to ~50 publicationsClusters' owner always has highestpriority
![Page 42: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/42.jpg)
Center for High Throughput Computing 42
Why join together to build the GLOW campus grid?
very high utilizationmore diverse users = less wasted cycles
simplicityAll we need is Condor at campus level. Plus, we get the full feature-set rather than lowest common denominator.
collective buying powerWe speak to vendors with one voice.
consolidated administrationFewer chores for scientists. Fewer holes for hackers.
More money for research, less money for overhead.
synergyFace-to-face technical meetings between members.Mailing list scales well at campus level.
![Page 43: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/43.jpg)
Center for High Throughput Computing 43
Fractional Usage of GLOW
ATLAS23%
Multiscalar1%
ChemE21%
CMPhysics2%
CMS12%
CS (Condor)4%
IceCube4%
LMCG16%
MedPhysics4%
Plasma0%
Others12% OSG
1%
![Page 44: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/44.jpg)
Center for High Throughput Computing 44
Molecular and Computational Genomics
Availability of computational resources serves as a catalyst for the development of new genetic analysis approaches.
Professor David C. Schwartz: “We have been a Condor user of several years, and it is now unthinkable that we could continue our research without this critical resource.”
![Page 45: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/45.jpg)
Center for High Throughput Computing 45
Physics
The Large Hadron Collider under construction at CERN will produce over 100PB of data over then next few years. About 200 “events” are produced per second, each requiring about 10 minutes of processing on a 1GHz Intel processor.
Professors Sridhara Rao Dasu, Wesley H. Smith and Sau Lan Wu: “The only way to achieve the level of processing power needed for LHC physics analysis is through the type of grid tools developed by Prof. Livny and the Condor group here at Wisconsin.”
![Page 46: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/46.jpg)
Center for High Throughput Computing 46
Chemical Engineering
In computation fluids and materials modeling research, a single simulation commonly requires several months of dedicated compute time. Professor Juan de Pablo's group leverages Condor and GLOW to satisfy this demand.
Professor Juan de Pablo: “The University of Wisconsin is a pioneer in [grid computing]. A case in point is provided by a number of outstanding, world-renowned research groups within the UW that now rely on grid computing and your team for their research.”
![Page 47: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/47.jpg)
Center for High Throughput Computing 47
Why not GLOW across campus?
UW Campus Grid
Many federated Condor pools across campus, including CHTC and
GLOW resources
![Page 48: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/48.jpg)
Center for High Throughput Computing 48
Condor installs at UW-Madison• Numerous Condor Pools operating at UW, several
at a department or college level:– CHTC Pool: ~1700 cpu and growing– GLOW Pool: ~3200 cpu and growing– CAE Pool: ~ 1200 cpu– Comp Sci Pool: ~ 1215 cpu– Biostat Pool: ~ 132 cpu– DoIT InfoLabs Pools: ~200 cpus and growing– Euclid Pool: ~2000 cpu
• Condor Flocking– Jobs move from one pool to another based upon
availability.
![Page 49: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/49.jpg)
Center for High Throughput Computing 49
Usage from just the CS Pool
User Hours Pct ------------------------------ ---------- ----- [email protected] 313205.7 6.1% [email protected] 302318.1 5.9% [email protected] 284046.1 5.5% [email protected] 259001.3 5.0% [email protected] 211382.4 4.1% [email protected] 166260.7 3.2% [email protected] 150174.0 2.9% [email protected] 121764.7 2.4% [email protected] 107089.7 2.1% [email protected] 106475.1 2.1% [email protected] 103529.2 2.0% [email protected] 98983.6 1.9%....TOTAL 5142471 100%
![Page 50: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/50.jpg)
Center for High Throughput Computing 50
Open Science Grid (OSG):
Linking grids together
![Page 51: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/51.jpg)
Center for High Throughput Computing 51
The Open Science GridThe Open Science Grid
• OSG is a consortium of software, service and resource providers and researchers, from universities, national laboratories and computing centers across the U.S., who together build and operate the OSG project. The project is funded by the NSF and DOE, and provides staff for managing various aspects of the OSG.
• Brings petascale computing and storage resources into a uniform grid computing environment
• Integrates computing and storage resources from over 50 sites in the U.S. and beyond
A framework for large scale distributed resource sharingaddressing the technology, policy, and social requirements of sharing
![Page 52: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/52.jpg)
Center for High Throughput Computing 52
Principal Science DriversPrincipal Science Drivers
• High energy and nuclear physics– 100s of petabytes (LHC) 2007– Several petabytes 2005
• LIGO (gravity wave search)– 0.5 - several petabytes 2002
• Digital astronomy– 10s of petabytes 2009– 10s of terabytes 2001
• Other sciences emerging– Bioinformatics (10s of petabytes)– Nanoscience– Environmental– Chemistry– Applied mathematics– Materials Science
![Page 53: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/53.jpg)
Center for High Throughput Computing 53
Virtual Organizations (VOs)Virtual Organizations (VOs)• The OSG Infrastructure trades in
Groups not Individuals
• VO Management services allow registration, administration and control of members of the group.
• Facilities trust and authorize VOs.
• Storage and Compute Services prioritize according to VO group. Set of Available Resources
VO Management Service
OSG and WAN VO Management
& Applications
Campus Grid Campus Grid
Image courtesy: UNM Image courtesy: UNM
![Page 54: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/54.jpg)
Center for High Throughput Computing 54
Current OSG Resources Current OSG Resources • OSG has more than 80 participating institutions,
including self-operated research VOs, campus grids, regional grids and OSG-operated VOs
• Provides about 45,000 CPU-days per day in processing• Provides 600 Terabytes per day in data transport• OSG is starting to offer support for MPI
![Page 55: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/55.jpg)
Center for High Throughput Computing 55
Simplified View: Campus Grid
![Page 56: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/56.jpg)
Center for High Throughput Computing 56
Simplified View: OSG Cloud
![Page 57: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/57.jpg)
Center for High Throughput Computing 57
Why should UW facilitateWhy should UW facilitate(or drive) cross-campus resource (or drive) cross-campus resource
sharing?sharing?Because it’s the right thing to do
– Enables new modalities of collaboration– Enables new levels of scale– Democratizes large scale computing– Sharing locally leads to sharing globally– Better overall resource utilization– Funding agencies
At the heart of the cyberinfrastructure vision is the development of a cultural community that supports peer-to-peer collaboration and new modesof education based upon broad and open access to leadership computing; data and information resources; online instruments and observatories; and visualization and collaboration services.
- Arden Bement CI Vision for 21st Century introduction
![Page 58: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/58.jpg)
Center for High Throughput Computing 58
![Page 59: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/59.jpg)
Center for High Throughput Computing 59
What level of know-how, experience, and influence does the CHTC and UW-Madison have in OSG?
![Page 60: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/60.jpg)
Center for High Throughput Computing 60
Notice any similarities?
UW CHTC DirectorOpen Science Grid PI
![Page 61: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/61.jpg)
Center for High Throughput Computing 61
Great! How can I get involved?
![Page 62: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/62.jpg)
Center for High Throughput Computing
How does it work?• Any PI at UW can get access, FREE!!!• Email [email protected] or fill out survey• Short description of your work• Initial engagement
– Two CHTC staff assigned to your project– Talk about your problem– We can help recommend/suggest resources
and tools– We can help get you going (scripting,
DAGMan files, etc)– Create accounts and/or install submit
machines in your lab – Start small, grow to ???
![Page 63: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/63.jpg)
Center for High Throughput Computing 63
Education
![Page 64: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/64.jpg)
Center for High Throughput Computing
CHTC engagements
Condor Week
OSG Summer School –>
http://opensciencegrid.org/GridSchool
Computer Sciences 368, Scripting (1 cr)
CHTC Education Opportunities
![Page 65: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/65.jpg)
Center for High Throughput Computing
Why Scripting?• Fast … easy … high-level• Pervasive• Automate workflows
– Data manipulation– Glue between incompatible apps– Drive CHTC workflows
![Page 66: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/66.jpg)
Center for High Throughput Computing
CS 368 Options• Perl (intro)• Python (intro, advanced)• Other ideas
– Ruby (standalone + Rails for web)– (Perl or Python) + CHTC– What is used in your community?
• Please complete the SURVEY
![Page 67: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/67.jpg)
Center for High Throughput Computing
NMI Build and Test Lab• The BaTLab is a distributed,
multi-platform facility designed to provide automated software build and test services
• 50 unique platforms named for specific architecture and OS combinations (x86_rhap_5, x86_64_fedora_13, etc.)
• The Metronome software is free to download, and others have set up their own local sites.
See http://nmi.cs.wisc.edu
![Page 68: High Throughput Computing On Campus High Throughput Computing: How we got here, Where we are May 2011 Todd Tannenbaum, Tim Cartwright {tannenba, cat} @cs.wisc.edu.](https://reader037.fdocuments.in/reader037/viewer/2022110206/56649f425503460f94c614f2/html5/thumbnails/68.jpg)
Center for High Throughput Computing 68
Conclusion• Computation enables better research.• There are many opportunities to run jobs on
campus.• The CHTC is here to help you:
– Get access to campus resources– Get access to campus expertise– Leverage computing to do better science!
Thank you! Email us at [email protected]