September 3, 2015 Writing a Successful XSEDE Proposal Ken Hackworth XSEDE Allocations Coordinator.
User Support, Campus Integration, OSG XSEDE · June 25, 2015. 149. Present to Council 1 page...
Transcript of User Support, Campus Integration, OSG XSEDE · June 25, 2015. 149. Present to Council 1 page...
User Support, Campus Integration, OSG XSEDERob GardnerOSG Council MeetingJune 25, 2015
149. Present to Council 1 page document on "Enabling Campus Resource Sharingand use of remote OSG resources in 15 minutes - Rob Gardner, Frank
Enabling Campus Sharing & Use of OSG
● Clemson helping drive this development● Two track strategy to integrate Palmetto
resource and user community○ Track 1: “light, quick”
■ Sumit from Palmetto to OSG, and back■ “Quick Connect” → OSG Connect to Palmetto via
hosted Bosco service (ssh)○ Track 2: “full OSG capabilities”
■ Full HT Condor CE, OASIS+Squid, {StashCache}Working document: goo.gl/9aNkJs
Track 1: OSG Connect to Clemson-Palmetto
● Hosted service @ OSG Connect● Addressed OrangeFS + Condor file locking
Current opp limit on PalmettoCapped at 500 jobs due to PBS Pro limitation that prevents Clemson users in the general pool from (non-owners) preempting OSG users. Expect a fix in next release of PBS Pro so that OSG jobs can claim additional idle cycles on Palmetto
submitted from login.osgconnect.net
Track 1: Submit from Clemson-Palmetto
● Download Campus Connect client from github● Minutes to submission to OSG
Campus Connect Client● lightweight module to
manage submission from a campus login host
● heavy lifting done at hosted schedd
● In Year 4, extend to reach, monitor, and account:○ local campus allocation○ XD allocation○ Full integration with
campus IDM & signupEvaluating at:
Longer term: Hosted Campus CE-ssh● Discussions to establish approach for hosted
CE services on behalf of campuses short of manpower
● Quick(er) on-ramp of a campus HPC cluster without requiring local OSG expertise
● @ the campus: provide ssh access, local accounts for supported VOs
● Normal CE operations handled by OSG staff● Possible “umbrella CE” for small campuses
152. Pay attention to "Sound Bites" that communicate the scale and reach ofOSG to outside agencies/projects - Rob G, Bo, Clemmie
Open Science Grid: HTC supercomputer
● 2014 stats○ 67% size of XD, 35% BlueWaters○ 2.5 Million CPU hours/day○ 800M hours/year○ 125M/y provided opportunistic
● >1 petabyte data xfer/day● 50+ research groups● thousands of users● XD service provider for XSEDE
Rudi EigenmannProgram Director Division of
Advanced Cyberinfrastructure (ACI)NSF CISE
CASC Meeting, April 1, 2015
Lowering barriers to usability
OSG as a campus research computing cluster
★ Login host★ Job scheduler★ Software (modules)★ Storage★ Tools
Software & tools on the OSG
● Distributed software file system● Special module command
○ identical software on all clusters
● Common tools & libs● Curate on demand
continuously● HTC apps in XSEDE
campus bridging yum repo
$ switchmodules oasis
$ module avail
$ module load R
$ module load namd
$ module purge
$ switchmodules local
http://goo.gl/TlLq1M
Modules now used at most sites
All software accesses by module are monitoredcentrally for support purposes
User Tools
● $ tutorial ● $ connect
○ on login.osgconnect.net, on campus, or laptop● $ module (software, all OASIS enabled sites)● $ stash-cp (Stash to job, in development)
Education and Training assets
● Helpdesk with community forum and knowledge base
● github seen as strategy for formal management of user documentation○ Markdown tutorials → same place as code○ tutorial write-ups track code samples closely○ auto html and upload to help desk (in seconds)
● Expect to announce helpdesk support.opensciencegrid.org this week
Code and Markdown managed in Github
Content indexed, searchable
User sees personal history of supportrequests to OSG.. and can drill down to see full interaction history. Staff can make private notes, or link to a Jira issue for technical support tracking.
Of course, all available via email: [email protected]
Can DM tweet to @osgusers which generates a ticket
Uber-like feedback is collected(except we don’t rate users :)
● Software Carpentry includes a section on scientific programming using Python. IPython Notebook is used for instruction.
● SWC typically asks users to install IPython on laptops; this is a top source of delays and confusion.
● In our DHTC edition of SWC, we already have a multiuser server with login accounts that users retain indefinitely.
● Idea: use this framework to provide a shared IPython, establishing a common baseline for the toolchain.
IPython Notebooks
IPython Notebook Service
Developed a platform to launch per-user IPython Notebook servers:
➥
1. User visits http://ipython.osgconnect.net and logs in.
IPython Notebook Service
2. Server launches pre-configured IPython Docker container. Docker provides user and data isolation. Containers can be shut down and re-instantiated on demand.
3. Within moments, a newly provisioned IPython instance is available. Notebook storage is persistent and accessible via login.osgconnect.net.
Education and Training activities● Working with Tim Cartwright and Lauren Michael (ACI-REF)
to support 2015 OSG User School● UChicago-Northwestern roundtable (postponed to “Fall”)● OSG-SWC @ Duke, October 26-29 (tentative dates)
Joint Software Carpentry & Open Science Grid Workshop at Duke University
Distributed high throughput computation is concerned with using many computing resources potentially spread over large geographic distances and shared between organizations. These could be university research computing clusters, national leadership class HPC facilities, or public cloud resources. Incorporating these into science workflows can dramatically benefit your research program. However, to get the most of these systems requires some knowledge and skill in scientific computation. This workshop extends basic instruction on Linux programming from the Software Carpentry series with concepts and exercises on distributed high throughput computation. Participants will use resources of the Scalable Computing Support Center as well as the Open Science Grid, a national supercomputing-scale high throughput computing facility. There will be experts on hand to answer questions about distributed high throughput computing and whether it is a good fit for your science.
Other Campus Outreach Events
● Internet2 Technology Exchange, October 4-7, Cleveland (formal decision next week)○ Distributed High Throughput Computation: a Campus
Roundtable Discussion (Research Track)● Rocky Mountain Advanced Computing
Consortium, HPC Symposium (Aug 11-13, Boulder)○ 30 minute slot shared with XSEDE
● XSEDE15, CLUSTER15 (Campus Bridging)
OSG as XD Provider to XSEDE
OSG XD - Last 12 monthsProject Name PI Institution Field of Science Allocation Wall Hours
TG-IBN130001 Donald Krieger University of Pittsburgh Biological Sciences Research 54,881,313
TG-CHE140110 John Stubbs University of New England Chemistry Research 1,047,897
TG-DMR130036 Emanuel Gull University of Michigan Materials Science Research 563,106
TG-PHY120014 Qaisar Shafi University of Delaware Physics and astronomy Research 309,036
TG-CHE140098 Paul Siders University of Minnesota; Duluth Chemistry Research 88,047
TG-CHE130091 Paul Siders University of Minnesota; Duluth Chemistry Startup 58,086
TG-MCB140160 David Rhee Albert Einstein College of Medicine Molecular and Structural Biosciences Startup 39,517
TG-AST140088 Francis Halzen University of Wisconsin-Madison High Energy Physics Startup 30,850
TG-CHE140094 John Stubbs University of New England Chemistry Startup 27,057
TG-OCE130029 Yvonne Chan University of Hawaii; Manoa Ocean Sciences Startup 22,007
TG-IRI130016 Joseph Cohen University of Massachusetts; Boston Information Robotics and Intelligent Systems Startup 20,401
TG-DMR140072 Adrian Del Maestro University of Vermont Materials Science Startup 20,179
TG-OCE140013 Yvonne Chan University of Hawaii; Manoa Ocean Sciences Research 19,861
TG-AST150012 Gregory Snyder Space Telescope Science Institute Mathematical Sciences Startup 18,099
TG-MCB090163 Michael Hagan Brandeis University Molecular and Structural Biosciences Research 10,676
TG-DEB140008 Robert Toonen University of Hawaii; Manoa Biological Sciences Startup 4,147
TG-TRA130011 John Chrispell Indiana University of Pennsylvania Other Campus Champions 1,578
TG-MCB140232 Alan Chen SUNY at Albany Molecular Biosciences Startup 598
TG-SEE140006 Sheila Kannappan University of North Carolina; Chapel Hill Physics and astronomy Educational 46
TG-CDA100013 Mark Reed University of North Carolina; Chapel Hill Mathematical Sciences Campus Champions 6
TG-CCR120041 Luca Clementi San Diego Supercomputer Center Computer and Information Science and Engineering Startup 1
Total 57,162,509
OSG XD: June XRAC Meeting (Nashville)● OSG pledges 2M CPU-hours (SUs) per quarter● There were 199 requests for XSEDE resources, mostly for
Stampede and Comet● There were no requests for OSG resources● Post meeting, following granted:
○ 50k SU to Kettimuthu/ANL (CS: workflow modeling)○ 100k SU to Qin/Spellman (class on gene networks)○ 1.39M SU to Gull/UMich (PHYS: condensed matter)
● Many NAMD requests○ → start a MD-HTC activity with ACI-REF?
Conclusions & Outlook● “Clemson on the air”
○ Local submit to OSG validated at scale of 1500 jobs○ Joint use of campus and OSG resources in same work
environment■ Model for other campuses, ACI-REF as channel
○ Quick Connect to share resources is functional● HTC training materials now formally managed● Helping users via XSEDE
○ plan detailed studies of common application scaling properties & potential conversion to HTC workflow
@osgusers