Yellowstone and HPSS
description
Transcript of Yellowstone and HPSS
![Page 1: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/1.jpg)
Yellowstone and HPSSOR
What you’re doing on Bluefire that you should stop
David HartCISL User Services Section
August 7, 2012
![Page 2: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/2.jpg)
2
Think different• Yellowstone is not Bluefire
– Yellowstone delivers 29x the computing capacity• GLADE is not /ptmp
– New /glade/scratch (~5 PB) is • 37x larger than /ptmp• 25x larger than old /glade/scratch
– New GLADE is 7x larger, 15x faster than old GLADE
• HPSS tape capacity is not infinite
![Page 3: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/3.jpg)
3
Tape ≠ slow disk• “Temporary” HPSS files use tape space that is
not easily reclaimed– Deleting files from tapes leaves gaps that
are not refilled (unlike disk)– “Repacking” not practical
• Time consuming, may recover only 10% of tape space, occupies tape drives
– Space is recovered only* when entire archive is migrated to new media
• Wasted tape = smaller future HPC systems
![Page 4: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/4.jpg)
4
HPSS today
![Page 5: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/5.jpg)
5
HPSS, May 2012—15.75 PB
NCARLabs
47%
CSL28%
Labels show:Entity, PB, %
![Page 6: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/6.jpg)
6
HPSS Growth May 2011-May 2012
• 12.8 PB (May 2011) 15.75 PB (May 2012)– +3 PB in one year, ~23% growth overall– ~70 TB added every week
• Largest increases– CGD: 741 TB– CESM: 624 TB– RDA: 374 TB– University: 302 TB– RAL: 295 TB
• NCAR Lab holdings grew 1.6 PB– Excluding CESM and other CSL activity – From 5.96 PB to 7.58 PB (+27%)
![Page 7: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/7.jpg)
7
HPSS by 2014Hit the wall running…
![Page 8: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/8.jpg)
9
Potential HPSS growth by Jan 2014
~15 months!
![Page 9: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/9.jpg)
10
HPSS allocations• CISL’s new accounting
system– Lets us set HPSS
allocations– Helps you more easily
monitor your holdings• We reduced allocations
for CSL awardees and CHAP awardees
• We will set a “budget” for NCAR labs, too
![Page 10: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/10.jpg)
11
HPSS holdings, Jan 2014 (projected)
• 30+ PB data• 200M+ files
![Page 11: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/11.jpg)
12
Action items
![Page 12: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/12.jpg)
13
1. Cleaning house• USERS: Opportune time to delete old files
– CISL will eventually migrate current holdings to new media– Help us avoid migrating unnecessary data– Closing old projects will provide you with details about
files associated with those projects• CISL: Convert dual-copy files to single-copy
– Recovers ~3 PB of space, mostly older MSS files (where dual-copy was the default).
– Since moving to HPSS, net amount of dual-copy data has decreased by 44 TB
![Page 13: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/13.jpg)
14
Limits to HPSS deletionHPSS holdings by year, in PB. Category amounts estimated.This represents upper limit of possible deletions, since some
files may have been removed.
![Page 14: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/14.jpg)
15
2. Manual second copies• Eliminating dual-copy class of service in favor of
backup area for user-managed second copies– Currently the approach used by Research Data
Archive (RDA)– Advantages:
• Guarantees second copy is on different media• Reduces confusion on dual-copy limitations• Protects against user error (not true of 2-copy CoS!)
– Removes, overwrites of original won’t clobber second copy• Changing your mind consumes less tape this way
– And less cost!
![Page 15: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/15.jpg)
16
3. Think “archive”• “A long-term storage area, often on magnetic tape, for
backup copies of files or for files that are no longer in active use.”– American Heritage Dictionary
• “Records or documents with historical value, or the place where such records and documents are kept.”
• “To transfer files to slower, cheaper media (usually magnetic tape) to free the hard disk space they occupied. … [I]n the 1960s, when disk was much more expensive, files were often shuffled regularly between disk and tape.”– Free On-Line Dictionary of Computing
![Page 16: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/16.jpg)
17
Updated GLADE policies for Yellowstone
• /glade/scratch (5 PB total)– 90-day file retention from last access– 10 TB quota default
• If you need more, ask.– Use it!
• Use responsibly! Don’t let large piles of data sit, untouched, for 88 days
• We will rein in the 90 days, if needed.• /glade/work (1 PB total)
– 500 GB quota default for everyone– No purging or scrubbing!
![Page 17: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/17.jpg)
18
Optimize your workflows
• Don’t use tape for data/files you know are temporary or interim
• Plan ahead• Leave temporary data in /glade/scratch • Post-process to final form before archiving• Take advantage of LSF-controlled Geyser
and Caldera to automate post-processing tasks
![Page 18: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/18.jpg)
19
4. Monitor off-site data• HPSS sizing and plans are estimated based on
size of CISL’s production HPC resources• Not for NCAR data production on Hopper,
Intrepid, Jaguar, Kraken, Pleiades, Blue Waters, Stampede …– HPC sites are pushing the data problem around
• Projects, labs need to be aware of their users’ data migration to NCAR from off-site– Factor this into local data management plans
![Page 19: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/19.jpg)
20
5. Plan ahead• CISL working with B&P on whether to formalize
tape storage needs and costs associated with proposal activity
• Most important for projects with plans to store “significant” amounts of data– How much is “significant” is TBD, but amounts that
can be described in tenths of petabytes or more probably qualify.
– If this applies to you, CISL can provide a cost for tape storage to include in your co-sponsorship budget.
![Page 20: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/20.jpg)
21
Looking ahead…1 year• GLADE will expand by ~5 PB in Q1 2014• How would you like to take advantage of the new
disk?– Near-term, online backup
• E.g., an area for 6-month “insurance” copies– Longer scratch retention– Larger permanent “work” space– Other ideas?
• HPSS procurement for next-generation archive in the planning stages
![Page 21: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/21.jpg)
22
Looking ahead…3-4 years
• Recap: Yellowstone may lead to 25+ PBper year stored in HPSS
• The successor to Yellowstone may be 10+ times more powerful– Anywhere from 15-40 Pflops, likely with GPU, MIC,
or other many-core accelerators• Can we afford to maintain and manage
10x the HPSS storage?– 250 PB per year — 0.25 Exabyte per year?
![Page 22: Yellowstone and HPSS](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815ef0550346895dcdaeff/html5/thumbnails/22.jpg)
23
Questions?