Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2,...

22
Pawsey Site Report Andrew Elwell

Transcript of Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2,...

Page 1: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

Pawsey Site Report

Andrew Elwell

Page 2: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

Overview• Compute Resources• Installed Systems (/scratch, /scratch2, /group)• /scratch purge policy (robinhood)

Page 3: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

Compute Services• Magnus• Galaxy• Zeus• Zythos

Page 4: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

Sonexions• 2 * Cray Sonexion 1600 (Seagate nee Xyratex)• /scratch – 14 SSUs, 3.1PB, General purpose– Magnus, Galaxy, Zeus, Zythos– 750M inodes (40%), 2.1PB (68%) used

• /scratch2 – 6 SSUs, 1.4PB, Astronomy– Galaxy, CASDA – 40M inodes (3%), 1.2PB (88%) used

Page 5: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

Sonexion Software• Appliance package of SL + Lustre– Add on HA, CLI (mostly works), GUI (later works)– Nagios Web interface, LTOP/cerebro, puppet– Slurm on controllers. Why ???

• 1.2.3 (/scratch2), 1.3.1 (/scratch)• Upgrade to 1.4, 1.5 “not advised”

Page 6: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

DIY method• SGI (netapp) arrays, 4* 3U SGI (intel) servers• /group - 768TB, Project duration storage– Magnus, Galaxy, Zeus, Zythos

• Lustre 2.4.3 on CentOS 6

Page 7: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

LNET fun• Each Cray has Aries internal HS network• Service nodes with IB cards to fabric• Fine-grained routing – priority routes to

controlers• Headache when adding Chaos (T&D) into the

pool

Page 8: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

Not just Lustre…• SGI DMF 6PB• 2* Spectra Tfinity (10,000 slots 5000 tapes ea)• Small Copan MAID (2 drawers)• 5PB GPFS as RDS storage + tape backup

Page 9: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

Test & Development• Lustre test system• V small (4*14TB OSTs)• Test upgrade of /group from 2.4.3

Page 10: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

/scratch Purge Policy• “Any files not accessed >30 days will be

removed”• No quota restrictions• /scratch/projectcode/username• Use Robinhood (cea)

Page 11: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

Robinhood• Compiled from GIT, same as 2.5.5-2– Built then installed as RPM onto headnodes

• Robinhood on ‘spare’ data mover node– Dell Cray 64G 2*8 core SandyBridge– Have all lustre fs mounted locally– Remote MySQL (MariaDB 10.0) database

Page 12: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

Robinhood Configs• On NFS /ivec/etc/robinhood/<fs>.conf• DB pass readable by supercomputing group• High defaults (eg scan interval 180d)• Rbh-config tool assumes DB+daemons on

same host (drop+recreate DB by hand)

Page 13: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

Database issues• Initial DB on KVM Guest (10G Ram, separate

Data lun)– Massively underspecced

• Replaced with Physical Hardware – 800G SSD, 256G RAM, 2* 1TB SATA– Tables are ~ 1G per 1M inodes on lustre

Page 14: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

DB Tuning• /var/lib/mysql mounted noatime• Tuning advice from robinhood website– Transaction flushing every sec– Increase threads– Log file size

• Added second SSD

Page 15: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

Populate Database• Enable changelogs– Discover that mds/mgs HA “strange”• Changing configs works best when failed over …

– robinhood -r -d -f configfile• Initial scan– robinhood –scan –once –no-gc –detach -f config

Page 16: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

Monitor progress• Watch STATS in logfile (20 min default)• ‘rbh-report –aif config’ displays last stats• ~100M inodes/day scan• Big directories are SLOW• ‘lsof’ to get current location

Page 17: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

Finally! Start removing• Test run OK (/scratch/project/2014/user)• Empty dirs have separate purge policy– OpenFOAM still hangs around for another 30d

• Disable ‘oldest first’ sort• Purgelog split up (grep) into each project• Dry run done 1 month before (empties DB)

Page 18: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

Oops…• Discover that filesystems mounted with

‘noatime’ so risk that users had accessed files• Alter mounts at next maintenance window– Needed rebuild of compute image

• Wait. • Some users req. 90d purge delay

Page 19: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

User reporting• Robinhood web GUI. “OK” But..– Hack to make it work with multiple filesystems– Cached images sometimes– Want to have simple PI overview

• Spare time refactor using highcharts• Use CLI (mostly ops ‘who’s using what’ Qs)

Page 20: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

monitoring• Log files– Changelog delta (last rec / last purged)– Progress ratio– Entries/s scanned

• No out of box collectd plugins– Cea show some RRD plots, they tune stats interval

up to 1m

Page 21: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

Future plans• New hardware for /group• Graph ALL THE THINGS• Fabric refactor (Block v Lustre)• Upgrade both sonexions to latest 1.3 rel

Page 22: Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)

Thanks• David Schibeci – “1.8 just works on Epic”• Rest of the Pawsey sysadmins• Frithjov Iversen (Cray) – LNET FGR• Peter Castle / Kurt Kappeler (Cray)