11/30/2015 2:31:26 PM 5864_ER_FED 1 Doing More With Less: Managing Software at a DoD HPC Center Jay...

Post on 14-Dec-2015

214 views 2 download

Tags:

Transcript of 11/30/2015 2:31:26 PM 5864_ER_FED 1 Doing More With Less: Managing Software at a DoD HPC Center Jay...

04/18/23 05:47 PM 5864_ER_FED 1

Doing More With Less:Managing Software at a DoD HPC Center

Jay Blair

CSC High Performance Computing Center of Excellence

•DoD HPCMP– The High Performance Computing Modernization Program

(HPCMP) was initiated in 1992 in response to congressional direction to modernize the Department of Defense (DoD) laboratories' high performance computing (HPC) capabilities

•Major Shared Resource Centers– Complete HPC environment, including hardware, software,

data storage, archiving, visualization, training, and expertise in specific computational technology areas

Introduction

Introduction

•ASC MSRC–Wright-Patterson AFB, OH

–Over 3000 CPUs on the floor

• HP, IBM, SGI & Linux

–Thousands of users (Gov, civilian, contractor)

–Software usage is mix of home grown and commercial

• 80% home grown

• 20% commercial

• Broken down into computational technology areas

• Jobs are managed in a batch only environment

The Dilemma

Users Vendors

Management Neverland

Wants and Needs

The Method

Users Cost

Vendors

Perfect Solution

5 What’s

1. What do we have?

2. What is being used?

3. What do we need?

4. What are the options?

5. What will the future hold?

What Do We Have?

•Software–What capabilities does the vendor say we have

–What is the competition

–How many seats/licenses/tokens

–What are the lifecycle and maintenance costs

–What is the license model

• Floating, node locked or token

• Open source

• Lease / Paid up

What Is Being Used?

•Log files are your friends–Parsing in Perl

–Macrovision tools

• SAM Suite aka FLEXnet Manager

–Metrics of merit

• CPU hours

• Number of license hits

• Number of users

• Maximum concurrent users

• Denials (Hours unavailable)

What Do We Need?

•Vendor NDA–Site specific issues

• Hardware/OS support

–The alpha is dead… long live the alpha

• Security

–No root, no gui, no problem

–Probe for interest in working relationship

• White papers

• Tech day

• Usage demographics

–36 month technology roadmap

What Do We Need?

•User Input–What do they need

• Functional needs

• Experience base

• Training budget

–HTML survey to targeted group of users

• Brief

• Highly focused questions

• Leverage intranet or in house web tools

• Publicize the results

• Graph it

What Do We Need?

•Management thoughts–Corporate initiatives

• Left asks: Right what are you doing?

–Existing partnerships

• Corporate rates, agreements, initiatives…

–Leverage

• Do not reinvent the wheel

–Timeline

• Time is money

–Access to data

• Is effort in the loop

–Do you have autonomy?

What Do We Need?

•Proof is in the pudding… er… benchmark–User requirements drive problem set

–Best / worst of class

–Know the answer and tell the vendor

–Brief vendor

• What

• Why

• Time

• Cost/Benefit

• Performance

•Usual Suspects–Add

–Remove

–Retain

• Users want the same or more as long as you pay

• Management wants less because they do pay

• Vendors want more because those payments on that Jag and the timeshare in Cancun aren’t getting any cheaper!

What Are the Options?

What Are the Options?

•Fallacy of Removal–Cost savings for removal is never 100%

–More than likely have to add to remove

–Consider freezing maintenance on paidups

–Consider divesture to another group

–Trade in value to OEM or competitor

–Consider Open Source• Free as in free or as in beer?

What Will the Future Hold?

•The Art and Practice of Prediction–Historical trends

–Current usage

–User-centric

–Project-centric

–Business methodology

• Make and Break

• Simulation

–If budget allows… Pad

Analysis Methodologies

•Return on Investment–Ingredients

• Hours

• Costs

• Users

Cos

t per

Use

r

Cos

t per

Hou

r

AnneBobChrisDanEllen

Analysis Methodologies

•Software Utility–Ingredients

• Users

• Hours

• Number of Jobs

• Cost

Cost

Hours

Number of Users

Jobs

Util

ity –

Are

a

Analysis Methodologies

•Historical Baseline–Examine costs over time

• Across all applications

• Across groups of similar functionality

Cos

t per

Hou

r

Uno

Dos

Tre

s

Cua

tro

Cin

co

Sei

s All Software for FY04

Cos

t per

Hou

r

Alp

haB

eta

Gam

ma

Del

taE

psilo

n

Zet

a

Cinco Software for FY04

Analysis Methodologies

•Software Usage–Ingredients

• Usage

• Hours

Num

ber

of J

obs

Jan

Feb

Mar

ch

Apr

il

May

June

Max

Con

curr

ent U

ser

Cinco SoftwareVendor Epsilon

Example CSM

•Computational Structural Mechanics–In 2002 there were four major commercial codes

• Alpha, Beta, Gamma, Delta

–Maintenance tail was excessive

–Contract with most expensive vendor was at end of life

–Some paid-up licenses and some leases

–Number of licenses was large

Goal was to reduce cost w/o adversely affecting users

What Is Being Used?Sample Alpha Usage at ASC MSRC

0

50

100

150

200

250

300

350

400

450

Oct

-01

Dec

-01

Jan-

02

Feb

-02

Month

Nu

mb

er o

f Jo

bs

0

2

4

6

8

10

12

14

Max

Co

ncu

rren

t U

sag

e

Jobs Max Concurrent License

Average

Licenses Purchased

Example CSM

Describe your usage of CSM software

Example CSM

How much alpha code experience do you have?

Example CSM

How many resources do you request per job?

Example CSM

If you could run a parallel job, would you?

Example

Features of Importance

Example CSM

•User Defined Benchmarks–Pulled from vendor documentation

–Industry standards

–User examples

• 25 problems

–Answers had to have decks and results

–Verification performed

• “Trivial” – Nameless Vendor

• 6 week timeline

Example CSM

CSM Solvers Feature Matrix

Feature Alpha Beta Delta Gamma

Implicit Linear X X X X

Implicit Nonlinear X X X X

Implicit Dynamics X X X X

User Defined Code X X X X

Heat Transfer X X X X

Aeroelasticity X

Explicit X X

Substructuring X X X

Optimization X X X

Distributed Memory Parallel (MPI) X X X

Shared Memory Parallel X X X X

Compaq Tru64 X X X X

SGI Irix X X X X

IBM AIX X X X X

= Support via 3rd party package = Not Available

Example CSM

CSM Decision

• Codes are functional equivalent at the 85%+ level

• Removed 2 packages (leased codes)

• Added seats to 2 packages

• Added 1 new package

• 6 weeks of training

Net savings in excess of $3M over 5 yearsNo unhappy users

Example CCM

•Computational Chemistry Codes–In 2004 there were several major commercial codes

• Alpha, Beta, Gamma, Delta, Epsilon, Zeta

–Large portion of costs in single code

–Contract with most expensive vendor was at end of life

–User requirements were undefined hence vendor was in control of situation

Goal was to right size w/o adversely affecting users

FY04 Costs

Alpha93%

Gamma1%

Beta1%

Epsilon1%

Delta2%

Zeta2%

Example CCM

GRAPHITE / ROI

                                                         

   

Alpha CCM CodeReport Period: October 1, 2009 to Sept 19, 2010

Normalized Area: .025

Cost: $500,000

CPU Hours: 150,000

Number of Hits: 2000

Number of Users: 10

Cost / Hour: $3.33

Cost

Hours

Number of Users

Jobs

Example CCM

What Is Being Used?Sample Alpha Usage at ASC MSRC

0

50

100

150

200

250

300

350

400

450

Oct

-01

Dec

-01

Jan-

02

Feb

-02

Month

Nu

mb

er o

f Jo

bs

0

2

4

6

8

10

12

14

Max

Co

ncu

rren

t U

sag

e

Jobs Max Concurrent License

Average

Licenses Purchased

Example CCM

Cos

t

Alp

ha

Bet

a

Gam

ma

Del

ta

Eps

ilon

Zet

a

Cost per User

Cost per Hour

Example CCM

What are the costs across CCM?

What are the features of Alpha are we usingand their cost?

Cost per User

Cost per Hour

Cos

t

Features

Example CCM

CCM Decision

• Not all features needed

• Reduced number of features within a package

• Pushed high $/hr or $/user features to users that need them

• Supported the idea of a core set of necessary features

• Offset some use to Government or Open Source codes

• Multiyear lease deal with vendor

Net savings in excess of $800K over 5 yearsUsers are more aware of metrics

Summary

• Software Utility is possible in a Government environment

• Users will come along if you make a good case

• Look to commercial offerings to increase value add

–Macrovision FLEXnet Manager

• Information is power with OEM vendors

• Decreasing costs can be done so that users do not suffer

Thanks!

• You’ve Got Questions: Excellent, I have answers!