Virtual Machines for HPC Paul Lu, Cam Macdonell Dept of Computing Science.

18
Virtual Machines for HPC Paul Lu, Cam Macdonell Dept of Computing Science
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    0

Transcript of Virtual Machines for HPC Paul Lu, Cam Macdonell Dept of Computing Science.

Virtual Machines for HPC

Paul Lu, Cam Macdonell

Dept of Computing Science

The Problems

1. Making applications run faster– Not discussed today– Parallelism is not always the answer

2. Making it easier to use different clusters– Packaging of applications, scripts, and libraries– Dealing with differences in environment

3. Making it easier to manage your files– Distributed file systems

Making Use of Clusters

• Heterogeneity creates complexity

• How can a scientist make use of all these clusters, without becoming a computing scientist?

Scientific LinuxRed Hat Linux

GROMACSBLAST

Python 2.3.5

Python 2.2FFTW

GlobusTrellis

Library X

Shrink-Wrapped VMs

• Package once– OS (e.g., Linux)– Libraries– Application(s)

• Run many places– Busby– Glacier– Favourite

workstation

Linux

GROMACS

Trellis

Linux, Windows, Mac OS

VM

HPC using VMs

• Packaged once, run on many x86 clusters

• Using Trellis, data is automatically moved from local-to-remote, and back

Glacier Busby, AICT

GROMACS

LinuxTrellis

GROMACS

LinuxTrellis

GROMACS

LinuxTrellis

GROMACS

LinuxTrellis

File Server, Laptop

Local

Remote

GROMACS on VM and HW

Concluding Remarks

• Small performance hit with VMs

• Much easier to package and use

• Potentially, access to many more compute nodes

There is hope!

• Virtualization!

What is Computing Science?

• “So…you…like…write programs or something?”

• Can you fix my printer?

Scientific Computing

• Scientific applications are on the leading edge of computing– Lots of resources– Complex interactions– Huge amounts of data

Fastest Supercomputer

• Fastest Supercomputer– IBM BlueGene/L @ LLNL

• Previously fastest– NEC Earth Simulator

• Are computers good at solving problems in natural science?

Computing in Canada

• Canada lacks world class computing facilities

• We have to be able to aggregate resources from numerous institutions

• The CISS experiments explored aggregating computing resources– 4000 CPUs, 19 ADs

Aggregating is difficult

• Different administration domains

• Running GROMACS– Requires fftw– Doesn’t like new compilers– Files must be in certain locations

• And this is just for one application!

Virtualization

• Is it appropriate for Scientific Computing?

– Performance has improved

– Pricing has improved (it’s become free)

Virtual Images

• Positives– Completely portable

• Less administration

– Control entire environment within Virtual Image

• We can run any application in them• We can bundle data control software within

them

Virtual Images

• Negatives– Large size

• GBs for virtual disks

– Performance Loss• Virtualization is slower than running on

hardware

VMware on Busby

• Gromacs test run on Busby1

00.20.40.60.8

11.21.41.61.8

2

Hours

HardwareVmware

Future Directions

• Resolve performance anomaly

• More accurate timings of phases

• Run other applications

• Get all 4 nodes running concurrently