Scientific Computing - Hardware

Post on 25-May-2015

588 views 0 download

Tags:

description

Introduction to servers, HPC, and Cloud Computing

Transcript of Scientific Computing - Hardware

Computing HardwareJeff Allen

Quantitative Biomedical Research CenterUT Southwestern Medical Center

BSCI5096 - 3.26.2013

Outline

• Servers• Clusters• The Cloud

Outline

• Servers– Concepts & Definitions– Novel properties of servers

• Clusters• The Cloud

Servers – Concepts & Definitions

“A computer or program that supplies data or resources to

other machines on a network.”

server. (n.d.). Collins English Dictionary - Complete & Unabridged 10th Edition. Retrieved March 25, 2013, from Dictionary.com website: http://dictionary.reference.com/browse/server

• File Server• Database Server• Web Server

• Email Server• iTunes Server• Computing Server

Servers – Concepts & Definitions

• Same hardware components as your Personal Computer– Processor, Memory,

Power Supply, Hard Drive

• Often stacked in a rack

Image from: http://www.stealth.com/industrial_rackmounts_sr1501datasheet.htm

Servers – Concepts & Definitions

• Same hardware components as your Personal Computer– Processor, Memory,

Power Supply, Hard Drive

• Often stacked in a rack

http://www.daystarinc.com/hosting-facility

Is this a server?

Image from: http://www.stealth.com/industrial_rackmounts_sr1501datasheet.htm

Is this a server?

Image from: http://mediapool.getthespec.com/media.jpg?m=gBLSSTJ6IbHLuZD1JNnmyw%3D%3D&v=HR

Is this a server?

Image from: http://www.maximumpc.com/articles/reviews/hardware

Is this a server?

Image from: http://www.phonearena.com/image.php?m=Articles.Images&f=name&id=28259&name=GT-I8520_1.jpg&caption=&title=Image+from+%22UPDATED%3A+Samsung+I8520+is+an+Android+phone+with+built-in+projector

%22&kw=&popup=1

Is this a server?

Common Attributes of a Server

• Often runs an Operating System geared towards servers.

• Primarily accessed remotely– Often “headless” (no monitor)

• Runs 24/7, minimize downtime• May be kept in a data center

– Superior cooling, increased security, etc.• Redundancy (Power, Disk Storage)• More powerful and expensive

Operating System

Client PCs• Windows (XP, Vista, 7,

8)• Mac OS• Linux (Ubuntu, Mint,

openSUSE)

Servers• Linux (Red Hat, Suse

Enterprise, Ubuntu Server)

• Windows (Windows Server 2003, 2008, 2012)

• Non-Linux Unix (BSD, Solaris, AIX)

Remote Access & the Shell

• Typically don’t have physical access to the server, must access over a network

• Windows is heavily graphical, access using “Remote Desktop”

Image from http://www.softsalad.com/software/remote-desktop-control.html

Remote Access & the Shell

• Typically don’t have physical access to the server, must access over a network

• Windows is heavily graphical, access using “Remote Desktop”

• Linux is less graphical, access via a “Shell”

Image from http://www.softsalad.com/software/remote-desktop-control.html

Shell Access

1. User logs in2. User types

command3. Computer executes

command and prints output

4. User types another command

5. …6. User logs off

Modified from http://software-carpentry.org/4_0/shell/intro.html

Shell

Shell Comparison

Windows (Graphical) Linux (Shell)

Image from http://www.dedoimedo.com/computers/windows-7.html

Shell Comparison

Windows (Graphical) Linux (Shell)

Shell Comparison

Windows (Graphical) Linux (Shell)

Shell Access

• Slow learning curve• Can often be confusing at first, requires a

new way of thinking• Ultimately very powerful and efficient• Three reasons to use:

1. It’s your only choice for remote access on some non-graphical systems

2. Many software tools only offer Command Line interfaces

3. Allows for powerful new combinations of tools

Modified from http://software-carpentry.org/4_0/shell/intro.html

Data Centers

• Redundant, independent power feeds– Diesel generator backup

• Redundant Internet connections• Redundant cooling• 24/7/365 staffing, restricted access

RAID

• “Redundant Array of Independent Disks”

• Store information redundantly

• Support failure of one or more hard drives without losing data

Disk 1

Disk 2

Disk 3

Disk 4

Disk 5

RAID Array

Server Computing Power

• Often very expensive machines• Hardware designed to support more

resources than a PC– May have dozens or hundreds of GB of RAM– Very expensive powerful processor, or even

multiple processors

Outline

• Servers• Clusters

– Motivation & Concept– Job submission– Example

• The Cloud

Example Problem

• Group of 10 researchers

• Too many concurrent users, runs slowly

• Have some very large jobs

Naïve Solution

• Buy more independent servers!

• Let people connect to whichever server they want

• Problems:– Not sure which

servers are busiest– Still takes weeks to

run big simulations

Clustered Solution• Servers are “nodes”

in a cluster• Log in via head node• Head node manages

requested jobs– Submits them to

“worker” or “slave” nodes

– Intelligently calculates available resources on each worker node

• Multiple nodes can work on a single task

Job Submission

• Prepare a script to be executed (“myjob.sh”)– Include specifications on resources required

• (“-l nodes=2:ppn=4”)

– Or what queue it should be submitted to• Different queues have different priorities and permissions

• Submit that job to the head node (“qsub myjob.sh”)

• Head node will begin executing as soon as it has sufficient resources

Clustered Solution

Clustered Solution

Job #1 User: User4 Nodes Req’d: 1 Program: align.sh

Clustered Solution

Job #1 User: User4 Nodes Req’d: 1 Program: align.sh

Clustered Solution

Job #1 User: User4 Nodes Req’d: 1 Program: align.sh

Clustered Solution

Job #1 User: User4 Nodes Req’d: 1 Program: align.sh

Job #2 User: User9 Nodes Req’d: 1 Program: simul.sh

Clustered Solution

Job #1 User: User4 Nodes Req’d: 1 Program: align.sh

Job #2 User: User9 Nodes Req’d: 1 Program: simul.sh

Clustered Solution

Job #1 User: User4 Nodes Req’d: 1 Program: align.sh

Job #2 User: User9 Nodes Req’d: 1 Program: simul.sh

Clustered Solution

Job #1 User: User4 Nodes Req’d: 1 Program: align.sh

Job #2 User: User9 Nodes Req’d: 1 Program: simul.sh

Job #3 User: User2 Nodes Req’d: 2 Program: splice.sh

Clustered Solution

Job #1 User: User4 Nodes Req’d: 1 Program: align.sh

Job #2 User: User9 Nodes Req’d: 1 Program: simul.sh

Job #3 User: User2 Nodes Req’d: 2 Program: splice.sh

Queued

Clustered Solution

Job #1 User: User4 Nodes Req’d: 1 Program: align.sh

Job #2 User: User9 Nodes Req’d: 1 Program: simul.sh

Job #3 User: User2 Nodes Req’d: 2 Program: splice.sh

Queued

Clustered Solution

Job #1 User: User4 Nodes Req’d: 1 Program: align.sh

Job #2 User: User9 Nodes Req’d: 1 Program: simul.sh

Job #3 User: User2 Nodes Req’d: 2 Program: splice.sh

Queued

Clustered Solution

Job #1 User: User4 Nodes Req’d: 1 Program: align.sh

Job #3 User: User2 Nodes Req’d: 2 Program: splice.sh

Queued

Clustered Solution

Job #1 User: User4 Nodes Req’d: 1 Program: align.sh

Job #3 User: User2 Nodes Req’d: 2 Program: splice.sh

Queued

Clusters

• Solve problem of sharing resources• Allow multiple nodes to collaborate on a

single job– Programs must be specifically designed to

run in this fashion• Can solve very large problems by

combining hundreds of nodes together– Global weather forecasting, particle

collisions at CERN, etc.

HPC at UT Southwestern

• QBRC manages an 18 node cluster on-campus.

• Have access to Texas Advanced Computing Center (TACC) at UT Austin– 6,400 node cluster with > 100k cores– Attracts many users, often a queue before

your jobs will run.

Outline

• Servers• Clusters• The Cloud

Cloud Computing

• Vendors with access to massive computing resources began leasing their servers out– Amazon, Microsoft, Google, Rackspace– Charge per hour of use, usually just a few

cents.

Cloud Computing - Advantages

• No up-front purchase/cost• No hardware to manage• 100 servers in parallel is the same cost

as a single server running for 100 hours– Can get parallel jobs done much more

quickly

Cloud Computing - Disadvantages

• Data must be transferred over the Internet– Can take hours to upload a large sequencing

experiment.• Can be more expensive than internal

clusters