Scientific Computing - Hardware
-
Upload
jalle6 -
Category
Technology
-
view
588 -
download
0
description
Transcript of Scientific Computing - Hardware
Computing HardwareJeff Allen
Quantitative Biomedical Research CenterUT Southwestern Medical Center
BSCI5096 - 3.26.2013
Outline
• Servers• Clusters• The Cloud
Outline
• Servers– Concepts & Definitions– Novel properties of servers
• Clusters• The Cloud
Servers – Concepts & Definitions
“A computer or program that supplies data or resources to
other machines on a network.”
server. (n.d.). Collins English Dictionary - Complete & Unabridged 10th Edition. Retrieved March 25, 2013, from Dictionary.com website: http://dictionary.reference.com/browse/server
• File Server• Database Server• Web Server
• Email Server• iTunes Server• Computing Server
Servers – Concepts & Definitions
• Same hardware components as your Personal Computer– Processor, Memory,
Power Supply, Hard Drive
• Often stacked in a rack
Image from: http://www.stealth.com/industrial_rackmounts_sr1501datasheet.htm
Servers – Concepts & Definitions
• Same hardware components as your Personal Computer– Processor, Memory,
Power Supply, Hard Drive
• Often stacked in a rack
http://www.daystarinc.com/hosting-facility
Is this a server?
Image from: http://www.stealth.com/industrial_rackmounts_sr1501datasheet.htm
Is this a server?
Image from: http://mediapool.getthespec.com/media.jpg?m=gBLSSTJ6IbHLuZD1JNnmyw%3D%3D&v=HR
Is this a server?
Image from: http://www.maximumpc.com/articles/reviews/hardware
Is this a server?
Image from: http://www.phonearena.com/image.php?m=Articles.Images&f=name&id=28259&name=GT-I8520_1.jpg&caption=&title=Image+from+%22UPDATED%3A+Samsung+I8520+is+an+Android+phone+with+built-in+projector
%22&kw=&popup=1
Is this a server?
Common Attributes of a Server
• Often runs an Operating System geared towards servers.
• Primarily accessed remotely– Often “headless” (no monitor)
• Runs 24/7, minimize downtime• May be kept in a data center
– Superior cooling, increased security, etc.• Redundancy (Power, Disk Storage)• More powerful and expensive
Operating System
Client PCs• Windows (XP, Vista, 7,
8)• Mac OS• Linux (Ubuntu, Mint,
openSUSE)
Servers• Linux (Red Hat, Suse
Enterprise, Ubuntu Server)
• Windows (Windows Server 2003, 2008, 2012)
• Non-Linux Unix (BSD, Solaris, AIX)
Remote Access & the Shell
• Typically don’t have physical access to the server, must access over a network
• Windows is heavily graphical, access using “Remote Desktop”
Image from http://www.softsalad.com/software/remote-desktop-control.html
Remote Access & the Shell
• Typically don’t have physical access to the server, must access over a network
• Windows is heavily graphical, access using “Remote Desktop”
• Linux is less graphical, access via a “Shell”
Image from http://www.softsalad.com/software/remote-desktop-control.html
Shell Access
1. User logs in2. User types
command3. Computer executes
command and prints output
4. User types another command
5. …6. User logs off
Modified from http://software-carpentry.org/4_0/shell/intro.html
Shell
Shell Comparison
Windows (Graphical) Linux (Shell)
Image from http://www.dedoimedo.com/computers/windows-7.html
Shell Comparison
Windows (Graphical) Linux (Shell)
Shell Comparison
Windows (Graphical) Linux (Shell)
Shell Access
• Slow learning curve• Can often be confusing at first, requires a
new way of thinking• Ultimately very powerful and efficient• Three reasons to use:
1. It’s your only choice for remote access on some non-graphical systems
2. Many software tools only offer Command Line interfaces
3. Allows for powerful new combinations of tools
Modified from http://software-carpentry.org/4_0/shell/intro.html
Data Centers
• Redundant, independent power feeds– Diesel generator backup
• Redundant Internet connections• Redundant cooling• 24/7/365 staffing, restricted access
RAID
• “Redundant Array of Independent Disks”
• Store information redundantly
• Support failure of one or more hard drives without losing data
Disk 1
Disk 2
Disk 3
Disk 4
Disk 5
RAID Array
Server Computing Power
• Often very expensive machines• Hardware designed to support more
resources than a PC– May have dozens or hundreds of GB of RAM– Very expensive powerful processor, or even
multiple processors
Outline
• Servers• Clusters
– Motivation & Concept– Job submission– Example
• The Cloud
Example Problem
• Group of 10 researchers
• Too many concurrent users, runs slowly
• Have some very large jobs
Naïve Solution
• Buy more independent servers!
• Let people connect to whichever server they want
• Problems:– Not sure which
servers are busiest– Still takes weeks to
run big simulations
Clustered Solution• Servers are “nodes”
in a cluster• Log in via head node• Head node manages
requested jobs– Submits them to
“worker” or “slave” nodes
– Intelligently calculates available resources on each worker node
• Multiple nodes can work on a single task
Job Submission
• Prepare a script to be executed (“myjob.sh”)– Include specifications on resources required
• (“-l nodes=2:ppn=4”)
– Or what queue it should be submitted to• Different queues have different priorities and permissions
• Submit that job to the head node (“qsub myjob.sh”)
• Head node will begin executing as soon as it has sufficient resources
Clustered Solution
Clustered Solution
Job #1 User: User4 Nodes Req’d: 1 Program: align.sh
Clustered Solution
Job #1 User: User4 Nodes Req’d: 1 Program: align.sh
Clustered Solution
Job #1 User: User4 Nodes Req’d: 1 Program: align.sh
Clustered Solution
Job #1 User: User4 Nodes Req’d: 1 Program: align.sh
Job #2 User: User9 Nodes Req’d: 1 Program: simul.sh
Clustered Solution
Job #1 User: User4 Nodes Req’d: 1 Program: align.sh
Job #2 User: User9 Nodes Req’d: 1 Program: simul.sh
Clustered Solution
Job #1 User: User4 Nodes Req’d: 1 Program: align.sh
Job #2 User: User9 Nodes Req’d: 1 Program: simul.sh
Clustered Solution
Job #1 User: User4 Nodes Req’d: 1 Program: align.sh
Job #2 User: User9 Nodes Req’d: 1 Program: simul.sh
Job #3 User: User2 Nodes Req’d: 2 Program: splice.sh
Clustered Solution
Job #1 User: User4 Nodes Req’d: 1 Program: align.sh
Job #2 User: User9 Nodes Req’d: 1 Program: simul.sh
Job #3 User: User2 Nodes Req’d: 2 Program: splice.sh
Queued
Clustered Solution
Job #1 User: User4 Nodes Req’d: 1 Program: align.sh
Job #2 User: User9 Nodes Req’d: 1 Program: simul.sh
Job #3 User: User2 Nodes Req’d: 2 Program: splice.sh
Queued
Clustered Solution
Job #1 User: User4 Nodes Req’d: 1 Program: align.sh
Job #2 User: User9 Nodes Req’d: 1 Program: simul.sh
Job #3 User: User2 Nodes Req’d: 2 Program: splice.sh
Queued
Clustered Solution
Job #1 User: User4 Nodes Req’d: 1 Program: align.sh
Job #3 User: User2 Nodes Req’d: 2 Program: splice.sh
Queued
Clustered Solution
Job #1 User: User4 Nodes Req’d: 1 Program: align.sh
Job #3 User: User2 Nodes Req’d: 2 Program: splice.sh
Queued
Clusters
• Solve problem of sharing resources• Allow multiple nodes to collaborate on a
single job– Programs must be specifically designed to
run in this fashion• Can solve very large problems by
combining hundreds of nodes together– Global weather forecasting, particle
collisions at CERN, etc.
HPC at UT Southwestern
• QBRC manages an 18 node cluster on-campus.
• Have access to Texas Advanced Computing Center (TACC) at UT Austin– 6,400 node cluster with > 100k cores– Attracts many users, often a queue before
your jobs will run.
Outline
• Servers• Clusters• The Cloud
Cloud Computing
• Vendors with access to massive computing resources began leasing their servers out– Amazon, Microsoft, Google, Rackspace– Charge per hour of use, usually just a few
cents.
Cloud Computing - Advantages
• No up-front purchase/cost• No hardware to manage• 100 servers in parallel is the same cost
as a single server running for 100 hours– Can get parallel jobs done much more
quickly
Cloud Computing - Disadvantages
• Data must be transferred over the Internet– Can take hours to upload a large sequencing
experiment.• Can be more expensive than internal
clusters