GXP in nutshell You can send jobs (Unix shell command line) to many machines, very fast Very small...
-
Upload
erin-hollis -
Category
Documents
-
view
212 -
download
0
Transcript of GXP in nutshell You can send jobs (Unix shell command line) to many machines, very fast Very small...
![Page 1: GXP in nutshell You can send jobs (Unix shell command line) to many machines, very fast Very small prerequisites –Each node has python (ver 2.3.4 or later)](https://reader036.fdocuments.in/reader036/viewer/2022082917/5515d97b550346cf6f8b49ee/html5/thumbnails/1.jpg)
GXP in nutshell
• You can send jobs (Unix shell command line) to many machines, very fast
• Very small prerequisites– Each node has python (ver 2.3.4 or later)– You have ssh access to it without being asked
to enter passphrases (e.g., use ssh-agent for ssh)
– Install GXP (only) to your home node. GXP multiplies itself to nodes you want to use
![Page 2: GXP in nutshell You can send jobs (Unix shell command line) to many machines, very fast Very small prerequisites –Each node has python (ver 2.3.4 or later)](https://reader036.fdocuments.in/reader036/viewer/2022082917/5515d97b550346cf6f8b49ee/html5/thumbnails/2.jpg)
What is it useful for?
• With GXP, you can comfortably– operate many nodes, interactively or non-intera
ctively– use nodes across multiple clusters– reach nodes behind firewalls/NATs– deal with many nodes some of which are daily
dead or unavailable– coordinate multiple clusters as a parallel proces
sing resource without installing any job scheduling software (PBS/condor etc.)
![Page 3: GXP in nutshell You can send jobs (Unix shell command line) to many machines, very fast Very small prerequisites –Each node has python (ver 2.3.4 or later)](https://reader036.fdocuments.in/reader036/viewer/2022082917/5515d97b550346cf6f8b49ee/html5/thumbnails/3.jpg)
Things made simple by GXP (1)
• Launch a parallel program on many nodes across multiple clusters
• Kill them with a single stroke of Ctrl-C• Simple PBS/Condor-like job scheduling• Monitor specific programs (ps … on all nodes)• Kill specific programs (killall … on all nodes)• Clean up all processes as a last resort (bomb)
![Page 4: GXP in nutshell You can send jobs (Unix shell command line) to many machines, very fast Very small prerequisites –Each node has python (ver 2.3.4 or later)](https://reader036.fdocuments.in/reader036/viewer/2022082917/5515d97b550346cf6f8b49ee/html5/thumbnails/4.jpg)
Things made simple by GXP (2)
• Copy a file to many nodes, some behind firewalls/NATs
• Elect a single node from each file system• Get load-average of all nodes and drop hig
hly-loaded nodes• Check installation of a command and drop
nodes that don’t have it• List processes consuming significant amo
unt of CPU time
![Page 5: GXP in nutshell You can send jobs (Unix shell command line) to many machines, very fast Very small prerequisites –Each node has python (ver 2.3.4 or later)](https://reader036.fdocuments.in/reader036/viewer/2022082917/5515d97b550346cf6f8b49ee/html5/thumbnails/5.jpg)
Our Experience
• A fairly large natural language processing task– parse > 100M web documents collected and archived
by our web crawler– resource : 350 CPUs across two clusters– GXP integrates them without specific efforts
• Environments– No globus/PBS installed (globus/rsh ports are blocked
across clusters)– Documents must be staged on demand due to disk ca
pacity– Nodes in one of the two clusters cannot connect to ou
tside the cluster. We used GXP to stage a file through multiple relaying hosts
![Page 6: GXP in nutshell You can send jobs (Unix shell command line) to many machines, very fast Very small prerequisites –Each node has python (ver 2.3.4 or later)](https://reader036.fdocuments.in/reader036/viewer/2022082917/5515d97b550346cf6f8b49ee/html5/thumbnails/6.jpg)
Basic Usage
• `explore’ command: login & authenticate yourself to many nodes
• `e’ command: send and execute a command line (very fast)
![Page 7: GXP in nutshell You can send jobs (Unix shell command line) to many machines, very fast Very small prerequisites –Each node has python (ver 2.3.4 or later)](https://reader036.fdocuments.in/reader036/viewer/2022082917/5515d97b550346cf6f8b49ee/html5/thumbnails/7.jpg)
Features (1) multihop logins
• You can reach nodes through other nodes
• Two typical scenarios where this is important– Home a cluster gateway
cluster nodes– Very large clusters for which trees
are mandatory
• Subsequent command submissions transparently reach all nodes home
cluster gateways
![Page 8: GXP in nutshell You can send jobs (Unix shell command line) to many machines, very fast Very small prerequisites –Each node has python (ver 2.3.4 or later)](https://reader036.fdocuments.in/reader036/viewer/2022082917/5515d97b550346cf6f8b49ee/html5/thumbnails/8.jpg)
Feature (2) node selection
• You don’t always want to send commands to all nodes
• After logging in many nodes, you can interactively select some of them
• `smask’ command selects the nodes on which the last command succeeded
![Page 9: GXP in nutshell You can send jobs (Unix shell command line) to many machines, very fast Very small prerequisites –Each node has python (ver 2.3.4 or later)](https://reader036.fdocuments.in/reader036/viewer/2022082917/5515d97b550346cf6f8b49ee/html5/thumbnails/9.jpg)
Feature (3) coordination syntax
• `e’ command takes an extended shell syntax
• e {{ S }} M– Run S on all selected nodes– Run M on home node– Merge all S’s standard out and feed it to M
• e B {{ S }}– Feed B’s standard out to all S’s
• e S is a abbreviation of e {{ S }}