Debugging Cluster Programs
description
Transcript of Debugging Cluster Programs
![Page 1: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/1.jpg)
Debugging Cluster Programs
usingsymbolic debuggers
![Page 2: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/2.jpg)
Debugging Code
Careful review of your codeAdd debugging code to your code print statements at strategic locations in code remove later
Use a symbolic debugger
![Page 3: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/3.jpg)
Careful review of your code
Rereading your code is often helpfulMost parallel code errors are serial errorsCompare your code to specsTake a break, review your code with a fresh brainHave someone else help you review your code
![Page 4: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/4.jpg)
Common sources of errors
Beyond what the compiler catches Usually run-time errors
Incorrect use of pointers Point out of memory Reference should have used a pointer
Referenced wrong variableIndex initialized wrong, wrong exit condition
![Page 5: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/5.jpg)
Common parallel errors
Deadlock errors Receive before send Receive, but no send
Incorrect arguments in MPI calls Mismatch on tags Mismatch of source/destination Misunderstanding of a the use of an
argument
![Page 6: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/6.jpg)
Add Debugging Code
Add strategically placed code in your code to display critical informationWatch values of variables as the program progressesCan create data-dump functions – call when you need themHave a way to remove them in production code
![Page 7: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/7.jpg)
Add Debugging Code
Can be difficult to get the right debugging code in the right placeDoes not scale well in parallel environmentCan produce unmanageable or unintelligible output
![Page 8: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/8.jpg)
Symbolic Debuggers
Allow you to – inspect your code monitor its behavior modify the data values
on the fly – as your code executes
![Page 9: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/9.jpg)
gdb – GNU debugger
![Page 10: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/10.jpg)
Frequently used GDB commands:
break [file:]function - Set a breakpoint at function (in file).
run [arglist] - Start your program (with arglist, if specified).
bt - Backtrace: display the program stack.
print expr - Display the value of an expression.
c - Continue running your program (after stopping, e.g. at a breakpoint).
next - Execute next program line (after stopping); step over any function calls in the line.
step - Execute next program line (after stopping); step into any function calls in the line.
help [name] - Show information about GDB command name, or general information about using GDB.
quit - Exit from GDB.
![Page 11: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/11.jpg)
gdb
![Page 12: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/12.jpg)
![Page 13: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/13.jpg)
![Page 14: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/14.jpg)
![Page 15: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/15.jpg)
Running in X-windows
Linux (Unix) to Linux ssh to host, login and enter X application
Other platforms (Windows, Mac) – Use X-windows server applicationVNC in most platforms VNC operates as a remote
control application in Linux VNC operates as a X-windows server viewer for Windows, Macintosh, Solaris
![Page 16: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/16.jpg)
Running in X-windows
Using VNCssh to host and loginstart vncserver pay attention to display id (:n)
from your desktop run VNCViewer select host with correct display id
After session kill vncserver – vncserver –kill :n (n is display id
number)
![Page 17: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/17.jpg)
Using VNC
![Page 18: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/18.jpg)
![Page 19: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/19.jpg)
x desktop with VNC
![Page 20: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/20.jpg)
ddd – a graphic front end to gdb…
![Page 21: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/21.jpg)
pgdbg
Debugger from the Portland Group (PGI)Can use with PG compilersCan use with GNU compilers
![Page 22: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/22.jpg)
pgdbg – common commands
Back to text mode for a bitlis[t] [count | low:high | routine | line,count]
-display lines from the source code file or routine
att[ach] <pid> [<exe> | <exe> <host>]
- attach to a running process <pid> or start a local executable and attach to it, or start
an executable <exe> on <host>
c[ont] - continue executing from the current location
![Page 23: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/23.jpg)
pgdbg – common commands
det[ach] – detach from the currently attached process
halt – halt the executing process or thread
n[ext] [count] – continue executing and stop after count lines of source code
nexti [count] – continue executing and stop after count
instructions
![Page 24: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/24.jpg)
pgdbg – common commands
q[uit] – terminate pgdbg and exit
ru[n] [arg0 arg1 … argn] – run program from beginning with arguments arg0, arg1…
s[tep] [count] – execute next count lines of source code and stop. Step steps into
called routines
s[tep] up – steps out of current routine
stepi [count] – execute next count instructions and stop. Steps into called routines
![Page 25: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/25.jpg)
pgdbg – common commands
stepi up – steps out of current routine and stops
Event command –
break line | function - sets a break point to specified line or function. If no line or function specified lists existing breakpoints. A break point stops execution at specified point
clear [all | line | func] – clears all breakpoints, or a breakpoint at line line or at function func.
![Page 26: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/26.jpg)
pgdbg – common commandsstop var - break when the value of var changes at a
location
watch expr – stops and display the value of expr when it changes
track expr – like watch except does not stop execution
trace var - displays a trace of source line execution when the value of var changes
![Page 27: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/27.jpg)
pgdbg – common commands
p[rint] var – displays the value of a variable
edit filename – evokes an editor to edit file filename. If no filename given edits current file
decl[aration name – displays the type declaration for the object name
as[ign] var = expr - assigns the value expr to the variable var
proc [number] – sets the current process to process number number
![Page 28: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/28.jpg)
![Page 29: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/29.jpg)
![Page 30: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/30.jpg)
![Page 31: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/31.jpg)
![Page 32: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/32.jpg)
Resources
gdb man gdb info gdb; Using GDB: A Guide to the GNU
Source- Level Debugger, Richard M. Stallman and
Roland H. Pesch, July 1991.
ddd man ddd
VNC http://www.uk.research.att.com/vnc/ http://www.realvnc.com
![Page 33: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/33.jpg)
Resources
PGI Debugger User’s Guide http://www.pgroup.com/ppro_docs/pgdbg_ug/PGDBG4.htmPGI Users Guide, PGI 4.1 Release Notes, FAQ, Tutorials http://www.pgroup.com/docs.htmMPI-CH http://www.netlib.org/ OpenMP http://www.openmp.org/ HPDF (High Performance Debugging Forum) Standard http://www.ptools.org/hpdf/draft/intro.html
![Page 34: Debugging Cluster Programs](https://reader036.fdocuments.in/reader036/viewer/2022062500/568157a3550346895dc535e2/html5/thumbnails/34.jpg)