Debugging Cluster Programs using symbolic debuggers.
34
Debugging Cluster Programs using symbolic debuggers
-
Upload
myra-preston -
Category
Documents
-
view
252 -
download
3
Transcript of Debugging Cluster Programs using symbolic debuggers.
- Slide 1
- Debugging Cluster Programs using symbolic debuggers
- Slide 2
- Debugging Code Careful review of your code Add debugging code to your code print statements at strategic locations in code remove later Use a symbolic debugger
- Slide 3
- Careful review of your code Rereading your code is often helpful Most parallel code errors are serial errors Compare your code to specs Take a break, review your code with a fresh brain Have someone else help you review your code
- Slide 4
- Common sources of errors Beyond what the compiler catches Usually run-time errors Incorrect use of pointers Point out of memory Reference should have used a pointer Referenced wrong variable Index initialized wrong, wrong exit condition
- Slide 5
- Common parallel errors Deadlock errors Receive before send Receive, but no send Incorrect arguments in MPI calls Mismatch on tags Mismatch of source/destination Misunderstanding of a the use of an argument
- Slide 6
- Add Debugging Code Add strategically placed code in your code to display critical information Watch values of variables as the program progresses Can create data-dump functions call when you need them Have a way to remove them in production code
- Slide 7
- Add Debugging Code Can be difficult to get the right debugging code in the right place Does not scale well in parallel environment Can produce unmanageable or unintelligible output
- Slide 8
- Symbolic Debuggers Allow you to inspect your code monitor its behavior modify the data values on the fly as your code executes
- Slide 9
- gdb GNU debugger
- Slide 10
- Frequently used GDB commands: break [file:]function - Set a breakpoint at function (in file). run [arglist] - Start your program (with arglist, if specified). bt - Backtrace: display the program stack. print expr - Display the value of an expression. c - Continue running your program (after stopping, e.g. at a breakpoint). next - Execute next program line (after stopping); step over any function calls in the line. step - Execute next program line (after stopping); step into any function calls in the line. help [name] - Show information about GDB command name, or general information about using GDB. quit - Exit from GDB.
- Slide 11
- gdb
- Slide 12
- Slide 13
- Slide 14
- Slide 15
- Running in X-windows Linux (Unix) to Linux ssh to host, login and enter X application Other platforms (Windows, Mac) Use X- windows server application VNC in most platforms VNC operates as a remote control application in Linux VNC operates as a X-windows server viewer for Windows, Macintosh, Solaris
- Slide 16
- Running in X-windows Using VNC ssh to host and login start vncserver pay attention to display id (:n) from your desktop run VNCViewer select host with correct display id After session kill vncserver vncserver kill :n (n is display id number)
- Slide 17
- Using VNC
- Slide 18
- Slide 19
- x desktop with VNC
- Slide 20
- ddd a graphic front end to gdb
- Slide 21
- pgdbg Debugger from the Portland Group (PGI) Can use with PG compilers Can use with GNU compilers
- Slide 22
- pgdbg common commands Back to text mode for a bit lis[t] [count | low:high | routine | line,count] -display lines from the source code file or routine att[ach] [ | ] - attach to a running process or start a local executable and attach to it, or start an executable on c[ont] - continue executing from the current location
- Slide 23
- pgdbg common commands det[ach] detach from the currently attached process halt halt the executing process or thread n[ext] [count] continue executing and stop after count lines of source code nexti [count] continue executing and stop after count instructions
- Slide 24
- pgdbg common commands q[uit] terminate pgdbg and exit ru[n] [arg0 arg1 argn] run program from beginning with arguments arg0, arg1 s[tep] [count] execute next count lines of source code and stop. Step steps into called routines s[tep] up steps out of current routine stepi [count] execute next count instructions and stop. Steps into called routines
- Slide 25
- pgdbg common commands stepi up steps out of current routine and stops Event command break line | function - sets a break point to specified line or function. If no line or function specified lists existing breakpoints. A break point stops execution at specified point clear [all | line | func] clears all breakpoints, or a breakpoint at line line or at function func.
- Slide 26
- pgdbg common commands stop var - break when the value of var changes at a location watch expr stops and display the value of expr when it changes track expr like watch except does not stop execution trace var - displays a trace of source line execution when the value of var changes
- Slide 27
- pgdbg common commands p[rint] var displays the value of a variable edit filename evokes an editor to edit file filename. If no filename given edits current file decl[aration name displays the type declaration for the object name as[ign] var = expr - assigns the value expr to the variable var proc [number] sets the current process to process number number
- Slide 28
- Slide 29
- Slide 30
- Slide 31
- Slide 32
- Resources gdb man gdb info gdb; Using GDB: A Guide to the GNU Source- Level Debugger, Richard M. Stallman and Roland H. Pesch, July 1991. ddd man ddd VNC http://www.uk.research.att.com/vnc/ http://www.realvnc.com
- Slide 33
- Resources PGI Debugger Users Guide http://www.pgroup.com/ppro_docs/pgdbg_ug/PGDBG4.htm http://www.pgroup.com/ppro_docs/pgdbg_ug/PGDBG4.htm PGI Users Guide, PGI 4.1 Release Notes, FAQ, Tutorials http://www.pgroup.com/docs.htm MPI-CH http://www.netlib.org/ OpenMP http://www.openmp.org/ HPDF (High Performance Debugging Forum) Standard http://www.ptools.org/hpdf/draft/intro.html
- Slide 34