TotalView is a graphical debugger that can be used for tracing errors in MPI programs. It can help you to localise errors to specific functions, as well as letting you inspect the contents of variables.

Table of Contents

Introduction




You can use TotalView at NCI both on the login node as well as in the interactive queue. If you're only debugging a small program it is fine to use the login node, for larger MPI programs use the queue. If using the queue you'll have to add some flags to the qsub command:

qsub -v DISPLAY -l software=totalview -I

This does two things. Firstly it forwards you X11 connection to the compute nodes so that you can see the graphical interface. Secondly it requests a licence from the licence server to run totalview. If you don't request a licence then totalview may kick you off in the middle of a debugging session. You'll also need -I to create an interactive session.

As with most programs at NCI totalview is contained in a module. To access it run
module load totalview

To run the debugger we'll first need something to debug. Download the CMS totalview example repository and build the examples:
git clone git://github.com/ScottWales/totalview
cd totalview
make

We'll start off with a very simple example, a MPI-enabled hello world. To start TotalView you must add the flag --debug to mpirun:
mpirun --debug ./helloworld

Three new screens will appear, looking something like
totalview_startup.png

In the top left of this screenshot is the list of currently running processes. The actual program hasn't started yet, so there is just mpirun listed at the moment but eventually this will show all the MPI ranks.

The top right is startup options. Just click on 'Ok' to dismiss this window, the defaults are generally fine.

The third window is the main debugging window, where most of the action happens. Press the big green play button to start the program running. When it asks you if you want to stop the job say 'Yes'. This makes the debugger stop at the call to MPI_Init, it should now look like

totalview_hellostart.png

The top left area is the stack trace. This gives a hierarchy of functions which were called to get to this point, in this case main called helloworld which called mpi_init_f which then called PMPI_Init and so on.

The top right shows the current values of variables and registers. The registers are the values the CPU uses when it's running the program, program variables are loaded into these so that the CPU can perform operations on them. Mostly you don't have to worry about these, they are just being shown at the moment because there's no debugging information for the current function __poll.

The main area is the code currently being run. Right now it's showing assembly code, these are very basic instructions that the CPU can run directly (at the highlighted line it's comparing the value in register %rax to the value -4096). This is a very low-level function and doesn't have any source associated with it. Notice in the stacktrace that the function helloworld is marked [f90]? This means that there's source code associated with the function, which you can see by clicking the name in the stack trace.

totalview_hellomain.png

Now we're seeing actual source code. The reason this function has source code associated with it is because the function was compiled with the -g flag. You should make sure to compile your code with this if you want to debug your code, debugging assembly isn't much fun.

The stack frame is now showing two local variables, ierr and rank, both with the value 0. The yellow arrow shows that the program is currently in the process of calling the function MPI_Init.

The toolbar has a variety of ways to run the program. The green 'Go' button sets the program running, if you don't otherwise intervene the program will get to the end and stop. The 'Halt' button stops the program wherever it happens to be at and shows the current state. This is helpful for example if the program is taking a long time and you want to see what the holdup is, for instance if the program has gotten into an infinite loop. The red 'Kill' button shuts the program down (but leaves totalview running). This is basically the equivalent of pressing Ctrl-C on the command line. The 'Restart' button shuts down the program and starts it again from the beginning as if you stopped it and ran it again on the command line.

The next section in the toolbar is for finer control. 'Next' runs until the program gets to the next line of the program in the current file. 'Step' will go to the next line to be executed, so if you call it on a function call it will go into the code of that function. 'Out' runs to the end of the current function. 'Run To' will run the program until it reaches the currently selected line.

The dropdown at the left of the toolbar allows you to run different MPI processors or theads individually, which can help to debug race conditions in parallel programs. We'll leave that alone for the moment.

Debugging the UM


In order to debug a UM run you need to run the job on the interactive queue. You can use the umuisubit_run script that the UM produces to help set this up. On Vayu, run
qsub -l software=totalview -v DISPLAY -I ~/umui_runs/uabcd-012345678/umuisubmit_run

This will submit an interactive job to the queue with the same CPU and memory settings as your normal UM run (substitute uabcd-012345678 with your own run folder). Rather than starting the UM automatically the interactive job will leave you on the command line so that you can start setting things up.

First load totalview with
module load totalview

Next you'll have to tell the UM run scripts to enable totalview. Some branches of the code are set up to do this easily, however the surefire way to enable it is to edit the 'qsexecute' script, found in you job's bin folder ($DATAOUTPUT/UM_ROUTDIR/$USER/uabcd/bin). This script calls the actual UM executable.

There are lots of processor specific options in this file. On NCI you want to search for LINUXMPP, you should get 3 matches. The first match is for running the reconfiguration, the second match is for running the UM with automatic post-processing and the third match runs the UM without post-processing. If you're debugging the reconfiguration you'll need to edit the first match, otherwise edit the second two matches to debug the UM.

The code to run the UM will either look like
    elif [[ $LINUXMPP = true ]]; then
      if [[ "$OASIS" = true ]]; then
        mpiexec -configfile o3coupled.conf  >> $OUTPUT
      else
        if [[ "$RUN_TYPE" = totalview ]]; then
            mpirun --debug -np $UM_NPES $LOADMODULE >>$OUTPUT
        else
            mpirun -np $UM_NPES $PAREXE >>$OUTPUT
        fi
      fi
or
    elif [[ $LINUXMPP = true ]]; then
      if [[ "$OASIS" = true ]]; then
        mpiexec -configfile o3coupled.conf  >> $OUTPUT
      else
        mpirun -np $UM_NPES $PAREXE >>$OUTPUT
      fi

If it looks like the first example you've got it easy, go back to the file ~/umui_runs/uabcd-012345678/umuisubmit_run and change the variable RUN_TYPE to totalview instead of normal. If your run looks like the second match you'll want to edit the script to be
    elif [[ $LINUXMPP = true ]]; then
      if [[ "$OASIS" = true ]]; then
        mpiexec -configfile o3coupled.conf  >> $OUTPUT
      else
        mpirun --debug -np $UM_NPES $LOADMODULE >>$OUTPUT
      fi
Note that the --debug flag has been added and PAREXE has been changed to LOADMODULE. Be sure to change this for both the second and third LINUXMPP instances.

Now you'll want to run the model, which you can do by running the script
~/umui_runs/uabcd-012345678/umuisubmit_run

After a moment this will bring up the totalview instance and you can begin debugging the UM. If the run exits immediately then make sure you've added '-v DISPLAY' to your qsub command, totalview doesn't work if it can't connect to your display.