Debugger Launch Process
In the following description, the term "debugger client" refers to the Eclipse client side of the parallel debugger. The term "sdm" or "debugger server" refers to the Scalable Debug Manager (SDM) or server side of the parallel debugger.
When the user hits the debug button, debug launch proceeds as follows:
- The debugger client is initialized
- The job is submitted by calling IResourceManager#submitJob()
- Optionally, debugger launch help actions are performed
- The SDM starts
Debugger initialization is performed on the client side of the debugger. It comprises the following steps:
- The debugger client binds to a random port on the local machine. This port is the debugger session port.
- If port forwarding is enabled on the connection, a port on the target machine will be forwarded to the debugger session port.
- The debugger arguments are computed (including adding the remote port obtained in the previous step) and these are saved to the launch configuration using the DEBUGGER_ARGS launch attribute.
- The debugger executable (as specified in the Debugger tab) is verified on the target machine. If the executable does not exist, an exception is thrown.
Debug Job Submission
The debug job is submitted using IResourceManager#submitJob() with mode set to "debug". It is up to the resource manager to determine what command will be used and what arguments will be passed to it. In the case of the Open MPI/MPICH2 RM's, the command submitted is the "mpirun" command which is passed the debugger executable and supplied with the debugger arguments from the launch configuration.
A thread is also started that monitors the job state.
Debugger Launch Help
Debugger launch help is available if a resource manager needs assistance in order to start the debug session. It is enabled by overriding AbstractResourceManagerConfiguration#needsDebuggerLaunchHelp() and returning true.
Debugger launch help is activated when needsDebuggerLaunchHelp() returns true AND the job state changes to RUNNING. When this happens, following steps are performed:
- A routing file is created that contains the tuples (index, host, port) for each process. The host information is obtained from the IPJob#getProcessNodeId() method (the other values are computed). The routing file is assumed to be in a location that is accessible to all sdm processes, such as a shared file system. The debug server does not proceed until this file is available.
- The client waits 3 seconds
- The sdm master process is launched on the target machine using an IRemoteProcessBuilder over the resource manager connection. The command is the debugger executable with the argument "--master", followed by the debugger arguments computed in the initialization step.
Once launched, the sdm master will attempt to connect to the debug client. If the debugger port was forwarded over the resource manager connection, this will be used for the sdm master connection, otherwise it will attempt to connect to the session host (as specified in the Debugger tab).
When the debugger server processes detect the routing file, they will read the file to obtain the host/port combination for their immediate parent in the communication tree. Each process will then attempt to connect to it's parent using this information. Once all connections have been established (including the connections to the sdm master process), the debugger startup is completed and it is ready to begin processing commands.
Existing RM Support
|Open MPI||Tool||Y||Host information is obtained when mpirun command starts (using xml option), and this is used to populate model with the process information|
|MPICH2||Tool||Y||Host information is obtained by running a periodic monitor command and parsing the output.|
|PE||Proxy||N||Proxy creates the routing file using information obtained from the poe attach.config file. The proxy also launches the sdm master process.|
|SLURM||Proxy||N||Proxy creates the routing file using information obtained from SLURM. The proxy also launches the sdm master process.|