Serial and Parallel batch jobs differ in the following ways:
- The program must be compiled with a parallel library (MPI).
- The communication mode can be specified for parallel jobs.
Set MPI_COMM=TCP
- Call vsched -m to create a file called machines
which lists the nodes allocated to your job.
Note: If you have @echo on in your script, vsched -m
will disable it. Reissue @echo on after vsched -m
to turn it back on. - For Linux: start mpd (mpich message passing daemon) on every machine listed in your
machines file.
- For Linux: start mpd (mpich message passing daemon) on every machine listed in your
machines file.
mpdboot -n $NMACHINES -f /tmp/machines
- Use mpirun (Windows) or mpiexec (Linux) to run
the MPI program executable on all nodes listed in the machines file.
- Specify the number of nodes in your xml file:
<nodes>5</nodes>
- Specify number of tasks with -np in the mpirun command or
-n
in the mpiexec command:
Windows: mpirun -np 10 myprog.exe
Linux: mpiexec -n 2 myexe
- Be careful to put your files on the appropriate nodes, and later to get your output
from all nodes.
- Before you run:
Windows: run mpipasswd
Linux: check your ssh setup