There are several ways to test your script. The most obvious,
of course, is simply to submit it to the queue and wait to see if it all works.
However, you may use too much time or service units this way, or you may not be
confident of some of the commands, especially if you are new at writing scripts.
In this case, here are some suggestions:
- Try out script commands in a small script on the login
nodes. The parts of your script that invoke command line
commands are quite easy to test. For example, if you want to check setting a variable
and changing directories, you can put a portion of your script in a test script
like this:
REM set two variables
set WORKDIR=testdir
set CODE=karp.exe
REM create a subdirectory
mkdir %WORKDIR%
REM copy a file to the subdirectory
copy %CODE% %WORKDIR%
REM change working directory
cd %WORKDIR%
Then you simply run the test script on a login node. In the above example, if the
commands are saved in a file called test.bat in your home directory on the file
server, you just enter "test.bat" from a login node session.
However, when testing scripts on the login nodes, do not
run any executables, since running executables on the login nodes is against policy.
- Redirect standard error messages or output (from "dir"
or "echo", e.g.) to H:. At key points, send stderr and
(possibly) stdout directly back to the file server. You can easily watch for a nonzero-length
stderr file, then log into the batch node to try to correct the problem. Furthermore,
if your job fails or times out before your files are retrieved from T:, you'll still
retain the error messages. Missing input files are another common cause of job failure.
Therefore, in the following example, the script first creates a list of files on
H: to permit verification that all files were properly copied, then runs the executable
with stderr redirected to H:, then leaves evidence of its continuing progress:
dir > \\tc.cornell.edu\tc\Users\userid\test\jobstat.txt
myjob.exe 1> stdout.txt 2> \\tc.cornell.edu\tc\Users\userid\test\stderr.txt
echo "Starting cleanup" >> \\tc.cornell.edu\tc\Users\userid\test\jobstat.txt
Note that if your code tends to produce a large output file via its stdout, its performance may suffer when stdout is diverted to H: rather than T:.
- Use "notify" to send messages to yourself from the script.
Submit your batch script normally, but include messages
to yourself in the script. For example, periodically
throughout the script, you might use the "notify" script to email yourself information
about how far the job has progressed:
call notify email_address "Batch job has started"
- Comment out "vsched -cancel" and log into the batch node
you were assigned. This is perhaps the most "hands-on"
method of checking your script. Submit your job as usual. When it has started, logon
to the batch node you were assigned. Examine the progression of the script. Try
executing the script commands manually. (When you are finished, be sure to use vsched
-c
from a login node or from the batch node to release the nodes.)
- Ask consult@tc.cornell.edu to check your script.