Providing Job Configuration
In order to promote efficient usage of the research environment - the job-scheduler is automatically configured with default run-time limits for jobs. These defaults can be overridden by users to help the scheduler understand how you want it to run your job.
Job instructions can be provided in two ways; they are:
- On the command line, as parameters to your
sbatch
orsrun
command. For example, you can set the name of your job using the--job-name=[name] | -J [name]
option:[flight@chead1 (mycluster1) ~]$ sbatch --job-name=mytestjob simplejobscript.sh Submitted batch job 51 [flight@chead1 (mycluster1) ~]$ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 51 all mytestjo centos R 0:02 1 node01
- In your job script, by including scheduler directives at the top of your job script - you can achieve the same effect as providing options with the
sbatch
orsrun
commands. Create an example job script or modify your existing script to include a scheduler directive to use a specified job name:#!/bin/bash -l #SBATCH --job-name=mytestjob echo "Starting running on host $HOSTNAME" sleep 120 echo "Finished running - goodbye from $HOSTNAME"
Including job scheduler instructions in your job-scripts is often the most convenient method of working for batch jobs - follow the guidelines below for the best experience:
- Lines in your script that include job-scheduler directives must start with
#SBATCH
at the beginning of the line. - You can put multiple instructions separated by a space on a single line starting with
#SBATCH
- The scheduler will parse the script from top to bottom and set instructions in order; if you set the same parameter twice, the second value will be used.
- Instructions are parsed at job submission time, before the job itself has actually run. This means you can't, for example, tell the scheduler to put your job output in a directory that you create in the job-script itself - the directory will not exist when the job starts running, and your job will fail with an error.
- You can use dynamic variables in your instructions (see next)
Warning
After #!/bin/bash -l
write all of your #SBATCH
lines. As soon as the interpreter reads a normal script line it will stop looking for #SBATCH
lines.
Common Job Configuration Examples
Setting Output File Location
To set the output file location for your job, use the -o [file_name] | --output=[file_name]
option - both standard-out and standard-error from your job-script, including any output generated by applications launched by your job-script will be saved in the filename you specify.
By default, the scheduler stores data relative to your home-directory - but to avoid confusion, we recommend specifying a full path to the filename to be used. Although Linux can support several jobs writing to the same output file, the result is likely to be garbled - it's common practice to include something unique about the job (e.g. its job-ID) in the output filename to make sure your job's output is clear and easy to read.
Note
The directory used to store your job output file must exist and be writable by your user before you submit your job to the scheduler. Your job may fail to run if the scheduler cannot create the output file in the directory requested.
The following example uses the --output=[file_name]
instruction to set the output file location:
#!/bin/bash -l
#SBATCH --job-name=myjob --output=output.%j
echo "Starting running on host $HOSTNAME"
sleep 120
echo "Finished running - goodbye from $HOSTNAME"
In the above example, assuming the job was submitted as the centos
user and was given the job-ID number 24
, the scheduler will save the output data from the job in the filename /home/centos/output.24
.
Setting Working Directory
By default, jobs are executed from your home-directory on the research environment (i.e. /home/<your-user-name>
, $HOME
or ~
). You can include cd
commands in your job-script to change to different directories; alternatively, you can provide an instruction to the scheduler to change to a different directory to run your job. The available options are:
-D | --workdir=[dir_name]
- instruct the job scheduler to move into the directory specified before starting to run the job on a compute node
Note
The directory specified must exist and be accessible by the compute node in order for the job you submitted to run.
Waiting for a Previous Job Before Running
You can instruct the scheduler to wait for an existing job to finish before starting to run the job you are submitting with the -d [state:job_id] | --depend=[state:job_id]
option. For example, to wait until the job with ID 75 has finished before starting the job, you could use the following syntax:
[flight@chead1 (mycluster1) ~]$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
75 all myjob centos R 0:01 1 node01
[flight@chead1 (mycluster1) ~]$ sbatch --dependency=afterok:75 mytestjob.sh
Submitted batch job 76
[flight@chead1 (mycluster1) ~]$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
76 all myjob centos PD 0:00 1 (Dependency)
75 all myjob centos R 0:15 1 node01