Providing Job Configuration

In order to promote efficient usage of the research environment - the job-scheduler is automatically configured with default run-time limits for jobs. These defaults can be overridden by users to help the scheduler understand how you want it to run your job.

Job instructions can be provided in two ways; they are:

On the command line, as parameters to your sbatch or srun command. For example, you can set the name of your job using the --job-name=[name] | -J [name] option:

[flight@chead1 (mycluster1) ~]$ sbatch --job-name=mytestjob simplejobscript.sh
Submitted batch job 51

[flight@chead1 (mycluster1) ~]$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                51       all mytestjo    centos  R       0:02      1 node01

In your job script, by including scheduler directives at the top of your job script - you can achieve the same effect as providing options with the sbatch or srun commands. Create an example job script or modify your existing script to include a scheduler directive to use a specified job name:
```
#!/bin/bash -l
#SBATCH --job-name=mytestjob
echo "Starting running on host $HOSTNAME"
sleep 120
echo "Finished running - goodbye from $HOSTNAME"
```

Including job scheduler instructions in your job-scripts is often the most convenient method of working for batch jobs - follow the guidelines below for the best experience:

Lines in your script that include job-scheduler directives must start with #SBATCH at the beginning of the line.
You can put multiple instructions separated by a space on a single line starting with #SBATCH
The scheduler will parse the script from top to bottom and set instructions in order; if you set the same parameter twice, the second value will be used.
Instructions are parsed at job submission time, before the job itself has actually run. This means you can't, for example, tell the scheduler to put your job output in a directory that you create in the job-script itself - the directory will not exist when the job starts running, and your job will fail with an error.
You can use dynamic variables in your instructions (see next)

Warning

After #!/bin/bash -l write all of your #SBATCH lines. As soon as the interpreter reads a normal script line it will stop looking for #SBATCH lines.

More information on this

Common Job Configuration Examples

Setting Output File Location

To set the output file location for your job, use the -o [file_name] | --output=[file_name] option - both standard-out and standard-error from your job-script, including any output generated by applications launched by your job-script will be saved in the filename you specify.

By default, the scheduler stores data relative to your home-directory - but to avoid confusion, we recommend specifying a full path to the filename to be used. Although Linux can support several jobs writing to the same output file, the result is likely to be garbled - it's common practice to include something unique about the job (e.g. its job-ID) in the output filename to make sure your job's output is clear and easy to read.

Note

The directory used to store your job output file must exist and be writable by your user before you submit your job to the scheduler. Your job may fail to run if the scheduler cannot create the output file in the directory requested.

The following example uses the --output=[file_name] instruction to set the output file location:

#!/bin/bash -l
#SBATCH --job-name=myjob --output=output.%j

echo "Starting running on host $HOSTNAME"
sleep 120
echo "Finished running - goodbye from $HOSTNAME"

In the above example, assuming the job was submitted as the centos user and was given the job-ID number 24, the scheduler will save the output data from the job in the filename /home/centos/output.24.

Setting Working Directory

By default, jobs are executed from your home-directory on the research environment (i.e. /home/<your-user-name>, $HOME or ~). You can include cd commands in your job-script to change to different directories; alternatively, you can provide an instruction to the scheduler to change to a different directory to run your job. The available options are:

-D | --workdir=[dir_name] - instruct the job scheduler to move into the directory specified before starting to run the job on a compute node

Note

The directory specified must exist and be accessible by the compute node in order for the job you submitted to run.

Waiting for a Previous Job Before Running

You can instruct the scheduler to wait for an existing job to finish before starting to run the job you are submitting with the -d [state:job_id] | --depend=[state:job_id] option. For example, to wait until the job with ID 75 has finished before starting the job, you could use the following syntax:

[flight@chead1 (mycluster1) ~]$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                75       all    myjob    centos  R       0:01      1 node01

[flight@chead1 (mycluster1) ~]$ sbatch --dependency=afterok:75 mytestjob.sh
Submitted batch job 76

[flight@chead1 (mycluster1) ~]$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                76       all    myjob    centos PD       0:00      1 (Dependency)
                75       all    myjob    centos  R       0:15      1 node01