Job submission
Introduction
The job submission system is Torque/MAUI.
Jobs can be submitted via the qsub command to the unique queue which allows to use all available resources of the queue (see system configuration page ).
Recommendations
For a correct use of cluster thank you to respect our recommendations.
QUEUE QUIET : for sequential or array-jobs
- For long (>12h00) sequential jobs up to 64 cores.
- No limit for short (<5h00) sequential jobs. Nevertheless, for the jobs to 8h00-12h00, it is preferable to execute these jobs at the end of the day to run overnight.
- The slot-limit for array-job is now 64 cores.
- Each user of the queue should not take more than 40% of the available resources in this queue.
These rules ensure that a maximum of users can work on the cluster, however they can be flexible due to the system load (memory, cpu, etc.) on the cluster.
If you have an exceptional request please contact us via https://supportapc.in2p3.fr
Monitoring
- Cluster status
You can find information of the cluster's loads on the web interface of our monitoring tool Ganglia at http://www.apc.univ-paris7.fr/ganglia
And you can also check the status of nodes by pbsnode command.
- Job status
Use qstat
to check job status:
- Q: queued
- C: completed
- E: exiting
- H: held
- R: running
- S: suspended.
You can also use /usr/local/maui/bin/showq
to display information about active, eligible, blocked, and/or recently completed jobs.
A simple job example
The qsub
command takes a number of command line arguments which can also be specified via a job file script, together with the call to the actual computation code:
myjob.sh
:
#!/bin/bash #PBS -N myjobname #PBS -o out_file #PBS -e err_file #PBS -q quiet #PBS -M email@address #PBS -l nodes=1:ppn=4,mem=4gb,walltime=24:00:00 export SCRATCH="/scratch/$USER.$PBS_JOBID" echo "Hello world! " cd my_working_dir ./my_executable_file
The commande qsub myjob.sh
will:
- submit the job to the queue
- execute it when 4 processors and 4Gb will be avaibable
- create a temporary directory referenced in the
$SCRATCH
environment variable - run the executable file
- print the standard output to the
out_file
file - print the standard error to the
err_file
file
Use qdel
to remove a job from the queue.
Note that it's important to declare the total memory (no matter how many "ppn" you declare) of your job by setting the "mem" parameter, or else it will use the default value "4gb" which is set for the queue quiet.
Interactive submission
One has the possibility to use the interactive job submission via the qsub -I
command, which allows to:
- ease the debugging
- perform interactive computations on declared and reserved resources
MPI , openMP and hybrid jobs
MPI job sample script:
#!/bin/bash #PBS -N myjobname #PBS -o out_file #PBS -e err_file #PBS -q quiet #PBS -M email@address #PBS -l nodes=5:ppn=2,mem=4gb,walltime=24:00:00 export SCRATCH="/scratch/$USER.$PBS_JOBID" export PATH=/usr/local/openmpi/bin:$PATH export LD_LIBRARY_PATH=/usr/local/openmpi/lib/:/usr/local/openmpi/lib/openmpi/:$LD_LIBRARY_PATH /usr/local/openmpi/bin/mpirun -mca plm rsh my_mpi_job
OpenMP sample script:
#!/bin/bash #PBS -N myjobname #PBS -o out_file #PBS -e err_file #PBS -q quiet #PBS -M email@address #PBS -l nodes=1:ppn=16,mem=4gb,walltime=24:00:00 export SCRATCH="/scratch/$USER.$PBS_JOBID" export OMP_NUM_THREADS=32 ./my_omp_job
Background job
By using #PBS nice
option, and resquesting 0 cpus.
Job arrays
Job arrays are submitted through the -t option. The PBS_ARRAYID
environment variable can be used as a job id number (or seed).
#!/bin/bash #PBS -N myjobname #PBS -o out_file #PBS -e err_file #PBS -q quiet #PBS -M email@address #PBS -l nodes=1:ppn=16,mem=4gb,walltime=24:00:00 #PBS -t 0-99 export SCRATCH="/scratch/$USER.$PBS_JOBID" export SEED=$PBS_ARRAYID ./my_executable_file $SEED
Matlab
#!/bin/bash #PBS -N myjobname #PBS -e err_file #PBS -l nodes=1:ppn=1,mem=4gb,walltime=24:00:00 cd my_working_dir ./path_of_matlab_bin/matlab -nodesktop -nosplash <input.m >output
More information on: http://www.clusterresources.com/torquedocs/2.1jobsubmission.shtml
Notification by mail
To receive a notification email:
#PBS -m abe user@apc.univ-paris7.fr
Tells the scheduler to send you email based upon:
- a mail is sent when the job is aborted by the batch system.
- b mail is sent when the job begins execution.
- e mail is sent when the job terminates.