Job submission on APC cluster
Introduction
The job submission system is Torque/MAUI.
Jobs can be submitted via the qsub
command to the unique queue which allows to use all available resources of the queue (see system configuration page ).
Recommendations
For a correct usage of cluster thank you to respect our recommendations.
QUEUE FURIOUS : for OpenMP or OpenMPI parallel jobs.
- for a long (> 96h00) parallel job: not more than 64 cores.
- no limit for short (<5h00) parallel jobs. Nevertheless, for the jobs to 8h00-12h00, it is preferable to execute these jobs at the end of the day to run overnight.
- Each user of the queue should not take more than 40% of the available resources in this queue.
These rules ensure that a maximum of users can work on the cluster, however they can be flexible due to the system load (memory, cpu, etc.) on the cluster.
If you have an exceptional request please contact us via https://supportapc.in2p3.fr
Monitoring
- Cluster status
You can find information of the cluster's loads on the web interface of our monitoring tool Ganglia at http://www.apc.univ-paris7.fr/ganglia/.
And you can also check the status of nodes by pbsnode
command.
- Job status
Use qstat
to check job status:
- Q: queued
- C: completed
- E: exiting
- H: held
- R: running
- S: suspended.
You can also use /usr/local/maui/bin/showq
to display information about active, eligible, blocked, and/or recently completed jobs.
All Maui diagnostic tools (showres, checkjob, checknode, diagnose) are available in
/usr/local/maui/bin/
,
The Torque tools (tracejob, pbsnodes) are in /usr/local/bin/
.
A simple job example
The qsub
command takes a number of command line arguments which can also be specified via a job file script, together with the call to the actual computation code:
myjob.sh
:
#!/bin/bash #PBS -N myjobname #PBS -o out_file #PBS -e err_file #PBS -q furious #PBS -M email@address #PBS -l nodes=1:ppn=4,mem=4gb,walltime=24:00:00 export SCRATCH="/scratch/$USER.$PBS_JOBID" echo "Hello world! " cd my_working_dir ./my_executable_file
The commande qsub myjob.sh
will:
- submit the job to the queue
- execute it when 4 processors and 4Gb will be avaibable
- create a temporary directory referenced in the
$SCRATCH
environment variable. For array jobs you have to use the following line :
export SCRATCH="/scratch/$USER.${PBS_JOBID//[\[\]]/}
- run the executable file
- print the standard output to the
out_file
file - print the standard error to the
err_file
file
Use qdel
to remove a job from the queue.
Note that it's important to declare the total memory (no matter how many "ppn" you declare) of your job by setting the "mem" parameter, or else it will use the default value "8gb".
Interactive submission
One has the possibility to use the interactive job submission via the qsub -I
command, which allows to:
- ease the debugging
- perform interactive computations on declared and reserved resources
Interactive submission with X Display :
- qsub -I -X
MPI , openMP and hybrid jobs
MPI job sample script:
#!/bin/bash #PBS -N myjobname #PBS -o out_file #PBS -e err_file #PBS -q furious #PBS -M email@address #PBS -l nodes=5:ppn=2,mem=4gb,walltime=24:00:00 export SCRATCH="/scratch/$USER.$PBS_JOBID" export PATH=/usr/local/openmpi/bin:$PATH export LD_LIBRARY_PATH=/usr/local/openmpi/lib/:/usr/local/openmpi/lib/openmpi/:$LD_LIBRARY_PATH /usr/local/openmpi/bin/mpirun my_mpi_job
Warning: not '-mca plm rsh', new TCP protocol message with furious queue!!!
OpenMP sample script:
#!/bin/bash #PBS -N myjobname #PBS -o out_file #PBS -e err_file #PBS -q furious #PBS -M email@address #PBS -l nodes=1:ppn=16,mem=4gb,walltime=24:00:00 export SCRATCH="/scratch/$USER.$PBS_JOBID" export OMP_NUM_THREADS=32 ./my_omp_job
Background job
By using #PBS nice
option, and resquesting 0 cpus.
Job arrays
Job arrays are submitted through the -t option. The PBS_ARRAYID
environment variable can be used as a job id number (or seed).
#!/bin/bash #PBS -N myjobname #PBS -o out_file #PBS -e err_file #PBS -q furious #PBS -M email@address #PBS -l nodes=1:ppn=16,mem=4gb,walltime=24:00:00 #PBS -t 0-99 export SCRATCH="/scratch/$USER.$PBS_JOBID" export SEED=$PBS_ARRAYID ./my_executable_file $SEED
Notification by mail
To receive a notification email:
#PBS -m abe user@apc.univ-paris7.fr
Tells the scheduler to send you email based upon:
- a mail is sent when the job is aborted by the batch system.
- b mail is sent when the job begins execution.
- e mail is sent when the job terminates.