torque_tutorial
Table of Contents
Overview
Jobs on Hopper execute on one or more “compute” nodes dedicated to that job. These nodes are distinct from the shared “login” nodes that host interactive sessions. Typically, users write the batch script with a text editor and submit it to the system using the “qsub” command. The batch script contains a number of job control directives and also the “aprun” command that actually runs the program in parallel on the compute nodes.
Batch Jobs
Sample Batch Script
#PBS -q regular #PBS -l mppwidth=48 #PBS -l walltime=00:10:00 #PBS -N jobname cd $PBS_O_WORKDIR aprun -n 48 ./my_executable
additional options: http://www.nersc.gov/users/computational-systems/hopper/running-jobs/batch-jobs/
Submit a Batch Script
$ qsub mybatchscript
Check the Progress of a Job
$ qstat -u username
Tips for getting your job through the queue faster
- Use the debug queue for test jobs
- Reduce walltime to only what your job needs
- Run shorter jobs (great for codes that can 'restart' like NIMROD)
- Run jobs during off peak hours (nighttime, weekends)
- Run jobs just before scheduled maintenance
torque_tutorial.txt · Last modified: 2022/07/21 06:59 by 127.0.0.1