Updated 2021-01-11
How to Use Local Scratch Storage¶
What is Local Scratch Storage?¶
- Every Node comes with a tmp directory that's local to the node.
- Since it's a local disk mounted on the node, it's faster than network storage (
/data
,/home
etc) - Most local nodes have limited disk space (<20gb)
- Because of this limited disk space, a node may be forced to go offline if a large
tmp
directeory is created on the node and then not deleted once the job ends. As a solution,/scratch
directories have been created on nodes to support the use oftmp
directories. The/scratch
directory is automatically deleted after the job has finished running
Scratch Directory on Node Explained¶
- Each node has its own
/scratch
directory.
Warning
The /scratch
dir on the node is not the same as the ~/scratch
directory in your /home
folder
- Every job will create a directory on the node under
/scratch
named after the jobID, for example:/scratch/20986925.shared-sched.pace.gatech.edu
- This directory will be automatically deleted when the job is complete, so you don't have to worry about deleting it yourself
${TMPDIR}
is the variable that is assigned to this path (normally/tmp
).${TMPDIR}
is how you reference this directory in your code- Any places in your code that you require a tmp directory (normally
/tmp
) replace with${TMPDIR}
How to Use Scratch Directory + Example¶
Important
In your code, use ${TMPDIR}
where you use /tmp
, or anywhere you need to use local scratch storage
- Since
${TMPDIR}
automatically deletes itself after the job, the problem of a large number of nodes running out of local space will be prevented - To update your code:
- Wherever you need to access a
tmp
directory, use${TMPDIR}
- Wherever you need to access a
python code.py > /tmp/intermediate_results.txt #WRONG
would be changed to:
python code.py > /${TMPDIR}/intermediate_results.txt #RIGHT
Example PBS Script¶
#PBS -N tmpTest
#PBS -l nodes=1:ppn=2
#PBS -l walltime=1:00
#PBS -l file=10g
#PBS -q inferno
#PBS -j oe
#PBS -o tmpTest.out
cd ${TMPDIR}
pwd
echo "this wont do anything but uses the right tmp directory" > ${TMPDIR}/intermediate_results.txt
Warning
The directive #PBS -l file=10g
reserves 10gb per core not per job. In this PBS
script, 20gb total scratch storage would be reserved (10gb per core x 1 node x 2 cores per node). Keep this in mind as it is easy to accidentally request excessive amounts of storage which could prevent your job from running.
- The
#PBS
directives are standard, more on the directives can be found in the PBS guide cd ${TMPDIR}
enters the/scratch
directory on the node which the job is running on, and then prints this path withpwd
- The
echo
line prints text to a file in thetmp
directory on the node. Notice how${TMPDIR}
is used, NOT/tmp
. This file is deleted after the job is run, so no actual results from this line will be apparent in the.out
file
Results¶
- From the same directory you ran
qsub
from, use a text editor to view the.out
file, such asvim tmpTest.out
. The results should look something like this:
Job name: tmpTest
Queue: inferno
End PBS Prologue Mon Mar 4 16:09:00 EST 2019
---------------------------------------
/scratch/24537771.shared-sched.pace.gatech.edu
---------------------------------------
Begin PBS Epilogue Mon Mar 4 16:09:01 EST 2019
Job ID: 24537771.shared-sched.pace.gatech.edu
- The path
${TMPDIR}
represents is printed
Ensure Local Storage Availability for Job¶
- PACE clusters consist of many nodes with different specifications, including different local drive capacities ranging from 20G to 7TB
- The
-l file=n
#PBS directive allows you to request space for your job. For example, if you need 10 gb of space:#PBS -l file=10g
. Remember: this storage amount is per core not per job.
- This directive will make sure your job will be allocated on a node that has at least 10gb free on
${TMPDIR}
. This is assuming you only requested 1 node and 1 proc per node. Since the-l file=10g
directive reserves storage per core, requesting for example, 2 nodes and 4 ppn would suddenly leave you with 80gb requested storage instead of 10 (10gb per core * 2 nodes * 4 ppn) - See the section below for more information about the
-l file=
flag and how to ensure that you request the right amount of local storage when using it
More about -l file=¶
- The
-l file=<>
requests local storage per task and makes it so that no single file can exceed the specified size - When using
-l nodes=<>:ppn=<>
or-l procs=<>
, each core counts as one task, so the file size specified with-l file=<>
is scaled by the number of processors - This can hold up a job in the queue since if no node can provide that much local storage
- However, if your job does not require multiple nodes, you can use the
-l ncpus=<>
flag to ensure that multiple cores are used for a single task, making sure that the local storage specified byl file=<>
is not scaled - If you need multiple nodes, then you will still need to use
-l nodes=<>:ppn=<>
or-l procs=<>
and be mindful of the fact that each file cannot exceed the size indicated by-l file=<>
and that total file size will scale with the number of requested processors
Retrieve Files from Unexpectedly Terminated Jobs¶
- Since
{$TMPDIR}
is deleted once a job is finished, you may lose the files stored in it if the job unexpectedly terminates - However, you can use the
trap
command to make sure that files are copied over from the{$TMPDIR}
before the job terminates and the directory is deleted - For
trap
to work, it must precede the command that causes the unexpected termination, so the suggested location is right after the PBS directives - Here is an example command that can be inserted right after the PBS directives:
trap "cp ${TMPDIR}/* ~/data/somewhere_to_store_these/;exit" TERM
- You should make sure to make this command more specific based on where you want to copy your files to
- A good option is to copy the files over to your global scratch directory
~/scratch/
where you have 7TB of space - Note that this copy will ONLY happen if the job terminates as a failure, so you should make sure that the existing procedure you have for copying files during a normal completion is still a part of the script.