Updated 2023-02-21
Hive Migration to Slurm¶
The Hive cluster migrated to the Slurm scheduler in August and September 2022. PACE has worked closely with the Hive PIs on the plan for the migration to ensure minimum interruption to research.
Join a PACE Slurm Orientation session to learn more about using Slrum on Hive.
Software Changes¶
The Hive-Slurm cluster features a new set of applications provided in the PACE Apps software stack. Please review this list of software we offer on Hive post-migration and let us know via email if any software you are currently using on Hive is missing from that list. We encourage you to let us know as soon as possible to avoid any potential delay to your research as the migration process concludes. We have reviewed batch job logs to determine packages in use and upgraded them to the latest version.
To load the default/latest version of a software package, use module load
without specifying a version number, e.g., module load anaconda3
.
Warning
Researchers installing or writing their own software will also need to recompile applications to reflect new MPI and other libraries.
Login¶
To access Hive-Slurm via ssh, use the address login-hive.pace.gatech.edu
or login-hive.pace.gatech.edu
. This will provide access to the new environment and your existing Hive home, project, and scratch storage.
Example:
ssh gburdell3@login-hive.pace.gatech.edu
Partitions¶
Hive's existing queues were converted into Slurm partitions. Each will use the same name as before. Initially, a portion of each node type will be moved, with more to follow as the migration is completed.
The hive-gpu-short
partition will work differently in Slurm, providing high-priority access to the same GPU nodes as hive-gpu
. This setup will increase utilization while continuing to support short and interactive workflows requiring GPUs. Each researcher is limited to 2 GPUs concurrently on hive-gpu-short
(and 10 running or pending jobs). In the same way, the hive-interact
partition will overlay the hive
partition for CPU nodes, and each researcher is limited to 96 concurrent cores (and 50 running or pending jobs). The limit of 500 running or pending jobs per researcher across the entire cluster remains in place.
Tracking Accounts¶
There continues to be no charge to use Hive. As part of this migration, we are introducing a new feature, in which each job will require a “tracking account” be provided for reporting purposes. Researchers who use the Phoenix cluster will be familiar with this accounting feature; however, the tracking accounts on Hive will have neither balances nor limitations, as they’ll be used solely for cluster utilization metrics.
To find your Hive-Slurm tracking accounts, run pace-quota
while logged into the Hive-Slurm cluster. Tracking accounts will be of the form hive-<PI username>
, e.g., hive-gburdell3
for researchers in Prof. Burdell's group. Researchers working with more than one faculty member may have multiple tracking accounts and should choose the one that best fits the project supervisor for each job run.
Add tracking accounts to your Slurm requests with the -A
flag.
Slurm Usage¶
Vist our Slurm usage on Hive guide to learn Slurm commands and find example scripts, and visit our conversion guide for detailed instructions on converting existing PBS scripts to Slurm scripts.
Open OnDemand, Jupyter, and VNC¶
The Hive OnDemand portal now supports the Slurm scheduler! You can access all OnDemand apps, including Jupyter and Interactive Desktop, from the “Interactive Apps” menu. Learn more about Open OnDemand in our guide.
The pace-jupyter-notebook
and pace-vnc-job
commands were retired with the migration to Slurm. Please use OnDemand to access Jupyter notebooks, VNC sessions, and more on Hive-Slurm via your browser.
This material is based upon work supported by the National Science Foundation under grant number 1828187. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.