Updated 2022-11-11

Phoenix Cluster Resources

Detailed Node Specs

  • Most nodes includes the following common features:
    • Dual Intel Xeon Gold 6226 CPUs @ 2.7 GHz (24 cores/node)
    • DDR4-2933 MHz DRAM
    • Infiniband 100HDR interconnect
  • The cpu-amd nodes include the following common features:
    • Dual AMD Epyc 7713 CPUs @ 2.0 GHz (128 cores/node)
    • 512GB DDR4 DRAM
    • 1.6 TB NVMe
  • The gpu-a100 nodes include the following common features:
    • Dual AMD Epyc 7513 CPUs @ 2.6 GHz (64 cores/node)
    • 512GB DDR4 DRAM
    • 2x Nvidia A100-40GB Tensor Core GPUs
    • 1.6 TB NVMe
  • The following chart provides detailed specifications for the 800 nodes that are part of the Phoenix-Slurm cluster migration for Phases 1 (October 10) and 2 (November 2-4). Details are also included for 4 additional cpu-amd nodes and 5 additional gpu-a100 nodes added on November 7:
Node Class Quantity RAM Storage Extra Unique Specs
CPU-192GB 522 192 GB 1.6 TB NVMe storage
CPU-384GB 144 384 GB 1.6 TB NVMe storage
CPU-768GB 39 768 GB 1.6 TB NVMe storage
CPU-384GB-SAS 46 192 GB 8.0 TB SAS storage
CPU-768GB-SAS 2 384 GB 8.0 TB SAS storage
GPU-192GB-V100 11 192 GB 2x Tesla V100
GPU-384GB-V100 14 384 GB 2x Tesla V100
GPU-768GB-V100 2 768 GB 2x Tesla V100
GPU-384GB-RTX6000 18 384 GB 4x Quadro Pro RTX6000
GPU-768GB-RTX6000 2 768 GB 4x Quadro Pro RTX6000
CPU-512GB-AMD 4 512 GB 1.6 TB NVMe storage 2x AMD Epyc 7713
GPU-512GB-A100 5 512 GB 1.6 TB NVMe storage 2x AMD Epyc 7513, 2x Tensor Core A100

Partitions

  • Jobs are assigned to Slurm partitions automatically based on your charge account (internal or external) and the most significant resources requested (gpu, memory requirements, etc).
  • Jobs will only be charged if the inferno QOS is selected.
  • Slurm partitions assigned determine how much users are charged based on current rates.
  • Slurm partitions include the following node classes and are assigned by the scheduler based on availability:
Partition Node Class
cpu-small CPU-192GB, CPU-384GB, CPU-384GB-SAS, CPU-768GB, CPU-768GB-SAS
cpu-medium CPU-384GB, CPU-384GB-SAS, CPU-768GB, CPU-768GB-SAS
cpu-large CPU-768GB, CPU-768GB-SAS
cpu-sas CPU-384GB-SAS, CPU-768GB-SAS
gpu-v100 GPU-192GB-V100, GPU-384GB-V100, GPU-768GB-V100
gpu-rtx6000 GPU-384GB-RTX6000, GPU-768GB-RTX6000
cpu-amd CPU-512GB-AMD
gpu-a100 GPU-512GB-A100
  • The partitions for external users have the same names with "-X" added (i.e. cpu-small-X, cpu-medium-X).

Job Submit Flowchart

  • When submitting a job to the Slurm scheduler using interactive mode (using salloc) or with a script (using sbatch), the resources requested will determine the partition assigned, as illustrated in the following flowchart:

Job Submit Flowchart

Tip

The scheduler reserves 8GB of memory for system processes, so the total available memory for jobs on a given node is reduced accordingly.