Updated 2023-05-03

Phoenix Cluster Resources

Detailed Node Specs

  • Most nodes includes the following common features:
    • Dual Intel Xeon Gold 6226 CPUs @ 2.7 GHz (24 cores/node)
    • DDR4-2933 MHz DRAM
    • Infiniband 100HDR interconnect
  • 40 cpu-large nodes have been added on February 21, 2023, with Dual Intel Xeon Gold 6226R CPUs @ 2.9 GHz (32 cores/node) and 768 GB of RAM
  • The cpu-amd nodes include the following common features (4 added November 7, 2022; 4 added February 23, 2023):
    • Dual AMD Epyc 7713 CPUs @ 2.0 GHz (128 cores/node)
    • 512GB DDR4 DRAM
    • 1.6 TB NVMe
  • The gpu-a100 nodes include the following common features (5 added November 7, 2022; 6 added Feburary 23, 2023, 1 added February 28, 2023):
    • Dual AMD Epyc 7513 CPUs @ 2.6 GHz (64 cores/node)
    • 512GB DDR4 DRAM
    • 2x Nvidia Tensor Core A100 40GB (6 nodes) or 80GB (6 nodes) GPUs
    • 1.6 TB NVMe
  • The cpu-pmem node (added on April 11, 2023) has a large amount of memory (1.5 TB). This memory is composed of 192 GB of DDR4-2933 ECC-DRAM and 1.3125 TB of 2666MHz DCPMM (Intel Optane persistent memory). It has 24 cores (Dual Intel Xeon Gold 6226 CPUs @ 2.7 GHz) and Infiniband 100HDR interconnect.
  • The following chart provides detailed specifications for the 1382 nodes that were part of the Phoenix-Slurm cluster migration for Phases 1 (October 10-12, 2022) through 6 (January 30-February 3, 2023). Details are also included for 8 additional cpu-amd nodes and 12 additional gpu-a100 nodes:
Node Class Quantity RAM Storage Extra Unique Specs
CPU-192GB 850 192 GB 1.6 TB NVMe storage
CPU-384GB 239 384 GB 1.6 TB NVMe storage
CPU-768GB 104 768 GB 1.6 TB NVMe storage
CPU-384GB-SAS 75 192 GB 8.0 TB SAS storage
CPU-768GB-SAS 4 384 GB 8.0 TB SAS storage
CPU-PMEM 1 1.5 TB 1.6 TB NVMe storage
GPU-192GB-V100 21 192 GB 2x Tesla V100 (16GB or 32GB)
GPU-384GB-V100 27 384 GB 2x Tesla V100 (16GB or 32GB)
GPU-768GB-V100 5 768 GB 2x Tesla V100 (16GB)
GPU-384GB-RTX6000 32 384 GB 4x Quadro Pro RTX6000 (24GB)
GPU-768GB-RTX6000 5 768 GB 4x Quadro Pro RTX6000 (24GB)
CPU-512GB-AMD 8 512 GB 1.6 TB NVMe storage 2x AMD Epyc 7713
GPU-512GB-A100 12 512 GB 1.6 TB NVMe storage 2x AMD Epyc 7513, 2x Tensor Core A100 (40GB or 80GB)
1383

Partitions

  • Jobs are assigned to Slurm partitions automatically based on your charge account (internal or external) and the most significant resources requested (gpu, memory requirements, etc).
  • Jobs will only be charged if the inferno QOS is selected.
  • Slurm partitions assigned determine how much users are charged based on current rates.
  • Slurm partitions include the following node classes and are assigned by the scheduler based on availability:
Partition Node Class
cpu-small CPU-192GB, CPU-384GB, CPU-384GB-SAS, CPU-768GB, CPU-768GB-SAS
cpu-medium CPU-384GB, CPU-384GB-SAS, CPU-768GB, CPU-768GB-SAS
cpu-large CPU-768GB, CPU-768GB-SAS
cpu-sas CPU-384GB-SAS, CPU-768GB-SAS
cpu-pmem CPU-PMEM
gpu-v100 GPU-192GB-V100, GPU-384GB-V100, GPU-768GB-V100
gpu-rtx6000 GPU-384GB-RTX6000, GPU-768GB-RTX6000
cpu-amd CPU-512GB-AMD
gpu-a100 GPU-512GB-A100
  • The partitions for external users have the same names with "-X" added (i.e. cpu-small-X, cpu-medium-X).

Job Submit Flowchart

  • When submitting a job to the Slurm scheduler using interactive mode (using salloc) or with a script (using sbatch), the resources requested will determine the partition assigned, as illustrated in the following flowchart:

Job Submit Flowchart

Tip

The scheduler reserves 8GB of memory for system processes, so the total available memory for jobs on a given node is reduced accordingly.