Updated 2020-11-04

Participation

How can I participate?

There are several straightforward options to consider:

Note

FoRCE allocations will be held for the months of September and October due to PACE's migration from the Rich datacenter to the Coda datacenter. Requests submitted prior to August 31 will be reviewed for access to Rich. Any request submitted on or after September 1 will be held for creation in Coda in early November. Additionaly, due to migrations and transition to the new cost model, we aim to process the new PI purchase requests starting January 2021.

While we understand this delay may cause inconvenience to research groups, overall, this will allow new research groups to begin in PACE’s Coda environment that features a new software stack deployed on leading edge brand-new resources in a state-of-the-art datacenter, rather than having them get started in Rich just in time to move to a new computing environment, which would require rebuilding workflows.​

  1. Immediate access to the shared resources available in the FoRCE Research Computing Environment.

    • This option provides faculty immediate access to the shared resources of the FoRCE cluster including compute nodes, GPUs, and basic storage. Policies affecting the FoRCE cluster are maintained by the Faculty Governance Committee, and detailed on the policy page.
    • Cost: This option is provided at no additional cost to participants.
    • How to sign-up: Submit a brief proposal online here. (Note: You must be logged into the site using your GT credentials. Login here)
  2. Faculty contribute nodes for shared access, augmenting the FoRCE cluster

    • Faculty are encouraged to use their HPC and research computing dollars to add nodes to the FoRCE. In all cases, priority is given to the owner of these nodes. When unused by the owner, these nodes may be utilized by other faculty who have also contributed nodes. Policies affecting sharing are maintained by the Faculty Governance Committee, and detailed on the policy. By participating in this way, faculty will also have access to the FoRCE cluster and other shared clusters as time is available. This option is good for faculty who must periodically have nodes reserved for their use, but also have workloads that can be handled by a shared queue. As another advantage, participants gain access to recent architectures in longer term as the FoRCE cluster grows in size with the addition of new nodes.
    • Cost: As jobs in the shared environment may execute on other nodes, a baseline hardware configuration must be maintained. Participants pay for the compute nodes they add to FoRCE as well as expanded storage. (See #4)
    • How to sign-up: Contact pace-support@oit.gatech.edu
  3. Faculty purchase compute nodes and expanded storage dedicated for exclusive use.

    • Faculty who require a cluster environment with dedicated compute nodes can purchase these nodes and still take advantage of the federated infrastructure. These nodes are not shared with the FoRCE cluster or other shared clusters, and are available exclusively to the participant and to researchers they authorize for access. This option is good for faculty who expect to keep their nodes busy most of the time.
    • Cost: Participants pay for the compute nodes precisely sized to their requirements as well as expanded storage. (See #4)
    • How to sign-up: Contact pace-support@oit.gatech.edu
  4. Faculty purchase expanded storage.

    • All user accounts are granted a 5GB home directory quota. For long term storage of data sets, faculty may purchase dedicated storage to augment the existing base storage allocation by adding disk space to a project directory. This storage is fully backed up and implemented using best-practice redundant disk arrays to provide data integrity and availability. This option can be used independently of computational node model above.
    • Cost: Project storage is provided as a dedicated portion of a shared highly expandable DDN/GPFS storage system. Storage may be purchased in increments as small as 1TB and may be easily expanded on demand.
    • How to sign-up: Contact pace-support@oit.gatech.edu
  5. Faculty who want central hosting of a stand-alone non-federated cluster.

    • To maximize the value of every dollar invested in HPC, we strongly encourage participation in the federated cluster model. An existing cluster which simply needs floor space, power, cooling, and a network connection may be able to be hosted under the PACE Federation (e.g., Hive cluster that's funded by an NSF MRI award). The impact of all such stand-alone requests will be evaluated on a case-by-case basis to ascertain the impact on the long-term availability of hosting facilities and associated resources.
    • Cost: TBD case-by-case.
    • How to sign-up: Contact pace-support@oit.gatech.edu to refine technical details, costs and options.

How much does it cost?

PACE offers various compute options according to the table below. We do not currently support compute elements based on AMD processors. A draft rate study has been submitted, which will formalize rates for PACE services and be reviewed annually. A detailed breakdown of charges is provided in the table below; also provided is an internal usage rate calculator to help convert between research budget, compute resources, and account credits. Participants are encouraged to seek support from PACE in choosing cost effective and proper hardware for their purpose.

Going into CODA, PACE is moving to the "Cascade Lake" CPU technology from Intel. We're also moving to a 100-gigabit high-performance network, HDR100 Infiniband, and 10-gigabit ethernet on all compute nodes. We're also evaluating a move from SSD to next-generation NVMe technology for system drives.

Compute Node Configurations

Participants are strongly encouraged to select from the following configuration choices. Reducing the number of configurations helps PACE staff provide efficient service, and focus efforts on projects of strategic value.

  • 192 GB Compute node

    • dual-socket, 12-core Intel Xeon Gold 6226 "Cascade Lake" @ 2.7Ghz (24 cores total)
    • 192 GB DDR4-2933 Mhz memory
    • HDR100 Infiniband card
    • port on the PACE HDR Infiniband switch
    • 10-gigabit Ethernet cabling
    • shipping, installation and burin-in testing
    • 5-year next-business-day on-site warranty
  • 384 GB Compute node

    • same configuration as the 192 GB Intel node, just more memory
  • 768 GB Compute node

    • same configuration as the 192 GB Intel node, just more memory
  • 192 GB Compute node w/ local disk

    • same configuration as the 192 GB Intel node, plus (qty 4) 2TB SAS drives
  • 384 GB Compute node w/ local disk

    • same configuration as the 384 GB Intel node, plus (qty 4) 2TB SAS drives
  • 768 GB Compute node w/ local disk

    • same configuration as the 768 GB Intel node, plus (qty 4) 2TB SAS drives

Note

Due to licensing restrictions from nVidia, PACE cannot support consumer-grade nVidia GPUs (e.g. the GTX or Titan line). For those desiring single-precision GPU performance (e.g. AI or machine learning workloads), we do support the nVidia RTX6000.

  • 192 GB Compute node w/ single precision GPU

    • same configuration as the 192 GB compute node, plus (qty 4) nVidia RTX6000 GPUs
  • 384 GB Compute node w/ single precision GPU

    • same configuration as the 384 GB compute node, plus (qty 4) nVidia RTX6000 GPUs
  • 768 GB Compute node w/ single precision GPU

    • same configuration as the 768 GB compute node, plus (qty 4) nVidia RTX6000 GPUs
  • 192 GB Compute node w/ double precision GPU

    • same configuration as the 192 GB compute node, plus (qty 2) nVidia v100 GPUs<200b>
  • 384 GB Compute node w/ double precision GPU

    • same configuration as the 384 GB compute node, plus (qty 2) nVidia v100 GPUs<200b>
  • 768 GB Compute node w/ double precision GPU

    • same configuration as the 768 GB compute node, plus (qty 2) nVidia v100 GPUs<200b>
  • Storage

    • provisioned from shared GPFS filesystem
    • may be easily expanded on demand
    • smallest increment is 1TB
    • multiple years may be paid up front (i.e. $240 / TB for 3 years)
    • includes nightly backups
  • Archive Storage

    • Globus user interface, not directly accessible from PACE systems
    • No PACE account needed
    • may be easily expanded on demand
    • smallest increment is 1TB for 1 year
    • multiple years may be paid up front (i.e. $120 / TB for 3 years)
    • triple replication for data reliability

Draft Rate Study

Service Unit Calculated Usage Rate Internal External
[GEN] cpu-192GB CPUh $0.0273 $0.0068  $0.0246
[GEN] cpu-384GB CPUh  $0.0255  $0.0077  $0.0275
[GEN] cpu-384GB-SAS CPUh  $0.0251  $0.0091  $0.0288
[GEN] cpu-768GB CPUh  $0.0254  $0.0091  $0.0297
[GEN] cpu-768GB-SAS CPUh  $0.0284  $0.0119  $0.0339
[GEN] gpu-192GB-v100 GPUh  $0.3664  $0.2307  $0.5156
[GEN] gpu-384GB-RTX6000 GPUh  $0.2690  $0.1491  $0.2775
[GEN] gpu-384GB-v100 GPUh  $0.4784  $0.2409  $0.5079
[GEN] gpu-768GB-RTX6000 GPUh  $0.1550  $0.1550  $0.2830
[GEN] gpu-768GB-v100 GPUh  $0.4562 $0.2627  $0.4940
[CUI] Server-192GB CPUh  $0.0567  $0.0067  $0.0567
[CUI] Server-384GB CPUh  $0.0632  $0.0102  $0.0637
[CUI] Server-384GB-SAS CPUh  $0.0793  $0.0151  $0.0814
[CUI] Server-768GB CPUh  $0.0648  $0.0127  $0.0648
[LIGO/OSG] cpu-192GB CPUh  $0.0467  $0.0068  $0.0467
[LIGO/OSG] cpu-384GB CPUh  $0.0352  $0.0077  $0.0412
[LIGO/OSG] gpu-384GB-RTX6000 GPUh  $0.2304  $0.1639  $0.378
[Storage] Project Storage TB/Mo  $7.85  $6.67  $7.61
[Storage] CUI Drive Bay/Mo  $41.20  $41.20  $41.20
[Storage] Hive TB/Mo  $6.60 $6.60  $6.60
[Storage] LIGO/OSG TB/Mo  $2.61  $2.61  $2.61
[Storage] Archival TB/Mo  $4.89  $3.33  $4.89
General Consulting Hour  $98  $98  $98

PACE Participation Calculator

Enter your expected budget or compute needs in the table below to update the values based on the internal usage rates.

$
Equivalent Nodes purchased for 5 years of compute











CPU Hours on Each Node Class for Given Credits Allocation





GPU Hours on Each Node Class for Given Credits Allocation





What is the FY20 Phase 3 schedule for performance?

PACE equipment will be moving to the new CODA facility in 2020. While we will do our best to limit the impact of this transition on the GT research community, it will require adherence to a proscribed schedule of procurements. Note that storage purchases may occur outside of this schedule. These schedules are based on the best available information at this time, and reflect a realistic timeline that is based on experiences of previous orders.

FY20-Phase3 - to be deployed directly to CODA. This is a tentative schedule that is driven by year-end deadlines. Due to compliance requirements of the procurement process, we will have reduced room for configuration changes after the "actionable requests" period.

Deadline Description Notes
02/20/20 intent period Intent to participate in FY20-Phase 3 due to pace-support@oit.gatech.edu
02/27/20 actionable requests All details due to PACE (configuration, quantity, not-to-exceed budget, account number, financial contact, queue name)
04/22/20 bid award Anticipated date to award bid
04/29/20 final bid quote Anticipated date to finalize quote with selected vendor
05/06/20 final quote, faculty approvals Exact pricing communicated to faculty, all formal approvals received
05/08/20 enter requisition Requisition entered into Workday
05/22/20 release PO GT-Procurement issues pruchase order to vendor
07/31/20 installation estimated completion of physical installation by the vendor
08/28/20 acceptance testing PACE will perform acceptance testing to ensure proper operation. Upon completion, resources will be made "ready for research"

What do I get in return?

The advantage of the federated model is that everyone benefits from the Institute's commitment to pre-load infrastructure and support. This applies to every participant who chooses any of the first 4 options described above. These benefits include:

  • lower direct costs to participants for HPC equipment purchases by leveraging shared resources
  • guidance in developing system specifications consistent with the federated architecture
  • full life-cycle procurement management
  • hosting in a professionally managed data center with a 24/7 operations staff
  • racks, power and cooling to meet high-density demands
  • installation management
  • acceptance testing in accordance with both PACE and end-user requirements
  • secure high-speed scratch storage
  • head node for login access
  • a small home directory (bulk storage is funded by the faculty member)
  • commodity Ethernet networking (Infiniband, if desired, is funded by the faculty member)
  • back-up and restore
  • queue management
  • system administration
  • software and compiler administration (loads, updates, licensing)
  • hardware fixes
  • the dedicated support team manages all aspects of the cluster
  • shared access to recent architecture