OSG Connect provides users with access to OSPool, an aggregation of computational resources from around the country for distributed high-throughput computing (dHTC). This page provides a brief summary of the steps to register an account and begin using the OSPool, but OSG maintains extensive documentation that covers all relevant topics.
Learn more about OSG Connect at a PACE OSG Orientation session.
Is OSG Connect for me?¶
Unlike the traditional HPC clusters managed by PACE and XSEDE, OSG resources are best-suited for large volumes of smaller, shorter jobs. As a general rule of thumb, ideal workflows are those that
- Use less than 8 cores in a shared-memory model,
- Require less than 20 hours of walltime,
- Can operate with a few GB of RAM and less than 1 GB of storage, and
- Use precompiled binaries or containers for their software.
Much of OGS's computing power comes from the ability to run a large number of jobs simultaneously. Breaking up your work into small, independently executable jobs and optimizing the resource requests of those jobs, by only requesting the amount of memory, disk, and cpus truly needed, will ensure that you get the most out of OSG resources. This is an important practice that will reduce the amount of time your jobs remain idle before running and which will maximize your throughput, all helping to get your work completed sooner.
If you have any questions about the benefits of OSG Connect for your research, feel free to discuss with PACE.
Getting Started with OSG Connect¶
This documentation serves as a very brief overview of the process to register and use the OSPool. For the most current and detailed instructions, please review documenation at the OSG Help Desk.
To start running your dHTC workload on the OSPool, you should:
- Review the policies for using OSG.
- Register for an account.
- Geneate SSH keys and activate your OSG login - instructions are available.
- Join a project in OSG Connect.
- Review the OSG Connect Quickstart guide to begin submitting jobs.
In general, OSG Connect can support most popular, open source software that fit the distributed high throughput computing model. At present, we do not have or support most commercial software due to licensing issues. See here to review options, and provide links to additional information, for using software installed by users, software available as precompiled binaries or via containers, and preinstalled software via modules.