Updated 2022-11-13

Link Gateway Account to Hive Cluster

Note

The Hive Gateway has been in production since October 2020.

Overview

  • For users who have access to the general Hive cluster, it is possible to take advantage of additional resources with the gateway. In order to this, the user needs to create a Resource Profile.
  • This guide will show you how to link your gateway account to Hive cluster as your personal resource so you can submit jobs as your PACE user and access your PACE Hive storage

Step 1: Set up a Resource Profile

  • Go to the Settings tab (appears on a dropdown after clicking Workspace)
  • Click the Group Resource Profile icon that looks like three stacked servers in the left navbar
  • Click the blue New Group Resource Profile + button in the top right

Screenshot

  • In the Name field, enter something like "[your_username]'s Resource Profile"
  • In the Default SSH Credential field, you can either use the Default Key that appears after clicking on the field, or you can create a new SSH Credential by clicking the plus sign next to the field

Screenshot

  • Give the new key a name, and click Create.

Screenshot

Important

Copy the SSH key to your clipboard by selecting the copy icon next to the plus icon. You'll need this later on

  • You can manage who you share this Resource Profile with by clicking the Share button below

Screenshot

  • Your screen should now look something like this

Screenshot

  • Click on the New Compute Preference + button and select Hive as your compute resource

Screenshot

  • You'll be taken to a new screen where you define the compute preferences

Screenshot

  • In the Login Username field, enter your Hive cluster login username
  • In the SSH Credential field, select the same SSH Credential you selected before for the resource profile
  • Change the Allocation Project Number field to your tracking account, in the format hive-<gt-username>.
  • In the Scratch Location field, enter your scratch location. You can find this by running this by running this command cd ~/scratch; pwd -P after connecting to the Hive Cluster via ssh. The path will look something like /storage/hive/scratch/N/username where N is a number.
  • In the Policy section, unselect the access-hive and access-gpu queues. - if you are a GT user, you will get errors trying to submit to those.
  • Click Save again once taken back to the resource profile page

Step 2: Add SSH Key to authorized_keys

  • Log in to your Hive cluster account through a terminal window
  • When you are logged in, navigate to the .ssh folder with cd .ssh
  • Open the authorized_keys file with a text editor of choice (you can use nano if you prefer to work in the terminal by typing nano authorized_keys)
  • Paste the SSH Key you copied earlier on the last line of this file, save the file, and exit

Step 3: Submit a Job Through Your Personal Resource

  • Go back to your workspace and select an application that you would like to submit a job for using your personal resource on Hive
  • Set up the job like normal, but make sure to change the Allocation to the Resource Profile you set up in Step 1

Screenshot

  • Save and Launch the job
  • When the job is complete, you can access the files from your ~/scratch directory on the Hive Cluster in addition to viewing them in the Experiment Summary page
  • Note: The name of the job is automatically generated, so the folder found in ~/scratch that contains your results might have a name that is not very readable (it will most likely appear as PROCESS followed by a list of numbers)
    • You can use the mv command to rename it: mv <old_folder_name> <new_folder_name>