Updated 2021-03-22

Run Jupyter Notebooks Interactively

Overview

  • The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.
  • Uses include
    • Data cleaning and transformation
    • Numerical simulation
    • Statistical modeling
    • Data visualization
    • Machine learning
    • Much more!
  • You can run Jupyter Notebook on the cluster via a browser on your local computer using pace-jupyter-notebook

Using pace-jupyter-notebook

  • Start by connecting to the cluster. If you are note on GT's network, please connect to the VPN first.

Warning

3rd-party applications such as Putty do not recognize the SSH escape character '~'. If you are using Windows, it is recommended that you use Powershell with OpenSSH (default on Windows 10), as other terminal applications do not establish port-forwarding correctly via instructions below.

  • To start a remote Jupyter Notebook, run the command pace-jupyter-notebook -q <QUEUENAME>, where <QUEUENAME> should be replaced with the queue to which you wish to submit your job

    • By default, this will start a Jupyter Notebook job on 1 node, with 1 processor and 1 GB of memory, for 1 hour Important: If you are using Phoenix or Firebird clusters, you must include the -A flag followed by your account name: pace-jupyter-notebook -q <QUEUENAME> -A <ACCOUNTNAME>
      • Run pace-whoami to see more information about your account / to find your account name
    • If the more resources are required for the job, use the standard qsub options to request accordingly:
      • -l can be used to set resource requests such as nodes, ppn, walltime, mem, pmem, etc.
      • -N can be used to set a custom name for the job
      • -j can be used to set the output flags (e.g. -j oe)
      • -o can be used to set the output file path
    • For example, to run a 4 hour Jupyter Notebook job with access to 1 node, 12 cores, and 32 GB of memory, we could run the command pace-jupyter-notebook -q inferno -l nodes=1:ppn=12,mem=32gb,walltime=4:00:00
    • By default, the Jupyter Notebook uses the base environment with the Python3 kernel from the latest Anaconda3 module
      • To select a different Anaconda module, use the --anaconda=<ANACONDA_MODULE> option when starting your job
      • <ANACONDA_MODULE> should match the full name of the Anaconda module to be used (e.g. anaconda3/2019.03)
      • A full list of available Anaconda modules can be seen by running the module avail command
    • You can also specify a custom conda environment to utilize packages that you have installed through Anaconda
      • To activate a custom conda environment, use the --conda-env=<CONDA_ENVIRONMENT> option when starting your job
      • <CONDA_ENVIRONMENT> should be the name of the conda environment when you created it
      • For more details on creating custom conda environments, see the documentation
  • The script will print output to the screen as the job is started

pace-jupyter-notebook output

Tip

To be recognized as the SSH escape character, it MUST be the first character on a new line. If you see the ~ character appear when you start to type, delete it, hit ENTER to start a new line, and try again.

Caution

If you encounter the error that the port is already in use and forwarding failed, this means that you are already forwarding that port to your local machine. To fix this issue, cancel the existing port-forwarding by opening a new SSH interface (SHIFT+~+C) and entering -KL<PORT>, where <PORT> is the port number you wish to clear.

  • Copy and paste the purple text that begins with -L into the SSH prompt and hit ENTER to begin port-forwarding
    • The port and compute node combination are unique to your job, so make sure you use the values provided
    • The prompt will display "Forwarding port." if successful; to return to your normal shell prompt, hit ENTER once more

ssh port-forwarding

  • Open a browser on your local machine, and copy and paste the URL and token into the address bar
    • The URL and token are unique to your job, so be sure to copy the correct link from the green text

remote Jupyter Notebook in browser

  • Once connected to your Jupyter Notebook, you can start a new kernel or open an existing notebook
  • Congratulations! You are now running a Jupyter Notebook on the cluster!
  • Once finished, close your browser and log off
    • If you would like to clean up further, you can delete the job with qdel <JOBID>