Updated 2022-08-08
Run Jupyter Notebooks Interactively¶
Overview¶
- The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.
-
Uses include
- Data cleaning and transformation
- Numerical simulation
- Statistical modeling
- Data visualization
- Machine learning
- Much more!
-
You can run Jupyter on PACE clusters via Open OnDemand, entirely within a browser window.
- The older
pace-jupyter-notebook
utility remains available on Moab/Torque clusters.
Tip
PACE now recommends using Open OnDemand for Jupyter notebooks. The following information provides instructions for our prior utility and is recommended only for clusters without OnDemand.
Using pace-jupyter-notebook¶
- Start by connecting to the cluster. If you are note on GT's network, please connect to the VPN first.
Warning
3rd-party applications such as Putty do not recognize the SSH escape character '~'. If you are using Windows, it is recommended that you use Powershell with OpenSSH (default on Windows 10), as other terminal applications do not establish port-forwarding correctly via instructions below.
-
To start a remote Jupyter Notebook, run the command
pace-jupyter-notebook -q <QUEUENAME>
, where <QUEUENAME> should be replaced with the queue to which you wish to submit your job- By default, this will start a Jupyter Notebook job on 1 node, with 1 processor and 1 GB of memory, for 1 hour
Important: If you are using Phoenix or Firebird clusters, you must include the
-A
flag followed by your account name:pace-jupyter-notebook -q <QUEUENAME> -A <ACCOUNTNAME>
- Run
pace-quota
to see more information about your account / to find your account name
- Run
- If the more resources are required for the job, use the standard
qsub
options to request accordingly:- -l can be used to set resource requests such as nodes, ppn, walltime, mem, pmem, etc.
-N
can be used to set a custom name for the job-j
can be used to set the output flags (e.g.-j oe
)-o
can be used to set the output file path
- For example, to run a 4 hour Jupyter Notebook job with access to 1 node, 12 cores, and 32 GB of memory, we could run the command
pace-jupyter-notebook -q inferno -l nodes=1:ppn=12,mem=32gb,walltime=4:00:00
- By default, the Jupyter Notebook uses the base environment with the Python3 kernel from the latest Anaconda3 module
- To select a different Anaconda module, use the
--anaconda=<ANACONDA_MODULE>
option when starting your job - <ANACONDA_MODULE> should match the full name of the Anaconda module to be used (e.g. anaconda3/2019.03)
- A full list of available Anaconda modules can be seen by running the
module avail
command
- To select a different Anaconda module, use the
- You can also specify a custom conda environment to utilize packages that you have installed through Anaconda
- To activate a custom conda environment, use the
--conda-env=<CONDA_ENVIRONMENT>
option when starting your job - <CONDA_ENVIRONMENT> should be the name of the conda environment when you created it
- For more details on creating custom conda environments, see the documentation
- Alternatively, you can choose the conda environment from within Jupyter, if you do not include it here.
- To activate a custom conda environment, use the
- By default, this will start a Jupyter Notebook job on 1 node, with 1 processor and 1 GB of memory, for 1 hour
Important: If you are using Phoenix or Firebird clusters, you must include the
-
The script will print output to the screen as the job is started
- To connect to your Jupyter Notebook, you first need to establish port-forwarding through your current SSH session
- The escape sequence
SHIFT
+~
+C
(hold the SHIFT key and press ~ then C) will open an SSH console to modify your current session - When successfully entered, a prompt displaying
ssh>
will appear on a new line
- The escape sequence
Tip
To be recognized as the SSH escape character, it MUST be the first character on a new line. If you see the ~ character appear when you start to type, delete it, hit ENTER to start a new line, and try again.
Tip
If your keyboard does not use the US Layout, you may need to remap the SSH escape character. To do this, exit any SSH
sessions, add 'EscapeChar +' (without quotes) to the start of ~/.ssh/config, and login again. When you run
pace-jupyter-notebook
, type SHIFT
+ +
and then SHIFT
+ C
.
Caution
If you encounter the error that the port is already in use and forwarding failed, this means that you are already
forwarding that port to your local machine. To fix this issue, cancel the existing port-forwarding by opening
a new SSH interface (SHIFT
+~
+C
) and entering -KL<PORT>
, where <PORT> is the port number you wish to
clear.
- Copy and paste the purple text that begins with
-L
into the SSH prompt and hit ENTER to begin port-forwarding- The port and compute node combination are unique to your job, so make sure you use the values provided
- The prompt will display "Forwarding port." if successful; to return to your normal shell prompt, hit ENTER once more
- Open a browser on your local machine, and copy and paste the URL and token into the address bar
- The URL and token are unique to your job, so be sure to copy the correct link from the green text
- Once connected to your Jupyter Notebook, you can start a new kernel or open an existing notebook
- Congratulations! You are now running a Jupyter Notebook on the cluster!
- Once finished, close your browser and log off
- If you would like to clean up further, you can delete the job with
qdel <JOBID>
- If you would like to clean up further, you can delete the job with
Using Jupyter with Your Own Conda Environment¶
You can build your own conda environments and use them in Jupyter. If you install the ipykernel
package in a conda env you have created, it will appear as a choice in the list of Jupyter kernels.
Complete these steps on the command line before launching Jupyter to set up the environment:
- Complete the One-Time Setup for Anaconda on PACE
- Load Anaconda:
module load anaconda3
- Create a conda environment with the name of your choice (e.g., "myenv"):
conda create --name myenv
- Activate the environment:
conda activate myenv
- Install the
ipykernel
package to support Jupyter:conda install ipykernel
- Install any other packages you would like to use in your environment
After you have set up your environment, you can use it any time you launch a Jupyter notebook. Be sure to create the environment and install the ipykernel
before submitting the Jupyter job. This setup needs to be completed only once per environment.
Inside Jupyter, you will now see a choice of kernel named Python [conda env:.conda-myenv]
. Select it to run the notebook's code in your environment.