Updated 2021-03-22
Run Jupyter Notebooks Interactively¶
Overview¶
- The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.
- Uses include
- Data cleaning and transformation
- Numerical simulation
- Statistical modeling
- Data visualization
- Machine learning
- Much more!
- You can run Jupyter Notebook on the cluster via a browser on your local computer using
pace-jupyter-notebook
Using pace-jupyter-notebook¶
- Start by connecting to the cluster. If you are note on GT's network, please connect to the VPN first.
Warning
3rd-party applications such as Putty do not recognize the SSH escape character '~'. If you are using Windows, it is recommended that you use Powershell with OpenSSH (default on Windows 10), as other terminal applications do not establish port-forwarding correctly via instructions below.
-
To start a remote Jupyter Notebook, run the command
pace-jupyter-notebook -q <QUEUENAME>
, where <QUEUENAME> should be replaced with the queue to which you wish to submit your job- By default, this will start a Jupyter Notebook job on 1 node, with 1 processor and 1 GB of memory, for 1 hour
Important: If you are using Phoenix or Firebird clusters, you must include the
-A
flag followed by your account name:pace-jupyter-notebook -q <QUEUENAME> -A <ACCOUNTNAME>
- Run
pace-whoami
to see more information about your account / to find your account name
- Run
- If the more resources are required for the job, use the standard
qsub
options to request accordingly:- -l can be used to set resource requests such as nodes, ppn, walltime, mem, pmem, etc.
-N
can be used to set a custom name for the job-j
can be used to set the output flags (e.g.-j oe
)-o
can be used to set the output file path
- For example, to run a 4 hour Jupyter Notebook job with access to 1 node, 12 cores, and 32 GB of memory, we could run the command
pace-jupyter-notebook -q inferno -l nodes=1:ppn=12,mem=32gb,walltime=4:00:00
- By default, the Jupyter Notebook uses the base environment with the Python3 kernel from the latest Anaconda3 module
- To select a different Anaconda module, use the
--anaconda=<ANACONDA_MODULE>
option when starting your job - <ANACONDA_MODULE> should match the full name of the Anaconda module to be used (e.g. anaconda3/2019.03)
- A full list of available Anaconda modules can be seen by running the
module avail
command
- To select a different Anaconda module, use the
- You can also specify a custom conda environment to utilize packages that you have installed through Anaconda
- To activate a custom conda environment, use the
--conda-env=<CONDA_ENVIRONMENT>
option when starting your job - <CONDA_ENVIRONMENT> should be the name of the conda environment when you created it
- For more details on creating custom conda environments, see the documentation
- To activate a custom conda environment, use the
- By default, this will start a Jupyter Notebook job on 1 node, with 1 processor and 1 GB of memory, for 1 hour
Important: If you are using Phoenix or Firebird clusters, you must include the
-
The script will print output to the screen as the job is started
Tip
To be recognized as the SSH escape character, it MUST be the first character on a new line. If you see the ~ character appear when you start to type, delete it, hit ENTER to start a new line, and try again.
Caution
If you encounter the error that the port is already in use and forwarding failed, this means that you are already
forwarding that port to your local machine. To fix this issue, cancel the existing port-forwarding by opening
a new SSH interface (SHIFT
+~
+C
) and entering -KL<PORT>
, where <PORT> is the port number you wish to
clear.
- Copy and paste the purple text that begins with
-L
into the SSH prompt and hit ENTER to begin port-forwarding- The port and compute node combination are unique to your job, so make sure you use the values provided
- The prompt will display "Forwarding port." if successful; to return to your normal shell prompt, hit ENTER once more
- Open a browser on your local machine, and copy and paste the URL and token into the address bar
- The URL and token are unique to your job, so be sure to copy the correct link from the green text
- Once connected to your Jupyter Notebook, you can start a new kernel or open an existing notebook
- Congratulations! You are now running a Jupyter Notebook on the cluster!
- Once finished, close your browser and log off
- If you would like to clean up further, you can delete the job with
qdel <JOBID>
- If you would like to clean up further, you can delete the job with