Using Anaconda and Creating User Conda Environments¶
Anaconda distributions have started to use a
year.month scheme, starting from late last year. All PACE resources will now adopt the same convention in the use of anaconda2 and anaconda3 modules. Therefore, anaconda module files for
latest will be removed to avoid ambiguities. Users currently loading an anaconda module ending in
latest should modify their commands to reference a specific version of Anaconda (ex:
- Conda is a powerful tool that makes it easy to:
- install whatever python packages you want
- create a personal environment to use these packages
- To use any
condacommands, you must have an anaconda module loaded.
Step 1: Use the Right Storage¶
If using Hive, follow the same directions below to create symlinks, but use
data as the name of the symbolic link to your project storage, instead of
p-<pi-username>-<number>. See Hive Storage for details.
If you are a Phoenix user with storage provided by a school or college, your symbolic link will start with
d- instead of
p-. See Phoenix Storage for details.
For ICE clusters, skip Step 1, as you have only a home directory.
Conda environments can easily surpass the limit of the
directory (5 GB on Hive or 10 GB on Phoenix and Firebird). To work around this, we simply have to make sure it stores the
environment in the
$HOME/p-<pi-username>-<number> 'project' directory on Phoenix, since this directory has much more storage available.
- We will use a symlink to make the
.condafile in the home dir link to a
p-<pi-username>-<number>, so Conda doesn't exceed storage limits.
- Do not load the anaconda module before doing these steps
- Check for and remove existing symlinks: If a symlink including the
.condadirectory already exists, please remove it before continuing. This is a common issue, be sure to double check existing symlinks if you run into issues.
Scenario 1: You had previously run Anaconda, and can see a
.conda directory in your home directory.
Make sure that you don't have another
.conda directory under
- (Move .conda to your project directory)
mv ~/.conda ~/p-<pi-username>-<number>Please replace
p-<pi-username>-<number>with your project directory
- (Check if it's moved successfully)
ls -hd ~/.condaYou should see 'No such file or directory'
- (Create a symbolic link under your home, where Anaconda is expecting to find this directory)
ln -s ~/p-<pi-username>-<number> ~/
- (Finally, confirm that the symbolic link is created and you can list its contents)
ls -l ~/.conda/*
Scenario 2: If you never run Anaconda before and there is no
.conda directory in your home directory.
ls -a to check if the file exists. If the anaconda module is loaded or
exists in the home folder, the symlink won't work properly. The environment
will still be stored in your home folder, and you might overflow the 5gb limit. Again, if a symlink already exists, remove it before continuing.
To make the symlink, follow these steps (log into the cluster first):
ln -s ~/data/.conda .conda
Step 2: Load the Anaconda Module¶
If you are using a PACE system, you cannot use
anaconda3/latest. You can instead use
module avail anaconda3 to determine the latest version (currently 2021.05)
module avail anacondato see all the available versions
- Load with
module load anaconda3/2021.05, or you can replace
anaconda3/2021.05with any version
Step 3: List Available Conda Environments¶
conda env list
Step 4: Create Environment¶
conda create --name <your-name>to create a completely empty conda environment.
Step 5: Activate Environment¶
To activate your enivronment, run:
source activate <your-env-name>
Newer anaconda installation permit the use of
conda activate <your-env-name>
Step 6: Install Packages¶
- Make sure your conda environment is activated. The env name will show up
in parentheses next to your name in the terminal. Install python packages
how you normally would. Use
condaas a the preferred package manager
pip install, following installation steps for whatever packages you choose
- Install nltk example:
- when your conda env is activated, just run:
conda install nltk
How to run a locally-installed Anaconda virtual env when some dependency is needed¶
- You may need to load other modules on which your libraries are dependent. For example, you may need the
intelmodule to make use of a Python2 library. In that case, run a series of commands:
module load anaconda3/2021.05 source activate my-conda-environment module load intel/19.0.5 command-depending-on-intel
Deactivating & Removing Environments¶
Deactivating an environment is useful if you need to activate another environment.
To deactivate run:
source deactivate <your-env-name>
When using your environment in running a job, make sure you activate it in the PBS script. Conda Env may use up a lot of your storage quota. If you are done with an environment and want to delete it, run:
conda env remove -n <your-env-name>
- For users wanting a improved interoperability with installing pip packages inside of conda, please have a read here.
At this time, users that wish to use Anaconda 2 or Anaconda 3, and csh/tcsh shells, need to use the module
anaconda3/2021.05 or newer versions. Older versions will not work reliably.
Furthermore, PACE staff advises tcsh users to consider using Bash scripts for job scripts:
#!/bin/bashon the first line of the job script
- After all
if [ -f /etc/bashrc ]; then . /etc/bashrc fi
Examples for Building Custom Conda environments¶
In this section we provide custom conda examples to help all understand a conda environment from start to finish.
In the below example, we demonstrate an installation of the package MACS2, on
login-s.pace.gatech.edu using Anaconda 3 (2021.05), with pip interoperability
An example with MACS2 (RHEL7 PACE Systems on 10/2021)
# load anaconda3 2021.05 for conda python3 module load anaconda3/2021.05 # sanity check the python and python version #type python3 python3 --version # create and activate the conda environment # noting the storage practices we mentioned above!!! conda create --name mymacs2 conda activate mymacs2 # ensure that we install pip locked into using python3.8.8 # this is specifically a requirement for the version of MACS2 we want conda install pip python=3.8.8 conda config --set pip_interop_enabled True pip install -U numpy pip install -U macs2 # sanity check packages conda list macs2 --version # macs2 user should know what to do here # when you are finished or want to work with another conda environment conda deactivate
In many cases, we emphasize creating environments with
environment.yml because we are making explicit the versions being used. Both you, the user, and I, the PACEr can now agree that we used the same package components and improves our changes at reproducibility for all.
Problems with Installation of pip Packages¶
- While trying to install pip packages in a conda environment, you may come across an issue such as the install clashing with items in other local directories
- One way to resolve the issue is to move the
~/.local folderout of the way when doing pip installs under the conda environment with a command like
mv ~/.local ~/.local.bak.
- If that doesn't work, then try to manually edit
site.pyin the conda Python by setting
ENABLE_USER_SITE = Falsein the file.
- If you are using the tiny conda environment, you can find the file with the following:
module load anaconda3/2021.05 conda env list conda activate tiny python -c 'import site; print(site.__file__)'
- More information about
ENABLE_USER_SITEcan be found here.