Conda Enviroments

Conda is a package management environment and package manager that is primarily used for open-source software packages. Conda is associated with Python and R programming languages but also supports C++ and Fortran.

Below is documentation on installing and maintaining a Conda environment on the HPC clusters.

Build your own conda environment

Load and initializing conda

If you have not ran conda init to add it as a source in your .bashrc file, then this is a necessary first step to use Conda. Once users log into the HPC system, you need to type the following command:

module load anaconda3
conda init

note

if you get a warning messages after conda init command, you need to log out the HPC system, and then log back in. You will then be able to initialize the system. Once the conda environment is initialized, you will see the environment name (default one is called base) next be listed next to your username in the command line.

Optional -- Editing the `.condarc` file

By default, conda stores packages in the HOME directory. If you want to change the default path to somewhere you desire, for example /data/$USER/, you can use a text editor to create a file called .condarc under the path /home/$USER/.condarc and it should contain the path to the alternative location.

e.g.:

pkgs_dirs:
 - /data/$USER/conda/pkgs

tip

More .condarc examples can be found here

Installing and activating a `conda` environment

To create and install a conda environment, you can specify a prefix, which will be the path to where the environment will be installed. If you don't include prefix, the default path will be under your HOME. Choose a descriptive name for the environment- Conda will create the directory, the directory should not already exist. For example, if you want to create a conda environment called env_aaa containing the packages AAA, BBB, CCC, and insatll it in the directory /data/$USER/packages, you can do following

conda create --prefix /data/$USER/packages/env_aaa AAA BBB CCC

To activate the environment, you can do:

conda activate /data/$USER/packages/env_aaa

note

If you are using the default package path, you do not need to include path to activate the environment, you can just do:

conda activate ENV_NAME

note

User can also create a YAML file with all of the desired conda packages so that conda will find the correct configuration to solve the environment. The YAML file will contain a name, a list of Conda channels to look for the packages, and a list of all the desired packages.

Here are some sample YAML files:

datascience.yml - Contains many common data science programs
sklearn.yml - Machine learning with scikit-learn
biotools.yml - Contains applications for a bioinformatics workflow

Using Anaconda in SLURM job scripts

Conda does not know the source when it is called in a SLURM job script. To ensure conda knows the source of the environment that it will be activating during the execution of the SLURM job script activate the conda environment by running:

source activate <env name>`

This will ensure SLURM to look for the called environment in the correct PATH and activates it.

An example job script utilizing a conda environment can be found ADD LINK HERE

warning

Note using conda activate <env name> may result in error as conda activate makes the assumption that the environment is being called in the same PATH it exists, which is not the case for SLURM jobs.

Common Issues with Conda

Without running conda init the commands conda activate and conda deactivate will present the following warning.

[tester001@rclogin01 ~]$ conda activate
CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run

    $ conda init <SHELL_NAME>

Currently supported shells are:
- bash
- fish
- tcsh
- xonsh
- zsh
- powershell

See 'conda init --help' for more information and options.

IMPORTANT: You may need to close and restart your shell after running 'conda init'.

Since conda init injects some logic into your .bashrc file it must be sourced. For interactive job sessions conda init does not need to be called as the source is already known. However this is not the case for SLURM jobs, therefore it is important to add source .bashrc to the submission script to allow subsequent conda activate or conda deactivate commands to work.

Build your own conda environment​

Load and initializing conda​

Optional -- Editing the .condarc file​

Installing and activating a conda environment​

Using Anaconda in SLURM job scripts​

Common Issues with Conda​

Build your own conda environment

Load and initializing conda

Optional -- Editing the `.condarc` file

Installing and activating a `conda` environment

Using Anaconda in SLURM job scripts

Common Issues with Conda