Conda is a package manager and installation tool available on Mac, Windows and Linux. It makes it easy to find packages (and version of packages) you want to install and installs them, in addition to all their dependencies, for you. It can also help you keep software tools up to date. Conda is also very useful for managing environments. A common issue in bioinformatics is the need to switch between python 2 and python 3. This can get really cumbersome - a good alternative is to have a separate environment for a specific version of python and switch into that when you need to work with that tool.
First, download the installer from https://conda.io/miniconda.html. The table on this page contains the installers for Mac, Windows, and Linux. These instructions will be for a Linux OS.
cd curl -O https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh bash Miniconda3-latest-Linux-x86_64.sh
Then follow the prompts on the screen. The default settings should be good for most installations, but we will also have conda add itself to our PATH. The PATH is a list of directories that your computer uses to search for software. To do this, you will:
- press enter to go to the license agreement screen
- press q when you finish reading the license
- type yes and press enter to agree to the license
- press enter to agree to the default installation directory
- type yes and press enter to automatically update your PATH.
After conda is installed, it will update a hidden file in your home directory called
.bash_profile). This file contains the definition of the PATH. To update your current environment with the new value of the PATH variable, conda recommends logging out and logging back in. However, you can also run the command:
which will add the updated variables in bashrc to your environment.
Test the installation
Now if you enter the command
conda, you should get a help message about how to run conda.
Now you may want to try to install a piece of software like bwa. Try running
conda install bwa
At this point in the process, however, we get an error message:
Solving environment: failed PackagesNotFoundError: The following packages are not available from current channels: - bwa Current channels: - https://repo.anaconda.com/pkgs/main/linux-64 - https://repo.anaconda.com/pkgs/main/noarch - https://repo.anaconda.com/pkgs/free/linux-64 - https://repo.anaconda.com/pkgs/free/noarch - https://repo.anaconda.com/pkgs/r/linux-64 - https://repo.anaconda.com/pkgs/r/noarch - https://repo.anaconda.com/pkgs/pro/linux-64 - https://repo.anaconda.com/pkgs/pro/noarch To search for alternate channels that may provide the conda package you're looking for, navigate to https://anaconda.org and use the search bar at the top of the page.
Channels and Bioconda
This error is telling us that conda cannot find a package called bwa. That’s because conda maintains a list of “channels” where it looks for software to install, and BWA lives in a channel called
bioconda that isn’t included by default.
We can install bwa by specifying the channel name with the
conda install -c bioconda bwa
However, it is useful to install the channel so we don’t have to remember the names of channels for all the software we want to install.
To make Bioconda (and some other common ones) a default channel, run the following commands:
conda config --add channels defaults conda config --add channels conda-forge conda config --add channels bioconda
The order here is important as it specifies the priority of the channels. You should now be able to install bioconda packages without needing to specify the bioconda channel with
conda install bwa
should now run and properly install BWA.
R and R utility software like Rstudio and Rstudio server are in a channel called
r. How would you add the
r channel to your conda configuration? How would you install a package in this channel without adding it to your configuration?
Searching for conda packages
You can search for conda packages with the
conda search command.
For example, you can run the following to search for samtools and see the different versions available and which channel the package is in:
conda search samtools
You can install a specific version of a package by typing an equal sign and the version number after the package name. For example, to install samtools version 1.2, rather than the most recent version (1.8), run
conda install samtools=1.2
Why might you want to install a specific version of a piece of software instead of the most recent one? You might be trying to access a function that has been deprecated in current versions (it’s really good practice to read the documentation when following a workflow like this - a lot of times functions are deprecated for very good reasons.). You may also be following an older workflow or analysis and trying to reproduce it. This is really common in bioinformatics as we often look to previously performed analyses that can work for a specific type of data and replicate them with a current dataset. There are endless more specific examples that I won’t get into right now, but suffice to say that accessing prior versions of software is a very handy skill to have!