BmDCA
Fork of matteofigliuzzi/bmDCA repository for Boltzmann-machine Direct Coupling Analysis (bmDCA).
Install / Use
/learn @ranganathanlab/BmDCAREADME
Boltzmann-machine Direct Coupling Analysis (bmDCA)
Dependencies (installation instructions detailed below):
This repository contains a C++ reimplementation of bmDCA adapted from the original code. Method is described in:
Figliuzzi, M., Barrat-Charlaix, P. & Weigt, M. How Pairwise Coevolutionary Models Capture the Collective Residue Variability in Proteins? Molecular Biology and Evolution 35, 1018–1027 (2018).
This code is designed to eliminate the original's excessive file I/O and to parallelize the MCMC in the inference loop.
Installing dependencies
GCC is used to compile the source code (and dependencies, if necessary).
The code relies on the fopenmp flag for parallelization, so GCC is preferred
over Clang. It also needs support for the C++11 standard, so any GCC later than
version 4.2 will suffice.
AutoTools are a set of programs used to generate makefiles for cross-platform compilation and installation.
pkg-config is a program that provides a simple interface between installed programs (e.g. libraries and header files) and the compiler. It's used by AutoTools to check for dependencies before compilation.
Armadillo is a C++ linear algebra library. It's used for storing data in matrix structures and performing quick computations in the bmDCA inference loop. To install, again look to your package repository.
Linux
To install the dependencies in Linux, simply use your distributions package manager. Commands for Debian/Ubuntu and Arch Linux are provided below:
Debian/Ubuntu
Run:
sudo apt-get update
sudo apt-get install git gcc g++ automake autoconf pkg-config \
libarmadillo-dev libopenblas-dev libarpack++2-dev
Arch Linux
For Arch Linux, GCC should have been installed with the base and base-devel
metapackages (sudo pacman -S base base-devel), but if not installed, run:
sudo pacman -S gcc automake autoconf pkgconf
For Arch, Armadillo is not in the package repositories. You will need to check the AUR.
First, install the SuperLU library:
git clone https://aur.archlinux.org/superlu.git
cd superlu
makepkg -si
cd ..
SuperLU is a fast matrix factorization library required as a build dependency
for Armadillo. Other build dependencies will be installed via makepkg from
the official repositories.
Now, download and install Armadillo:
git clone https://aur.archlinux.org/armadillo.git
cd armadillo
makepkg -si
cd ..
<!-- If there is no package for Armadillo, or you do not have root privileges on the
- system your using, you can instead compile the library from source.
-
- First, make sure that `cmake`, `openblas` (or `blas`), `lapack`, `arpack`, and
- `SuperLU` are installed. CMake is a compilation tool and the others are build
- dependencies. Then, to download and install Armadillo system wide, run the
- following:
- ```
- wget https://sourceforge.net/projects/arma/files/armadillo-9.850.1.tar.xz
- tar xf armadillo-9.850.1.tar.xz
- cd armadillo-9.850.1
- cmake .
- make -j4
- sudo make install
- cd ..
- ``` -->
macOS
The macOS instructions rely on Xcode developer tools and Homebrew for package management. All commands will be entered into the Terminal.
First, install Xcode developer tools. Open the 'Terminal' application from the launcher and run:
xcode-select --install
This may already have this installed.
Next, install Homebrew. From the online instructions, run:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
If you run into permissions errors when installing Homebrew, complaining that
root owns the /usr/local/ directory, you can change the ownership by running:
sudo chown -R <user> /usr/local/
where <user> should be substituted with your username, e.g. john.
Once Homebrew is installed, run:
brew install gcc automake autoconf pkg-config armadillo
This will install the most recent GCC (9.3.0 as of writing) along with AutoTools and pkg-config.
IMPORTANT: The default gcc, located in /usr/bin/gcc is actually aliased
to clang, which is another compiler. While in principle this is not an issue,
this version of Clang is not compatible with the fopenmp compiler flag that
is used to enable parallelization of the MCMC sampler. Additionally, libraries
(see Armadillo in the next step) installed via Homebrew are not by default
known to pkg-config or the linker.
Addressing all of these issues involves overriding the CC and CXX
environmental variables with the new GCC, updating PKG_CONFIG_PATH with paths
to any relevant *.pc files, and updating LD_LIBRARY_PATH with any shared
object library linked at compile time.
Doing this for the first time is a bit bewildering, so for convenience, use the
rcparams file in the tools directory in this repository. In it are a few
helper functions and aliases. Simply append the contents of that file to your
shell run commands. If you don't know what shell you're using, run:
echo $SHELL
For bash, copy the contents of rcparams to ${HOME}/.bashrc, and for zsh,
copy to ${HOME}/.zshrc. The general idea is that macOS versions <=10.14
(Mojave and earlier), uses bash as the default shell, and for >=10.15 (Catalina
and later), Apple switched the default shell to zsh.
You can append the rcparams file by copy-pasting the code in your favorite
text editor. You could also do something like cat tools/rcparams >> ${HOME}/.bashrc, for example.
As a side note, your macOS may not actually source the .bashrc file by
default. If you notice that adding the rcparams function has not effect in
new terminals, check that the ${HOME}/.bash_profile file exists. In it, there
should be a line like [ -f $HOME/.bashrc ] && . $HOME/.bashrc. (If the
.bashrc file exists, use source on it.) If no such like is there, add it
and reload your terminal.
The libraries and headers will be found via the pkgconfig_find() and
ld_lib_add() functions specified in the rcparams file.
Note: Run commands are executed when the shell starts, not when the files are edited. To update your shell to reflect changes, you can either run:
source ${HOME}/.bashrc
Or simply open a new shell. (For remote systems, you can just log out and log in again.)
<!-- The files will be installed to `/usr/local/include` and `/usr/local/lib` by - default. This requires root privileges (hence, the `sudo make install` at the - end). If you want to install elsewhere, adjust the above commands: - ``` - wget https://sourceforge.net/projects/arma/files/armadillo-9.850.1.tar.xz - tar xf armadillo-9.850.1.tar.xz - cd armadillo-9.850.1 - cmake . -DCMAKE_INSTALL_PREFIX:PATH=<alternate_path> - make -j4 - make install - cd .. - ``` - - Here, change `<alternate_path>` to wherever you want, for example `${HOME}` or - `${HOME}/.local`. -->Windows
Before starting, install MSYS2. This program is a package distribution for GNU/Unix tools that can be used to build programs for Windows.
The installer defaults work fine, and if prompted, open the "MSYS2" shell in the dialog window.
Once MSYS2 is installed and open, update the base libraries by running:
pacman -Syu
This will download and install some packages. You will then be prompted to close the terminal. Close it and open it again. Then, again run:
pacman -Syu
This will upgrade the remaining packages packaged in the installer to their most recent versions.
Next, install the dependencies for bmDCA:
pacman -S nano vim git \
autoconf automake-wrapper pkg-config make \
mingw-w64-x86_64-toolchain \
mingw-w64-x86_64-openmp \
mingw-w64-x86_64-arpack \
mingw-w64-x86_64-lapack \
mingw-w64-x86_64-openblas \
mingw-w64-x86_64-armadillo
The above command will installed the required programs in the /mingw64/bin
directory. Unfortunately, this directory is not on the default PATH. You will
need to add it manually.
Open your .bashrc file in a text editor (e.g. vim ~/.bashrc). Nano and Vim
were installed in the above command block.
Once open, add the line (at the end of the file):
export PATH="/mingw64/bin:$PATH"
Then, close and open the MSYS2 terminal again.
Optionally, edit the /etc/pacman.conf file. Uncomment the line #Color and
add the line ILoveCandy. Just a cosmetic flourish for pacman.
Installing bmDCA (all platforms)
Now that all the dependencies have been installed, compile and install bmDCA
globally (default: /usr/local) by running:
git clone https://github.com/ranganathanlab/bmDCA.git
cd bmDCA
./autogen.sh --prefix=/usr/local && \
make -j4 && \
make install
cd ..
Depending on your platform, the make install command may fail due to
permissions issues. To remedy this you can either run sudo make install
instead, or you can specify a different installation directory that does not
require administrator privileges. The latter option is particularly useful when
working on remote system not under your control.
Should you want to specify a local directory, for example $HOME/.local, run:
./autogen.sh --prefix=${HOME}/.local && \
make -j4 && \
make install
You can replace the value to the right of --prefix= with any other path.
Note, that you should check that it is on your system PATH.
In the event you with to uninstall bmDCA, simply run sudo make uninstall
or make uninstall as appropriate.
Test the installation by running in the terminal:
bmdca
If the installation worked correctly, this will print the usage information, e.g.:
bmdca usage:
(e.g. bmdca -i <input MSA> -r -d <direc
