R2g
A homology-based, computationally lightweight pipeline for discovering genes in the absence of an assembly
Install / Use
/learn @yangwu91/R2gREADME
Reads to Genes (r2g)
Introduction
<div align=center><img src="https://raw.githubusercontent.com/yangwu91/r2g/master/images/banner.png" alt="banner"/></div>Reads to Genes, or r2g, is a computationally lightweight and homology-based pipeline that allows rapid identification of genes or gene families from raw sequence databases in the absence of an assembly, by taking advantage of over 44.3 petabases of sequencing data for all kinds of species deposited in Sequence Read Archive hosted by National Center for Biotechnology Information, which can be effectively run on most common computers without high-end specs.
Implementation
The GUI wrapper r2g GUI now is released. Please visit here if you prefer a graphic user interface (GUI) for r2g. The following methods are for installing command line interface (CLI) for r2g. Please note that GUI is still under developing, and CLI is more stable than GUI.
Pulling the Docker image (recommended)
Please follow the instruction here to download and install Docker based on your operating system before running the Docker image. For Windows users, please check here to configure the Docker if it is your first time to use it.
This installation method is recommended as it is compatible with most common operating systems including Linux, macOS and Windows.
Then, pull the r2g Docker image with all required software packages installed and configured by one command as follows:
docker pull yangwu91/r2g:latest
Now, you are good to go.
Installing with Conda channels for Linux users
For Linux users, r2g can be installed by Conda as follows. Of course miniconda3 (recommended) or anaconda needs to be installed first.
# Install miniconda3:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh Miniconda3-latest-Linux-x86_64.sh
# Set up bioconda channel (or its mirrors):
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
# Install r2g:
conda install -c yangwu91 r2g
After that, Google Chrome web browser and the corresponding version of ChromeDriver (or selenium/standalone-chrome Docker image) need to be installed.
In the future, I plan to create a pull request to the Bioconda recipes.
Installing with Homebrew for macOS users
Progress:
- [x] Build Homebrew Formula
- [x] Init a pull request to the
brewsci/bioTap. - [ ] Be permitted by the
brewsci/bioTap.
Since the r2g formula is still waiting for the approval from the the brewsci/bio Tap, macOS users can download the r2g formula and add it manually on your local computer.
# Install Homebrew and add the tap
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
brew tap brewsci/bio
# Download the r2g formula and put it in the correct directory:
/usr/local/Cellar/curl/7.72.0/bin/curl -o /usr/local/Homebrew/Library/Taps/brewsci/homebrew-bio/Formula/r2g.rb -fsSL https://raw.githubusercontent.com/yangwu91/r2g/master/brewsci-Formula/r2g.rb
# Install r2g:
brew install r2g
And then Google Chrome web browser and the corresponding version of ChromeDriver (or selenium/standalone-chrome Docker image) need to be installed.
Manual installation for all platforms
Required third-party applications
The r2g required 3 third-party software packages including NCBI SRA Toolkit, Trinity, and Google Chrome web browser with ChromeDriver (or selenium/standalone-chrome Docker image).
-
NCBI SRA Toolkit
-
For Linux and macOS users, it also can be installed using Conda via the Bioconda channel:
conda install -c bioconda sra-toolsIf the installed version of SRA Toolkit is above 2.10.3, before the first run you have to execute the follow command:
vdb-config --interactiveThen press
xto set up the default configs. This is a known annoying issue that can't be avoided.
-
-
Trinity
-
Follow the instruction to compile the source code. Please note that Trinity has its own dependencies, including samtools, Python 3 with NumPy, bowtie2, jellyfish, salmon, and trimmomatic. If you are a macOS user while compiling Trinity, please use
gcccompiler instead of nativeclangcomplier on macOS to avoid raising errors. -
For macOS users, Trinity can be installed using Homebrew as well:
brew tap brewsci/bio brew install trinity -
For Linux users, Trinity can be installed easily using Conda, and you would never worry about other dependencies:
conda install -c bioconda trinity=2.8.5 numpy samtools=1.10The compatibility of Trinity Version 2.8.5 with r2g has been fully tested, and theoretically, the later versions should work too.
-
-
Google Chrome web browser with ChromeDriver
-
Install Google Chrome web browser and then download the corresponding version of ChromeDriver.
-
Or, you can simply run selenium/standalone-chrome Docker image in background (make sure you have the permission to bind the 4444 port on local host):
docker run -d -p 4444:4444 -v /dev/shm:/dev/shm selenium/standalone-chrome
-
Installing the r2g package
The r2g package has been deposited to PyPI, so it can be installed as follows:
pip install r2g
Setting up the environment
If these required third-party applications above are installed using Conda, you don't need to take care of it.
If these packages are compiled or downloaded by yourself, either include them in $PATH separately by a command as follows:
# Linux and macOS:
export PATH="$PATH:/path/to/fastq-dump:/path/to/Trinity:/path/to/chromedriver"
# Windows:
set PATH=%PATH%;DRIVER:\path\to\fastq-dump;DRIVER:\path\to\Trinity;DRIVER:\path\to\chromedriver
or fol
Related Skills
node-connect
349.9kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
349.9kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
349.9kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
