Ray
Ray -- Parallel genome assemblies for parallel DNA sequencing
Install / Use
/learn @sebhtml/RayREADME
Ray assembler
Ray is a parallel de novo genome assembler that utilises the message-passing interface everywhere and is implemented using peer-to-peer communication.
Ray is free software distributed under the terms of the GNU General Public License, version 3 (GPLv3).
Ray is implemented using RayPlatform, a message-passing-interface programming framework.
Ray is documented in
- Documentation/ (many files)
- MANUAL_PAGE.txt (command-line options, same as Ray -help)
- README.md (general)
- INSTALL.txt (quick installation)
Solutions (all bundled in a single Product called Ray)
Standard:
- de novo genome assembly (works by default) => http://online.liebertpub.com/doi/abs/10.1089/cmb.2009.0238 => Documentation/README-heuristics
- quantification of contig abundances (works by default)
Metagenomics:
- Ray Meta: de novo metagenome assembly (works by default) => http://genomebiology.com/2012/13/12/R122
- Ray Communities: quantification of microbiome consortia members (with Ray Communities with -search) => Documentation/BiologicalAbundances.txt
- Ray Communities: taxonomy profiling of samples (with -search and -with-taxonomy) => Documentation/Taxonomy.txt
- Ray Ontology: gene ontology profiling of samples (with -search and -gene-ontology) => Documentation/GeneOntology.txt
- Ray Surveyor: compare genomic content between samples (with -run-surveyor) => Documentation/Ray-Surveyor.md
Transcriptomics:
- de novo transcriptome assembly (works, but not tested a lot)
- quantification of transcript expression
Distributors
- Geeknet, Inc. http://sourceforge.net/projects/denovoassembler/
- GitHub, Inc. https://github.com/sebhtml/ray
- Debian (Software in the Public Interest, Inc.) http://packages.debian.org/sid/ray
- Ubuntu (Canonical Ltd.) https://launchpad.net/ubuntu/+source/ray
- archlinux (AUR Development Team.) https://aur.archlinux.org/packages/ray/
- DNAnexus, Inc. https://platform.dnanexus.com/app/ray
- CloudBioLinux https://github.com/chapmanb/cloudbiolinux/blob/master/cloudbio/custom/bio_nextgen.py
In progress:
- Fedora (Red Hat, Inc.) https://bugzilla.redhat.com/show_bug.cgi?id=872783 (in progress)
- Galaxy (Galaxy Team) http://user.list.galaxyproject.org/How-do-I-add-Ray-to-Galaxy-Central-in-the-tool-shed-td4655623.html#none
- BaseSpace (Illumina, Inc.)
Website
- http://denovoassembler.sf.net
Code repositories
-
http://github.com/sebhtml/ray (Ray plugins for genomics)
-
http://github.com/sebhtml/RayPlatform (the engine RayPlatform)
If you want to contribute, clone the repository, make changes and I (Sébastien Boisvert) will pull from you after reviewing the code changes.
Other related repositories
-
http://github.com/sebhtml/Ray-TestSuite (system tests & unit tests)
-
http://github.com/sebhtml/Ray.web (Ray SourceForge web site)
-
http://github.com/sebhtml/ray-logo (artworks)
Mailing lists
-
Users: denovoassembler-users AT lists.sourceforge.net
-
Read it on gmane: http://blog.gmane.org/gmane.science.biology.ray-genome-assembler
-
Development/hacking: denovoassembler-devel AT lists.sourceforge.net
-
SEQanswers: http://seqanswers.com/forums/showthread.php?t=4301
Installation
You need a C++ compiler (supporting C++ 1998), make, an implementation of MPI (supporting MPI 2.2).
Compilation
tar xjf Ray-x.y.z.tar.bz2
cd Ray-x.y.z
make PREFIX=build
make install
ls build
Compilation using CMake
tar xjf Ray-x.y.z.tar.bz2
cd Ray-x.y.z
mkdir build
cd build
cmake ..
make
Change the compiler
make PREFIX=build2000 MPICXX=/software/openmpi-1.4.3/bin/mpicxx
make install
Tested C++ compilers: see Documentation/COMPILERS.txt
Parallel I/O
To compile with MPI I/O, use this:
make MPI_IO=y
Faster execution
Some processors have the popcnt instruction and other cool instructions. With gcc, add -march=native to build Ray for the processor used for the compilation.
make PREFIX=Build.native DEBUG=n ASSERT=n EXTRA=" -march=native"
make install
Another way to build Ray is to use whole-program optimization. With gcc, use this script:
./scripts/Build-Link-Time-Optimization.sh
Use large k-mers
make PREFIX=Ray-Large-k-mers MAXKMERLENGTH=64
# wait
make install
mpirun -np 512 Ray-Large-k-mers/Ray -k 63 -p lib1_1.fastq lib1_2.fastq \
-p lib2_1.fastq lib2_2.fastq -o DeadlyBug,Assembler=Ray,K=63
# wait
ls DeadlyBug,Assembler=Ray,K=63/Scaffolds.fasta
Compilation options
make PREFIX=build-3000 MAXKMERLENGTH=64 HAVE_LIBZ=y HAVE_LIBBZ2=y \
ASSERT=n FORCE_PACKING=y
# wait
make install
ls build-3000
see the Makefile for more.
Run Ray
To run Ray on paired reads:
mpiexec -n 25 Ray -k31 -p lib1.left.fasta lib1.right.fasta -p lib2.left.fasta lib2.right.fasta -o RayOutput
ls RayOutput/Contigs.fasta
ls RayOutput/Scaffolds.fasta
ls RayOutput/
Using a configuration file
Ray can be run with a configuration file instead.
mpiexec -n 16 Ray Ray.conf
Content of Ray.conf:
-k 31 # this is a comment -p lib1.left.fasta lib1.right.fasta
-p lib2.left.fasta lib2.right.fasta
-o RayOutput
Outputted files
RayOutput/Contigs.fasta and RayOutput/Scaffolds.fasta
type Ray -help for a full list of options and outputs
Color space
Ray assembles color-space reads and generate color-space contigs. Files must have the .csfasta extension. Nucleotide reads can not be mixed with color-space reads. This is an experimental feature.
Publications
http://denovoassembler.sf.net/publications.html
Code
Code documentation
cd code
doxygen DoxygenConfigurationFile
cd DoxygenDocumentation/html
firefox index.html
Useful links
Cloud computing
- http://aws.amazon.com/ec2/hpc-applications/
- https://cloud.genomics.cn/
- http://szdaily.sznews.com/html/2011-08/04/content_1689998.htm
- http://www.nature.com/nbt/journal/v28/n7/full/nbt0710-691.html
Message-passing interface
- http://dskernel.blogspot.com/2011/07/understand-main-loop-of-message-passing.html
- http://cw.squyres.com/
- http://blogs.cisco.com/performance/
- http://www.parawiki.org/index.php/Message_Passing_Interface#Peer_to_Peer_Communication
Funding
Doctoral Award to S.B., Canadian Institutes of Health Research (CIHR)
Authors
see AUTHORS
Compile Ray on Microsoft Windows with Microsoft Visual Studio
see Documentation/VISUAL_STUDIO.txt
Related Skills
node-connect
343.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
92.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
343.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
343.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
