CloudBrush
A De Novo Next Generation Genomic Sequence Assembler Based on String Graph and MapReduce Cloud Computing Framework
Install / Use
/learn @CSCLabTW/CloudBrushREADME
CloudBrush
CloudBrush is a de novo genome assembler based on the string graph and mapreduce framework, it is released under Apache License 2.0 (http://www.apache.org/licenses/LICENSE-2.0) as a Free and Open Source Software Project.
More details about CloudBrush Project you can get under: https://github.com/ice91/CloudBrush
requirement: hadoop cluster
step 1: convert *.fastq to *.sfa
e.g. perl data/preprocessor.pl -renum data/Ec10k.sim.fastq > data/Ec10k.sim.sfa
step 2: upload *.sfa to HDFS
e.g. hadoop fs -put data/Ec10k.sim.sfa Ec10k
(You can use CloudRS(ReadStackCorrector) as preprocesser to correct sequencing errors and prepare *.sfa in HDFS. More details about CloudRS Project you can get under: https://github.com/ice91/ReadStackCorrector)
step 3 perform the CloudBrush using hadoop cluster
e.g. hadoop jar CloudBrush.jar -reads Ec10k -asm Ec10k_Brush -k 21 -readlen 36
e.g. hadoop jar CloudBrush.jar -reads Ec10k -asm Ec10k_Brush -k 21 -readlen 36 -javaopts -Djava.util.Arrays.useLegacyMergeSort=True
step 4 download *.fasta from HDFS
e.g. hadoop fs -cat Ec10k_Brush/* > Ec10k_Brush.fasta
[Reference] Yu-Jung Chang, Chien-Chih Chen, Chuen-Liang Chen and Jan-Ming Ho, "De Novo Next Generation Genomic Sequence Assembler Based on String Graph and MapReduce Cloud Computing Framework," BMC Genomics, volume 13, number Suppl 7, pages S28, December 2012.
Related Skills
node-connect
347.6kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
108.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
347.6kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
347.6kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
