Bashbone
A bash/biobash library for workflow and pipeline design
Install / Use
/learn @Hoffmann-Lab/BashboneREADME
Bashbone
A bash and biobash library for workflow and pipeline design within but not restricted to the scope of Next Generation Sequencing (NGS) data analyses.
Outline
- Bashbone
- License
- Download
- Bash library usage (without full installation)
- Installation
- Biobash library usage (requires installation)
- Third-party software
- Supplementary information
- Closing remarks
For developers - bash library
- Get a full bash error stack trace in interactive shells or within scripts
- Write command line code in your favorite programming language via Here-documents for later orchestrated execution
- Add object-oriented programming (oop) like syntactic sugar to bash variables and arrays to avoid complex parameter-expansions, variable-expansions and brace-expansions
- Execute commands in parallel on your machine or submit them as jobs to a workflow manager like sun grid engine (SGE) and log stdout, stderr and exit codes per job
- Benchmark runtime and memory usage
- Infer number of parallel instances according to targeted memory consumption or targeted threads per instance
- Log execution of bash functions at different verbosity levels
- Extend the library by custom bash functions which will inherit
- Stack trace
- Termination of all function related (sub-)processes, including asynchronous background jobs upon error/exit or when reaching prompt-command (interactive shell)
- Removal of temporary files created via
mktempand execution of custom cleanup commands upon error/exit or when reaching prompt-command (interactive shell)
- Profit from helper functions that implement
- Multi-threaded joining of multiple files
- Multi-threaded sorting
- Multi-threaded de-compression
- Multi-threaded compression plus indexing for random access by byte offset or line number without noticeable overhead
- Multi-threaded application of commands on an compressed and indexed or flat file on per-line or per-chunk basis
For users - biobash library
- Get a full bash error stack trace in interactive shells or within scripts
- Easily design multi-threaded pipelines to perform NGS related tasks
- Use many best-practice parameterized and heavily run-time tweaked software wrappers
- Most software related parameters will be inferred directly from input data, so that all functions require just a minimal set of input arguments
- Benefit from a non-root stand-alone installer without need for any prerequisites
- Get genomes, annotations from Ensembl, variants from GATK resource bundle and RAW sequencing data from NCBI Sequence Read Archive (SRA)
Covered tasks
- For paired-end and single-end derived raw sequencing or prior mapped read data
- RNA-Seq protocols (RNA, RIP, m6A, ..)
- DNA-Seq protocols (WGS, ChIP, Chip-exo, ATAC, CAGE, Quant, Cut&Tag, ..)
- Bisulfite converted DNA-Seq protocols (WGBS, RRBS)
- Data quality anlysis and preprocessing
- adapter and poly-mono/di-nucleotide clipping
- quality trimming
- error correction
- artificial rRNA depletion
- Read alignment and post-processing
- knapsack problem based slicing of alignment files for parallel task execution
- sorting, filtering
- UMI based de-duplication or removal of optical and PCR duplicates
- generation of pools and pseudo-replicates
- read group modification, split N-cigar reads, left-alignment and base quality score recalibration
- Gene fusion detection
- Methyl-C calling and prediction of differentially methylated regions
- Expression analysis
- Read quantification (also from quasi-mappings), TPM and Z-score normalization and heatmap plotting
- Inference of strand specific library preparation methods
- Inference of differential expression as well as clusters of co-expression
- Detection of differential splice junctions and differential exon usage
- Gene ontology (GO) gene set enrichment and over representation analysis plus semantic similarity based clustering and visualizations
- Implementation of ENCODE v3 best-practice ChIP-Seq Peak calling
- Peak calling from RIP-Seq, MeRIP-Seq, m6A-Seq and other related IP-Seq data
- Inference of effective genome sizes
- Variant detection from DNA or RNA sequencing experiments
- Integration of multiple solutions for germline and somatic calling
- VCF normalization
- Tree reconstruction from homozygous sites
- ssGSEA and survival analysis from TCGA cancer expression data
- Genome and SRA data retrieval
- Genome to transcriptome conversion
- Data visualization via IGV batch processing
License
The whole project is licensed under the GPL v3 (see LICENSE file for details), except the the third-party tools set-upped during installation. Please refer to the corresponding licenses
Copyleft (C) 2020, Konstantin Riege
Download
This will download you a copy which includes the latest developments
git clone --recursive https://github.com/Hoffmann-Lab/bashbone
To check out the latest release (irregularly compiled) do
cd bashbone
git checkout $(git describe --tags)
Bash library usage (without full installation)
Do's and don'ts
When used, in a script, bashbone is meant to be sourced at the very top to handle positional arguments and to re-execute (-r true) the script under its own process group id in order to take care of proper termination (-a "$@"). It will enable error stack tracing and sub-process handling globally by setting traps for EXIT ERR RETURN INT. So, don't override them. In case your script intends to spawn deamons use setsid or disable bashbone first.
#!/usr/bin/env bash
source <path/to/bashbone>/activate.sh -r true -a "$@"
# do stuff
# now spawn deamons
setsid deamon1 &
bashbone -x
deamon2 &
Please note, that error tracing in bash is circumvented by using || or '&&' constructs. Therefore, avoid them in any context of function calls.
#!/usr/bin/env bash
source <path/to/bashbone>/activate.sh -r true -a "$@"
function myfun(){
cat file_not_found
echo "error ignored. starting time consuming calculation now."
}
# DON'T !
myfun || echo "failed with code $?"
Quick start
To get all third-party tools set-upped and subsequently all biobash bashbone functions to work properly, see also
For a lite installation that gets you the minimum required tools (GNU parallel, gztool, mdless) in order to make use of developer functions, execute
scripts/setup.sh -i lite -d <path/to/installation>
To see the usage, do
scripts/setup.sh -h
DESCRIPTION
Bashbone setup routine
SYNOPSIS
setup.sh -i [all|upgrade] -d [path]
OPTIONS
-i | --install [lite|all|upgrade] : install into given directory
-g | --use-config : use supplied yaml files and URLs instead of cutting edge tools
-d | --directory [path] : installation path
-t | --threads [value] : threads - predicted default: 32
-l | --log [path] : log file - default: [-d]/install.log
-v | --verbose : enable verbose mode
-h | --help : prints this message
DEVELOPER OPTIONS
-s | --source [path,..] : source file(s) to overload compile::[lite|all|upgrade|<tool>] functions
-i | --install [<tool>,..] : install into given directory
Now load the bashbone library in an interactive terminal session. Note, that none of your environment settings will be modified.
source <path/of/installation>/latest/bashbone/activate.sh
To see the activate script usage, do
source <path/of/installation>/latest/bashbone/activate.sh -h
This is bashbone activation script.
To see lists of available options and functions, source me and execute bashbone -h
Usage:
-h | this help
-l <legacymode> | true/false let commander inerts line breaks, thus crafts one-liners from makecmd here-documents
default: false
-i <path> | to installation root <path>/latest
default: infe
Related Skills
diffs
340.5kUse the diffs tool to produce real, shareable diffs (viewer URL, file artifact, or both) instead of manual edit summaries.
clearshot
Structured screenshot analysis for UI implementation and critique. Analyzes every UI screenshot with a 5×5 spatial grid, full element inventory, and design system extraction — facts and taste together, every time. Escalates to full implementation blueprint when building. Trigger on any digital interface image file (png, jpg, gif, webp — websites, apps, dashboards, mockups, wireframes) or commands like 'analyse this screenshot,' 'rebuild this,' 'match this design,' 'clone this.' Skip for non-UI images (photos, memes, charts) unless the user explicitly wants to build a UI from them. Does NOT trigger on HTML source code, CSS, SVGs, or any code pasted as text.
openpencil
1.9kThe world's first open-source AI-native vector design tool and the first to feature concurrent Agent Teams. Design-as-Code. Turn prompts into UI directly on the live canvas. A modern alternative to Pencil.
HappyColorBlend
HappyColorBlendVibe Project Guidelines Project Overview HappyColorBlendVibe is a Figma plugin for color palette generation with advanced tint/shade blending capabilities. It allows designers to
