Pafchainer
A tool for merging WFMASH's alignment chains using the WFA algorithm.
Install / Use
/learn @waveygang/PafchainerREADME
pafchainer
A tool for processing PAF (Pairwise Alignment Format) files and merging alignment chains using the Wavefront Alignment Algorithm (WFA).
Overview
pafchainer merges consecutive alignments in a chain by connecting them with wavefront alignments. It erodes the boundaries of adjacent alignments and fills in the gaps with new alignments, creating a single, contiguous alignment path. This is particularly useful for genomic alignment workflows that require complete end-to-end alignments.
Features
- Processes PAF files with chain information
- Uses the Wavefront Alignment Algorithm for fast and accurate gap alignments
- Indexes chains for efficient processing
- Handles both compressed (BGZF) and uncompressed PAF files
- Configurable boundary erosion size
- Outputs in PAF or SAM format
- Multi-threaded processing
Installation
Clone the repository and build using Cargo:
git clone https://github.com/waveygang/pafchainer
cd pafchainer
cargo build --release
Usage
pafchainer --paf input.paf --query query.fa --target target.fa [OPTIONS]
Required Arguments:
--paf, -p: Input PAF file with chain information--query, -q: Query FASTA file--target, -t: Target FASTA file
Optional Arguments:
--erosion-size, -e: Size of boundary erosion in base pairs (default: 100)--output, -o: Output file (default: stdout)--sam: Output in SAM format instead of PAF--threads, -t: Number of threads to use (default: 4)--verbose, -v: Verbosity level (0=error, 1=info, 2=debug)
Examples
Basic usage:
pafchainer -p alignments.paf -q query.fa -t reference.fa -o connected.paf
Output in SAM format:
pafchainer -p alignments.paf -q query.fa -t reference.fa --sam -o aligned.sam
How It Works
- Builds or loads a chain index for efficient access to PAF entries
- For each chain, erodes boundaries between adjacent entries
- Performs wavefront alignment to connect the eroded regions
- Merges CIGAR strings to create a single continuous alignment
- Outputs the merged alignment in PAF or SAM format
