Fastp
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
Install / Use
/learn @OpenGene/FastpREADME
fastp
A tool designed to provide ultrafast all-in-one preprocessing and quality control for FastQ data.
This tool is designed for processing short reads (i.e. Illumina NovaSeq, MGI), if you are looking for tools to process long reads (i.e. Nanopore, PacBio, Cyclone), please use fastplong.
fastp supports batch processing of multiple FASTQ files in a folder, see - batch processing
If you use fastp in your work, you can cite fastp as: Shifu Chen. fastp 1.0: An ultra-fast all-round tool for FASTQ data quality control and preprocessing. iMeta 4.5 (2025): e70078
- features
- simple usage
- examples of report
- get fastp
- input and output
- filtering
- adapters
- per read cutting by quality score
- base correction for PE data
- global trimming
- polyG tail trimming
- polyX tail trimming
- unique molecular identifier (UMI) processing
- output splitting
- overrepresented sequence analysis
- merge paired-end reads
- duplication rate and deduplication
- batch processing
- all options
- citations
features
- comprehensive quality profiling for both before and after filtering data (quality curves, base contents, KMER, Q20/Q30, GC Ratio, duplication, adapter contents...)
- filter out bad reads (too low quality, too short, or too many N...)
- cut low quality bases for per read in its 5' and 3' by evaluating the mean quality from a sliding window (like Trimmomatic but faster).
- trim all reads in front and tail
- cut adapters. Adapter sequences can be automatically detected, which means you don't have to input the adapter sequences to trim them.
- correct mismatched base pairs in overlapped regions of paired end reads, if one base is with high quality while the other is with ultra low quality
- trim polyG in 3' ends, which is commonly seen in NovaSeq/NextSeq data. Trim polyX in 3' ends to remove unwanted polyX tailing (i.e. polyA tailing for mRNA-Seq data)
- preprocess unique molecular identifier (UMI) enabled data, shift UMI to sequence name.
- report JSON format result for further interpreting.
- visualize quality control and filtering results on a single HTML page (like FASTQC but faster and more informative).
- split the output to multiple files (0001.R1.gz, 0002.R1.gz...) to support parallel processing. Two modes can be used, limiting the total split file number, or limitting the lines of each split file.
- support long reads (data from PacBio / Nanopore devices).
- support reading from STDIN and writing to STDOUT
- support interleaved input
- support ultra-fast FASTQ-level deduplication
- ...
If you find a bug or have additional requirement for fastp, please file an issue:https://github.com/OpenGene/fastp/issues/new
simple usage
- for single end data (not compressed)
fastp -i in.fq -o out.fq
- for paired end data (gzip compressed)
fastp -i in.R1.fq.gz -I in.R2.fq.gz -o out.R1.fq.gz -O out.R2.fq.gz
By default, the HTML report is saved to fastp.html (can be specified with -h option), and the JSON report is saved to fastp.json (can be specified with -j option).
examples of report
fastp creates reports in both HTML and JSON format.
- HTML report: http://opengene.org/fastp/fastp.html
- JSON report: http://opengene.org/fastp/fastp.json
get fastp
install with Bioconda
# note: the fastp version in bioconda may be not the latest
conda install -c bioconda fastp
or download the latest prebuilt binary for Linux users
This binary was compiled on CentOS, and tested on CentOS/Ubuntu
# download the latest build
wget http://opengene.org/fastp/fastp
chmod a+x ./fastp
# or download specified version, i.e. fastp v0.23.4
wget http://opengene.org/fastp/fastp.0.23.4
mv fastp.0.23.4 fastp
chmod a+x ./fastp
or compile from source
fastp depends on libisal, libdeflate and libhwy (Google Highway >= 1.1.0). Please install all three before building.
You can install all dependencies at once with conda:
conda install -c conda-forge isa-l libdeflate libhwy
Or install them individually using your system package manager:
Step 1: install isa-l
Install via brew install isa-l (macOS) or apt install libisal-dev (Ubuntu, dynamic linking only). Note: Ubuntu's libisal-dev does not ship a static library (.a). For static linking, compile from source (requires nasm, autoconf, automake, libtool):
git clone --depth=1 --branch v2.31.0 https://github.com/intel/isa-l.git
cd isa-l
./autogen.sh
./configure --prefix=/usr --libdir=/usr/lib64
make -j
sudo make install
Step 2: install libdeflate
Install via package manager: apt install libdeflate-dev (Ubuntu) or brew install libdeflate (macOS). Or compile from source:
git clone https://github.com/ebiggers/libdeflate.git
cd libdeflate
cmake -B build
cmake --build build
sudo cmake --install build
Step 3: install Highway
Google Highway (>= 1.1.0) provides portable SIMD acceleration. Install via brew install highway (macOS) or conda install -c conda-forge libhwy. Note: apt install libhwy-dev on Ubuntu 24.04 provides 1.0.7 which is too old — compile from source instead:
git clone --depth=1 --branch 1.3.0 https://github.com/google/highway.git
cd highway
cmake -B build -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF -DHWY_ENABLE_TESTS=OFF -DHWY_ENABLE_EXAMPLES=OFF
cmake --build build
sudo cmake --install build
Step 4: download and build fastp
# get source (you can also use browser to download from master or releases)
git clone
