SkillAgentSearch skills...

Fastp

An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)

Install / Use

/learn @OpenGene/Fastp

README

install with conda install with conda DebianBadge European Galaxy server

fastp

A tool designed to provide ultrafast all-in-one preprocessing and quality control for FastQ data.

This tool is designed for processing short reads (i.e. Illumina NovaSeq, MGI), if you are looking for tools to process long reads (i.e. Nanopore, PacBio, Cyclone), please use fastplong.

fastp supports batch processing of multiple FASTQ files in a folder, see - batch processing

If you use fastp in your work, you can cite fastp as: Shifu Chen. fastp 1.0: An ultra-fast all-round tool for FASTQ data quality control and preprocessing. iMeta 4.5 (2025): e70078

features

  1. comprehensive quality profiling for both before and after filtering data (quality curves, base contents, KMER, Q20/Q30, GC Ratio, duplication, adapter contents...)
  2. filter out bad reads (too low quality, too short, or too many N...)
  3. cut low quality bases for per read in its 5' and 3' by evaluating the mean quality from a sliding window (like Trimmomatic but faster).
  4. trim all reads in front and tail
  5. cut adapters. Adapter sequences can be automatically detected, which means you don't have to input the adapter sequences to trim them.
  6. correct mismatched base pairs in overlapped regions of paired end reads, if one base is with high quality while the other is with ultra low quality
  7. trim polyG in 3' ends, which is commonly seen in NovaSeq/NextSeq data. Trim polyX in 3' ends to remove unwanted polyX tailing (i.e. polyA tailing for mRNA-Seq data)
  8. preprocess unique molecular identifier (UMI) enabled data, shift UMI to sequence name.
  9. report JSON format result for further interpreting.
  10. visualize quality control and filtering results on a single HTML page (like FASTQC but faster and more informative).
  11. split the output to multiple files (0001.R1.gz, 0002.R1.gz...) to support parallel processing. Two modes can be used, limiting the total split file number, or limitting the lines of each split file.
  12. support long reads (data from PacBio / Nanopore devices).
  13. support reading from STDIN and writing to STDOUT
  14. support interleaved input
  15. support ultra-fast FASTQ-level deduplication
  16. ...

If you find a bug or have additional requirement for fastp, please file an issue:https://github.com/OpenGene/fastp/issues/new

simple usage

  • for single end data (not compressed)
fastp -i in.fq -o out.fq
  • for paired end data (gzip compressed)
fastp -i in.R1.fq.gz -I in.R2.fq.gz -o out.R1.fq.gz -O out.R2.fq.gz

By default, the HTML report is saved to fastp.html (can be specified with -h option), and the JSON report is saved to fastp.json (can be specified with -j option).

examples of report

fastp creates reports in both HTML and JSON format.

  • HTML report: http://opengene.org/fastp/fastp.html
  • JSON report: http://opengene.org/fastp/fastp.json

get fastp

install with Bioconda

install with conda

# note: the fastp version in bioconda may be not the latest
conda install -c bioconda fastp

or download the latest prebuilt binary for Linux users

This binary was compiled on CentOS, and tested on CentOS/Ubuntu

# download the latest build
wget http://opengene.org/fastp/fastp
chmod a+x ./fastp

# or download specified version, i.e. fastp v0.23.4
wget http://opengene.org/fastp/fastp.0.23.4
mv fastp.0.23.4 fastp
chmod a+x ./fastp

or compile from source

fastp depends on libisal, libdeflate and libhwy (Google Highway >= 1.1.0). Please install all three before building.

You can install all dependencies at once with conda:

conda install -c conda-forge isa-l libdeflate libhwy

Or install them individually using your system package manager:

Step 1: install isa-l

Install via brew install isa-l (macOS) or apt install libisal-dev (Ubuntu, dynamic linking only). Note: Ubuntu's libisal-dev does not ship a static library (.a). For static linking, compile from source (requires nasm, autoconf, automake, libtool):

git clone --depth=1 --branch v2.31.0 https://github.com/intel/isa-l.git
cd isa-l
./autogen.sh
./configure --prefix=/usr --libdir=/usr/lib64
make -j
sudo make install

Step 2: install libdeflate

Install via package manager: apt install libdeflate-dev (Ubuntu) or brew install libdeflate (macOS). Or compile from source:

git clone https://github.com/ebiggers/libdeflate.git
cd libdeflate
cmake -B build
cmake --build build
sudo cmake --install build

Step 3: install Highway

Google Highway (>= 1.1.0) provides portable SIMD acceleration. Install via brew install highway (macOS) or conda install -c conda-forge libhwy. Note: apt install libhwy-dev on Ubuntu 24.04 provides 1.0.7 which is too old — compile from source instead:

git clone --depth=1 --branch 1.3.0 https://github.com/google/highway.git
cd highway
cmake -B build -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF -DHWY_ENABLE_TESTS=OFF -DHWY_ENABLE_EXAMPLES=OFF
cmake --build build
sudo cmake --install build

Step 4: download and build fastp

# get source (you can also use browser to download from master or releases)
git clone 
View on GitHub
GitHub Stars2.3k
CategoryDevelopment
Updated1d ago
Forks372

Languages

C++

Security Score

100/100

Audited on Mar 30, 2026

No findings