SkillAgentSearch skills...

Seqinspector

Dedicated QC-only pipeline for sequencing data. The pipeline will run a (potentially large) set of QC tools and can output global and group specific Multiqc reports. The pipeline is targeting core facilities or research groups with larger sequencing throughput.

Install / Use

/learn @nf-core/Seqinspector
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<h1> <picture> <source media="(prefers-color-scheme: dark)" srcset="docs/images/nf-core-seqinspector_logo_dark.png"> <img alt="nf-core/seqinspector" src="docs/images/nf-core-seqinspector_logo_light.png"> </picture> </h1>

Open in GitHub Codespaces GitHub Actions CI Status GitHub Actions Linting StatusAWS CICite with Zenodo nf-test

Nextflow nf-core template version run with conda run with docker run with singularity Launch on Seqera Platform

Get help on SlackFollow on BlueskyFollow on MastodonWatch on YouTube

Introduction

nf-core/seqinspector is a bioinformatics pipeline that processes raw sequence data (FASTQ) to provide comprehensive quality control. It can perform subsampling, quality assessment, duplication level analysis, and complexity evaluation on a per-sample basis, while also detecting adapter content, technical artifacts, and common biological contaminants. The pipeline generates detailed MultiQC reports with flexible output options, ranging from individual sample reports to project-wide summaries, making it particularly useful for sequencing core facilities and research groups with access to sequencing instruments. If provided, nf-core/seqinspector can also parse statistics from an Illumina run folder directory into the final MultiQC reports.

Compatibility between tools and data type

<!-- TODO: add a search tool that accepts a tree for `Compatibility with Data`. -->

| Tool Type | Tool Name | Tool Description | Compatibility with Data | Dependencies | Default tool | | ------------------- | ------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------- | ----------------------- | ------------------------------------------------------------------------------------------------------------------------- | ------------ | | Subsampling | Seqtk | Global subsampling of reads. Only performs subsampling if --sample_size parameter is given. | [RNA, DNA, synthetic] | [N/A] | no | | Indexing, Mapping | Bwamem2 | Align reads to reference | [RNA, DNA] | [N/A] | yes | | Indexing | SAMtools | Index aligned BAM files, create FASTA index | [DNA] | [N/A] | yes | | QC | FastQC | Read QC | [RNA, DNA] | [N/A] | yes | | QC | FastqScreen | Basic contamination detection | [RNA, DNA] | [N/A] | yes | | QC | SeqFu Stats | Sequence statistics | [RNA, DNA] | [N/A] | yes | | QC | Picard collect multiple metrics | Collect multiple QC metrics | [RNA, DNA] | [Bwamem2, SAMtools, --genome] | yes | | QC | Picard_collecthsmetrics | Collect alignment QC metrics of hybrid-selection data. | [RNA, DNA] | [Bwamem2, SAMtools, --fasta, --run_picard_collecths_metrics, --bait_intervals, --target_intervals (--ref_dict)] | no | | Reporting | MultiQC | Present QC for raw reads | [RNA, DNA, synthetic] | [N/A] | yes |

Workflow diagram

<picture> <source media="(prefers-color-scheme: dark)" srcset="docs/images/seqinspector_tubemap_dark.png"> <source media="(prefers-color-scheme: light)" srcset="docs/images/seqinspector_tubemap_light.png"> <img alt="Fallback image description" src="docs/images/seqinspector_tubemap_light.png"> </picture>

Summary of tools and version used in the pipeline

| Tool | Version | | ----------- | ------- | | bwamem2 | 2.3 | | fastqc | 0.12.1 | | fastqscreen | 0.16.0 | | multiqc | 1.33 | | picard | 3.4.0 | | samtools | 1.22.1 | | seqfu | 1.22.3 | | seqtk | 1.4 |

Usage

[!NOTE] If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

First, prepare a samplesheet with your input data that looks as follows:

samplesheet.csv:

sample,fastq_1,fastq_2,rundir,tags
CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz,200624_A00834_0183_BHMTFYDRXX,lane1:project5:group2

Each row represents a fastq file (single-end with only fastq_1) or a pair of fastq files (paired end with fastq_1 and fastq_2). rundir is the path to the runfolder. tags is a colon-separated list of tags that will be added to the MultiQC report for this sample.

Now, you can run the pipeline using:

nextflow run nf-core/seqinspector \
   -profile <docker/singularity/.../institute> \
   --input samplesheet.csv \
   --outdir <OUTDIR>

[!WARNING] Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to

View on GitHub
GitHub Stars24
CategoryProduct
Updated18d ago
Forks39

Languages

Nextflow

Security Score

95/100

Audited on Mar 13, 2026

No findings