Foldseek

Foldseek enables fast and sensitive comparisons of large structure sets.

Generate Convert Improve

Install / Use

/learn @steineggerlab/Foldseek

About this skill

Quality Score

0/100

README

Foldseek

Foldseek enables fast and sensitive comparisons of large protein structure sets, supporting monomer and multimer searches, as well as clustering. It runs on CPU, supports GPU acceleration for faster searches, and optionally allows ultra-fast and sensitive comparisons directly from protein sequence inputs using a language model, bypassing the need for structures.

Publications

van Kempen M, Kim S, Tumescheit C, Mirdita M, Lee J, Gilchrist CLM, Söding J, and Steinegger M. Fast and accurate protein structure search with Foldseek. Nature Biotechnology, doi:10.1038/s41587-023-01773-0 (2023)

Barrio-Hernandez I, Yeo J, Jänes J, Mirdita M, Gilchrist CLM, Wein T, Varadi M, Velankar S, Beltrao P and Steinegger M. Clustering predicted structures at the scale of the known protein universe. Nature, doi:10.1038/s41586-023-06510-w (2023)

Kim W, Mirdita M, Levy Karin E, Gilchrist CLM, Schweke H, Söding J, Levy E, and Steinegger M. Rapid and sensitive protein complex alignment with Foldseek-Multimer. Nature Methods, doi:10.1038/s41592-025-02593-7 (2025)

Kallenborn F, Chacon A, Hundt C, Sirelkhatim H, Didi K, Cha S, Dallago C, Mirdita M, Schmidt B, Steinegger M: GPU-accelerated homology search with MMseqs2. bioRxiv, doi: 10.1101/2024.11.13.623350 (2024)

Foldseek
- Publications
Table of Contents

Webserver

Search your protein structures against the AlphaFoldDB and PDB in seconds using the Foldseek webserver (code): search.foldseek.com 🚀

Installation

# Linux AVX2 build (check using: cat /proc/cpuinfo | grep avx2)
wget https://mmseqs.com/foldseek/foldseek-linux-avx2.tar.gz; tar xvzf foldseek-linux-avx2.tar.gz; export PATH=$(pwd)/foldseek/bin/:$PATH

# Linux ARM64 build
wget https://mmseqs.com/foldseek/foldseek-linux-arm64.tar.gz; tar xvzf foldseek-linux-arm64.tar.gz; export PATH=$(pwd)/foldseek/bin/:$PATH

# Linux AVX2 & GPU build (req. glibc >= 2.17 and nvidia driver >=525.60.13)
wget https://mmseqs.com/foldseek/foldseek-linux-gpu.tar.gz; tar xvfz foldseek-linux-gpu.tar.gz; export PATH=$(pwd)/foldseek/bin/:$PATH

# MacOS
wget https://mmseqs.com/foldseek/foldseek-osx-universal.tar.gz; tar xvzf foldseek-osx-universal.tar.gz; export PATH=$(pwd)/foldseek/bin/:$PATH

# Conda installer (Linux and macOS)
conda install -c conda-forge -c bioconda foldseek

Other precompiled binaries are available at https://mmseqs.com/foldseek.

[!NOTE] We recently added support for GPU-accelerated protein sequence and profile searches. This requires an NVIDIA GPU of the Ampere generation or newer for full speed, however, also works at reduced speed for Turing-generation GPUs. The bioconda- and precompiled binaries will not work on older GPU generations (e.g. Volta or Pascal).

Memory requirements

For optimal software performance, consider three options based on your RAM and search requirements:

With Cα info (default). Use this formula to calculate RAM - (6 bytes Cα + 1 3Di byte + 1 AA byte) * (database residues). The 54M AFDB50 entries require 151GB.
Without Cα info. By disabling --sort-by-structure-bits 0, RAM requirement reduces to 35GB. However, this alters hit rankings and final scores but not E-values. Structure bits are mostly relevant for hit ranking for E-value > 10^-1.
Single query searches. Use the --prefilter-mode 1, which isn't memory-limited and computes all optimal ungapped alignments. This option optimally utilizes foldseek's multithreading capabilities for single queries and supports GPU acceleration.

Tutorial Video

A Foldseek tutorial covering the webserver and command-line usage is available here. <a href="https://www.youtube.com/watch?v=k5Rbi22TtOA"><img src="https://img.shields.io/youtube/views/k5Rbi22TtOA?style=social"></a>

Documentation

Many of Foldseek's modules (subprograms) rely on MMseqs2. For more information about these modules, refer to the MMseqs2 wiki. For documentation specific to Foldseek, checkout the Foldseek wiki here.

Quick start

Search

The easy-search module allows to query one or more single-chain proteins, formatted in as protein structures in PDB/mmCIF format (flat or gzipped) or as protein sequnece in fasta, against a target database, folder or individual single-chain protein structures (for multi-chain proteins see complexsearch). The default alignment information output is a tab-separated file but Foldseek also supports Superposed Cα PDBs and HTML.

foldseek easy-search example/d1asha_ example/ aln tmpFolder

Output Search

Tab-separated

The default output fields are: query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits but they can be customized with the --format-output option e.g., --format-output "query,target,qaln,taln" returns the query and target accessions and the pairwise alignments in tab-separated format. You can choose many different output columns.

| Code | Description | | --- | --- | |query | Query sequence identifier | |target | Target sequence identifier | |qca | Calpha coordinates of the query | |tca | Calpha coordinates of the target | |alntmscore | TM-score of the alignment | |qtmscore | TM-score normalized by the query length | |ttmscore | TM-score normalized by the target length | |u | Rotation matrix (computed to by TM-score) | |t | Translation vector (computed to by TM-score) | |lddt | Average LDDT of the alignment | |lddtfull | LDDT per aligned position | |prob | Estimated probability for query and target to be homologous (e.g. being within the same SCOPe superfamily) |

Check out the MMseqs2 documentation for additional output format codes.

Superpositioned Cα only PDB files

Foldseek's --format-mode 5 generates PDB files with all target Cα atoms superimposed onto the query structure based on the aligned coordinates. For each pairwise alignment it will write its own PDB file, so be careful when using this options for large searches.

Interactive HTML

Locally run Foldseek can generate an HTML search result, similar to the one produced

Related Skills

node-connect

348.5k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

109.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

348.5k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

348.5k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。

steineggerlab

View profile

View on GitHub

GitHub Stars1.2k

CategoryDevelopment

Updated14h ago

Forks148

steineggerlab/foldseek

Languages

Security Score

100/100

Audited on Apr 4, 2026

No findings

Foldseek

Install / Use

README

Foldseek

Publications

Table of Contents

Webserver

Installation

Memory requirements

Tutorial Video

Documentation

Quick start

Search

Output Search

Tab-separated

Superpositioned Cα only PDB files

Interactive HTML

Related Skills