TRIDENT
Toolkit for large-scale whole-slide image processing.
Install / Use
/learn @mahmoodlab/TRIDENTREADME
🔱 Trident
arXiv | Blog | Cite | Documentation | License
Trident is a toolkit for large-scale whole-slide image processing. This project was developed by the Mahmood Lab at Harvard Medical School and Brigham and Women's Hospital. This work was funded by NIH NIGMS R35GM138216.
[!NOTE] Contributions are welcome! Please report any issues. You may also contribute by opening a pull request.
Key Features:
<img align="right" src="_readme/trident_crop.jpg" width="250px" />- Tissue Segmentation: Extract tissue from background (H&E, IHC, etc.).
- Patch Extraction: Extract tissue patches of any size and magnification.
- Patch Feature Extraction: Extract patch embeddings from 20+ foundation models, including UNI, Virchow, H-Optimus-0 and more...
- Slide Feature Extraction: Extract slide embeddings from 5+ slide foundation models, including Threads (coming soon!), Titan, and GigaPath.
Updates:
- 07.25: Support for Feather model.
- 05.25: New batch-wise WSI caching for scalable processing on limited SSD space + nested WSI search (
--search_nested). - 04.25: Native support for PIL.Image and CuCIM (use
wsi = load_wsi(xxx.svs)). Support for seg + patch encoding without Internet. - 04.25: Remove artifacts/penmarks from the tissue segmentation with
--remove_artifactsand--remove_penmarks. - 02.25: New image converter from
czi,png, etc totiff. - 02.25: Support for GrandQC(Citation necessary, Non-commercial use, Original repository) tissue vs. background segmentation.
- 02.25: Support for Madeleine, Hibou, Lunit, Kaiko, and H-Optimus-1 models.
[!NOTE] GrandQC is integrated into Trident under the CC BY-NC-SA 4.0 license. If you use GrandQC, please cite their original publication.
🔨 1. Installation:
- Create an environment (Python 3.10 or 3.11):
conda create -n "trident" python=3.10, and activate itconda activate trident. - Cloning:
git clone https://github.com/mahmoodlab/trident.git && cd trident. - Local installation:
pip install -e ..- This installs the shared model stack (
transformers,timm,safetensors, etc.).
- This installs the shared model stack (
Optional install profiles:
pip install -e ".[patch-encoders]"for CONCH/MUSK/CTransPath-related extras.pip install -e ".[slide-encoders]"for PRISM/GigaPath/Madeleine-related extras.pip install -e ".[convert]"for slide conversion dependencies.pip install -e ".[full]"to install all pip-installable optional dependencies.
Run checks before launching jobs:
trident-doctor --profile basetrident-doctor --profile patch-encoders --check-gatedtrident-doctor --profile slide-encoderstrident-doctor --profile converttrident-doctor --profile full --check-gated
[!NOTE] Some models still require manual setup (e.g., local CHIEF repository path in
trident/slide_encoder_models/local_ckpts.json) or HuggingFace gated access approvals.
🔨 2. Running Trident:
CLI options (all are supported):
python run_batch_of_slides.py ...(existing command)python run_single_slide.py ...(existing command)trident batch ...,trident single ...,trident convert ..., andtrident doctor ...(wrapper CLI)
Already familiar with WSI processing? Perform segmentation, patching, and UNI feature extraction from a directory of WSIs with:
python run_batch_of_slides.py --task all --wsi_dir ./wsis --job_dir ./trident_processed --patch_encoder uni_v1 --mag 20 --patch_size 256
Equivalent wrapper CLI:
trident batch -- --task all --wsi_dir ./wsis --job_dir ./trident_processed --patch_encoder uni_v1 --mag 20 --patch_size 256
Feeling cautious?
Run this command to perform all processing steps for a single slide:
python run_single_slide.py --slide_path ./wsis/xxxx.svs --job_dir ./trident_processed --patch_encoder uni_v1 --mag 20 --patch_size 256
Equivalent wrapper CLI:
trident single -- --slide_path ./wsis/xxxx.svs --job_dir ./trident_processed --patch_encoder uni_v1 --mag 20 --patch_size 256
Convert images/WSIs to pyramidal TIFF:
trident convert --input_dir ./wsis --mpp_csv ./wsis/to_process.csv --job_dir ./pyramidal_tiff --downscale_by 1 --num_workers 1
--mpp_csv is required and must contain wsi,mpp columns. Only files listed in the CSV are converted.
If embedded MPP metadata is detected in a slide, Trident compares it to the CSV value and logs mismatches.
Or follow step-by-step instructions:
Step 1: Tissue Segmentation: Segments tissue vs. background from a dir of WSIs
- Command:
python run_batch_of_slides.py --task seg --wsi_dir ./wsis --job_dir ./trident_processed --gpu 0 --segmenter hest--task seg: Specifies that you want to do tissue segmentation.--wsi_dir ./wsis: Path to dir with your WSIs.--job_dir ./trident_processed: Output dir for processed results.--gpu 0: Uses GPU with index 0.
--segmenter: Segmentation model. Defaults tohest. Usegrandqc(Citation necessary, Non-commercial use, Original repository) for fast H&E segmentation orotsufor a classical image-processing-only fallback. Add the option--remove_artifactsfor additional artifact clean up.- Outputs:
- WSI thumbnails in
./trident_processed/thumbnails. - WSI thumbnails with tissue contours in
./trident_processed/contours. - GeoJSON files containing tissue contours in
./trident_processed/contours_geojson. These can be opened in QuPath for editing/quality control, if necessary.
- WSI thumbnails in
Step 2: Tissue Patching: Extracts patches from segmented tissue regions at a specific magnification.
- Command:
python run_batch_of_slides.py --task coords --wsi_dir ./wsis --job_dir ./trident_processed --mag 20 --patch_size 256 --overlap 0--task coords: Specifies that you want to do patching.--wsi_dir wsis: Path to the dir with your WSIs.--job_dir ./trident_processed: Output dir for processed results.--mag 20: Extracts patches at 20x magnification.--patch_size 256: Each patch is 256x256 pixels.--overlap 0: Patches overlap by 0 pixels (always an absolute number in pixels, e.g.,--overlap 128for 50% overlap for 256x256 patches.
- Outputs:
- Patch coordinates as h5 files in
./trident_processed/20x_256px/patches. - WSI thumbnails annotated with patch borders in
./trident_processed/20x_256px/visualization.
- Patch coordinates as h5 files in
Step 3a: Patch Feature Extraction: Extracts features from tissue patches using a specified encoder
- Command:
python run_batch_of_slides.py --task feat --wsi_dir ./wsis --job_dir ./trident_processed --patch_encoder uni_v1 --mag 20 --patch_size 256--task feat: Specifies that you want to do feature extraction.--wsi_dir wsis: Path to the dir with your WSIs.--job_dir ./trident_processed: Output dir for processed results.--patch_encoder uni_v1: Uses theUNIpatch encoder. See below for list of supported models.--mag 20: Features are extracted from patches at 20x magnification.--patch_size 256: Patches are 256x256 pixels in size.
- Outputs:
- Features are saved as h5 files in
./trident_processed/20x_256px/features_uni_v1. (Shape:(n_patches, feature_dim))
- Features are saved as h5 files in
Trident supports 24 patch encoders, loaded via a patch encoder_factory. Models requiring specific installations will return error messages with additional instructions. Gated models on HuggingFace require access requests.
| Patch Encoder | Embedding Dim | Args | Link |
|-----------------------|---------------:|------------------------------------------------------------------|------|
| UNI | 1024 | --patch_encoder uni_v1 --patch_size 256 --mag 20 | MahmoodLab/UNI |
| UNI2-h | 1536 | --patch_encoder uni_v2 --patch_size 256 --mag 20 | MahmoodLab/UNI2-h |
| CONCH | 512 | --patch_encoder conch_v1 --patch_size 512 --mag 20 | MahmoodLab/CONCH |
| CONCHv1.5 | 768 | --patch_encoder conch_v15 --patch_size 512 --mag 20 | MahmoodLab/conchv1_5 |
| Virchow | 2560 | --patch_encoder virchow --patch_size 224 --mag 20
Related Skills
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
last30days-skill
15.9kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
autoresearch
2.8kClaude Autoresearch Skill — Autonomous goal-directed iteration for Claude Code. Inspired by Karpathy's autoresearch. Modify → Verify → Keep/Discard → Repeat forever.
omg-learn
Learning from user corrections by creating skills and patterns. Patterns can prevent mistakes (block/warn/ask) or inject helpful context into prompts
