SkillAgentSearch skills...

TRIDENT

Toolkit for large-scale whole-slide image processing.

Install / Use

/learn @mahmoodlab/TRIDENT

README

🔱 Trident

arXiv | Blog | Cite | Documentation | License

Trident is a toolkit for large-scale whole-slide image processing. This project was developed by the Mahmood Lab at Harvard Medical School and Brigham and Women's Hospital. This work was funded by NIH NIGMS R35GM138216.

[!NOTE] Contributions are welcome! Please report any issues. You may also contribute by opening a pull request.

Key Features:

<img align="right" src="_readme/trident_crop.jpg" width="250px" />
  • Tissue Segmentation: Extract tissue from background (H&E, IHC, etc.).
  • Patch Extraction: Extract tissue patches of any size and magnification.
  • Patch Feature Extraction: Extract patch embeddings from 20+ foundation models, including UNI, Virchow, H-Optimus-0 and more...
  • Slide Feature Extraction: Extract slide embeddings from 5+ slide foundation models, including Threads (coming soon!), Titan, and GigaPath.

Updates:

  • 07.25: Support for Feather model.
  • 05.25: New batch-wise WSI caching for scalable processing on limited SSD space + nested WSI search (--search_nested).
  • 04.25: Native support for PIL.Image and CuCIM (use wsi = load_wsi(xxx.svs)). Support for seg + patch encoding without Internet.
  • 04.25: Remove artifacts/penmarks from the tissue segmentation with --remove_artifacts and --remove_penmarks.
  • 02.25: New image converter from czi, png, etc to tiff.
  • 02.25: Support for GrandQC(Citation necessary, Non-commercial use, Original repository) tissue vs. background segmentation.
  • 02.25: Support for Madeleine, Hibou, Lunit, Kaiko, and H-Optimus-1 models.

[!NOTE] GrandQC is integrated into Trident under the CC BY-NC-SA 4.0 license. If you use GrandQC, please cite their original publication.

🔨 1. Installation:

  • Create an environment (Python 3.10 or 3.11): conda create -n "trident" python=3.10, and activate it conda activate trident.
  • Cloning: git clone https://github.com/mahmoodlab/trident.git && cd trident.
  • Local installation: pip install -e ..
    • This installs the shared model stack (transformers, timm, safetensors, etc.).

Optional install profiles:

  • pip install -e ".[patch-encoders]" for CONCH/MUSK/CTransPath-related extras.
  • pip install -e ".[slide-encoders]" for PRISM/GigaPath/Madeleine-related extras.
  • pip install -e ".[convert]" for slide conversion dependencies.
  • pip install -e ".[full]" to install all pip-installable optional dependencies.

Run checks before launching jobs:

  • trident-doctor --profile base
  • trident-doctor --profile patch-encoders --check-gated
  • trident-doctor --profile slide-encoders
  • trident-doctor --profile convert
  • trident-doctor --profile full --check-gated

[!NOTE] Some models still require manual setup (e.g., local CHIEF repository path in trident/slide_encoder_models/local_ckpts.json) or HuggingFace gated access approvals.

🔨 2. Running Trident:

CLI options (all are supported):

  • python run_batch_of_slides.py ... (existing command)
  • python run_single_slide.py ... (existing command)
  • trident batch ..., trident single ..., trident convert ..., and trident doctor ... (wrapper CLI)

Already familiar with WSI processing? Perform segmentation, patching, and UNI feature extraction from a directory of WSIs with:

python run_batch_of_slides.py --task all --wsi_dir ./wsis --job_dir ./trident_processed --patch_encoder uni_v1 --mag 20 --patch_size 256

Equivalent wrapper CLI:

trident batch -- --task all --wsi_dir ./wsis --job_dir ./trident_processed --patch_encoder uni_v1 --mag 20 --patch_size 256

Feeling cautious?

Run this command to perform all processing steps for a single slide:

python run_single_slide.py --slide_path ./wsis/xxxx.svs --job_dir ./trident_processed --patch_encoder uni_v1 --mag 20 --patch_size 256

Equivalent wrapper CLI:

trident single -- --slide_path ./wsis/xxxx.svs --job_dir ./trident_processed --patch_encoder uni_v1 --mag 20 --patch_size 256

Convert images/WSIs to pyramidal TIFF:

trident convert --input_dir ./wsis --mpp_csv ./wsis/to_process.csv --job_dir ./pyramidal_tiff --downscale_by 1 --num_workers 1

--mpp_csv is required and must contain wsi,mpp columns. Only files listed in the CSV are converted. If embedded MPP metadata is detected in a slide, Trident compares it to the CSV value and logs mismatches.

Or follow step-by-step instructions:

Step 1: Tissue Segmentation: Segments tissue vs. background from a dir of WSIs

  • Command:
    python run_batch_of_slides.py --task seg --wsi_dir ./wsis --job_dir ./trident_processed --gpu 0 --segmenter hest
    
    • --task seg: Specifies that you want to do tissue segmentation.
    • --wsi_dir ./wsis: Path to dir with your WSIs.
    • --job_dir ./trident_processed: Output dir for processed results.
    • --gpu 0: Uses GPU with index 0.
  • --segmenter: Segmentation model. Defaults to hest. Use grandqc (Citation necessary, Non-commercial use, Original repository) for fast H&E segmentation or otsu for a classical image-processing-only fallback. Add the option --remove_artifacts for additional artifact clean up.
  • Outputs:
    • WSI thumbnails in ./trident_processed/thumbnails.
    • WSI thumbnails with tissue contours in ./trident_processed/contours.
    • GeoJSON files containing tissue contours in ./trident_processed/contours_geojson. These can be opened in QuPath for editing/quality control, if necessary.

Step 2: Tissue Patching: Extracts patches from segmented tissue regions at a specific magnification.

  • Command:
    python run_batch_of_slides.py --task coords --wsi_dir ./wsis --job_dir ./trident_processed --mag 20 --patch_size 256 --overlap 0
    
    • --task coords: Specifies that you want to do patching.
    • --wsi_dir wsis: Path to the dir with your WSIs.
    • --job_dir ./trident_processed: Output dir for processed results.
    • --mag 20: Extracts patches at 20x magnification.
    • --patch_size 256: Each patch is 256x256 pixels.
    • --overlap 0: Patches overlap by 0 pixels (always an absolute number in pixels, e.g., --overlap 128 for 50% overlap for 256x256 patches.
  • Outputs:
    • Patch coordinates as h5 files in ./trident_processed/20x_256px/patches.
    • WSI thumbnails annotated with patch borders in ./trident_processed/20x_256px/visualization.

Step 3a: Patch Feature Extraction: Extracts features from tissue patches using a specified encoder

  • Command:
    python run_batch_of_slides.py --task feat --wsi_dir ./wsis --job_dir ./trident_processed --patch_encoder uni_v1 --mag 20 --patch_size 256 
    
    • --task feat: Specifies that you want to do feature extraction.
    • --wsi_dir wsis: Path to the dir with your WSIs.
    • --job_dir ./trident_processed: Output dir for processed results.
    • --patch_encoder uni_v1: Uses the UNI patch encoder. See below for list of supported models.
    • --mag 20: Features are extracted from patches at 20x magnification.
    • --patch_size 256: Patches are 256x256 pixels in size.
  • Outputs:
    • Features are saved as h5 files in ./trident_processed/20x_256px/features_uni_v1. (Shape: (n_patches, feature_dim))

Trident supports 24 patch encoders, loaded via a patch encoder_factory. Models requiring specific installations will return error messages with additional instructions. Gated models on HuggingFace require access requests.

| Patch Encoder | Embedding Dim | Args | Link | |-----------------------|---------------:|------------------------------------------------------------------|------| | UNI | 1024 | --patch_encoder uni_v1 --patch_size 256 --mag 20 | MahmoodLab/UNI | | UNI2-h | 1536 | --patch_encoder uni_v2 --patch_size 256 --mag 20 | MahmoodLab/UNI2-h | | CONCH | 512 | --patch_encoder conch_v1 --patch_size 512 --mag 20 | MahmoodLab/CONCH | | CONCHv1.5 | 768 | --patch_encoder conch_v15 --patch_size 512 --mag 20 | MahmoodLab/conchv1_5 | | Virchow | 2560 | --patch_encoder virchow --patch_size 224 --mag 20

Related Skills

View on GitHub
GitHub Stars518
CategoryEducation
Updated2d ago
Forks113

Languages

Python

Security Score

85/100

Audited on Mar 27, 2026

No findings