SkillAgentSearch skills...

Parachute

[WACV2026] PaRaChute: Pathology-Radiology Cross-Modal Fusion for Missing-Modality-Robust Survival Prediction

Install / Use

/learn @PietroCaforio/Parachute
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

🪂 PaRaChute

Introduction

This repository contains the implementation for the WACV 2026 paper "PaRaChute: Pathology-Radiology Cross-Modal Fusion for Missing-Modality-Robust Survival Prediction". <img width="3847" height="1288" alt="parachute" src="https://github.com/user-attachments/assets/03892bc4-13e0-4f04-b0f3-b8e6fce9a924" />

Setup

Create the Conda environment:

conda env create -f environment.yml
conda activate parachute
pip install -r requirements.txt

Data

We use three publicly available multimodal cancer datasets:

Splits and labels are stored under data/labels_splits/<dataset>/ (for example data/labels_splits/CPTACPDA/k=all.tsv). The CPTAC-PDA and CPTAC-UCEC splits follow the ones from Patho-Bench for consistency, while the MMISTCCRCC splits were generated by us.

Features

CT volume features (after preprocessing) are extracted with MedImageInsights Histopathology features are extracted with TITAN through the Trident framework

Expected structure:

data/features/
  CPTACPDA/
    MedImageInsights/<CASE_ID>/*.npy (or .pt)
    TITAN/*.h5
  CPTACUCEC/
    MedImageInsights/<CASE_ID>/*.npy (or .pt)
    TITAN/*.h5
  MMISTCCRCC/
    MedImageInsights/<CASE_ID>/*.npy (or .pt)
    TITAN/*.h5

Notes:

  • CT features are loaded from per-patient folders (ct_path/<case_id>/*.npy or .pt) and were extracted with MedImageInsights.
  • WSI features are .h5 files with a features dataset, extracted with TITAN via Trident.
  • Make sure the feature dimensions match the config (rad_input_dim, histo_input_dim, n_patches).

Directory Structure And Outputs

Repository layout (top level and key subfolders):

.
├── README.md
├── data/
│   ├── features/
│   │   ├── CPTACPDA/
│   │   │   ├── MedImageInsights/
│   │   │   └── TITAN/
│   │   ├── CPTACUCEC/
│   │   │   ├── MedImageInsights/
│   │   │   └── TITAN/
│   │   └── MMISTCCRCC/
│   │       ├── MedImageInsights/
│   │       └── TITAN/
│   └── labels_splits/
│       ├── CPTACPDA/
│       ├── CPTACUCEC/
│       └── MMISTCCRCC/
├── models/
│   ├── ckpts/
│   └── parachute/
├── scripts/
│   ├── presets/
│   │   ├── titan_cptacpda/
│   │   ├── titan_cptacucec/
│   │   └── titan_mmist_ccrcc/
│   ├── test.py
│   ├── train.py
│   └── train_multival.py
├── training/
│   ├── losses.py
│   ├── metrics.py
│   ├── survival_trainer.py
│   └── trainer.py
├── environment.yml
└── requirements.txt

What each directory is for:

  • data/features/: Extracted CT and WSI features used for training and testing.
  • data/labels_splits/: Fold-aware TSVs describing train/test splits and labels.
  • models/parachute/: Model and fusion code.
  • models/ckpts/: All training outputs (checkpoints, logs, and CV summaries).
  • scripts/: Train/test entry points and dataset presets.
  • training/: Trainer implementations, losses, and metrics.

Training

This repo provides two training entry points.

Standard CV training:

python scripts/train.py \
  --config scripts/presets/titan_cptacpda/MedImSight_trainfull.json \
  --experiment-name cptacpda_trainfull

Multi-validation training with missing-modality evaluations:

python scripts/train_multival.py \
  --config scripts/presets/titan_cptacpda/MedImSight_multival.json \
  --experiment-name cptacpda_multival

Outputs (standard CV training, train.py):

models/ckpts/<experiment_name>/
  cv_results.json
  fold_0/
    config_fold_0.json
    <experiment_name>_fold_0/
      <experiment_name>_fold_0.log
      <experiment_name>_fold_0_latest.pth
      <experiment_name>_fold_0_best.pth
      <experiment_name>_fold_0_best_val_loss.pth
  fold_1/
    ...

Outputs (multival training, train_multival.py, checkpoints per missing-modality split / missing ratio):

models/ckpts/<experiment_name>/
  cv_results_multival.json
  fold_0/
    config_fold_0.json
    <experiment_name>_fold_0/
      <experiment_name>_fold_0.log
      <experiment_name>_fold_0_best_ct_missing.pth
      <experiment_name>_fold_0_best_histo_missing.pth
      <experiment_name>_fold_0_best_mixed_missing.pth
      <experiment_name>_fold_0_ct_missing_best_metrics.json
      <experiment_name>_fold_0_histo_missing_best_metrics.json
      <experiment_name>_fold_0_mixed_missing_best_metrics.json
  fold_1/
    ...

Missing-modality splits are ct_missing, histo_missing, and mixed_missing.

Notes:

  • Ensure your config contains n_folds (see scripts/presets/titan_cptacpda/).
  • training.gpu selects the CUDA device id.
  • The performance numbers reported in the paper correspond to the best validation fold, not the mean across folds.

Testing

Use scripts/test.py to evaluate checkpoints on the test split.

Example: full evaluation

python scripts/test.py \
  --checkpoint-dir models/ckpts/cptacpda_trainfull \
  --mode full \
  --checkpoint-tag best

Example: multival evaluation (missing-modality splits)

python scripts/test.py \
  --checkpoint-dir models/ckpts/cptacpda_multival \
  --mode multival \
  --checkpoint-tag best

Notes:

  • --checkpoint-dir can be an experiment directory, a fold_* directory, or a run directory that contains _fold_ in the name.
  • --config lets you override the config stored in the checkpoint (for example models/ckpts/.../fold_0/config_fold_0.json).
  • Results are written to eval_results.json (full) or eval_results_multival.json (multival) in the checkpoint root unless --output is set.
  • Logs are written under ./models/test_logs/ by default, with one folder per fold named test_fold_X/.

Expected testing output layout (default paths):

<checkpoint_root>/
  eval_results.json
  eval_results_multival.json
models/test_logs/
  test_fold_0/
    test_fold_0.log
  test_fold_1/
    ...

Output details:

  • cv_results.json: Per-fold metrics from train.py plus mean and standard deviation across folds.
  • cv_results_multival.json: Per-fold, per-split metrics from train_multival.py plus mean and standard deviation across folds.
  • *_latest.pth: Most recent checkpoint in a fold run.
  • *_best.pth: Best checkpoint according to training.monitor_metric.
  • *_best_val_loss.pth: Best checkpoint by validation loss.
  • *_best_<split>.pth: Best checkpoint for a missing-modality split (ct_missing, histo_missing, mixed_missing).
  • *_<split>_best_metrics.json: Best metrics for that split (written after training finishes).
  • eval_results.json: Evaluation results for full mode testing.
  • eval_results_multival.json: Evaluation results for multival testing.
  • models/test_logs/test_fold_X/test_fold_X.log: Per-fold evaluation logs.

Acknowledgements

We acknowledge NVIDIA Corporation for the Academic Hardware Grant Program, ISCRA for awarding this project access to the LEONARDO supercomputer, owned by the EuroHPC Joint Undertaking, hosted by CINECA (Italy).

This work leverages:

  • MedImageInsights: https://github.com/medimageinsight/medimageinsight
  • TITAN: https://github.com/mahmoodlab/TITAN
  • Trident: https://github.com/mahmoodlab/TRIDENT
  • Patho-Bench: https://github.com/mahmoodlab/Patho-Bench

Citation

If you find our work useful in your research, please consider citing our paper:

@InProceedings{Caforio_2026_WACV,
    author    = {Caforio, Pietro and Poles, Isabella and Santambrogio, Marco D.},
    title     = {PaRaChute: Pathology-Radiology Cross-Modal Fusion for Missing-Modality-Robust Survival Prediction},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {March},
    year      = {2026},
    pages     = {718-728}
}
View on GitHub
GitHub Stars4
CategoryDevelopment
Updated27d ago
Forks1

Languages

Python

Security Score

85/100

Audited on Mar 8, 2026

No findings