Parachute
[WACV2026] PaRaChute: Pathology-Radiology Cross-Modal Fusion for Missing-Modality-Robust Survival Prediction
Install / Use
/learn @PietroCaforio/ParachuteREADME
🪂 PaRaChute
Introduction
This repository contains the implementation for the WACV 2026 paper "PaRaChute: Pathology-Radiology Cross-Modal Fusion for Missing-Modality-Robust Survival Prediction". <img width="3847" height="1288" alt="parachute" src="https://github.com/user-attachments/assets/03892bc4-13e0-4f04-b0f3-b8e6fce9a924" />
Setup
Create the Conda environment:
conda env create -f environment.yml
conda activate parachute
pip install -r requirements.txt
Data
We use three publicly available multimodal cancer datasets:
- CPTAC-PDA: Pancreatic Ductal Adenocarcinoma preoperative CT volumes and H&E WSI with clinical metadata (grading and survival) from CPTAC, hosted on TCIA. TCIA collection, CPTAC consortium
- CPTAC-UCEC: Uterine Corpus Endometrial Carcinoma preoperative CT volumes and WSI histopathology with clinical and survival metadata from CPTAC, hosted on TCIA. TCIA collection, CPTAC consortium
- MMIST-CCRCC: Clear Cell Renal Cell Carcinoma (ccRCC) benchmark curated from CPTAC-CCRCC (TCIA) and TCGA-KIRC (GDC). MMIST dataset page, CPTAC-CCRCC collection, TCGA-KIRC (GDC)
Splits and labels are stored under data/labels_splits/<dataset>/ (for example data/labels_splits/CPTACPDA/k=all.tsv). The CPTAC-PDA and CPTAC-UCEC splits follow the ones from Patho-Bench for consistency, while the MMISTCCRCC splits were generated by us.
Features
CT volume features (after preprocessing) are extracted with MedImageInsights Histopathology features are extracted with TITAN through the Trident framework
Expected structure:
data/features/
CPTACPDA/
MedImageInsights/<CASE_ID>/*.npy (or .pt)
TITAN/*.h5
CPTACUCEC/
MedImageInsights/<CASE_ID>/*.npy (or .pt)
TITAN/*.h5
MMISTCCRCC/
MedImageInsights/<CASE_ID>/*.npy (or .pt)
TITAN/*.h5
Notes:
- CT features are loaded from per-patient folders (
ct_path/<case_id>/*.npyor.pt) and were extracted with MedImageInsights. - WSI features are
.h5files with afeaturesdataset, extracted with TITAN via Trident. - Make sure the feature dimensions match the config (
rad_input_dim,histo_input_dim,n_patches).
Directory Structure And Outputs
Repository layout (top level and key subfolders):
.
├── README.md
├── data/
│ ├── features/
│ │ ├── CPTACPDA/
│ │ │ ├── MedImageInsights/
│ │ │ └── TITAN/
│ │ ├── CPTACUCEC/
│ │ │ ├── MedImageInsights/
│ │ │ └── TITAN/
│ │ └── MMISTCCRCC/
│ │ ├── MedImageInsights/
│ │ └── TITAN/
│ └── labels_splits/
│ ├── CPTACPDA/
│ ├── CPTACUCEC/
│ └── MMISTCCRCC/
├── models/
│ ├── ckpts/
│ └── parachute/
├── scripts/
│ ├── presets/
│ │ ├── titan_cptacpda/
│ │ ├── titan_cptacucec/
│ │ └── titan_mmist_ccrcc/
│ ├── test.py
│ ├── train.py
│ └── train_multival.py
├── training/
│ ├── losses.py
│ ├── metrics.py
│ ├── survival_trainer.py
│ └── trainer.py
├── environment.yml
└── requirements.txt
What each directory is for:
data/features/: Extracted CT and WSI features used for training and testing.data/labels_splits/: Fold-aware TSVs describing train/test splits and labels.models/parachute/: Model and fusion code.models/ckpts/: All training outputs (checkpoints, logs, and CV summaries).scripts/: Train/test entry points and dataset presets.training/: Trainer implementations, losses, and metrics.
Training
This repo provides two training entry points.
Standard CV training:
python scripts/train.py \
--config scripts/presets/titan_cptacpda/MedImSight_trainfull.json \
--experiment-name cptacpda_trainfull
Multi-validation training with missing-modality evaluations:
python scripts/train_multival.py \
--config scripts/presets/titan_cptacpda/MedImSight_multival.json \
--experiment-name cptacpda_multival
Outputs (standard CV training, train.py):
models/ckpts/<experiment_name>/
cv_results.json
fold_0/
config_fold_0.json
<experiment_name>_fold_0/
<experiment_name>_fold_0.log
<experiment_name>_fold_0_latest.pth
<experiment_name>_fold_0_best.pth
<experiment_name>_fold_0_best_val_loss.pth
fold_1/
...
Outputs (multival training, train_multival.py, checkpoints per missing-modality split / missing ratio):
models/ckpts/<experiment_name>/
cv_results_multival.json
fold_0/
config_fold_0.json
<experiment_name>_fold_0/
<experiment_name>_fold_0.log
<experiment_name>_fold_0_best_ct_missing.pth
<experiment_name>_fold_0_best_histo_missing.pth
<experiment_name>_fold_0_best_mixed_missing.pth
<experiment_name>_fold_0_ct_missing_best_metrics.json
<experiment_name>_fold_0_histo_missing_best_metrics.json
<experiment_name>_fold_0_mixed_missing_best_metrics.json
fold_1/
...
Missing-modality splits are ct_missing, histo_missing, and mixed_missing.
Notes:
- Ensure your config contains
n_folds(seescripts/presets/titan_cptacpda/). training.gpuselects the CUDA device id.- The performance numbers reported in the paper correspond to the best validation fold, not the mean across folds.
Testing
Use scripts/test.py to evaluate checkpoints on the test split.
Example: full evaluation
python scripts/test.py \
--checkpoint-dir models/ckpts/cptacpda_trainfull \
--mode full \
--checkpoint-tag best
Example: multival evaluation (missing-modality splits)
python scripts/test.py \
--checkpoint-dir models/ckpts/cptacpda_multival \
--mode multival \
--checkpoint-tag best
Notes:
--checkpoint-dircan be an experiment directory, afold_*directory, or a run directory that contains_fold_in the name.--configlets you override the config stored in the checkpoint (for examplemodels/ckpts/.../fold_0/config_fold_0.json).- Results are written to
eval_results.json(full) oreval_results_multival.json(multival) in the checkpoint root unless--outputis set. - Logs are written under
./models/test_logs/by default, with one folder per fold namedtest_fold_X/.
Expected testing output layout (default paths):
<checkpoint_root>/
eval_results.json
eval_results_multival.json
models/test_logs/
test_fold_0/
test_fold_0.log
test_fold_1/
...
Output details:
cv_results.json: Per-fold metrics fromtrain.pyplus mean and standard deviation across folds.cv_results_multival.json: Per-fold, per-split metrics fromtrain_multival.pyplus mean and standard deviation across folds.*_latest.pth: Most recent checkpoint in a fold run.*_best.pth: Best checkpoint according totraining.monitor_metric.*_best_val_loss.pth: Best checkpoint by validation loss.*_best_<split>.pth: Best checkpoint for a missing-modality split (ct_missing,histo_missing,mixed_missing).*_<split>_best_metrics.json: Best metrics for that split (written after training finishes).eval_results.json: Evaluation results for full mode testing.eval_results_multival.json: Evaluation results for multival testing.models/test_logs/test_fold_X/test_fold_X.log: Per-fold evaluation logs.
Acknowledgements
We acknowledge NVIDIA Corporation for the Academic Hardware Grant Program, ISCRA for awarding this project access to the LEONARDO supercomputer, owned by the EuroHPC Joint Undertaking, hosted by CINECA (Italy).
This work leverages:
- MedImageInsights: https://github.com/medimageinsight/medimageinsight
- TITAN: https://github.com/mahmoodlab/TITAN
- Trident: https://github.com/mahmoodlab/TRIDENT
- Patho-Bench: https://github.com/mahmoodlab/Patho-Bench
Citation
If you find our work useful in your research, please consider citing our paper:
@InProceedings{Caforio_2026_WACV,
author = {Caforio, Pietro and Poles, Isabella and Santambrogio, Marco D.},
title = {PaRaChute: Pathology-Radiology Cross-Modal Fusion for Missing-Modality-Robust Survival Prediction},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {March},
year = {2026},
pages = {718-728}
}
