VLSA: Interpretable Vision-Language Survival Analysis with Ordinal Inductive Bias for Computational Pathology

Abstract: Histopathology Whole-Slide Images (WSIs) provide an important tool to assess cancer prognosis in computational pathology (CPATH). While existing survival analysis (SA) approaches have made exciting progress, they are generally limited to adopting highly-expressive architectures and only coarse-grained patient-level labels to learn prognostic visual representations from gigapixel WSIs. Such learning paradigm suffers from important performance bottlenecks, when facing present scarce training data and standard multi-instance learning (MIL) framework in CPATH. To overcome it, this paper, for the first time, proposes a new Vision-Language-based SA (VLSA) paradigm. Concretely, (1) VLSA is driven by pathology VL foundation models. It no longer relies on high-capability networks and shows the advantage of data efficiency. (2) In vision-end, VLSA encodes prognostic language prior and then employs it as auxiliary signals to guide the aggregating of prognostic visual features at instance level, thereby compensating for the weak supervision in MIL. Moreover, given the characteristics of SA, we propose i) ordinal survival prompt learning to transform continuous survival labels into textual prompts; and ii) ordinal incidence function as prediction target to make SA compatible with VL-based prediction. Notably, VLSA's predictions can be interpreted intuitively by our Shapley values-based method. The extensive experiments on five datasets confirm the effectiveness of our scheme. Our VLSA could pave a new way for SA in CPATH by offering weakly-supervised MIL an effective means to learn valuable prognostic clues from gigapixel WSIs.

<div align="center"> <a href="https://"><img width="100%" height="auto" src="./docs/fig-vlsa-overview.png"></a> </div>

On updating. Stay tuned.

📚 Recent updates:

25/02/08: upload the patch features (31.86G in files) used in VLSA; you can download them from here.
25/01/23: VLSA is accepted to ICLR 2025
24/10/07: add the Notebook - VLSA Walkthrough
24/09/24: codes & papers are live
24/09/10: release VLSA

VLSA Walkthrough

Please refer to our Notebook - VLSA Walkthrough. It provides the detail of

individual incidence function prediction in VLSA models;
and prediction interpretation using our Shapley values-based method.

👩‍💻 Running the Code

Pre-requisites

All experiments are run on a machine with

two NVIDIA GeForce RTX 3090 GPUs
python 3.8 and pytorch==1.11.0+cu113

Detailed package requirements:

for pip or conda users, full requirements are provided in requirements.txt.
for Docker users, you could use our base Docker image via docker pull yuukilp/deepath:py38-torch1.11.0-cuda11.3-cudnn8-devel and then install additional essential python packages (see requirements.txt) in the container.

Training models

Use the following command to load an experiment configuration and train the VLSA model (5-fold cross-validation):

python3 main.py --config config/IFMLE/tcga_blca/cfg_vlsa_conch.yaml --handler VLSA --multi_run

All important arguments are explained in config/IFMLE/tcga_blca/cfg_vlsa_conch.yaml.

For the traditional SA models only using visual features, use this one:

python3 main.py --config config/IFMLE/tcga_blca/cfg_sa_base_conch.yaml --handler SA --multi_run

Training Logs

We advocate open-source research. Our full training logs for VLSA models can be accessed at Google Drive.

🔥 Awesome Papers of Pathology VLMs

Foundational VLMs for computational pathology:

VLM-driven computational pathology tasks:

NOTE: please open a new PR if you want to add your work into this table.

WSI Preprocessing

Following CONCH, we first divide each WSI into patches of 448 * 448 pixels at 20x magnification. Then we adopt the image encoder of CONCH to extract patch features.

Our complete procedure in WSI preprocessing follows Pipeline-Processing-TCGA-Slides-for-MIL. You could

download the patch

VLSA

Install / Use

README