ApricotM
This repository contains the official code for the paper "Real-time prediction of intensive care unit patient acuity and therapy requirements using state-space modelling" (Nature Communications), which presents a deep learning framework for real-time patient acuity prediction using EHR data.
Install / Use
/learn @iheallab/ApricotMREADME
<p align="center"> <img src="main/image/apricot_mamba.png" alt="APRICOT Logo" style="width: 80px;"><br> APRICOT-Mamba: Acuity Prediction in Intensive Care Unit (ICU) </p>
<p align="center"> <a href="https://www.gnu.org/licenses/gpl-3.0"> <img src="https://img.shields.io/badge/License-GPLv3-blue.svg" alt="License: GPL v3"> </a> <img src="https://img.shields.io/badge/python-3.8%2B-blue" alt="Python 3.8+"> <img src="https://img.shields.io/badge/PyTorch-1.10%2B-red" alt="PyTorch"> <a href="https://arxiv.org/abs/2401.04081"> <img src="https://img.shields.io/badge/arXiv-2401.04081-b31b1b" alt="arXiv"> </a> <a href="https://doi.org/10.1038/s41467-025-62121-1"><img src="https://zenodo.org/badge/DOI/10.1038/s41467-025-62121-1.svg" alt="DOI"></a> </p>💻Code implementation for "Real-time prediction of intensive care unit patient acuity and therapy requirements using state-space modelling" paper.
📝 Paper was accepted for publication at Nature Communications (📄PDF).
📘 Overview
APRICOT-Mamba is a deep learning framework designed to continuously predict patient acuity in the ICU using Electronic Health Records (EHR). It extends the APRICOT family by integrating Mamba-based state space models and Transformer architectures, enabling real-time, interpretable predictions of patient stability and transitions.
This repository includes:
- Data preprocessing pipelines for retrospective and prospective ICU cohorts.
- Training and evaluation scripts for APRICOT-Mamba, APRICOT-Transformer, GRU, CatBoost, and Transformer baselines.
- Post-hoc analysis tools for calibration, feature attribution, and prospective validation.
📂 Project Structure
├── README.md
└── main/
├── analyses/ # Post-training analyses (calibration, performance, etc.)
│ ├── calibration/
│ ├── confusion_matrix/
│ ├── integrated_gradients/ # Feature importance analysis
│ └── ...
├── baseline_models/ # Baseline models (CatBoost, GRU, Transformer)
│ ├── catboost/
│ ├── gru/
│ └── transformer/
├── datasets/ # Data loading and description
│ ├── README.md
│ ├── eicu/
│ ├── mimic/
│ └── uf/
├── models/ # Core model implementations (APRICOT-Mamba, APRICOT-T)
│ ├── apricotm/ # APRICOT-Mamba model
│ ├── apricott/ # APRICOT-Transformer model
│ ├── model_comparison.py
│ └── variables.py # Configuration variables
├── prospective_cohort/ # Prospective cohort data processing
├── retrospective_cohort/ # Retrospective cohort data processing
├── sofa_baseline/ # SOFA score baseline calculation
└── summary/ # Summary generation scripts
⚙️ Requirements
Software
- Python ≥ 3.8
- Package Manager:
piporconda - Key Python Libraries:
pandasnumpyscikit-learnh5pytorch(PyTorch)optunacatboostcaptum
Install all dependencies with:
pip install -r requirements.txt
Note: For GPU support with PyTorch, refer to the official installation guide.
Hardware
- CPU: Multi-core processor
- RAM: ≥ 16GB
- GPU: NVIDIA GPU with CUDA support (recommended for training deep learning models)
🏥 Data Sources
This project utilizes EHR data from:
-
eICU Collaborative Research Database: A multi-center ICU database with high granularity data for over 200,000 admissions. Access requires credentialed approval.
-
MIMIC-IV: A large, freely accessible critical care database comprising de-identified health-related data associated with over 60,000 ICU admissions.
-
University of Florida Health (UFH): Internal EHR data from UF Health. Note: This dataset is not publicly available at this time.
Data processing scripts are located in:
main/datasets/main/retrospective_cohort/main/prospective_cohort/
The primary data format for training and evaluation is HDF5 (.h5). The script main/retrospective_cohort/5_build_hdf5.py demonstrates the structure of the final dataset.h5 file, which includes training, validation, external test, and temporal test sets with features (X), static data (static), and labels (y_main, y_trans).
Refer to main/datasets/README.md for detailed information on data sources and initial setup.
🚀 Getting Started
1. Data Preparation
Process raw EHR data to generate the required dataset.h5 file:
python main/retrospective_cohort/5_build_hdf5.py
Note: Adjust paths and parameters as needed in the script.
2. Model Training
Navigate to the desired model directory and run the training script:
cd main/models/apricotm/
python 1_train.py
This script performs hyperparameter optimization using optuna, trains the model with PyTorch, and saves:
- Best hyperparameters:
best_params.pkl - Model weights:
apricotm_weights.pth - Model architecture:
apricotm_architecture.pth
Training duration is approximately 2 hours on an NVIDIA A100 GPU.
Repeat the process for other models as needed.
3. Model Evaluation
Evaluate the trained model on test sets:
python 2_eval.py
Evaluation results are saved in the results subdirectory within the model's directory.
4. Prospective Run
If prospective data is prepared, apply the trained model:
python 3_prospective.py
5. Post-hoc Analyses
Perform analyses on model predictions:
python main/analyses/calibration/1_calibration.py
python main/analyses/integrated_gradients/1_integrated_gradients_table.py
6. Expected Output
Results are generated under the user-defined home directory (HOME_DIR), time window (time_window), and model:
{HOME_DIR}/deepacu/main/{time_window}h_window/model/{model}/results
📊 Results & Performance
APRICOT-Mamba demonstrates high performance in predicting patient acuity, with AUROC scores comparable to state-of-the-art models. Detailed performance metrics, calibration plots, and feature importance analyses are available in the results directories and can be visualized using the provided analysis scripts.
🧑💻 Contributing
We welcome contributions from the community! To contribute:
- Fork the repository.
- Create a new branch for your feature or bugfix.
- Commit your changes with clear messages.
- Submit a pull request detailing your changes.
📄 License
This project is licensed under the GNU General Public License v3.0.
📚 Citation
If you use this work in your research, please cite:
@article{contreras2025real,
author = {Miguel Contreras, Brandon Silva, Benjamin Shickel, Andrea Davidson, Tezcan Ozrazgat-Baslanti, Yuanfang Ren, Ziyuan Guan, Jeremy Balch, Jiaqing Zhang, Sabyasachi Bandyopadhyay, Tyler Loftus, Kia Khezeli, Gloria Lipori, Jessica Sena, Subhash Nerella, Azra Bihorac, Parisa Rashidi},
title = {Real-time prediction of intensive care unit patient acuity and therapy requirements using state-space modelling},
journal = {Nature Communications},
year = {2025},
month = {July},
doi = {10.1038/s41467-025-62121-1},
}
📬 Contact
- Dr. Parisa Rashidi: parisa.rashidi@bme.ufl.edu
