<p align="center"> <img src="https://i.imgur.com/waxVImv.png" alt="Oryx Video-ChatGPT"> </p>  <h1 align="left" style="margin:24px 0;"> MIRA: A Novel Framework for Fusing Modalities in Medical RAG </h1>  <p align="center"> <img src="https://i.imgur.com/waxVImv.png" alt="Oryx Video-ChatGPT"> </p> <div align="center">

</div>

Authors: Jinhong Wang, Tajamul Ashraf, Zongyan Han, Jorma Laaksonen, Rao Muhammad Anwer

* Equal contribution, Correspondence: Tajamul Ashraf

Updates

[2025-07-09]: 🎉 MIRA paper accepted at ACM Multimedia 2025
[2025-06-02]: MIRA paper published on arXiv:2507.07902
[2025-05-29]: Released evaluation & deployment code for MIRA
[2025-05-22]: Published the MIRA dataset on Hugging Face

Introduction

Multimodal Large Language Models (MLLMs) have significantly advanced AI-assisted medical diagnosis, but they often generate factually inconsistent responses that deviate from established medical knowledge. Retrieval-Augmented Generation (RAG) enhances factual accuracy by integrating external sources, but it presents two key challenges. First, insufficient retrieval can miss critical information, whereas excessive retrieval can introduce irrelevant or misleading content, disrupting model output. Second, even when the model initially provides correct answers, over-reliance on retrieved data can lead to factual errors.

What is MIRA?

We introduce the Multimodal Intelligent Retrieval and Augmentation (MIRA) framework, designed to optimize factual accuracy in MLLM. MIRA consists of two key components: (1) a calibrated Rethinking and Rearrangement module that dynamically adjusts the number of retrieved contexts to manage factual risk, and (2) A medical RAG framework integrating image embeddings and a medical knowledge base with a query-rewrite module for efficient multimodal reasoning. This enables the model to integrate both its inherent knowledge and external references effectively. Our evaluation of publicly available medical VQA and report generation benchmarks demonstrates that MIRA substantially enhances factual accuracy and overall performance, achieving new state-of-the-art results.

Evaluation Scripts

To be released...

📝 Citation

If you use miRA in your research, please cite the following paper:

@misc{mira,
      title={MIRA: A Novel Framework for Fusing Modalities in Medical RAG}, 
      author={Jinhong Wang and Tajamul Ashraf and Zongyan Han and Jorma Laaksonen and Rao Mohammad Anwer},
      year={2025},
      eprint={2507.07902},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2507.07902}, 
}

MIRA

Install / Use

README

Authors: Jinhong Wang, Tajamul Ashraf, Zongyan Han, Jorma Laaksonen, Rao Muhammad Anwer

Updates

Introduction

What is MIRA?

Evaluation Scripts

📝 Citation

Related Skills

MIRA

Install / Use

README

Authors: Jinhong Wang*, Tajamul Ashraf*, Zongyan Han, Jorma Laaksonen, Rao Muhammad Anwer

Updates

Introduction

What is MIRA?

Evaluation Scripts

📝 Citation

Related Skills

Authors: Jinhong Wang, Tajamul Ashraf, Zongyan Han, Jorma Laaksonen, Rao Muhammad Anwer