FedMLLM

FedMLLM: Federated Fine-tuning MLLM on Multimodal Heterogeneity Data arXiv

| Wide Image | Narrow Image | |:-----------------------------:|:--------------------------------:|

YOCO

You Only Communicate Once: One-shot Federated Learning for Multimodal Large Language Models

🏆 Accepted at NeurIPS 2025

TODO

[x] Release YOCO code: YOCO Implementation

Directory Structure

<details> <summary>Click to expand / collapse</summary>

.
└── root/
    ├── data/
    │   ├── hateful_memes/
    │   │   ├── minicpmv_data/
    │   │   │   ├── modality-missing/
    │   │   │   │   ├── mrate-0.3/
    │   │   │   │   │   └── partition-alpha0.5-clt10
    │   │   │   │   ├── mrate-0.4/
    │   │   │   │   │   └── partition-alpha0.5-clt10
    │   │   │   │   └── mrate-0.5/
    │   │   │   │       └── partition-alpha0.5-clt10
    │   │   │   ├── modality-single/
    │   │   │   │   ├── image-3/
    │   │   │   │   │   └── partition-alpha0.5-clt10
    │   │   │   │   ├── image-5/
    │   │   │   │   │   └── partition-alpha0.5-clt10
    │   │   │   │   └── image-7/
    │   │   │   │       └── partition-alpha0.5-clt10
    │   │   │   ├── modality-mix/
    │   │   │   │   ├── qrate-0.2/
    │   │   │   │   │   └── partition-alpha0.5-clt10
    │   │   │   │   ├── qrate-0.3/
    │   │   │   │   │   └── partition-alpha0.5-clt10
    │   │   │   │   └── qrate-0.4/
    │   │   │   │       └── partition-alpha0.5-clt10
    │   │   │   ├── partition-alpha5.0-clt10
    │   │   │   ├── partition-alpha1.0-clt10
    │   │   │   └── partition-alpha0.5-clt10
    │   │   └── raw_data/ # Extracted files of the downloaded dataset
    │   │       ├── partition-alpha5.0-clt10
    │   │       ├── partition-alpha1.0-clt10
    │   │       └── partition-alpha0.5-clt10
    │   └── crisis-mmd # Consistent with the *hateful_memes* folder structure.
    └── code/
        ├── data_gen/
        │   ├── data_partition_crisismmd.py
        │   ├── data_partition_hateful.py
        │   ├── data_process_medalpaca.py
        │   ├── data_process_vqarad.py
        │   ├── gen_data_crisismmd_missing_aug.py
        │   ├── gen_data_crisismmd_missing.py
        │   ├── gen_data_crisismmd_mix_aug.py
        │   ├── gen_data_crisismmd_mix.py
        │   ├── gen_data_crisismmd_single_aug.py
        │   ├── gen_data_crisismmd_single.py
        │   ├── gen_data_crisismmd.py
        │   ├── gen_data_hateful_missing_aug.py
        │   ├── gen_data_hateful_missing.py
        │   ├── gen_data_hateful_mix_aug.py
        │   ├── gen_data_hateful_mix.py
        │   ├── gen_data_hateful_single_aug.py
        │   ├── gen_data_hateful_single.py
        │   ├── gen_data_hateful.py
        │   ├── gen_data_medical_vtqa_single.py
        │   └── gen_data_medical_vtqa_mix.py
        ├── finetune/
        │   ├── federated_learning/
        │   │   ├── __init__.py
        │   │   ├── fed_global.py
        │   │   └── fed_utils.py
        │   ├── __init__.py
        │   ├── dataset.py
        │   ├── finetune_lora.sh
        │   ├── finetune.py
        │   └── trainer.py
        ├── eval_crisismmd_aug.py
        ├── eval_crisismmd.py
        ├── eval_hateful_aug.py
        ├── eval_hateful.py
        ├── eval_medical_gpt_slake.py
        ├── eval_medical_gpt.py
        ├── eval_medical_slake.py
        ├── eval_medical.py
        ├── vqa_eval_slake.py
        ├── vqa_eval.py
        ├── vqa_slake.py
        ├── vqa.py
        └── start.sh

</details>

Install

conda create -n FedMLLM python=3.10 -y
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
pip install -r requirements.txt

Dataset

Download dataset

Hateful-Memes download
CrisisMMD download
VQA-RAD download
MedAlpaca download
SLAKE download

Dataset processing

cd data_gen/

python data_partition_crisismmd.py
python gen_data_crisismmd.py # aligned modal scenario
python gen_data_crisismmd_missing.py # missing modal scenario
python gen_data_crisismmd_missing_aug.py # missing modal scenario with prompt strategy
python gen_data_crisismmd_single.py # cross modal scenario
python gen_data_crisismmd_mix.py # hybrid modal scenario

python data_process_medalpaca.py
python data_process_vqarad.py
python gen_data_medical_vtqa_mix.py
python gen_data_medical_vtqa_single.py

Training and Testing

sh start.sh

Citation

@article{xu2024fedmllm,
  title={FedMLLM: Federated Fine-tuning MLLM on Multimodal Heterogeneity Data},
  author={Xu, Binqian and Shu, Xiangbo and Mei, Haiyang and Xie, Guosen and Fernando, Basura and Tang, Jinhui},
  journal={arXiv preprint arXiv:2411.14717},
  year={2024}
}

@inproceedings{xu2025you,
  title={You Only Communicate Once: One-shot Federated Low-Rank Adaptation of MLLM},
  author={Binqian Xu, Haiyang Mei, Zechen Bai, Jinjin Gong, Rui Yan, Guo-Sen Xie, Yazhou Yao, Basura Fernando, Xiangbo Shu},
  booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems}
}

Acknowledgements

This repo is based on MiniCPM-V, OpenFedLLM, and PeFoMed thanks to the original authors for their works!

FedMLLM

Install / Use

README

FedMLLM

YOCO

TODO

Directory Structure

Install

Dataset

Download dataset

Dataset processing

Training and Testing

Citation

Acknowledgements