FedModule
联邦学习模块化框架,支持各类FL。A universal federated learning framework, free to switch thread and process modes
Install / Use
/learn @NUAA-SmartSensing/FedModuleREADME
<img src="./doc/pic/header.png" style="width:800px"></img>
keywords:
federated-learning,asynchronous,synchronous,semi-asynchronous,personalized
<details> <summary><b>Table of Contents</b></summary> <p>
- </p>
Brief
One code adapts to multiple operating modes: thread, process, MPMT, distributed.
One-click start; change the experimental environment without modifying the code.
Support random seeds for reproducible experiments.
Redesigned the FL framework to be module with high extensibility, supporting various mainstream federated learning paradigms: synchronous, asynchronous, semi-asynchronous, personalized, etc.
With wandb, synchronize experimental data to the cloud, avoiding data loss.
For more project information, please see the wiki.
Requirements
python3.8 + pytorch + linux
It has been validated on macOS.
It supports single GPU and Multi-GPU.
Getting Started
Environment
Install dependencies on an existing python environment using pip install -r requirements.txt
or
Create a new python environment using conda:
conda env create -f environment.yml
Experiments
You can run python main.py (the main file in the fl directory) directly. The program will automatically read the config.json file in the root directory and store the results in the specified path under results, along with the configuration file.
You can also specify the configuration file by python main.py ../../config.json. Please note that the path of config.json is relative to the main.py.
The config folder in the root directory provides some algorithm configuration files proposed in papers. The following algorithm implementations are currently available:
Centralized Learning
FedAvg
FedAsync
FedProx
FedAT
FedLC
FedDL
M-Step AsyncFL
FedBuff
FedAdam
FedNova
FedBN
TWAFL
more methods to refer to the wiki
Docker
Now you can directly pull and run a Docker image, the command is as follows:
docker pull desperadoccy/async-fl
docker run -it async-fl config/FedAvg-config.json
Similarly, it supports passing a config file path as a parameter. You can also build the Docker image yourself.
cd docker
docker build -t async-fl .
docker run -it async-fl config/FedAvg-config.json
Features
- [x] Asynchronous Federated Learning
- [x] Support model and dataset replacement
- [x] Support scheduling algorithm replacement
- [x] Support aggregation algorithm replacement
- [x] Support loss function replacement
- [x] Support client replacement
- [x] Synchronous federated learning
- [x] Semi-asynchronous federated learning
- [x] Provide test loss information
- [x] Custom label heterogeneity
- [x] Custom data heterogeneity
- [x] Support Dirichlet distribution
- [x] wandb visualization
- [x] Support for multiple GPUs
- [x] Docker deployment
- [x] Process thread switching
Add new methods
Please refer to the wiki
Existing Bugs
Currently, there is a core issue in the framework that the communication between clients and servers is implemented using the multiprocessing queues. However, when a CUDA tensor is received by the queue and retrieved by other threads, it can cause a memory leak and may cause the program to crash.
This bug is caused by PyTorch and the multiprocessing queue, and the current solution is to upload non-CUDA tensors to the queue and convert them to CUDA tensors during aggregation. Therefore, when adding aggregation algorithms, the following code will be needed:
updated_parameters = {}
for key, var in client_weights.items():
updated_parameters[key] = var.clone()
if torch.cuda.is_available():
updated_parameters[key] = updated_parameters[key].cuda()
Contributors
<!-- readme: contributors -start --> <table> <tr> <td align="center"> <a href="https://github.com/desperadoccy"> <img src="https://avatars.githubusercontent.com/u/44546125?v=4" width="100;" alt="desperadoccy"/> <br /> <sub><b>Desperadoccy</b></sub> </a> </td> <td align="center"> <a href="https://github.com/jzj007"> <img src="https://avatars.githubusercontent.com/u/73173984?v=4" width="100;" alt="jzj007"/> <br /> <sub><b>Jzj007</b></sub> </a> </td> <td align="center"> <a href="https://github.com/cauchyguo"> <img src="https://avatars.githubusercontent.com/u/41313807?v=4" width="100;" alt="cauchyguo"/> <br /> <sub><b>Cauchy</b></sub> </a> </td></tr> </table> <!-- readme: contributors -end -->Citation
Please cite our paper in your publications if this code helps your research.
@misc{chen2024fedmodulemodularfederatedlearning,
title={FedModule: A Modular Federated Learning Framework},
author={Chuyi Chen and Zhe Zhang and Yanchao Zhao},
year={2024},
eprint={2409.04849},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2409.04849},
}
Contact us
We created a QQ group to discuss the asyncFL framework and FL, welcome everyone to join~~
Here is the group number:
895896624

QQ: 527707607
email: desperado@qq.com
Welcome to provide suggestions for the project~
if you'd like contribute to this project, please contact us.
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
flutter-tutor
Flutter Learning Tutor Guide You are a friendly computer science tutor specializing in Flutter development. Your role is to guide the student through learning Flutter step by step, not to provide d
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
16.9kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
