FedModule

联邦学习模块化框架，支持各类FL。A universal federated learning framework, free to switch thread and process modes

Generate Convert Improve

Install / Use

/learn @NUAA-SmartSensing/FedModule

About this skill

Quality Score

0/100

README

This document is also available in: 中文 | English

keywords: federated-learning, asynchronous, synchronous, semi-asynchronous, personalized

<details> <summary>Table of Contents</summary>

Brief
Requirements
Getting Started
Features
Add new Methods
Existing Bugs
Contributors
Citation
Contact Us

</details>

Brief

One code adapts to multiple operating modes: thread, process, MPMT, distributed.

One-click start; change the experimental environment without modifying the code.

Support random seeds for reproducible experiments.

Redesigned the FL framework to be module with high extensibility, supporting various mainstream federated learning paradigms: synchronous, asynchronous, semi-asynchronous, personalized, etc.

With wandb, synchronize experimental data to the cloud, avoiding data loss.

For more project information, please see the wiki.

Requirements

python3.8 + pytorch + linux

It has been validated on macOS.

It supports single GPU and Multi-GPU.

Getting Started

Environment

Install dependencies on an existing python environment using pip install -r requirements.txt

Create a new python environment using conda:

conda env create -f environment.yml

Experiments

You can run python main.py (the main file in the fl directory) directly. The program will automatically read the config.json file in the root directory and store the results in the specified path under results, along with the configuration file.

You can also specify the configuration file by python main.py ../../config.json. Please note that the path of config.json is relative to the main.py.

The config folder in the root directory provides some algorithm configuration files proposed in papers. The following algorithm implementations are currently available:

Centralized Learning
FedAvg
FedAsync
FedProx
FedAT
FedLC
FedDL
M-Step AsyncFL
FedBuff
FedAdam
FedNova
FedBN
TWAFL

more methods to refer to the wiki

Docker

Now you can directly pull and run a Docker image, the command is as follows:

docker pull desperadoccy/async-fl
docker run -it async-fl config/FedAvg-config.json

Similarly, it supports passing a config file path as a parameter. You can also build the Docker image yourself.

cd docker
docker build -t async-fl .
docker run -it async-fl config/FedAvg-config.json

Features

[x] Asynchronous Federated Learning
[x] Support model and dataset replacement
[x] Support scheduling algorithm replacement
[x] Support aggregation algorithm replacement
[x] Support loss function replacement
[x] Support client replacement
[x] Synchronous federated learning
[x] Semi-asynchronous federated learning
[x] Provide test loss information
[x] Custom label heterogeneity
[x] Custom data heterogeneity
[x] Support Dirichlet distribution
[x] wandb visualization
[x] Support for multiple GPUs
[x] Docker deployment
[x] Process thread switching

Add new methods

Please refer to the wiki

Existing Bugs

Currently, there is a core issue in the framework that the communication between clients and servers is implemented using the multiprocessing queues. However, when a CUDA tensor is received by the queue and retrieved by other threads, it can cause a memory leak and may cause the program to crash.

This bug is caused by PyTorch and the multiprocessing queue, and the current solution is to upload non-CUDA tensors to the queue and convert them to CUDA tensors during aggregation. Therefore, when adding aggregation algorithms, the following code will be needed:

updated_parameters = {}
for key, var in client_weights.items():
    updated_parameters[key] = var.clone()
    if torch.cuda.is_available():
        updated_parameters[key] = updated_parameters[key].cuda()

Contributors

<table> <tr> <td align="center"> <a href="https://github.com/desperadoccy"> <img src="https://avatars.githubusercontent.com/u/44546125?v=4" width="100;" alt="desperadoccy"/> Desperadoccy </a> </td> <td align="center"> <a href="https://github.com/jzj007"> <img src="https://avatars.githubusercontent.com/u/73173984?v=4" width="100;" alt="jzj007"/> Jzj007 </a> </td> <td align="center"> <a href="https://github.com/cauchyguo"> <img src="https://avatars.githubusercontent.com/u/41313807?v=4" width="100;" alt="cauchyguo"/> Cauchy </a> </td></tr> </table>

Citation

Please cite our paper in your publications if this code helps your research.

@misc{chen2024fedmodulemodularfederatedlearning,
      title={FedModule: A Modular Federated Learning Framework}, 
      author={Chuyi Chen and Zhe Zhang and Yanchao Zhao},
      year={2024},
      eprint={2409.04849},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2409.04849}, 
}

Contact us

We created a QQ group to discuss the asyncFL framework and FL, welcome everyone to join~~

Here is the group number:

895896624

group_number

QQ: 527707607

email: desperado@qq.com

Welcome to provide suggestions for the project~

if you'd like contribute to this project, please contact us.

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

flutter-tutor

Flutter Learning Tutor Guide You are a friendly computer science tutor specializing in Flutter development. Your role is to guide the student through learning Flutter step by step, not to provide d

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

last30days-skill

16.9k

AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary