Dordis

[ACM EuroSys'24] Dordis: Efficient Federated Learning with Dropout-Resilient Differential Privacy

Generate Convert Improve

Install / Use

/learn @SamuelGong/Dordis

About this skill

Quality Score

0/100

README

<p align="center"> <img src="asset/dordis.png" height=400> </p> <p align="center"> <a href="https://arxiv.org/abs/2209.12528"><img src="https://img.shields.io/badge/arxiv-2209.12528-silver" alt="Paper"></a> <a href=""><img src="https://img.shields.io/badge/Pub-EuroSys'24-olive" alt="Pub"></a> <a href="https://github.com/SamuelGong/Dordis"><img src="https://img.shields.io/badge/-github-teal?logo=github" alt="github"></a> <a href="https://github.com/SamuelGong/Dordis/blob/main/LICENSE"><img src="https://img.shields.io/github/license/SamuelGong/Dordis?color=yellow" alt="License"></a> <img src="https://badges.toozhao.com/badges/01HCERSP3HP3DQDCZGBGN0BFYX/green.svg" alt="Count"/> </p> <h1 align="center">Dordis: Efficient Federated Learning with Dropout-Resilient Differential Privacy (ACM EuroSys 2024)</h1>

You can also explore the DeepWiki for this repository, which offers additional insights and helps with answering questions.

This repository contains the evaluation artifacts of our paper titled Dordis: Efficient Federated Learning with Dropout-Resilient Differential Privacy, which will be presented at ACM EuroSys'24 conference. You can find the paper here.

Zhifeng Jiang, Wei Wang, Ruichuan Chen

Keywords: Federated Learning, Distributed Differential Privacy, Client Dropout, Secure Aggregation, Pipeline

<details> <summary><b>Abstract (Tab here to expand)</b></summary>

Federated learning (FL) is increasingly deployed among multiple clients to train a shared model over decentralized data. To address privacy concerns, FL systems need to safeguard the clients' data from disclosure during training and control data leakage through trained models when exposed to untrusted domains. Distributed differential privacy (DP) offers an appealing solution in this regard as it achieves a balanced tradeoff between privacy and utility without a trusted server. However, existing distributed DP mechanisms are impractical in the presence of client dropout, resulting in poor privacy guarantees or degraded training accuracy. In addition, these mechanisms suffer from severe efficiency issues.

We present Dordis, a distributed differentially private FL framework that is highly efficient and resilient to client dropout. Specifically, we develop a novel add-then-remove scheme that enforces a required noise level precisely in each training round, even if some sampled clients drop out. This ensures that the privacy budget is utilized prudently, despite unpredictable client dynamics. To boost performance, Dordis operates as a distributed parallel architecture via encapsulating the communication and computation operations into stages. It automatically divides the global model aggregation into several chunk-aggregation tasks and pipelines them for optimal speedup. Large-scale deployment evaluations demonstrate that Dordis efficiently handles client dropout in various realistic FL scenarios, achieving the optimal privacy-utility tradeoff and accelerating training by up to 2.4× compared to existing solutions.

</details>

Overview
Prerequisites
- Necessary dependencies installation before anything begins.
Simulation
- Learn how to run experiment in simulation mode.
Cluster Deployment
- Learn how to run experiments in a distributed manner.
Reproducing Experimental Results
- Learn how to reproduce paper experiments.
Repo Structure
- What are contained in the project root folder.
Support
License
Citation

1. Overview

The system supports two modes of operation:

Simulation: This mode allows you to run experiments on a local machine or GPU server. It is primarily used for validating functionality, privacy, or utility.
Cluster Deployment: This mode enables you to run experiments on an AWS EC2 cluster funded by your own account. Alternatively, you can also run experiments on an existing cluster of Ubuntu nodes (currently undocumented). Cluster Deployment Mode is typically used for evaluating runtime performance.

2. Prerequisites

To work with the project, you need to have a Python 3 Anaconda environment set up in your host machine (Ubuntu system assumed) with specific dependencies installed. To simplify the setup process, we provide a shortcut:

# assumes you are working from the project folder
cd exploration/dev
bash standalone_install.sh
conda activate dordis

Note

Most the dependencies will be installed in a newly created environment called dordis, minimizing interference with your original system setup.
However, please note that the redis-server application needs to be installed at the system level with sudo previlige, as mentioned in the Line 49-52 of the standalone_install.sh script. If you do not have sudo privileges, you can follow the instructions provided here to install Redis without root access. In that case, you should comment out these lines before executing the command bash standalone_install.sh.

3. Simulation

3.1 Preparing Working Directory

Start by choosing a name for the working directory. For example, let's use ae-simulator in the following instructions.

# assumes you are working from the project folder
cd exploration
cp -r simulation_folder_template ae-simulator
cd ae-simulator

3.2 Run Experiments

To run an experiment with a specific configuration file in the background, follow these steps:

bash simulator_run.sh start_a_task [target folder]/[target configuration file]

The primarily logged information will be output to the following file:

[target folder]/[timestamp]/dordis-coordinator/log.txt

Note

When you execute the above command, the command line will prompt you with [timestamp], which represents the relevant timestamp and output folder.

You can use the simulator_run.sh script for task-related control. You don't need to remember the commands because the prompt will inform you whenever you start a task. Here are a few examples:

# To kill a task halfway
bash simulator_run.sh kill_a_task [target folder]/[timestamp]
# To analyze the output and generate insightful figures/tables
bash simulator_run.sh analyze_a_task [target folder]/[timestamp]

3.3 Batch Tasks to Run

The simulator also supports batching tasks to run. You can specify the tasks to run in the background by writing them in the batch_plan.txt file, as shown below:

[target folder]/[target configuration file]
[target folder]/[target configuration file]
[target folder]/[target configuration file]

To sequentially run the tasks in a batch, execute the following command:

bash batch_run.sh batch_plan.txt

The execution log will be available at batch_log.txt.

Note

To stop the batching logic halfway and prevent it from issuing any new tasks, you can use the command kill -9 [pid]. The [pid] value can be found at the beginning of the file batch_log.txt.
If you want to stop a currently running task halfway, you can kill it using the command bash simulator_run.sh kill_a_task [...], as explained in the previous subsection. The information needed to kill the job will also be available in the log.

4. Cluster Deployment

You can initiate the cluster deployment process either from your local host machine (ensuring a stable network connection) or from a dedicated remote node specifically designed for coordination purposes (we thus call it the coordinator node). It is important to note that the remote node does not necessarily need to be a powerful machine.

4.1 Install and Configure AWS CLI

Before proceeding, please ensure that you have an AWS account. Additionally, on the coordinator node, it is essential to have the latest version of aws-cli installed and properly configured with the necessary credentials. This configuration will allow us to conveniently manage all the nodes in the cluster remotely using command-line tools.

Reference

Install AWS CLI.

Example command for installing into Linux x86 (64-bit):

# You can work from any directory, e.g., at your home directory
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
sudo apt install unzip
unzip awscliv2.zip
sudo ./aws/install

Configure AWS CLI.
- Example command for configuring one's AWS CLI:
```
# You can work from any directory, e.g., at your home directory
aws configure
```
You will be prompted to enter your AWS Access Key ID, AWS Secret Access Key, Default region name, and Default output format. Provide the required information as prompted.

4.2 Configure and Allocate a Cluster

To begin, let's choose a name for the working directory. For example, we can use ae-cluster as the name throughout this section (please note that this name should not be confused with the above-mentioned simulator folder).

# assumes you are working from the project folder
cd exploration
cp -r cluster_folder_template ae-cluster
cd ae-cluster

# make some modifications that suit your need/budget
# 1. EC2-related
# relevant key:
#   BlockDeviceMappings/Ebs/VolumeSize: how large is the storage of each node
#   KeyName: the path to the key file (relative to ~/.ssh/) you plan to use to 
#

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

research_rules

Research & Verification Rules Quote Verification Protocol Primary Task "Make sure that the quote is relevant to the chapter and so you we want to make sure that we want to have it identifie

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

SamuelGong

View profile

View on GitHub

GitHub Stars24

CategoryEducation

Updated3mo ago

Forks1

SamuelGong/Dordis

Languages

Python

Security Score

87/100

Audited on Dec 15, 2025

No findings

Dordis

Install / Use

README

Table of Contents

1. Overview

2. Prerequisites

3. Simulation

3.1 Preparing Working Directory

3.2 Run Experiments

3.3 Batch Tasks to Run

4. Cluster Deployment

4.1 Install and Configure AWS CLI

4.2 Configure and Allocate a Cluster

Related Skills