SkillAgentSearch skills...

CCSD

Combinatorial Complex Score-based Diffusion model using stochastic differential equations

Install / Use

/learn @AdrienC21/CCSD

README

CCSD - Combinatorial Complex Score-based Diffusion Modelling through Stochastic Differential Equations

Code style: black pypi version Documentation Status visitors Downloads Python versions Test Lint Codecov Imports: isort License: MIT

<p align="center"><img src="https://github.com/AdrienC21/CCSD/blob/main/logo.png?raw=true" alt="CCSD_logo" width="600"/></p>

CCSD is a sophisticated score-based diffusion model designed to generate Combinatorial Complexes using Stochastic Differential Equations. This cutting-edge approach enables the generation of complex objects with higher-order structures and relations, thereby enhancing our ability to learn underlying distributions and produce more realistic objects.

Table of Contents

CCSD

Introduction

Complex object generation is a challenging problem with application in various fields such as drug discovery. The CCSD model offers a novel approach to tackle this problem by leveraging Diffusion Models and Stochastic Differential Equations to generate Combinatorial Complexes (CC). This topological structure generalizes the different mathematical stuctures used in Topological/Geometric Deep Learning to represent complex objects with higher-order structures and relations. The integration of the higher-order domain during the generation enhances the learning of the underlying distribution of the data and thus, allows for better data generation.

If you find this project interesting, we would appreciate your support by leaving a star ⭐ on this GitHub repository.

Code still in Alpha version!

Why CCSD?

CCSD stands out from traditional complex object generation models due to the following key advantages:

  • Combinatorial Complexes: The model generates Combinatorial Complexes, enabling the synthesis of complex objects with rich structures and relationships.

  • Score-Based Diffusion: CCSD utilizes score-based diffusion techniques, allowing for efficient, high-quality and state-of-the-art complex object generation.

  • Enhanced "Realism": By incorporating higher-order structures, the generated objects are more representative of the underlying data distribution.

Also, this repository is highly documented and commented, which makes it easy to use, understand, deploy, and which offers endless possibilities for improvements.

Author

The research has been conducted by Adrien Carrel as part of his requirements for the MSc degree in Advanced Computing of Imperial College London, United Kingdom, and his requirements for the MEng in Applied Mathematics (Diplôme d'Ingénieur) at CentraleSupélec, France.

<a href="https://linkedin.com/in/adrien.carrel/" target="_blank"><img align="center" src="https://cdn.jsdelivr.net/npm/simple-icons@3.0.1/icons/linkedin.svg" alt="linkedin" height="39" width="52"/></a> <a href="https://www.instagram.com/adrien.carrel" target="_blank"><img align="center" src="https://cdn.jsdelivr.net/npm/simple-icons@3.0.1/icons/instagram.svg" alt="instagram" height="39" width="52" /></a> <a href="https://github.com/AdrienC21/" target="_blank"><img align="center" src="https://cdn.jsdelivr.net/npm/simple-icons@3.0.1/icons/github.svg" alt="github" height="39" width="52" /></a>

This project has been supervised by Dr. Tolga Birdal, Assistant Professor (Lecturer) in the Department of Computing of Imperial College London.

Contributions

We welcome new contributors with various background and programming levels who would like to contribute to the fields of diffusion models and topological deep learning. Feel free to suggest new ideas, submit pull requests, etc.

Feel free to check our Code of Conduct if you wish to contribute.

Installation

If you encounter an error during the installation, please refer to the section Commons errors below. If you are creating an Ubuntu instance on a Public Cloud service to train/sample from the model, you may want to use the post_installation_script.sh script provided to automate the process (just modify the Git configurations section inside the script with your details).

Using pip

To get started with CCSD, you can install the package using pip by typing the command:

pip install ccsd

Manually

If you encounter, if you want to use the latest version, or if you prefer the command line interface, you can use it locally by cloning or forking this repository to your local machine.

git clone https://github.com/AdrienC21/CCSD.git

Next steps

Plotly engine

For Windows users, the recommended version of kaleido is 0.1.0.post1. You can install it by typing:

pip install kaleido==0.1.0post1

For Linux users, the latest version of kaleido seems fine. For those who want to use orca, you can install it via the npm command of Node.js. For the installation, type:

sudo NEEDRESTART_MODE=a apt install -y nodejs
sudo NEEDRESTART_MODE=a apt install -y npm
npm install -g electron@6.1.4 orca

Dependencies and other packages/libraries

Install the dependencies (see the section Dependencies below).

When installing PyTorch and its componenents, or when install TopoModelX along with TopoNetX, run the commands below:

pip install torch==2.0.1 --extra-index-url https://download.pytorch.org/whl/${CUDA}
pip install torch-scatter torch-sparse -f https://data.pyg.org/whl/torch-2.0.1+${CUDA}.html

where ${CUDA} could be cu117, cu118, or cpu if you want to use the CPU. For GPU, we recommend cu118. Also, TopoModelX should be installed after TopoNetX to avoid versioning issues.

If you are using a Linux system, you may need to install libxrender1 by typing:

sudo apt-get install libxrender1

Tests and errors

To test your installation, refer to the section Testing below.

If you encounter an error, please refer to the section Commons errors below.

Dependencies

CCSD requires a recent version of Python, probably 3.7, but preferably 3.10 or higher.

It also requires the following dependencies:

  • dill>=0.3.6
  • easydict>=1.10
  • freezegun>=1.2.2
  • hypernetx>=1.2.5
  • imageio>=2.31.1
  • joblib>=1.3.1
  • kaleido>=0.1.0.post1
  • matplotlib>=3.7.2
  • networkx>=2.8.8
  • numpy>=1.24.4
  • pandas>=2.0.3
  • plotly>=5.15.0
  • pyemd>=1.0.0
  • pytest>=7.4.0
  • pytz>=2023.3
  • pyyaml>=6.0
  • rdkit>=2023.3.2
  • scikit_learn>=1.3.0
  • scipy>=1.11.1
  • Cython
  • pomegranate
  • toponetx
  • torch>=2.0.1
  • tqdm>=4.65.0
  • molsets>=0.3.1
  • pytest-cov
  • wandb
  • pandas>=2.0.3

Please make sure you have the required dependencies installed before using CCSD.

You can install all of them by running the command:

pip install -r requirements.txt

Testing

To ensure the correctness and robustness of CCSD and to allow researchers to build upon this tool, we have provided an extensive test suite. To run the tests, clone the repository, change your directory to the root folder of this project and execute the following command:

pytest tests/ -W ignore::DeprecationWarning

If you encounter an error during the testing, please refer to the section Commons errors below.

The output should look like this:

=====
View on GitHub
GitHub Stars10
CategoryEducation
Updated10mo ago
Forks1

Languages

Python

Security Score

87/100

Audited on Jun 2, 2025

No findings