/$$$$$$              /$$               /$$$$$$$   /$$$$$$ 
 /$$__  $$            | $$              | $$__  $$ /$$__  $$
| $$  \ $$ /$$   /$$ /$$$$$$    /$$$$$$ | $$  \ $$| $$  \ $$
| $$$$$$$$| $$  | $$|_  $$_/   /$$__  $$| $$$$$$$ | $$$$$$$$
| $$__  $$| $$  | $$  | $$    | $$  \ $$| $$__  $$| $$__  $$
| $$  | $$| $$  | $$  | $$ /$$| $$  | $$| $$  \ $$| $$  | $$
| $$  | $$|  $$$$$$/  |  $$$$/|  $$$$$$/| $$$$$$$/| $$  | $$
|__/  |__/ \______/    \___/   \______/ |_______/ |__/  |__/
                                                            
           Automated Bioinformatics Analysis
               www.joshuachou.ink/about

An AI Agent for Fully Automated Multi-omic Analyses.

(Automated Bioinformatics Analysis via AutoBA)

Juexiao Zhou, Bin Zhang, Guowei Li, Xiuying Chen, Haoyang Li, Xiaopeng Xu, Siyuan Chen, Wenjia He, Chencheng Xu, Liwei Liu, Xin Gao

King Abdullah University of Science and Technology, KAUST

Huawei Technologies Co., Ltd

https://github.com/JoshuaChou2018/AutoBA/assets/25849209/3334417a-de59-421c-aa5e-e2ac16ce90db

What's New

[2024/08] Our paper is published online at Advanced Science
[2024/08] We integrated ollama to make it easier to use local LLMs and released the latest stable version v0.4.0
[2024/03] Now we support retrieval-augmented generation (RAG) to increase robustness of AutoBA, to use it, please upgrade openai==1.13.3 and install llama-index.
[2024/02] Now we support deepseek-coder-6.7b-instruct (failed test), deepseek-coder-7b-instruct-v1.5 (passed test), deepseek-coder-33b-instruct (passed test), to use it, please upgrade transformers==4.35.0.
[2024/01] Don't like the command line mode? Now we provide a new GUI and released the milestone stable version v0.2.0 🎉
[2024/01] Updated JSON mode for gpt-3.5-turbo-1106, gpt-4-1106-preview, the output of these two models will be more stable
[2024/01] Updated the support for ChatGPT-4 (gpt-4-32k-0613: Currently points to gpt-4-32k-0613, 32,768 tokens, Up to Sep 2021; gpt-4-1106-preview: GPT-4 Turbo, 128,000 tokens, Up to Apr 2023)
[2024/01] Updated the support for ChatGPT-3.5 (gpt-3.5-turbo: openai chatgpt-3.5, 4,096 tokens and gpt-3.5-turbo-1106: openai chatgpt-3.5, 16,385 tokens)
[2023/12] We added LLM support for the executor and the ACR module and released the milestone stable version v0.1.1
[2023/12] We provided the latest docker version to simplify the installation process.
[2023/12] New feature: automated code repairing (ACR module) added, add llama2-chat backends.
[2023/11] We updated the executor and released the latest stable version (v0.0.2) and are working on automatic error feedback and code fixing.
[2023/09] We integrated codellama 7b-Instruct, 13b-Instruct, 34b-Instruct, now users can choose to use chatgpt or local llm as backends, we currently recommend using chatgpt because tests have found that codellama is not as effective as chatgpt for complex bioinformatics tasks.
[2023/09] We are pleased to announce the official release of AutoBA's latest version v0.0.1! 🎉🎉🎉

TODO list

We're working hard to achieve more features, welcome to PRs!

[x] Automatic error feedback and code fixing
[x] Offer local LLMs (eg. code llama) as options for users
[x] Provide a docker version, simplify the installation process
[x] A UI-based YAML generator
[x] Support deepseek coder
[x] Support RAG
[x] Support ollama
[ ] Pack into a conda package, simplify the installation process
[ ] Interactive mode
[ ] GUI for data visualization
[ ] Continue from breakpoint
[ ] ...

We appreciate all contributions to improve AutoBA.

The main branch serves as the primary branch, while the development branch is dev.

Thank you for your unwavering support and enthusiasm, and let's work together to make AutoBA even more robust and powerful! If you want to contribute, please PR to the latest dev branch. 💪

Installation

Command line

# (mandatory) for basic functions
curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh"
bash Mambaforge-$(uname)-$(uname -m).sh
git clone https://github.com/JoshuaChou2018/AutoBA.git

mamba create -n abc_runtime python==3.10 -y
mamba activate abc_runtime
# Then manually:
add conda-forge and bioconda to ~/mambaforge/.condarc
            
mamba create -n abc python==3.10
mamba activate abc
mamba install -c anaconda yaml==0.2.5 -y
pip install openai==1.13.3 pyyaml==6.0
pip install transformers==4.35.0
pip install accelerate==0.29.2
pip install bitsandbytes==0.43.1
pip install vllm==0.4.1

## (optional) for RAG
pip install llama-index==0.10.14
pip install llama-index-embeddings-huggingface

# (optional) for local llm with ollama
mamba install langchain-community==0.2.6 -y
curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.3.4 sh
## pull the model before using it with AutoBA
ollama run llama3.1

# (optional) for gui version
pip install gradio==4.14.0

# (optional) for local llm (llama2)
cd AutoBA/src/codellama-main
pip install -e .

## apply for a download link at https://ai.meta.com/resources/models-and-libraries/llama-downloads/
## download codellama model weights: 7b-Instruct,13b-Instruct,34b-Instruct
cd src/codellama-main
bash download.sh
## download llama2 model weights: 7B-chat,13B-chat,70B-chat
cd src/llama-main
bash download.sh
## download hf version model weights
git lfs install
cd src/codellama-main
git clone https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf
git clone https://huggingface.co/codellama/CodeLlama-13b-Instruct-hf
git clone https://huggingface.co/codellama/CodeLlama-34b-Instruct-hf

# (optional) for local llm (deepseek)
cd AutoBA/src/deepseek
git clone https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct
git clone https://huggingface.co/deepseek-ai/deepseek-coder-7b-instruct-v1.5
git clone https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct
git clone https://huggingface.co/deepseek-ai/deepseek-llm-67b-chat

# (optional) for features under development: the yaml generator UI
pip install plotly==5.14.1 dash==2.9.3 pandas==2.0.1 dash-mantine-components==0.12.1

Docker

Please refer to https://docs.docker.com/engine/install to install Docker first.

# (mandatory) for basic functions
docker pull joshuachou666/autoba:cuda12.2.2-cudnn8-devel-ubuntu22.04-autoba0.1.2
docker run --rm --gpus all -it joshuachou666/autoba:cuda12.2.2-cudnn8-devel-ubuntu22.04-autoba0.1.2 /bin/bash
## Enter the shell in docker image
conda activate abc
cd AutoBA

If you get this error: could not select device driver "" with capabilities: [[gpu]], then run the following codes:

# (optional) for using GPU in docker
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt install -y nvidia-docker2
sudo systemctl daemon-reload
sudo systemctl restart docker

Try the previous codes again.

Conda

Coming soon...

Get Started

Understand files

./example contains several examples for you to start.

Under ./example, config.yaml defines your files and goals. Defining data_list, output_dir and goal_description in config.yaml is mandatory before running app.py. Absolute paths rather than relative paths are recommended for all file paths defined in config.yaml.

app.py run this file to start.

Start with one command

Run this command to start a simple example with chatgpt as backend (recommended).

python app.py --config ./examples/case1.1/config.yaml --openai YOUR_OPENAI_API --model gpt-4

Execute the code while generating it with ACR module loaded.

python app.py --config ./examples/case1.1/config.yaml --openai YOUR_OPENAI_API --model gpt-4 --execute True

Please note that this work uses the GPT-4 API and does not guarantee that GPT-3.5 will work properly in all cases.

or with local llm as backend (not recommended for the moment, in development and only for testing purposes)

python app.py --config ./examples/case1.1/config.yaml --model codellama-7bi

or with local llm based on ollama as backend

python app.py --config ./examples/case1.1/config.yaml --model ollama_llama3.1

Start GUI version

Run this command to start a GUI version of AutoBA.

python gui.py

Model Zoo

Dynamic Engine: dynamic update version

gpt-3.5-turbo: Points to the latest gpt-3.5 model
gpt-4-turbo: Points to the latest gpt-4 model
gpt-4o: Points to the latest gpt-4o model
gpt-4o-mini: Points to the latest gpt-4o-mini model
gpt-4: (default)
For more information, please check: https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4

Ollama Engine:

ollama_llama3.1: llama3.1
ollama_llama3.1:8b: llama3.1:8b
ollama_mistral: mistral
...
the ollama_ prefix is mandatory, for more models, please refer to https://ollama.com/library

Fixed Engine: snapshot version

gpt-3.5-turbo-1106: Updated GPT 3.5 Turbo, 16,385 tokens, Up to Sep 2021
gpt-4-0613: Snapshot of gpt-4 from June 13th 2023 with improved function calling support, 8,192 tokens, Up to Sep 2021
gpt-4-32k-0613: Snapshot of gpt-4-32k from June 13th 2023 with improved function calling support, 32,768 tokens, Up to Sep 2021
gpt-4-1106-preview: GPT-4 Turbo, 128,000 tokens, Up to Apr 2023
codellama-7bi: 7b-Instruct
codellama-13bi: 13b-Instruct
codell

AutoBA

Install / Use

README