SkillAgentSearch skills...

Accucopy

Accucopy is a computational method that infers Allele-Specific Copy Number alterations from low-coverage low-purity tumor sequencing data.

Install / Use

/learn @polyactis/Accucopy

README

Introduction

Accucopy is a CNA-calling method that extends our previous Accurity model to predict both total (TCN) and allele-specific copy numbers (ASCN) for the tumor genome. Accucopy adopts a tiered Gaussian mixture model coupled with an innovative autocorrelation-guided EM algorithm to find the optimal solution quickly. The Accucopy model utilizes information from both total sequencing coverage and allelic sequencing coverage. Through benchmark in both simulated and real-sequencing samples, we demonstrate that Accucopy is more accurate than existing methods

Accucopy's main strength is in handling low coverage and/or low tumor-purity samples.

Publication

X Fan, G Luo, YS Huang# (2021) BMC Bioinformatics. Accucopy: Accurate and Fast Inference of Allele-specific Copy Number Alterations from Low-coverage Low-purity Tumor Sequencing Data.

License

The license follows our institute policy that you can use the program for free as long as you are using Accucopy strictly for non-profit research purposes. However, if you plan to use Accucopy for commercial purposes, a license is required and please contact polyactis@gmail.com to obtain one.

The full-text of the license is included in the software package.

Get our software

News

  • 2023/6: Fixed the ploidy (must be within 1-4) bug.
  • 2022/3: Commandline argument to allow user to choose which period (TRE histogram) to use.
  • 2020/3: Can handle non-human genomes.
  • 2019/10/22 First release.

Register to receive updates

Please register here to receive updates and the download link (a standalone Accucopy package without dependencies). If you have trouble installing packages described below, use the docker image instead.

Docker image

NOTE Due to the difficulty (i.e. no root access to install required libraries or incompatible libraries) in running our binary software, we have made a docker image available at dockerhub, which contains the latest development version of our software and all dependent libraries. Accucopy inside the image is usually newer than what is downloadable from this website.

  1. Install docker before you do anything below.
  2. Download the ref genome package.
  3. To run it on a HPC cluster, singularity might be a better fit than docker.

An example docker session:


yh@cichlet:~$ docker pull polyactis/accucopy
Using default tag: latest         
latest: Pulling from polyactis/accucopy
...                           
fd6992ef54e0: Pull complete
Digest: sha256:a6f72af3114ba903f26b60265e10e6f13b8d943d25e740ab0a715d1a99000188
Status: Downloaded newer image for polyactis/accucopy:latest
yh@cichlet:~$ docker images
REPOSITORY                  TAG                 IMAGE ID            CREATED             SIZE
polyactis/accucopy          latest              a11fdb62c5d4        5 months ago        1.04GB

# Get inside the image, without mounting. Useful to just check what's inside the image.
yh@cichlet:~$ docker run -i -t polyactis/accucopy /bin/bash

# Download the reference genome folder (links on this page) into /home/mydata (or any folder)
# Put your bam files into /home/mydata
# Mount /home/mydata to /mnt inside the image
# Get inside the docker image.
yh@cichlet:~$ docker run -i -t -v /home/mydata:/mnt polyactis/accucopy /bin/bash

root@cc7807445e40:/$ cd /usr/local/Accucopy/
/usr/local/Accucopy
root@cc7807445e40:/usr/local/Accucopy$ ls
GADA         maestre    main.py             plot_autocor_diff.py                  plot_snp_maf_peak.py
LICENSE      configure  plot.tre.autocor.R  plot_coverage_after_normalization.py  plot_tre.py
__init__.py  infer      plotCPandMCP.py     plot_snp_maf_exp.py

root@cc7807445e40:/usr/local/Accucopy$ ./main.py
usage: main.py [-h] [-v] -c CONFIGURE_FILEPATH -t TUMOR_BAM -n NORMAL_BAM -o
               OUTPUT_DIR [--snp_output_dir SNP_OUTPUT_DIR] [--clean CLEAN]
               [--segment_stddev_divider SEGMENT_STDDEV_DIVIDER]
               [--snp_coverage_min SNP_COVERAGE_MIN]
               [--snp_coverage_var_vs_mean_ratio SNP_COVERAGE_VAR_VS_MEAN_RATIO]
               [--max_no_of_peaks_for_logL MAX_NO_OF_PEAKS_FOR_LOGL]
               [--nCores NCORES] [-s STEP] [-l LAM] [-d DEBUG] [--auto AUTO]
main.py: error: argument -c/--configure_filepath is required
# modify file "configure" to reflect paths of input data and relevant binaries
root@cc7807445e40:/usr/local/Accucopy$ cat configure 
read_length     101
window_size     500
reference_folder_path   /mnt/hs37d5
samtools_path   /usr/local/bin/samtools
caller_path     /usr/local/strelka
binary_folder   /usr/local/Accucopy

root@cc7807445e40:/usr/local/Accucopy$ ls /usr/local/bin/
total 11640
-rwxrwxr-x  1 root root 4436160 Jul  7  2018 samtools*

Install Accucopy and all its dependencies

Prerequisites

  • A computer with at least 32GB of memory (recommend 64GB).
  • Strelka2. A variant caller that is used to call SNPs.
  • Python
  • matplotlib
  • numpy
  • pandas
  • Pyflow
  • samtools)
  • libbz2-1.0 (a high-quality block-sorting file compressor library, install it via "apt install libbz2-1.0" in Debian/Ubuntu)
  • If your OS (like CentOS) has this library installed but Accucopy still fails to load it, you can do a symlink from the installed libarary file to "libbz2.so.1.0".
  • libgsl2 -liblzma5 (XZ-format compression library)
  • libssl1.0.0
  • libboost-program-options1.58.0
  • libboost-iostreams1.58.0
  • libhdf5-dev
  • (Only for building from source) pkg-config: used by Rust compiler to find library paths. i.e. "pkg-config --libs --cflags openssl"
  • (Optional) R packages ggplot2, grid, scales. Only needed if you obtain a development version of Accucopy. Required to make one R plot.
    • But the R plot is NOT a must-have, one python plot has similar content as the R plot.

Running Accucopy requires a project-specific configure file, details below. configure according to your OS environment.

Install pyflow and other Python packages

git clone https://github.com/Illumina/pyflow.git pyflow
cd pyflow/pyflow
python setup.py build install

Other python packages can be installed through Python package system "pip install ..." or Ubuntu package system, dpkg/apt-get.

Register to download the Accucopy binary package and receive update emails

Please register here to receive an email that contains a download link. After finishing download, unpack the package via this:

tar -xvzf Accucopy.tar.gz

The Accucopy package contains a few binary executables and R/Python scripts. All binary executables were compiled for the Linux platform (Ubuntu 18 tested). It also contains a sample configure file. Denote the full path of the Accucopy folder as accucopy_path in the configure file (described below).

NOTE

  1. If you are having difficulty in getting Accucopy to work, please use the docker image instead.
  2. This binary package is older than the docker release.

Compile source code (for advanced users)

Instead of downloading binary, you can also choose to compile the source code. Be forewarned, you may run into problems (missing packages, wrong paths, etc.) in compiling the C++ portion on non-Ubuntu platforms. Rust compiling is relatively easy.

Compiling Accucopy requires those "lib..." packages mentioned above and their corresponding development packages (for example, libbz2-dev). In addition, it requires an installation of Rust, https://www.rust-lang.org/. We have compiled successfully on Ubuntu 16.04 and 18.04.

cd src_o

# to get a debug version (recommended)
make debug

# to get a release version
make release

The difference between debug and release version:

  • The debug version will contain a Rust binary that can print a stack trace in case an error happens. It is only slightly slower than the release version Rust binary.
  • The debug version will

Related Skills

View on GitHub
GitHub Stars17
CategoryProduct
Updated5mo ago
Forks4

Languages

C++

Security Score

92/100

Audited on Oct 6, 2025

No findings