E2EDNA 2.0 - OpenMM Implementation of E2EDNA !

New feature: DeltaGzip [JCIM paper][code]

An automated pipeline for simulating DNA aptamers complexed with target ligands (peptide, DNA, RNA or small molecules).

Please note that the main branch is in ongoing development and tests may or may not work. For a fully working version use the released code v2.0.0
To view Tinker-based version of E2EDNA, refer to its GitHub repo and DOI.
Interested in contributing to developing E2EDNA? Check out how to contribute here.
Please download the most recent release v2.0.0 here or here

Reference

If you use this code in any future publications, please cite our work using Kilgour et al., (2022). E2EDNA 2.0: Python Pipeline for Simulating DNA Aptamers with Ligands. Journal of Open Source Software, 7(73), 4182

E2EDNA pipeline makes use of several other open-sourced software packages, therefore please be mindful of citing them as well:

Installation
Usage
Running a job
Functionality of eight different operation modes
Automated test runs

1. Installation

Download the E2EDNA 2.0 package from this repository.
Locate macos_installation.sh in the downloaded E2EDNA2 codebase directory. Then at the codebase directory, run $ source macos_installation.sh in command line to create a conda virtual environment named e2edna and install required dependences. The e2edna environment should be activated when the installation script finishes, which means a string '(e2edna)' should show up at the beginning of the command line prompt.
- If the script fails to activate the environment automatically, this is likely because $ conda activate e2edna command in the script gives an error such as Your shell has not been properly configured to use 'conda activate'.
- If so, manually run $ source activate <path_to_e2edna_conda_environment> to activate the environment. To help find the path, run $ conda info -e to list all conda environments and their paths on your computer.
As the message indicates at the end of installation process, if you wish to execute E2EDNA pipeline with a DNA aptamer sequence rather than its 3D structure, please register and download MMB from https://simtk.org/projects/rnatoolbox. Then copy or move the downloaded MMB folder to the codebase directory and remember to fill MMB-related paths section in the configuration file simu_config.yaml

2. Usage

The usage and help statements can be accessed with the -h/--help flags:

(e2edna)$ ./main.py --help
usage: main.py [-h] -yaml [-ow] [-d] [-os] [-p] [--CUDA_precision] [-w DIR] [-mbdir] [-mb] [--quick_check_mode] [-r] [-m] [-a]
               [-l] [-lt] [-ls] [--example_target_pdb] [--example_peptide_seq] [--skip_MMB] [-init] [--secondary_structure_engine]
               [--N_2D_structures] [--Mg_conc] [--fold_fidelity] [--fold_speed] [--mmb_normal_template] [--mmb_quick_template]
               [--mmb_slow_template] [--mmb_params] [-pk] [--pickup_from_freeAptamerChk] [--pickup_from_complexChk] [--chk_file]
               [--pickup_pdb] [--pressure] [--temperature] [--ionicStrength] [--pH] [--auto_sampling] [--autoMD_convergence_cutoff]
               [--max_aptamer_sampling_iter] [--max_walltime] [--skip_smoothing] [--equilibration_time] [--smoothing_time]
               [--aptamer_sampling_time] [--complex_sampling_time] [--time_step] [--print_step] [--force_field] [--hydrogen_mass]
               [--water_model] [--box_offset] [--constraints] [--constraint_tolerance] [--rigid_water] [--nonbonded_method]
               [--nonbonded_cutoff] [--ewald_error_tolerance] [--friction] [--implicit_solvent] [--implicit_solvent_model]
               [--soluteDielectric] [--solventDielectric] [--implicit_solvent_Kappa] [--leap_template] [--DNA_force_field]
               [--docking_steps] [--N_docked_structures]

E2EDNA: Simulate DNA aptamers complexed with target ligands

optional arguments:
  -h, --help            show this help message and exit
  -yaml, --yaml_config 
                        A YAML configuration file that can specify all the arguments (default: simu_config.yaml)
  -ow, --overwrite      Overwrite existing --run_num (default: False)

Compute Platform Configuration:
  -d, --device      Device configuration (default: local)
  -os, --operating_system 
                        Operating system (default: macos)
  -p, --platform    Processing platform (default: CPU)
  --CUDA_precision    Precision of CUDA, if used (default: single)

Directory Settings:
  -w DIR, --workdir DIR
                        Working directory to store individual output runs (default: ./localruns)
  -mbdir, --mmb_dir 
                        MMB library directory (default: None)
  -mb, --mmb        Path to MMB executable (default: None)

Run Parameters:
  --quick_check_mode  Rapidly run a certain mode for quick check using default test parameters (default: Yes)
  -r, --run_num     Run number. Output will be written to {--workdir}/run{--run_num} (default: 1)
  -m, --mode        Run mode (default: None)
  -a, --aptamer_seq 
                        DNA Aptamer sequence (5'->3') (default: None)
  -l, --ligand      Name of PDB file for ligand structure; None if not to have ligand (default: None)
  -lt, --ligand_type 
                        Type of ligand molecule (default: None)
  -ls, --ligand_seq 
                        Ligand sequence if peptide, DNA, or RNA (default: None)
  --example_target_pdb 
                        An example peptide ligand included in E2EDNA package: used when wish to test docking (default:
                        examples/example_peptide_ligand.pdb)
  --example_peptide_seq 
                        The sequence of the example peptide ligand (default: YQTQTNSPRRAR)
  --skip_MMB          If `Yes`: skip both 2D structure analysis and MMB folding, and start with a known --init_structure (default: No)
  -init, --init_structure 
                        Name of PDB file if starting pipeline on a DNA aptamer with known structure (default: None)
  --secondary_structure_engine 
                        Pipeline module that is used to predict secondary structures (default: NUPACK)
  --N_2D_structures   Number of predicted secondary structures (default: 1)
  --Mg_conc           Magnesium molar concentration used in NUPACK: [0, 0.2] (default: 0.0)
  --fold_fidelity     Refold in MMB if score < `fold_fidelity` unless the `fold_speed` is `quick` (default: 0.9)
  --fold_speed        MMB folding speed (default: normal)
  --mmb_normal_template 
                        Path to MMB folding protocol of normal speed (default: lib/mmb/commands.template.dat)
  --mmb_quick_template 
                        Path to MMB folding protocol of quick speed (default: lib/mmb/commands.template_quick.dat)
  --mmb_slow_template 
                        Path to MMB folding protocol of slow speed (default: lib/mmb/commands.template_long.dat)
  --mmb_params        Path to parameter file bundled with MMB package (default: lib/mmb/parameters.csv)
  -pk, --pickup     Whether the run is to resume MD sampling of an unfinished run or an old run (default: No)
  --pickup_from_freeAptamerChk 
                        Resume MD sampling of free aptamer: skip everything before it (default: No)
  --pickup_from_complexChk 
                        Resume MD sampling of aptamer-ligand: skip everything before it (default: No)
  --chk_file          Name of checkpoint file for resuming MD sampling, format: <path>/<filename>.chk (default: None)
  --pickup_pdb        PDB file (topology+coordinates) for resuming MD sampling in explicit solvent, format: <path>/<filename>.pdb (default: None)
  --pressure          Pressure in the unit of atm (default: 1.0)
  --temperature       Temperature in Kelvin (default: 298.0)
  --ionicStrength     Sodium molar concentration (could be used by NUPACK and OpenMM) (default: 0.1)
  --pH                Could be used by OpenMM (default: 7.4)
  --auto_sampling     If `Yes`: run MD sampling till convergence, currently only feasible in free aptamer sampling (default: No)
  --auto

E2EDNA2

Install / Use

README

E2EDNA 2.0 - OpenMM Implementation of E2EDNA !

An automated pipeline for simulating DNA aptamers complexed with target ligands (peptide, DNA, RNA or small molecules).

Reference

Table of contents

1. Installation

2. Usage