Eltetrado
A Python application to find and classify tetrads and quadruplexes in DNA/RNA 3D structures
Install / Use
/learn @tzok/EltetradoREADME
Project description
This is an application to analyze base pairing patterns of DNA/RNA 3D structures to find and classify tetrads and quadruplexes. ElTetrado assigns tetrads to one of the ONZ classes (O, N, Z) alongside with the directionality of the tetrad (+/-) determined by the bonds between bases and their non-canonical interactions. The interactions follow Leontis/Westhof classification (Leontis et al. 2001). Watson-Crick (W) edge of first base in the tetrad structure exposed to the Hoogsteen (H) edge of the next nucleobase from the same tetrad sets the tetrad directionality, clockwise (+) or anticlockwise (-). For more details, please refer to Zok et al. (2020) and Popenda et al. (2020)
Installation
This project uses Poetry for dependency management.
To install the package, run:
poetry install
Dependencies
The project is written in Python 3.12+ and requires mmcif, orjson, NumPy and rnapolis.
Visualization is created by R 3.6+ script which uses
R4RNA (Lai et al. 2012) library. The
dependency will be automatically installed if not present.
Base pairs and stacking interactions are identified by RNApolis.
Usage
ElTetrado is a command line application, which requires to be provided
with --input and a path to a PDB or PDBx/mmCIF file.
By default, ElTetrado outputs textual results on the standard output. A
JSON version of the output can be obtained with --output switch
followed by a path where the file is supposed to be created.
ElTetrado prepares visualization of the whole structure and of each
N4-helices, quadruplexes and tetrads. This can be supplemented with
canonical base pairs visualization when --complete-2d is set. All
color settings are located in the first several lines of the quadraw.R
file, you can easily change them without knowledge of R language. If you
want ElTetrado to not visualize anything, pass --no-image switch to
it.
usage: eltetrado [-h] [-i INPUT] [-o OUTPUT] [-m MODEL] [--no-reorder]
[--complete-2d] [--image DIR] [-e [EXTERNAL_FILES ...]]
[--tool {fr3d,dssr,rnaview,bpnet,maxit,barnaba,mc-annotate}]
[-v]
options:
-h, --help show this help message and exit
-i, --input INPUT path to input PDB or PDBx/mmCIF file
-o, --output OUTPUT (optional) path for output JSON file
-m, --model MODEL (optional) model number to process
--no-reorder chains of bi- and tetramolecular quadruplexes should
be reordered to be able to have them classified; when
this is set, chains will be processed in original
order, which for bi-/tetramolecular means that they
will likely be misclassified; use with care!
--complete-2d when set, the visualization will also show canonical
base pairs to provide context for the quadruplex
--image DIR directory where visualization files (PDF) will be
saved; if omitted, no images are generated
-e, --external-files [EXTERNAL_FILES ...]
path(s) to external tool output file(s); if omitted
ElTetrado will compute interactions itself
--tool {fr3d,dssr,rnaview,bpnet,maxit,barnaba,mc-annotate}
name of the external tool that produced the files
(auto-detected when not provided)
-v, --version show program's version number and exit
Chains reorder
ElTetrado keeps a global and unique 5’-3’ index for every nucleotide which is independent from residue numbers. For example, if a structure has chain M with 60 nucleotides and chain N with 15 nucleotides, then ElTetrado will keep index between 0 and 74 which uniquely identifies every nucleotide. Initially, ElTetrado assigns this indices according to the order of chains in the input file. Therefore, if M preceded N then nucleotides in M will be indexed from 0 to 59 and in N from 60 to 74. Otherwise, nucleotides in N will be indexed from 0 to 14 and in M from 15 to 74.
When --no-reorder is present, this initial assignment is used.
Otherwise, ElTetrado exhaustively checks all permutations of chains’
orders. Every permutation check induces recalculation of the global and
unique 5’-3’ index and in effect it changes ONZ classification of
tetrads.
ElTetrado keeps a table of tetrad classification scores according to these rules:
- Type preference:
O>N>Z - Direction preference:
+>-
The table keeps low values for preferred classes i.e. O+ is 0, O- is
1 and so on up to Z- with score 5. For every permutation of chain
orders, ElTetrado computes sum of scores for tetrads classification
induced by 5’-3’ indexing. We select permutation with the minimum value.
Examples
2HY9: Human telomere DNA quadruplex structure in K+ solution hybrid-1 form

$ curl ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/mmCIF/my/2hy9.cif.gz | gzip -d > 2hy9.cif
$ ./eltetrado --input 2hy9.cif --output 2hy9.json
Chain order: 1
n4-helix with 3 tetrads
Oh* V 9a -(pll) quadruplex with 3 tetrads
1.DG4 1.DG22 1.DG18 1.DG10 cWH cWH cWH cWH O- Vb planarity=0.06
direction=hybrid rise=3.15 twist=28.48
1.DG5 1.DG23 1.DG17 1.DG11 cHW cHW cHW cHW O+ Va planarity=0.05
direction=hybrid rise=3.08 twist=29.27
1.DG6 1.DG24 1.DG16 1.DG12 cHW cHW cHW cHW O+ Va planarity=0.05
Tracts:
1.DG4, 1.DG5, 1.DG6
1.DG22, 1.DG23, 1.DG24
1.DG18, 1.DG17, 1.DG16
1.DG10, 1.DG11, 1.DG12
Loops:
propeller- 1.DT7, 1.DT8, 1.DA9
lateral- 1.DT13, 1.DT14, 1.DA15
lateral+ 1.DT19, 1.DT20, 1.DA21
AAAGGGTTAGGGTTAGGGTTAGGGAA
...(([...{)]...[[}...)]]..
...([{...)((...))(...)]}..
<details>
<summary>
Click to see the output JSON
</summary>{
"metals": [],
"nucleotides": [
{
"index": 1,
"chain": "1",
"number": 1,
"icode": null,
"molecule": "DNA",
"fullName": "1.DA1",
"shortName": "A",
"chi": 22.30828283085781,
"glycosidicBond": "syn"
},
{
"index": 2,
"chain": "1",
"number": 2,
"icode": null,
"molecule": "DNA",
"fullName": "1.DA2",
"shortName": "A",
"chi": -123.05454402191421,
"glycosidicBond": "anti"
},
{
"index": 3,
"chain": "1",
"number": 3,
"icode": null,
"molecule": "DNA",
"fullName": "1.DA3",
"shortName": "A",
"chi": -94.96579955603106,
"glycosidicBond": "anti"
},
{
"index": 4,
"chain": "1",
"number": 4,
"icode": null,
"molecule": "DNA",
"fullName": "1.DG4",
"shortName": "G",
"chi": 79.28363721639316,
"glycosidicBond": "syn"
},
{
"index": 5,
"chain": "1",
"number": 5,
"icode": null,
"molecule": "DNA",
"fullName": "1.DG5",
"shortName": "G",
"chi": -126.01709201555563,
"glycosidicBond": "anti"
},
{
"index": 6,
"chain": "1",
"number": 6,
"icode": null,
"molecule": "DNA",
"fullName": "1.DG6",
"shortName": "G",
"chi": -127.26656202302102,
"glycosidicBond": "anti"
},
{
"index": 7,
"chain": "1",
"number": 7,
"icode": null,
"molecule": "DNA",
"fullName": "1.DT7",
"shortName": "T",
"chi": -63.10830751967371,
"glycosidicBond": "anti"
},
{
"index": 8,
"chain": "1",
"number": 8,
"icode": null,
"molecule": "DNA",
"fullName": "1.DT8",
"shortName": "T",
"chi": -138.79520345559828,
"glycosidicBond": "anti"
},
{
"index": 9,
"chain": "1",
"number": 9,
"icode": null,
"molecule": "DNA",
"fullName": "1.DA9",
"shortName": "A",
"chi": -148.83990757408878,
"glycosidicBond": "anti"
},
{
"index": 10,
"chain": "1",
"number": 10,
"icode": null,
"molecule": "DNA",
"fullName": "1.DG10",
"shortName": "G",
"chi": 58.77875250191579,
"glycosidicBond": "syn"
},
{
"index": 11,
"chain": "1",
"number": 11,
"icode": null,
"molecule": "DNA",
"fullName": "1.DG11",
"shortName": "G",
"chi": -123.85746807924986,
"glycosidicBond": "anti"
},
{
"index": 12,
"chain": "1",
"number": 12,
"icode": null,
"molecule": "DNA",
"fullName": "1.DG12",
"shortName": "G",
"chi": -84.36679807284759,
"glycosidicBond": "anti"
},
{
"index": 13,
"chain": "1",
"number": 13,
"icode": null,
"molecule": "DNA",
"fullName": "1.DT13",
"shortName": "T",
"chi": -30.819029132834157,
"glycosidicBond": "anti"
},
{
"index": 14,
"chain": "1",
"number": 14,
"icode": null,
"molecule": "DNA",
"fullName": "1.DT14",
"shortName": "T",
"chi": -168.51776782812965,
"glycosidicBond": "anti"
},
{
"index": 15,
"chain": "1",
"number": 15,
"icode": null,
"molecule": "DNA",
"fullName": "1.DA15",
"shortName": "A",
"chi": -105.72881577106517,
"glycosidicBond": "anti"
},
{
"index": 16,
"chain": "1",
"number": 16,
"icode": null,
"molecule": "DNA",
"fullName": "1.DG16",
"shortName":
