G4Catchall
G4Catchall is a python package designed to scan given DNA/RNA sequences for G-quadruplexes with or without atypical features
Install / Use
/learn @odoluca/G4CatchallREADME
G4Catchall
G4Catchall is a python package designed to scan given DNA/RNA sequences for G-quadruplexes with or without atypical features
Please cite: Doluca, O. (2019). G4Catchall: A G-quadruplex prediction approach considering atypical features. Journal Of Theoretical Biology, 463, 92-98. doi: 10.1016/j.jtbi.2018.12.007
DESCRIPTION
Searches for matches to a G-quadruplex-fitting regex in a fasta file,
filters through G4Hunter-like secondary scoring scheme and return a bed file with
coordinates of the match, matched sequence, G-quadruplex forming sequence and the score.
Output bed file has the following columns:
1. description of the fasta sequence (e.g. NC_00024.11 Y chromosome)
2. start of the match
3. end of the match
4. size of the match
5. strand of the match (e.g. +)
6. positive strand sequence of the match (e.g. CCCTTCCCTTTCCCTCCC)
7. matched G-quadruplex-forming sequence (e.g. GGGAGGGAAAGGGAAGGG)
8. score of the matched G-quadruplex-forming sequence based on selected scoring scheme
EXAMPLE
##Test data:
echo ^>mychr > mychr.fa
echo TTGGGTTGGGACTGGGTACGGGAATAAATAGGTTAGGAATGGATAGGATCCCTTCCCTTCCCTTCCCTTGGCGCGGCCGGCGG >> mychr.fa
python G4Catchall.py -f mychr.fa --G3L 1..3
mychr 2 22 20 + GGGTTGGGACTGGGTACGGG GGGTTGGGACTGGGTACGGG 1.7
mychr 49 67 18 - CCCTTCCCTTCCCTTCCC GGGAAGGGAAGGGAAGGG -2.0
## 2 Guanine-tetrad G-quadruplexes can be included using --G2L
python G4Catchall.py -f mychr.fa --G3L 1..3 --G2L 1..3
mychr 2 22 20 + GGGTTGGGACTGGGTACGGG GGGTTGGGACTGGGTACGGG 1.7
mychr 30 47 17 + GGTTAGGAATGGATAGG GGTTAGGAATGGATAGG 0.9411764705882353
mychr 49 67 18 - CCCTTCCCTTCCCTTCCC GGGAAGGGAAGGGAAGGG -2.0
## Score threshold can be changed using --G4Threshold
python G4Catchall.py -f mychr.fa --G3L 1..3 --G2L 1..3 --G4HThreshold 0.4
mychr 2 22 20 + GGGTTGGGACTGGGTACGGG GGGTTGGGACTGGGTACGGG 1.7
mychr 30 47 17 + GGTTAGGAATGGATAGG GGTTAGGAATGGATAGG 0.9411764705882353
mychr 49 67 18 - CCCTTCCCTTCCCTTCCC GGGAAGGGAAGGGAAGGG -2.0
mychr 69 83 14 + GGCGCGGCCGGCGG GGCGCGGCCGGCGG 0.7142857142857143
##When no fasta file is indicated, it only constructs the regex from given parameters and prints.
python G4Catchall.py --G3L 1..3 -I 0
([Gg]{3,}) (\w{1,3}) ([Gg]{3,}) (\w{1,3}) ([Gg]{3,}) (\w{1,3}) ([Gg]{3,})
DOWNLOAD
G4Catchall.py is hosted at http://github.com/odoluca/G4Catchall
PLEASE CITE
Doluca, O. (2019). G4Catchall: A G-quadruplex prediction approach considering atypical features. Journal Of Theoretical Biology, 463, 92-98. doi: 10.1016/j.jtbi.2018.12.007
optional arguments:
-h, --help show this help message and exit
--fasta FASTA, -f FASTA
Input file in fasta format containing one or more sequences can be used.
Please note that, if not used, only the regular expression constructed using given
arguments will be printed.
--min_Gtract_for_extreme_loop MIN_GTRACT_FOR_EXTREME_LOOP, -E MIN_GTRACT_FOR_EXTREME_LOOP
Defines the minimum G-tract length for permission of an extreme
loop. Works only with --extreme_loop. Can be set to 2 or 3. Default=3
--extreme_loop [EXTREME_LOOP], --XL [EXTREME_LOOP]
Allows search for an extreme loop. If precedes a secondary argument,
such as "1..20" also defines the limits of the loop. For default values do
not use a second argument. Default="1..30"
--G2GQs_allowed Allows G-quadruplexes with G-tracts of two guanines. Not necessary
with --G2GQ_loop command.
--G2GQ_loop [G2GQ_LOOP], --G2L [G2GQ_LOOP]
Allows G-quadruplexes with G-tracts of two guanines and defines
limits of loops for such G-quadruplexes if precedes a secondary argument,
such as "1..7". Do not use a secondary argument for default loop limits.
Default="1..2"
--G3GQ_loop G3GQ_LOOP, --G3L G3GQ_LOOP
Defines limits of loops for typical G-quadruplexes if precedes a
secondary argument, such as "1..7". Do not use for default loop limits.
Default="1..8"
--max_imperfect_Gtracts MAX_IMPERFECT_GTRACTS, -I MAX_IMPERFECT_GTRACTS
Defines the number of atypical or "imperfect" G-tracts allowed for
G-quadruplexes with G-tracts of at least 3 guanines. It can be set to 0,1
or 2. Default=1
--bulge_only, -B Defines the nature of the imperfect G-tracts allowed. If used only
bulged G-tracts are allowed. Otherwise, mismatches are also allowed.
--max_GQ_length MAX_GQ_LENGTH, --max MAX_GQ_LENGTH
Maximum allowed G-quadruplex length for a single discovery. This
should be used with caution. If not used together with --dont_merge_overlapping
discovered sequences may be longer than the given value. This parameter is
essentially designed for limiting cumulative negative impact of long loops.
--no_reverse, -R By default the program searches both strands by reversing the regex.
If used only + strand is searched for matches.
--dont_merge_overlapping
Putative G-quadruplex-forming sequences may be found overlapping
on the same strand. By default the program merges these sequences. If
used, these are not overlapped. Using may result in huge number of
matches and cause memory issues.
--include_flanks, --F
By default the program extracts only matching sequences.
If used flanking nucleotides are also included in the search.
Please note if used G-quadruplex-forming sequences at the beginning or
ending of the sequences may be missed. Consider adding "N" to the edges
of the sequence if G-quadruplex forming sequences are expected to be
found at the very edge of the target sequence.
--G4HThreshold G4HTHRESHOLD
Removes G-quadruplex predictions with absolute G4Hunter scores lower than
the preceding threshold value. Default: 0.0.
Related Skills
diffs
342.5kUse the diffs tool to produce real, shareable diffs (viewer URL, file artifact, or both) instead of manual edit summaries.
clearshot
Structured screenshot analysis for UI implementation and critique. Analyzes every UI screenshot with a 5×5 spatial grid, full element inventory, and design system extraction — facts and taste together, every time. Escalates to full implementation blueprint when building. Trigger on any digital interface image file (png, jpg, gif, webp — websites, apps, dashboards, mockups, wireframes) or commands like 'analyse this screenshot,' 'rebuild this,' 'match this design,' 'clone this.' Skip for non-UI images (photos, memes, charts) unless the user explicitly wants to build a UI from them. Does NOT trigger on HTML source code, CSS, SVGs, or any code pasted as text.
openpencil
1.9kThe world's first open-source AI-native vector design tool and the first to feature concurrent Agent Teams. Design-as-Code. Turn prompts into UI directly on the live canvas. A modern alternative to Pencil.
ui-ux-pro-max-skill
55.6kAn AI SKILL that provide design intelligence for building professional UI/UX multiple platforms
