SkillAgentSearch skills...

BioAider

A richly featured desktop platform for data analysis of bioinformatics. Especially, for quick sequence annotation and mutation analysis on large-scale viral (or others) genome-sequencing data.

Install / Use

/learn @ZhijianZhou01/BioAider
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Bioinformatics Aider (BioAider)

<b>Note</b> that versions lower than 1.423 were not optimized read speed for large data.

<b>New, BioAider v1.727 (2024/09/25) are stronger and more stable, we highly recommend it.</b>

1. Introduction

With the development of sequencing technology, a large amount of genomic sequenced data has been accumulated. Analysis of these data will help us understand their genetic variation at the molecular level. However, processing in a large-scale sequence data is difficult for biological or clinical expert without bioinformatics or programming skills. Besides, the needs are also diverse due to different research purposes. Therefore, software with diversity of function and simplicity of operation is very valuable.

BioAider is developed based on Python3, which is a user-friendly program with GUI-interface. As a desktop platform, <b>the design concept of BioAider is that simplicity of operation and high summary of analysis results, which could save a lot of time for researchers</b>.

BioAider

<div align="center"> <a href=https://www.sciencedirect.com/science/article/pii/S2210670720306867>Zhou et al., <i> Sustainable Cities and Society</i>, 2020</a> </div> <br />

Since its release, BioAider has been used in some studies by many researchers. In the future, we will continue to optimize BioAider and add new features.

<div align="left"> <kbd><img src="https://github.com/ZhijianZhou01/BioAider/blob/master/Figures/download_count_2024-01-24-v.jpg" width = "504" height = "457" alt="download_count" align=left /></kbd> </div> <div align="left"> <a <i>BioAider V1.0~V1.527 (prior to January 24, 2024)</i></a> </div>

2. Download, install and run

BioAider and all the updated versions is freely available for non-commercial user. After obtaining the program, users could directly run the program of executable file in the directory of "main", BioAider can run in Windows, Linux(Ubuntu 16.04 or more) and MacOS system.

Github links

Other download links (China)

(1) For Windows or MacOS system, users could run BioAider directly by clicking BioAider.exe (in Windows) or bioaider (in MacOS) in the directory main.

(2) For linux system(Ubuntu 16.04 or more), first, switch to the directory main, then:

$ ./bioaider

If you could not get permission to run BioAider on linux system, you could:

$ chmod -R 777 BioAider_v1.423_linux_20220324

3. Preview of BioAider

BioAider GUI

4. Example of functions

<b>Note:</b>BioAider will be in long-term development and functional improvement in the future. <b>Only a small part of the features are shown here</b>, please refer to the instruction Manual V1.423 and Update record for details.

4.1. Mutation Analysis

This function could be used for analysis of the <b>mutation characteristic on large numbers of sequenced strains</b>. The sequence data for analysis needs to be aligned in advance, and they could be nucleotides, proteins(amino acid) sequences or simply coding gene fragments. For nucleotides and proteins sequences, BioAider could summarizes all the mutation sites with corresponding frequency and strains.

Of course, if the data is codon gene, BioAider provides multiple sets of different codon tables for users, and could scan each condon sites in aligned sequence datasets, and identifies the type of mutation, including synonymous, non-synonymous, insertions and deletions and early termination. Finally, BioAider will automatically summarize and output the relevant analysis results.

<b>Note: </b>The codon gene sequences for mutations analysis have to be aligned by translation-alignment methon in advance, It is worth mentioning that BioAider packed three multiple-sequence-alignment software (mafft, muscle and clsutal-omega) in the graphical interface, and provided translation-alignment additionally.

Whether it’s nucleotides or amino acids or coding genes, BioAider could plot the frequency distribution graph for mutation sites through specifing groups of substitution frequencey in custom.

Eaxmple of mutations analysis for aligned SARS-CoV-2 ORF3a gene sequences. First, Drag the sequence to be analyzed to the input box, and select "Codon" single button in "Datas type":

Mutation Analysis.png

After the run is over, these analysis result could be found in the directory where the source file is located, you could scan the <b>*_mutation site summary</b> file then know the overall variation and mutation hotspots.

SARS-CoV-2_ORF3a_aligned_summary_file.png

If you also need to plot the distribution of synonymous/non-synonymous substitution bases, you can prepare a grouping table first:

Groups of mutation frequency.png

Each group of substitution frequency contains start value and end value which are separated by tab symbol. <b>Note, the start value</b> of each group is not included in the range of frequency. group_in_mutation.png

You could also konw the number of mutation sites under each mutation frequency group through view <b>*_substitution frequency distribution.png</b>:

SARS-CoV-2_ORF3a_aligned_substitution frequency distribution.png

It is not difficult to find that more than half of the mutation sites only appear in a single strain, although there are many mutation sites in ORF3a gene.

Or could obtain the corresponding mutant strain of these variant sites in the detailed <b>*_log.txt</b> file:

SARS-CoV-2_ORF3a_aligned log.png

4.2. Lollipop chart of gene mutation

Lollipop map is an efficient method to display gene mutation sites and frequencies, they look like the following: Lollipop map of mutation In BioAider, you only need to prepare the corresponding matrix file and simply set the parameters to quickly complete the drawing.

4.3. Figure of sequence alignment

Since version 1.727, BioAider has added a sequence editor, which supports sequence viewing and editing. More importantly, you can export the sequence alignment diagram very easy, and BioAider supports custom colors for each base (or amino acid). alignment_diagram

4.4. Fast Annotation

For different strain sequences from the same virus, their nucleotide identity is usually relatively higher. Therefore, the sequences annotation could be based on the gene information of the reference sequence after multi-sequence alignment.

BioAider provides a quickly sequence annotation function, users can import the aligned complete genome sequence set (fasta format file), and adjust the reference sequence for annotation to the forefront of the file. Paste the gene information of reference sequence in aligned sets, name, starting string and end string into the textbox, separated by ",". Then batch abstract genes. Note that the start string or end string of the gene is not limited in length, but it is required to be unique in the reference sequence. Besides, the higher of similarity among sequences, the higher accuracy of the annotation.

Fast_Annotation.png

4.

View on GitHub
GitHub Stars34
CategoryData
Updated2mo ago
Forks7

Security Score

75/100

Audited on Jan 18, 2026

No findings