BioScripts

Cool Bioinformatics Scripts

Generate Convert Improve

Install / Use

/learn @Zilong-Li/BioScripts

About this skill

Quality Score

0/100

README

#+TITLE: Cool Bioinformatics Scripts

Table of Content :TOC:QUOTE: #+BEGIN_QUOTE

[[#qqplot][qqplot]]
[[#downsampling-bam-files][downsampling bam files]]
[[#genotype-refinement-using-beagle-3][genotype refinement using beagle 3]]
[[#calculate-genotype-discordance][calculate genotype discordance]]
[[#r2-vs-maf-plot][R2 vs MAF plot]]
[[#phylogenetic-tree][Phylogenetic Tree]]
[[#fixref][fixref]] #+END_QUOTE

[[file:qqplot.py][qqplot]] You can use make a QQ plot in the following ways.

one-liner for reading tons of millions of P values from the pipe

#+begin_src shell

python

zcat pval.txt.gz | qqplot.py -out test -title "QQ plot on the fly"

julia (recommand to run it in the REPL)

zcat pval.txt.gz | qqplot.jl --out test --title "QQ plot on the fly" #+end_src

warning : If you have 100 billion P values to process you should definitely use [[qqplot.jl]] instead of [[file:qqplot.py][qqplot.py]]. The hourly processed number of lines of julia version is 5 billion while python is only 700 million on my server.

running in a julia REPL (recommanded)

#+begin_src julia include("qqplot.jl") cmd = pipeline(zcat pval.gz, awk 'NR>1{print $10}') sigp, expp = qqfly("test", cmd=cmd) #+end_src

use qqplot.py in your script

#+begin_src python import numpy as np from qqplot import qq p = np.random.random(1000000) qq(x=p, figname="test.png") #+end_src

[[file:image/qqplot.png]]

[[file:downsample.sh][downsampling bam files]]

#+begin_src shell Usage: downsample.sh [-b <bamlist>] [-d <depth>] [-n <cores>] [-o <outdir>] #+end_src

[[file:beagle3-imputation.sh][genotype refinement using beagle 3]]

#+begin_src shell Usage: beagle3-imputation.sh [options] Pipeline of genotype refinement for median depth sequencing data using beagle3

-h, Display help -i, Input VCF/BCF file -o, Output folder -f, MAF filters before imputation #+end_src

[[file:calc_imputed_gt_discord.py][calculate genotype discordance]]

When you run imputation analysis with =BEAGLE= (or other imputation tools), you may want to know the distribution of genotype discordance between the original vcf and imputed vcf.

#+begin_src shell usage: calc_imputed_gt_discord.py [-h] [-chr STRING] VCF1 VCF2 OUT #+end_src

warning : Before running the script, you must be sure the two vcfs have the exact same sites and samples for each chromosome.

[[file:image/calc_imputed_gt_discord.png]]

[[file:plot-r2-vs-maf.R][R2 vs MAF plot]] plot INFO/R2 after imputation by =BEAGLE= etc.

[[file:image/r2-vs-maf.png]]

[[file:njtree.R][Phylogenetic Tree]]

[[file:image/njtree-circular.png]]

[[file:fixref.py][fixref]]

Before running =bcftools merge=, you maybe need to fix the ref and alt and corresponding genotypes, otherwise =bcftools= will surprise you.

#+begin_src shell usage: fixref.py [-h] REF_VCF IN_VCF OUT_VCF #+end_src

Related Skills

node-connect

339.3k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

83.9k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

339.3k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

83.9k

Commit, push, and open a PR