BioScripts
Cool Bioinformatics Scripts
Install / Use
/learn @Zilong-Li/BioScriptsREADME
#+TITLE: Cool Bioinformatics Scripts
- Table of Content :TOC:QUOTE: #+BEGIN_QUOTE
- [[#qqplot][qqplot]]
- [[#downsampling-bam-files][downsampling bam files]]
- [[#genotype-refinement-using-beagle-3][genotype refinement using beagle 3]]
- [[#calculate-genotype-discordance][calculate genotype discordance]]
- [[#r2-vs-maf-plot][R2 vs MAF plot]]
- [[#phylogenetic-tree][Phylogenetic Tree]]
- [[#fixref][fixref]] #+END_QUOTE
- [[file:qqplot.py][qqplot]] You can use make a QQ plot in the following ways.
- one-liner for reading tons of millions of P values from the pipe
#+begin_src shell
python
zcat pval.txt.gz | qqplot.py -out test -title "QQ plot on the fly"
julia (recommand to run it in the REPL)
zcat pval.txt.gz | qqplot.jl --out test --title "QQ plot on the fly" #+end_src
warning : If you have 100 billion P values to process you should definitely use [[qqplot.jl]] instead of [[file:qqplot.py][qqplot.py]]. The hourly processed number of lines of julia version is 5 billion while python is only 700 million on my server.
- running in a julia REPL (recommanded)
#+begin_src julia
include("qqplot.jl")
cmd = pipeline(zcat pval.gz, awk 'NR>1{print $10}')
sigp, expp = qqfly("test", cmd=cmd)
#+end_src
- use qqplot.py in your script
#+begin_src python import numpy as np from qqplot import qq p = np.random.random(1000000) qq(x=p, figname="test.png") #+end_src
[[file:image/qqplot.png]]
- [[file:downsample.sh][downsampling bam files]]
#+begin_src shell Usage: downsample.sh [-b <bamlist>] [-d <depth>] [-n <cores>] [-o <outdir>] #+end_src
- [[file:beagle3-imputation.sh][genotype refinement using beagle 3]]
#+begin_src shell Usage: beagle3-imputation.sh [options] Pipeline of genotype refinement for median depth sequencing data using beagle3
-h, Display help -i, Input VCF/BCF file -o, Output folder -f, MAF filters before imputation #+end_src
- [[file:calc_imputed_gt_discord.py][calculate genotype discordance]]
When you run imputation analysis with =BEAGLE= (or other imputation tools), you may want to know the distribution of genotype discordance between the original vcf and imputed vcf.
#+begin_src shell usage: calc_imputed_gt_discord.py [-h] [-chr STRING] VCF1 VCF2 OUT #+end_src
warning : Before running the script, you must be sure the two vcfs have the exact same sites and samples for each chromosome.
[[file:image/calc_imputed_gt_discord.png]]
- [[file:plot-r2-vs-maf.R][R2 vs MAF plot]] plot INFO/R2 after imputation by =BEAGLE= etc.
[[file:image/r2-vs-maf.png]]
- [[file:njtree.R][Phylogenetic Tree]]
[[file:image/njtree-circular.png]]
- [[file:fixref.py][fixref]]
Before running =bcftools merge=, you maybe need to fix the ref and alt and corresponding genotypes, otherwise =bcftools= will surprise you.
#+begin_src shell usage: fixref.py [-h] REF_VCF IN_VCF OUT_VCF #+end_src
Related Skills
node-connect
339.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
339.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.9kCommit, push, and open a PR
