FindGSE

findGSE is a tool for estimating size of (heterozygous diploid or homozygous) genomes by fitting k-mer frequencies iteratively with a skew normal distribution model.

Generate Convert Improve

Install / Use

/learn @schneebergerlab/FindGSE

About this skill

Quality Score

0/100

README

findGSE

findGSE is a tool for estimating size of (heterozygous diploid or homozygous) genomes by fitting k-mer frequencies iteratively with a skew normal distribution model, which is written in R (code). The current version works on Linux & Mac OS X with R version 3.3.1 or above.

To use findGSE, one needs to input a k value and a corresponding k-mer histo file generated with short reads, which contains two tab-separated columns. The first column gives frequencies at which k-mers occur in reads, while the second column gives counts of such distinct k-mers (example).

Given multiple fastq.gz files, here is a two-step example for counting k-mers with jellyfish:

  zcat *.fastq.gz | jellyfish count /dev/fd/0 -C -o test_21mer -m 21 -t 1 -s 5G
  jellyfish histo -h 3000000 -o test_21mer.histo test_21mer

After getting the .histo file, supposing findGSE has been installed (INSTALL), we can do the following for GSE under R environment:

  library("findGSE")
  findGSE(histo="test_21mer.histo", sizek=21, outdir="hom_test_21mer")

Results will be printed like "Genome size estimate for test_21mer.histo: 1498918 bp." For more information about estimation, one can check the .txt and .pdf files in the output dir.

Two detailed toy examples about GSE for heterozygous and homozygous genomes are provided for playing around.

Related Skills

node-connect

341.0k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

84.4k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

341.0k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

84.4k

Commit, push, and open a PR