Piawka
The powerful `awk` script to calculate pi, Dxy and Fst in polyploid VCF files with mixed-ploidy groups support
Install / Use
/learn @novikovalab/PiawkaREADME
piawka <img src="logo/logo.svg" align="right" width="25%">
The powerful awk script to calculate π, Dxy (or πxy, or Nei's D) and some more simple stats (Fst, Tajima's D, Ronfort's rho) in VCF files in the command line. Developed to analyze arbitrary-ploidy groups with substantial amounts of missing data.
:warning:
piawkais under development. If something does not seem to work well, check for updates and do not hesitate to file an issue!
Install it!
Quickly: conda
conda install -c bioconda piawka
Quickly but slower
Make the following programs available in the command line (install and add to PATH):
gawkv5.0.0 and abovetabix
Then, get piawka by cloning the repo and add the scripts to PATH:
git clone https://github.com/novikovalab/piawka.git
export PATH="$( realpath ./piawka/scripts ):${PATH}"
Use it!
$ piawka
piawka v0.8.11
Usage:
piawka -g groups_tsv -v vcf_gz [OPTIONS]
Options:
-1, --persite output values for each site
-b, --bed <arg> BED file with regions to be analyzed
-B, --targets <arg> BED file with targets (faster for numerous small regions)
-D, --nodxy do not output Dxy
-f, --fst output Hudson Fst
-F, --fstwc output Weir and Cockerham Fst instead
-g, --groups <arg> either 2-columns sample / group table or
keywords "unite" (1 group) or "divide" (n_samples groups)
-h, --help show this help message
-H, --het output only per-sample pi = heterozygosity
-j, --jobs <arg> number of parallel jobs to run
-m, --mult use multiallelic sites
-M, --miss <arg> max share of missing GT per group at site, 0.0-1.0
-P, --nopi do not output pi
-q, --quiet do not output progress and warning messages
-r, --rho output Ronfort's rho
-t, --tajimalike output TajimaD-like stat (manages missing data but untested)
-T, --tajima output classic TajimaD instead (affected by missing data)
-v, --vcf <arg> gzipped and tabixed VCF file
-w, --watterson output Watterson's theta
See the wiki for further details.
Cite it!
If you want to express your gratitude for having piawka, please cite our Siberian Arabidopsis paper where we have introduced and first used it.
Related Skills
openhue
352.2kControl Philips Hue lights and scenes via the OpenHue CLI.
sag
352.2kElevenLabs text-to-speech with mac-style say UX.
weather
352.2kGet current weather and forecasts via wttr.in or Open-Meteo
casdoor
13.3kAn open-source AI-first Identity and Access Management (IAM) /AI MCP & agent gateway and auth server with web UI supporting OpenClaw, MCP, OAuth, OIDC, SAML, CAS, LDAP, SCIM, WebAuthn, TOTP, MFA, Face ID, Google Workspace, Azure AD
