YaGGUF

A GUI GGUF converter that relies entirely on llama.cpp.

Generate Convert Improve

Install / Use

/learn @usrname0/YaGGUF

About this skill

Quality Score

0/100

README

YaGGUF - Yet Another GGUF Converter

There are simultaneously too many and not enough GGUF converters in the world.

convert and quantize More screenshots

Features

llama.cpp under the hood - so that part works
Download - automatically download models and their auxiliary files from HuggingFace
Convert - safetensors and PyTorch models to GGUF format
Quantize - to multiple formats at once
Cross-platform - works on Windows and Linux (and probably Mac but untested)
Easy - auto-installs an environment + llama.cpp + CPU binaries for quantizing
Flexible - can use any local llama.cpp repo or binary installation for quantizing
Minimal mess - virtual environment prevents conflicts with your python setup

Advanced Features

Single or split files mode - Generate single or split files for intermediates and quants
Split/Merge Shards - Split, merge, or resplit GGUF and safetensors files with custom shard sizes
Importance Matrix - Generate or reuse imatrix files for better low-bit quantization (IQ2, IQ3)
Imatrix Statistics - Analyze importance matrix files to view statistics
Custom intermediates - Use existing GGUF files as intermediates for quantization
Enhanced dtype detection - Detects model precision (BF16, F16, etc.) from configs and safetensors headers
Model quirks detection - Handles Mistral format, pre-quantized models, and architecture-specific flags
Vision/Multimodal models - Automatic detection and two-step conversion (text model + mmproj-*.gguf)
Sentence-transformers - Auto-detect and include dense modules for embedding models
VRAM Calculator - Estimate VRAM usage and recommended GPU layers (-ngl) for GGUF models

Quantization Types

All quantization types from llama.cpp are supported. Choose based on your size/quality tradeoff:

| Type | Size | Quality | Category | Imatrix | Notes | |------|------|---------|----------|---------|-------| | F32 | Largest | Original | Unquantized | - | Full 32-bit precision | | F16 | Large | Near-original | Unquantized | - | Half precision | | BF16 | Large | Near-original | Unquantized | - | Brain float 16-bit | | Q8_0 | Very Large | Excellent | Legacy | - | Near-original quality | | Q5_1, Q5_0 | Medium | Good | Legacy | - | Legacy 5-bit | | Q4_1, Q4_0 | Small | Fair | Legacy | - | Legacy 4-bit | | Q6_K | Large | Very High | K-Quant | Suggested | Near-F16 quality | | Q5_K_M | Medium | Better | K-Quant | Suggested | Higher quality | | Q5_K_S | Medium | Better | K-Quant | Suggested | 5-bit K small | | Q4_K_M | Small | Good | K-Quant | Suggested | 4-bit K medium | | Q4_K_S | Small | Good | K-Quant | Suggested | 4-bit K small | | Q3_K_L | Very Small | Fair | K-Quant | Recommended | 3-bit K large | | Q3_K_M | Very Small | Fair | K-Quant | Recommended | 3-bit K medium | | Q3_K_S | Very Small | Fair | K-Quant | Recommended | 3-bit K small | | Q2_K | Tiny | Minimal | K-Quant | Recommended | 2-bit K | | Q2_K_S | Tiny | Minimal | K-Quant | Recommended | 2-bit K small | | IQ4_NL | Small | Good | I-Quant | Recommended | 4-bit non-linear | | IQ4_XS | Small | Good | I-Quant | Recommended | 4-bit extra-small | | IQ3_M | Very Small | Fair | I-Quant | Recommended | 3-bit medium | | IQ3_S | Very Small | Fair+ | I-Quant | Recommended | 3.4-bit small | | IQ3_XS | Very Small | Fair | I-Quant | Required | 3-bit extra-small | | IQ3_XXS | Very Small | Fair | I-Quant | Required | 3-bit extra-extra-small | | IQ2_M | Tiny | Minimal | I-Quant | Required | 2-bit medium | | IQ2_S | Tiny | Minimal | I-Quant | Required | 2-bit small | | IQ2_XS | Tiny | Minimal | I-Quant | Required | 2-bit extra-small | | IQ2_XXS | Tiny | Minimal | I-Quant | Required | 2-bit extra-extra-small | | IQ1_M | Extreme | Poor | I-Quant | Required | 1-bit medium | | IQ1_S | Extreme | Poor | I-Quant | Required | 1-bit small |

Quick Guide:

Bigger is better (more precision)
For best quality use F16 or Q8_0
For decent quality use Q6_K or Q5_K_M
Medium quality... Use Q4_K_M
For smallest size use IQ3_M or IQ2_M with importance matrix

Requirements

Python 3.8 or higher
Git 2.20 or higher (if you want the update tab to work)

Installation - Windows

# Clone the repository
    git clone https://github.com/usrname0/YaGGUF.git
    cd YaGGUF
# Run the launcher script for Windows (runs a setup script if no venv detected):
    .\run_gui.bat

Installation - Linux

# If you want to select folders via the gui install tkinter (optional):
    sudo apt install python3-tk      # Ubuntu/Debian
    sudo dnf install python3-tkinter # Fedora/RHEL
    sudo pacman -S tk                # Arch

# Clone the repository
    git clone https://github.com/usrname0/YaGGUF.git
    cd YaGGUF

# Run the launcher script for Linux (runs a setup script if no venv detected):
    ./run_gui.sh

Usage

Windows:

Double-click .\run_gui.bat

Linux:

Use terminal ./run_gui.sh

The GUI will automatically open in your browser on a free port like: http://localhost:8501

License

MIT License - see LICENSE file for details

Credits

llama.cpp - GGUF format and conversion/quantization tools
HuggingFace - Model hosting and transformers library
Streamlit - Pythonic data apps

Related Skills

node-connect

344.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

99.2k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

344.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

344.4k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。