YaGGUF
A GUI GGUF converter that relies entirely on llama.cpp.
Install / Use
/learn @usrname0/YaGGUFREADME
YaGGUF - Yet Another GGUF Converter
There are simultaneously too many and not enough GGUF converters in the world.
Features
- llama.cpp under the hood - so that part works
- Download - automatically download models and their auxiliary files from HuggingFace
- Convert - safetensors and PyTorch models to GGUF format
- Quantize - to multiple formats at once
- Cross-platform - works on Windows and Linux (and probably Mac but untested)
- Easy - auto-installs an environment + llama.cpp + CPU binaries for quantizing
- Flexible - can use any local llama.cpp repo or binary installation for quantizing
- Minimal mess - virtual environment prevents conflicts with your python setup
Advanced Features
- Single or split files mode - Generate single or split files for intermediates and quants
- Split/Merge Shards - Split, merge, or resplit GGUF and safetensors files with custom shard sizes
- Importance Matrix - Generate or reuse imatrix files for better low-bit quantization (IQ2, IQ3)
- Imatrix Statistics - Analyze importance matrix files to view statistics
- Custom intermediates - Use existing GGUF files as intermediates for quantization
- Enhanced dtype detection - Detects model precision (BF16, F16, etc.) from configs and safetensors headers
- Model quirks detection - Handles Mistral format, pre-quantized models, and architecture-specific flags
- Vision/Multimodal models - Automatic detection and two-step conversion (text model +
mmproj-*.gguf) - Sentence-transformers - Auto-detect and include dense modules for embedding models
- VRAM Calculator - Estimate VRAM usage and recommended GPU layers (-ngl) for GGUF models
Quantization Types
All quantization types from llama.cpp are supported. Choose based on your size/quality tradeoff:
| Type | Size | Quality | Category | Imatrix | Notes | |------|------|---------|----------|---------|-------| | F32 | Largest | Original | Unquantized | - | Full 32-bit precision | | F16 | Large | Near-original | Unquantized | - | Half precision | | BF16 | Large | Near-original | Unquantized | - | Brain float 16-bit | | Q8_0 | Very Large | Excellent | Legacy | - | Near-original quality | | Q5_1, Q5_0 | Medium | Good | Legacy | - | Legacy 5-bit | | Q4_1, Q4_0 | Small | Fair | Legacy | - | Legacy 4-bit | | Q6_K | Large | Very High | K-Quant | Suggested | Near-F16 quality | | Q5_K_M | Medium | Better | K-Quant | Suggested | Higher quality | | Q5_K_S | Medium | Better | K-Quant | Suggested | 5-bit K small | | Q4_K_M | Small | Good | K-Quant | Suggested | 4-bit K medium | | Q4_K_S | Small | Good | K-Quant | Suggested | 4-bit K small | | Q3_K_L | Very Small | Fair | K-Quant | Recommended | 3-bit K large | | Q3_K_M | Very Small | Fair | K-Quant | Recommended | 3-bit K medium | | Q3_K_S | Very Small | Fair | K-Quant | Recommended | 3-bit K small | | Q2_K | Tiny | Minimal | K-Quant | Recommended | 2-bit K | | Q2_K_S | Tiny | Minimal | K-Quant | Recommended | 2-bit K small | | IQ4_NL | Small | Good | I-Quant | Recommended | 4-bit non-linear | | IQ4_XS | Small | Good | I-Quant | Recommended | 4-bit extra-small | | IQ3_M | Very Small | Fair | I-Quant | Recommended | 3-bit medium | | IQ3_S | Very Small | Fair+ | I-Quant | Recommended | 3.4-bit small | | IQ3_XS | Very Small | Fair | I-Quant | Required | 3-bit extra-small | | IQ3_XXS | Very Small | Fair | I-Quant | Required | 3-bit extra-extra-small | | IQ2_M | Tiny | Minimal | I-Quant | Required | 2-bit medium | | IQ2_S | Tiny | Minimal | I-Quant | Required | 2-bit small | | IQ2_XS | Tiny | Minimal | I-Quant | Required | 2-bit extra-small | | IQ2_XXS | Tiny | Minimal | I-Quant | Required | 2-bit extra-extra-small | | IQ1_M | Extreme | Poor | I-Quant | Required | 1-bit medium | | IQ1_S | Extreme | Poor | I-Quant | Required | 1-bit small |
Quick Guide:
- Bigger is better (more precision)
- For best quality use F16 or Q8_0
- For decent quality use Q6_K or Q5_K_M
- Medium quality... Use Q4_K_M
- For smallest size use IQ3_M or IQ2_M with importance matrix
Requirements
Installation - Windows
# Clone the repository
git clone https://github.com/usrname0/YaGGUF.git
cd YaGGUF
# Run the launcher script for Windows (runs a setup script if no venv detected):
.\run_gui.bat
Installation - Linux
# If you want to select folders via the gui install tkinter (optional):
sudo apt install python3-tk # Ubuntu/Debian
sudo dnf install python3-tkinter # Fedora/RHEL
sudo pacman -S tk # Arch
# Clone the repository
git clone https://github.com/usrname0/YaGGUF.git
cd YaGGUF
# Run the launcher script for Linux (runs a setup script if no venv detected):
./run_gui.sh
Usage
Windows:
- Double-click
.\run_gui.bat
Linux:
- Use terminal
./run_gui.sh
The GUI will automatically open in your browser on a free port like: http://localhost:8501
License
MIT License - see LICENSE file for details
Credits
- llama.cpp - GGUF format and conversion/quantization tools
- HuggingFace - Model hosting and transformers library
- Streamlit - Pythonic data apps
Related Skills
node-connect
344.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
99.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
344.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
344.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
