Optipfair
Structured pruning and bias visualization for Large Language Models. Tools for LLM optimization and fairness analysis.
Install / Use
/learn @peremartra/OptipfairREADME
optipfair
<div align="center"> <img src="images/optiPfair.png" alt="optipfair Logo" width="600"/> </div> <div align="center"> <h1>optipfair</h1> <strong>The Python library for making LLMs both efficient (via pruning) and fair (via bias analysis).</strong> </div> <p align="center"> <a href="https://pypi.org/project/optipfair/"><img alt="PyPI Version" src="https://img.shields.io/pypi/v/optipfair?color=blue"></a> <a href="https://pypi.org/project/optipfair/"><img alt="Downloads" src="https://img.shields.io/pypi/dm/optipfair?color=orange"></a> <a href="https://github.com/peremartra/optipfair/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/github/license/peremartra/optipfair?color=green"></a> <a href="https://github.com/peremartra/optipfair/stargazers"><img alt="GitHub Stars" src="https://img.shields.io/github/stars/peremartra/optipfair?style=social"></a> </p> <div align="center"> <h3> <a href="https://peremartra.github.io/optipfair/" target="_blank">Documentation</a> · <a href="https://github.com/peremartra/optipfair/issues" target="_blank">Report Bug</a> · <a href="https://github.com/peremartra/optipfair/issues" target="_blank">Request Feature</a> </h3> </div>New to optipfair? Use our LLM Reference Manual - paste it into ChatGPT, Claude or your Favourite LLM for guided assistance with any optipfair task.
Note on Terminology: The default neuron selection method is PPM (Peak-to-Peak Magnitude), which calculates neuron importance based on the full dynamic range of weights (max + |min|). This method is formally described in: Martra, P. (2025). Fragile Knowledge, Robust Instruction-Following: The Width Pruning Dichotomy in Llama-3.2. ArXiv. https://arxiv.org/abs/2512.22671. For backward compatibility, the parameter value
"MAW"is still accepted and maps to PPM.
🚀 Interactive Demos: Try optipfair NOW
Experience optipfair's capabilities directly in your browser.
| Live Bias Visualization Demo | | :--------------------------: | | Analyze any compatible model from Hugging Face with a full UI. No setup required. | | 🚀 Launch the Live Demo on HF Spaces |
Tutorials on Google Colab
Explore optipfair’s features with these interactive notebooks.
| Tutorial | Description | Link |
| :--- | :--- | :---: |
| Depth Pruning | Learn how to remove entire transformer layers from models like Llama-3. | |
| Layer Importance | Identify which transformer layers contribute the least to your model. |
|
| Pruning Compatibility | Check if your model's architecture can be pruned by optipfair. |
|
| Bias Compatibility | The coder's alternative to our live demo for bias analysis. |
|
✅ Why optipfair?
optipfair is more than just another pruning library. It's a toolkit designed for the modern AI developer who cares about both performance and responsibility.
-
Efficiency & Fairness in One Place: Stop juggling tools. optipfair is the only library designed to integrate structured pruning with powerful, intuitive bias visualization and analysis.
-
Dual Pruning Strategies: optipfair supports both Width Pruning (removing neurons from MLP layers) and Depth Pruning (removing entire transformer layers), giving you flexible control over the efficiency-performance trade-off.
-
Optimized for Modern Architectures: We focus on what works now. The library is specialized for GLU-based models like LLaMA, Mistral, Gemma, and Qwen, ensuring relevant and effective pruning.
-
Go Beyond Numbers with Bias Visualization: Don't just get a bias score. Our visualization tools (PCA, heatmaps, mean differences) help you understand how and where your model encodes bias, enabling more effective mitigation.
-
🤖 AI-Assisted Development: Accelerate your workflow using the included
LLM Reference Manual. Provide it to your favorite LLM (ChatGPT, Claude) to get expert-level help and generate integration code instantly.
- 🔬 Backed by Research: Our methods aren't arbitrary. They are built upon and validated by ongoing applied research in model optimization and fairness analysis.
⚙️ Installation
Choose the installation method that best suits your needs. For bias visualization features, you'll need the [viz] extra. Standard Installation For core pruning functionality:
pip install optipfair
Full Installation (with Bias Visualization) To use the bias analysis and visualization tools, install with the [viz] extra dependencies:
pip install "optipfair[viz]"
Developer Installation To install from the source for contributing or development:
git clone https://github.com/peremartra/optipfair.git
cd optipfair
pip install -e .
⚡ Quick Start
See how to use optipfair's core features in just a few lines of code.
Pruning with the Python API
Prune 20% of the MLP neurons from a model using the Peak-to-Peak Magnitude (PPM) method.
from transformers import AutoModelForCausalLM
import optipfair as opf
# Load a pre-trained model
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
# Prune 20% of neurons from MLP layers
pruned_model, stats = opf.prune_model(
model=model,
pruning_type="MLP_GLU",
neuron_selection_method="MAW",
pruning_percentage=20,
expansion_divisor=None, # Optional: round to divisor (32, 64, 128, 256)
show_progress=True,
return_stats=True
)
# Print pruning statistics
print(f"Original parameters: {stats['original_parameters']:,}")
print(f"Pruned parameters: {stats['pruned_parameters']:,}")
print(f"Reduction: {stats['reduction']:,} parameters ({stats['percentage_reduction']:.2f}%)")
# Save the pruned model
pruned_model.save_pretrained("./pruned-llama-model")
The pruning process yields tangible results in model size and performance. Here's a sample comparison for Llama-3.2-1B after pruning 20% of its MLP neurons:
| Metric | Original Model | Pruned Model | Improvement | | :--- | :---: | :---: | :---: | | Total Parameters | 1.24B | 1.07B | -13.03% | | Inference Speed | Benchmark in progress | Benchmark in progress | Coming soon | | MMLU Score | Benchmark in progress | Benchmark in progress | Minimal change expected |
Results based on the PPM pruning method (parameter "MAW"). Full benchmark results will be published shortly.
Data-Driven Width Pruning (NEW in v0.2.0)
Enhance pruning decisions with activation statistics from calibration data. This hybrid approach combines weight magnitudes with real data patterns for more intelligent neuron selection.
from transformers import AutoModelForCausalLM, AutoTokenizer
from torch.utils.data import DataLoader, TensorDataset
import torch
import optipfair as opf
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B")
tokenizer.pad_token = tokenizer.eos_token
# Prepare calibration data (use your domain-specific dataset)
texts = [
"Your domain-specific text here...",
"More examples from your use case...",
# Add 100-1000 samples for best results
]
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=512)
dataset = TensorDataset(inputs['input_ids'], inputs['attention_mask'])
dataloader = DataLoader(dataset, batch_size=8)
# Prune with data-driven importance calculation
pruned_model, stats = opf.prune_model(
model=model,
neuron_selection_method="MAW", # Only PPM (parameter "MAW") supports data-driven pruning
pruning_percentage=20,
dataloader=dataloader, # ← Enables hybrid pruning
show_progress=True,
return_stats=True
)
print(f"Reduction: {stats['reduction']:,} parameters ({stats['percentage_reduction']:.2f}%)")
pruned_model.save_pretrained("./pruned-datadriven-model")
Key Benefits:
- 📊 Better Preservation: Keeps neurons important for your specific use case
- 🎯 Domain Adaptation: Use calibration data from your target domain
- 🔬 Research-Backed: Based on CFSP methodology (arXiv:2409.13199v2)
- ⚡ Easy Integration: Just add a dataloader - no other changes needed
Note: Data-driven pruning is currently only available with neuron_selection_method="MAW" (PPM method). Using a dataloader with "VOW" or "PON" will raise a ValueError.
Selective Layer Width Pruning (NEW in v0.2.0)
Prune neurons only in specific layers while leaving others unchanged. Perfect for preserving critical layers or implementing layer-specific optimization strategies.
from transformers import AutoModelForCausalLM
import optipfair as opf
# Load a pre-trained model
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
# Prune neurons only in specific layers (e.g., middle layers)
pruned_model, stats =
Related Skills
node-connect
351.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
claude-opus-4-5-migration
110.9kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
frontend-design
110.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
model-usage
351.8kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
