Whitelightning
WhiteLightning distills massive, state-of-the-art language models into lightweight, hyper-efficient text classifiers. It's a command-line tool that lets you create specialized models that run anywhere—from the cloud to the edge—using the universal ONNX format for maximum compatibility.
Install / Use
/learn @Inoxoft/WhitelightningREADME
WhiteLightning distills massive, state-of-the-art language models into lightweight, hyper-efficient text classifiers. It's a command-line tool that lets you create specialized models that run anywhere—from the cloud to the edge—using the universal ONNX format for maximum compatibility.
What do we mean by "Distillation"?
We use large, powerful frontier models as "teachers" to train much smaller, task-specific "student" models. WhiteLightning automates this process for text classification, allowing you to create high-performance classifiers with a fraction of the computational footprint.
<p align="center"> <img src="media/openart-image_m8sOEHVQ_1753429527461_raw.png" width="800" alt="The WhiteLightning metaphor: from a complex still to a pure, potent product."> </p>How are the models saved?
WhiteLightning exports every trained model to ONNX (Open Neural Network Exchange). This standard format makes your models instantly portable. Run them natively in Python, JavaScript, C++, Rust, Java, and more, ensuring total flexibility for any project. Learn more at onnx.ai.
⚡ Cross-Platform Compatibility
WhiteLightning is designed as a "generic" Docker image that works seamlessly across macOS, Linux, and Windows with identical commands:
- Zero Configuration: No need for complex
--userflags or platform-specific commands - Automatic Permission Handling: Intelligently detects your system and sets correct file ownership
- Universal Commands: Same
docker runcommand works everywhere - Smart User Management: Internally manages user creation and permission mapping
- Secure by Default: Always runs as non-root user with proper privilege dropping
Key Features
- Multiple Model Architectures: Generate models for binary and multiclass classification with different activation functions.
- Instant Cross-Platform Deployment: Export to ONNX for use in any environment or language.
- Lightweight & Incredibly Fast: Optimized for high-speed inference with minimal resource consumption.
- Framework Agnostic: The final ONNX model has zero dependencies on TensorFlow or PyTorch. It's pure, portable compute.
- Multilingual Support: Generate training data and classifiers in a wide variety of languages.
- Smart & Automatic: Intelligently generates and refines prompts based on your classification task.
🚀 Quick Start
-
Clone the repository:
git clone https://github.com/Inoxoft/whitelightning.git cd whitelightning -
Get an OpenRouter API key at openrouter.ai/settings/keys.
-
Run the Docker image:
Mac:
docker run --rm \ -v "$(pwd)":/app/models \ -e OPEN_ROUTER_API_KEY="YOUR_OPEN_ROUTER_KEY_HERE" \ ghcr.io/inoxoft/whitelightning:latest \ python -m text_classifier.agent \ -p "Categorize customer reviews as positive, neutral, or negative"Linux:
docker run --rm \ -v "$(pwd)":/app/models \ -e OPEN_ROUTER_API_KEY="YOUR_OPEN_ROUTER_KEY_HERE" \ ghcr.io/inoxoft/whitelightning:latest \ python -m text_classifier.agent \ -p "Categorize customer reviews as positive, neutral, or negative"Windows (PowerShell):
docker run --rm \ -v "${PWD}:/app/models" \ -e OPEN_ROUTER_API_KEY="YOUR_OPEN_ROUTER_KEY_HERE" \ ghcr.io/inoxoft/whitelightning:latest \ python -m text_classifier.agent \ -p "Categorize customer reviews as positive, neutral, or negative" -
That's it! You'll see the generation process in your terminal.
<img src="media/demo.gif" width="600" alt="WhiteLightning CLI Demo">When it's finished, list the files in your directory (
ls -l). You'll find all the assets for your new model, ready to go:🎮 Try your trained model right here: WhiteLightning Playground
📊 Use Your Own Data
NEW! Skip LLM data generation and train directly on your existing datasets. WhiteLightning automatically analyzes your data structure and creates optimized models from real domain data.
# Create folder for your data
mkdir own_data
cp your_dataset.csv own_data/
# Train on your data (faster, cheaper, more accurate!)
docker run --rm \
-v "$(pwd)":/app/models \
-e OPEN_ROUTER_API_KEY="YOUR_OPEN_ROUTER_KEY_HERE" \
ghcr.io/inoxoft/whitelightning:latest \
python -m text_classifier.agent \
-p "Categorize customer reviews as positive, neutral, or negative" \
--use-own-dataset="/app/models/own_data/your_dataset.csv"
Benefits:
- ⚡ 3-5x Faster: No data generation needed
- 💰 95% Cheaper: Only uses LLM for data analysis (~$0.01 vs $1-10)
- 🎯 Higher Accuracy: Real domain data vs synthetic
- 📁 Multiple Formats: Supports CSV, JSON, JSONL, and TXT files
- 🔍 Auto-Detection: Automatically identifies text/label columns and classification type
config.json # Configuration and analysis
training_data.csv # Generated training data
edge_case_data.csv # Challenging test cases
model.onnx # ONNX model file
model_scaler.json # StandardScaler parameters
model_vocab.json # TF-IDF vocabulary
See our Complete Documentation for guides on how to use these files in your language of choice (C++, Rust, iOS, Android, and more).
💡 Making It Your Own: Example Prompts
The power of WhiteLightning is the -p (prompt) argument. You can create a classifier for almost anything just by describing it. Here are some ideas to get you started:
-
Spam Filter:
-p "Classify emails as 'spam' or 'not_spam'" -
Topic Classifier:
-p "Determine if a news headline is about 'tech', 'sports', 'world_news', or 'finance'" -
Toxicity Detector:
-p "Detect whether a user comment is 'toxic' or 'safe'" -
Urgency Detection:
-p "Categorize a support ticket's urgency as 'high', 'medium', or 'low'" -
Intent Recognition:
-p "Classify the user's intent as 'book_flight', 'check_status', or 'customer_support'"
The possibilities are endless. For more inspiration and advanced prompt engineering techniques, check out our Complete Documentation.
🔧 Docker Command Generator
Don't want to manually construct Docker commands? Use our Interactive Command Generator to build your personalized WhiteLightning commands with a user-friendly interface:
- 📝 Simple Configuration: Enter your API key and describe your classification task
- ⚙️ Advanced Options: Configure model type, activation functions, language settings, and more
- 🖥️ Platform Detection: Automatically generates the correct command format for macOS, Linux, or Windows
- 📋 One-Click Copy: Copy the generated command directly to your clipboard
- 💡 Smart Defaults: Intelligent parameter suggestions based on your task description
Features:
- Model Type Selection: Choose between TensorFlow, PyTorch, or Scikit-learn
- Activation Functions: Auto-detect or manually select sigmoid/softmax
- Custom Datasets: Easy configuration for using your own data files
- Language Support: Set primary language for multilingual classification
- Performance Tuning: Adjust batch size, refinement cycles, and feature limits
Perfect for:
- First-time users who want guided setup
- Complex configurations with multiple parameters
- Teams sharing standardized commands
- Quick experimentation with different settings
🧪 Testing & Validation
Want to test your ONNX models across multiple programming languages? Check out our WhiteLightning Test Framework - a comprehensive cross-language testing suite that validates your models in:
- 8 Programming Languages: Python, Java, C++, C, Node.js, Rust, Dart, and Swift
- Performance Benchmarking: Detailed timing, memory usage, and throughput analysis
- Automated Testing: GitHub Actions workflows for continuous validation
- Real-world Scenarios: Test with custom inputs and edge cases
Perfect for ensuring your WhiteLightning models work consistently across all target platforms and deployment environments.
🌐 Documen
Related Skills
next
A beautifully designed, floating Pomodoro timer that respects your workspace.
product-manager-skills
26PM skill for Claude Code, Codex, Cursor, and Windsurf: diagnose SaaS metrics, critique PRDs, plan roadmaps, run discovery, and coach PM career transitions.
devplan-mcp-server
3MCP server for generating development plans, project roadmaps, and task breakdowns for Claude Code. Turn project ideas into paint-by-numbers implementation plans.
