Classifier
Neurlang binary classifier (Hashtron)
Install / Use
/learn @neurlang/ClassifierREADME
Neurlang Classifier
Neurlang Classifier is a lightweight ML library for binary and quaternary neural networks that train quickly on CPUs. It models neurons with simple integer-to-boolean filters, enabling networks to be trained purely with integer arithmetic—no backpropagation, no floating-point math, and no GPU required. This makes training fast on multi-core CPUs while keeping dependencies minimal.
The framework has been proven in production, including training large-scale transformers for the goruut phonemizer. Use cases include virus detection, handwritten digit recognition, phoneme modeling, speech command classification, and more.
Features
- No backpropagation or perceptrons: Uses simple integer-based logic instead of weight gradients, enabling an alternative ML paradigm
- CPU-optimized, hardware-light: Requires no GPU—training is fast on multi-core CPUs using bitwise and integer operations
- Quaternary neurons: Implements custom layers (convolution, attention, pooling, parity, etc.) that operate on boolean/integer data
- Tiny dependencies: Written in pure Go with minimal external libraries, simplifying installation and portability
- Hash-based models: Resulting models are extremely fast for inference using hash-based feature extraction
- Proven at scale: Already used in production projects to train large-scale transformers
Getting Started
Prerequisites
- Go 1.18 or higher
Installation
go get github.com/neurlang/classifier
Usage Examples
Training MNIST Digit Classifier
cd cmd/train_mnist
go run .
Running Inference on MNIST
cd cmd/infer_mnist
go run .
Training Virus Detection Classifier
cd cmd/train_is_virus
go run .
Other Examples
The cmd/ directory contains additional demo programs:
train_is_alnum/infer_is_alnum- Alphanumeric character classificationtrain_speak- Speech command recognitiontrain_squareroot/infer_squareroot- Mathematical function learningtrain_phonemizer_multi/train_phonemizer_ulevel- Grapheme-to-phoneme conversion
Run ./cmd/trainall.sh to train all examples or ./cmd/runall.sh to run all inference demos.
Package Overview
- cmd - Demo programs with
train_*andinfer_*commands for various tasks - datasets - Core dataset interface and implementations:
isalnum- Alphanumeric character datasetisvirus- TLSH file hash signatures for virus detectionmnist- Standard MNIST handwritten digits (60k train / 10k test)phonemizer_multi/phonemizer_ulevel- Grapheme-to-phoneme datasetsspeak- Speech commands datasetsquareroot- Synthetic dataset for numeric relationsstringhash- String hashing and classification
- hash - Fast modular hash function implementation used by Neurlang layers
- hashtron - Core "hashtron" classifier model implementing the neuron logic
- layer - Abstract interfaces and implementations:
conv2d- 2D binary convolutional layercrossattention- Cross-attention layer for transformer-like modelsfull- Fully connected (dense) layermajpool2d- 2D majority pooling layerparity- Parity (XOR-like) layersochastic- Stochastic/randomly connected layersum- Element-wise sum layer
- net - Network architecture definitions:
feedforward- Feedforward network architecture
- parallel - Concurrency utilities (
ForEach,LoopUntil) to speed up training - trainer - High-level training orchestration managing training loops over datasets
Implementing a Dataset
To implement a dataset, define a slice of samples where each sample has these methods:
type Sample interface {
Feature(int) uint32 // Returns the feature at the specified index
Parity() uint16 // Returns parity for dataset balancing (0 if balanced)
Output() uint16 // Returns the output label/prediction
}
Implementing a Network
Example network with majority pooling layers:
const fanout1 = 3
const fanout2 = 5
const fanout3 = 3
const fanout4 = 5
var net feedforward.FeedforwardNetwork
net.NewLayerP(fanout1*fanout2*fanout3*fanout4, 0, 1<<fanout4)
net.NewCombiner(majpool2d.MustNew(fanout1*fanout2*fanout4, 1, fanout3, 1, fanout4, 1, 1))
net.NewLayerP(fanout1*fanout2, 0, 1<<fanout2)
net.NewCombiner(majpool2d.MustNew(fanout2, 1, fanout1, 1, fanout2, 1, 1))
net.NewLayer(1, 0)
fanout1andfanout3define majority pooling dimensionsfanout2andfanout4define the number of hashtrons- The final layer contains one hashtron for predictions (0 or 1 means 1 bit predicted, up to 16 bits supported)
Training and Inference
Training uses the trainer package with custom evaluation and training functions:
import "github.com/neurlang/classifier/trainer"
import "github.com/neurlang/classifier/parallel"
// Define training function
trainWorst := trainer.NewTrainWorstFunc(net, nil, nil, nil,
func(worst []int, tally datasets.AnyTally) {
parallel.ForEach(len(dataslice), 1000, func(i int) {
var sample = dataslice[i]
net.AnyTally(&sample, worst, tally, customErrorFunc)
})
})
// Define evaluation function
evaluate := trainer.NewEvaluateFunc(net, len(dataslice), 99, &improved_success_rate, dstmodel,
func(length int, h trainer.EvaluateFuncHasher) int {
// Evaluate accuracy on dataset
return successRate
})
// Run training loop
trainer.NewLoopFunc(net, &improved_success_rate, 100, evaluate, trainWorst)()
Inference is straightforward:
predicted := net.Infer2(&sample) // Returns predicted output
Contributing
Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests.
License
Neurlang Classifier is licensed under Apache 2.0 or Public Domain, at your option.
