MdlARC
Goal is to solve sample efficiency by using ARC-AGI as a benchmark
Install / Use
/learn @mvakde/MdlARCREADME
44% on ARC-AGI-1: trained from scratch for just ~$0.67
- Takes 2hrs on a 5090
- Uses a standard tranformer
- 75M parameters
This is one of the best non-LLM scores in the world today (if not THE best).
It is also the cheapest, by far, at that performance.
<a href="https://mvakde.github.io/blog/44-on-arc-1/"><img src="graph.png"></a>
New score
Performance: 44% on ARC-1 public eval
Total compute cost: ~$0.67 (2hrs on a 5090 rented on vast.ai)
Old score
Performance: 27.5% on ARC-1 public eval
Total Compute cost: $1.8 (<3hrs on an A100 rented on Google Colab)
Deployment
- Rent a 5090, ensure cuda >12.8, ideally >13.0
- Create a virtual environment and install
torch,numpy,numba,matplotlibandflash-attn - Download and build the dataset
- (optional) delete raw data, solutions file and dataset scripts to prove no leakage
- Run the training and inference script
This script takes care of (3)-(5):
git clone https://github.com/mvakde/mdlARC.git
# download and build the datasets
cd mdlARC/dataset_building_scripts
python download_and_group.py
python build_datasets.py arc1 --add-conceptarc --with-filtered
cd ..
# prove no data leakage (optional, uncomment to run)
# rm -r assets_tmp # deletes raw data
# rm assets/solutions.json # deletes solutions file
# rm -r dataset_building_scripts # deletes dataset related files
#run the training + inference script
python run_script.py high # Choose between 3 modes: low, medium, high
Note: To get the best speed, I have disabled logging loss values. Feel free to add it back
Citation
@misc{vakde2025mdlarc,
author = {Mithil Vakde},
title = {mdlARC},
year = {2025},
url = {https://github.com/mvakde/mdlARC},
}
Related Skills
node-connect
339.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
83.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
339.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
83.9kCommit, push, and open a PR
