LIME
Official code for paper LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning
Install / Use
/learn @tonywu95/LIMEREADME
This is the official code repository for LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning
There are two dirs under this file:
- reason/generate_data.py generates all but one task introduced in LIME (see Appendix B): induct, deduct, abduct, induct_v2, induct_v3, induct_rewrite, rewrite.
- rewrite_multi/generate_data.py generates two variants of rewrite_multi_step introduced in LIME, Appendix B.1. The version "rewrite_multistep_hard" is the one described in the paper, which is also recommended to use.
Arguments of generate_data.py:
-
To generate synthetic data, we recommend generating 5 to 10 M examples, specified by the arg
num_train. -
To generate a mixed of tasks, one simply put multiple task names after arg
mode. For example: python reason/generate_data.py --mode induct deduct abduct generates examples that's a mixed of three tasks, which is called task "mixed" in the paper. -
The arg
vocab_sizeis the vocab size of the synthetic task. In LIME, we had a discussion about the effect of this in Appendix C.1. It seems matter somewhat. Empirically we observed that the closer to the downstream task's vocab size, the better. But for larger vocab size it becomes harder to train. So we recommend vocab_size 1000, and it seems to work well for most of our tasks. Usually, it takes 3-5M steps to converge for vocab_size 1000, with batch size 444096.
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
research_rules
Research & Verification Rules Quote Verification Protocol Primary Task "Make sure that the quote is relevant to the chapter and so you we want to make sure that we want to have it identifie
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
