LIME

Official code for paper LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning

Generate Convert Improve

Install / Use

/learn @tonywu95/LIME

About this skill

Quality Score

0/100

README

This is the official code repository for LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning

There are two dirs under this file:

reason/generate_data.py generates all but one task introduced in LIME (see Appendix B): induct, deduct, abduct, induct_v2, induct_v3, induct_rewrite, rewrite.
rewrite_multi/generate_data.py generates two variants of rewrite_multi_step introduced in LIME, Appendix B.1. The version "rewrite_multistep_hard" is the one described in the paper, which is also recommended to use.

Arguments of generate_data.py:

To generate synthetic data, we recommend generating 5 to 10 M examples, specified by the arg num_train.
To generate a mixed of tasks, one simply put multiple task names after arg mode. For example: python reason/generate_data.py --mode induct deduct abduct generates examples that's a mixed of three tasks, which is called task "mixed" in the paper.
The arg vocab_size is the vocab size of the synthetic task. In LIME, we had a discussion about the effect of this in Appendix C.1. It seems matter somewhat. Empirically we observed that the closer to the downstream task's vocab size, the better. But for larger vocab size it becomes harder to train. So we recommend vocab_size 1000, and it seems to work well for most of our tasks. Usually, it takes 3-5M steps to converge for vocab_size 1000, with batch size 444096.

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

research_rules

Research & Verification Rules Quote Verification Protocol Primary Task "Make sure that the quote is relevant to the chapter and so you we want to make sure that we want to have it identifie

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

tonywu95

View profile

View on GitHub

GitHub Stars29

CategoryEducation

Updated11mo ago

Forks5

tonywu95/LIME

Languages

Python

Security Score

67/100

Audited on Apr 7, 2025

No findings