MathBot
MathBot is a transformer-based Math Word Problem (MWP) solver made as the Lab project for CSE 4622: Machine Learning Lab.
Install / Use
/learn @Starscream-11813/MathBotREADME
MathBot
MathBot is a transformer-based Math Word Problem (MWP) solver made as the Lab project for CSE 4622: Machine Learning Lab.
Built With:
Frameworks and Dependencies:
Production:
We have deployed the model with a simple gradio UI. Visit https://huggingface.co/spaces/Casio991ms/MathBot and check it out!
Team Members:
- Syed Rifat Raiyan- 180041205
- Md. Nafis Faiyaz- 180041101
- Shah Md. Jawad Kabir- 180041234
Foreword:
The goal of this model is to translate an MWP statement to a valid math expression, which when evaluated, yields the solution to the problem. For a better understanding of the underlying transformer model, please go through the MathBot.ipynb file and the relevant literature that have been cited.
Introduction:
Definition:
A Math Word Problem is a textual narrative that states a problem description and poses a question about one or more unknown quantities. These type of problems are generally found in math text-books of 1st to 3rd grade kids.
Example:
Problem: $\text{69 teddy bears are sold for 23 dollars each.}$ $\text{There are total 420 teddy bears in a store and the remaining teddy bears are sold for 17 dollars each.}$ $\text{How much did the store earn after selling all the teddy bears?}$
Expression: $x = 69×23 + (420 − 69)×17$
Our approach is to use Transformer-based $\text{Seq2Seq}$ model to generate the mathematical expression from problem statement.
Dataset:
The dataset we used is MAWPS. There are 3,320 problems along with their solution expressions. Out of those, we took 2,373 problems that were specific to our interest, as the rest were geometry problems. After that we used a question generator to generate similar problems. The final dataset had 38,144 problems in total. And our train-test split was $95-5$.
Features:
Provide a simple Math Word Problem statement in the text-box on the left and click on the "Submit" button. After a few seconds, the model should yield a predicted math expression.
You can also click on one of the many MWP examples shown below the text-boxes.

Result Analysis:
Results:
- Training-set Accuracy → $98.4$%
- Test-set Accuracy → $73.7$%
- Corpus BLEU (BiLingual Evaluation Understudy) → $87.2$%
Attention Weights:
Let’s look at a test sample (please overlook the bad English)...
Problem: $\text{Sarah wants to diverge 764 plums among 23 friends. How many would each friend experience?}$
Predicted Translation: $\text{x = 764/23}$
Here, we can see the tokens from prompt in columns and the tokens from target expressions in rows. These multiheads are somewhat similar to kernels in Convolutional Neural Networks (CNNs). We can see every single head except head $5$ and $7$ gives very heavy attention to the numbers from both sides. Also, note that head $5$ gives strong attention to the word $each$.
Critique:
Strengths:
- Correctly identifies where to give attentions to figure out the expression.
- Robust to grammatical errors.
- Achieves $73.7$%; better than some of the works done before on this dataset.
Weaknesses:
- Trained on small dataset.
- Struggles in problems that require multiple steps and > 2 operators.
- Uses tokens of digits, not whole numbers. Output can dramatically change for only changing a number in whole problem.
- Can produce erroneous outputs if statement's grammar is slightly changed or if the given problem statement deviates too much from the structure of the problems in the training set.
Resources:
Tutorials:
Inspirations:
We were inspired by similar research works and projects like:
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
flutter-tutor
Flutter Learning Tutor Guide You are a friendly computer science tutor specializing in Flutter development. Your role is to guide the student through learning Flutter step by step, not to provide d
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
last30days-skill
16.9kAI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
