SkillAgentSearch skills...

NumberRecognition

NumberRecognition is a project aimed at recognizing handwritten digits from the MNIST dataset using PyTorch. It includes scripts for training and inference, along with utilities for dataset preparation. This project is ideal for learning the basics of neural networks and digit recognition. This is a "Hello World" program in the field of AI.

Install / Use

/learn @linuslau/NumberRecognition
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

NumberRecognition

Quick start

pip install -r requirements.txt
python parse_train_images_labels.py
python parse_t10k_images_labels.py
python model_train.py
python model_inference.py

MNIST Digit Recognition with PyTorch

This is a "Hello World" program in the field of artificial intelligence (AI) and deep learning, written for learning purposes.

This repository contains a complete implementation of a neural network for recognizing handwritten digits from the MNIST dataset using PyTorch. The project includes both training and inference scripts, along with utilities for preparing the dataset.

Overview

The project consists of the following scripts:

  • model_train.py: This script is responsible for training the neural network model on the MNIST dataset.
  • model_inference.py: This script performs inference using the trained model on a test dataset.
  • parse_train_images_labels.py: Reads and saves the MNIST training images, organized by digit labels.
  • parse_t10k_images_labels.py: Reads and saves the MNIST test labels and images, organized by digit labels.

Contents

parse_train_images_labels.py

This script handles the following tasks:

  • Reads the MNIST training images from the .idx3-ubyte file.
  • Parses the images and saves them into the mnist_train directory, organized by digit labels.

parse_t10k_images_labels.py

This script performs the following tasks:

  • Reads the MNIST test labels from the .idx1-ubyte file.
  • Parses the labels and saves the corresponding images into the mnist_test directory, organized by digit labels.

model_train.py

This script handles the following tasks:

  • Loads the training and test datasets from the ./mnist_train and ./mnist_test directories respectively.
  • Applies transformations to the images, including converting them to grayscale and tensor format.
  • Initializes and trains a neural network using the Adam optimizer and cross-entropy loss.
  • Logs the progress of the training process, including dataset lengths, batch information, and loss values.
  • Saves the trained model to mnist.pth.

model_inference.py

This script performs the following tasks:

  • Loads the test dataset from the ./mnist_test directory and applies necessary transformations.
  • Loads the pre-trained model from mnist.pth.
  • Performs inference on the test dataset and prints cases where the model's predictions do not match the actual labels.
  • Calculates and prints the accuracy of the model on the test dataset.

How to Use

  1. Dataset Preparation: Ensure the MNIST dataset is available in the ./mnist_train and ./mnist_test directories. The dataset should be organized such that each digit (0-9) has its own subdirectory containing corresponding images.

  2. Training the Model:

python model_train.py

This command will train the model and save the trained weights to mnist.pth.

  1. Running Inference:
python model_inference.py

This command will evaluate the trained model on the test dataset and print the accuracy along with any misclassified images.

Test result (97.77%)

C:\Users\kz\.conda\envs\nn\python.exe C:\Users\kz\Documents\Code\NumberRecognition\model_inference.py 
test_dataset length:  10000
wrong case: predict = 9 actual = 0 img_path = ./mnist_test\0\mnist_test_126.png
wrong case: predict = 2 actual = 0 img_path = ./mnist_test\0\mnist_test_1748.png
wrong case: predict = 9 actual = 0 img_path = ./mnist_test\0\mnist_test_1987.png
wrong case: predict = 4 actual = 0 img_path = ./mnist_test\0\mnist_test_2033.png
wrong case: predict = 9 actual = 0 img_path = ./mnist_test\0\mnist_test_3251.png
wrong case: predict = 4 actual = 0 img_path = ./mnist_test\0\mnist_test_3818.png
wrong case: predict = 8 actual = 0 img_path = ./mnist_test\0\mnist_test_4065.png
wrong case: predict = 8 actual = 0 img_path = ./mnist_test\0\mnist_test_4880.png
wrong case: predict = 4 actual = 0 img_path = ./mnist_test\0\mnist_test_6400.png
wrong case: predict = 7 actual = 0 img_path = ./mnist_test\0\mnist_test_6597.png
wrong case: predict = 6 actual = 0 img_path = ./mnist_test\0\mnist_test_7216.png
wrong case: predict = 6 actual = 0 img_path = ./mnist_test\0\mnist_test_8325.png
wrong case: predict = 3 actual = 0 img_path = ./mnist_test\0\mnist_test_9634.png
wrong case: predict = 2 actual = 1 img_path = ./mnist_test\1\mnist_test_2182.png
wrong case: predict = 2 actual = 1 img_path = ./mnist_test\1\mnist_test_3073.png
wrong case: predict = 3 actual = 1 img_path = ./mnist_test\1\mnist_test_3906.png
wrong case: predict = 7 actual = 1 img_path = ./mnist_test\1\mnist_test_4201.png
wrong case: predict = 6 actual = 1 img_path = ./mnist_test\1\mnist_test_5331.png
wrong case: predict = 8 actual = 1 img_path = ./mnist_test\1\mnist_test_5457.png
wrong case: predict = 5 actual = 1 img_path = ./mnist_test\1\mnist_test_5642.png
wrong case: predict = 8 actual = 1 img_path = ./mnist_test\1\mnist_test_619.png
wrong case: predict = 6 actual = 1 img_path = ./mnist_test\1\mnist_test_6783.png
wrong case: predict = 0 actual = 1 img_path = ./mnist_test\1\mnist_test_7928.png
wrong case: predict = 8 actual = 1 img_path = ./mnist_test\1\mnist_test_8020.png
wrong case: predict = 2 actual = 1 img_path = ./mnist_test\1\mnist_test_956.png
wrong case: predict = 3 actual = 2 img_path = ./mnist_test\2\mnist_test_1395.png
wrong case: predict = 6 actual = 2 img_path = ./mnist_test\2\mnist_test_1609.png
wrong case: predict = 7 actual = 2 img_path = ./mnist_test\2\mnist_test_1790.png
wrong case: predict = 0 actual = 2 img_path = ./mnist_test\2\mnist_test_2098.png
wrong case: predict = 4 actual = 2 img_path = ./mnist_test\2\mnist_test_2488.png
wrong case: predict = 7 actual = 2 img_path = ./mnist_test\2\mnist_test_321.png
wrong case: predict = 8 actual = 2 img_path = ./mnist_test\2\mnist_test_3796.png
wrong case: predict = 4 actual = 2 img_path = ./mnist_test\2\mnist_test_3817.png
wrong case: predict = 8 actual = 2 img_path = ./mnist_test\2\mnist_test_4248.png
wrong case: predict = 7 actual = 2 img_path = ./mnist_test\2\mnist_test_4289.png
wrong case: predict = 4 actual = 2 img_path = ./mnist_test\2\mnist_test_4615.png
wrong case: predict = 4 actual = 2 img_path = ./mnist_test\2\mnist_test_4876.png
wrong case: predict = 4 actual = 2 img_path = ./mnist_test\2\mnist_test_5086.png
wrong case: predict = 7 actual = 2 img_path = ./mnist_test\2\mnist_test_583.png
wrong case: predict = 8 actual = 2 img_path = ./mnist_test\2\mnist_test_613.png
wrong case: predict = 6 actual = 2 img_path = ./mnist_test\2\mnist_test_6574.png
wrong case: predict = 1 actual = 2 img_path = ./mnist_test\2\mnist_test_659.png
wrong case: predict = 7 actual = 2 img_path = ./mnist_test\2\mnist_test_7457.png
wrong case: predict = 4 actual = 2 img_path = ./mnist_test\2\mnist_test_7886.png
wrong case: predict = 8 actual = 2 img_path = ./mnist_test\2\mnist_test_8094.png
wrong case: predict = 4 actual = 2 img_path = ./mnist_test\2\mnist_test_8198.png
wrong case: predict = 7 actual = 2 img_path = ./mnist_test\2\mnist_test_9664.png
wrong case: predict = 0 actual = 2 img_path = ./mnist_test\2\mnist_test_9768.png
wrong case: predict = 8 actual = 2 img_path = ./mnist_test\2\mnist_test_9811.png
wrong case: predict = 3 actual = 2 img_path = ./mnist_test\2\mnist_test_9839.png
wrong case: predict = 7 actual = 3 img_path = ./mnist_test\3\mnist_test_1128.png
wrong case: predict = 5 actual = 3 img_path = ./mnist_test\3\mnist_test_1531.png
wrong case: predict = 7 actual = 3 img_path = ./mnist_test\3\mnist_test_1681.png
wrong case: predict = 8 actual = 3 img_path = ./mnist_test\3\mnist_test_18.png
wrong case: predict = 9 actual = 3 img_path = ./mnist_test\3\mnist_test_2109.png
wrong case: predict = 9 actual = 3 img_path = ./mnist_test\3\mnist_test_2408.png
wrong case: predict = 5 actual = 3 img_path = ./mnist_test\3\mnist_test_2618.png
wrong case: predict = 2 actual = 3 img_path = ./mnist_test\3\mnist_test_2921.png
wrong case: predict = 2 actual = 3 img_path = ./mnist_test\3\mnist_test_2927.png
wrong case: predict = 7 actual = 3 img_path = ./mnist_test\3\mnist_test_381.png
wrong case: predict = 2 actual = 3 img_path = ./mnist_test\3\mnist_test_4437.png
wrong case: predict = 2 actual = 3 img_path = ./mnist_test\3\mnist_test_4443.png
wrong case: predict = 6 actual = 3 img_path = ./mnist_test\3\mnist_test_5078.png
wrong case: predict = 4 actual = 3 img_path = ./mnist_test\3\mnist_test_5140.png
wrong case: predict = 7 actual = 3 img_path = ./mnist_test\3\mnist_test_5734.png
wrong case: predict = 8 actual = 3 img_path = ./mnist_test\3\mnist_test_5955.png
wrong case: predict = 8 actual = 3 img_path = ./mnist_test\3\mnist_test_5973.png
wrong case: predict = 9 actual = 3 img_path = ./mnist_test\3\mnist_test_6009.png
wrong case: predict = 9 actual = 3 img_path = ./mnist_test\3\mnist_test_6011.png
wrong case: predict = 9 actual = 3 img_path = ./mnist_test\3\mnist_test_6023.png
wrong case: predict = 9 actual = 3 img_path = ./mnist_test\3\mnist_test_6045.png
wrong case: predict = 8 actual = 3 img_path = ./mnist_test\3\mnist_test_6046.png
wrong case: predict = 0 actual = 3 img_path = ./mnist_test\3\mnist_test_6059.png
wrong case: predict = 2 actual = 3 img_path = ./mnist_test\3\mnist_test_7800.png
wrong case: predict = 2 actual = 3 img_path = ./mnist_test\3\mnist_test_7821.png
wrong case: predict = 9 actual = 3 img_path = ./mnist_test\3\mnist_test_8246
View on GitHub
GitHub Stars63
CategoryEducation
Updated8d ago
Forks21

Languages

Python

Security Score

95/100

Audited on Mar 27, 2026

No findings