Yasir Insights
Comments 0
09 Oct 2025

🎯 Handwritten Digit Recognition Model By Mirza Yasir Abdullah Baig

🧠 Introduction

Handwriting recognition is one of the most fascinating and practical applications of Computer Vision and Deep Learning. It lies at the intersection of pattern recognition, neural networks, and image processing — enabling systems like postal address reading, bank check verification, and digital note recognition.

To explore the power of deep learning, I built a Handwritten Digit Recognition App using the MNIST dataset. The app allows users to draw digits (0–9) on an interactive canvas or upload images, and it predicts the correct digit using a Convolutional Neural Network (CNN) model.

This project demonstrates an end-to-end AI pipeline — from dataset handling and model building to web app deployment — showing how deep learning can bridge the gap between data and real-world applications.

Also Read: https://github.com/mirzayasirabdullahbaig07/HandWritten-Classification-Model

💡 Motivation & Intuition Behind the Project

Exploring AI in Computer Vision:
I wanted to understand how neural networks can “see” and “interpret” images — the same fundamental principle behind OCR (Optical Character Recognition) and modern vision systems like Google Lens.
Deep Learning Skill Development:
The goal was to gain hands-on experience with CNN architecture, image preprocessing, and TensorFlow/Keras model training.
End-to-End ML Pipeline Practice:
Building a model isn’t enough — I wanted to learn the entire process:
➜ Preprocessing the dataset
➜ Designing and training a CNN
➜ Evaluating accuracy
➜ Saving the trained model
➜ Deploying it using Streamlit with an interactive UI
Portfolio & Interview Readiness:
This project reflects practical knowledge of deep learning, model deployment, and user experience design, all of which are highly valued in AI/ML engineer interviews.

📊 Dataset Overview – MNIST (Modified National Institute of Standards and Technology)

Dataset Name: MNIST Handwritten Digit Dataset
Source: Yann LeCun’s MNIST Database (official benchmark for digit recognition)
Training Samples: 60,000 images
Testing Samples: 10,000 images
Image Size: 28 × 28 pixels (Grayscale)
Classes: 10 (Digits 0–9)

Why MNIST?

MNIST is simple yet powerful. It’s often called the “Hello World” of computer vision because it helps beginners learn how image data can be understood by neural networks. Despite its simplicity, it demonstrates key deep learning concepts like:

Image normalization
Feature extraction via convolution layers
Classification using dense layers

⚙️ Techniques & Workflow

1️⃣ Data Preprocessing

Loaded the dataset using TensorFlow/Keras.
Normalized pixel values (0–255 → 0–1) for faster and stable training.
Reshaped images into 28×28×1 tensors for CNN compatibility.
One-hot encoded target labels (0–9).

2️⃣ Model Architecture (CNN)

I experimented with different architectures (simple DNN vs. CNN) and finalized a Convolutional Neural Network because it is specifically designed for image recognition tasks.

Final CNN Architecture:

Conv2D Layer 1: 32 filters, 3×3 kernel, ReLU activation
MaxPooling Layer 1: 2×2 pool size
Conv2D Layer 2: 64 filters, 3×3 kernel, ReLU activation
MaxPooling Layer 2: 2×2 pool size
Flatten Layer: Converts 2D features to 1D
Dense Layer: 128 neurons, ReLU activation
Output Layer: 10 neurons (for digits 0–9), Softmax activation

3️⃣ Model Training

Optimizer: Adam (adaptive learning rate optimization)
Loss Function: Categorical Crossentropy
Metrics: Accuracy
Epochs: 10–20
Batch Size: 128

4️⃣ Model Evaluation

Achieved Accuracy: ~99% on test data.
Very low error rate and minimal overfitting observed.
Plotted confusion matrix to visualize per-digit performance.

5️⃣ Model Serialization

Saved trained model as mnist_model.h5 using Keras.
Used Pickle/Joblib alternative for future reuse.
Loaded the model in the web app to make real-time predictions.

💻 Web App Development – Streamlit Deployment

The model is deployed using Streamlit, making it accessible via a clean, interactive interface.

🌐 Key Features:

🖊️ Interactive Drawing Canvas: Draw digits directly for prediction.
📁 Upload Option: Upload any digit image for recognition.
🔮 Instant Predictions: Displays recognized digit and model confidence score.
💡 Visual Feedback: Probability bar chart of all digits.
👨‍💻 About Me Sidebar: Includes links to LinkedIn, GitHub, and Kaggle.

🔧 Tech Stack:

Python 3.9+
Streamlit – Frontend UI
TensorFlow/Keras – Model training
NumPy & Pandas – Data handling
Matplotlib & Seaborn – Visualization
OpenCV – Image preprocessing
Pickle/Keras – Model saving/loading

📈 Results & Insights

Training Accuracy: ~99%
Validation Accuracy: ~98–99%
Minimal Overfitting: Regularization & pooling helped maintain generalization.
Feature Learning: Lower layers detect strokes, while deeper layers detect digit structure.

Visualizations:

Confusion Matrix
Accuracy vs. Epochs
Loss vs. Epochs

These metrics demonstrate robust model performance and reliability for educational or demonstration use cases.

💬 Interview Talking Points

If asked in an interview, you can explain:

What problem does your model solve?
It automates digit recognition using CNN, simulating human visual recognition capabilities.
Why did you use a CNN instead of a normal neural network?
CNNs are better for image data because they can automatically extract spatial and visual features using filters and pooling layers.
What preprocessing steps were necessary?
Normalization, reshaping, and one-hot encoding were key to making the data compatible with CNNs.
How did you handle overfitting?
Used dropout, pooling layers, and limited epochs.
What metrics did you use?
Accuracy, Confusion Matrix, and visual learning curves.
What did you learn from this project?
End-to-end ML project flow — dataset handling, CNN design, evaluation, model saving, and deployment in Streamlit.

🚀 Demo

🎥 Live Demo: https://handwrittenclassification07.streamlit.app/
📸 Video Demo: Handwritten-Prediction.webm
🧾 Model File: mnist_model.h5

About Me

Name: Mirza Yasir Abdullah Baig
Portfolio: Kaggle, LinkedIn, GitHub

Goal: Building intelligent, human-centered AI applications that make technology intuitive and accessible.

❤️ Acknowledgements

⚠️ Disclaimer

This project is for educational and demonstration purposes only.
It is not designed for real-world deployment in production systems.

Yasir Insights