Breast Cancer Prediction Model By Mirza Yasir Abdullah Baig - Yasir Insights

  Hire Me:

ML/AI Engineer

+92 322 7297049

Breast Cancer Prediction Model By Mirza Yasir Abdullah Baig
  • Yasir Insights
  • Comments 0
  • 09 Oct 2025

🎗️ Breast Cancer Prediction Model By Mirza Yasir Abdullah Baig

Introduction

Breast cancer is one of the most common cancers among women worldwide. Early detection is crucial for improving survival rates. Inspired by this, I decided to build a Breast Cancer Prediction App to leverage machine learning for assisting early detection of breast tumors.

The app predicts whether a tumor is Malignant (cancerous) or Benign (non-cancerous) based on key medical features. While it cannot replace professional diagnosis, it serves as an educational and demonstration tool for machine learning applications in healthcare.

Github Repo: https://github.com/mirzayasirabdullahbaig07/Breast-Cancer-Prediction-Model


💡 Motivation Behind Building This

  1. Personal Interest in Healthcare AI: I wanted to explore how AI can assist in critical real-world problems like cancer detection.

  2. Demonstrate End-to-End ML Project Skills: From data preprocessing, model building, web deployment, to interactive UI with Streamlit.

  3. Portfolio & Interview Ready: A project that shows practical implementation of ML in healthcare, which is appealing to recruiters in AI/ML roles.

  4. Simplifying Complexity for Users: Instead of 30 medical features, I reduced the input to 8 essential features for usability, showing UX awareness in ML applications.


🧠 What This Model Does

The model predicts the risk of breast cancer by classifying tumors into:

  • Malignant → Cancerous tumor (⚠️ consult a doctor)

  • Benign → Non-cancerous tumor (✅ no immediate concern)

It uses supervised machine learning to learn from past tumor data and provides a probabilistic prediction based on new inputs.


📊 Dataset Used

  • Dataset: Breast Cancer Wisconsin (Diagnostic) Dataset

  • Source: UCI Machine Learning Repository

  • Details:

    • Total Features: 30 medical features per tumor

    • Simplified Features Used: 8 key features (mean radius, mean texture, mean perimeter, mean area, mean smoothness, etc.) for easy input.

  • Classes:

    • Malignant → Cancerous

    • Benign → Non-cancerous


🛠️ Techniques and Workflow

The project follows an end-to-end ML pipeline:

  1. Data Preprocessing

    • Handling missing values

    • Removing duplicates

    • Scaling and normalizing features

    • Selecting key features for prediction

  2. Model Selection & Training

    • Techniques Used:

      • Logistic Regression (Baseline)

      • Random Forest Classifier (For robustness)

      • Neural Network (TensorFlow/Keras)

    • Selected Model: Random Forest Classifier (or Neural Network depending on accuracy)

    • Reason:

      • Handles non-linear relationships well

      • Provides feature importance insights

      • High accuracy on the dataset with low overfitting

  3. Model Serialization

    • Saved trained model using Pickle (trained_model.sav) for deployment

  4. Web App Development

    • Framework: Streamlit

    • UI Features:

      • Input form for 8 tumor features

      • Clear display of prediction result (Malignant/Benign)

      • About Me sidebar with portfolio links

    • User Experience: Focused on simplicity and clarity

  5. Prediction

    • Users enter tumor features → Click Predict → Receive visual result cards

    • ⚠️ Malignant → consult a doctor

    • ✅ Benign → no immediate concern


💻 Tech Stack

  • Python 3.9+ – Programming language

  • Streamlit – Frontend web app development

  • NumPy & Pandas – Data processing

  • Scikit-learn – Model training and evaluation

  • Pickle – Model serialization for deployment

  • TensorFlow/Keras – Neural network model (optional)


📈 Model Performance (Interview Talking Points)

  • Evaluation Metrics:

    • Accuracy, Precision, Recall, F1-Score

  • Example Results:

    • Accuracy: ~97–99%

    • Confusion Matrix shows minimal false negatives (critical in medical applications)

  • Feature Importance:

    • Radius, Perimeter, Area are most influential features for tumor prediction


🚀 Demo

 


👨‍💻 About Me


❤️ Acknowledgements


⚠️ Disclaimer

This project is for educational purposes only. It should not be used as a substitute for professional medical diagnosis.

Blog Shape Image Blog Shape Image

Leave a Reply

Your email address will not be published. Required fields are marked *