Yasir Insights
Comments 0
06 Dec 2025

📉 Customer Churn Prediction Model By Mirza Yasir Abdullah Baig

Introduction

Customer churn is one of the most critical issues for telecom companies, as losing customers directly impacts revenue and growth. I built this Customer Churn Prediction App to demonstrate how machine learning can be used to predict whether a telecom customer is likely to leave the company (churn) or stay. The app allows businesses to proactively retain customers and optimize their marketing strategies while serving as an educational project showcasing end-to-end ML implementation.

Github Repo: https://github.com/mirzayasirabdullahbaig07/Customer_Churn_Prediction_Model

Motivation & Intuition Behind the Project

The main motivation was to explore how AI can solve real-world business problems. Customer churn prediction is a high-impact application of machine learning because it helps companies reduce revenue loss and improve customer retention. Additionally, I aimed to create a project that covers the entire ML lifecycle: from data preprocessing and exploratory data analysis to model building, evaluation, and deployment in a web app.

Dataset & Features

The project uses the Telco Customer Churn Dataset from IBM Sample Data, which contains both categorical and numerical features about customers:

Demographics: Gender, SeniorCitizen, Partner, Dependents
Account Information: Tenure, Contract type, Payment method, Paperless billing
Services: Phone service, Internet, Online security, Streaming services, Tech support
Charges: MonthlyCharges, TotalCharges

The target variable is Churn:

Yes → Customer will leave
No → Customer will stay

The dataset required preprocessing, including handling missing values, encoding categorical features, scaling numeric features, and addressing class imbalance using SMOTE.

Techniques & Model Used

I explored multiple machine learning models, including Random Forest and XGBoost, to identify the best-performing approach.

Why Random Forest/XGBoost:
- Handles high-dimensional data with mixed categorical and numerical features
- Robust to overfitting
- Provides feature importance insights
Data Handling & Preprocessing:
- Label encoding for categorical variables
- Feature scaling for numerical columns
- SMOTE to balance the dataset for accurate predictions

Evaluation Metrics: Accuracy, Precision, Recall, F1-Score, and AUC to ensure reliable predictions. The models achieved high predictive performance, minimizing false positives and false negatives, which is critical in customer retention scenarios.

Web App & User Experience

The model is deployed using Streamlit, providing a user-friendly interface:

Single Customer Prediction: Users input customer details and receive a prediction (Yes → will churn, No → will stay).
Batch Prediction: Users upload a CSV file containing multiple customer profiles to receive batch predictions.
Visuals: Probability-based predictions and clear charts to help interpret results.

The app also features a modern sidebar with portfolio links and serves as an educational tool for demonstrating real-world ML deployment.

Interview-Ready Points

If asked to explain this project in an interview, you can highlight:

Problem Motivation: Reducing customer churn is critical for revenue optimization in telecom.
Dataset Understanding: Mixed categorical and numerical features; target variable is churn.
Model Choice: Random Forest and XGBoost for high accuracy, interpretability, and handling complex feature relationships.
Techniques Used:
- Data preprocessing (missing values, scaling, encoding)
- Handling class imbalance (SMOTE)
- Model evaluation using multiple metrics
Deployment: Interactive Streamlit app for single and batch predictions.
Learning Outcome: End-to-end ML workflow, from data analysis to deployment and visualization.

Tech Stack

Python 3.9+ – Core programming
Streamlit – Web app deployment
Pandas & NumPy – Data processing
Matplotlib & Seaborn – Data visualization and EDA
Scikit-learn – Model training, encoding, and evaluation
XGBoost & Random Forest – Predictive modeling
Imbalanced-learn (SMOTE) – Handling class imbalance
Pickle – Model serialization

Demo & Screenshots

Live App: https://customerchurnprediction07.streamlit.app/
Video Demo:

Disclaimer

This project is for educational purposes only and should not be used for real-world business decisions without further validation.

Yasir Insights