This project analyzes customer behavior and cancellation patterns using the Kaggle E-Commerce dataset.
The goal is to identify key drivers of cancellations, quantify revenue impact, and segment high-risk customers to support data-driven business decisions.
- Analyze overall cancellation rate
- Quantify revenue lost due to cancellations
- Identify high-risk products and customer segments
- Explore customer behavior patterns
- Build actionable business insights from transactional data
- Overall Cancellation Rate
- Revenue Lost from Cancellations
- Top Products by Cancellation Count
- Top Products by Revenue Impact
- Customer-Level Cancellation Rate
- Segment-Based Cancellation Trends
- A small subset of products drives a disproportionate share of cancellation impact.
- Certain customer segments exhibit significantly higher cancellation rates.
- Revenue-weighted cancellation analysis provides stronger business insight than raw counts.
- High-value customers with frequent cancellations may require targeted retention strategies.
- Python
- Pandas
- NumPy
- Matplotlib
- Seaborn
- Scikit-Learn (for segmentation & modeling)
Source: Kaggle – E-Commerce Data
The dataset contains transactional-level order data including:
- Customer ID
- Product / Stock Code
- Quantity
- Price
- Invoice Information
- Cancellation Indicators
- Data Cleaning & Preprocessing
- Feature Engineering (Total Price, Cancellation Flags)
- Exploratory Data Analysis (EDA)
- Cancellation Rate & Revenue Impact Analysis
- Customer-Level Behavioral Segmentation
- Visualization & Business Interpretation
This analysis helps answer:
- Which products contribute most to cancellation revenue loss?
- Which customer segments are high-risk?
- What is the financial impact of cancellations?
- Where should retention strategies be prioritized?
- Confusion Matrix (if predictive model applied)
- Revenue Impact Bar Charts
- Cancellation Distribution Charts
- Segment-Level Risk Heatmaps
- Build predictive cancellation model
- Revenue-weighted churn scoring
- Time-series cancellation forecasting
- Customer lifetime value analysis
- Risk-based intervention strategies