This project aims to segment customers of a mall based on their demographic and spending behavior using K-means clustering. The analysis includes data visualization, data cleaning, and clustering techniques to identify distinct customer segments.
The dataset used in this project is "Mall_Customers.csv", containing information about customers including their age, gender, annual income, and spending score.
- Python 3.x
- pandas
- numpy
- matplotlib
- scikit-learn
- gap-statistic
You can install the required Python libraries using pip:
pip install pandas numpy matplotlib scikit-learn gap-statistic
- Make sure you have Python and the required libraries installed.
- Clone the repository or download the script "mall_customer_segmentation.py" along with the dataset "Mall_Customers.csv".
- Run the script using any Python IDE or execute it from the command line:
python YOUR_FILE_NAME.py
- The script will perform data visualization, clustering, and display the results.
- Elbow method is used to determine the optimal number of clusters.
- Silhouette method is employed for cluster validation.
- Gap statistics is used to find the optimal number of clusters.
- Visualization of the clustering results using PCA.
[MOHAMMAD FARHAAN ALI]
This project is licensed under the [MIT License]
Feel free to customize it according to your project details and preferences!