Maven Music, a music streaming service, has been experiencing higher-than-usual customer churn in recent months. As a newly hired Jr. Data Scientist, your role is to investigate the churn problem by analyzing customer data and preparing it for predictive modeling.
The project involves working with two key data sources:
- Customer subscription details
- Music listening history
The objective is to gather, clean, explore, and engineer features that will help explain churn patterns and provide a foundation for future machine learning models.
- Revisit the project scope β Clarify the problem statement and goals.
- Data Gathering β Load customer and listening history data into Python.
- Data Cleaning β
- Convert data types
- Resolve data quality issues
- Handle missing values
- Create new useful columns
- Exploratory Data Analysis (EDA) β
- Explore each dataset individually
- Join datasets for deeper insights
- Identify patterns related to churn
- Feature Engineering β
- Build non-null, numeric features
- Create potential predictors of churn (e.g., listening behavior, subscription details, activity levels)
- Visualization & Insights β
- Generate plots to reveal trends
- Interpret churn drivers
- Data Preparation for Modeling β
- Produce a clean DataFrame suitable for machine learning
- Ensure consistency and reproducibility