Churn prediction model monitors the customer relationship management in order to preserve the customers who are anticipated to quit from provided service. This study aims to identify variables which are more relevant for the prediction of customer churn. This study collected secondary data from 7403 customers consisting of information about customer demographics, services subscribed to and account information which was provided by Kaggle online. The dataset consisted of information about 19 exploratory variables alongside their respective binary classification of customer churn. This study adopted the use of decision trees algorithm for the development of a classification model for customer churn based on the collected dataset. This study performed a comparative analysis of the performance of the classification models based on varying percentage proportion of training and testing datasets. The results of the study revealed that the CART decision trees showed the overall best performance with an accuracy of 81.8%. The results of the study also showed that by increasing the training proportion of the modeling process, the accuracy of the predictive model was improved. The study concluded that a limited yet relevant number of variables were selected for the classification of customer churn using the CART algorithm.
Keywords: Customer Churn, Machine Learning, Classification Modeling, Decision Trees