Application of data mining in telecommunication industry

This paper applied a data mining model in sales and marketing department of Telecommunication Industry (TI) in Nigeria. The motivation behind the paper is as a result of competitive challenges facing most TI sales and marketing departments globally such as inability in gaining precise view of targeted data, inability to translate and formulate business question correctly and Problem of addressing data quality. The aim of this research work is to develop and implement a model that would be used to retain existing customers, attract new ones, effectively manage and allocate resources, goods and services in TI. The data mining techniques used were classification, association, sequence discovery, visualization and prediction. The tools used to implement the model were PHP, JavaScript, CSS and HTML. Telecommunication Service Providers (TSP) considered were Mobile Telephone Networks (MTN), GlobaCom (GLO), Airtel and emerging telecommunication markets (EMTs) also known as Etisalat. Three products on sales and marketing department of TI such as Airtime, Electronic Recharge (e-top up) and SIM card sales were considered. The training data used for model exploratory analysis range from 2008 to 2015 (eight years) and was collected from historical sales records of EMTs. The data were cleaned and transformed. The enhanced system was achieved through the implementation of the model which proves to be more efficient than the existing system. The model implemented was able to extract relevance information from database of TI and makes sales forecast for subsequent year. Therefore the system is recommended to be used by the TI to enhance their productivity.


INTRODUCTION
Information and Communication Technology (ICT) has made business so easy that many now describe the world as a global village.According to Uduchukwu (2013), it is a widely known fact that life in the world today has been made easier through ICT.Therefore, adoption of ICT by organizations and industries will extensively improve the standard of their operations.Generally, organizations including Telecommunication industries (TI) *Corresponding author.E-mail: drezeudokaf@yahoo.com.Tel: 08037723003.
Author(s) agree that this article remain permanently open access under the terms of the Creative Commons Attribution License 4.0 International License face several challenges in analyzing their large volume of data sets in order to extract meaningful information that will enhance their decision making.Based on the aforementioned problems, it is imperative to adopt an appropriate modeling tool (data mining) to support the operations of these organizations, specifically the TI to enable them achieve their statutory goal which is 'profit making'.
Data mining can be seen as the exploration and analysis of large quantities of data in order to discover meaningful patterns and rules (Meens, 2012).
According to Zentut (2011) and Rijmenam (2014), there are several major data mining techniques that have been developed and used in data mining projects recently such as classification, clustering, regression, association rule learning, sequence discovery, prediction and visualization (Oyeniyi and Adeyemo, 2015).The data mining techniques used in this research are classification, association, sequence discovery and prediction.Telecommunication companies utilize data mining to improve their sales and marketing operation strategies.
The aim of the work is to develop and implement data mining model in the sales and marketing department of TI to enable them discover meaningful patterns and rules that will enhance their decision making.The specific objectives of the research is to: develop a data mining model for data analysis, implement the model developed and analyze the organizational sales data (data from customers) to extract some useful information.
This work considered four telecommunication service providers (TSP) in Nigeria, such as Etisalat, MTN, AIRTEL and GLO.
All the TSP in Nigeria offer similar products and services but package it in different ways.The operations within the companies are almost the same.Major sources of revenue to these TSP are sales made from their distribution partners located in all their regions.In sales and marketing department of TI, three products were considered such as airtime, electronic recharge (e-top up) and SIM card sales.The training data analyzed was collected for eight years (2008 to 2015).From the model results, one can mine to get such information as; a precise view of sales in a particular region of the country at a specified period of time, total sales in a region at a specified period of time, a region with highest sales in any of the three product considered, year with highest number of sales etc. Appropriate application of this research work will also assist the organization in forecasting and predicting future outcomes such as; which region will likely make highest sales or profit in the nearest future say two or three years?, which product will likely have highest sales in future?etc.  Oyeniyi and Adeyemo (2015) developed a data mining model for customer churn analysis in banking sectors using data mining techniques.Simple Knee Means (K-Means) was used for the clustering phase while a rulebased algorithm, RIPPER (JRip) Error reduction was used for the rule generation phase.The data was analyzed using Waikato Environment for Knowledge Analysis (WEKA).Performance evaluation of the applied data mining techniques was carried out to test the goodness of fit, and adequacy of the constructed models in customer churn and non-churn prediction and analysis.The outcome of the model validation and performance evaluation was the ability of the applied model to accurately predict churn and non-churn customers.

REVIEW OF RELATED WORKS
Also, Fashoto et al. (2013) researched on application of data mining on fraud detection in National Health Insurance Scheme (NHIS) in Nigeria.This researcher applied Knee-Point K-means clustering algorithm to detect fraudulent claims in (NHIS).The aim of the model is to produce result that is easy to interprets and make use of a visualization tool that provide high levels of understanding and trust.
Cortes and Pregibon ( 2001) developed signaturebased methods which was applied to data streams of call detail records for fraud detection in TI.This work generated a signature from a data stream of call detail records to concisely describe the calling behavior of customers and then they used anomaly detection to measure the unusualness of a new call relative to a particular account.Signature-based model was experiment with France Telecom, AT&T, and SBC databases of 29, 26, and 25 terabytes, respectively and was successful but the model have few drawbacks.First, the signature-based method cannot support fraud incidences that did not follow the profiles.Second, these systems require upgrading to update them with current frauds methods.Customer level data such as price plan and credit rating information also help in fraud analysis (Alves et al., 2006).More recent works using signatures has adopted dynamic clustering as well as deviation detection to detect fraud (Kantardzic, 2011).Clustering has problem of not producing a single output variable that leads to easy conclusions, but instead requires that you observe the output and attempt to draw your own conclusions (Cox et al., 1997).
Another method for detecting fraud exploits human pattern recognition skills.Research on Cortes and Pregibon (2001) built a suite of tool for visualizing data that was tailored to show calling activity in such a way that unusual patterns are easily detected by users.These tools were then used to identify international calling fraud.

METHODOLOGY
The software methodology used in the analysis and design of the

SYSTEM ANALYSIS OF DESIGN
Here is a presentation of the system analysis and design of the existing and proposed systems, and the data mining model in form of algorithm.

Problems of the existing system
1) The difficulty in gaining a precise view of target area in a collated voluminous business transaction data by decision makers of an organization.
2) Difficulty in information retrieval and data analysis.

Data flow diagram (DFD)
Figure 2 shows the business operations of the proposed system.The conventional way of accessing and analyzing the record in repository is being replaced by proposed data mining approach.

Use case diagram of the system
Figure 3 presents the relationship between the main entities (actors) and the system, that is, actors and activities they perform.

System architecture
Figure 4 illustrates the operational 3-tier framework of the proposed system.This diagram shows the communication existing between the users (the interface), sales records, database storage, and data mining approach with respect to the server infrastructure that controls all the entities.Graphical user interface/presentation tier is responsible for the interaction between the system and humans.The users are the top management and the Analyst.The programming tools used to design the interface were HTML and CSS.
Server Infrastructure enables privilege users to communicate and access the dynamic web document.It includes middle tier and data tier.Middle tier connects the data tier and the presentation tier together via programming language(s).PHP and Java script were used to achieve this.Data tier stores data for the application and MySQL was used to query the database.Data mining approach involves the application of data mining model on the sales data in order to carry out analysis.

Database design, input, output
This deals with the designs of the software implementation.

Database design
Here shows database tables that store the data that was used in the work.Some of the tables are as shown.Table 1 stores the admin/users login details with respect to their levels of permission.SIM card sales are stored in Table 2. Airtime and e-top-up sales are being captured by table with slight difference in field name.

Input design
Input design is used by the system to capture information from external environment e.g.New user registration template, airtime sales registration, mine airtime sales, new SIM card registration template, E-Top Up sales etc.   the same template with this.Airtime sales data is analyzed using Figure 7. SIM card and e-top up have similar template.New SIM Card Registration is captured in Figure 8.

Output design
The system uses this module to convey information such as results, data in order to carry out analysis.acknowledgement receipt to people etc. Figure 9 displays the output of the new user registration.
Figure 10 displays airtime sales data analysis result in a specified period.This figure has the same template with E-top up and SIM card Sales mining template.

System algorithm, flowchart and data mining model
Algorithm design is of utmost importance in software development; it simplifies the job of a programmer; inform him of his next step in the conception of a program and guide him towards the realization of the entire program.These algorithms were developed as a result of knowledge gained in reviewing data mining techniques applied in this study.This forms the data mining model developed for the data analysis (Appendix).

System implementation
Here shows some screenshots of the research work output and exploratory analysis of the system implementation.It shows a few illustrative analyses that can be done with the data mining model.
Figure 11 shows the general products report analysis table.This table contains analysis of all the sales made from year 2008 to 2014 in all the products considered.Figure 12 shows the Bar chart interpretation of the analysis of all the sales products shown in Figure 11.  Figure 13 is an interface where one can analyze and manage sales data for all the products considered in the work.Mine sales by components; when clicked by selecting any components (either, airtime, E-top up, or SIM card), selecting range in years such as (2008 -2009, 2008 -2010, 2009 -2012, 2008-2015 etc) will display the records.When clicked on graph (any of pie chart, bar chart, or line graph), displays result which is easy to interpret.Report displayed will be analyzed and prediction would be made.
The Figure 14 shows the snapshot for mining records of airtime sales for all regions (showing region of sales, value gotten and target given) from 2008 to 2010.The Figures 15 and 16 show the interpretation using bar chart and analysis report (the airtime sales forecast for 2011) respectively.
All product data target: Figure 17 shows all the products target, sales made and difference determines whether is actualized or not with respect to years of business undertaken.
Figure 19 shows the analysis report for Figures 17 and 18 with respect to regions.From this analysis, it can be deduced that: (1) maximum sales was made from Lagos-North in 2009 at the amount of N27359689087.00,(2) Minimum sales was made in North2 in 2008 at the amount of N1716116709.00,(3) the Maximum and Minimum target can be seen in the analysis.The organization can make a prediction that in 2010, according to the sales trends Lagos North sales will increase.

Conclusion
In this work, data mining model has been developed and implemented to enhance the operations of sales and       organizational large and complex datasets and reveal relationships and trends hidden in the geospatial data.Application of this model offers a great range of graphs, techniques and charts for easy description of relationships in data and knowledge acquisition which addressed the present needs in data analysis of sales and marketing department of TSP.Therefore, based on the above listed benefit of this system, it is recommended that the new system be adopted by the TSP in Nigeria because the benefit to be realized will be outstanding.

Figure 1
Figure 1 shows summary of the research work in a block diagram.

Figure 1 .
Figure 1.Conceptual framework of the study.

Figure 2 .
Fig. 1.1: Conceptual Framework of the Study

Figure 3 .
Figure 3. Use case of the proposed system.
Figure 5 is used to capture a new user.Figure 6 captures Airtime sales.E-Top Up sales has almost

FigFigure 4 .
Fig. 4.1: Business operation in the proposed system

Figure 8 .
Figure 8. New SIM Card Registration Template.

Figure 9 .
Figure 9. Sample output for User Registration.

Figure 10 .
Figure 10.Sample output of airtime sales mining.

Figure 11 .
Figure 11.General products report analysis table.

Figure 13 .
Figure 13.Mining component by classification interface.

Figure 14 .
Figure 14.Result of Airtime sales department from 2008 to 2010.

Figure 16 .
Figure 16.2008 to 2010 Analysis report and 2011 Airtime sales forecast.
Figure 19.Analysis Report for Figures 17 and 18.