Modification of the adaptive Nadaraya-Watson kernel regression estimator

Nadaraya-Watson (NW) kernel regression estimator is a widely used and flexible nonparametric estimator of a regression function, which is often obtained by using a fixed bandwidth. Several studies showed that the adaptive kernel estimators with varying bandwidths have better performance results. In this paper, a new improvement of the NW kernel regression estimator is proposed and the bandwidth of this new improvement is obtained depending on the range of the observations. Simulated example is presented, including comparisons with three others NW estimators. The performance of the proposed new estimator is evaluated via the MSE criterion. The results of the simulation study were very promising; it shows that our modified NW estimator performs well in all cases.


INTRODUCTION
In many statistical problems, nonparametric regression techniques are commonly used for describing the relationship between a response variable and some covariates.Let be a random sample of bivariate data with size .The nonparametric regression model is defined as: where : is unknown regression function, and : are independent random errors with zero mean and variance ; .
The nonparametric regression techniques are weighted averages of the response variable, where the weights depend on the technique and the distance between the observations of the explanatory variable scaled by a smoothing parameter.One of the nonparametric regression estimation techniques is the Nadaraya-Watson (NW) kernel estimator.It is more flexible than the other nonparametric methods, and provides an accurate predictor of observations.The NW kernel estimator was first proposed by Nadaraya (1964) and Watson (1964).Nadaraya (1964) introduced the NW estimator as an approximation to the regression curve based on empirical data .He studied the properties of his suggested estimator when the sample size increases infinitely.Watson (1964) presented the NW estimator as a simple computer method for obtaining a "graph" from a large number of observations.
The NW kernel estimator depends on one parameter which is called the bandwidth; it controls the amount of curve smoothing where large h produces a smooth density estimate (Wand and Jones, 1995).The bandwidth of the NW kernel estimator can be fixed or variable; the choice of the optimal bandwidth is a critical issue.The optimal bandwidth is the value that minimized the mean integrated squared error (MISE) which can be obtained by integrating the mean squares of errors (MSE).Several methods of selecting h can be used, Silverman (1986), Wand and Jones (1995) and Härdle et al. (2004) expanded in the bandwidth selections methods.One of these methods is the least square cross-validation or also called unbiased cross-validation; Scott and Terrell (1987) discussed it and presented a relationship between the biased and unbiased crossvalidation.The variable bandwidth should be used rather than the fixed bandwidth in the case of long-tail or multimodal distributions.Abramson (1982) suggested the inverse-square-root rule for the bandwidth h of a variable-kernel density, which reduces the bias more than the fixed-bandwidth estimator, even when a nonnegative kernel is used.Silverman (1986) discussed the kernel density estimation exhaustively.He gave details about the assumptions of the kernel weight and the properties of the estimator such as bias and variance.In addition, he proposed an adaptation for the kernel estimator by varying the bandwidth as nonparametric density estimation.Demir and Toktamiş (2010) considered the adaptive Nadaraya-Watson (ANW) kernel regression estimators as a way to estimate the regression function.
The results of their simulation study showed that the NW kernel estimator has a better performance when evaluating the local bandwidth factor based on the arithmetic mean instead of using the geometric mean.Also, their results did not oppose the previous studies in that the NW kernel estimator with the variable bandwidths is better than the fixed NW kernel estimator.
The purpose of this paper is to propose a new modification of the NW kernel regression estimator.The bandwidth of our modification is obtained by using the range of the probability density function of .The idea behind our modification is that by increasing the local bandwidth factor and thus the bandwidth, better performance of the NW kernel estimator will be obtained.In more details, different selected types of the NW estimators, including our modified NW kernel estimator are presented in our study.Also, a brief description of the MSE criterion is given.Finally, a simulation study is conducted with useful concluding remarks given.

METHODS
An important factor that has a great impact on the smoothing results is the choice of the bandwidth or the smoothing parameter

Aljuhani and Al turk 967
h.Here, different Nadaraya-Watson kernel estimators are presented according to the selected type of the bandwidth.

Fixed Nadaraya-Watson kernel estimator
The bandwidth can be selected to be a constant over all the range of x; this choice is suitable when the unknown regression function behaves the same over all the estimation range.The NW kernel estimator is often obtained with a fixed bandwidth which can be defined as: (2) where : is the fixed bandwidth, , and : is the kernel function which satisfying the following conditions (Silverman, 1986): Several kernel functions are proposed in the literature.Gaussian kernel function is one of the most commonly used in practice (Härdle, 1990;Silverman, 1986); the Gaussian kernel function is defined as (3) The fixed bandwidth can be selected depending on various methods such as; Silverman's rule of thumb and cross-validation.In this paper, the least square cross-validation (LSCV) will be used according to its simple evaluation and its ability to be applied in any regression model.The LSCV minimizes the integrated squared error (ISE) rather than the MISE (Scott and Terrell, 1987), where MISE is the average of the ISE, and ISE is a distance measured between the fitted density and the true density which is defined as (4) and 968 Sci. Res.Essays The is the bandwidth that minimize the LSCV which is defined as (5) where : is the leave-one-out kernel density estimator, which is obtained among the remaining observations and can be defined by the following equation ( 6)

Variable Nadaraya-Watson kernel estimator
The fixed NW kernel estimator is not a good choice for the cases of multivariate, long-tailed, and the multi-modal distributions.The multivariate problem can be handled by increasing the sample size, but for the cases of the long-tail and the multi-modal distributions, varying the bandwidth is recommended.The estimator which is based on varying the bandwidth is called the variable NW kernel estimator, and has the following form: (7) where : is the variable bandwidth.Abramson (1982) gave the following formula to compute : where : is the probability density function of which can be estimated by the kernel density estimator.
The variable bandwidth can be obtained by the Silverman algorithm which is presented in Silverman (1986).He presented in his paper, an algorithm for the Abramson style estimator, and referred to it as an adaptive kernel estimator.In the first step, he obtained the prior kernel estimator with fixed bandwidth which is denoted by .Then, he defined the local bandwidth factor , as: (8 where : is the geometric mean of , , and : is the sensitivity parameter, which satisfies .
At the last step, his suggested adaptive bandwidth is defined as: (9) In ( 1982), Abramson gave the sensitivity parameter the value 0.5 since this value leads to good prediction results.Then, the variable NW kernel estimator can be written as follow: (10)

Adaptive Nadaraya-Watson kernel estimator
Demir and Toktamiş (2010) modified the NW kernel estimator, their modification based on using the arithmetic mean instead of the geometric mean when computing the local bandwidth factor which is given as ( 11) where : is the arithmetic mean of .
Then, the ANW kernel estimator is defined by Equation 10 with replacing by .The authors used the arithmetic mean since its value is greater than or equal to the geometric mean (Lidstone, 1932), and that has made the value of the greater than .By maximizing the value of the local bandwidth factor, the value of the bandwidth will increase too, and this will enhance the performance of the NW estimator.According to their simulation study and real data application, they showed that the performance of the ANW kernel estimator is better than the performance of the NW kernel estimator.

Modified Nadaraya-Watson kernel estimator
This part of the paper is dedicated to our modification which aims to enhance the predictive ability of the NW kernel estimator through increasing the value of the bandwidth.In our proposed NW kernel estimator we suggest to evaluate the local bandwidth factor based on the range of the observations instead of using the geometric or the arithmetic mean.Most of the times, the range will have a larger value, particularly if the phenomenon being studied has outliers.Thus, the local bandwidth factor and the value of the bandwidth will be increased.The modified local bandwidth factor is given as: (12)  where : is the range of , which is the difference between the largest and smallest values.

Aljuhani and Al turk 969
Therefore, the modified Nadaraya-Watson (MNW) kernel estimator can be obtained by substituting instead of in Equation ( 10).

Evaluation criteria
For the selection of the best performance NW kernel estimators, several criteria can be used.In general, the evaluation criteria are based on computing the distance between the observed values and the predicted values which are obtained by using the estimated models.Here, the mean squared error (MSE) will be used.The best estimator is the one with the smallest MSE value.The MSE can be computed mathematically by using the following formula: (13) n: is the number of observations.

Simulation study
Here, the performance of the new proposed MNW kernel estimator is examined over three different selected NW kernel estimators; the fixed NW, the variable NW, and the ANW through a simulation study.The explanatory variables are generated from the uniform distribution based on the interval [0,1] with six different sample sizes 25, 50, 100, 250, 300 and 600.The regression function is given by Hardle (1990) as: Where the random errors have normal distribution with 0 mean and 0.1 variance.The fixed bandwidth is obtained by using the unbiased cross-validation method.To evaluate the NW kernel estimators, the Gaussian kernel function is used.A 1000 simulation repetition for each sample size is used to compute the MSE criterion.
The graphs of the real regression function and the estimated regression functions which are computed based on the sample of sizes 50, 100, 250 are presented in Figures 1, 2, and 3.While the performance of the MNW kernel estimator comparable with the three selected NW kernel estimators is considered objectively, our comparable study is based on the MSE criterion.For each sample size, the MSE value of the fixed NW, variable NW, ANW and MNW kernel estimators which is based on Gaussian kernel function are computed.The results of the MSE criterion is presented in Table 1.

RESULTS
From the figures, it is clear that the performance of the MNW is superior to the fixed NW, the variable NW and the ANW kernel estimators.Also, the figures show that generally the performance of all the studied NW kernel estimators becomes better by increasing the sample size.   1 we can conclude that: 1) The variable NW estimator gives noticeably better prediction results than the fixed NW estimator.
2) The ANW estimator has better performance than the variable NW estimator; same results were obtained by Demir and Toktamiş in (2010).
3) Our suggested estimator gives better predictive capability in all cases.4) All the estimators are enhanced by increasing the sample size.
Generally and according to our study and the study of Demir and Toktamiş in (2010), we can conclude that any modification that aims to increase the local bandwidth factor will give an improved prediction results.

DISCUSSION
The Nadaraya-Watson (NW) kernel estimator is a nonparametric method that can be used for regression estimation, it is an easy and flexible method and has previously been shown to provide an accurate prediction results.In this paper we have proposed a new NW kernel regression estimator as a modification to the adaptive Nadaraya-Watson kernel estimator.Our suggestion based on enhancing the predictive ability of the ANW kernel estimator through increasing the value of the Local bandwidth factor by using the range instead of the arithmetic mean when calculating the bandwidths.By conducting a simulation study with different sample sizes, various NW kernel regression estimators have been compared with our new suggested kernel estimator.The MNW kernel estimator seems to be superior to the other three NW kernel estimators in all cases.This estimator was more stable in comparison with the other kernel estimators; so according to our study when aiming to estimate the regression function the MNW kernel estimator is recommended.

Figure 1 .
Figure 1.The real regression function and the NW kernel estimators of the regression function using sample size 50 and h=0.16.

Figure 1 .
Figure 1.The real regression function and the NW kernel estimators of the regression function using sample size 100 and h=0.08.

Figure 3 .
Figure 3.The real regression function and the NW kernel estimators of the regression function using sample size 250 and h=0.06.

Table 1 .
The MSE values of the NW kernel estimators.