The Production and prediction of major chinese agricultural fruits using an econometric analysis and machine learning technique

This paper investigates and explores the relationship between agricultural gross domestic product (AGDP) and major fruit output: apple, citrus, pears, grapes and bananas in China. The ordinary least square (OLS) method and the augmented Dickey-Fuller (ADF) test were used to analyze the data, and the Johansen co-integration test was used to interpret the results. The machine learning technique was used to examine and to predict future agricultural productivity in China. Our study found that the coefficient of the apple fruit output has a significant or positive relationship with the AGDP. The results also show that the output of citrus, grapes and pears have coefficients that demonstrate a positive relationship with the AGDP, while the banana fruit output bears a negative relationship with China’s AGDP and is statistically insignificant. The use of an econometric analysis and machine learning technique to examine the relationship between the AGDP and the output from major fruits production in China makes the current study unique. A review of the literature suggests that only limited research has been conducted in this area.


INTRODUCTION
China's population continues to grow, and China is now the world's largest food consumer.Each year, China consumes about 5 million tonnes of food and feeds about 20% of the world's population.Currently, agriculture accounts for 36% of the world's land cover area (Xiao et al., 2014;Rehman et al., 2017).In a broad sense, the cultivation of agricultural crops is important for environment and socio-economic growth in most countries (Rajalahti et al., 2012;Rehman et al., 2017).
Developing countries face challenges and constraints regarding agricultural development and food security, and food security are required for subsistence and rural development.
Socio-economic and protective development requires robust infrastructure for innovative management systems, facilities and operations for agricultural output.It has been reported that agricultural output, facilities and operations will increase by at least 70% of agricultural production over the next 40 years (Kilelu and Leeuwis, 2013).
Through new policies and increased financial support, China's government has made systematic efforts to accelerate the development of agricultural cooperatives.Because of these efforts, new laws governing farming cooperatives were introduced in 2007 with the goal of promoting sustainable agricultural development.The Chinese Ministry of Agriculture reported that about 25.2% of the country's farmers participated in agricultural cooperatives in 2013 (Ma and Abdulai, 2016).However, farmers are affected by high transaction costs, which affect the ability of several villages to participate in the cooperatives (Deng et al., 2010;Francesconi and Wouterse, 2015).Many studies have shown that agricultural cooperatives have been used to develop and to modernize agricultural technology to improve the welfare of farmers and their families (Fischer and Qaim, 2012;Ito et al., 2012;Abebaw and Haile, 2013).
In the last few decades, China's agricultural productivity has increased rapidly; however, this has led to serious environmental and ecological problems (Ju et al., 2004;Na et al., 2016;Zhang et al., 2011;Fan et al., 2011).In addition, compared to that of other developed countries, China's nitrogen use efficiency is low for its economic crops.Because of the large amount of fertilizer emissions and pesticides in the soil, atmosphere and surface water, the combination of chemical fertilizer and low nutrient use has caused serious environmental problems (Ju et al., 2009).The world recognizes the negative effects of agriculture on ecosystems and the environment.The sustained increase in crop production should be accompanied by the management and protection of ecosystems, even as crop productivity is maximized (FAO, 2014).The major objective of this study was to investigate the relationship between agricultural gross domestic product AGDP and major fruit output including apple, citrus, pears, grapes and bananas in China.Data were analysed by employing the ordinary least square (OLS) method and the augmented Dickey-Fuller (ADF) unit root test.The Johansen co-integration test was used to interpret the results, and the machine learning technique was used to examine and predict the future agricultural productivity in China.

Apple production
The apple industry plays a vital role in China's national economy.China accounted for nearly 30% of the world's total apple output, exporting nearly 24,000 tonnes of apples (Yang et al., 2006a).With 2.13 million hectares under cultivation, China has the world's largest apple production with an output of 31 billion tonnes, accounting for about 43% to 54% of the world's total production.Major issues affecting the production of high-quality apples include out-dated orchard management methods and models as well as major pests and other fruit trees.About 90% of apple orchards in China use out-dated compactly populated systems that are inefficient and expensive (Sun and Liu, 2012;Yang et al., 2006b;Zhou et al., 2013).Rigorous modern apple cultivation methods using dwarf stocks and wide rows have not yet been adopted.Pest control is still an obstacle to improving the efficiency and production of apples.In apple orchards, out-dated pest control methods rely on chemicals, and this frequently leads to very high pesticide usage, genetic mutations and insecticide resistance that cause serious ecological problems (Zhai et al., 2007;Chen et al., 2010).
Fruit trees are rich in flowers, and fruit ripening cannot be supported.For example, only about 7% of flowers are essential, such as in the apple's lucrative harvest (Untiedt et al., 2001).Too many flowers on a tree can reduce the size and quality of the fruit, deplete the tree's reserves and reduce its cold resistance (Dennis et al., 2000).The relationship between the plant and reproductive growth is key for ensuring good quality and high yields (Solomakhin et al., 2010;Janoudi and Flore, 2005).Thinning is accomplished by removing certain flowers or fruits from chemicals, manually, mechanically and by a combination of mechanical and chemical methods (Seehuber et al., 2011).
In the previous decade, apple production gradually increased, but growth is likely to be modest as less land becomes available for apple production.In 2014, the export of apples dropped nearly 20% as domestic prices reached a record high.The production of apples from 1980 to 2015 is shown in Figure 1 in tens of thousands of tonnes.

Citrus production
In the global market, citrus is the most popular and delicious fruit.In the early 1960s, the global production of citrus was 16 million tons.By 2012, it had increased to 68 million tons (FAO, 2012).Citrus trees tend to bloom in the spring; the fruit may need 6-8 months to mature (Steduto et al., 2012;Qin et al., 2016).The other important citrus producing countries are the United States (Florida), Brazil and Spain.It is necessary to supplement any shortage of rainfall with irrigation (Morgan et al., 2010;Ballester et al., 2013;Romero et al., 2009).
Citrus has become one of China's high-quality agricultural products; it is also the principal industry in southern rural China.Since the early 1970s, Chinese scientists have made scientific and technological progress in improving the quality of citrus products.This includes advances in areas such as breeding, germplasm utilization, pest control and fruit storage processing (Shen et al., 2009).China has more than 74 species of insect pests (Yang et al., 2004;Niu et al., 2013).
In 2010, production increased dramatically, and a significant number of navel oranges were planted.The production of citrus fruits from 1980 to 2015 is shown in Figure 2 in tens of thousands of tonnes.

Pear production
The sand pears of China are cultivated widely in Korea and Japan, and the colour of the fruit can vary from green or yellow to russet-brown (Teng and Tanabe, 2004).In recent decades, China has discovered and developed several varieties of red fruit (Tao et al., 2004).These red pearls are favoured by consumers because of their seductive appearance and nutritional value; however, the red colour is uneven because of variations in growth conditions (Huang et al., 2009).The quality of the pear is affected by internal and external characteristics (quality of taste and nutrition) (Choi et al., 2007).Pears have a high nutritional value with an appropriate amount of amino acids, sugar and raw materials such as calcium, sodium, potassium, magnesium and iron (Yim and Nam, 2016).They also have a higher dietary fiber level than most common fruits and vegetables and have produced excellent results in the treatment of constipation and intestinal inflammation (Silva et al., 2014).The production of pear fruit from 1980 to 2015 is shown in Figure 3 in tens of thousands of tonnes.

Grape production
Grapes are very important fruits.There are eight million varieties of grapes in the world (Ramezani et al., 2009).
China is one of the main producers of grapes.The country grows a wide variety of grapes and is therefore one of the world's richest germplasm resources.Grape wild relatives (GWRs) are important quality sources for cultivation.They have significant resistance to cold, drought, pests and other biological stresses.The breeding of grapevines has demonstrated the importance of wild germplasm resources for disease resistance gene breeding (Wan et al., 2008).In wine production, grapes are the main source of natural yeast.The grapevine flora can determine whether a wine product is beneficial or harmful.Thus, yeast producers have a great deal of information that is very important for helping wine producers to produce high-quality wines (Chavan et al., 2009;González et al., 2007).Grape growers may have significantly different breed selection criteria than breeders or nurseries.Previous studies of crops in developed and developing countries have shown that farmers use biological and economic criteria that are more complex than those of breeders.Farmers' choices are also strongly influenced by other factors in the agricultural supply chain, such as agrichemical and extension services (Mulatu et al., 2002;Vanloqueren et al., 2008;Macholdt and Honermeier, 2016).As a grower of grapes, China is now a large producer of red grape wine, in particular.In 2015, table grape production was 9.7 million tonnes, which was greater than that of the previous year, and the acreage of vineyards is expected to increase by 5%.The production of grapes from 1980 to 2015 is shown in Figure 4 in tens of thousands of tonnes.

Banana production
Bananas are considered the fourth largest crop in the world after rice, wheat and corn.They account for about 15% of the world's total fruit production.Bananas play an essential role in food security in developing countries.Cavendish, the most traded bananas, have accounted for half of the world's banana production (FAO, 2006;FAO, 2015).Bananas (Musa parasdisiac), of the Musaceae and Musa families, are perennial herbs that are widely distributed in tropical and subtropical regions (Pelissari et al., 2012).It is estimated that about 20-25% (10-15 million tons) of bananas are rejected each year because they do not meet quality standards and are not suitable for retail sales (Pillay and Tenkouano, 2011).From an economic perspective, banana production is the fifth most important crop in the world trade after coffee, grain, sugar and cocoa.Bananas are cultivated in more than 130 countries.India, China, the Philippines and Brazil are the main producers (Singh et al., 2016).Brazil is ranked fourth in the world in terms of banana production (FAO, 2013).Banana intensive farming produces many types of organic residues, such as pseudo-stems, peduncle, bulbs, leaf sheaths and shafts.These account for about 70% of the total weight of the plant.These residues tend to accumulate in large roadside piles.The fermentation from these bananas contributes to greenhouse gas emissions, volatile organic compounds and feasts for pathogens and mosquitoes (Awedem et al., 2016).Some of these residues are composed mainly of cellulose and lignin, which are difficult to reduce with the usual windrow composting (Chanakya and Sreesha, 2012;Kamdem et al., 2015).
According to the Food and Agricultural Organization of the United Nations (FAO) estimates, 413,000 hectares of land was used for the cultivation of bananas, with a total production of about 9.85 million tonnes (FAO, 2010).The production of bananas from 1980 to 2015 is shown in Figure 5 in tens of thousands of tonnes.

MATERIALS AND METHODS
Time series data from 1980 to 2015 were used to determine the relationship between AGDP and production outputs for major fruits, including apples, bananas, citrus, grapes and pears.The data were taken from the Ministry of Agriculture (MOA) of China and the China Bureau of Statistics.In the current study, the variables were AGDP (in million RMB), production output of apples (in 0000 tonnes), output of bananas (in 0000 tonnes), output of citrus (in 0000 tonnes), output of grapes (in 0000 tonnes) and output of pears (in 0000 tonnes).

Ordinary least squares method
The ordinary least squares (OLS) method results demonstrated the model's predictive ability and provided the parameters for the shortrun relationship.The Johansen co-integration test was used to check the long-run relationship between the AGDP and the production output of the major fruits.

Results of ADF unit root test
To check the stationarity of each variable, the augmented Dickey-Fuller (ADF) unit root test was used.The modelled results and statistics of the ADF test are presented in Table 1.

Co-integration test results
The Johansen co-integration tests based on trace statistics and the Max-Eigenvalue are presented in Tables 2 and 3.The co-integration test presence showed that the dependent and independent variables have a longrun equilibrium relationship.The trace statistic and the Max-Eigenvalue statistic revealed one (1) co-integrating equation at the 5% level.

Results of regression
Table 4 presents the results of the regression analysis.The value of R-squared was 0.995, and the adjusted Rsquared was 0.994.The F-statistic computed value was 1348.737 with a p-value of 0.000000.This demonstrates the model's overall goodness of fit.
The result of the regression analysis, as seen in Table 4, demonstrates that the coefficient of the output of the apple fruit has a positive relationship with the AGDP.The results also show that the output of citrus, grapes and pears have coefficients that demonstrate a positive relationship with the AGDP, but statistically these are insignificant.A 1% rise in the output of apples, citrus fruits, grapes and pears causes the AGDP to increase by 0.153, 0.736, 0.153 and 0.781%.The output of apples, citrus fruits, grapes and pears shows a positive relationship with the AGDP, but this is not statistically significant.Moreover, the coefficient of the output of bananas is not significant at the 1% and 5% levels of significance.In addition, there was a negative relationship between the AGDP and the output of bananas.This means that a 1% rise in the output of bananas leads to a decrease of 0.203806% in AGDP.This negative result was not expected.The major reasons for this negative

Prediction of major Chinese agricultural fruits
In the prediction of major Chinese agricultural fruits production, linear regression was used among the study variables.Statistical classification was used to interpret the results (Figure 6).Time series data was used in this analysis, and it was collected from the Economic Survey of Pakistan.The model for linear regression is specified as: (4) In Equation 4; Indicates the inner product among the vectors x i and β and T is the transpose.
Thus, the form of vector is = (5) The confidence interval of E (y| ) and the average of expected value of y for a specific given : (6) In the Equation 6; S y shows the standard deviation of the residuals, intended as and S x is known as residual standard error in R regression output. (7) The proposed model consists of m vectors in a dimensional feature space.In the feature space x points, which project it on m and convert it into z real number, the range of the real number is −∞ to +∞.

Algorithm
Step 1: Take matrix M of last one year of data, data size 1×6 Step 2: Take matrix P of forty years pervious data, data size 35×6 Step 3: Make sliding window of window size 1×6 for each matrix P as W 1 , W 2 …W 35 Step

Conclusion and Recommendations
The agriculture sector has made a rich contribution to the Chinese economy.To check the actual performance between the dependent and the independent variables, time series data from 1980 to 2015 were used.The data were collected from the Ministry of Agriculture (MOA) China, China Bureau of Statistics and various publications.The augmented Dickey-Fuller unit root test and the ordinary least squares method were used to analyse the data.The results were interpreted using the Johansen co-integration test.The machine learning technique was used to examine and to predict future agricultural productivity in China.The results of the study show that the coefficient of the output of the apple fruit has a positive relationship with the AGDP.The results also show that the output of citrus, grapes and pears has coefficients that demonstrate a positive relationship with the AGDP, but it is statistically insignificant.The banana fruit output has a negative, but not significant, relationship with China's AGDP.The negative relationship is probably the result of operating costs fluctuations and bad climatic conditions.This negative result was not expected.
The population of China continues to grow; thus, increases in the fruit production are essential.It is the responsibility of the government to provide resources to farmers to increase fruit production.To this end, it is necessary for the Government of China to initiate new programmes and methods of financial support.China also should adopt new policies in the coming decade to improve and to increase yield, a major factor in fruit production.
indicates True Positive, TN indicates True Negative, FP indicates False Positive and FN indicate False Negative.The design of algorithm is stated below; Algorithm 1: Agriculture Fruit Production Prediction Input: A set of fruit data X Output: Fruit data X future Prediction P

Figure 6 .
Figure 6.Prediction of major Chinese fruits production.

Table 1 .
ADF unit root test including (Trend and Intercept).

Table 2 .
Johansen Co-integration test using trace statistic.

to 1 Eigenvalue Trace Statistic 5 Percent Critical Value Prob.** Hypothesized no. of co-integration equations
Denotes rejection of the hypothesis is at the 0.05 level; ** Indicates values are accurate.The trace test indicates 2 co-integrating equations at the 0.05 level of significance. *

Table 3 .
Johansen co-integration test using the Max-Eigenvalue Statistic.

Eigen Statistic 5 Percent Critical Value Prob.** Hypothesized no. of co-integration equations
*Denotes rejection of the hypothesis is at the 0.05 level of significance.** Indicates values are accurate.The max-eigenvalue test indicates 2 co-integrating equations at the 0.05 level of significance.

Table 5 .
Predicted fruits data and results.End Table5shows the predicted production of major Chinese fruits up to 2030.