Modeling climate variables using time series analysis in arid and semi arid regions

Stochastic models have been proposed as one technique to generate scenarios of future climate change. Temperature and precipitation are among the main indicators in climate study. The goal of this study is the simulation and modeling of monthly precipitation and the mean of monthly temperature using stochastic methods. The 21-year data of precipitation and the mean of monthly temperature at Shiraz Synoptic Station in south of Iran have been used in this study and based on ARIMA model, the autocorrelation and partial autocorrelation methods, assessment of parameters and types of model, the suitable models to forecast monthly precipitation and the mean of monthly temperature were obtained. After models validation and evaluation, the forecasting was made for the crop years 2008 to 2009 and 2009 to 3010. In view of the forecasting made, the precipitation amounts will be improved than recent years. As regards the mean monthly temperature, the findings of the forecasting show an increase in temperature along with a narrowing of the range of variations.


INTRODUCTION
The study of meteorological parameters is highly important in hydrology problems, since the same parameters generally form the climate of a region and is due to variations caused by water, wind, rain, etc. that issues problems, such as drought.Therefore, accuracy in data collection of such parameters is of particular importance.The study of long statistical term of the behavior and fluctuations in climatic parameters and analysis of the results obtained as well as the study of the behavior of a phenomenon in the past can analyze its probable trend in the future, too.Therefore, one can study the climatic variations using forecasting and estimation of parameters, such as precipitation and temperature and studying their behavior in the past.
In order to model and forecast, stochastic and time series methods can be used.Statistical methods include two objectives: 1-understanding of random processes, 2-Forecasting of series (Anderson, 1971).Time series analysis has rapidly developed in theory and practice since 1970s to forecast and control.This type of analysis is generally related to data which are not independent and are consecutively dependent to each another.In a study, the mean of monthly temperature of Tabriz Station in Iran was investigated based on Box and Jenkins Autoregressive Integrated Moving Average (ARIMA) model.In this study, the monthly temperature of Tabriz for a 40-year statistical period (1959 to 1998) was examined based on autocorrelation and partial autocorrelation methodsas well as controlling the normality of residues using above mentioned models.Based on the obtained models, the variations of the mean of temperature of Tabriz Station are forecasted up to the year 2010 (Jahanbakhsh and Babapour-Basser, 2003).
A study was conducted to analyze the climate of Birjand Synoptic Station in Iran and recognize climatic fluctuations, especially drought and wetness to provide a suitable model to forecast the climatic fluctuations and the best model using statistical methods and Box-Jenkins models of time series of precipitation and temperature.Among the necessities to conduct this study are climatic forecasting to be used in the state planning at large concerning natural disasters, thus, the precipitation and temperature of Birjand Station have been studied to identify the climatic fluctuations and their possible forecasting (Bani-Waheb and Alijani, 2005).Bouhaddou et al. (1997) used the AutoRegressive Moving Average model (ARMA) model for simulation of weather parameters such as ambient temperature, humidity and clearness index.Frausto et al. (2003) implied that autoregressive (AR) and ARMA could be used to describe the inside air temperature of an unheated.Kurunc et al. (2005) applied the ARIMA approach to water quality constituents and stream flows of the Yesilirmak River in Turkey.Yurekli and Kurunc (2006) performed prediction of drought periods based on water consumption of the selected critical crops by using ARIMA approach.Yurekli et al. (2005) used the ARIMA model to simulate monthly stream flow of Kelkit Stream in Turkey.Yurekli and Ozturk (2003) showed whether the daily extreme stream flow sequences concerning with Kelkit Stream could be generated by stochastic models.
In a study, modeling of drought in Fars Province in Iran was made using Box-Jenkins method and ARIMA model and the model to forecast drought in any region was obtained after zoning of different regions (Shamsnia et al., 2009).Shahidi et al. (2010) used ITSM software for modeling and forecasting groundwater level fluctuations of Shiraz Plain in Iran.The autoregressive (order 24) fitted to the series with AIC = 165.117.Coefficient of the fitted model was finalized by the residual tests.In another study, the monthly maximum of the 24 h average timeseries data of ambient air quality-sulphur dioxide (SO 2 ), nitrogen dioxide (NO 2 ) and suspended particulate matter (SPM) concentration monitored at the six national ambient air quality monitoring (NAAQM) stations in Delhi was analysed using Box-Jenkins Modelling approach.
The model evaluation statistics suggest that considerably satisfactory real-time forecasts of pollution concentrations can be generated using the Box-Jenkins approach.The developed models can be used to provide short-term, real-time forecasts of extreme air pollution concentrations for the air quality control region (AQCR) of Delhi City, Babazadeh and Shamsnia 2019 India (Sharma et al., 2009).Therefore, considering the importance of climatic parameters of precipitation and temperature and the importance they have in determining the roles of other climatic elements, their modeling and forecasting using advanced statistical methods is a necessity and could be a basic pillar in agricultural and water resource managements.The goal of the present study is to analyze the behavior of climatic parameters of monthly precipitation and mean temperature, simulation and providing a model to forecast parameters under study using the statistical models of time series analysis in the Synoptic Station of the Shiraz City.

RESEARCH METHODOLOGY
In this study, the monthly data on the precipitation and the mean temperature of Shiraz Synoptic Station were used and the required information was collected from the tables and the databases available.Shiraz City in Fars Province in southern part of Iran is located at 53° 37 E longitude and 29° 57 N latitude with the area of 10434 km 2 .The mean annual precipitation 330 mm and mean annual temperature for the study area about 18°C (I.R. of Iran Meteorological Org.).The geographical location of the study region is shown in Figure 1.The statistical period under study is the crop years 1983 to 1984 through 2003 to 2004.Initially, the homogeneity of data was confirmed using the run test statistical method.Essentially, homogenous test before statistical analysis on data should be taken to ensure the stochastic data.Homogeneous data was done using SPSS software.
Then, based on the results obtained and studying the sequence of observations and the past behavior of the phenomenon, the appropriate model was devised to forecast using time series analysis and stochastic methods.In order to model the data, they were fixed after preparing the time series of observations of precipitation and mean temperature separately.
For fitting ARIMA model to the time series of the new data sequences, the basis of the approach consists of three phases: model identification, parameter estimation and diagnostic testing (Yurekli and Ozturk, 2003).Identification stage is proposed to determine the differencing required to produce stationary and also the order of AR and MA operators for a given series.Stationary is a necessary condition in building an ARIMA model that is useful for forecasting.A stationary time series has the property that its statistical characteristics such as the mean and the autocorrelation structure are constant over time.When the observed time series presents trend and heteroscedasticity, differencing and power transformation are often applied to the data to remove the trend and stabilize variance before an ARIMA model can be fitted.Estimation stage consists of using the data to estimate and to make inferences about values of the parameters conditional on the tentatively identified model.The parameters are estimated such that an overall measure of residuals is minimized.This can be done with a nonlinear optimization procedure.
The diagnostic checking of model adequacy is the last stage of model building.This stage determines whether residuals are independent, homoscedastic and normally distributed.Several diagnostic statistics and plots of the residuals can be used to examine the goodness of fit; the tentative model should be identified, which is again followed by the stage of parameter estimation and model verification.Diagnostic information may help to suggest alternative model(s).This three-step model building process is typically repeated several times until a satisfactory model is finally selected.The final selected model can then be used for prediction purpose.By plotting original series trends in the mean and variance may be revealed (Box and Jenkins, 1976).The ARIMA model is essentially an approach to forecasting time series data.However, the ARIMA model requires the use of stationary time series data (Dickey and Fuller, 1981).

The modeling procedures
Modeling is made using time series analysis by several methods, one of which is the ARIMA or Box-Jenkins method being called the (p,d,q) model too (Box and Jenkins, 1976).In the (p,d,q) model, p denotes the number of autoregressive values, q denotes the number of moving average values and d is the order of differencing, representing the number of times required to bring the series to a kind of statistical equilibrium.In an ARIMA model, (p,d,q) is called the non-seasonal part of the model, p denotes the order of connection of the time series with its past and q denotes the connection of the series with factors effective in its construction.The mathematical formulation of ARIMA models shown by Equation (1).Analysis of a time series is made in several stages.At the first stage, the primary values of p, d and q are determined using the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF).A careful study of the autocorrelation and partial autocorrelation diagrams and their elements will provide a general view on the existence of the time series, its trend and characteristics.
This general view is usually a basis for selection of the suitable model.Also, the diagrams are used to confirm the degree of fitness and accuracy of selection of the model.At the second stage, it is examined whether p and q (representing the autoregressive and moving average values, respectively) could remain in the model or must exit it.At the third stage, it is evaluated whether the residue (the residue error) values are stochastic with normal distribution or not.It is then, that one can say the model has a good fitness and is appropriate.If the time series is of seasonal type, then the modeling has a two-dimensional state, and in principle, a part of the time series variations belongs to variations in any season and another part of it belongs to variations between different seasons.A special type of seasonal models that shows deniable results in practice and coin sides with the general structure of ARIMA models is devised by Box and Jenkins (1976), which is called multiplicative seasonal model.It is in the form of ARIMA (pdq) (PDQ).Then, for the model being ideal, the schemes must be used to test the model and for the comparison purpose, so as the best model is chosen for forecasting. (1) In Equation (1), X(t) is the variable parameter in instant t and Z(t) is the remaining parameter in the model is the white noise variance (Brockwell and Davis, 2002).

Model selection criteria
Several appropriate models may be used to select a model to analyze time series or generally data analysis to present a given set of data.Sometimes, selection is easy, whereas, it may be much difficult in other times.Therefore, numerous criteria are introduced to compare models which are different from methods for model recognition.Some of these models are based on statistics summarized from residues (that are computed from a fitted scheme) and others are determined based on the forecasting error (that is computed from forecasting outside the sample).For the first method, one can point to Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Schwartz-Bayesian Criterion (SBC) and for the scheme based on the forecasting error; one can point to the mean percent error (MPE) method, the mean square error (MSE), the mean absolute value error (MAE), and the mean absolute value percent error (MAPE).The model in which the above statistics are the lowest will be selected as the appropriate model.Akaike (1974) suggests a mathematical formulation of the parsimony criterion of model building as Akaike Information Criterion (AIC) for the purpose of selecting an optimal model fits to a given data.Mathematical formulation of AIC is defined as:  (2) Where "M" is the number of AR and MA parameters to estimate, " 2 a  " is residual variance and " n " is the number of observation.
The model that gives the minimum AIC is selected as a parsimonious model.Akaike (1974) has shown that the AIC criterion trends to overestimate the order of the autoregression.But, Akaike (1978 and1979) has developed a Bayesian extension of minimum AIC procedure called BIC.Then, another index for model evaluating is efficiency factor.The model efficiency (EF) indicates the robustness of the model (Raes et al., 2006).EF ranges from   to 1 with higher values indicating a better agreement.If EF is negative, the model prediction is worse than the mean observation: (3)  2 and 3. Trend and seasonal components recognized by ACF/PACF diagrams (Figures 4 and 5), show the peaks in 12 and 24 lag times.These deterministic parameters removed by different operator, stationary series results (Figures 6 and 7).ACF/PACF of stationary monthly precipitation and the mean of monthly temperature series after differencing are showed (Figures 8 and 9).Residual testing was used for validation.ACF/PACF of residuals shows all covered by 95% confidence interval (Figures 10 and 11).The RACFs drawn for the best models indicated that the residuals were not significantly different from a white noise series at 5% significance level.Inspection of the RACFs and the residuals integrated periodogram confirmed a strong model fit.

Modeling of monthly precipitation and the mean of monthly temperature
To model using ACF and PACF methods, assessment of values related to auto regression and moving average were made and eventually, an appropriate model for estimation of precipitation values for Shiraz Station was found as ARIMA (0 0 0) (2 1 0)12.The same as the precipitation modeling, ARIMA (2 1 0) (2 1 0)12 model is the describable one to estimate the mean of monthly temperature at Shiraz Station after modeling procedures.
To prevent excessive fitting errors, AIC and BIC criterion was used.In comparison between schemes, regarding the lowest AIC and BIC value, the final model with the best fitting of data, obtained using the method of maximum likelihood and ITSM software.The results are shown in Equations (4 and 5) and Table 1.In Equation (4), X(t) is the total precipitation in instant t and Z(t) is the remaining parameter in the model is the white noise variance (Brockwell and Davis, 2002).In Equation ( 5), X(t) is the mean monthly temperature at instant t and Z(t), the same as precipitation is WN variance estimate.
Figure 12 shows the correlation between observed and predicted data from ARIMA models in crop years 2004 to

Conclusion
Recent droughts in Fars Province with Shiraz as its center have led to much damage.To prevent such huge Measured precipitation (mm) X(t) = .1066X(t-1) -.08771 X(t-2) -.1019 X(t-9) -.2363 X(t-12) + .1581X(t-23) -.2111 X(t-24) + Z(t) X(t) = -.6683X(t-1) -.5921 X(t-2) -.3740 X(t-3) -.2680 X(t-4) -.1849 X(t-5) -.1206 X (t-6) -.6832 X(t-12) -.4682 X(t-13) -.3631 X(t-14) -.1038 X(t15) -.3448 X(t-24) -.1471 X(t-25) -.1030 X(t-26) + Z(t) damage, knowledge of the fluctuations during the statistical period and forecasting of them in planning is necessary.The findings of the study of climatic parameters of monthly precipitation and the mean of monthly temperature and evaluation of diagrams showed that, variations of precipitation in Shiraz region denote the existence of severe and in some instances, long-term droughts.The Box-Jenkins model was used to forecast the studied parameters and the final model was tested using AIC and BIC criterion and the results showed that it can be used to forecast the monthly variations in precipitation and the mean of temperature in the city of Shiraz regarding its high accuracy.
For model validation, EF value calculated 0.7 for monthly precipitation and 0.94 for the mean of monthly temperature.Also, R 2 for climate variables obtained 0.78 and 0.95.Consequently, the models can be used for forecasting of studied variables.In view of the forecasting made, despite of a continuing drought, it is likely that the precipitation will improve.As regards the mean monthly temperature, the trend of increasing temperature, especially in recent years, has continued and the findings of the forecasting show an increase in temperature along with a narrowing of the range of variations.

Figure 1 .
Figure 1.Regional map of Iran, Fars Province and Shiraz City (I.R. of Iran Meteorological org.).

Figure 2 .
Figure 2. Time series of monthly precipitation in Shiraz Station.

Figure 3 .
Figure 3.Time series of the mean of monthly temperature in Shiraz Station.

Figure 5 .
Figure 5. ACF/PACF of the mean of monthly temperature in Shiraz Station.

Figure 7 .
Figure 7. Stationary of the mean of monthly temperature series.

Figure 11 .
Figure 11.ACF/PACF of residual for the mean of monthly temperature.

Figure 12 .
Figure 12.Correlation between observed and predicted data from ARIMA models in crop years 2004 to 2005 through 2007 to 2008.

Figure 13 .
Figure 13.Predicted data of the monthly precipitation for the agriculture years 2008 to 2009 and 2009 to 2010.

Figure 14 .
Figure 14.Predicted data of the mean of monthly temperature for the agriculture years 2008 to 2009 and 2009 to 2010.

Table 1 .
The ARIMA models selected for variables.