Maize leaf area estimation in different growth stages based on allometric descriptors

1 Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre s.n., 4169-007 Porto, Portugal. 2 Escola Superior de Desenvolvimento Rural, Universidade Eduardo Mondlane, Vilankulo, Moçambique. 3 Linking Landscape, Environment, Agriculture and Food (LEAF), Instituto Superior de Agronomia, Universidade de Lisboa, Tapada da Ajuda, 1349-017 Lisboa, Portugal. 4 Geo-Space Sciences Research Center, Rua do Campo Alegre s.n., 4169-007 Porto, Portugal. 5 Institute for Systems and Computer Engineering, Technology and Science (INESC-TEC), CRIIS, Campus da Faculdade de Engenharia da Universidade do Porto, Rua Dr. Roberto Frias 4200-465 Porto, Portugal.


INTRODUCTION
Leaf area (LA) is a determinant factor in many physiological and agronomic processes, particularly in terms of growth, photosynthesis, transpiration, water and nutrients use and productivity (Gao et al., 2012;Nangju and Wanki, 1980;Pandey and Singh, 2011).
Therefore, implementation of operational and accurate processes for measuring and estimating crop LA has long been a concern for researchers.There are currently several approaches for LA determination, which include direct and indirect methods.Direct methods include planimetric or gravimetric analyses of leaves, harvested directly or indirectly (Breda, 2003;Jonckheere et al., 2004).Portable scanning planimeters (e.g., LI-3000, Licor, NE, USA) are often used as a reference method for obtaining the LA.
Direct methods are more accurate but have the disadvantages of being very time-consuming, not userfriendly, and having constraints regarding equipment acquisition, price, and operation (Jonckheere et al., 2004).Moreover, direct methods can be destructive, not allowing successive measurements of LA (Peksen, 2007;Rouphael et al., 2010).One of the most frequently used indirect methods for LA estimation is based on observations and measurements of allometric parameters of the plants, which are used as inputs in mathematical models (Montgomery, 1911;Peksen, 2007).Such mathematical models are based on the correlation between the allometric measures of plants and the area of the leaves.These methods are non-destructive and allow for faster LA determination, eventually being suitable for automation.Nevertheless, an adequate parameterization and calibration of such methods is necessary.
The development of model for maize LA estimate based on alometrics has long been a concern for growers, breeders and researchers.A generalized leaf area equation LA = α × L × W for maize plants was proposed by Montgomery, (1911), based on a rectangle area L × W (Lleaf length, Wleaf width) and on a weighing factor (α) equal to 0.75.However, several authors indicated that the weighting factor may vary depending on the maize variety (Bange et al., 2000;Carvalho and Christoffoleti, 2007;Tivet et al., 2001), plant development stage (Bange et al., 2000), environmental conditions and agronomic practices (Elings, 2000;Sezer et al., 2009).Therefore, application of this classic equation requires a measurement of length and width of all leaves on a plant, which is very labour and time consuming, and can be a source of errors.
An alternative approach for estimating the maize LA based only on the largest leaf allometric measurements was developed for varieties adapted to temperate regions (Valentinuz and Tollenaar, 2006).When it was used on tropical varieties, these equations underestimated LA (Elings, 2000).Mondo et al. (2009) estimated maize LA based on one leaf, but not necessarily the largest.Although these models can perform well in estimating the LA at specific stages of the season, their portability to estimate the LA in different stages of maize development are not yet known.
According to Costa et al. (2016), the flexibility of LA models for use at different crop development stages is an important feature to support, throughout the crop cycle, different agricultural practices of high agronomic, economic and environmental importance, such as management of crop water requirements and dosage parameterization of pesticides applications.

Mananze et al. 203
The objective of this study was to develop a nondestructive and expeditious method and mathematical model for estimating TLA in the maize crop, variety PAN 53, at different phenological stages.The specific goals included (i) the development of an estimation methodology based on biometric measurements of specific plant leaf using image processing; and (ii) the development of a dynamic mathematical model that estimates the TLA of the crop stems throughout the cultural cycle of the maize.

MATERIALS AND METHODS
The current study was conducted in a field of 3 ha operated by the Joint Aid Management (JAM), a non-governmental organization, in the district of Vilankulo, within the province of Inhambane in southern Mozambique, latitude: 21° 58`S, longitude: 035° 09`E and altitude of 31 m above sea level (Figure 1).
The district of Vilankulo is characterized by a semi-arid to arid climate, with sandy soils of low fertility, and a high risk of agricultural production failure due to drought.The total annual rainfall is 733.9 mm, while the total annual evapotranspiration is 1135.1 mm, and the average annual temperature is 24.5ºC.The hot and rainy season occurs between November and March, with February being the hottest month (average monthly temperature of 26.9°C), and the average rainfall is about 166 mm.The cold and dry season occurs from April to October.July is the coldest month (average monthly temperature of 19.4°C) and drier, with about 17 mm of monthly rainfall.
Maize seeds of PAN 53 variety (from PANNAR Seeds Company) were used for the present study.Sowing was done on June 9, 2015 in the cold and dry season, and following geometry of 0.50 × 0.20 m.A drip irrigation system was used and fertilization was applied during irrigation.Harvest was done in October 2015.The PAN 53 variety has an average maturity, is resistant to major maize diseases and has a potential yield from 8 to 10 t/ha (PANNAR, s/d).
The alometrics measures took place from June to September 2015 in different phenological stages.The Lancashire et al. (1991) phenological stages description was adopted and data were collected at the following stages: plants with 3 (V3), 6 (V6), 8 (V8), 12 (V12) and 15 (V15) leaves unfolded; flag leaf just visible (VT); inflorescence emergence (R1) and medium milk (RT).Fourteen maize plants were randomly selected and monitored at each phenological stage.The recorded variables in each stage were: i) length and width of the largest leaf, ii) number of leaves per plant and iii) height and diameter of the stem.Additionally, in the stages V8 and R1, the full set of leaves of 30 randomly selected plants was collected, identified, marked and transported to the laboratory for measurements of length and width, using a graduated ruler.
The leaves were also digitized using a camera (Sony -Optical SteadyShot ® DSC -W730; 16.1 megapixels; 8x optical zoom), while keeping constant the distance of the image acquisition.The area of each leaf was determined by digital image processing, using the Image J software 1:48 (Wayne Rasband National Institute of Health, USA) and following the methodology described by Glozer (2008).Previous studies have shown reasonable results of LA estimations using Image J software and other image processing software (Costa et al., 2016).In fact, several authors showed the occurrence of no statistically significant differences between the results provided by this approach and the portable leaf area meter (Liquor Inc., Lincoln, Nebraska, US), which is considered the most accurate equipment for measuring LA (Dombroski et al., 2010;Santos et al., 2014).
A linear regression analysis was performed to assess the relationship between the total leaf area (TLA, which is the sum of LA for all leaves on a plant) and the measured allometric variables.
The dependent variable (TLA) was estimated according to the allometric measurements and their derivatives (transformations), to test the following linear regression models: where NL is the number of leaves on a plant; L and W are the length and width of the largest leaf; H is the plant height; D is the stem diameter; 0, 1, 2, 3, 4 e 5 are the regression parameters estimated specifically for each model using the ordinary least squares method.For the model calibration, data from the 60 (30 + 30) plants collected at the phenological stages V8 and R1 were aggregated into one sample.The aggregated sample was then divided into two independent random samples, one used for calibration and the other for validation.
Analysis of variance was performed to test statistical differences (F test) for each model.In addition, the standard deviation (SD) was computed for each parameter, and the statistical significance of model parameters were determined using the t-test.For each model, the normality and homoscedasticity assumptions were tested, and the absence of multicollinearity between independent variables assessed.The normal distribution of the residuals was determined through the Jarque-Bera test (Gujarati, 1995).The Breusch-Pagan test (Breusch and Pegan, 1979) was used to identify the homoscedasticity by testing the dependence of the residuals variance on the independent variables.In both tests, the null hypothesis assumed a homogeneous variance of residuals, or a normal distribution of the residuals.The null hypothesis was rejected for p value lower than 5% for the distribution of X 2 (2df).The diagnosis of extreme observations or "outliers" was processed through the leverage test, establishing a maximum acceptable value of 1.5 (Montgomery et al., 2012).
The assessment of the model`s goodness-of-fit was done using the coefficient of determination (R 2 ), the efficiency coefficient -Nash-Sutcliffe -NSE, the linear regression through the origin and the index of agreement (IoA) between simulated and observed values.The NSE is a standard statistics that compares the relative magnitude of the residual variance with the variance of the observed data (Cunha et al., 2016).It has a range of -∞ to 1; the closer to 1, the more accurate is the model.Compared to R 2 , the NSE is less sensitive to differences between the means and variances of the observed and predicted values.However, both are sensitive to extreme values, as reported by Legates and McCabe (1999) cited in (Cunha et al., 2016).The IoA has values ranging from 0 -1, with 0 indicating lack of agreement and 1 perfect agreement.
Analysis of the residuals between observed and estimated values was used to evaluate the model accuracy and precision.Several indicators were considered: i) the absolute average error (AAE), ii ) the mean squared error (MSE ), iii) the mean root square error (MRSE) and iv) the relative mean root squared error (RMRSE).The Durbin Watson test (DW) was used for evaluating the autocorrelation between residuals assuming that values close to 2 denote the absence of autocorrelation.
For selection of the model with best performance, the Akaike information criteria (AIC, dimensionless), was also used based on the maximum likelihood function that allows generic comparison of models with different number of predictors.The AIC is calculated as follows: (5) where N is the number of observations, SQE is the sum of square error, and K is the number of parameters + 1. Lower values of AIC indicate better models.
Evaluation of the regression assumptions and the model validation are very important for verifying the model suitability as a forecasting tool when using observations of new independent variables.In fact, the regression model can provide a good fit for the calibration sample data, but not when transposed outside the calibration confidence interval.For this reason, the statistical indicators of both calibration and validation phases, were used for model evaluation and selection.In addition to the statistical indicators, the easiness of application and the biophysical meaning were also taken in consideration.
Two validation procedures were applied: cross validation and external validation.The cross validation was applied over the full set of data (n = 60) using the "leave-one-out" (LOO) crossvalidation method (Cunha et al., 2016).The LOO cross-validation evaluates the model performance for observations not considered in the estimation step, thus providing independent estimates of the predictive capability of the selected models.This technique consists of the removal of one observation from the dataset used, and the estimation of a new regression model with the remaining observations.This new regression model is then used to estimate the stem LA.
For the external validation, about 50% of the observations (30 plants) not used in the model parameter estimation, were used to evaluate the quality of the predictive model for these observations.We assume that the quality of the model validation is greater when the values of the indicators MSE, RMSE, AAE and the RRMSE are similar for the calibration and validation samples.The SPSS 23 software was used for the implementation of all statistical analysis.

RESULTS AND DISCUSSION
The dates for the occurrence of phenology and dynamics Mananze et al. 205 of plants growth are presented in Table 1.The maximum height (2325 mm) was recorded when plants presented the flag leaf just visible.The growth rate is in agreement with the expected patterns of crop growth (Table 1).
Initially, there was an exponential growth up to the 15 th leaf stage and hereafter the growth rate becomes very small.The mean and standard deviation values for all the allometric descriptors presented in Table 2 were very close for the calibration and validation samples.
The model 1" explains 90% of the variability of maize TLA at different stages of crop development (R 2 = 0.90, n = 30; P <0.000) in both calibration and validation datasets.The value of the regression coefficient  1 was significantly different from zero (t test P < 0.000) and the confidence interval for its estimation does not include The vertical line separates the statistical indicators for the calibration and validation samples, respectively.
Table 3.The assumptions diagnostic and goodness-of-fit indicators for the calibration and the validation of the models proposed for estimating the total leaf area.The Jarque-Bera test was statistically significant for both calibration and validation samples, indicating normality of the residual variance, and the homoscedasticity of the variance could be confirmed by the statistical significance of the Breusch-Pegan test (Table 3).The efficiency coefficient for calibration (NSE = 0.90) and validation (NSE = 0.91) are within the range defined for accurate models.Additionally, the model indicates an excellent predictive power, if one considers its high level of agreement (IoA = 0.84).The measures of association suggest strong correlation between observed and predicted TLA, with the coefficient b higher than 1 for both calibration and validation data sets suggesting good accuracy.The slope of the regression through the origin was very close to one (0.99 for calibration and 0.90 for validation) and the coefficient of determination was 89%, showing that the model produced TLA values with high accuracy and precision at different plant development stages (Table 3).

Statistics
Figure 2 illustrates the relationship between observed and predicted TLA for all data-set (n=60).The slope of the regression line (b) is very close to 1 (0.98), and the value of the coefficient of determination is high (0.89), indicating a good agreement between the observed and predicted values.
The relative difference between observed and predicted leaf area is less than 10% in over 91% of the cases, as shown in Figure 3.The deviations exceeded 10% only in 6.7 and 10% of the cases respectively for the calibration and validation data.The largest deviation (24.6%) was registered in the validation series, with all other cases presenting deviation lower than 20% (Figure 3).
The model relating the product between the number of leaves, length and width of the largest leaf (model 1") proved to be the most suitable for estimating the TLA of maize, variety PAN 53, in the agro-ecological conditions and agronomic practices of the study area.This model performed well when applied to the validation dataset, which suggests its accuracy in forecasting maize TLA.On the other hand, the selected model enabled the estimation of LA at different stages of the crop cycle, unlike other evaluated models which resulted in negative LA values at the initial stages of crop development (data not shown).
Studies using temperate (Valentinuz and Tollenaar, 2006) or tropical (Elings, 2000;Mondo et al., 2009) maize varieties demonstrated that the product of the length and width of the largest leaf is an important descriptor to estimate the total LA.These models were developed for a specific stage of crop development and therefore, do not include the number of leaves as the model developed in the current study.According to Elings (2000), models developed for temperate varieties are not suitable for application in tropical varieties.Likewise, in the current study, an attempt to apply the models developed by Mondo et al. (2009), Sezer et al. (2009) and by Montgomery (1911), resulted in substantially lower fit (R 2 = 0.597, 0.6416 and 0.6453, respectively, data not shown) when compared with the model "1".As noted by several authors, these differences probably stem from genetic aspects of the studied varieties, agro-ecological conditions and agricultural practices of the study areas (Stoppani et al., 2003;Tsialtas et al., 2008).

Conclusion
The model equation developed from the current study is deemed suitable for estimating the total leaf area of maize plants based on data collected from various stages of the crop cycle.The accuracy of the leaf area estimation results, and the operability of the model developed in the current study are indicators of the model"s potential use in different agricultural practices whereby decision-making depends on plant leaf area, such as spraying, fertilization and irrigation as well to support research project.

Figure 1 .
Figure 1.Geographic location of study area in Vilankulo, Mozambique.

Figure 2 .
Figure 2. Linear regression through the origin between predicted and observed leaf area for calibration and validation data-sets.

Figure 3 .
Figure 3. Frequencies (%) of differences between observed and predicted leaf area for calibration and validation sets.

Table 1 .
Day of the year (DOY) for the occurrence of phenological stages and parameters of crop growth dynamics.

Table 2 .
Descriptive statistics for the allometric descriptors used for model calibration and validation.