Mathematical models to estimate leaf area of citrus genotypes

Mathematical models were developed, using 22 different genotypes of citrus, to estimate leaf area. The information of the relationship between leaf length and width ( ⁄ ) for simple leaf blade form (eliptic, ovate, obovate, lanceolate); and length of the three folioles ( ) ⁄ for a compound leaf (trifoliate leaves), was used with the purpose to separate group of similarities of leaf blade form and promote high accuracy of estimate. The best models presented an excellent precision with errors varying from 1.2 to 6.2 (%) and r 2 higher than 0.95 for the majority of the models tested. Considering a single leaf blade, the linear model ( ) presented the lower mean deviation and lower square deviation. For the compound leaves, the potential models are simple to use, since use only the information of length of central foliole L1 (Y= L1 μ ), although the use of linear models gave the best precision, as observed by using the model Y = . L1 . W1. Furthermore the model might be used as a single model independent of the relation (L2+L3)/L1∶ {Y=β (L1 W1 + L2 W2 + L3 W3), r2 = 0.98}.

Among the methods to estimate leaf area, mathematical models based on measures of biometric variables (leaf width and length) are widely used for various species of plants (Serdar and Demirsoy, 2006) and can be applied in studies of many types (Ramirez and Zullo Júnior, 2010;Bu et al., 2013;Coelho Filho et al., 2013;Padrón et al., 2016;Silva et al., 2008;Yi et al., 2010).However, due to the genetic variability for such characteristics, further studies and specific equations for each genotype are needed within a given species (Malagi *Corresponding author.E-mail: mauricio-antonio.coelho@embrapa.br. Author(s) agree that this article remain permanently open access under the terms of the Creative Commons Attribution License 4.0 International License  et al., 2010).Instruments such as portable scanners and optical laser are designed for measurements of leaf area index (LAI).However, many times, they are very expensive and complex for basic studies (Serdar and Demirsoy, 2006) and involve destructive measures, what makes the sequential readings inviable (Cristofori et al., 2008).
Citrus breeding programs have generated several hybrids, which should be evaluated for tolerance to abiotic and biotic stress and the leaf area is constantly assessed and correlated with most of others physiological traits.Thus, the present study aimed to develop an accurate mathematical model to estimate single blade leaf area, easily applicable and adaptable to any hybrid of Citrus.

Genotypes used and growing conditions
This study was conducted with 22 genotypes of Genetic Breeding Program of citrus (GBP Citrus) of the Embrapa Cassava and Fruits, being classified into two groups according to the leaf types: Simple and compound (Table 1).The leaves of each genotype were collected in five plants of each genotype cultivated in greenhouse during a year, in pots of 40 L.

Modeling and statistics of the results
From each genotype, 22 to 49 leaves were randomly collected sampling the maximum range of scope as possible.The leaf area of each leaf was determined using the methodology of Marshall (1968).
For the simple leaves, the maximum length of the leaf (L) and the maximum width of the leaf (W); for the compound leaves, the maximum lengths of the central folioles (L1) and lateral (L2 and L3) and the maximum widths of the central folioles (W1) and lateral (W2 and W3) were considered.Through the software Table curve, the biometric measurements were treated as independent variables and the leaf area as the dependent variable.The best models were selected based on the coefficient of determination (r 2 ) (Table 2).
In order to compare the models proposed, besides the correlations analysis, we calculated the total errors of the estimate of the leaf area and their relative errors.The total error of the estimate for each model generated was calculated by means of Equation ( 1): In which E is the total error of estimate of leaf area (cm²); Am is the estimated leaf area (cm²); and Ar is the leaf area measurement (cm²).
The relative error was calculated by the ratio between the difference of the sum of the estimated leaf area (∑ Am n 1 ) and the corresponding measured value ((∑ Ar n 1 ) by the sum of the real leaf area (∑ ) (Equation 2): In which RE is the relative error (%); (∑ ) the sum of leaf area, of all the leaves in a genotype, estimated by the proposed model (cm²) and (∑ ) the sum of leaf area, considering all the leaves in a genotype (cm²).

Adjusted models
The mathematical models presented the best adjustments  were linear and potential, so they were selected for more detailed analysis.For genotypes with single leaves, three models were chosen: one linear and two potentials; for genotype with compound leaves, six were chosen: three linear and three potential (Table 2).

Models for genotypes with single leaves
All equations of the models individually generated for the genotypes possessing single leaves presented r 2 above 0.9 (Table 3).The constant μ of Model 2 (simple leaf blade) tended to unity, showing that, regardless of the format of the leaf, leaf area is approximately 70% of the area of the rectangle (L.W), with no gains in accuracy with the use of the potential model.When only the length of the midribs as independent variable is used (model 3), the lowest value for constant μ was approximately 1.8, being characterized as potential (Table 3).
As shown in Figure 1, the adjusted models considering the three leaf groups (simple leaf), explained very well the variation of the data presenting excellent adjustment to mathematical models r 2 ≥ 0.99.It was noticed a proximity to responses of the models when analyzing range in the abscissa axis corresponding to small leaves (L•W ≤ 30 cm, L ≤ 5 cm) (Figure 1A to C).Consequently, the procedure of grouping, expressed by the ratio (L/W) 2 , promotes gains in estimates of LA, especially for larger leaves, range in which there is a greater dispersion of the models, regardless of the genotype tested.When considering the leaves grouping based on the relation (L/W) 2 , it was possible even the distinction of the access selected from a genotype, as the case of Rangpur lime (RL), in which the selections Aluminum 01 and 02 (group 3) belonged to distinct groups of Santa Cruz (RLSTC) (Group 2) (Table 3).The estimate errors for each genotype, from the use of the adjusted models for each group (Figure 1), are presented in Table 4.In the case of the linear model (Y = β.(L.W )) they were lower in relation to the two powers (Models 2 and 3), with ER ranging from 0.62% to aluminum RL 02 to 5.31% for SMFL x CTC 13-012; and the average deviation the lowest among the three models tested (Table 4).The third model, in which it was used only the length (L) as the independent variable, proved to be comparatively less precise, especially for the genotypes belonging to groups 2 and 3.That result indicates the need of use of all the variables L and W in the estimates of single leaves for a greater precision regardless of the group.Considering that there were different responses depending on the genotype evaluated, with relative error (RE) minimum of 0.05% for SO and maximum of 20.73% for SM x CTARG -044, proportionally different from the models which used L.W imple leaf format, the most appropriate was the linear (Y β. (L.W)).The advantages are by the high precision on the estimates and ease of practical application, confirming and justifying its widespread use in the estimate of leaf area in different plant species (Blanco and Folegatti, 2005;Coelho Filho et al., 2005, 2012, Cristofori et al., 2007;Malagi et al., 2010;Sousa et al., 2014;Souza and Amaral, 2015).

Models for compound leaves genotypes
The mathematical models tested fitted well for all genotypes, by the values of r 2 0.84 (Tables 4 and 5).The choice of mathematics ratio (L 2 +L 3 )/L 1 , originally based on visual observations of variability, was attested by the high correlation with the constant β, model 1 (Table 5), Spearman's correlation coefficient of 0.98 (figure not shown).
Considering only the linear models (Table 5), there was less variation in the amplitude of the values of the constant β in the third model proposed; therefore, the model was sensitive to changes in the shape of leaves.In a converse way, variations were greater for the first model.Such results probably reflect the number of variables used in each model.
Analyzing the estimates of leaf area within each  Constant β and μ used on models and their respective coefficients of determination, and grouping of genotypes (compound leaves).Model 4: ( ) , Model 5: ( ) , Model 6: ( ) Table 7. Sum of errors, relative errors, coefficient of correlation between the area of each leaf and the estimated area to the linear models of genotypes of compound leaves estimates based on the specific model for each group presented in Figure 1.genotype, based on the adjusted models from each group (Figure 1), it was observed that the largest number of independent variables used in Model 2 reflected the higher values for the coefficient of determination, except in SM x (RL x TR) genotype -016, in which it was noticed the best fit when using the third model (Table 5).Possibly the greatest number of independent variables of the model 2 increased its sensitivity, regardless of the leaf groups (1, 2 and 3), expressed by the proximity of the angular coefficients obtained (Figure 1E).That result suggested the feasibility of using an average value, regardless of grouping.

Genotype
Considering that observation, a single regression with the data of 11 genotypes of compound leaves based on that model ( ( ) was performed.The value of β is equal to 0.776 and the model explained very well to the values observed by the coefficient of determination of 0.976 (figure not shown).In that case, in function of response independent of the genotype, the lack of concerning with groupings is a positive point.However, there is a need for a greater number of independent variables, which can restrict its use in practice, when the goal is to perform a large number of measures.
The proximity of the results with the use of linear models (Table 7) (average deviation of the relative error (RE) ranging from 1.01 to 1.48; and average deviation of error (E) ranging from 4.97 to 9.43), justifies the use of the Model 1 ( ), due to its greater simplicity and practicality, confirming one more time the widespread use by different authors.
Analyzing the potential models, it was found that the constant μ for groups in the leaf model 4 ( ( ) were lower than one (Table 6), suggesting a reduction in the estimate rate of leaf area according to the increase of leaf length (Figure 1 G), what can cause major errors in the estimate of the area of leaves with high length.On fifth and sixth models, once the exponents are larger than the unit (Table 6), the angular coefficient of the tangent lines to the curve increases with the elevation of the value of the input variable, the opposite of what happened in the fourth model (Figure 1G to I).
Despite the high accuracy of the estimates obtained individually for the genotypes, in relation to the potential models 4, 5 and 6, according to the coefficients of determination (Table 8), when analyzing the statistical parameters 'average error' and 'standard error', there is a greater precision and accuracy when used with the linear models (Tables 7 and 8).
In a general way, the estimates of leaf area for all genotypes using the six models proposed resulted in high coefficients of determination (>0.88).Exception for SM x CTTR -002 and SM x (RL x TR) -016, with respective values of 0.78 and 0.84; both for the sixth model.When compared only the potential models, the fifth model presented the best adjustment, lower errors of estimate (E and ER) and a higher r² (Table 8).Among the linear models, due to the proximity of the errors and high values of r 2 , the model Y = β.L.W is very interesting for the greater ease of practical determination, favoring the largest sample in studies of plant growth.Earnings comparatively small in accuracy can be obtained with the use of the linear models 2 and 3 ( ( ), Model 3: [( ) ], despite the larger number of variables to be measured.Compared to the linear, it can provide errors of estimates higher than the linear, for some genotypes.Different types of mathematical models have been generated for different plant species and leaf type.However, the models are developed for specific species and are usually restricted to few varieties or form of leaf, as performed by Coelho Filho et al. (2005, 2012); Malagi et al. (2010); Souza and Amaral (2015) and Toebe et al. (2012).In the present study different mathematical models of leaf area estimate with different levels of accuracy were developed, with advantages of being applicable to any genotype of citrus just requiring the knowledge of biometric relations that differentiate the leaf shape.

Conclusions
The greater precision of estimates is achieved when using specific models for each type (simple and compound leaves) and separating these types in homogeneous groups in relation to leaf dimensions and folioles.For simple and compound leaves, the respective linear models (Y = .L. W; Y = .L 1 .W) showed the best statistical performance, besides being easy to use.The potential models Y β L µ and Y β L1 µ , respectively for simple and compound leaves, require only one input biometric variable, which in a practical way, allow an increase in the number of repetitions, but provide errors

Figure 1 .
Figure 1.Regressions fitted for grouped genotypes (A to C -single leaves and D to I -compound leaves) and their coefficients of determination (-•-Group 1* -Group 2**, ---Group 3***, L-maximum length of the leaf, W -maximum width of the leaf, L1maximum length of the central foliole, L2 and L3 -maximum length of lateral folioles, W1 -maximum width of the central foliole, W2 and W3 -maximum width of the lateral folioles; axis Y of same scale for the graphics from A to C and D to I).

Table 1 .
Genotypes used for leaf area estimation according to the leaf blade form.

Table 2 .
Description of models obtained, where β and μ are constants estimated by the software table curve.

Table 3 .
Number of leaves (NL) and total leaf area (TLA-cm 2 ) of genotypes, Constants β and μ of models; coefficients of determination (r 2 ); ratio of the length and width raised to the second power (L ⁄ W) 2 and grouping of genotypes (single leaves).
* ME, Mean absolute and relative error; E, error (if positive; the model overestimated and if negative; the model underestimated the leaf area); R.E., Relative error (represents the percentage of over or underestimate of the model); Model 1: ( ); Model 2: ( ) ; Model 3: ( ) .

Table 4 .
Sum of the errors of the estimation of leaf area (E) of genotypes of single leaves, relative errors (RE) and coefficient of determination (r 2 ).The estimates were based on the models proposed for each group presented in Figure1.*ME, Mean absolute and relative error; E, Error (if positive, the model overestimated and if negative, the model underestimated the leaf area); R.E., Relative error (represents the percentage of over or underestimate of the model); Model 1:
ME, Mean absolute and relative error, E, Error (if positive, the model overestimated and if negative, the model underestimated the leaf area); R.E., *

Table 8 .
Sum of errors, relative errors, coefficient of determination between the average area of each leaf and the estimated area for the potential models of genotypes of compound leaves.Estimates based on model specific to each group presented in Figure1.*ME, Mean absolute and relative error; E, Error (if positive, the model overestimated and if negative, the model underestimated the leaf area), R.E., Relative error (represents the percentage of over or underestimate of