Modeling of hypsometric distribution of Handroanthus heptaphyllus seedlings in different containers

Hypsometric information of seedlings allows greater assertiveness of silvicultural decisions in nurseries. This study aimed to evaluate the efficiency of different Probability Density Functions (PDF) to estimate and compare the height distribution of Handroanthus heptaphyllus seedlings in different cultivation containers. One hundred and thirty one seedlings were produce in two types of containers; 81 units in tubes and the remaining in plastic bags. The census was realized at 122-day old, measuring total height of the seedlings using millimeter ruler. The seedlings did not present a tender or brittle caulinar system. Data were grouped into biometric classes with regular intervals of 2.5 cm in height. Seven PDF were adjusted using maximum likelihood method and the one with the best predictive performance was selected to identify the statistical equality between the distributions estimated for the seedling size in each container. The order of predictive efficiency of PDF was distinct between recipients. The Weibull of two parameters function can be used to model the height distribution of H. heptaphyllus seedlings at 122-day old, produced in plastic bags and tubes. The hypsometric distribution was different between containers.


INTRODUCTION
Handroanthus heptaphyllus (Vell.) Mattos (Bignoniaceae), commonly known as ipê-roxo and pau-d'arco, is a tree species with records of occurrence in almost all Brazilian territory, from the South to the Northeast (Oliveira et al., 2015;Dullius et al., 2016). It has been widely used for wood purposes, due to its heavy wood (1.07 g cm -3 ), urban afforestation and restorations of legal reserves and permanent preservation areas (Mori et al., 2012;Oliveira et al., 2015). Regarding seedlings production of native species, the success of a planning depends on quantity and quality of available information. Hypsometric distribution modeling is an important statistical technique *Corresponding author. E-mail: bruno.lafeta@ifmg.edu.br. Tel: +55 33 3412 2925.
Author(s) agree that this article remain permanently open access under the terms of the Creative Commons Attribution License 4.0 International License capable of supporting silvicultural decisions, taking the choice of culture containers as an example.
The choice of containers for seedlings production should take into account various technical and financial criteria. Plastic bags and tubes are the most recommended containers for seedlings production in the country, each one expresses advantages and limitations (Pias et al., 2015). The plastic bags have a low cost of acquisition, but require more production space and labor to handle the seedlings, besides the ease of breaking under inadequate handling (Teixeira et al., 2009). An alternative that facilitates the sequencing of operations in nurseries is the use of reusable polypropylene tubes, which have internal ribs and a lower hole (open bottom) for directing root growth and minimizing problems with folding (Dominguez-Lerena et al., 2006;Ferraz and Engel, 2011). Containers shape root system, protect the roots from mechanical damage, and provide support for nutrition. In spite of the greater demand for water and fertilizers, voluminous containers provide more space for root development and, in some cases, intensify the growth rate of seedlings (Dominguez-Lerena et al., 2006;Pias et al., 2015). Routinely, voluminous containers are used to seminal propagation of native species, due to the lack of knowledge of the pattern of growth and distribution of the root system.
Height and its ratio to the shoot dry mass are excellent attributes for the qualitative evaluation of seedlings (Gomes et al., 2002). The height is a biometric attribute that allows inferring about the potential performance of seedlings in a fast, objective and non-destructive way. In addition, high seedlings tend to be more successful in establishing and surviving in the field (Landergott et al., 2012;Westfall and McWilliams, 2017).
It is common to represent large datasets by means of generalist measures of position and dispersion, neglecting information related to their distribution. Detailed information on the hypsometric structure of seedlings allows greater precision and assertiveness of silvicultural decisions by individual owners and large forest companies, such as the application of a nutritional and logistics management plan. The adjustment of biometric distribution models allows the estimation of the number of seedlings within intervals or size classes, with a lower and upper limit (Amaral et al., 2009;Rana et al., 2017). The optimization of the input allocation for the growth of certain seedlings with interest size rationalizes the application of correctives and fertilizers.
Modeling of biometric distributions has gained increasing importance due to its contribution to the forest enterprises planning, whose focus is the obtaining of multiproducts. In the specific case of nurseries, the seedlings expedition for different purposes (multiproducts) is a practice that must be analyzed from an operational and logistic point of view in order to maximize yields. A common approach in distribution models is the use of statistical probability functions, known as Probability Density Functions (PDF), to characterize the size structure of a set of plants (Tsogt and Lin, 2014;Diamantopoulou et al., 2015).
Hypsometric distribution of seedlings is based on a histogram of height frequency and is expected to present different forms according to species, age and silvicultural treatments. Despite the intensive use of containers in nurseries, detailed research on their implications in the hypsometric distribution of plants is still lacking. By the aforementioned, the following hypotheses were tested: (i) Is the Weibull function flexible enough to model the height distribution of seedlings? (ii) Are the hypsometric distributions of seedlings cultivated in tubes and plastic bags similar? The objective was to evaluate the efficiency of different PDF to estimate and compare the height distribution of H. heptaphyllus seedlings in different culture containers.

MATERIALS AND METHODS
Fruits used in the present work came from routine collections conducted by the Federal Institute of Education, Science and Technology of Minas Gerais (IFMG) in Governador Valadares-MG municipality, from June to September 2017. The trees selected for the collection had exuberant crown, with no apparent signs of pest and insect attack. The collection region has a humid tropical savanna climate, classified as Aw by Köppen International System (rainy summer and dry winter).
Fruits collected were stored in Kraft paper bags and hand-fed at seedling nursery of IFMG São João Evangelista-MG campus, located at 18°33'11" south latitude and 42°45'10" west longitude (Datum WGS84). The climate is classified as Cwa (dry winter and rainy summer), with average minimum temperature of 22°C and average maximum of 27°C per year, annual average rainfall of 1.180 mm and altitude of 730 m. The beneficiation consisted of seeds isolation in relation to fruits and elimination of materials that had some atrophy or injury.
The nursery produced 131 seedlings of H. Heptaphyllus in two types of containers; 81 units in 280 cm 3 tubes (6.5 cm outside diameter, 5.2 cm inner diameter, 19 cm length, 8 internal ribs) and the rest in plastic bags (height 30 cm, width 25 cm and perforated). The seedlings were cultivated in a shade house, covered with sombrite (50% mesh), until 30 days after sowing and were irrigated three times a day for 15 min (nozzle flow rate of 103 L h -1 ). Afterwards, the seedlings were conditioned in the open air (full sun) for 92 days, irrigated four times a day for 10 min (nozzle flow rate of

Name Probability Density Functions
Weibull 2P Logistic 2P
118 L h -1 ). The seedlings did not present a tender or brittle caulinar system.
The census was carried out measuring the total height (H, cm) of all seedlings produced with the aid of a millimeter ruler. The total height was characterized by linear distance from collection to last leaf. Data were submitted to descriptive statistical analysis (minimum, mean, median, mode, maximum, coefficient of variation, and by the moment's method, asymmetry and kurtosis). The unpaired t test was applied to compare the height averages of the seedlings produced in the different container types.
Data were grouped into biometric classes with regular intervals of 2.5 cm in height. The functions tested were: two-parameter Weibull (Weibull 2P); two-parameter Logistic (Logistic 2P); two-parameter Log-logistic (Log-logistic 2P); Cauchy; two-parameter Gamma (Gamma 2P); Normal; and Log-normal. All the functions were adjusted by the maximum likelihood method, using the optimization methodology of Nelder and Mead. The PDF are listed in Table 1.
Adjustments quality was evaluated according to the Mean Absolute Error (MAE), Pearson correlation coefficient ( ), Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). Lower MAE values and both criteria of statistical information imply higher predictive quality.
Adherence of the functions to the data was evaluated by the Kolmogorov-Smirnov test (Gibbons and Subhabrata, 1992). It is a test that compares the estimated cumulative frequency with the observed frequency. The point of greatest divergence between distributions is the test statistic value (dn). In addition, graphical analysis was performed between values observed and estimated by obtained equations.
From predictive best efficiency function, identity test model (Graybill, 1976) was applied to identify the statistical equality among estimated distributions for seedling size produced in each container. This hypothesis was tested by F statistic, whose nonrejection (FH0 < Fα) allows to admit that distributions' frequency are not different from each other.
In order to diagnose statistical effect, the significance level of 1% was adopted in all analyses. These were carried out using software R 3.3 (R Core Team, 2017).

RESULTS
Height distribution of the H. heptaphyllus seedlings produced in tubes was more asymmetrical than those from plastic bags; both distributions presented negative asymmetry, with the left tail ( Table 2). The leptokurtic (positive kurtosis) and platicurtic (negative kurtosis) behaviors were observed in height distributions related to the use of plastic tubes and bags, respectively. The plastic bags exhibited seedlings with a greater amplitude of variation (absolute difference between the maximum and minimum values) and coefficient of variation of height. The mean, median and mode values were close to each other, indicating a trend towards normality.
By the t test, the average height of the seedlings produced in plastic bags was higher than those from tubes. Regarding the use of tubes, all adjustments presented adherence by the Kolmogorov-Smirnov test (mean of test statistics equivalent to 0.47 ± 0.11). When plastic bags were used as containers for seedlings production, adherence was verified only in the adjustments of the Weibull 2P (dn = 0.64) and Cauchy (dn = 0.64); the mean of test statistics of adjustment of the other functions was 0.80 ± 0.09. Functions adjustments showed few deviations, with low MAE values (Table 3). In general, the correlation coefficients were high (above 0.90) for seedlings produced in tubes. The significance (p ≤ 0.01) of the correlation coefficients was found in all the adjustments. The efficiency of adjusted functions majority was distinct between containers. According to AIC and BIC criteria, the increasing order of predictive height efficiency of seedlings produced in tubes was: Cauchy, Log-normal 2P, Gamma 2P, Log-logistic 2P, Normal, Logistic 2P and Weibull 2P. For the use of plastic bags as containers, the order was modified to: Cauchy, Log-logistic 2P, Lognormal, Gamma 2P, Logistic 2P, Normal, and Weibul 2P. The predictive superiority of the Weibull 2P function was also confirmed by the lower MAE value, high correlation coefficient and adherence. From the parameters obtained using the PDF (Table 2), followed by the estimates of relative frequency of seedlings per height class. The negative asymmetry of height data (Table 1) was evidenced in the frequency distributions observed ( Figure  1). The observed frequencies of height did not show discontinuity (absence of seedlings in one or more biometric classes), considering the respective intervals formed by the minimum and maximum limits (Table 2).
Facing the best predictive efficiency, the Weibull 2P function was selected for graphical analysis (Figure 2) and subsequent comparison. The F Graybill test result showed a significant statistical difference (F H0 ≥ F α ) between the estimated height distributions of the seedlings for the two types of containers. The curve concerning the plastic bags was more displaced to the right, showing greater dispersion (standard deviation of 7.03 cm) and of central tendency (average of 43.81 cm).

DISCUSSION
Providing accurate estimates requires good function adherence to structure data. The adherence of the PDF demonstrated potential to describe the hypsometric structure of seedlings produced in tubes. However, not all of them were efficient to model the height distribution of seedlings from plastic bags. The values of dn statistic (Kolmogorov-Smirnov) were smaller with use of tubes; variation amplitude of this statistic was from 0.29 to 0.57 for the tubes and from 0.64 to 0.82 for the plastic bags. The smaller number of seedlings produced in plastic bags, together with the greater variation amplitude of height (25.60 cm), may have negatively affected the predictive performance of PDF Logistic 2P, Log-logistic 2P, Gamma 2P, Normal and Log-normal. It is important to emphasize that other statistical approaches, such as the use of artificial intelligence and regularized regressions (Binoti et al., 2013;Castro et al., 2013;Binoti et al., 2014;Kadyrova and Pavlova, 2014;Diamantopoulou et al., 2015;Chai et al., 2016), can be applied in order to improve predictive quality in complex database modeling. The most common regularization methods for machine learning focused on regression problems are the Ridge and Least Absolute Shrinkage and Selection Operator (LASSO) (Chai et al., 2016). Biological network-regularized logistic models are examples that have been extensively used in the genomic area (Zhang et al., 2013;Huang et al., 2015;Chai et al., 2016;Huang et al., 2016;Kang et al., 2017), but its application has not yet been found in Brazilian silviculture.
It is emphasized that hypsometric distribution can be obtained by measuring height of all seedlings produced (census) or a representative subset of them (sampling). The census for a specific species is justified for a reduced number of seedlings, which is common in nurseries that propagate native species and prioritize the diversification of production. On the other hand, sampling techniques may be indicated to characterize the biometric distribution of large-scale production of a given species. The graphical analysis showed a greater clarity in the judgment of the height distributions estimated by PDF. When analyzed in Figure 1, Weibull 2P function was the one that best represented data series. All functions underestimated the number of seedlings produced in  plastic bags in the extreme height classes. It is easy to notice the non-adherence of those functions whose null hypothesis was rejected by the Kolmogorov-Smirnov test. Although Weibull 2P and Cauchy PDF were the only ones whose adherence was verified to describe the height of seedlings in both tubes and plastic bags, the first showed a better predictive efficiency of hypsometric distribution of H. heptaphyllus seedlings (Table 3 and Figure 1). Adjustment statistics showed that the scale and shape parameters of the Weibull 2P function were not biased. Scale parameter represents the amplitude of the distribution and it was smaller when using tubes for seedlings production, consistent with the limits shown in Table 2. The use of plastic bags resulted in a distribution relatively closer to the axis of symmetry. The values of shape parameter denoted negative asymmetry, revealing an accumulation of larger size seedlings. This assumption was based on the premise that shape parameter (values above 3.6) increases as the asymmetry becomes progressively more negative (Bailey and Dell, 1973).
The choice of the function for modeling distributions determines the accuracy of the estimates. The evaluation of biometric distribution models should consider interpretations of qualitative nature (biological realism) and quantitative (statistical). The Weibull 2P function proved to be an efficient probabilistic model capable of accurately representing reality, even on occasion with reduced data and large variation amplitude. Currently, this function is intensely adjusted in forest area due to its flexibility to assume different asymmetries and shapes, modeling several distribution tendencies, from normal to exponential (Bailey and Dell, 1973;Leite et al., 2010;Tsogt and Lin, 2014;Diamantopoulou et al., 2015). The ability to model distributions with different asymmetries and shapes was confirmed in the present work (Table 3 and Figure 1). There is a consensus of the superiority of the Weibull function over the other PDF for the description of biometric attributes, above all, the diametrical structure of trees (Campos and Leite, 2017). However, no studies have been found using the function to describe the hypsometric structure at nursery level.
Comparing the hypsometric distributions estimated by the Weibull 2P function in each container (Figure 2), rejection of the similarity hypothesis was recorded (F H0 ≥ F α ). This difference between distributions hinders the predictive performance of a possible fit for the entire data set, and it is indicative that the seedlings in some cases require different management per container. In terms of mean, the difference in height was also observed by the t test. The definition of stratification criteria for PDF adjustment, as silvicultural treatments, provides detailed information about the effect of the management in seedlings production in commercial and technological routine.
Size is crucial for the establishment of seedlings in field or urbanized areas, small differences may influence the establishment and/or dominance of the species (Dominguez-Lerena et al., 2006). The classification of nursery seedling lots in terms of height is a quality control methodology (Gomes et al., 2002). The adjustment of the hypsometric distribution for each container proved that it is possible to sketch the quality of seedlings using PDF, because the larger the value of the scale parameter, the greater the growth rate. In this perspective, the seedlings produced in plastic bags (distribution more asymmetric and with lower value of the shape parameter) presented higher growth potential in height than those originating from tubes. It is emphasized that the standardization of an index age, or reference, is essential for making inferences about the classification of productive potential (Campos and Leite, 2017). In the presence of seedlings with different ages or periodic measurements, the adjustment of regression models for height estimates in the index age is possible.
Knowledge of productive potential stimulates the search for silvicultural strategies that minimize reducing factors and limiting plant development. The containers used in seedlings production influenced the growth rate of seedlings, corroborating Dominguez-Lerena et al. (2006) for Pinus pinea L. The H. heptaphyllus seedlings produced in plastic bags presented a higher growth rate, reducing their time in nursery and making possible the anticipated expedition in relation to those of tubes. This fact was probably due to the greater volume of plastic bags, which provide more nutrients, space for the development of the root system and water utilization, whose losses in tubes can reach 78% of the volume applied (Bomfim et al., 2009;Barroso et al., 2000). In addition, the depth of the plastic bags was 1.58 times greater than that of the tubes. According to Dominguez-Lerena et al. (2006), the depth of the container is one of the most important attributes that act on the morphology of seedlings.
A greater disuniformity of seedling height produced in plastic bags was observed, with a flattened distribution (platicurtic) and greater data dispersion. The seedlings from tubes had lower hypsometric variation, resulting in greater production uniformity. The use of tubes proved to be a promising technique to homogenize seedlings propagation. With regard to the choice of growing container to meet a rapid demand for seedlings, plastic bags can be a viable alternative to be considered, provided all the costs involved in the production chain are analyzed and waste is avoided.
The information of hypsometric structure of seedlings allows a better targeting of the inputs. The accelerated growth rate favors the production of high-quality seedlings, which are more valued for commercialization and frequently used in urban afforestation. Seedlings that grow faster require more attention to the application of silvicultural treatments, such as the additional application of N based on nutritional balance. As for the height, the plastic bags were the most suitable containers for the production of high-standard seedlings of H. heptaphyllus.
The most frequent intervals, formed by three consecutive size classes (Figure 2), concentrated 30.99% (20.00 to 27.50 cm) of the seedlings produced in tubes and 17.94% (42.50 to 50.00 cm) of those coming from plastic bags. Assuming a hypothetical scenario in which seedlings with ≥90% of the height average of those 10 higher are ideal to be conducted in high-standard, 40% (height ≥24.30 cm) and 28% (height ≥ 46.58 cm) of the seedlings from plastic bags and tubes met this condition, respectively. Therefore, investment in fertilizers and larger containers, such as buckets, should be considered for maximize yields.
Seedlings with slow growth rates require more time to reach specific quality standards. Decisions to invest in the growth of these seedlings or their immediate expedition should be evaluated with caution, if possible, considering the market requirements and available inputs. Smaller seedlings, which meet a certain level of quality, can be targeted to medium and small-scale farmers, as they are traditionally a public with greater financial limitations for forest recovery (Ferraz and Engel, 2011;Pinto et al., 2011). Another target audience are companies with capital restrictions to start reforestation projects. Thus, larger seedlings are indicated for reforestation in places where there is clear possibility of survival and establishment of the species (Pinto et al., 2011).
The detailed statistical analysis of biometric surveys performed in nursery for better control of seedling production is recommended, evaluating size distributions and general measures of position and dispersion. The hypsometric distribution is an indicator of growing stock (Amaral et al., 2009), with potential use in the planning and management of seedlings production, helping to define silvicultural strategies and logistic strategies in nurseries. It is important to point out that difficulties in adjusting biometric distributions have been solved due to advances in computer science and statistical techniques, making adjustments ever simpler and facilitating the choice of the best function in relation to the dataset (Lana et al., 2013).

Conclusion
The two-parameter Weibull function is efficient to model the height distribution of H. heptaphyllus seedlings at 122-day old, produced in tubes and plastic bags. This function is flexible and promising for adjusting seedling height.
The hypsometric distribution is an efficient tool to classify seedlings and to support strategic decisions about logistic and silvicultural treatments in nurseries.
The hypsometric distribution of H. heptaphyllus seedlings at 122-day old may be different between plastic bags and tubes.