The growth of a tree or a forest settlement is of great value to a forest enterprise, because many decisions are directly dependent of this information, for instance, determining the optimal cutting age. This study aims to apply a new class of models to fit growth curves for diameter and height of Eucalyptus grandis X Eucalyptus urophylla seedling data. Data were collected from a trial conducted in a green house at the Natural Resources Department at School of Agriculture, Botucatu, S ?o Paulo, Brazil. The experiment’s design was completely randomized with eight treatments and four replications. In this trial, the growth variables referring to the height and the diameter were evaluated, being measured five and four times, respectively. The methodology was carried in a mixed longitudinal model using a new approach based on Box-Cox Normal (BCN) distribution, and comparisons with this model were made assuming normality of the data. The results revealed that the BCN mixed model provided similar results to the standard model in order to estimate growth curves; however, the BCN model was the best result according to Akaike criterion, considering the slight asymmetry in the data set. This approach is of great interest in case of outliers and robust procedures for parameter estimation.
The increase in wood consumption and its derivatives highlight the need to generate new seedlings production technologies with a standard of appropriate quality to establish more productive forests.
In the seedlings selection for planting, the criteria are based on characteristics which generally do not determine the real quality, because the seeds vary according to the species, ecological sites, cultivation, transportation, distribution and planting. So, there are several reasons for the use of tests in the standard definition of seedlings quality, being able to add some values that are often required by the market (Gomes et al., 2002) .
The survival, the establishment, the cultivation frequency and the early forest growth are necessary ratings for the forest enterprise success, which is directly related to the seedling quality at planting (Gomes, 2001) .
In this context, the height/diameter relation is one of the features used to evaluate the forest seedling quality, because it reflects the reserve accumulation, a greater resistance and better fixation in the soil (Carneiro, 1995) . Furthermore, the average height and the base diameter are two parameters, which show the high plant quality considered relevant to forestry industry companies (Gomes et al., 1996) .
A set of mathematical models for growth in height and in diameter was proposed using linear and nonlinear models by many authors, including Schumacher (1939) , Bertalanffy (1951) , Richards (1959) , Prodan (1968) , Machado (1979) , Silva (1986) , which are known as Brody, von Bertalanffy, logistic and Gompertz models. These models have been widely and successfully used in many works, such as in Machado et al. (1997) , Dias et al. (2005) , Oliveira et al. (2008) , Teo et al. (2011) , Carvalho et al. (2014) . Such models assume normal distribution for the dataset but, generally this assumption is not verified.
Ferrari & Fumes (2017) proposed a new class of distributions called Box-Cox symmetric class, containing distributions that allow it to model asymmetry and that can also include outliers. Special cases include log-symmetric distribution (Vanegas & Paula, 2015) , (for instance, log-normal distribution) and the truncated symmetric distributions with support on the positive real line, such as the Box-Cox t (Rigby & Stasinopoulos, 2006) , the Box-Cox Normal (Cole & Green, 1992; Stasinopoulos et al., 2008) and the Box-Cox power exponential distributions (Voudouris et al., 2012) , among others.
Therefore, the aim of this work is to apply models taken into account this new distribution class to fit growth curves for height and diameter data from a forestry trial of the Eucalyptus grandis X Eucalyptus urophylla hybrid seedlings and compare them with models that assume normality of the data.
This work arised from the data collected from a trial carried out in a green house of the Department of Natural Resources - Forestry Sciences, School of Agriculture, UNESP, Botucatu, São Paulo, Brazil (Bazzo, 2009) . The experiment evaluated the effects of different substrates on the Eucalyptus grandis X Eucalyptus urophylla hybrid seedling development. Different proportions of sewage sludge and carbonized rice husk, which composed the substrates, were compared to a standard substrate used in the green house. For the substrate composition, the sewage used sludge by carbonized rice husk proportions were: 100:0, 80:20, 60:40, 50:50, 40:60, 20:80 and 0:100. For this article, the different proportions composed the treatments from one to seven, in the order of proportions as shown above, and the eighth treatment was the standard substrate. The experimental design was a completely randomized design with eight treatments and four replications, each replication being origined by the growth average of 24 plantings. The growth variables measured were the height in five periods (after 90, 105, 120, 135 and 150 days) and the diameter in four periods (after 105, 120, 135 and 150 days), both measurement in centimeters.
treatment, which was the standard compost, presented a better growth performance.
The histograms for the height data by time in
In this section, it was described the (non) linear mixed model approach for height and diameter modelling that was used to compare it with a new approach
based in the Box-Cox symmetric class (Ferrari & Fumes, 2017) .
Having problems involving growth data, the structure of linear mixed model is ordinarily used to model longitudinal data (Melesse & Zewotir, 2017; Soares et al., 2017; Pinheiro & Bates, 2000) . In this case, the height and the diameter data were assumed to be normally distributed and the independence assumption was violated in longitudinal data, hence random effects were required.
So, for the height data, which presented a linear trend in descriptive analysis (Section 2.1), the model was defined by:
Y i j k = β i 1 + β i 2 t k + ϵ i j + ε i j k , (1)
where Y i j k was the height related to the kth time, the jth replication and the ith treatment, β i 1 was the mean of the ith treatment, β i 2 was the slope parameter of the ith treatment, t k was the time (continuous), ϵ i j ~ N ( 0 , σ p 2 ) , i i d , was the plot random effect and ε i j k ~ N ( 0 , σ 2 ) , i i d , was the error of the model, for i = 1 , ⋯ , 8 , j = 1 , ⋯ , 4 and k = 1 , ⋯ , 5 .
For the diameter data, which presented a nonlinear trend in the descriptive analysis (Section 2.1), the Gompertz mixed model was proposed with the structure:
Y i j k = ( β 1 + ϵ i j ) exp ( − β 2 β 2 t k ) + ε i j k , (2)
where Y i j k was the diameter related to the kth time, the jth replication and the ith treatment, β 1 was the parameter representing the asymptote, ϵ i j ~ N ( 0 , σ p 2 ) , i i d ; β 2 was the parameter related to the value of the function at t k = 0 , β 3 was the parameter related to the scale of the t variable and t k was the time (continuous), and ε i j k ~ N ( 0 , σ 2 ) , i i d , was the error of the model, for i = 1 , ⋯ , 8 , j = 1 , ⋯ , 4 and k = 1 , ⋯ , 4 .
The components of variance parameters of the linear mixed model were estimated by the restricted maximum likelihood (REML). This methodology is a particular approach form of maximum likelihood estimation which does not calculate estimates on a maximum likelihood fit of all the information, but instead, uses a likelihood function calculated from a transformed set of data. REML can produce less unbiased estimates of variance and covariance parameters ( Pinheiro & Bates, 2000
Ferrari & Fumes (2017) proposed a new class of distributions called Box-Cox symmetric class, which includes the Box-Cox Normal distribution, that can be fitted to data with symmetric and asymmetric shapes. Then, an alternative approach, using the same structure of linear mixed model for the height data, was:
Y i j k | ϵ i j ~ B C N ( μ i j k , σ , ν ) , where μ i j k = β i 1 + β i 2 t k + ϵ i j , (3)
where Y i j k was the height related to the kth time, the jth replication and the ith treatment with Box-Cox Normal (BCN) distribution, where μ i j k was associated with: the median of the ith treatment ( β i 1 ), β i 2 was the slope parameter of the ith treatment, t k was the time (continuous) and ϵ i j ~ N ( 0 , σ p 2 ) , i i d , was the plot random effect, for i = 1 , ⋯ , 8 , j = 1 , ⋯ , 4 and k = 1 , ⋯ , 5 ; besides, σ was the parameter related to the coefficient of variation based on the median and ν was the parameter associated to the asymmetry, with μ i j k > 0 , σ > 0 and ν ∈ ℝ .
For the diameter data, another alternative approach using BCN model was:
Y i j k | ϵ i j ~ B C N ( μ i j k , σ , ν ) , where μ i j k = β i + f ( t k ) + ϵ i j , (4)
where Y i j k was the diameter related to the kth time, the jth replication and the ith treatment with a Box-Cox Normal (BCN) distribution, where μ i j k was associated with: the median of the ith treatment ( β i ), f ( t k ) was the smooth function (in this case, p-spline smoother was used, see Stasinopoulos et al. (2008) for details) which models the time effects, ϵ i j ~ N ( 0 , σ p 2 ) , i i d , was the plot random effect; σ was the parameter related to the coefficient of variation based on the median and ν was the parameter associated to the asymmetry, for i = 1 , ⋯ , 8 , j = 1 , ⋯ , 4 and k = 1 , ⋯ , 4 ; with μ i j k > 0 , σ > 0 and ν ∈ ℝ .
The marginal maximum likelihood was used for BCN models. In this context, the main distribution was been marginalized (Rigby et al., 2014) . All the procedures were performed in R (R Core Team, 2015) , version 3.3.1, using gamlss routine for linear as well as semiparametric BCN models (Rigby et al., 2014) . For the goodness of fit, the quantile residual plot was calculated for BCN models (Rigby et al., 2014) .
In these approaches, the random effect ϵ i j represented the between plot variability in the experiment, while the within plot variability was modelled with the variance of the residual error ε i j k in the standard models, and in the BCN, it was modelled through the parameter σ. The Akaike criteria were calculated to compare the different approaches.
The BCN and the normal distributions were plotted on top of the raw data in each time. For the height data, the BCN distribution performed better than the normal distribution (
In
For nonlinear mixed model (model (3)), the estimated parameters for Gompertz were: β ^ 1 = 3.51 , β ^ 2 = 1.55 and β ^ 3 = 0.47 for fixed effects and
Treatments (i) | Linear mixed model (1) | Linear BCN mixed model (2) | Nonlinear BCN mixed model (4) | ||
---|---|---|---|---|---|
β i 1 | β i 2 | β i 1 | β i 2 | β i | |
1 2 3 4 5 6 7 8 | 2.14 2.16 2.61 3.14 3.18 3.10 2.04 5.26 | 4.17 3.44 4.70 4.09 3.19 4.87 3.91 5.42 | 2.76 2.84 2.72 3.38 3.69 3.51 2.14 5.42 | 4.46 3.15 3.61 3.98 2.98 4.65 3.87 5.33 | 1.14 0.99 1.20 1.12 1.08 1.22 1.03 1.58 |
|AIC| | 584.15 | 549.98 | 52.08 |
|AIC| = 111.51. It means that the estimate of the parameter associated to asymptotic growth for this observed period was around 3.5 cm, with an estimated growth efficiency of 0.47 cm in each period. From AIC criteria, the BCN model had a better performance than the standard ones, even though we could observe that the parameter estimate were quite close as well as the AIC (models (1) and (2)). This occurred probably due to the presence of slight asymmetry already commented above, but still indicating the BCN can be considered superior than the other.
For diameter, we can observe that the nonlinear BCN mixed model (model (4)) presented lower AIC than the Gompertz model (model (3)). It is important to notice that in the BCN model, a smooth function was taking into account to obtain a better fit. In this case, the smooth function can lead to a better fit but it had a cost of lack of parsimony while Gompertz model uses only three parameters. We guess this sort of modelling can be better investigated in order to get a better fit with few parameters.
Furthermore, for all the models, comparisons among the eight treatments were done. The eighth treatment, which was the standard compound, was statistically different from the others with a better growth performance (
This paper presented a new approach based on the Box-Cox Normal distribution
to analyse growth data. Additionally, the classical linear and nonlinear models were used to compare the new proposal. We observed that the fitted models presented similar performances, which means that the BCN model can be used to estimate growth curves. Moreover, its structure considered the slight asymmetry of the data and the AIC criteria primed as the best models. Furthermore, in the presence of outliers, this model can be more proper, because it involves a robust process of estimation. However, more studies can be made involving smooth functions in the nonlinear case.
The growth curves are important for determining the time of delivery of the seedlings to the field. Besides that, the genus Eucalyptus species is very important for the economic wood production in Brazil. Finally, curve predictions can be useful when making efficient decisions on the use of this natural renewable resource.
Fumes, G., Demétrio, C. G. B., Villegas, C., Corrente, J. E., & Bazzo, J. F. (2017). Growth Curves for Diameter and Height Using Mixed Models: An Application in Eucalyptus Seedling. Open Journal of Forestry, 7, 403-415. https://doi.org/10.4236/ojf.2017.74024