Estimating Tree Volume of Dry Tropical Forest in The Brazilian Semi-Arid
Estimating Tree Volume of Dry Tropical Forest in The Brazilian Semi-Arid
Estimating Tree Volume of Dry Tropical Forest in The Brazilian Semi-Arid
To cite this article: Robson B. de Lima, Rinaldo L. Caraciolo Ferreira, José A. Aleixo da Silva,
Francisco T. Alves Júnior & Cinthia P. de Oliveira (2020): Estimating Tree Volume of Dry Tropical
Forest in the Brazilian Semi-Arid Region: A Comparison Between Regression and Artificial Neural
Networks, Journal of Sustainable Forestry, DOI: 10.1080/10549811.2020.1754241
Article views: 30
ABSTRACT KEYWORDS
The dry tropical forests of the Brazilian semi-arid region are a key Caatinga domain; regression
component in the sustainable production of coal and firewood for analysis; artificial neural
power generation, although their estimates of volume and wood stock networks; forest
depend almost exclusively on equations adjusted from other semi-arid management
regions or form factor for data of managed species. Therefore,
a systematic evaluation of new methodologies such as artificial neural
networks and regression models is justifiable for the locale, since it aims
to select a tool that reports reliable predictions of volume and that is low
cost in the forest management of the region. Our main results show that
less reliable estimates of trunk and branch volumes are obtained by
simple input models and perceptron networks. The Schumacher-Hall
linearized equation provides reliable estimates of volume, although the
Multilayer-Perceptron neural networks indicate estimates, which are no
less biased. Our results suggest that using volumetric equations to
predict trunk and tree branch volume in the Brazilian semi-arid region
is still more statistically advantageous, although the use of ANNs is not
ruled out. This shows that there are obviously complex relationships
between dependent and independent biological factors and that volu-
metric models are able to better explain such relationships.
Introduction
Dry forests comprise just under half of the tropical and subtropical forest categories in the
world (Powers et al., 2009; Ranaivoson et al., 2017; Raymundo et al., 2018; Sabogal, 1992).
Despite their importance, they are among the most threatened and least studied forest
ecosystems, and as a result may be at greater risk than moist forests (Bastin et al., 2017;
Gillespie et al., 2012; Miles et al., 2006; Portillo-Quintero & Sánchez-Azofeifa, 2010).
To date, research on dry forest management has focused on Africa (Blackie et al., 2014;
Giday et al., 2013; Hasen-Yusuf et al., 2013; Henry et al., 2011) and Asia (Nath et al., 2006;
Naveenkumar et al., 2017). However, specific studies on the volumetry of these forests and
their species are still insipient, especially in dry forests (Caatinga forests) located in the
Brazilian semi-arid region that are heavily harvested to supply steel mills and generate
energy through burning coal (Althoff et al., 2018; Lima et al., 2017; Sampaio et al., 2010).
In Brazil, the Brazilian Institute for the Environment and Renewable Natural Resources
(IBAMA) determines through “Normative Instruction” (NI no. 030) that the tree volume
calculation in management plans will only be accepted by a volume equation. In the study
region, it is still common to estimate the commercial volume with bark using a form factor
or fixed expansion factor, and this is applied in a generalized way for different species,
sites, and dry forest environments, which still causes serious errors in volume and biomass
(Magalhães & Seifert, 2015), with the use of specific equations being recommended or
testing the accuracy of new predictive modeling methodologies (Chen et al., 2007; Del
Frate & Solimini, 2004; Mohammadi et al., 2011).
Obtaining the wood volume via robust statistical methods is substantially important in
forest inventories (Almeida et al., 2014; Masota, 2014; Serinaldi et al., 2012), mainly for the
areas submitted to forest management, since in addition to quantifying the raw material
stock, these methods can support biomass and carbon stock estimates of a given location
(e.g., Aigbe et al., 2012; Brandeis et al., 2006; Chaturvedi & Raghubanshi, 2013; Chave
et al., 2005; Mate et al., 2014; Vahedi, 2016; Weggler et al., 2012).
However, estimating individual tree volume is not a trivial task. The limiting factor has
always been the destructive sampling of trees. In practice, local relationships between
volume diameter and height are often modeled using regression analysis (Burkhart &
Tomé, 2012), or artificial neural networks (Bhering et al., 2015; Blanco et al., 2012;
Diamantopoulou & Milios, 2010; Marques da Silva Binoti et al., 2014; Özçelik et al.,
2010; Soares et al., 2012). Although it is a routine operation to estimate trunk volume
through classical regression models, neural networks tools are still poorly studied in dry
forests (Vahedi, 2016), and few works compare different alternatives for volumetric
prediction in areas submitted to forest management (Miguel et al., 2015; Razi &
Athappilly, 2005).
Artificial neural networks (ANNs) are computational models that resemble the struc-
ture of the human brain and integrate simple processing units (artificial neurons) that
calculate certain mathematical functions (see Valença et al., 2011). These networks create
the possibility of superior performance to that of the conventional models which makes
them attractive to solve a series of problems (see Haykin, 2001).
Bastos Gorgens et al. (2009) used the ANNs with the objective of constructing a neural
network that efficiently estimates tree volume. They concluded that modeling by such
methodology was perfectly viable. Its generalization and connectivity capacity enabled
only one network to be used to predict the tree volume from five different sites and two
different species.
In this context, reliably obtaining variables such as volume is essential in planning and
evaluating the amount of impact to be caused in the area submitted to forest management,
and can provide information to mitigate such problems. It is therefore rare to compare
different predictive modeling methodologies to accurately measure the individual volumes
(trunks and branches) of all trees in the inventory lots of these forests. As a result, in
practice total wood stocks and hence biomass and aboveground carbon remain unknown
(Ubuy et al., 2018).
Herein we approach this challenge by assembling a set of local tree data (trunks and
branches) harvested for volume measurement and examining them to quantify how well
the regression models and the structure of locally derived artificial neural networks predict
tree volume. We used a Leave-one-out approach to enable testing the performance of
JOURNAL OF SUSTAINABLE FORESTRY 3
regression models and neural networks in data that are independent of those used for
model fit. In this task, some statistical considerations were made in the comparison to
select the most efficient methodology for the amplitude of the obtained data.
Figure 1. Location of Itapemirim farm in the municipality of Floresta, Pernambuco, semi-arid region of
Brazil.
4 R. B. D. LIMA ET AL.
with sections not more than 1 m in length and up to a diameter of 3 cm in bark, which
represents the minimum diameter for branches established in the region. The trunk and
branch volumes were obtained by the Smalian method, which were summed up compos-
ing the total commercial volume of the trees volumetrically measured in this work. In
total, 316 volumes were obtained. We randomly split the volumetric data into two subsets
for adjustment and validation using the Leave-one-out (LOOCV) cross-validation tool,
where we selected a sample of N1 = 30 trunk volumes (60% of the sample) and N2 = 216
branch volumes (80% of the sample). Table 1 shows a descriptive summary with the
confidence intervals (p > .05) for the variables of trees (trunks and branches) harvested.
Volumetric modeling
Regression models
Three single-entry volumetric models (only having the base diameter as an explanatory
variable) and two double-entry models were tested for the 30 trunks and 216 branches,
with the explanatory variables being the Db and the total trunk height (Table 2).
We estimated the model parameters using the OLS (Ordinary Least Squares) method.
In general, the parameters were calculated using the total data of measured trunks and
branches, and are assumed as the true parameters that represent the tree volume.
However, samples were taken from the complete data set to evaluate the influence of
sample size (60% of the number of trunks and 80% of the branches) on the adjusted
parameters (LOOCV cross-validation). The objective of this tool is to estimate the value of
a set of evaluation statistics through the LOOCV. This type of estimate is obtained by
performing N repetitions of a test cycle, where N is the size of the data set provided. At
each repetition, one of the N observations is left out to serve as the test set, while the
Table 1. Confidence interval (mean ± standard error (se) of the mean, p > .05) of the variables
diameter, height and volume obtained in the volume measurement process for trunks and branches.
Volume (m3)
General Data n Db (cm) ± se Ht (m) ± se Mean ± se Total (m3)
Trunk Adjustment 30 11.04 ± 0.67 4.80 ± 0.31 0.0662 ± 0.0098 1.98
Validation 20 10.06 ± 0.78 5.55 ± 0.39 0.0696 ± 0.0170 1.39
Branches Adjustment 213 4.49 ± 0.13 4.65 ± 0.13 0.0095 ± 0.0010 2.03
Validation 53 4.21 ± 0.18 4.45 ± 0.25 0.0074 ± 0.0010 0.39
Total Trunk 50 10.65 ± 0.51 5.10 ± 0.25 0.0675 ± 0.0089 5.80
Branches 266 4.43 ± 0.11 4.61 ± 0.12 0.0091 ± 0.0008
Table 2. Statistical models tested for the volumetric estimation of Caatinga vegetation of the
Brazilian semi-arid region.
Explanatory variable Author Models
Db 1. Husch LnVi ¼ β0 þ β1:lnðdÞ þ εi
2. Koperzky−Gehrhardt Vi ¼ β0 þ β1:d2 þ εi
3. Hohenald−Krenn Vi ¼ β0 þ β1:d
þ βd2 þ εi
Db/Ht 4. Spurr Vi ¼ β0 þ β1: d2 :h
5. Schumacher−Hall (Ln) LnVi ¼ β0 þ β1:ln:ðdÞ þ β2:lnðhÞ þ εi
In which: Vi is the volume with bark in m3; d is the measured base diameter (0.30 m) with trunk bark; h is the
total height of the trees in m; Ln is the neperian logarithm; βi are the model parameters and; εi is the random
error.
JOURNAL OF SUSTAINABLE FORESTRY 5
remaining N−1 cases are used to obtain the model. The process is repeated N times,
leaving aside each of the N observations given. The LOOCV estimates are obtained by the
mean of the N scores obtained in the different repetitions. All computations and analyzes
were performed using the R®statistical software (R Core Team, 2018).
The obtained equations were analyzed through statistical criteria comparisons
(Vanclay, 2001) obtained according to Equations (1)–(4).
– Akaike Information Criteria (AIC):
AIC ¼ 2LL þ 2k (1)
In which: LL is the log-likelihood and k is the number of model parameters. This criterion
penalizes the addition of parameters in the analyzed models. It indicates the quality of fit
by the equations. The best equation minimizes the value of the AIC.
–Adjusted coefficient of determination (R2adj):
k1
R2adj ¼R2 1R2 (2)
nk
This statistic indicates a tendency of sub or overestimation, thus measuring error and
quality in the predictions made, so that the lower the error, the greater the efficiency in the
generalizations.
We concomitantly performed an analysis of the percentage error dispersion of the best
selected equations with the statistical criteria analyzes.
(2009), linear interpolation performs both the data normalization and the equalization
according to Equation (5):
x þ x
Transformedvalue ¼ 1 þ (5)
xmax
At this stage, the networks had only one neuron related to the trunk diameter and
branches, and two neurons related to the input diameter and the trunk height and
branches. The volumetric data were the same as those used in the adjustment of the
models, being 60% for trunk and 80% for branches.
We trained the networks in a supervised way, adopting perceptron and Multilayer
Perceptron structures. We used the backpropagation algorithm, which adopts a learning
rule known as the Delta Rule – (mean square error minimization). In this rule, we adjust
the weights of the connections between the network neurons according to the error in
order to find a set of weights and polarizations that reduce the error function (Haykin,
2001; Valença, 2011).
According to Haykin (2001), Valença (2011), and Pandorfi et al. (2011), the back-
propagation algorithm for training the networks was described by the following steps:
− step 1 – start weighting, biases and other training parameters;
− step 2 – present an input pattern of the training set to the network consisting of
inputs and outputs;
− step 3 – calculate the error for the neurons of the output layer (ek), subtracting the
desired output from the calculated output (internal processing of the network)
(Equation (6));
e ¼ yk ^yk (6)
− step 6 – calculate the accumulated network error. In this step, we verify whether the
total error on all input patterns can be considered negligible, meaning below an
acceptance threshold. In this case, the algorithm must stop, otherwise it returns to
step 2.
As training parameters used in the algorithm for estimating trunk and branch volume,
we adopted a learning rate of 0.1, an error of 0.005 being repeated around three thousand
times (3000 cycles) with sigmoid logistic activation function as transfer function in both
the output layer and the input layer (Equation (9)).
epuk 1 @f
f ðuk Þ ¼ ¼ pu
; ¼ puk ð1 uk Þ > 0 (9)
e k þ 1 1 þ e k @uk
pu
In which: puk = product of the synaptic weight (corrected by the learning rate) with the
input variable.
We selected the ANNs that presented the best performance in the training using only
the base diameter and ANNs that had the base diameter and height in the input layer,
both for modeling with trunks and for modeling with branches. These ANNs were
compared among themselves and the one with better performance in the estimation was
selected for comparison in the validation process and generalization with the best equation
obtained in the model adjustments.
Regarding the criteria for selecting the best ANN, as well as for its validation in another
volumetric database, we used the correlation statistics between the estimated and observed
values (rYŶ) and the root-mean-square error (RMSE %), as recommended by Leite et al.
(2011), Marques da Silva Binoti et al. (2014):
cov V; V ^
rYY^ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi (10)
S2 ðVÞ:S2 V ^
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn
Vi V ^i 2
RMSEð%Þ ¼ i¼1
:100 (11)
n
Results
Adjustment by regression analysis
For the trunks and branches, the obtained equations resulted in estimates with larger
errors in the single input models (Table 3). The logarithmic Spurr and Schumacher-Hall
models, which include the base diameter and total height, presented better performance
(lower AIC, Table 3) and returned the best predictions with more than 92% of the total
variance explained (R2adj, Table 3). These models suggest estimation errors of less than
5% for trunks (Bias = 0.0401 and 0.0389, respectively) and less than 7% for bias = 0.0629
and 0.0405, respectively. In both cases, with the exception of the Hohenald-Krenn
model, the selected models presented parameters with valid confidence intervals
(p > .05).
The Koperzky-Gehrhardt and Hohenald-Krenn equations lightly present both under-
estimation and overestimation for the smallest diametric amplitudes, indicating the pre-
sence of outliers (trunks and branches that are unusually high or short for their diameter),
discrediting the confidence limits of the estimates (see Figures 1A and Figure 2A electro-
nic version only).
The Spurr and Schumacher and Hall equations of trunks and branches predict that the
logarithmic transformation of the diameter and height of the tree or the linear combina-
tion of these variables for a given volume decreases the bias in the estimate. The functional
form of the tested equations is biologically consistent, especially with the inclusion of the
height variable, and these results can be illustrated in the residual distribution for each
case (Figure 2).
The residual scatter plot indicates a line (trend or bias) at the highest predicted values.
In all cases there is no evidence regarding the choice of a general equation. However,
discrepant values should be considered to suggest a curvature possibly caused by model
errors, rather than the data selected in the fit. In this type of graph one may observe the
good capacity of the equations in estimating the trunk and branch volumes, suggesting
that the highest density of residual points is within a range of less than 20%.
The statistical results confidently suggest that the Schumacher and Hall equations were
slightly better in the volumetric prediction when compared to the Spurr equation, but
does not invalidate the latter. Thus, we selected the Schumacher and Hall equation for
validating a new set of trunk and branch data to verify the applicability and accuracy of
the new predictions. Thus, we consider that the errors are independent, with null mean
and constant variance. The good results obtained through the statistical scores confirm its
practical applicability, together with the ease of obtaining the volume within the collected
dendrometric amplitude.
Figure 2. Distribution of the percentage error for the selected equations to estimate trunk and branch
volume.
Table 4. ANN structure trained separately with diameter and with diameter and height for volumetric
estimation of trunks and branches.
Trunk volume Branch volume
Variable ANN Structure rYŶ RMSE (%) Structure rYŶ RMSE (%)
Db 1 MLP 1 − 6 − 1 0.82 45.72 MLP 1 − 6 − 1 0.92 60.21
2 MLP 1 − 2 − 1 0.82 45.75 MLP 1 − 4 − 1 0.92 60.12
3 MLP 1 − 6 − 1 0.82 45.54 MLP 1 − 3 − 1 0.92 59.65
4 MLP 1 − 7 − 1 0.82 45.59 MLP 1 − 2 − 1 0.92 59.94
Db/Ht 5 MLP 2 − 6 − 1 0.97 19.81 MLP 2 − 4 − 1 0.98 29.16
6 MLP 2 − 4 − 1 0.97 18.6 MLP 2 − 8 − 1 0.99 19.18
7 MLP 2 − 3 − 1 0.98 16.61 MLP 2 − 7 − 1 0.99 17.52
8 MLP 2 − 4 − 1 0.98 17.33 MLP 2 − 3 − 1 0.99 20.41
In which: r training-Correlation between observed volume and estimated volume by ANN; RMSE (%) root-mean-square
error; MLP – Multilayer Perceptron.
perceptron-type networks (Adaline); this justifies including the trunk height and branches
variables for volumetric prediction in both cases (Table 4).
In predicting the volume with only the base diameter, networks 1, 2, 3 and 4 were
statistically similar, but with poorer predictions, although a slight decrease in RMSE%
values occurred for branch volume. In networks 5–8 employing the base diameter, trunk
height and branches, an increase in correlation values and a significant decrease in RMSE
% values were noted. All networks presented equal number of hidden layers varying only
in relation to the number of neurons. Networks 1 and 3 exhibited six neurons in the
hidden layer. Networks 6 and 8 were formed by hidden layer with four neurons each. The
ANN architectures that predict trunk and branch volume with the lowest error, with
synapses and weights corrected by variable and activation function can be visualized in the
figure presented in the appendix (Figure 5A and Figure 6A electronic version only).
In Figure 3, we verified that the residual distribution in percentage for the trunk and
branch volumes estimated by the best ANNs tend to present homogeneous variance. The
networks that only had the diameter as input variable presented distributions with larger
amplitudes, evidencing non-homogeneous variance. For networks with two neurons in the
input layer (Db, Ht), discrepant sub or overestimated rates are not apparent for estimated
volumes greater than 0.10 m3 for stem and 0.04 m3 for branches. However, there is a slight
trend in overestimations for the lower volumes in the two analyzed cases.
Figure 3. Distribution of the percentage errors of the best volumetric estimates generated by trained
ANNs for trunks and branches.
12 R. B. D. LIMA ET AL.
Statistically, ANN 7 with three neurons in the hidden layer for the trunk and seven
neurons in the hidden layer for branches were those that reported lower prediction error
(RMSE%) values and were then selected for the validation process.
Statistical validation
In the validation, t-test results for the trunk and branch volumes measured in the field
with the volumes obtained by the Schumacher and Hall equations and ANN 7 reported
that there is no significant difference (p > .05), meaning the volumes are similar from the
statistical point of view. The aggregate difference (AD) and mean error values corroborate
these similarities. The new predictions showed differences and positive errors for trunks
and branches indicating a low underestimation, although there is a slight tendency to
overestimation (AD = −28%) by ANN 7 for the branch volume.
In the predictions which presented similar estimates to the data observed before the
t-test, a deeper analysis was conducted to evaluate the behavior of the estimated data. To
do so, we used an observed versus estimated dispersion plot (Figure 4).
The predictions generated in the new dataset presented a more regular distribution for the
branch volume, and did not present critical points of bias. For trunk volume there is a slight
underestimation generated by network 7, as shown by the aggregate difference and mean error
(Table 5), although this difference is not shown graphically. In generalizations, the
Schumacher and Hall equations for the trunk and branches showed a slight decrease in the
root-mean-square error value (RMSE = 0.0198, 0.0011, Figure 4, respectively), respectively.
Taking into account the volumetric variability of the Caatinga species conditioned by varia-
tions in the trunk shape, branches shape, density and even genetics, the estimates generated by
equations and MLP-type networks can be considered as satisfactory and valid.
Figure 4. Validation of the best alternatives selected in the adjustment for predicting trunk and branch
volume, tested in a random sample independent of the data corresponding to 40% of the measured
trunks (n = 20) and 20% of the measured branches (n = 58). The graphs compare expected and
observed volume values with the finest dashed line corresponding to a ratio of 1:1. In both cases, the
generalizations of the best ANN architectures are represented by a black dashed line, and the best
equation by a red dashed line.
JOURNAL OF SUSTAINABLE FORESTRY 13
Table 5. Statistics of the equations selected for the validation sample of trunk and branch volume.
Methods AD AD% Mean error Test statistic Critical T-value (unilateral p > .05)
Trunk Schumacher−Hall 0.101 7.297 0.0066 0.309 .379
RNA 7 0.258 18.553 0.0020 0.606 .273
Branches Schumacher−Hall 0.012 3.125 0.0002 0.052 .479
RNA 7 −0.113 −28.907 −0.0020 −1.129 .130
In which: AD-Absolute aggregate difference; AD% -Relative aggregate difference.
Discussion
Theoretically, regression analysis has been used with an emphasis on solving most forest
problems, especially when estimating the forest parameters through biometric relations
(Robinson & Hamann, 2011). The use of regression models capable of accurately deter-
mining forest production based on estimated timber volume is fundamental for imple-
menting sustainable management (Berger et al., 2014; McRoberts & Westfall, 2014).
Logarithmic models have been used constantly in studying biometric relations, mainly
for developing equations for biomass and volume in dry tropical forests (Abich et al.,
2018; Brandeis et al., 2006; Chave et al., 2005; Mwakalukwa et al., 2014; Návar et al., 2013;
Ubuy et al., 2018), although few studies have been applied to the back transformation
from the log-log scale to the original scale using the corrective factor (Sprugel, 1983;
Vibrans et al., 2015). Moreover, R2 is often used to describe the quality of fit of the model;
however, very few studies have calculated the R2 for back transformed data (or original
scale), evidencing a misleading use of R2 since it has usage limitations in non-linear
models (Anderson-Sprecher, 1994; Tellinghuisen & Bolster, 2011; Vibrans et al., 2015).
Other parameters such as RMSE and Bias are rarely calculated based on the original
residuals’ scale (Zeng et al., 2011). In addition, robust methods to leverage the quality of fit
models such as AIC or Bayesian information criteria are rarely used and should be
incorporated into statistical model fit routines (Carpenter et al., 2015; Monnahan et al.,
2017). This result supports the decision to use regression methods to construct models
and estimate their parameters.
Although ANNs are a new predictive alternative for dry tropical forests (Vahedi, 2016),
the Schumacher-Hall equation in the linearized form using local data sets can reduce the
uncertainty in estimating commercial trunk volume and is less costly in forest manage-
ment plans. The biases of the estimates (RMSE; Bias) suggest a good measure of the
general predictive value of the alternatives (Akindele & LeMay, 2006; Tesfaye et al., 2016).
In both methodologies, the largest estimation errors are not only absolute numerical
values, but there is also a percentage increase mainly in the simple input equations and
perceptron type networks (Adaline) that only use the diameter as the explanatory variable.
In the ANN training and testing phase, even though only the diameter variable reported
estimates indicating good adjustments (r = 0.92) (Table 4), the RSME (%) values were
high, which contributes to greater residual distribution amplitude, especially in the
smallest diameters (see Figure 3).
There are two important and visible trends in Table 4 and Figure 3. First, for the
predictions with high bias for the smallest diameters (Figure 3), the trunk and smaller
branches have small commercial volume given their small dimensions, however the height
data have weak correlation with the diameter (Sullivan et al., 2018). Thus, the commercial
volume for branches is further from the total volume in the smaller trees than in the larger
14 R. B. D. LIMA ET AL.
ones, changing the relation between the diameter, height and volume variables (Amaral
Machado et al., 2008). This would explain the tendency in the volumetric estimates of the
trunks and minor branches, as shown in the networks (Table 4), since the trunk and larger
branches would “pull” the values of the synaptic weights corrected in the training to values
of smaller deviations, since these are in greater number and amplitude in the total data set.
Second, they can indicate tendencies resulting from the minimum diameters used in
volume measurement, since the smaller trees present a correlation pattern between the
diameter, height and differentiated volume variables of the larger ones (Amaral Machado
et al., 2008). In addition, the base diameter of the trunk also has a smaller sample number,
and in most cases is larger than the base diameter of the branches, thus presenting greater
variability which results in a residual distribution with greater amplitude in the initial
diameters.
In this case, we observed that the Multilayer Perceptron ANN with more than one
input variable are mostly indicated as an analysis tool due to its high generalization
capacity (Leite et al., 2011). It should also be noted that according to Pandorfi et al.
(2011), one of the problems in training the Multilayer Perceptron neural network type
with backpropagation training is the definition of its parameters. Selection of the algo-
rithm’ straining parameters is a process that demands great effort, because small differ-
ences in these parameters lead to great changes in both training time and in the obtained
generalization. Another important fact is that obtaining the appropriate ANN architecture
depends on numerous attempts in order to generate satisfactory results, since this process
is random, the number of neurons per layer is not based on any criterion, just on an
attempt to assimilate the error established as a training stop criterion (Bastos Gorgens
et al., 2014; Blanco et al., 2012; Vahedi, 2016).
The good representativeness of MLP networks in reporting accurate values in relation
to the closeness of functions is due to the activation function in addition to the back-
propagation algorithm (Bastos Gorgens et al., 2009), which in this case is the non-linear
sigmoid logistic, since this function uses the scalar product input (diameter x synaptic
weight and height x synaptic weight) and can approximate any arbitrary continuous
function (Valença, 2011). Soares et al. (2012) used Multilayer Perceptron type networks
and compared them with Radial Base Function (RBF) networks and with the Schumacher
and Hall (log) equation, and the results were close, but the MLP reported statistically more
consistent estimates.
The efficiency of neural networks in volume estimation has already been observed in
some works in natural and planted forests (Bastos Gorgens et al., 2009; Diamantopoulou
& Milios, 2010; Marques da Silva et al., 2009; Özçelik et al., 2010; Vahedi, 2016). The
studied dendrometric caatinga data are still pioneer, although good application is noted
due to its capacity to overcome the problems of forest data such as non-linear relation-
ships, non-Gaussian distribution of residues, multicollinearity, outliers and noise.
This work seeks to fill the gap on reliable volume predictions in a dry tropical forest in
the Brazilian semi-arid region, as well as to suggest an easy-to-apply and low cost
methodology for local management plans. Currently, the volumetry of forests under forest
management in the region remains estimated by a fixed expansion factor (form factor of
0.55), which in many cases considerably underestimates the total volumetric production
without taking into account the volume of the branches that also have commercial use.
The limiting factor has always been the destructive sampling of the trees for adjusting and
JOURNAL OF SUSTAINABLE FORESTRY 15
selecting models. Highly accurate volume and biomass estimates of individual trees are
increasingly available through Lidar technology (Castillo et al., 2012; Chen et al., 2007;
Estornell et al., 2011, 2012; Hill et al., 2014). These estimates do not require the destructive
sampling of trees and can be performed systematically in the field (Duncanson et al.,
2015). With adequate sampling, a system could be developed to sample the in situ tree
volume (trunk and branches) in environmental gradients, providing a potential solution to
outstanding problems related to forest biomass and carbon stock.
Conclusions
In this study, site-specific volumetric equations and ANN structures were developed to
predict volume. Our results suggest that using volumetric equations to predict trunk and
tree branch volume in the Brazilian semi-arid region is still more statistically advanta-
geous, although the use of ANNs is not ruled out. The results showed that applying
volumetric models provided a slightly more accurate forecast for volume compared to
ANNs. This shows that there are obviously complex relationships between dependent and
independent biological factors and that volumetric models are able to better explain such
relationships. However, ANNs generalize such relations between the input and output
variables and automatically assimilate them into the network connection weights by
providing reliable predictions in some cases, such as for the trunk (Db > 10 cm).
ORCID
Robson B. de Lima http://orcid.org/0000-0001-5915-4045
Rinaldo L. Caraciolo Ferreira http://orcid.org/0000-0001-7349-6041
José A. Aleixo da Silva http://orcid.org/0000-0003-0675-3524
References
Abich, A., Mucheye, T., Tebikew, M., Gebremariam, Y., & Alemu, A. (2018). Species-specific
allometric equations for improving aboveground biomass estimates of dry deciduous woodland
ecosystems. Journal of Forestry Research, 30, 1619–1632. https://doi.org/10.1007/s11676-018-
0707-5
Aigbe, H. I., Modogu, W. W., & Oyebade, B. A. (2012). Modeling volume from stump diameter of
Terminalia ivorensis (A. CHEV) in Sokponba Forest Reserve, Edo State, Nigeria. ARPN Journal
of Agricultural and Biological Science, 7(3), 146–151. http://www.arpnjournals.com/jabs/research_
papers/rp_2012/jabs_0312_369.pdf
Akindele, S. O., & LeMay, V. M. (2006). Development of tree volume equations for common timber
species in the tropical rain forest area of Nigeria. Forest Ecology and Management, 226(1–3),
41–48. https://doi.org/10.1016/j.foreco.2006.01.022
Almeida, A. Q., Mello, A. A., Neto, A. L. D., & Ferraz, R. C. (2014). Relações empíricas entre
características dendrométricas da Caatinga brasileira e dados TM Landsat 5. Pesquisa
Agropecuária Brasileira, 49(4), 306–315. https://doi.org/10.1590/S0100-204X2014000400009
Althoff, T. D., Menezes, R. S. C., Pinto, A. S., Pareyn, F. G. C., Carvalho, A. L. D., Martins, J. C. R.,
de Carvalho, E. X., Silva, A. S. A. D., Dutra, E. D., & Sampaio, E. V. D. S. B. (2018). Adaptation of
the century model to simulate C and N dynamics of Caatinga dry forest before and after
deforestation. Agriculture, Ecosystems & Environment, 254(15), 26–34. https://doi.org/10.1016/j.
agee.2017.11.016
16 R. B. D. LIMA ET AL.
Amaral Machado, S., Profumo Aguiar, L., Figueiredo Filho, A., & Soares Koehler, H. (2008).
Modelagem do volume do povoamento para Mimosa scabrella Benth. na regição metropolitana
de Curitiba. Revista Árvore, 32(3), 465–478. 10.1590/S0100-67622008000300009
Anderson-Sprecher, R. (1994). Model comparisons and R2. The American Statistician, 48(2),
113–117. https://doi.org/10.2307/2684259
Bastin, J.-F., Berrahmouni, N., Grainger, A., Maniatis, D., Mollicone, D., Moore, R., Patriarca, C.,
Picard, N., Sparrow, B., Abraham, E. M., Aloui, K., Atesoglu, A., Attore, F., Bassüllü, Ç., Bey, A.,
Garzuglia, M., García-Montero, L. G., Groot, N., Guerin, G., Laestadius, L., & Castro, R. (2017).
The extent of forest in dryland biomes. Science, 356(6338), 635–638. https://doi.org/10.1126/
science.aam6527
Bastos Gorgens, E., Garcia Leite, H., Marinaldo Gleriani, J., Soares, C. P. B., & Ceolin, A. (2014).
Influência da arquitetura na estimativa de volume de árvores individuais por meio de redes
neurais artificiais. Revista Árvore, 38(2), 289–295. 10.1590/S0100-67622014000200009
Bastos Gorgens, E., Garcia Leite, H., Nascimento Santos, H., & Gleriani, J. M. (2009). Estimação do
volume de árvores utilizando redes neurais artificiais. Revista Árvore, 33(6), 1141–1147. 10.1590/
S0100-67622009000600016
Berger, A., Gschwantner, T., McRoberts, R. E., & Schadauer, K. (2014). Effects of measurement
errors on individual tree stem volume estimates for the Austrian National forest inventory. Forest
Science, 60(1), 14–24. https://doi.org/10.5849/forsci.12-164
Bhering, L. L., Cruz, C. D., Peixoto, L. A., Rosado, A. M., Laviola, B. G., & Nascimento, M. (2015).
Application of neural networks to predict volume in eucalyptus. Crop Breeding and Applied
Biotechnology, 15(3), 125–131. https://doi.org/10.1590/1984-70332015v15n3a23
Blackie, R., Baldauf, C.; Gautier, D., Gumbo, D., Kassa, H., Parthasarathy, N., Paumgarten, F., Sola,
P., Pulla, S., Waeber, P., & Sunderland, T.C.H. (2014). Tropical dry forests: The state of global
knowledge and recommendations for future research. CIFOR (Discussion Paper no. 2). https://
www.cifor.org/library/4408/
Blanco, A. M., Sotto, A., & Castellanos, A. (2012). Prediction of the amount of wood using neural
networks. Journal of Mathematical Modelling and Algorithms, 11(3), 295–307. https://doi.org/10.
1007/s10852-012-9186-4
Brandeis, T. J., Delaney, M., Parresol, B. R., & Royer, L. (2006). Development of equations for
predicting Puerto Rican subtropical dry forest biomass and volume. Forest Ecology and
Management, 233(1), 133–142. https://doi.org/10.1016/j.foreco.2006.06.012
Burkhart, H. E., & Tomé, M. (2012). Modeling forest trees and stands. Springer Netherlands.
Carpenter, B., Hoffman, M. D., Brubaker, M., Lee, Daniel., Li, P., Betancourt, M. (2015). The stan
math library: Reverse-mode automatic differentiation in C++. arXiv Preprint. https://arxiv.org/
abs/1509.07164
Castillo, M., Rivard, B., Sánchez-Azofeifa, A., Calvo-Alvarado, J., & Dubayah, R. (2012). LIDAR
remote sensing for secondary tropical dry forest identification. Remote Sensing of Environment,
121(5), 132–143. https://doi.org/10.1016/j.rse.2012.01.012
Chaturvedi, R. K., & Raghubanshi, A. S. (2013). Aboveground biomass estimation of small diameter
woody species of tropical dry forest. New Forests, 44(4), 509–519. https://doi.org/10.1007/s11056-
012-9359-z
Chave, J., Andalo, C., Brown, S., Cairns, M. A., Chambers, J. Q., Eamus, D., Fölster, H., Fromard, F.,
Higuchi, N., Kira, T., Lescure, J.-P., Nelson, B. W., Ogawa, H., Puig, H., Riéra, B., & Yamakura, T.
(2005). Tree allometry and improved estimation of carbon stocks and balance in tropical forests.
Oecologia, 145(1), 87–99. https://doi.org/10.1007/s00442-005-0100-x
Chen, Q., Gong, P., Baldocchi, D., & Tian, Y. Q. (2007). Estimating basal area and stem volume for
individual trees from lidar data. Photogrammetric Engineering and Remote Sensing, 73(12),
1355–1365. https://doi.org/10.14358/PERS.73.12.1355
Del Frate, F., & Solimini, D. (2004). On neural network algorithms for retrieving forest biomass
from SAR data. IEEE Transactions on Geoscience and Remote Sensing, 42(1), 24–34. https://doi.
org/10.1109/TGRS.2003.817220
JOURNAL OF SUSTAINABLE FORESTRY 17
Diamantopoulou, M. J., & Milios, E. (2010). Modelling total volume of dominant pine trees in
reforestations via multivariate analysis and artificial neural network models. Biosystems
Engineering, 105(3), 306–315. https://doi.org/10.1016/j.biosystemseng.2009.11.010
Duncanson, L., Rourke, O., & Dubayah, R. (2015). Small sample sizes yield biased allometric
equations in temperate forests. Scientific Reports, 5(1), 17153. https://doi.org/10.1038/
srep17153
Estornell, J., Ruiz, L. A., Velázquez-Martí, B., & Fernández-Sarría, A. (2011). Estimation of shrub
biomass by airborne LiDAR data in small forest stands. Forest Ecology and Management, 262(9),
1697–1703. https://doi.org/10.1016/j.foreco.2011.07.026
Estornell, J., Ruiz, L. A., Velázquez-Martí, B., & Hermosilla, T. (2012). Estimation of biomass and
volume of shrub vegetation using LiDAR and spectral data in a Mediterranean environment.
Biomass & Bioenergy, 46(11), 710–721. https://doi.org/10.1016/j.biombioe.2012.06.023
Giday, K., Eshete, G., Barklund, P., Aertsen, W., & Muys, B. (2013). Wood biomass functions for
Acacia abyssinica trees and shrubs and implications for provision of ecosystem services in
a community managed exclosure in Tigray, Ethiopia. Journal of Arid Environments, 94(7),
80–86. https://doi.org/10.1016/j.jaridenv.2013.03.001
Gillespie, T. W., Lipkin, B., Sullivan, L., Benowitz, D. R., Pau, S., & Keppel, G. (2012). The rarest
and least protected forests in biodiversity hotspots. Biodiversity and Conservation, 21(14),
3597–3611. https://doi.org/10.1007/s10531-012-0384-1
Hasen-Yusuf, M., Treydte, A. C., Abule, E., & Sauerborn, J. (2013). Predicting aboveground biomass
of woody encroacher species in semi-arid rangelands, Ethiopia. Journal of Arid Environments, 96
(9), 64–72. https://doi.org/10.1016/j.jaridenv.2013.04.007
Haykin, S. (2001). Redes neurais: Princípios e prática. Bookman Companhia.
Henry, M., Picard, N., Trotta, C., Manlay, R., Valentini, R., Bernoux, M., & Saint-André, L. (2011).
Estimating tree biomass of sub-Saharan African forests: A review of available allometric
equations. Silva Fennica, 45(3B), 477–569. https://doi.org/10.14214/sf.38
Hill, A., Breschan, J., & Mandallaz, D. (2014). Accuracy assessment of timber volume maps using
forest inventory data and LiDAR canopy height models. Forests, 5(9), 2253–2275. https://doi.org/
10.3390/f5092253
IBGE - Instituto Brasileiro de Geografia e Estatística. (2012). Manual técnico da vegetação brasileira
(2nd ed., R. J. Rio de Janeiro). Retrieved January 17, 2017, from http://biblioteca.ibge.gov.br/
visualizacao/livros/liv63011.pdf
Leite, H. G., Silva, M. L. M., Binoti, D. H. B., Fardin, L., & Takizawa, F. H. (2011). Estimation of
inside-bark diameter and heartwood diameter for Tectona grandis Linn. trees using artificial
neural networks. European Journal of Forest Research, 130(2), 263–269. https://doi.org/10.1007/
s10342-010-0427-7
Lima, R. B., Alves Júnior, F. T., Oliveira, C. P. D., SILVA, J. A. A. D., & FERREIRA, R. L. C. (2017).
Predicting of biomass in Brazilian tropical dry forest: A statistical evaluation of generic equations.
Anais Da Academia Brasileira De Ciências, 89(3), 1815–1828. https://doi.org/10.1590/0001–
3765201720170047
Magalhães, T. M., & Seifert, T. (2015). Estimation of tree biomass, carbon stocks, and error
propagation in mecrusse woodlands. Open Journal of Forestry, 05(4), 471–488. https://doi.org/
10.4236/ojf.2015.54041
Marques da Silva Binoti, M. L., Breda Binoti, D. H., Garcia Leite, H., Garcia, S. L. R., Ferreira, M. Z.,
Rode, R., & Silva, A. A. L. D. (2014). Redes neurais artificiais para estimação do volume de
árvores. Revista Árvore, 38(2), 283–288. 10.1590/S0100-67622014000200008
Marques da Silva, M. L., Breda Binoti, D. H., Gleriani, J. M., & Garcia Leite, H. (2009). Ajuste do
modelo de Schumacher e Hall e aplicação de redes neurais artificiais para estimar volume de
árvores de eucalipto. Revista Árvore, 33(6), 1133–1139. 10.1590/S0100-67622009000600015
Masota, A. M. (2014). Volume models for single trees in tropical rainforests in Tanzania. Journal of
Energy and Natural Resources, 3(5), 66. https://doi.org/10.11648/j.jenr.20140305.12
Mate, R., Johansson, T., & Sitoe, A. (2014). Biomass equations for tropical forest tree species in
Mozambique. Forests, 5(3), 535–556. https://doi.org/10.3390/f5030535
18 R. B. D. LIMA ET AL.
McRoberts, R. E., & Westfall, J. A. (2014). Effects of uncertainty in model predictions of individual
tree volume on large area volume estimates. Forest Science, 60(1), 34–42. https://doi.org/10.5849/
forsci.12-141
Miguel, E. P., Rezende, A. V., Leal, F. A., Matricardi, E. A. T., Vale, A. T. D., & Pereira, R. S. (2015).
Redes neurais artificiais para a modelagem do volume de madeira e biomassa do cerradão com
dados de satélite. Pesquisa Agropecuária Brasileira, 50(9), 829–839. https://doi.org/10.1590/
S0100-204X2015000900012
Miles, L., Newton, A. C., DeFries, R. S., Ravilious, C., May, I., Blyth, S., Kapos, V., & Gordon, J. E.
(2006). A global overview of the conservation status of tropical dry forests. Journal of
Biogeography, 33(3), 491–505. https://doi.org/10.1111/j.1365-2699.2005.01424.x
Mohammadi, J., Shataee, S., & Babanezhad, M. (2011). Estimation of forest stand volume, tree density
and biodiversity using Landsat ETM+Data, comparison of linear and regression tree analyses.
Procedia Environmental Sciences, 7, 299–304. https://doi.org/10.1016/j.proenv.2011.07.052
Monnahan, C. C., Thorson, J. T., Branch, T. A., & O’Hara, R. B. (2017). Faster estimation of
Bayesian models in ecology using Hamiltonian Monte Carlo. Methods in Ecology and Evolution, 8
(3), 339–348. https://doi.org/10.1111/2041-210X.12681
Mwakalukwa, E. E., Meilby, H., & Treue, T. (2014). Volume and aboveground biomass models for
dry Miombo woodland in Tanzania. International Journal of Forestry Research, 2014(5), 1–11.
https://doi.org/10.1155/2014/531256
Nath, C. D., Dattaraja, H. S., Suresh, H. S., Joshi, N. V., & Sukumar, R. (2006). Patterns of tree
growth in relation to environmental variability in the tropical dry deciduous forest at
Mudumalai, southern India. Journal of Biosciences, 31(5), 651–669. https://doi.org/10.1007/
BF02708418
Návar, J., Ríos-Saucedo, J., Pérez-Verdín, G., Rodríguez-Flores, F. D. J., & Domínguez-Calleros, P. A.
(2013). Regional aboveground biomass equations for North American arid and semi-arid forests.
Journal of Arid Environments, 97(10), 127–135. https://doi.org/10.1016/j.jaridenv.2013.05.016
Naveenkumar, J., Arunkumar, K. S., & Sundarapandian, S. (2017). Biomass and carbon stocks of
a tropical dry forest of the Javadi Hills, Eastern Ghats, India. Carbon Management, 8(5–6),
351–361. https://doi.org/10.1080/17583004.2017.1362946
Özçelik, R., Diamantopoulou, M. J., Brooks, J. R., & Wiant, H. V. (2010). Estimating tree bole
volume using artificial neural network models for four species in Turkey. Journal of
Environmental Management, 91(3), 742–753. https://doi.org/10.1016/j.jenvman.2009.10.002
Pandorfi, H., Silva, I. J. O., Sarnighausen, V. C. R., Vieira, F. M. C., Nascimento, S. T., &
Guiselini, C. (2011). Uso de redes neurais artificiais para predição de índices zootécnicos nas
fases de gestação e maternidade na suinocultura. Revista Brasileira De Zootecnia, 40(3), 676–681.
https://doi.org/10.1590/S1516-35982011000300028
Portillo-Quintero, C. A., & Sánchez-Azofeifa, G. A. (2010). Extent and conservation of tropical dry
forests in the Americas. Biological Conservation, 143(1), 144–155. https://doi.org/10.1016/j.bio
con.2009.09.020
Powers, J. S., Becknell, J. M., Irving, J., & Pèrez-Aviles, D. (2009). Diversity and structure of
regenerating tropical dry forests in Costa Rica: Geographic patterns and environmental drivers.
Forest Ecology and Management, 258(6), 959–970. https://doi.org/10.1016/j.foreco.2008.10.036
R Core Team. (2018). R: A Language and Environment for Statistical Computing. R Foundation for
Statistical Computing, Vienna. https://www.R-project.org
Ranaivoson, T., Rakouth, B., Buerkert, A., & Brinkmann, K. (2017). Wood biomass availability for
smallholder charcoal production in dry forest and savannah ecosystems of south-western
Madagascar. Journal of Arid Environments, 146(11), 86–94. https://doi.org/10.1016/j.jaridenv.
2017.07.002
Raymundo, D., Prado-Junior, J., Alvim Carvalho, F., Santiago Do Vale, V., Oliveira, P. E., &
Sande, M. T. (2018). Shifting species and functional diversity due to abrupt changes in water
availability in tropical dry forests. Journal of Ecology, 107(1), 253–264. https://doi.org/10.1111/
1365-2745.13031
JOURNAL OF SUSTAINABLE FORESTRY 19
Razi, M., & Athappilly, K. (2005). A comparative predictive analysis of neural networks (NNs),
nonlinear regression and classification and regression tree (CART) models. Expert Systems with
Applications, 29(1), 65–74. https://doi.org/10.1016/j.eswa.2005.01.006
Robinson, A. P., & Hamann, J. D. (2011). Forest analytics with R: A introduction. Springer.
Sabogal, C. (1992). Regeneration of tropical dry forests in Central America, with examples from
Nicaragua. Journal of Vegetation Science, 3(3), 407–416. https://doi.org/10.2307/3235767
Sampaio, E., Gasson, P., Baracat, A., Cutler, D., Pareyn, F., & Lima, K. C. (2010). Tree biomass
estimation in regenerating areas of tropical dry vegetation in northeast Brazil. Forest Ecology and
Management, 259(6), 1135–1140. https://doi.org/10.1016/j.foreco.2009.12.028
Serinaldi, F., Grimaldi, S., Abdolhosseini, M., Corona, P., & Cimini, D. (2012). Testing copula
regression against benchmark models for point and interval estimation of tree wood volume in
beech stands. European Journal of Forest Research, 131(5), 1313–1326. https://doi.org/10.1007/
s10342-012-0600-2
Soares, F. A. A. M. N., Flôres, E. L., Cabacinha, C. D., Carrijo, G. A., & Veiga, A. C. P. (2012).
Recursive diameter prediction for calculating merchantable volume of Eucalyptus clones without
previous knowledge of total tree height using artificial neural networks. Applied Soft Computing,
12(8), 2030–2039. https://doi.org/10.1016/j.asoc.2012.02.018
Sprugel, D. G. (1983). Correcting for bias in log-transformed allometric equations. Ecology, 64(1),
209–210. https://doi.org/10.2307/1937343
Sullivan, M. J. P., Lewis, S. L., Hubau, W., Qie, L., Baker, T. R., Banin, L. F., Chave, J., Cuni-
Sanchez, A., Feldpausch, T. R., Lopez-Gonzalez, G., Arets, E., Ashton, P., Bastin, J.-F.,
Berry, N. J., Bogaert, J., Boot, R., Brearley, F. Q., Brienen, R., Burslem, D. F. R. P.,
Canniere, C., & Phillips, O. L. (2018). Field methods for sampling tree height for tropical forest
biomass estimation. Methods in Ecology and Evolution, 9(5), 1179–1189. https://doi.org/10.1111/
2041-210X.12962
Tellinghuisen, J., & Bolster, C. H. (2011). Using R2 to compare least-squares fit models: When it
must fail. Chemometrics and Intelligent Laboratory Systems, 105(2), 220–222. https://doi.org/10.
1016/j.chemolab.2011.01.004
Tesfaye, M. A., Bravo-Oviedo, A., Bravo, F., & Ruiz-Peinado, R. (2016). Aboveground biomass
equations for sustainable production of fuelwood in a native dry tropical afro-montane forest
of Ethiopia. Annals of Forest Science, 73(2), 411–423. https://doi.org/10.1007/s13595-015-0533-
2
Ubuy, M. H., Eid, T., Bollandsås, O. M., & Birhane, E. (2018). Aboveground biomass models for
trees and shrubs of exclosures in the drylands of Tigray, northern Ethiopia. Journal of Arid
Environments, 156(9), 9–18. https://doi.org/10.1016/j.jaridenv.2018.05.007
Vahedi, A. A. (2016). Artificial neural network application in comparison with modeling allometric
equations for predicting above-ground biomass in the Hyrcanian mixed-beech forests of Iran.
Biomass & Bioenergy, 88(5), 66–76. https://doi.org/10.1016/j.biombioe.2016.03.020
Valença, M. (2011). Fundamentos das redes neurais: Exemplos em Java. Livro Rápido.
Vanclay, J. K. (2001). Modelling forest growth and yield: Applications to mixed tropical forests. CAB
International.
Vibrans, A. C., Moser, P., Oliveira, L. Z., & de Maçaneiro, J. P. (2015). Generic and specific stem
volume models for three subtropical forest types in southern Brazil. Annals of Forest Science, 72
(6), 865–874. https://doi.org/10.1007/s13595-015-0481-x
Weggler, K., Dobbertin, M., Jüngling, E., Kaufmann, E., & Thürig, E. (2012). Dead wood volume to
dead wood carbon: The issue of conversion factors. European Journal of Forest Research, 131(5),
1423–1438. https://doi.org/10.1007/s10342-012-0610-0
Zeng, W. S., Zeng, W. S., & Tang, S. Z. (2011). Bias Correction in logarithmic regression and
comparison with weighted regression for nonlinear models. Nature Precedings, 12 1–11. https://
doi.org/10.1038/npre.2011.6708.1