USE OF DIFFERENT STATISTICAL APPROACHES IN PREDICTION OF METABOLIZABLE ENERGY OF DIETS FOR BROILERS

Energy value of diets has importance for feed producers and farmers. Methods for in vivo determination of metabolisable energy have high accuracy, but they are time and cost consuming. The aim of this study was to investigate the effect of enzymatic digestible organic matter and values of proximate chemical analysis on prediction of the nitrogen corrected true metabolisable energy (TMEn) of diets for broilers. The performance of Artificial Neural Network was compared with the performance of first order polynomial model, as well as with experimental data in order to develop rapid and accurate method for prediction of TMEn content. Analysis of variance and post-hoc Tukey‟s HSD test at 95% confidence limit have been calculated to show significant differences between different samples. Response Surface Method has been applied for evaluation of TMEn. First order polynomial model showed high coefficients of determination (r 2 = 0.859). Artificial Neural Network model also showed high prediction accuracy (r 2 = 0.992). Principal Component Analysis was successfully used in prediction of TMEn.


INTRODUCTION
Energy value of diets has importance for feed producers and farmers.Since methods for in vivo metabolizable energy (ME) determination require the use of live animals they can be considered as most accurate.On the other hand, these methods are often time and cost consuming (Elkin, 1987;Mohamed et al., 1984;Palić and Leeuw, 2009;Pojić et al., 2008).There has been a considerable interest to find accurate methods for ME prediction, which will be also rapid and inexpensive (Robbins and Firman, 2005;Zhang et al., 1994).
Recently, mathematical modelling has been increasingly used for the study of the given systems.Developed empirical models show a reasonable fit to experimental data and successfully predict ME (Perai et al., 2010).Nonlinear models are found to be more suitable for real process simulation.First order polynomial (FOP), using Response Surface Methodology (RSM) and Artificial Neural Network (ANN) models have gained momentum for modelling and control of processes (Khuri and Mukhopadhyay, 2010; Priddy and Keller, 2005).
ANN models are recognized as a good modelling tool since they provide the empirical solution to the problems from a set of experimental data, and are capable of handling complex systems with nonlinearities and interactions between decision variables (Almeida, 2002).The specific objective of this study was to investtigate the effect of EDOM and values of proximate chemical analysis on nitrogen corrected true metabolisable energy (TME n ) content of diets for broilers.The performance of ANN was compared with the performance of FOP, as well as to experimental data, in order to develop rapid and accurate method for prediction of TME n .

Feed and assays
Twenty one diets for broilers were used in the study.Proximate chemical composition of the diets was determined according to AOAC standard methods (AOAC, 2000).Estimation of the enzymatic digestibility of organic matter (EDOM) was performed by use of modified method of Boisen and Fernandez (1997).In vivo TME n content of the diets was determined using the assay described by McNab and Blair (1988).

Principal Component Analysis (PCA)
Principal Component Analysis (PCA) is a mathematical procedure used as a central tool in exploratory data analysis (Brlek et al., 2013).It is a multivariate technique in which the data are transformed into orthogonal components that are linear combinations of the original variables.PCA is performed by Eigenvalue decomposition of a data correlation matrix (Abdi and Williams, 2010).This transformation is defined in such a way that the first component has the largest possible variance.This analysis is used to achieve maximum separation among clusters of parameters (Pezo et al., 2013).This approach, evidencing spatial relationship between processsing parameters, enabled a differentiation between the different samples.

First order polynomial (FOP) model
According to general recommendations, prior to ANN modelling, analysis of variance (ANOVA) was performed, in order to check the significant effect of the input variables over the output, as well as to justify the later use of ANN model by coefficient of determination (r 2 ).Analysis and mathematical modelling was performed using StatSoft Statistica 10.0 software (Statistica, 2010).
The FOP model was used for estimation of the main effect of the process variables on responses.The independent variables used for modelling were dry matter (DM), crude protein (CP), crude fibre (CFi), crude fat (CFa), crude ash (CA), organic matter (OM) and enzymatic digestible organic matter (EDOM), while TME n was response variable.FOP model was fitted to data collected by experimental measurements: where: β 0 and β i are constant regression coefficients, Y is response variable, while X i are independent variables.The significant terms in the model were found using ANOVA for each dependent variable.

Artificial Neural Network (ANN) modelling
The database for ANN was randomly divided to: training data (60%), crossvalidation (20%) and testing data (20%).The cross-validation data set was used to test the performance of the network, while training was in progress as an indicator of the level of generalization and the time at which the network has begun to overtrain.
Testing data set was used to examine the network generalization capability.
To improve the behaviour of the ANN, both input and output data were normalized.In order to obtain good network behaviour, it is necessary to make a trial and error procedure and also to choose the number of hidden layers, and the number of neurons in hidden layer(s).A multi-layer perceptron model (MLP) consisted of three layers (input, hidden and output).Such a model has been proven as a quite capable of approximating nonlinear functions (Hu and Weng, 2009) giving the reason for choosing it in this study.In this work the number of hidden neurons for optimal network was ten.Broyden-Fletche-Goldfarb-Shanno (BFGS) algorithm was used for ANN modelling.
After defining the architecture of ANN, the training step was initiated.The training process was repeated several times in order to get the best performance of the ANN, due to a high degree of variability of parameters.It was accepted that the successful training was achieved when learning and cross-validation curves (Sum of Squares vs. training cycles) approached zero.Testing was carried out with the best weights stored during the training step.Coefficient of determination (r 2 ) and SOS were used as parameters to check the performance (i.e. the accuracy) of the obtained ANNs.After the best behaved ANN was chosen, the model was implemented using an algebraic system of equations to predict TME n content of studied diets.

The goodness of fit
The goodness of fit for developed models (FOP and ANN) were evaluated using the coefficient of determination (r 2 ), the mean relative percent error (P), the root mean square error (RMSE) and the reduced chisquare (χ 2 ).The higher the values of r 2 and the lower the values of P, RMSE and χ 2 , the better is the goodness of the fit.

These parameters can be calculated as follows:
exp, , where Y exp,i is the i th experimentally observed response Y, Y pre,i is the i th predicted Y, N is the number of observations and n is the number of constants.

Sensitivity analysis
Sensitivity analysis is a sophisticated technique which is necessary to use for studying the effects of observed input variables and also the uncertainties in obtained models and general network behaviour.Neural networks were tested using sensitivity analysis, to determine whether and under what circumstances obtained model might result in an ill-conditioned system (Taylor, 2006).On the basis of developed ANN model, sensitivity analysis was performed in order to more precisely define the influence of processing variables on the observed outputs.The infinitesimal amount (+0.0001%) has been added to each input variable, in 10 equally spaced individual points encompassed by the minimum and maximum of the train data.These signals were normally distributed with a constant intensity and frequency.It was used to test the model sensitivity and measurement errors.

RESULTS AND DISCUSSION
Results of proximate chemical analysis, EDOM and TME n content of broiler diets are presented using descriptive statistics in Table 1.DM, CP, CFa, CFi, CA, OM, EDOM, and TME n varied significantly, implying that fitting of the experimental data can be performed using FOP and ANN modelling.

Principal component analysis (PCA)
Preliminary performed calculation for estimation of effects, using RSM of experimental data, showed that only EDOM, CFa, CFi, and CA variables influenced TME n at statistically significant level (p<0.05).Therefore DM, CP and OM were excluded from further calculation.
The PCA applied to the given data set has shown a differentiation between the samples according to used process parameters, and it was used as a tool in exploratory data analysis to characterize and differentiate neural network input parameters (Figure 1).As it can be seen, there is a neat separation of the observed samples according to used assays.Quality results show that the first two principal components, accounting for 81.53% of the total variability for TME n , can be considered sufficient for data representation.CFi content, CA, TME n and EDOM had been more influential for the first factor coordinate calculation (accounting 25.5, 29.9, 22.9 and 20.1% contribution, respectively), while CFa content had been more influential for the second factor coordinate calculation (67.3%, respectively).PCA (Figure 1) showed quite good discrimination between samples.

Neurons in the ANN hidden layer
All variables considered in the RSM, were also used for the ANN modelling.Determination of the appropriate number of hidden layers and number of hidden neurons in each layer is one of the most critical tasks in ANN design.The number of neurons in a hidden layer depends on the complexity of the relationship between inputs and outputs.As this relationship becomes more complex, more neurons should be added (Ćurĉić et al., 2014).
The optimum number of hidden neurons was chosen upon minimizing the difference between predicted ANN values and desired outputs, using Sum of Squares (SOS) during testing as performance indicator.Used multi-layer perceptron models (MLPs) were marked according to StatSoft Statistica's notation.MLP was followed by number of inputs, number of neurons in the hidden layer, and the number of outputs.

Sensitivity analysis
In order to assess the effect of changes in the outputs due to the changes in the inputs, a sensitivity analysis was performed.The greater effect observed in the output implies that greater sensitivity is presented with respect to the input (Pezo et al., 2013).Sensitivity analysis has been performed to test an infinitesimal change in an input value in 10 equally spaced individual points, ranged by the minimum and maximum of the observed assay, in order to explore the changes in observed outputs.It is also used to test the model sensitivity and measurement errors.The influence of the input over the output variables, i.e. calculated changes of output variables for infinitesimal changes in input variables, is shown on Figure 3. Obtained values corresponded to the level of experimental errors, and also showed the CFa, CFi, CA and EDOM influence on TME n .
Sensitivity analysis is used to show the influence of the inputs, but it also shows the importance of an input variable at a given point in the input space (Saltelli and Annoni, 2010).As it can be seen on Figure 3, TME n was affected more strongly by the infinitesimal changes of CFa, CFi, CA, and EDOM at the extreme values of the input range.These findings are in accordance with PCA and ANOVA analysis, as well as with experimental measurements.

CONCLUSIONS
This paper presented different statistical approaches for prediction of in vivo TME n content in complete diets for broilers using the results of proximate chemical analysis and EDOM.FOP and ANN-based models were developed for prediction of TME n for a wide range of input variables.Both models are easy to implement and could be effectively used for predictive purposes, modelling and optimization.As compared to RSM, ANN model yielded a better fit of experimental data.Taking into account that a considerable amount and wide variety of data were used in the present work to obtain the ANN model, and considering that the model turned out to yield a sufficiently good representation of the experimental results, it can be expected that it will be useful in practice.

АCKNOWLEDGEMENTS
This paper is a result of the research within the project III 46012 "Investigation of contemporary biotechnological processes in animal feed production, aimed at

Figure 2 .Figure 3 .
Figure 2. Comparison of experimentally obtained TME n with ANN and FOP predicted values

Table 1 .
Results of proximate chemical analysis, EDOM and TME n content of complete diets for broilers

Table 2 .
Analysis of variance (ANOVA) of TME n in diets for broilers +Significant at p<0.001 level, *Significant at p<0.05 level, 95% confidence limit; df -Degrees of freedom, SS -Sum of squares, F -F-test value

Table 3 .
Performance of the optimal ANN