CREDIT RISK ASSESSMENT OF AGRICULTURAL ENTERPRISES IN THE REPUBLIC OF SERBIA: LOGISTIC REGRESSION VS DISCRIMINANT ANALYSIS

Credit risk assessment of agricultural enterprises in the Republic of Serbia was analyzed in this research by applying discriminant analysis and logistic regressions. The aim of the research is to determine the financial indicators which financial analysts consider when analyzing a loan application that have the most influence on the decision to approve or reject a loan application. The internal determinants of credit risk of agricultural enterprises are analyzed, i.e., indicators of financial leverage, profitability, liquidity, solvency, financial stability and effectiveness. The analyzed models gave different results in significance of the observed indicators. The indicators that stood out as significant in both models are only indicators of profitability and solvency. The model of discriminant analysis has successfully classified rate 81.0%, while the logistic regression model has successfully classifies rate 89.8%. In modeling the credit risk of agricultural enterprises in the Republic of Serbia, the logistic regression model gives better results. © 2021 EA. All rights reserved.


Introduction
Many authors consider credit risk as one of the most important risks that can affect banks (Spasojević, 2013;Dragosavac, 2014). Credit risk is the risk that the client will fail to meet the obligations and terms of a contract. Credit risk is included in credit activities as well as in business and investment activities, payments and securities settlements (Sůvová, 2002). Many factors can affect credit risk, including those that are under the control of the bank and those that are beyond its control. Credit risk depends on two groups of factors: 1. exogenous (government regulations, general economic conditions, natural conditions, etc.); 2) endogenous (assessment of the creditworthiness of each client). There are various events generating credit risk and they can occur at any time in the life of a loan. Also, credit risk should be analyzed in terms of the clients' activity sector, taking into consideration the particularities of each client's business (Sbârcea, 2008). Agriculture is the activity sector with the highest risk, primarily due to characteristics of agricultural production including: dependence on climatic conditions, slow turnover of funds, specific method of reproduction, lower level of marketability, seasonal nature of work, etc. When modeling credit risk for agricultural loans, one must take into account the characteristics of both the agricultural sector and its borrowers. Performance of the sector is influenced also by economic cycles and it is highly correlated with the typology, commodity, and geographical location of the company (Bandyopadhyay, 2007). Agricultural production in the Republic of Serbia is of great importance, the share of agricultural production in total GDP is about 6% (World Bank, 2019). Consequently, the performance of agricultural enterprises in the Republic of Serbia is one of the crucial aspects of the national economy. The main information on agricultural enterprises is given in their financial statements. These reports provide a view of the financial position and business results for the observed period (Milić et al., 2018). Therefore, the paper analyzes the financial ratios considered by financial analysts when making a decision to accept or reject a loan application from agricultural companies. Researches related to the topic of credit risk of agricultural enterprises are rare. Given that Wen (2013) considers that nonperforming loans makes a negative result of the bank's credit risk, the author investigated the impact of indicators such as gross domestic product, interest rate and money supply on the ratio of nonperforming loans of the Agricultural Bank of China. The results of this study determined that all three observed factors have a significant joint impact on nonperforming loans. Shalini (2013) in a survey of farmers in India identified the impact of a number of microeconomic variables on agricultural credit management. This research also proposes measures that can minimize the number of nonperforming loans in India. In their research, Khanam and Hasan (2013) examined the factors influencing nonperforming loans from the agricultural sector, a bank in Bangladesh. The authors came to the conclusion that a high percentage of nonperforming loans reduces the productivity and profitability of banks. Muhović  Finding an appropriate model for credit risk assessment is becoming an increasingly important issue for the banking system of the Republic of Serbia, therefore the aim of this research is to demonstrate how a discriminant analysis and binary logistic regression models can be used for this purpose. In this paper we analyze credit risk assessment by using financial dataset consisting of 295 loan applications of one bank located in Serbia. The research started from the hypothesis that used models, discriminant analysis and binary logistic regression can be used to model the credit risk of the observed companies.

Material and methods
The selected indicators which are considered by financial analysts when making a decision of accepting or rejecting a credit loan application were analyzed by applying two statistical methods: discriminant analysis and binary logistic regression. These methods can be used to estimate the associations between a categorical outcome variable and various covariates. Logistic regression and discriminant analysis are widely implemented practically (Ahsan ul Haq et al., 2015). Logistic regression, unlike discriminant analysis, is not based on assumptions about data normality and correlation of independent variables.

Discriminant analysis
The method of discriminant analysis is a multivariate technique for analyzing differences between individual groups of features. Discriminant analysis determines which variables better explain or predict affiliation of observations to certain groups (Tillmanns&Krafft, 2017). It is used to determine the discriminant function and to classify objects into one or two or more groups based on a set of characteristics that describe the objects. The goal is to maximize the difference between two groups and minimize the differences between individual members of the same group (Gurný P. &Gurný M., 2013). Homoscedasticity is tested using Leven test and Brown-Forsythe group test. Box's M statistic was used for testing homogeneity of the group covariance matrices.
The test used to interpret the discriminant functions is Wilk's λ-test, which is a measure of the differences among group means of the explanatory variables, and it was used to ascertain the level of significance for each group prediction (Heil &Schmidhalter, 2014). The classification function coefficient analysis shows more about the importance of each indicator in the discriminant function.

Logistic regression
Logistic regression is a statistical technique for modeling categorical variables which is generally most widely used in biomedical research, but it is also increasingly used in areas such as business and finance, marketing and economics (Meyers, et al., 2006). Logistic regression is a special type of regression used to predict the outcome of binary variables, i.e., magnitudes that can have two possible outcomes (e.g., success / failure). The dependent variable in the binary logistic regression model must be dichotomous (Hosmer et al., 2013).
The model of logistic regression has a following form: Where π(х) is the expected value of Y for a given value of X, while the α and ß1, 2...k corresponds to the parameters α and ß1,2...k from the linear regression model (Tekić et al., 2020). This function is nonlinear and in order to linearize it is necessary to perform an appropriate transformation.
If the logistic regression function is transformed, the function is following: (Kvesić, 2012): The resulting equality is called logit and it is linear with the parameters ßi, i = 1 ... k. The method commonly applied for testing the parameters in the logistic regression model is the maximum likelihood estimator, while Wald statistic test is used to estimate the significance of coefficients (Basu et al., 2017).
The Hosmer-Lemeshow test was used to assess the suitability of the data to the model (Hosmer et al., 2013). The tool used to assess the predictive accuracy of the model is the classifications matrix. The applied methods included calculation of Cox & Snell and Nagelkerke pseudo-R² coefficients. These coefficients have the maximum value of 1 and the closer the value is to 1, the more accurate the model is (Walker & Smith, 2016).

Sample and used variables
The statistical analysis included 295 loan applications made by agricultural companies taken from a bank operating in the Republic of Serbia. These loan applications of agricultural companies were processed by the bank in the period 2017-2019. Software packages used for statistical data processing were SPSS 21 and R 3.6.3. As a dependent variable in this analysis, a dichotomous variable was observed: credit application approved (yes or no). A set of financial and accounting ratios, belonging to different categories such as liquidity, solvency, profitability and economic structure, were selected from the accounts of these agricultural companies as independent variables: The leverage indicator (LEV) indicates the degree of capital burden on total liabilities. In essence, the rule is that the lower the value of this ratio, the greater are the security of long-term creditors and the solvency of the company. Interest ratio (NIIE) measures how much space there is between interest costs and company earnings. The larger the space, the safer the claims of long-term creditors will be and vice versa.Net income and operating income ratio (NIOI): operating profit shows the company's earnings after all expenses have been removed, except for tax expenses and certain one-off items, on the other hand, net profit shows the profit that remains after all operating expenses incurred in that period have been deducted from sales revenue. Return on assets (ROA) and Return on Equity (ROE) are indicators of company`s profitability which are the most used in the analysis (Walsh, 2003). Liquidity (LIQ) is the ability of a company to meet its obligations as they fall due. It can be measured in several ways, we used quick ratio which analyzes current assets and short-term liabilities. Ratio of financial liabilities and total assets (FLTA) shows the degree of indebtedness of companies to banks. Ratio of financial stability (STB), this coefficient indicates how many long-term sources, which consist of capital and fixed liabilities, related to fixed assets. Indirectly this indicator indicates the size of working capital, which is important preservation factor liquidity, because it represents reserve liquidity. The fixed assets coverage ratio (SOL) shows the extent to which fixed assets are financed by equity. The limit value of this indicator is 1.
If the value of the ratio is above one, the company is considered to be solvent. The ratio of total cost-effectiveness of the enterprise (EFF) is obtained when total revenues and total expenditures are compared. When this ratio was less than 1, the company realized a lower amount of income from expenses and then cost-effectiveness is unsatisfactory.

Results and Discussions
Out of 295 observed loan applications, 207 applications were accepted, while 88 loan applications were rejected. A list of the independently variables analyzed during the loan application processing and descriptive statistics of intependently variables are presented in Table 2. From the results shown in Table 2, it can be seen that the average indebtedness of the analyzed companies is 2.79, and it ranged from 0 to 330.67. The high average value of the debt ratio is a consequence of the high indebtedness of those companies in the sample that were not approved for loan applications. The same can be concluded for the other observed indicators, especially the indicators of profitability and effectiveness (ROA and ROE and EFF), whose average values are extremely low, due to the low profitability of the rejected companies in the sample. The liquidity of the observed companies was at a satisfactory level, the average observed value of the current liquidity ratio was above 2, which means that the ratio of current assets and short-term liabilities is higher than the recommended ratio of 2:1. The solvency of the observed companies is also endangered, the minimum value of the solvency ratio is 0, while the maximum is 0.99.
The analysis was started by testing the collinearity of the variables; a correlation matrix is used within the groups to show correlation between the variables (Table 3). From the results shown in Table 3 it can be seen that the highest correlation coefficients are determined between NIIE and ROE (r = -0.88), followed by NIOI and ROA (r = 0.56) and STB and SOL (r = 0.56).

Results of discriminant analysis
Based on the results shown in Table 5, it can be seen that only one canonical function was isolated.  (Table 5).
In Table 6, explanatory variables are sorted according to the importance of separation. The largest correlation with discriminant function is made by the variable ROA (0.555), followed by SOL (0.509). The variable STB (-0.005) has the smallest contribution and the smallest correlation. It can be, also, noticed that all coefficients are statistically significant. Standardized canonical coefficients of the discriminant function (Table 7) represent a relative measure of the influence of each explanatory variable on the discriminant function. The explanatory variable showing the greatest discriminatory influence is SOL, second is ROA, then ROE, and the other explanatory variables have less influence. The most significant explanatory variables have a positive sign of the coefficient of discriminatory function, which means that with increasing profitability and solvency of the company increases the probability that the loan will be approved.
Based on the results from Table 7, the equation of isolated function is:

Results of logistic regression
The backward stepwise method is used to construct the regression model. The selection of variables is conducted in five steps. Only the results of the fifth step, final model, will be presented. Omnibus test, i.e., "goodness of fit" is applied to assess the predictive performance of the chosen model. Goodness of fit test (Table 8) show that the chosen model has good predictive performance and differs statistically significantly from the initial model without independently variables (Sig. <0.05). Based on the results of the Hosmer-Lemeshow test, the same conclusion is reached.    From the results shown in Table 10

Discriminant analysis vs. logistic regression
The Receiver Operator Characteristic (ROC) is measure for assessing logistic regression and classification performance of discriminant analysis model (Hosmer et al. 2013). The ROC curve is presented in Figure 1. For the purpose of additional analysis of the predictive power of the two applied statistical methods, the area under rock curve (AUC) was calculated. If AUC has a value less than 0.5, the model has no predictive power. Based on the results shown in Table 12, it can be seen that the logistic regression has AUC of 0.956, while discriminant analysis has AUC of 0.902. The areas of both analyses show outstanding discrimination. In the next step of the analysis, the sensitivity and specificity of both models were calculated (Table 13). Based on results (from Table 14) it can be seen that logistic regression has higher sensitivity and specificity power than discriminant analysis. Also, based on the AUC values, it can be noticed that the logistic regression model is better than discriminant analysis model.

Conclusions
Credit risk modeling is a serious challenge in all branches of business, and certainly the biggest challenge is to model and predict credit risk in agriculture. In this paper we compared two statistical techniques: discriminant analysis and binary logistic regression to determine the influence of eleven ratio indicators on the likelihood that a credit loan application will be accepted. Based on the results of discriminant analysis, the most important ratio indicators influencing the approval of a loan application are total equity to total assets ratio, return on assets and return on equity (profitability indicators). Wilks' lambda test and the canonical correlation coefficient value shown the significance of isolated function. The results of binary logistic regression indicated that the most important predictors included: leverage ratio, net income to interest expense ratio, net income to operating income ratio, profitability ratios (ROA and ROE), stability ratio and total cost-effectiveness. Significance of logistic regression mode was confirmed by Omnibus test, Hosmer-Lemeshow test and Pseudo R Square coefficients. Both models show that solvency and profitability indicators stand out as significant determinants of credit risk of the observed agricultural enterprises. The comparison of Economics of Agriculture, Year 68, No. 4, 2021, (pp. 881-894), Belgrade models was performed by using the overall classification rate, sensitivity, specificity and area under the ROC curve (AUC). The results showed that the logistic regression model exceeds the discriminant analysis model in all observed parameters. Based on all the above, it can be concluded that both statistical models can be successfully applied in financial institutions when modeling credit risk, but that for analyzed enterprises from the Republic of Serbia, the logistic regression model is a better basis for prediction.
It is important to note that the research was conducted on a sample of only 295 agricultural companies, for the period of the last three years, so in the future researches, the sample size should be increased.