ARE THERE MACROECONOMIC PREDICTORS OF POINT-IN-TIME PD? RESULTS BASED ON DEFAULT RATE DATA OF THE ASSOCIATION OF SERBIAN BANKS

Interni modeli koje banke koriste za ocenu kreditne sposobnosti svojih dužnika po pravilu daju ocene verovatnoće neizmirenja koje obuhvataju čitav poslovni ciklus. Za potrebe primene MSFI 9 neophodne su, međutim, ocene verovatnoće neizmirenja za konkretan vremenski trenutak, kao i uključivanja različitih makroekonomskih scenarija. Ovakve ocene zasnovane su prevashodno na uračunavanju efekata poslovnog ciklusa, te stoga podrazumevaju postojanje dokazive veze između makroekonomskih pokazatelja i ostvarenih stopa neizmirenja. U ovom radu analiziramo da li ovakva veza postoji na podacima banaka koje posluju u Srbiji. Koristimo nekoliko različitih pristupa za utvrđivanje ove veze linearnu regresiju, autoregresioni proces, model sa korekcijom greške, pristup statičkih i dinamičkih panela, kao i dva Bayes-ova pristupa. Na čitavom uzorku model sa korekcijom greške pokazuje najbolje performanse i daje faktore prihvatljive ekonomske intuicije. Podaci po tipu proizvoda daju nešto manje pouzdane rezultate, što je delimično uslovljeno dominantnim uticajem segmenta malih i srednjih preduzeća u ukupnim stopama neizmirenja. Kao najrobustniji prediktori stopa neizmirenja izdvajaju se docnje u promenama ovih stopa, referentna stopa Narodne banke Srbije i stopa rasta bruto domaćeg proizvoda.


Introduction
Lending represents the core business of commercial banks. In order to manage their placements as efficiently as possible, banks around the world are developing their own methods of assessing the creditworthiness of their borrowers. These internal models of banks differ according to the specificity of their credit portfolios and business segments that they cover, as well as to the level of sophistication in the use of statistical-based methods. Due to statistical reliability, internal models are usually developed on larger samples that include all historical data on loan repayments. Therefore, they generally give estimates of Probability of Default (PD) that involve the entire business cycle. These so-called Through-the-Cycle (TtC) PDs represent the "time-centered" values of the PD during the entire history that the bank uses to develop the model.
Starting from January 1, 2018, the International Financial Reporting Standard (IFRS) 9 applies. The Standard specifies how financial institutions should classify and measure financial instruments in their portfolios, and how to assess their impairment using the concept of expected credit loss. One of the essential differences of this standard in relation to, for example, Basel II and III standards used to assess capital adequacy, is that IFRS 9 requires the estimation of the probability of default for a specific time point. Point-in-Time (PiT) estimation of PD is necessary because the expected credit loss in IFRS 9 uses cash flow projections that can span a range of maturities, or even (for Stage 2 instruments) several years or decades. This requires that the future (conditional) probabilities of default be correctly predicted and that the effects of different macroeconomic scenarios are incorporated. PiT PD estimates are primarily based on calculating the effects of the business cycle and the ability to design forward-looking macroeconomic scenarios, and therefore imply the existence of a provable link between macroeconomic indicators and realized default rates. The basic issues to be addressed in the examination of this link are the quality of the data on which it is tested, the choice of the methodology for examining the existence of a statistical association, as well as its economic interpretation and the possibility of forming meaningful scenarios.
The existence of a quantifiable and measurable relationship between macroeconomic indicators and realized default rates is a question that has been a subject of interest for researchers over the past decade, since the Global Financial Crisis. Simons  In a slightly broader picture, it is important to take into account the exogenous credit risk determinants. Duffie et al. (2009) found that the likelihood of extreme losses in the event of default for large companies in the USA is significantly higher than the one obtained by assuming that the correlations depend only on observable risk factors. Hence, it is necessary to include other, latent factors. Fons (1991), in addition to many other authors, describes the techniques for forecasting defaults and other credit events.
Deskriptivne statistike za sve podatke prikazane su u Tabeli 1. Za svaku promenljivu (zavisnu i nezavisne) tabela sadrži srednje vrednosti, standardne devijacije, minimume, maksimume i vrednosti proširene Dickey-Fuller (ADF) test statistike. Na osnovu vrednosti ADF uočavamo da je jedina vremenska serija kod koje možemo odbaciti hipotezu postojanja nestacionarnosti jediničnog korena were used: linear regression, autoregressive process, error correction model, static and dynamic panel-data models, and two Bayesian approaches. The first part of the paper describes in detail the data used and their descriptive statistics are presented. In the second part, methodological approach is presented. The third part summarizes the basic results. The final considerations are given in the last, fourth part of the paper.

Data
We used two groups of data in this paper: (1) data on default rates and (2) macroeconomic data. The aggregated anonymous data from the Association of Serbian Banks on the number of placements and amounts in the status of default were used to calculate default rates (DR). These data were available on the loan-type level and were, in addition, divided into three groups of comparable banks (according to their total default rates). Anonymization refers to the fact that due to the confidentiality of information, the author did not have the data down to the level of individual banks, nor parts of their portfolios. The data have a quarterly frequency and cover the period from the fourth quarter of 2012 to the fourth quarter of 2018 (including the both boundaries of the specified interval).
The default rates in this paper were calculated as the ratio of total amounts in the default status to the total outstanding amounts of all the loans in the sample. This definition partially removes the problem of sample bias that would appear in an alternative definition based on the number of placements. Namely, in such situations, there would be a relatively large number of placements in the sample that are in the status of default, although they may not represent large amounts of receivables, and therefore make up a relatively small amount of total banking assets. A more detailed discussion of the use of different definitions of the default rate is given by Fridson (1991).
The macroeconomic data used in this paper are also quarterly and cover the same period as the default rates (from the fourth quarter of 2012 to the fourth quarter of 2018). They include: the growth rate of gross domestic product (GDP), calculated on a year-to-year basis; the unemployment rate; the inflation rate measured by the consumer price index; monetary aggregate M3, i.e. the broad money; the exchange rate of the euro (EUR) against the Serbian dinar (RSD); the benchmark rate of the National Bank of Serbia (NBS). The data are publicly available in the databases of the Republic Statistical Office (RSO) and the NBS.
Descriptive statistics for all data are shown in Table 1. For each variable (dependent and independent), the table contains their mean values, standard deviations, minima, maxima, and values of the Augmented Dickey-Fuller (ADF) test statistics. Based on the values of the ADF, we note that the only time series in which we can reject the null-hypothesis of the existence of a unit-root non-stationarity is the  inflation rate. All other series are therefore non-stationary and this fact must be taken into account when constructing econometric models. Figure 1 shows the evolution of the aggregate default rate (for all banks and types of products). A downward trend is evident, from a maximum of 7.67% in the first quarter of 2014 to a minimum of 1.21% in the second quarter of 2018. At the same time, in spite of the relatively short series, there are somewhat noticeable patterns of cyclical behavior in the default rates of failure, since the decline is not monotonous.
The overview of default rates by type of product and bank peer groups is given in Table  2. The first column of Table 2 lists the types of products according to the categorization from the ASB database (credit cards, cash and consumer loans, loans to entrepreneurs, loans to large corporate entities, loans to local governments and municipalities, mortgage loans, current account overdrafts, loans to agricultural holdings, micro loans, and loans to small and mediumsized enterprises). From the second to the fourth column, the default rates by bank peer groups are presented. They are grouped according to the default rate for all types of products combined (shown in the last line of the table). The last column in Table 2 contains default rates calculated for all banks in the sample, by product type.
Kao što se vidi u Tabeli 1, sve promenljive osim stope inflacije su nestacionarne. ADF test nad prvim diferencama stope neizmirenja i makroekonomskih faktora pokazuje da se za ovako formirane promenljive hipoteza  Figure 2 shows the evolution of default rates by groups of comparable banks. A third group, where the banks with the highest average default rates are, is particularly striking. In this group, we notice an increase in default rates, followed by a decrease. The same decrease can be seen in Figure 1 for all banks together. The first two groups of banks have a slightly more gradual decrease, i.e. they exhibit a monotonous downward trend. A careful analysis of Table 2 helps us understand where this effect comes from -default rates are most evident in the segments of legal entities and in micro loans. It was these segments that dominated the aggregate defaults during the observed period.

Methodology
We can test the existence of a relationship between the realized default rates and macroeconomic indicators in several ways. A "naïve" approach, which would involve calculating the correlation between default rates and the individual macro indicators, is controversial from the econometric point of view because it neglects the (trend-)nonstationarity of the dependent variable and independent factors. Therefore, we will base our analysis on the regression of stationary variables.
As we can see from Table 1, with the exception of the inflation rate all variables are non-stationary. The ADF test over the first differences in default rates and macroeconomic factors shows that the null-hypothesis of nonstationarity can be rejected at all relevant levels of significance. Therefore, the first logical step is to implement an ordinary linear regression of stationary variables: where X k,t are different macroeconomic factors. Since default rates are bounded between 0 and 1, an alternative that takes into account this censoring of the variable DR t is to apply a logistic transformation of the form: Using this transformation, it is possible to substitute the previous model with a linear regression of the form: Cyclical effects can be captured through autoregressive terms as well. Hence, we will also include an ordinary autoregressive process, AR(M): Similarly, we can include lags of macroeconomic factors in an ordinary linear regression, and obtain a model of the form: By combining an autoregressive model and a model with lags in macroeconomic (or, more generally, exogenous) factors, we obtain the most complete model, AR(M)-X(L): As an alternative to ordinary one-step regressions, we can apply an error correction model (ECM). In this model, a long-term
In the sample split by the type of product we can apply the approach of (dynamic) panel-data models. The general expression for a regression equation in this approach is: where X k,t are different macroeconomic indicators and D i,l are dummies associated with every type of product i and every moment t. We will use several estimation methods: pooled panel, fixed-effect regression, random-effect regression, and autoregressive Distributed Lags (ARDL) in the dynamic panel.

Results
In this section we summarize the most important results obtained in this research. Simple linear regression with stationary variables and linear regression with a logistic transformation have no significant factors. Due to the relatively short length of the time series, we ran the autoregression process up to the fourth lag, AR (4). This corresponds to the assumption that it is sufficient to take the last four quarters of the data on default rate in order to explain the future default rates by the autoregressive mechanism. The AR(4) process gives a statistically significant coefficient only for the fourth lag, ΔDR t-4 . The coefficient is negative, which implies that, for example, an increase in the default rate that occurred four quarters ago would most likely lead to a fall in the default rate today. For linear regression with macro indicators, the last four quarters were used, i.e. up to four lags of exogenous variables. In such a model, X(4), the only statistically significant factor is the first lag of the inflation rate, which has a positive sign of the regression coefficient. This can be interpreted in a simple way -an increase in the inflation rate leads to an increase in default rate one quarter later. However, the inflation rate ceases to be a significant factor if the definition of DR t based on the number of placements in the default status is used. This makes it an insufficiently robust predictor of future failures, at least in the X(4) model. The combined model, AR(4)-X(4), gives qualitatively the same result as AR (4): statistically significant coefficient is obtained only at the fourth lag, ΔDR t-4 .
For the above mentioned models, we will not give explicit values of the estimated coefficients or the corresponding test statistics for the sake of tractability. Instead, we will focus on other models where a greater explanatory power of the factors is obtained. One such model is the ECM (4,1), where the first, third and fourth lag of the default rate, ΔDR t-1 , ΔDR t-3 and ΔDR t-4 , appear as statistically significant factors at the confidence level of 95 percent, along with the first lag of GDP growth rate, M3 aggregate and its first lag, the NBS reference rate and its first lag, as well as the inflation rate. Unemployment rate is significant only at the 90-percent U pristupu Bayesovog usrednjavanja po modelima (BMA) primenili smo usrednjavanje klasičnih ocena beta koeficijenata za 1024 modela koji kombinuju faktore zaključno sa četvrtim redom docnje. U Tabeli 4 sumirani su faktori čija je a posteriori verovatnoća uključivanja u model veća od 0,5. Uočavamo da su najrobustniji prediktori promena u stopi neizmirenja njena četvrta docnja (sa negativnim predznakom), referentna stopa NBS (sa pozitivnim predznakom) i stopa rasta BDP (sa negativnim predznakom). Kvalitativno isti rezultati se dobijaju i primenom stepwise metoda,

24
confidence level. These results are summarized in Table 3. Coefficients with statistically significant macroeconomic factors have an economically intuitive sign. In particular, the first lag in the growth rate of GDP has a negative sign, which indicates that the decline in economic activity is reflected in the increase in loan defaults a quarter later.
Similarly, an increase in the unemployment rate leads to an increase in default rates. This can be seen from the positive sign of this factor, which points to the decline in the repayment ability of the borrower. Increasing the monetary aggregate M3 increases the availability of funding sources, which on the other hand leads (at least initially) to the allocation of these funds to new placements, potentially in a more efficient way than it was the case prior to the increase. Therefore, we see the combined effect of alternating increase and decrease of default rate. Finally, the NBS reference rate has a positive sign, which can be interpreted as the direct effect of increasing the average interest rates, i.e. loan servicing costs. A negative sign of the lag in the reference rate could be interpreted by cyclicality. The inflation rate has a positive sign, which indicates that inflation affects unfavorably the debtor's repayment ability. Finally, we can notice that the sign of the coefficient γ, associated with the first lag of the residual from the long-term equilibrium regression, is statistically significant and negative. This indicates that the long-term equilibrium dependence exists. As significant co-integrating factors, we obtain the GDP growth rate, inflation rate and M3 aggregate. The coefficient of determination in this model is very high and amounts to 0.987, indicating that significant macroeconomic factors, combined with the lags of default rate, can account for almost 99 percent of variations in DR t . Chisquare test statistics is highly significant. In the Bayesian Model Averaging (BMA) approach, we applied the averaging of classical beta estimates for 1024 models that combine factors up to their fourth lag. Table  4 summarizes the factors with a posteriori inclusion probability greater than 0.5. We notice that the most robust predictors of changes in default rates are its fourth lag (with a negative sign), the NBS reference rate (with a positive sign) and GDP growth rate (with a negative sign). Qualitatively the same results are obtained with the stepwise method, using   Rezultati na osnovu baze podataka stopa neizmirenja UBS the approach based on progressively adding factors with a p-value less than 0.10 ( Table 5). The former three factors explain over 80 percent of variations in the first difference of the default rate. Panel regressions, based on the inclusion of the type of product as an additional dimension, give less convincing results. In particular, the pooled panel only yields the monetary aggregate M3 and a dummy variable associated with loans to small and mediumsized enterprises (SME) as significant factors. The determination coefficient is significantly lower than in the regressions over the total data and is equal to 0.112. Fixed-effects model, in addition to M3, also gives the first and fourth lag of the difference in the default rate as significant factors, ΔDR t-1 and ΔDR t-4 (with R 2 = 0.110). Random-effects model includes the SME dummy as an additional significant variable (R 2 = 0.227). The ARDL model provides similar results, with ΔDR t-3 instead of ΔDR t-4 as a significant variable. However, this model is of little practical significance as it shows that long-term dependence does not exist. Hence, we will not display detailed results for panel regressions under classical approaches.
In the Bayesian approach, BMA with panel data applies an averaging over 2048 models. Table 6 summarizes factors with a posteriori inclusion probability greater than 0.5. We see that the most robust predictors of the first difference of the default rate are its fourth lag (with a negative sign), the SME dummy (with a negative sign) and M3 (with an insignificant sign of the average beta coefficient). Similar results are obtained with the stepwise method, using the approach based on progressively adding factors with a p-value less than 0.10 ( Table 7). The significant factors, however, explain less than 20 percent of variations in the first difference of the default rate.

Concluding Remarks
In this paper, we have analyzed whether there is a link between default rates and macroeconomic factors using the database on loan defaults of the Association of Serbian Banks. We applied several approaches to examine the existence of this link. For data that corresponds to a whole sample of banks in Serbia, one-step regressions did not give any (robust) relationships between default rates and macroeconomic factors. The error correction model shows the best performance and gives factors of acceptable economic intuition. Lags of the first differences of default rates, the NBS reference rate and the growth rate of GDP are the most robust predictors of future default rates. We used panel regressions for data separated by the type of product. Differences in default rates are mostly explained by their own lags and the    Za podatke po tipu proizvoda koristili smo panel regresije. Promene u stopama neizmirenja uglavnom su objašnjene sopstvenim docnjama i monetarnim agregatom M3. Veliki deo pada u stopama neizmirenja zapravo ne potiče od poslovnog ciklusa, već se može objasniti idiosinkratikom segmenta malih i srednjih preduzeća. Niski koeficijenti determinacije, tj. mali procenat objašnjenih varijacija, potiču od objašnjavajućih promenljivih -nedostaju dodatni faktori specifični za segmente koji bi, uz veštačke promenljive, doprineli većoj objašnjavajućoj i/ili prediktivnoj moći modela.

Literatura / References
monetary aggregate M3. A large part of the decline in default rates does not actually arise from the business cycle, but can be explained by the idiosyncrasies related to small and medium-sized enterprises. Low determination coefficients, i.e. a small percentage of the explained variations, can be attributed to the explanatory variables -there are no additional factors specific to segments that could, along with the dummy variables, contribute to the greater explanatory and/or predictive power of the model.
The obtained results are perhaps not sufficiently robust and conclusive. One possible explanation for this claim is that in the observed period there were evident cycles in macroeconomic factors, but not necessarily cycles in default rates. In other words, the default rates still do not have enough history to make a "full circle", as is the case in some of the financially more developed markets. Asynchrony and difference in cycle periods are a major challenge for such models. This holds in general, and not only for Serbia or the region of South East Europe. In order to create a reliable model for the link between default rates and macroeconomic factors in Serbia, it would be desirable that there are data on a lower level of aggregation that would contain segment variations in order for the panels to have more degrees of freedom. An alternative avenue of future research on this topic could also use panel data for several comparable countries, for example, by creating a regional database on loan defaults.