Measuring regional economic disparities in Serbia: Multivariate statistical approach

The identification of regional economic disparities and their extent is an important factor affecting regional development policy formulation. In this work we propose an alternative, multivariate statistical methodology for evaluation of level of economic development of districts in Serbia, and their classification in homogeneous groups, based on five economic indicators. First, the new composite indicator for measuring economic development level (IED) is created using factor analysis, and then the districts were classified according to the obtained IED values. The evaluation of structural quality of thus formed groups was conducted using the non-hierarchical clustering procedure. The approach presented in this paper takes account of statistical assumptions on which the valid application of multivariate methods is based, which makes it advantageous over the current approaches in the literature. The resulting categorization into three district groups clearly confirms the presence of very pronounced regional economic disparities between the less developed districts in southern and eastern part and more developed districts in northern part of Serbia. Districts with city of Belgrade and Novi Sad occupy the dominant positions compared to other districts.


Introduction
One of the most important economic policy issues in each country is to ensure a balanced economic development throughout its territory, as the primary condition for successful integration of individual countries and their regions into global economic trends. The Republic of Serbia is characterized by very pronounced regional development disproportions and asymmetries, which constantly tend to increase (Government of the Republic of Serbia, 2007;Molnar, 2013;Vukmirović, 2013). After decades of neglect and marginalization, a significant step in recognizing the importance and necessity of ensuring a balanced and sustainable regional development in Serbia was made in 2007 and 2009, by adopting the Regional Development Strategy and the Law on Regional Development, respectively. This provided an adequate institutional framework for implementation of principles and mechanisms of regional development management based on modern (European) concepts, and effective reduction of highly-present intra-regional and inter-regional inequalities that jeopardize the socio-economic development of Serbia at national and international level.
Objective measurement and understanding of the achieved level of development of territorial units of different administrative levels (i.e. regions, districts, municipalities), with respect for different aspects (dimensions) of development potential / limitation (i.e. economic, social, infrastructural, demographic, environmental) is the key prerequisite on which strategic planning of balanced regional development and effective implementation of regional policy are based (Government of the Republic of Serbia, 2007). Starting from the multidimensional nature of the balanced regional development concept, the research of regional characteristics mainly involves measurement of a large number of indicators of different development dimensions necessary for perception of the degree of general development of territorial units, their comparison and classification, thus creating a suitable basis for formulating appropriate strategies, development policies, and adequate allocation of incentives and other support instruments by the official authorities and institutions. This approach to the measurement of regional asymmetries has led to the shift of analytical framework from one-dimensional monitoring of values of a large number of individual indicators of different dimensions to the development and application of various multidimensional methodologies for construction of specific composite indicators for measuring poverty, human development, social cohesion / exclusion, quality of life, demographic resources, degree of development vulnerability and socioeconomic development of territories within the country (National Agency for Regional Development (NARD), 2013). However, in spite of the indisputable importance of all other dimensions of regional development and their indicators, it is necessary to emphasize the key role of economic development indicators in assessing the level of development of territorial units. Being indispensable when it comes to alleviating social inequalities, balancing demographic trends and environmental protection, a balanced regional development is primarily economic issue, since "without economic equality there is no national equality" (Government of the Republic of Serbia, 2007).
In addition to the aforementioned, this study was also motivated by the following: 1) there is a possibility to extend the existing methodological framework in researching the structure and modeling of relations within multidimensional economic phenomena, and thus issues of regional characteristics and economic inequalities by applying multivariate statistical methods, 2) it is necessary to demonstrate the advantages of using composite (multivariate) indicators against individual (univariate) analysis of a large number of separate indicators, 3) there is insufficient transparency regarding the proposed methodologies for construction of composite indicators used in theory and domestic practice, as well as the absence of an empirical and statistically based evaluation of their validity, and 4) there is a lack of research focus on the analysis of the degree of economic development of territorial units at district level (NUTS-3) in Serbia.
Thus, the subject of the research is to examine and present the possibilities of applying the methods of multivariate statistical analysis (MVA) in modeling relations between the relevant indicators of economic development of territorial units at district level in the Republic of Serbia. The first objective of the study is to develop an alternative approach to assessing the degree of economic development of districts in Serbia based on a simultaneous examination of the interdependence of five representative economic indicators and creation of their composite combination in the form of the Index of Economic Development (IED), using appropriate MVA methods. The second objective is to classify districts, based on IED values, into statistically evaluated, internally homogeneous and externally heterogeneous groups, for identifying the extent of the present regional disparities.
The practical benefit of the created multivariate classification of districts is that it enables a simpler and more practical, but equally informative comparison of territorial units according to their level of economic development in comparison to the individual, isolated interpretation of the individual economic indicators' values. Accordingly, the created "multidimensional snapshot" of economic development of the observed territorial units can represent a suitable basis for subsequent analyzes of regional asymmetries from the angle of the remaining, important development dimensions (social, demographic, infrastructural, ecological) and their interdependence with the achieved economic development, thus providing information useful for decision makers and experts in the field of planning and implementation of measures and policies of the regional development strategy.

Research background
The analysis of degree of development of different groups of countries and / or territorial units in their composition, using various combinations of indicators of one or several dimensions of development, and, accordingly, an appropriate classification of observed territories into relatively homogeneous groups, in order to identify the presence and / or the extent of regional disparities, represents an attractive research niche for both scientists and professionals, as evidenced by a significant number of published works, and conducted empirical studies. Most of the proposed approaches in assessing the level of regional development are predominantly based on individual or combined application of appropriate multivariate statistical methods. Accordingly, Table 1 shows the key methodological determinants of selected empirical research relevant for the research presented in this paper.

Source: Authors' representations
Meta-analysis of studies presented in Table 1 suggests that a small number of papers analyze regional economic disparities exclusively. Adding indicators of other dimensions of regional development can blur the general picture in terms of placement of territorial units as well as in terms of clear understanding of particular indicator and/or group(s) of indicators, which conditioned it. In other words, the lag in one dimension can be compensated in other dimensions and therefore result in a distorted picture of the level of socio-economic development of that territory (Perišić & Wagner, 2015). The classifications created in this way can not provide precise ranking of territories on the basis of certain, individual aspects of regional development, but only their aggregate position, since the impact of individual development dimensions is incorporated in general categorization.
On the other hand, Goletsis & Chletsos (2011), Igić (2014), Ozaslan, Dincer & Ozgur (2006 used different multivariate methodological approach based on the application of factor analysis or principal component analysis for determination of the weights for individual indicators and construction of specific composite indicators of regional development, thus providing precise quantification of the degree of achieved regional development regarding its different aspects. This created the basis for ranking the territorial units according to the calculated values of created composite indicators. The importance and applicative potential of multivariate methods of statistical analysis in the construction of composite indicators, and the creation of a more objective approach in determining the weights of individual indicators that constitute their structure, are presented and explained in detail in Perišić & Wagner (2015). More precisely, through a critical review of the methodological approach in calculating the index of development of Croatian territorial units (Government of the Republic of Croatia, 2010), these authors emphasize the role of MVA methods in eliminating subjectivity in determination of weights and the importance of ensuring the transparency of the complete process of construction of composite indicators. In addition, according to the authors' opinion, and observations made by Cziraky et al. (2006) and Perišić (2014), it is important to emphasize that one of the main disadvantages of most of the previous studies, is the fact that they neglect or incompletely test the fulfillment of appropriate statistical assumptions, which cast shadow on validity of MVA methods application.

Research methodology framework
In addition to the traditional univariate statistical methods, we also extensively applied several methods of multivariate statistical analysis aimed at examination of simultaneous interdependencies among individual economic indicators and discovery of an "unknown" structure hidden in the observed set of multivariate data. A schematic representation of the research methodology framework, complemented by detailed explanations of the implemented univariate and multivariate statistical methods, is presented in Figure 1. After the selection of individual indicators of economic development, in accordance with the detailed explanation of procedure, guidelines and critical activities in construction of composite indicators presented in the relevant literature (Salzman, 2003;OECD, 2008;Perišić & Wagner, 2015), and adequate verification of justification of its application on a particular set of multivariate observations, a factor analysis has been carried out (for details, see: Kovačić, 1994;Hair, Black, Babin & Anderson, 2010;Everitt, 2010), with the principal component analysis as a factor extraction method, and the application of relevant criteria for selection of optimal number of factors. Factor analysis was implemented in order to determine the total and individual values of communalities, i.e. the share of the variance of each individual indicator that was included and explained by the extracted, common factor solution. Communalities were used as a basis for calculating relative weights, ω j , for individual p variables in the structure of created composite indicator of economic development level (IED), using the following expression: The construction of composite indicator -Index of Economic Development (IED) was carried out using the method of linear additive aggregation of weighted normalized values of individual economic indicators (OECD, 2008): where n and p denote the total number of observation units (i.e. objects) and number of variables (i.e. individual indicators), X ij ' is a normalized value of j th variable for i th observation unit,  j is a weight of j th variable within common factor, and IED i is a value of composite indicator for i th observation unit.
By ranking the IED values and comparing them with the national average (i.e. average IED value), the classification of districts in Serbia, according to the degree of economic development, was made. In order to verify the validity of created IED classification and provide a statistically based evaluation of quality of composite IED indicator, a multivariate classification of districts was performed using the non-hierarchical clustering procedure (for details, see: Kovačić, 1994;Hair et al., 2010), based on the application of k-means method on the values of the initial economic indicators. The analysis of possible differences in the composition of clusters identified by these two classification alternatives was carried out using the silhouette coefficient, as a comprehensive indicator of internal homogeneity and external heterogeneity of the formed clusters (Rousseeuw, 1987;Tan, Steinbach & Kumar, 2006): Within the presented expressions (3), used symbols denote: s ivalue of the silhouette coefficient for i th object (for i =1,2,…,n); a iaverage distance of the i th object to all other objects in its cluster; b iminimal average distance of i th object to all objects belonging to other clusters, individually; and Svalue of the overall silhouette coefficient for particular classification.
In addition, the squared Euclidean distance (Expression 4) was used as an appropriate distance measure for calculating the values of silhouette coefficient for both classification alternatives: Symbols x ij and x kj denote value of i th observation unit (x i ) for variable X j , and value of k th observation unit (x k ) for variable X j , respectively. This is followed by deeper qualitative analysis and interpretation of identified clusters of districts, from the aspect of original variables' values used in the IED construction, along with the corresponding cartographic and multivariate graphical representations (e.g. Chernoff faces). In addition, special attention is devoted to ensuring the fulfillment of the statistical assumptions on which the valid application of presented multivariate analytical procedures is based (for details, see: Tobachnick & Fidell, 2006;Johnson & Wichern, 2007;Hair et al., 2010). Using the appropriate graphical methods and statistical tests, within data preparation phase, the following activities were carried out:  Univariate distribution normality testing using Anderson-Darling, and Shapiro-Wilk test, along with univariate outlier analysis using box plots;  Transformation of the original values of variables for which the deviation from normal distribution was identified, using Box-Cox procedure, defined as (Osborne, 2010): where T(X ij ) denotes the transformed value of i th observation unit for j th variable, X ij , the original value of i th unit for j th variable, and λ j represents the optimal value of transformation parameter for j th variable;  Multivariate distribution normality testing using Mardia's, and Henze-Zikler's test (Ramzan, Zahid & Ramzan, 2013;Korkmaz, Goksuluk & Zararsiz, 2014), multivariate outlier analysis based on the Mahalanobis distance measure (Varmuza & Filzmoser, 2009), and examination of multicolinearity and linearity between variables, based on the application of simple linear correlation analysis;  The normalization of indicators' values by the min-max transformation technique (OECD, 2008;Perišić & Wagner, 2015), with the correction of the range of values from 1 to 10, using the following expressions: where X ij ' represents normalized value of j th variable for i th observation unit, X ij is corresponding original value, while, X j min and X j max denote minimal, i.e. maximal value of j th variable, respectively. Data analysis and all the necessary statistical calculations were carried out using the following statistical software packages: IBM SPSS Statistics, version 20, MacroSAS for multivariate normality, and EduStat 4.05.

Variables, sources of data, and temporal-spatial scope
Using the current territorial organization, defined by the nomenclature of statistical territorial units (NUTS) in the Republic of Serbia (RS), secondary data for selected five indicators of economic development ( Table 2) were collected for 24 administrative districts and Belgrade area (NUTS-3 level). Data were obtained from electronic databases of complex publications of the Statistical Office of the Republic of Serbia (SORS, 2014), and Report on Small and Medium Enterprises and Entrepreneurship (ME & NARD, 2014). All collected data refer to the year 2013. In addition, since 1999 SORS provides no information for districts within the Autonomous Province of Kosovo and Metohiјa, so they are not included in the research. Values of variables X4 and X5 were taken from SORS (2014).

Source: Authors' representations
To obtain the values of selected economic indicators, the authors used not absolute numbers (i.e. number of SMEs, employed/unemployed persons, and total amount of GVA) but their values per capita, per 1000 inhabitants, or percentage participation. This was done in order to neutralize and/or mitigate the impact of total demographic mass of individual districts on the composite indicator values, and thus the resulting classification of districts.
Why were these five specific indicators of economic development selected?
The number of indicators that will constitute the composite index is the tradeoff between the risk of overlapping information and risk of information loss (Mazziotta & Pareto, 2013). It is also determined by the ratio of number of observations and number of variables (minimum acceptable ratio is 5:1), important for the implementation of factor analysis (Hair et al., 2010). Starting from the fact that SMEs sector represents the most efficient segment of the Serbian economy (ME & NARD, 2014;Jakopin, 2015), and given its positive contribution to employment, gross value added and total turnover, it is clear that indicator of density of SMEs territorial distribution (i.e. number of SMEs per 1000 inhabitants), unequivocally provides useful information on degree of economic development of districts. The rest of the selected variables are the most commonly used regional economic development indicators in the "official" studies analyzing regional development of territories within Serbia In addition, regardless of the included dimensions of regional development, the common characteristic of the multivariate studies presented in Table 1 is that they incorporate at least one or a group of indicators of economic development in the analysis of regional inequalities. Scientific papers focused exclusively on the economic character of regional disparities are relatively rare. In addition to that the lack of a consensus regarding the used economic indicators has caused a pronounced variation in their selection. Accordingly, it is important to emphasize that for the purposes of this research, a specific combination of economic indicators has been created by selecting those indicators that are most often used as indicators of the economic dimension of regional development in the presented studies (Table 1). More precisely, depending on the availability of data, the subjective choice of authors and the objectives of the analysis, in more than 55% of analyzed papers, the employment rate (or employment per 1000 inhabitants), the unemployment rate (or unemployment per 1000 inhabitants) or GDP (or GVA) per capita were used as indicators of the economic dimension of regional development. The average wage per employee, as an economic indicator, is present in approximately 40% of the studies in Table 1. Approximately 50% of presented papers simultaneously use employment rates and GDP (or GVA) per capita. One third of the studies include indicators of employment and unemployment (e.g. Cziraky et al., 2003;del Campo et al., 2008;Kurnoga-Živadinović, 2007;Melecky, 2014;Pintilescu, 2011;Rašić-Bakarić, 2006;Lepojević et al., 2015), while quarter of them simultaneously uses these two indicators in combination with GDP (or GVA) per capita (e.g. del Campo et al., 2008;Kurnoga-Živadinović, 2007;Melecky, 2014;Pintilescu, 2011;Rašić-Bakarić, 2006;etc.). Although the simultaneous use of employment and unemployment indicators may initially be debatable and questionable, their incorporation into the structure of composite indicator is fully justified. Namely, due to changes in the manner of recording and treatment of unemployment (SORS, 2014) 3 , employment and unemployment are not exactly the two sides of the same coin, since unemployment is not determined as a simple difference between the total working age population and the number of employees. In addition, relatively low correlation coefficient value (r = 0.475) between these indicators (Table 5) and the absence of multicolinearity problem, which will be explained in detail in the following section, clearly confirm this.

Data preparation and preprocessing
Although certain methods of multivariate analysis are less sensitive to nonfulfillment of certain statistical assumptions, before the application of selected multivariate methods, an examination of whether, and to what extent, the available sample of univariate and multivariate observations meets the defined statistical requirements is performed. Accordingly, normality testing of univariate and multivariate distribution of selected indicators, as a basic prerequisite of a valid implementation of factor analysis, is a necessary step in understanding the nature of the used variables. Presented graphical displays (Figure 2), indicate that in the case of most of the variables, a certain (smaller or larger) deviation of the distribution shape from the normal distribution is present, except in the case of variable X 4 , for which histogram of frequencies follows and reflects the contour of a normal curve. The remaining four variables are characterized by a positively skewed distribution, and kurtosis smaller than height characteristic for normal distribution.
The results of univariate normality testing ( Table 3, 2 nd and 3 rd column), at the level of significance α=0.05, confirm previously stated, visually determined conclusions. More precisely, in contrast to the variable X 4 for which there was not enough evidence to reject the null hypothesis, for variables X 2 , X 3 and X 5 , the results of Anderson-Darling and Shapiro-Wilk normality tests suggest the rejection of null hypothesis, which states that the observed sample of univariate observations is normally distributed, since the corresponding pvalues are lower than predefined level of test significance α.  Based on the graphical representations of univariate distributions (isolated intervals of values at the right tail of distributions), as well as the results of conducted statistical tests, it can be assumed that the identified deviations from normal distribution are, most probably, a consequence of the presence of outliers. Box-plots ( Figure 3) confirm this assumption in the case of variables X 2 , X 3 , and X 5 . For example, the variable X 2 , as one with the most pronounced deviation from normal distribution, is characterized by the presence of two outliers (marked with stars). For variable X 4 , as expected, the presence of outliers is not recorded, unlike the variable X 1 for which Belgrade area is identified as a potential outlier (marked with circle) and most likely, as a cause of positively asymmetric distribution.
In order to ensure the fulfillment of normality assumptions, and to mitigate the impact of identified non-standard observations, the Box-Cox's transformation (Expression 5) of the original values of variables X 1 , X 2 , X 3 and X 5 was performed. The optimal values of the transformation parameter λ for each variable are -1.267 (for X 1 ), -0.356 (X 2 ), -3.052 (X 3 ), and -3.049 (X 5 ). In order to verify the effects of the applied transformation method, previously used univariate normality tests were implemented on the transformed values, at the level of significance α=0.05, and obtained results are presented, for the purpose of comparing, in  In addition, by comparing the calculated values of Mahalanobis distance for each district (ranging from 1.235 to 10.021), as a measure that approximately follows a chi-square distribution with p degrees of freedom, and the value of 97.5 percentile of chi-square distribution (χ 2 5; 0,975 =12.8325), as cut-off value, the presence of multivariate outliers has not been identified in the sample of multivariate observations, since there is no Mahalanobis distance value that is higher than the cut-off value. For the verification of assumption regarding the absence of multicolinearity problem, the analysis of degree and statistical significance of correlation between the observed variables is performed. Based on the analysis of calculated simple linear correlation coefficient values (r) for all pairs of variables, presented in the form of correlation matrix type 5x5 (Table 5), the following conclusions were made: 1) there are no correlation coefficient values that are lower than 0.30, or higher than 0.80, since they all range from r min =0.439 to r max =0.769, (indicating no weak linearity or multicolinearity problem between the selected indicators), 2) the presence of direct correlation between all pairs of variables is dominant, except in the case of variable X 4 , which is negatively correlated with other indicators, and 3) based on the procedure of testing statistical hypothesis regarding the population correlation coefficient (ρ), the statistical significance of linear correlation between all pairs of variables was confirmed. Note: (**) and (*) denote statistical significance of correlation at 0.01 and 0.05 level of significance, respectively.

Source: Authors' calculation (IBM SPSS Statistics 20.0)
Finally, having in mind that the selected variables are measured in different units, within the last step of data preparation process, the procedure of normalization of their values is carried out, using the method of min-max transformation (Expression 6, 7) for converting original and/or transformed values of variables (X 1 , X 2 , X 3 , X 4 , X 5 ) into normalized values (X 1 ', X 2 ', X 3 ', X 4 ', X 5 ') ranging from 1 to 10. An inverse coding (Expression 6) has been performed only for the variable X 4 , since higher unemployment implies a lower level of economic development, and vice versa.

Construction of Index of Economic Development
Starting from the confirmed linearity, univariate and multivariate normality, as well as the absence of multicolinearity, univariate and multivariate outliers, in order to check the adequacy of selected indicators for the application of factor analysis, an additional examination of strength and statistical significance of linear interrelationship between variables, based on the interpretation of Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy, and the results of Barttlet's sphericity test, was carried out. The overall KMO value (0.754) for the set of variables is greater than the specified (Tobachnick & Fidell, 2006), minimally acceptable value (0.50), indicating the existence of a strong linear interdependency between the observed variables. Due to the Bartlett's test statistic value (χ 2 =68.244) with 10 degrees of freedom, and corresponding pvalue (0.000) at significance level 0.05, the null hypothesis is rejected, and the alternative, stating that correlation matrix (at the population level) differs from an identity matrix (i.e. there is a statistically significant linear interrelationship between the variables), is accepted. KMO values for individual variables range from 0.699 (moderate interdependence) to 0.881 (very strong interdependence), and thus confirm the statistical significance of correlation of individual variables with other variables. Based on the confirmed sampling adequacy, a valid implementation of the Rtype exploratory factor analysis, without a priori determination of a number of common factors, was carried out ( Table 6). In order to identify the best linear combination(s) of variables (i.e. factors) accounting for most of the variance in the data as a whole, following criteria were used: Kaiser-Guttman rule, percentage of variance explained criterion, and Cattell's scree test. After the interpretation of these criteria, a single-factor solution was extracted as an optimal, since of the total of 5 identified latent variables, only the first is characterized by an eigenvalue (3.399) greater than 1.00 (Kaiser-Guttman rule), accounting for approximately 68% of the total initial variance of the observed set of variables (more than 60% is defined threshold by percentage of variance explained criterion). This was confirmed by Cattell's scree test.
Communalities for individual variables ( Table 7, 3 rd column) also confirm the validity of extracted single-factor solution, which explains 67,976% of the total variance of the initial sets of variables. In fact, more than 50% of the variance of each of the observed variables is explained by a common factor ( Table 7,  4 th column), whereby variables X 2 ' and X 5 ' are characterized with highest percentages of explained initial variance (78.3% and 83.1%), followed by variable X 3 ' (66.8%), X 4 ' (58.8%), and finally X 1 ' (52.9%). In fifth column, the unrotated factor loadings are presented, since only one factor was extracted, and therefore, there were no empirical grounds, nor the need for the implementation of the factor rotation method.

Source: Authors' calculation (IBM SPSS Statistics 20.0)
In addition, based on insight into factor loadings of individual variables (ranging from 0.727 to 0.912), as indicators of strength of correlation between each of the variables and extracted factor ( Table 7, 5 th column), it can be concluded that practical significance of the obtained single-factor solution is satisfied, since all variables are characterized by factor loadings which are significantly greater than +0.30, as a minimum satisfactory threshold value. Since each of the five variables is used for measuring and describing particular, specific aspect of a common phenomenon (i.e. economic development level), the extracted factor is named Index of Economic Development (IED). Finally, on the basis of presented and interpreted values of communalities and factor loadings, determination of weights (ωj) of initial economic indicators in the structure of a composite indicator IED (extracted factor) is performed by calculating the relative share of communalities in the identified eigenvalue of extracted factor ( Table 7, 6 th column), using Expression (1). After weighting of the normalized values of individual indicators, the IED values are calculated, using the Expression (2). More specifically, the structure of the formed composite indicator (IED) can be presented as follows: where X ij ' are normalized values of individual indicators (for i = 1, 2,…, 25, and j =1, 2,…, 5). The IED values range from 1 to 10, where districts with a values closer to 10 are characterized by a higher degree of economic development, and vice versa. The maximum value of the index of economic development of districts in Serbia is 9.8851 (Belgrade area), and minimum value is 1.7513 (Toplica district), which is within defined range of values.

Classification of districts
Ranking of districts according to their IED values (Figure 4) clearly indicates the presence of significant regional disparities, in terms of the level of economic development.  Belgrade area and South Backa district are territories characterized by the high level of economic development (Group 1), since their IED values are over 150% of the national average. On the other hand, eight districts (mainly within the Region of Eastern and Southern Serbia) recorded IED values under 80% of the national average, due to which the same can be classified within the group of districts with low level of economic development (Group 3). The largest number of districts (precisely 15, i.e. approx. 60% of the total number of observed districts) is characterized by the IED values moderately above or slightly below the determined average value, i.e., between 80% and 130% (<150%) of the national average, which is why they can be classified within the group of district with moderate (average) level of achieved economic development (Group 2). For evaluation of identified structure of the three groups of districts, a nonhierarchical cluster analysis was implemented on normalized values of selected economic indicators, using k-means method. More precisely, the obtained results are used for statistical evaluation of quality and validity of constructed composite indicator (IED), and therefore, the justifiability of its application for classification of districts in Serbia according to the achieved degree of economic development. By comparing the classification solutions based on IED values and non-hierarchical clustering, a minimum deviation has been identified. In fact, identical allocation was obtained for all districts regarding their membership to certain groups, except in the case of Pirot district, which was, according to the results of applied k-means method, excluded from the group of districts with moderate economic development and assigned to the group of low economic development. The final decision regarding the position of Pirot district was made based on the comparison of silhouette coefficient values calculated for both classifications ( Table 8). The higher value of the overall silhouette coefficient, and the negative value of individual silhouette coefficient for Pirot district in the non-hierarchical solution, suggest the retention of the classification alternative based on IED values, as more favorable ("more optimal").

Interpretation of classification of districts
The formed classification of districts in Serbia, based on their economic development, is presented visually, through a cartographic representation (Figure 7), and Chernoff faces (Figure 8), along with their interpretation. / 5 (or ≈ 60%) of the districts within this group belong to the Region of Eastern and Southern Serbia. The average IED value in this cluster is 3.10, which is at the level of ≈60% of the national average (5.25). The average and even maximum values of all individually observed economic indicators are here significantly below the corresponding national average ( Table 9). More precisely, twice lower value of GVA per capita, ≈10% lower amount of average wage, 4.5% lower employment rate, 19% more unemployed, and 13% smaller number of SMEs per 1000 inhabitants, in average, compared to the corresponding national averages, unambiguously point to the general economic vulnerability and catastrophic consequences of mismatches of natural, production and material resources, as constraints from earlier periods, amplified by the wave of economic devastation that hit these districts during the transition process. By comparing with average values in Cluster 1, the presence of extremely uneven regional development and alarming regional economic disparities between the districts has been confirmed. The ratio of degree of development of these two clusters varies from 1.43:1, regarding the amount of average wages, to drastic 5.57:1, in terms of GVA per capita, in favor of Cluster 1. The seriousness of the observed situation from the aspect of economic development of districts in Cluster 3 is perhaps best illustrated by the "sad" expressions of Chernoff faces on the verge of tears, with lowered eyebrows, narrow and small nose, and small and downward-facing mouth, which is illustrated later, in Figure 8.  Group 1 (high level of economic development) includes only South Backa district and Belgrade area. This classification outcome is partially expected, bearing in mind that these same districts were, in the pre-processing phase, identified as outliers in almost all variables (Figure 3). Their allocation within Group 1 is conditioned by the fact that in their territory are the cities of Belgrade and Novi Sad, which represent the leading economic, administrative, cultural and university centers in the RS and Vojvodina region, with exceptionally favorable geographical position and infrastructure preconditions (road and railroad nodes, Danube corridor, airport) for intensive economic development and attraction of domestic and foreign investments.
The dominant position of these two areas in terms of economic development is confirmed by their average IED value of 9.18, which is ≈75% above the national average. Approximately 30% higher average wages and lower number of unemployed, 14.3% higher employment rate and 50% more SMEs per 1000 inhabitants explain almost three times higher amount of GVA per capita in these two areas, compared to the corresponding national averages per individual economic indicators. The "prepotent" expressions of their Chernoff faces, high-edged eyebrows, wide / big nose and cynical smile, best explain their position from the aspect of achieved economic development in relation to the rest of the RS territory.
Although there are numerous studies dealing with the analysis of regional disparities within the territory of Serbia (e.g. Republic Development Bureau, 2009; NARD, 2013; Province of Vojvodina Secretariat for Interregional Cooperation and Local Government, 2014;Molnar, 2013;Jakopin, 2014), regardless of similarity of the objectives, the results are not comparable with ours, due to theoretical and methodological differences and temporal scope of data. More precisely, these studies mainly provide univariate analysis based on the separated classifications of different territorial units (regions, districts, or municipalities) for one particular or several individual regional economic development indicators. In other words, each of the used economic indicators has its own specific classification of territorial units, and classifications based on different indicators mainly differ. The results of the multivariate classification of the districts presented in this paper, which is based on a simultaneous analysis of the five selected economic indicators, are not comparable with the correspending five univariate, individual classifications. The methodological approach applied here, enables sublimation, with minimal loss, of information contained in separated, individual classifications, and their presentation in a single common classification, providing more contextualized, less fragmented insight into the phenomenon at hand.
Nevertheless, the obtained results confirm the well-known fact that the Belgrade area and the South Bačka district represent the engine of economic development in Serbia, in contrast to the districts in the eastern and southern parts of the territory. However, the proposed methodological framework based on the construction and calculation of values of the innovative composite indicator provides quantification of the achieved level of economic development of each Individual territorial unit, allowing their comparison and precise view of their position, in terms of the values of all five used economic indicators, in relation to the leading districts, those least developed, or any other district within the same or some other cluster. These are precisely the key advantages, in relation to the multivariate classifications provided by the application of cluster analysis in the research carried out by Popović & Maletić (2008), Lepojević et al. (2015), and Janković-Milić, Marković & Igić (2013). In addition to this comparison of methodological approaches, due to the differences in the spatial-temporal coverage of data, as well as the difference in terms of used indicators and dimensions of regional development, a concrete comparison of empirical results would not be valid.
In comparison to Igić (2014), which uses data for the year 2011 and covers the same territorial units (districts) in Serbia, our research provides a more rigorous multivariate statistical analysis used for construction of a proposed composite indicator, and fulfillment of statistical assumptions on which the valid application of presented MVA methods is based. Methodological differences are reflected in the application of univariate and multivariate outlier analysis, testing the normality of univariate and multivariate distribution of indicators, and the use of non-hierarchical clustering procedure and corresponding optimality criteria for verification of validity of the performed classification of districts based on the composite indicator values. In addition, the use of indicators of demographic, technological and environmental dimensions of regional development, in addition to the economic one, results, as a rule, in different weights in the composite index structure preventing direct comparison of the obtained classification results.

Conclusions
In this paper, an alternative multivariate statistical approach for the classification of districts in Serbia according to economic development is proposed. It is based on simultaneous observation and measurement of values of five representative economic indicators, and their aggregation in the form of composite indicator IED. Developed as intensive and statistically justified use of selected MVA methods, the proposed approach in the analysis of regional economic disparities is characterized by the following practical and methodological advantages in relation to the majority of previously implemented research of similar character.
 Multivariate compression of five representative indicators to the level of univariate composite indicator provides a more practical yet equally informative classification of the observed territories according to the degree of economic development. In comparison to the approach based on a separate interpretation of values of individual economic indicators, the classification obtained by this study is more illustrative, comprehensible and informative when used to formulate and efficiently implement concrete measures in the domain of regional development strategy.
 When it comes to opportunities for future research, the classification of districts according to the values of the proposed composite indicator (IED) can serve as a suitable basis for further and deeper analysis of the situation and tendencies within their territories in relation to other important dimensions of regional development, and formulation of adequate conclusions regarding their interdependence and causality.
 The proposed multivariate statistical approach provides a tool for the estimation of the development level of each territorial unit that allows comparison of the evolution of development over time, identification of disparities and growth patterns and, therefore, can be used as a policy tool for planning and resource allocation.
 The key advantage of the proposed methodological approach is the thorough verification of fulfillment of statistical assumptions on which the valid application of presented MVA methods is based, as an indispensable and initial, though frequently neglected and omitted, step in securing the scientific foundation of the results and drawn conclusions.
 The statistical validity of the IED (i.e. weight coefficients of individual indicators in its composition), and thus derived classification of districts, was confirmed by the results of non-hierarchical clustering analysis.
 On the other hand, the obtained classification, complemented by detailed interpretation, cartographic representation and Chernoff faces, as a specific, rarely used, but very informative multivariate graphical representation, confirms pronounced regional economic inequalities among observed territories, in Serbia in 2013. By analyzing the IED values, three groups of districts have been identified according to the level of economic development, and they are as follows: group of districts with high (group 1), average (group 2), and low (group 3) economic development. The ratios of the average IED values of the identified groups vary from 1.56:1 (group 1 vs. group 2) to 1.89:1 (group 2 vs. group 3) and 2.96:1 (group 1 vs. group 3). The ratio of the best and worst-ranked districts (Belgrade and Toplica) of 5.64:1, best describes the degree and seriousness of the present disparities among the districts in Serbia. Generally, the performed classification confirms the well-known division between the developed northern and undeveloped southern part of Serbia, since all districts of the Vojvodina region (except for South Bačka) are positioned within the group of average economic development, while the largest part of the Region of Eastern and Southern Serbia is characterized by low economic development.
To sum up, the applied multivariate approach enables a clear, objective, transparent and statistically valid analysis of the degree of economic development of territorial units in Serbia, thus providing a reliable basis for redefining and efficient implementation of appropriate measures within the framework of the regional development strategy aimed at mitigating the evidently present and expressed regional economic asymmetries.