MODELING OF THE NEED FOR PARKING SPACE IN THE DISTRICTS OF MOSCOW METROPOLIS BY USING MULTIVARIATE METHODS

The growth of metropolis cities and consequently the number of vehicles cruising within their boundaries create a permanent problem of dissatisfaction with the amount of parking space and its over-occupancy. The results of continuous observation of parking lots in Moscow and data on registered cars in the city districts was the initial basis for this study. The data was processed by IBM SPSS Statistics 20 statistical program to obtain descriptive statistics indicators of parking space in Moscow, the analysis of cause-and-effect relations and subsequent multivariate modeling using regression analysis; log it regression; discriminant analysis; “classifi cation trees” (decision tree). The results clearly show the possibility of applying the methods of multivariate statistics, log it regression and “classifi cation trees”. Both models allow for using the explanatory variables “proportion of parking lots with violations” and “number of parking spaces in the street and road network” to analyze the impact on parking lot occupancy. Also, the descriptive statistics analysis revealed that when the number and proportion of parking lots with violations are 2 times higher on average in the districts with over-occupied parking lots versus the districts where the parking lot occupancy is not so high, and the number of paid parking lots is over 10 times less. The increase in the proportion of parking spaces with violations ranging from 0 to 0.2% entails a sharp increase in parking space occupancy (up to 90%), while a further increase in the proportion of parking spaces with violations does not entail a signifi cant increase in the parking occupancy.


INTRODUCTION
Mass urban migration of people will entail, per some forecasts, to the fact that "by 2040 cities will be populated by 70% of the world's population" [20]. Such population growth in the cities will lead to signifi cant transport problems, one of which is the growth of traffi c and the parking space occupancy. The solution of this problem requires proper planning and parking policy management. This article presents a study on modeling the availability of parking spaces in the metropolis, as exemplifi ed by the metropolitan city of Moscow. This study is aimed at fi nding tools for modeling the parking space occupancy s for the subsequent creation of technological smart solutions within the "Smart City" strategy. An important issue for the development of municipal parking policy in the metropolis is related to the provision of various districts with parking spaces. A signifi cant challenge here is to develop a model aimed to determine the demand for parking spaces in municipal districts. The importance of solving this problem is based on the fact that the problem of parking space arrangement is becoming critical in municipal planning [17,23], in the face of population growth in major Metropolitan areas [2, 7 8]. The metropolis of Moscow is not an exception here, as the number of cars per 1,000 inhabitants is equal to that in the world's largest metropolises [19]. In this paper, we present a part of the results of the analysis of the study concerning parking lots in Moscow that was conducted from December 26, 2018 throughout June 30, 2019. The outcomes of fi eld studies of parking spaces available in the metropolis of Moscow, performed by 285 professors, experts and students of Plekhanov Russian University of Economics were the basis of the initial data used for the analysis. An observation method was used to obtain the data [16]. The literature review revealed certain gaps in the analysis of the factors infl uencing the parking space occupancy in the metropolitan urban areas. Therefore, the objectives of this study were: • To obtain descriptive statistics of the indicators of parking space available in Moscow; • To identify the factors of infl uence on the basis of the analysis of cause-and-effect relations between the explanatory indicators of parking space and parking space occupancy. • To analyze and verify the effectiveness of predictive models for determining the parking occupancy. Based on the established objective of the study, the following hypotheses are proposed: H1. The dependence of the parking occupancy indicator on predictors is nonlinear; H2. The models of parking space provision in the metropolis can be formulated on the basis of the revealed cause-and-effect relations between the explanatory indicators of parking space and parking space occupancy by applying the methods of multivariate modeling.
The data array was analyzed in this study using the IBM SPSS Statistics 20 software package. The calculations of total absolute and averaged relative indicators for each district of Moscow, as well as the calculation of additional indicators were carried out based on the obtained data array by using consolidated tables. Occupancy (%) as the proportion of the number of parked cars and the number of parking spaces was calculated on the basis of initial data for each parking lot. Then, with the help of correlation analysis and scattering patterns, causal relations between the explanatory indicators of parking space and parking lot occupancy were determined. The parking lot occupancy models were developed on the basis of the obtained data using statistical methods of multivariate modeling: regression analysis; logit regression; discriminant analysis; classifi cation trees. The obtained results showed that the models, where the predicted indicator is the parking lot over-occupancy in the city district were of the greatest interest for the parking occupancy modeling in terms of quality and effi ciency. These are the logit regression model and the model of classifi cation trees. The novelty of the study is stipulated, fi rstly, by the revealed cause-and-effect relations between the occupancy of parking spaces, over-occupancy of parking lots and parking spaces with violations; secondly, by the hypothesis about the nonlinear nature of the relation between the indicators and the parking lot occupancy that was set and confi rmed in the course of the study; third, by the outcomes of multivariate modeling, which revealed the most appropriate methods of the parking lot occupancy modeling. We assume that the city authorities can contribute to the creation of smart solutions for the parking policy implementation on the basis of the proposed models, both in the metropolis, as a whole, and in particular urban districts.

LITERATURE REVIEW
Two main directions in the parking space modeling can be identifi ed in the reviewed literature, namely: simulation based on queuing theory and modeling by using statistical methods.

Queuing theory based modeling
The literature sources suggest the use of the queuing theory to model various parameters of parking methods and parking spaces [9,29]. The modeling and management of transport behavior of drivers based on the queuing theory with the use of smart parking systems is an important issue being discussed in the literature. This will reduce unstructured parking and parking violations, and also will increase the parking space occupancy effi ciency in the permitted parking lots [6,22,24,30,34]. Queuing theory studies and models the processes provoking the occurrence of the units to perform certain type of services and the servicing of these units [11]. The studied objects may include production processes, supply processes, transport, trade, stocks. For example, the authors demonstrate the application of the queuing theory to model the distribution of elevators in the underground parking [35]. The random nature of the main parameters of order queuing systems is a common feature of all problems of the queuing theory: • The number of units received for service and the time interval between their receipt are random variables; • The duration of each request service is random [12]. The queuing system is usually divided into two parts: • Serviced subsystem (population, customers, and drivers) -where the demand (need) for servicing arises; • Servicing subsystem, which is designed to cover the incoming servicing demands. The literature presents the studies comprising an attempt to model a navigation system to predict parking occupancy and availability of parking spaces, based on queuing theory [15]. In some studies, this problem is localized for the part of urban space. For example, it pertains to the business part of the city [32]. Other researchers propose a method for determining the number of the entry terminals at the entrances to the closed parking lot (garage), also based on queuing theory [13].

Modeling based on statistical methods
Approaches to modeling based on statistical methods are more widely presented in the literature. Various statistical methods and sets of indicators can be used to model parking space occupancy [5]. The literature sources provide examples of modeling the user behavior and the time required to park a car [14]. This study deploys two types of multiple linear regression models using SPSS software. Based on the analysis of models, the most appropriate explanatory variables affecting the demand for parking are identifi ed having regard to the socio-economic behavior of the user. The authors proved that the most infl uential explanatory variables for this model are the travel distance and the income per month. The cumulative model study identifi es the most appropriate explanatory variable to evaluate the parking demand and, according to the researchers, can be used to determine the number of parking spaces that will be provided to meet the parking demand. In some works [3], parking modeling serves as the basis for fi nding "Smart City" solutions, where fl ow management, space availability management and transport system development is based on simulation and modeling proceeding from formalization using the Discrete Event System Specifi cation. Other researchers base their models on formulation of forecasts using cumulative data on the availability of parking spaces [26]. Zheng et al. [35] proposed an interesting approach of combining statistical data with the results of cluster anal-ysis of "parking anomalies" to solve the problem with modeling of parking space availability. Some researchers analyze the parking lots operated by private operators using a limited number of indicators: Parking capacity, parking fee and access period [31]. Other researchers reduce the variety of indicators of parking spaces to the cost and time spent on search for available parking space. "From the view of microeconomics, the chosen car park is determined by the value of utility. Motorists will always choose the car park with the maximal utility value, which is related to the consumed time and the cost" [35]. A number of studies notes that the different cities will most likely face the same situation with parking spaces, i.e. between 8 and 18 on weekdays -high demand for parking. Researchers can develop predictive models for such urban areas (e.g. offi ce buildings, restaurants, residential areas, etc.) that possess parking occupancy data [10]. Summing up the review, it can be stated that the literature contains a gap in the use of cause-and-effect relations between the explanatory indicators of parking space and parking space occupancy for the subsequent modeling of the provision of parking spaces in the metropolis. In addition, the authors focus mainly on either a separate parking lot or a parking lot in a part of the city, without analyzing the parking lots in different districts of the city in their totality.

Field study and descriptive statistics
In this paper, we use data on parking lots in Moscow obtained in the period of fi eld study and data collection by observation from December 26, 2018 throughout June 30, 2019. Two hundred and eighty faculty members, employees and students of Plekhanov Russian University of Economics participated in the observation [4]. The method of observation corresponded to the approaches described in the literature [16]. An array of data describing the various indicators of parking lots in Moscow was obtained as a result. For the subsequent processing, the received data on observations of parking space of Moscow was assumed in this paper including: a list of parking spaces with indication of city district for each parking lot, parking ring reference, number of parking spaces, number of parked cars, number of cars parked with violations, position relative to the street and road network (roadside, SRN), parking tariff, method of placing (parking or storage). The data with its distribution by districts of Moscow and the "ring references" were used for processing. The districts of Moscow include the following municipal administrative territories: 125 municipal districts, 19 settlements and 2 city districts. In our study, we used data on the districts, which had been a part of Moscow before the administrative expansion in 2012 [1].
The ring reference is an indicator that determines the position of the parking lot relative to the ring roads dividing the city. The ring reference is used by the municipal authorities to set basic parking tariffs. There are 4 rings: Boulevard Ring (BR), Garden Ring (GR), Third Transport Ring (TTR) and Moscow Ring Road (MRR) [18]. The ring reference zones are shown in Table 1. The analysis of tariffs was based on the principle of formation and not their cost values: • Common, when the cost of parking does not depend on its duration; • Progressive, when the cost of parking increases based on its duration [25]. In addition, the Department for Transport and Road Infrastructure Development of the Moscow City Government provided data on the number of the cars registered in the districts of Moscow. However, data on the number of cars registered in the districts of Moscow are not available in the public domain. Parking occupancy (%) as the proportion of the number of parked cars and the number of parking spaces was calculated on the basis of initial data for each parking space. The preliminary data processing consisted in the calculation by using consolidated tables of total absolute and averaged relative indicators for each district of Moscow, as well as the calculation of additional indicators. As a result, the following indicators should have been obtained for each district ( Table 2).

Modeling
Modeling of parking occupancy was carried out using a variety of statistical methods of multivariate modeling: • Regression analysis; • Logit regression; • Discriminant analysis; • "Classifi cation trees". These modeling tools are multivariate statistical methods and allow consider the impact of a set of indicators on the dependent variable [5]. Multiple regression analysis allows to evaluate the relation between several explanatory variables (also called regressors or predictors) and the dependent variable. Regression analysis provides a mathematical description of a particular type of dependencies. The regression analysis results in a model represented by the regression equation. The measure of the model adequacy to the initial data (model quality) is the measure of agreement -determination factor -R 2 and the residual characteristics. Discriminant analysis and logit regression are predictive models, in which the resulting measure (response) is categorical and the explanatory variables are measured in an interval (quantitative) scale. Discriminant analysis interprets the data area as consisting of separate sets, each of which is characterized by variables with multivariate normal distribution. Logit regression evaluates the probability of occurrence of a certain event with an increase in the set of factors affecting it, this function is not linear, but S-shaped. Classifi cation trees are a heuristic method of examining many explanatory variables to determine such combinations of categories that yield the highest percentage under the desired response condition. If successful, the resulting tree shows which explanatory variable is most closely related to the target variable.

RESULTS
Primary data analysis allows the solution of two problems: • To obtain descriptive statistics of the indicators of parking space available in Moscow; • To identify the cause-and-effect relations between the explanatory indicators of parking space and parking space occupancy.
Descriptive statistics of parking space indicators are presented in Table 3  The standard deviation shows the average deviation from the mean values, the variation coeffi cient is the standard deviation in fractions of the average; therefore, the variation coeffi cient should be used for the comparative assessment of the variability degree of indicators between the districts (Figure 1).  Moreover, the lowest variation coeffi cient is the indicator of parking lot occupancy (15%). An additional occupancy rate of parking lots in the district (k) was introduced based on the occupancy rate of parking lots (y) to determine the average occupancy rate of parking lots according to the formula.

No. Indicator
(1) The share of districts with over-occupied parking lots amounted to 52%. The cause-and-effect relations between the explanatory indicators of parking space and parking occupancy were assessed using the correlation analysis and the scatter patterns. The correlation analysis revealed the presence of a poor statistical linear relation between the parking occupancy, parking lots with violations and the share of parking lots with violations (pair correlation factors were r=0.37 and r=0.42, respectively). There was no statistically signifi cant correlation with any other indicators found. In this regard, a repeated correlation analysis with squared values of variables was carried out, based on the assumption of the parabolic nature of the infl uence of indicators on the parking lot occupancy. As a result, a statistically significant relation between the parking lot occupancy and the following parameters was revealed:   A comparison of average values was used to assess the relation between the indicator of the parking lot occupancy in the district and other factors; the statistical signifi cance of differences in the mean values was assessed using the t-criterion. The differences were found between the average values for the following indicators at the signifi cance level of less than 5% (Table 4). Thus, in the districts with over-occupied parking lots, the number and the share of parking lots with violations are averagely more than 2 times higher than with not over-occupied, and the number of paid parking lots is over 10 times less. It is noteworthy that there are no over-occupied parking lots between the Boulevard and the Garden Ring. The number of parking lots in SRN in the districts with over-occupied parking lots is averagely more than 2 times less than in the not over-occupied districts.

Modeling of parking occupancy
Model 1-Multiple Linear Regression. Multiple regression analysis was used to develop a model of parking occupancy dependence (y) on other indicators. As a result of the preliminary study, the nonlinear nature of the relation of some variables with the parking lot occupancy was revealed, which was taken into account at model development. Initially, the following values were included as explanatory variables: x 1 is the number of cars registered in the district, thousand cars; x 2 is the parking, thousand spaces; x 3 is storage, thousand spaces; x 5 is the share of parking lots with violations; x 8 is the number of paid parking lots; x 9 is the number of parking lots between the Boulevard and the Garden Rings; x 10 is the number of parking lots between the Garden ring and TTR; x 11 is the number of parking lots between TTR and MRR; x 12 is the number of parking lots outside MRR; x 13 is the number of parking lots in SRN.
A model including only signifi cant variables at a significance level of less than 5% (2) was obtained after the regression analysis. This model explains only 40% of the input data variance (2) (based on the determination factor value), since the parking occupancy (y) dependence is affected by the parameters other than those included in the model. However, the model may be interpreted: • Increment of the parking occupancy caused by change in the share of parking lots with violations at unchanged other parameters of the model is not linear and is shown in Figure 3, the extremum falls on 2.6% share of parking lots with violations, apparently before 2.6% the direct impact of parking lots with violations on the parking lot occupancy occurs due to the lack of parking spaces, and from 2.6% the opposite effect may occur due to the high cost of parking, which forces the drivers to look for the ways of avoiding payment; • Increase in the number of paid parking lots x 8 by 100 units is caused by an average 5% increase in parking occupancy, which might be explained by the usual location of paid parking lots in places with the greatest demand for parking; • Increase in the number of parking lots between TTR and MRR (x 11 ) and outside MRR (x 12 ) by 100 will lead to an average increase in the parking occupancy by 1.4% for each location, apparently, this is caused by the fact that bedroom communities are concentrated in these locations; • Increment of parking occupancy caused by the change in the number of parking lots in SRN (x 13 ) with unchanged other parameters of the model is nonlinear and shown in Figure 4, the inverse relation may be explained by the fact that parking lots in SRN are more convenient for parking, visible and accessible for the drivers than the parking lots outside SRN. affecting it was applied to obtain a model with a higher quality (having better predictive abilities than the 1st model). The probability is described as an S-curve. The indicator of parking occupancy in the district (k) calculated by formula 1 was used as a dependent variable. Initially, the following values were included as explanatory variables: x x 1 1 is the number of cars registered in the district (B category), thousand cars; x x 2 2 is the parking, thousand spaces; x x 3 3 is storage, thousand spaces; x 5 is the share of parking lots with violations; x 8 is the number of paid parking lots; x 9 is the number of parking lots between the Boulevard and the Garden Rings; x 10 is the number of parking lots between the Garden ring and TTR; x 11 is the number of parking lots between TTR and MRR; x 12 is the number of parking lots outside MRR; x 13 is the number of parking lots in SRN.
A model including only signifi cant variables at a significance level of less than 5% (3) was obtained after the regression logit analysis. (3) Where p is the probability of parking over-occupancy This model is nonlinear for all explanatory variables and explains only 52% of the initial data variance (based on Nigelkerk R 2 value), the proportion of the correctly predicted values of the indicator of parking occupancy in the district (k): "Not over-occupied parking lots" -80% and "Over-occupied parking lots" -80%, which characterizes the satisfactory quality of the model. Interpretation of the model is as follows: • Increment of the indicator z caused by change in the share of parking lots with violations at unchanged other parameters of the model is not linear and, similarly as in model 1, the extremum falls on 2.6% share of parking lots with violations ( Figure 5); apparently before 2.6% the direct impact of parking lots with violations on the probability of parking over-occupancy (p) may be explained by the lack of parking spaces, and from 2.6% the opposite effect may occur due to the high cost of parking, which forces the drivers to look for the ways of avoiding payment; • Increase in the number of paid parking lots x 8 by 100 units is caused by an average increase z by 0.43 and increases the probability of parking over-occupancy (p), which might be explained by the usual location of paid parking lots in places with the greatest demand for parking; • Increase in the number of parking lots between TTR and MRR (x 11 ) by 100 will result in an average increase of z by 0.06 and increases the probability of parking over-occupancy (p), apparently it is caused by the high concentration of bedroom communities in this location; • Increase in the number of parking lots in SRN (x 13 ) by 100 will lead to an average decrease of z by 0.8 and reduces the probability of parking over-occupancy (p), the inverse relation may be explained by the fact that parking lots in SRN are more convenient for parking, visible and accessible for drivers than the parking lots outside SRN.
It is easy to notice the similar results of interpretation of the 1st and 2nd models, which confi rms the validity of the conclusions: • Multidirectional infl uence of the share of cars parked with violations on the parking lot occupancy; • Close direct connection between the number of paid parking lots and the parking lot occupancy; • Direct impact of the number of parking lots outside TTR on the parking lot occupancy; • Reverse infl uence of the number of parking lots in SRN on the parking lot occupancy; • Presence of signifi cant additional factors affecting the parking lot occupancy, not included in the model. Model 3. Stepwise discriminant analysis was used as an alternative to the logit regression model, to develop a predictive model to determine the category a district will belong to, based on the average parking lot occupancy (k) -irrespective of the fact, whether the parking lots are over-occupied or not. The indicator of parking occupancy in the district (k) calculated by formula 1 was used as a dependent variable. Initially, the following values were included as explanatory variables: x 1 is the number of cars registered in the district, thousand cars; x 2 is the parking, thousand spaces; x 3 is storage, thousand spaces; x 5 is the share of parking lots with violations; x 8 is the number of paid parking lots; x 9 is the number of parking lots between the Boulevard and the Garden Rings; x 10 is the number of parking lots between the Garden ring and TTR; x 11 is the number of parking lots between TTR and MRR; x 12 is the number of parking lots outside MRR; x 13 is the number of parking lots in SRN. After carrying out the discriminant analysis, the model consisting of 2 discriminant Fisher functions (for 2 categories of parking occupancy degree) was obtained: not over-occupied, over-occupied), which includes only signifi cant variables at a signifi cance level of less than 5% (4, 5). The indicator of parking occupancy in the district (k), calculated by formula 1, was used as a target (dependent) variable. Initially, the following values were included as explanatory variables: x 1 is the number of cars registered in the district (B category), thousand cars; x 2 is the parking, thousand spaces; x 3 is storage, thousand spaces; x 5 is the share of parking lots with violations; x 8 is the number of paid parking lots; x 9 is the number of parking lots between the Boulevard and the Garden Rings; x 10 is the number of parking lots between the Garden ring and TTR; x 11 is the number of parking lots between TTR and MRR; x 12 is the number of parking lots outside MRR; x 13 is the number of parking lots in SRN. The CHAID method based on Pearson's Chi-squared test and allowing to identify explanatory variables that have the greatest impact on the target variable change was used to construct the tree; Bonferroni adjustment was not included, the number of observations in terminal nodes was set to at least 10, in parent nodes to 20. The classifi cation results are presented in Tables 5 and  6.
(4) (5)  In the gain tables, the nodes of the "classifi cation tree" are arranged in descending order of the observation share corresponding to the target category (response).
The fi rst column contains the node number in the "classifi cation tree", the second and the third column contain the number and the share of observations in the node (in our case, Moscow districts), the next two columns contain the number and the share of observations in the node corresponding to the target audience (gain) across all nodes. Response share is the share of observations that correspond to the target audience in the node. Index is the ratio of the response share for the node and for the entire sample (the latter is represented in the root (zero) node). Thus, the nodes with an index signifi cantly greater than 100% are of interest for each target category. Let us consider the most signifi cant nodes for each target category. Target Category: Not over-occupied parking lots. The following nodes are of interest: No. 9 includes 17 districts (13.7% of the total number of districts) of which 16 parking lots are not over-occupied (94.1%), which is 1.98 times higher than the average in the sample; characteristics: Parking lots in SRN -over 202; the share of parking lots with violations -no more than 1.18%; No. 8 includes 26 districts (21% of the total number of districts) of which 17 parking lots are not over-occupied (65.4%), which is 1.37 times higher than the average in the sample; characteristics: Parking lots in SRN -from 130 to 202, the share of parking lots with violationsmore than 1.18%. Target Category: Over-occupied parking lots. The following nodes are of interest: No. 5 includes 18 districts (14.5% of the total number of districts) of which all are over-occupied (100%), which is 1.91 times higher than the sample average; characteristics: Parking lots in SRN -less than 130, the share of parking lots with violations -more than 1.18%; No. 4 includes 19 districts (15.3% of the total number of districts) of which 15 parking lots are over-occupied (23.1%), which is 1.51 times higher than the sample average; characteristics: Parking lots in SRN -less than 130, the share of parking lots with violations -from 0.327% to 1.18%.

DISCUSSION AND CONCLUSIONS
First, it is necessary to explain the selection of the modeling methodology. The literature review demonstrates that there are two approaches to modeling based on the queueing theory (QT) and the multivariate statistical methods. Having analyzed the literature, we believe that QT with consideration of the methodology, is applicable to an individual parking lot, as it was described in the literature [11,12]. In addition, with certain assumptions, QT may be used for clusters of parking lots or local urban   [32], which are considered as a multi-channel queuing system in the event when drivers in the process of searching for a parking space consider the parking lots included in the cluster as acceptable parking places. First, it is necessary to explain the selection of the modeling methodology. The literature review demonstrates that there are two approaches to modeling based on the queueing theory (QT) and the multivariate statistical methods. Having analyzed the literature, we believe that QT with consideration of the methodology is applicable to an individual parking lot, as it was described in the literature [11,12]. In addition, with certain assumptions, QT may be used only for clusters of individual parking lots or local urban spaces, as seen in the work Yang et al. [32]. This assumptions dictating necessity of adapting of QT to multi-channel queuing systems in the event that drivers in the process of searching for a parking space consider the parking lots included in the cluster as acceptable places to park. The QT can be used (as it is described in the literature and from our observations) for individual parking lots only and for urban spaces.
With some assumptions QT can be transferred to the multi-channel queuing systems for the individual parking lots, when drivers in the process of searching for a parking space consider the parking lots included in the cluster as acceptable places to park. In cases where the whole system of urban parking lots is considered, we believe it is more correct to use the methodology of statistical methods of multivariate modeling, as proposed in the literature [5]. This approach makes it possible to develop models based on sample data that describe the infl uence of a set of factors on the dependent variable or on the probability of a certain event.
The multivariate statistical methods allow to assess the adequacy of models and include statistically signifi cant factors in the model. Such tasks were fundamental for this study. In cases where the whole system of urban parking lots is considered, we believe it is better to use the methodology of multivariate modeling, as proposed in the literature [5]. This approach enables developing the models based on sample data that describes the infl uence of a set of factors on the dependent variable or on the probability of a certain event. The multivariate statistical methods allow assessing the adequacy of models and include statistically signifi cant factors in the model. In this paper, we do not describe the procedure for collecting initial data based on the observation of parking lots in Moscow. Various methods and models were used in the processing of the study fi ndings.
In this paper, we do not describe the procedure for collecting initial data based on the observation of parking lots in Moscow. Various methods and models were used in the processing of the study fi ndings. The dependence of parking occupancy indicator on the predictors was verifi ed to confi rm hypothesis 1. The cause-and-effect relations between the explanatory indicators of parking space and parking occupancy were assessed using the correlation analysis and the scatter patterns.
The dependence of parking occupancy indicators on the predictors was verifi ed to confi rm hypothesis 1. The cause-and-effect relations between the explanatory indicators of parking spaces and parking occupancy were assessed using the correlation analysis and the scatter patterns. The correlation analysis revealed the presence of a poor statistical linear relation between parking occupancy and parking lots with violations (the pair correlation factor was r=0.37 In addition, during the initial analysis of the collected data, descriptive statistics were calculated for each district. In contrast to the study conducted by Ionita et al. [10], the cost of fuel was not considered, but the parameter of cars parked with violations was included. In addition, we used data on all parking lots, including the "unorganized/informal" ones. The analysis revealed an interesting pattern, according to which the number and proportion of parking violations were 2 times higher on average in the districts with over-occupied parking lots versus the districts where parking occupancy was not so high, and the number of paid parking lots was over 10  Table 7: Comparative evaluation of models for determining the demand for parking spaces times less. In addition, according to the diagram, the increase in the proportion of parking spaces with violations from 0 to 0.2% entails a sharp increase in parking lot occupancy (up to 90%), while a further increase in the proportion of parking lots with violations does not entail a signifi cant increase in the parking occupancy. Four approaches using approaches of multivariate statistics were analyzed to verify the hypothesis of the applicability of multivariate modeling to create a model of parking adequacy in the metropolis on the basis of the cause-and-effect relations between the explanatory indicators of parking spaces and parking occupancy (y). The results of the comparative evaluation of the model are presented in Table 7. Based on the comparative evaluation of the models, the following conclusions can be drawn: 1. All models have the following explanatory variables that affect parking occupancy: • The share of parking lots with violations; and • The number of parking lots in SRN. 2. The most interesting models in terms of quality and effi ciency are Nos. 2 and 4 (Table. 7), in which the predicted indicator presents the fact of parking lot over-occupancy in the district: • Logit regression; and • "Classifi cation tree". Thus, we have identifi ed two models that give similar results, but the probability of parking over-occupancy will be calculated for each district in the logit regression; in case of "classifi cation trees", it will be correlated with a certain group (node), which represents the total value of the probability of parking over-occupancy by districts. The obtained result confi rms hypothesis 2. The obtained models can be used in the "Smart City" systems [27,28]. At present, several cities in the world have begun to monitor the parking area to provide the drivers with information about the free space availability. But as the researchers note "however, this approach is limited by the high costs of sensors that need to be installed throughout the city" [10]. However, such sensor system is not suffi ciently represented in various districts and parking lots even in very developed cities. In addition, the sensor-based system does not allow for prediction of the parking space development in the city, and operates only with the available data. Our approach enables to integrate the models of parking occupancy dependence on other indicators into smart technologies and use it irrespectively of sensor availability. We consider further study in terms of integrating the model with the data fl ow received from city video cameras, in order to take into account the factor of a motor vehicle temporary parking in the particular district based on the time it stayed in the city. We suggest focusing on the data specifying the vehicle license plate number and the time interval when it was registered by the cameras in different parts of the district or the city as a whole. We intend to be guided by the vehicle shield number and the time slot of its registering between the cameras installed in various parts of the district or the city in general.