Generation of a Flood Susceptibility Map of Evenly Weighted Conditioning Factors for Hungary

Over the past decades, in the mountainous, hilly and/or urban areas of Hungary several high-intensi-ty storms were followed by severe flash flooding and other hydrologic consequences. The overall aim of this paper was to upgrade the national flash flood susceptibility map of Hungary first published by Czigány et al. (2011). One elementary watershed level (FFSI ws ) and three settlement level flash flood susceptibility maps (FFSI s ) were constructed using 13 environmental factors that influence flash flood generation. FFSI maps were verified by 2,677 documented flash flood events. In total, 5,458 watersheds were delineated. Almost exactly 10% of all delineated watersheds were included into the category of extreme susceptibility. While the number of the mean-based FFSI s demonstrated a normal qua-si-Gaussian distribution with very low percentages in the quintile of low and extreme categories, the maximum-based FFSI s overemphasized the proportion of settlements of high and extreme susceptibility. These two categories combined accounted for more than 50% of all settlements. The highest accuracy at 59.02% for class 5 (highest susceptibility) was found for the majority based FFSI s . The current map has been improved compared to the former one in terms of (i) a higher number of conditional factors considered, (ii) higher resolution, (iii) being settlement-based and (iv) a higher number of events used for verification.


Introduction
Over the past decades many hydrological and hydraulic engineering analyses have focused on the assessment of the socio-economic consequences of flash floods (Georgakakos, 1986;Lóczy et al., 2012). In Hungary these studies have mainly sought to assess flood hazards in the floodplains of large rivers, mainly the Danube and its largest left-bank tributary, the Tisza, and less attention has been paid to mountainous and hilly areas by the Hungarian water management policy (Lóczy, 2010).
A flash flood is commonly caused by heavy or excessive rainfall within a short period of time, generally interpreted as less than 6 hours in the US, following the onset of the rainfall event (Elkhrachy, 2015;Georgakakos, 1987). In the UK, concentration time for flash floods is less than 3 hours which is in the range of the times of nowcasting (Collier, 2007). Flash floods are characterized by extreme flow uncertainty which cannot be ignored in a reasonable estimation of flood risk or in the reliable mitigation of the haz-ard. However, others, like Schwartz and Dingle (1980) adopted the term of "hybrid floods" with a lead time of 12 hours following the causative event. Localization of flash flood hot spots is of paramount importance to prevent or mitigate losses triggered by flash floods. Today, preliminary rapid screening of flash flood-prone localities is commonly done by GIS (Al-Juaidi et al., 2018;Stathopoulos et al., 2017). This approach is the susceptibility mapping of the parameters that influence the magnitude of runoff, in other words the partitioning of rainfall into infiltration and runoff Saleh et al., 2020). Rainfall-Runoff models are integrated systems of assessing possible impacts for severe flood events (Gioti et al., 2013).
Susceptibility or flood potential index (FPI) is defined as the probability that a risk occurs in a particular area and in a not determined date (Santangelo et al., 2011). Susceptibility mapping (a kind of potential natural hazard mapping) is usually based on the comparison of certain conditioning factors with the distribution of previous events, the latter used as model validation. In this sense, it is the degree to which an area can be affected by future flood hazards, i.e., an estimate of the location of future events. On the other hand, susceptibility does not consider the temporal probability of the event, i.e., when, or how frequently the hazardous events may occur. Nonetheless, mapping the most susceptible locations helps us understand flood trends and can aid appropriate planning and flood prevention (Tehrany, 2014).
In contrast to susceptibility, physical vulnerability (flood risk) assessment implies the identification of the elements at risk and commonly interpreted as the impact of natural disasters on physical, manmade structures (Aleotti & Chowdhury, 1999;Arrighi et al., 2020;Compton et al., 2013;Karagiorgos et al., 2016). In other words, vulnerability can be defined as a functional relationship between the magnitude of loss and the corresponding process intensity causing the damage (Fuchs et al., 2007;Fuchs et al., 2011;Khajehei et al. 2020;Totschnig et al., 2011).
A large number of techniques are available today for the susceptibility mapping of flash floods: traditional, empirical methods and various machine learning based methods. Statistical, rule-based and auto-mated modelling approaches commonly outperform conventional flood models due to their suitability for hazard analyses (Tehrany et al., 2019). Numerous methods have coupled empirical models with Geographical Information Systems (GIS) with the purpose of flood susceptibility modelling (Saleh et al., 2020). Several papers focused on the impact of morphometric properties on flash flood susceptibility (Apaydin et al., 2006;Biswas, 2016;Fábián et al., 2016). Other papers quantified the impact of various conditioning factors, e.g.: topography, land use and soil hydraulic properties on the partitioning of rainfall into runoff and infiltration. The conditioning forces which are assumed to explain the flash floods by Abedi et al. (2021) were slope inclination and aspect, land use/land cover, hydrological soil type, lithology, topographic wetness index (TWI), topographic position index, profile curvature, convergence index and stream power index. Youssef and Hegab (2019) only used 7 flood factorsdistance from streams, slope, curvature, lithological units, angle, elevation, and TWI. However, their results showed that the Analytical Hierarchical Process (AHP) provided a good estimation for flash flood susceptibility (83.3%). Tehrany et al. (2013) and Borga et al. (2014) combined bivariate probability and logistic regression for flood susceptibility analysis in Kelantan State of Malaysia.
Whereas warning systems for large riverine flooding are well applied all around the world, flash floods still represent prediction and detection challenges due to the large spatial heterogeneity of the influencing factors.
Although flash flood guidance (FFG) systems have been in operation since the 1970s in the United States and in many other countries across the globe, still, they prediction accuracy lags the accurateness of large riverine floods (Georgakakos, 2006;Norbiato et al., 2009). In Europe, where flash floods are also common (Gaume et al., 2009), there are also numerous efforts implementing FFG (for instance, in the Black Sea and Middle East regions). Although FFG provides a useful concept that simplifies communication between hydrologists to meteorologists as well as promotes mitigation, it does not predict flash flood timing (Collier, 2007).
Although the number of studies varies greatly, it can reflect that the field of flash floods has been widely studied by researchers in different countries (continents) to a certain extent. The have been collected 175 documents in the United States had carried out the most research in this field, followed by India by 77 documents, and then 71 documents Italy (Aronica et al., 2012;De Marchi et al., 2012;Forte et al., 2005;Heredia-Calderon et al., 1999, Miglietta et al., 2008. China ranked fourth (70 documents), and France ranked fifth, with 64 published documents (Yang et al., 2022).
To assess the flash flood hazard in Hungary, especially following the floods in May and June 2010, in a collaboration between the General Directorate of Water Management and the University of Pécs a flash flood susceptibility index (FFSI) map was elaborated for Hungary (Czigány et al., 2011). However, in the wake of climate change and the increasing weather extremities in the headwaters of the hilly and mountainous areas of Hungary an upgrade of the first national FFSI map became necessary. With the advent of available environmental data and the increased GIS computation capacity an improved version of the FFSI map could be produced jointly by the two research institutes in 2021.
The overall aims of this paper were (i) to develop a national flash flood susceptibility map and (ii) illustrate the susceptibility conditions to flash floods in the hilly and mountainous areas of Hungary. Specifically, we aimed at evaluating the flash flood susceptibility of Hungary by evaluating a total of 13 topographical, hydrological, geological, pedological and land use parameters by means of GIS. The spatial goodness of the map was verified with reported and documented hydrological damages related to intense rainfalls and flash floods.

Data acquisition and processing
All derived topographic parameters and the delineation of watersheds were based on the 10-meter resolution DEM of Hungary. The land use model was generated using the CLC-50 and CLC-2012 and the Artificial Surfaces 2012 databases.
The Hungarian Stream network spatial database of VARGEO, provided by the Hungarian Water Directorate, was applied for stream network analyses (density, bottlenecks and confluences) and the generation of watersheds.
Lithological data were obtained from the 1:100,000 resolution geological database of the Mining and Geological Survey of Hungary. Soil data were obtained from the AGROTOPO (1:100,000 scale) and DoSoRe-Mi (one-hectare resolution) soil databases, both developed by the Research Institute of Soil Sciences and Agricultural Chemistry (TAKI).
For rainfall data two datasets were used. Firstly, the interpolated and gridded dataset of 0.1° resolution of the Hungarian Meteorological Service (OMSZ). Secondly, rainfall data collected by an automated rain gage network over the period of 2013 to 2020.
All geospatial data were processed in ArcMap 10.7.1, ArcGIS Pro 2.8.0 and SAGA GIS software environments.

Delineation of areas of flash flood potential
Hilly and mountainous areas were delineated by calculating (a) range, (b) slope variety and (c) slope majority. The three parameters were then clustered to differentiate watersheds of various topographic characters using the K-means clustering algorithm of the Multivariate Clustering model of ArcGIS Pro ( Figure 1).
The primary delineation was enhanced by defining valley widths at a height of 5 meters above the centerline of the channel by using the Vertical Distance to Channel Network function. The maximum search radius was set at 5 km. An area was considered a plain when no valley relief of at least 5 meters was found in the search radius of 5 km. Areas, prone to inland excess water were excluded from the delineation of hilly and mountainous areas.
By unifying methods 1 and 2 polygons of areas of potentially affected by flash floods were generated ( Figure 1). For the optimization of calculation capacity, 5 areas were delineated. The total area of the potentially flash flood-prone areas was 32,759 km2 covering 35% of the entire land area of Hungary.

Delineation of watersheds
Watersheds were delineated in a cumulative way identifying watersheds of about 3 to 8 km2, with a mean of 6 km2 as a minimum unit, which is a typical size of supercells of convective events, primarily responsible for flash flood generation. The number of unit watersheds and streams totalled 5,458 and 3,103, respectively, while the total length of streams was 18,561 km.

Generation of the watershed based FFSI
Elementary watersheds were delineated using the Hydrology tools (extension) of ArcGIS/ArcMAP 10.5.1. A total of 13 conditional factors, derived from the topographical, channel and flow properties, land use, ped-ological, geological and meteorological datasets were used for creating a watershed-based flash flood susceptibility map (FFSI ws ) of Hungary (Table 1).
The formation of flash floods is affected by several active (meteorological) and passive (morphological) parameters. These parameters (listed in Table 1) in the FFSI ws were selected and analyzed in terms of their impact on flash flood formation (see Table 1, influence on susceptibility). In our model only one active parameter was included, which is the annual average number of days with extreme precipitation (≥ 30 mm).
The passive factors were divided into two categories: (i) catchment characteristics (such as surface cov-er, lithological properties, slope variety and rate of change in range); (ii) river basin characteristics (e.g.: river density, confluence zones, bottleneck effect). All factors were ranked on a scale of 1 to 5 at watershed level ( Fig. 1.), the higher the value, the higher the susceptibility. To calculate FFSI ws all 13 factors at a rank of 1 to 5 were summarized hence, counting each factor at an equal weight with a potentially maximum value of 65. The influence of the non-dynamic factors on flash flood susceptibility are listed below ( Fig. 1 and Table 1) Note that valley density is not identical with river density, as intermittent valleys may act as linear conveyor paths facilitating water accumulation. • Soil type: Clay and sand textured areas were calculated in all catchments. Clay and sand control runoff and infiltration adversely, i.e., clay promotes runoff, whereas sand enhances infiltration. This parameter accounts for two conditioning factors, namely (i) clay and (ii) sand percentage. • Specific relief: the topographic conditions of the catchment, e.g., the height differences (relative relief) control the time of concentration and flow velocity. However, catchment areas with large relative relief are not necessarily hazardous. If the accumulating water does not reach an inflection point in the area, it cannot be classified as more dangerous than a catchment of lower relief. Areas that are "only" steep and have no inflection points will only conduct water through and this parameter will not indicate a hazard. In the model, we have calculated the difference between the highest and lowest points of the catchment and divided it by the catchment area. • Slope variety: It demonstrates the ruggedness of the watershed, indicates the number of possible inflection points which dissect the terrain, and shows how many possible locations there are in the catchment where the accumulation process slows down. • Bottleneck effect: in the case of heavy rainfall the valley bottlenecks impound runoff. On both sides of the river the valley height (up to 5 m) was examined to a horizontal distance of 5 km. Sections where the degree of constriction exceeded 50% in the flow direction compared to the previous point were sorted, and then their number in each catchment was counted and normalized to the catchment area. • Confinement: valley morphology and asymmetry were calculated to account for blocking effect in the case of flooding. If the deviation from the center line exceeded the 50% asymmetry value, i.e., swinging left and right towards the valley sides, it was classified as a risk factor if the watercourse was closer than fifty meters to the valley edge on either side. The points in the catchment that meet this criterion were given and normalized to the catchment area. • Drainage density: in connection with the valley density, the drainage densities of the catchments were also taken into the analysis. Lowland sections, artificial water networks, as well as watercourses shorter than 1 km have been removed from the dataset. • Number of confluence points: We determined the confluence points (at least 2) within 500 meters' distance, then we calculated how many confluence points are found in a given catchment area.
Except for the forested, paved and karstic areas (%) and the clay and sand percentages, all other eight factors were normalized and averaged to the catchment area.

Generation of FFSI s maps
To obtain the settlement level susceptibility indices (FFSI s ), feature polygons (watersheds) of each 13 conditioning factors were rasterized at a resolution of 100 × 100 m. FFSI s maps were calculated from the rasterized grid network by extracting the mean, maximum and majority raster values using the Zonal Statistics function of ArcGIS for the overlapping settlements. Each settlement was underlain by more than one watershed, hence calculating the majority was also an option for calculation. FFSI s was then calculated by evenly summing up each of the 13 values ranked on a scale of 1 to 5, having a maximum of 65 points potentially. Ranking was calculated using the Geometric Intervals extension of ArcGIS, commonly used for non-normal distribution datasets and producing results in a visually appealing and cartographically comprehensive way.

Data verification
FFSI s maps were validated using the locations of documented flash flood inventory data reported to insurance companies (Fig. 6). Database was provided by the General Directorate of Water Management of Hungary. Flash flood and intense rainfall related damages were selected from the database using the keywords "intense precipitation" and "flash flood". Accuracy of the susceptibility map was calculated for each of the quintiles by dividing the number of settlements with reported events by the total number of settlements: where S e is the number of settlements with reported events and S is the total number of settlements in the given bin.
Principal component analysis in MatLab was employed to calculate the level of influence of each of the 13 input parameters and assess the correlation of influence on flash flood generation. PC analysis showed that 10 components (out of the 13) explained 95% of all variability of our dataset. The first three component only explained 54.9% of all variations.

Results and Discussion
The watershed based FFSI ws In total, 5,458 watersheds were delineated. Almost exactly 10% of all delineated watersheds were included into the category of extreme susceptibility ( Table  2). The highest flash flood susceptibility was found in the Southern Transdanubian region, along the western national border (the region of Alpokalja), and in the Börzsöny and the Mátra Mountains in the north central part of Hungary ( Figure 2). Mean watershed area decreased to the direction of higher susceptibility demonstrating the increasingly headwater character of the watersheds. Most watersheds belonged to the category of moderate susceptibility.
In terms of the environmental factors on flash flood generation, artificial surfaces covered 5.9% of the stud-ied river basins on average, whereas paved surfaces cover 20% of the area occurred at about 7.5% of the catchments. In comparison forest coverage was higher than 50% in 25% of the catchments, however, 32% of the studied area was afforested. The average proportion of carbonate surfaces in the studied catchments was 2.99%. 5.71% of them were covered with carbonate rocks in more than 20% of the total area. Clay and sand as physical soil type was measured on average 18.5% and 31.7%, respectively. 22.5% of the catchments showed valley density values higher than 20% (the average was 14.5%).

Settlement based susceptibility map (FFSI s )
The distribution of the five susceptibility classes on the three FFSI s maps were rather different. While the number of the mean-based FFSI s demonstrated a normal quasi-Gaussian distribution with very low percentages in the quintile of low and extreme categories, the maximum-based FFSI s overemphasized the proportion of settlements of high and extreme susceptibility, accounting for more than 50% of both categories combined (Table 3). The number of the majority-based FFSI s again showed a normal distribution with less extreme values in the low and extreme quintiles than on the mean-based map.
The maximum based FFSI s hence showed a higher areal proportion of extreme susceptibility than the other two FFSI s maps (Figure 3). The lowest areal cov-er of the settlements of extreme susceptibility was found for the mean based FFSIs ( Figure 5).

Data verification
The flash flood inventory data included a total of 2,677 events. On average, 62% of the analysed 1,912 settlements did not report any flash flood related losses. Hence, the average number of reported events per impacted settlement was 3.68.
According to the flood inventory, the highest absolute number of flash flood related events were reported from the city of Miskolc (38) and in the north central part of Hungary (Figure 6a). However, when the    number of reported events was normalized for the number of residents, the SW part of Hungary demonstrated a higher level of risk ( Figure 6b). The contrasting differences in the distribution of reported events is partly due to the different settlement structures and the higher percentage of settlements of low population in the SW and NE parts of Hungary. The accuracy of the three FFSIs maps was tested using the accuracy of the five susceptibility quintiles. Hence, the highest accuracy was expected for quantile 5 (highest susceptibility) and lowest for quantile 1 (lowest susceptibility). The highest accuracy at 59.02% for quintile 5 (highest susceptibility) was found for the majority based FFSIs, however for the same map the highest accuracy was also found here for the quintile of the lowest susceptibility (Table 4). The majority based FFSIs demonstrated the lowest accuracy for the quantile of high susceptibility (quintile 4), while FFSIs was the best at 50.74%. When accuracy of each quantile was considered, again the majority-based map performed the best at a mean accuracy of 48.41%, whereas the mean and maximum based FFSIs maps were almost identical at 46.33% and 46.29%, respectively.

Conclusions
The currently proposed FFSIs maps comprise the second-stage development of the flash flood susceptibility map (flood potential map) of Hungary first generated and published by Czigány et al. (2011). In the present study a watershed based FFSI ws and three settlement level FFSI s maps were created. Using the maximum susceptibility value for statistical evaluation is a highly recommended approach, as this susceptibility level reflects the worst-case scenario for the relevant community. This approach enables decision makers to mitigate losses, however it increases the cost of flood prevention measures.
The current map, however, is a markedly improved version of the first susceptibility map. Improvements were done at the following four points: a) The current map includes a larger number of conditional factors, specifically integrating multiple hydraulic factors that may influence channel flow. b) The current map has a higher resolution. The current FFSI map was generated for 5,485 elementary watersheds in contrast to the 1,093 of the former FFSI map.
c) The prediction accuracy of the current map is verified by a much larger flood inventory dataset. d) The current map focuses on settlement susceptibility/vulnerability, on the locations where damage happens.
A common number of conditioning factors used in the development of FFS maps is between 3 and 12 (Saleh et al., 2020;Tincu et al., 2018). As flash floods are generated by multiple conditioning factors, but in a site and climate specific manner, a multidisciplinary approach is needed forecasting such extreme hydrological phenomena and nowcasting the causative heavy rainfalls. However, reliable historical records are often too short. In addition, measuring peak rainfall or storm flow is subject to error. Thirdly, rainfall patterns have also changed over the past decades in the wake of climate change.
In addition to the above-discussed environmental factors, it is essential to incorporate and regularly monitor other, dynamic, non-steady environmental factors, like discharge, antecedent soil moisture contents, groundwater table depths, rainfall pattern and canopy cover. The present model included 13 conditioning factors of even weight. A first option to improve the model of the current study is to perform a linear regression calculation to evaluate the weight of the conditioning factors, similarly to many previous studies on flood potential assessment. These papers used weighted parameters based on preliminary statistics and regression calculations. Secondly other classification methods and alternative raster resolutions may also be applied during ArcGIS processing and analyses. Consequently, a novel spatial statistical method and higher spatial resolution may be selected in GIS environment to convert watershed level data to settlement level data. Thirdly, the current susceptibility map could also be verified and compared with the susceptibility map developed for the headwater catchments of the Mecsek Hills by Fábián et al. (2016) on the basis of the morpho-and geometric properties of the studied watersheds.
Combining the current FFSIs maps with the extent of infrastructural damage the map may be further developed into a vulnerability map of advanced practical application for decision makers and end-users. For further refinement, other indirect factors may be included in the model. Khosravi et al. (2018) sug-gested the integration of the topographic wetness index (TWI) into FFS maps. (TWI was successfully employed in SW Hungary for the detection of soil moisture availability by Nagy et al. (2021)). Tincu et al. (2018) found a strong correlation between flow accumulation and flash flood susceptibility on a watershed of a surface area of 4,456 km2.
Although the currently presented FFSI has a lower accuracy than most of the previously proposed ones, still it could serve as a useful tool for decision makers.
A highlight of the current model is that it was verified with an independent dataset of flash flood related disasters and damage. The lower accuracy may also be explained by the size of the analysed area. Most studies performed flood potential analysis on drainage areas of relatively small areas, in some cases at city level (Tehrany et al., 2014), while others for areas of up to several thousands of square kilometres (Tincu et al., 2018) or the entire state of Pennsylvania (Ceru, 2012). Hence, the novelty of the current maps is their resolution compared to the dimensions of the modelled area. However, coming down to a resolution of a few km2s, vulnerability and risk mapping may also be enhanced by the actual forecasting and nowcasting of precipitation effects.