Comparative analysis of the traffic accidents in the territory of the city of Užice for 2021 and 2022 using open data and the Streamlit application

Abstract


Introduction
This time can be rightly called the time of data. The paradigm that describes modern society was created in 2006 when the British mathematician Clive Hamby exclaimed from the roof of a building: "Data is the new oil." With the development of information technologies, the Internet and social networks, the amount of information has been growing from year to year at high speed. It is estimated that 2.7 zettabytes of data had been collected by 2013, while it is assumed that in 2025 that number will be 175 zettabytes (Daley, 2022). Data processing has become both a challenge and a necessity in order to make quality business and life decisions. The new era characterized by ubiquitous computing, big data, the use of sensors, the Internet of Things, artificial intelligence, blockchain and a number of other technologies is called the Fourth Industrial Revolution (Industry 4.0) (Krivokapić at al, 2019). Industry 4.0 would not be possible without the use of large amounts of data in business. The European Union (EU) recognizes the importance of data as a resource for economic development, creation of new jobs, competitiveness, innovation and social progress. One of the pillars of the development of the EU economy is the construction of the data economy, the total share of which in 2016 was 300 billion euros (1.99% of European social income) with a tendency to grow in the following years (Krivokapić at al, 2019).

Open data
Open data is data that is publicly available to everyone and can be used for any purpose in any way, without copyright or other restrictions (Dymora at al, 2018). Most often, this term refers to data obtained from public institutions of society, but the term is also used for other types of data (medical data and the like). The open data initiative was created in 2007 when Lessing and his collaborators started the initiative to open the data of Government institutions in the United States of America (USA) in order to achieve transparency and introduce control at work (Ayre & Craner, 2017). In 2013, in the USA, then-President Obama signed an executive order on open data. The decree opens an open data portal, states that openness in government strengthens democracy, promotes the delivery of efficient and effective services to the public and contributes to economic growth. The advantage of open government is the facilitation of information resources, finding, availability and usability of data (Ayre & Craner, 2017).
Proper use of open data can help achieve the following goals (Terzić & Majstorivić, 2019): 2023, Vol. 71, Issue 3 -Open data helps governments and citizens to make better decisions based on the availability of more data. -Users can combine different datasets.
-The openness of data allows researchers to explore trends as well as uncover social and economic problems. -Open data helps the public sector and institutions to achieve better results in areas such as health, education, public safety, ecology, natural disasters and the like. -Open data contributes to economic development and helps in business operations. -It improves the flow of information in the government, improves intersectoral cooperation and contributes to greater transparency. -The openness of economic data helps to control the spending of public money. -Government openness accelerates institutional reactions, reduces corruption and helps build new democratic spaces for citizens (Keserű & Kin-Sing Chan, 2015).
Open data is usually created by public institutions. Before being published, data must go through an anonymization process (Dymora at al, 2018). One of the basic requirements that open data must meet is to ensure privacy. Data that is not allowed to be shared is called closed data. The term closed data usually means personal data. Some data may be processed by a limited number of users for the purpose of various research. Such data is called shared data.
Open data must be in a format that allows easy machine processing. Open data formats must enable processing in at least one open format software (opendata portal).

Open data in Serbia
The first available set of open data in Serbia appeared in 2015 in the work of the Ministry of Education, Science and Technological Development, and significant progress has been made since then. The number of publicly available open data sets in 2019 on the open data portal data.gov.rs was over 650 from over 25 institutions (Krivokapić at al, 2019). In April 2023, that number was around 2166 from 111 institutions (Data.gov.rs, 2023a(Data.gov.rs, , 2023b. The Law on Electronic Government from 2018 requires public bodies to open data within their jurisdiction (Službeni glasnik RS, 27/2018). The degree of progress of countries in terms of opening data can be monitored on the Global Open Data Index portal (2023). Data are analyzed in 15 categories. Serbia is in 41st place with a Comparative analysis of the traffic accidents in the territory of the city of Užice for 2021 and 2022 using open data and the Streamlit application, pp.616-633 degree of openness of 41%. According to the Global Open Data Index, (2023), the highest degree of openness is in the area of public procurement as well as information on the degree of air quality. In four categories, the Government did not open its data, namely: -election data, -overview of locations, -spending the Government's money, and -ownership of plots.

Open data on traffic accidents
The use and processing of open data creates favorable conditions for the discovery of new knowledge in order to provide answers to important social problems. One of the major problems around the world are traffic accidents that claim more and more lives and cause great material damage. With high-quality data analytics, conclusions can be drawn that could reduce harmful consequences in the future. Saxena & Robila (2021) describes the use of open data in the field of traffic and presents a tool for the analysis of traffic accidents in the territory of the city of New York based on open data analytics. The tool analyzes all the factors that have influenced traffic accidents, visually represents the locations where accidents have occurred, and provides insight into the details in order to draw conclusions with the aim of reducing the number of accidents in the future. Gladivić and Deretić describes the analysis of traffic accidents in the territory of the City of Belgrade (Gladović & Deretić, 2018). At the very beginning, the authors describe a set of open data, and then provide an analysis of the data in Excel. The hypothesis of increased traffic safety in the territory of the City of Belgrade in 2016 compared to 2015 was examined.
Visualization means translating information into a visual context to make it easier to understand. Visualization can be used to create highquality graphical representations of data, but also to perform exploratory data analysis, EDA. In the paper (

Methodology used in the work
For the purposes of building the traffic accident analysis application, data from the open data portal covering the period from 2015 to January 2023 was used (Open data portal). Datasets are provided in .xlsx format. The description of the columns is given in Table 1. Kind of traffic accident 8 Type of traffic accident 9 Description of the traffic accident The application for the automatic analysis of traffic accident data was developed in the Streamlit development environment using the Pandas library. Streamlit is an open source Python library for easily creating web applications for machine learning and data analytics. It provides an intuitive and simple way of working when building applications and does not require knowledge of other web tools. It is characterized by excellent documentation that further facilitates the work (Docs.streamlit.io, 2023).
Streamlit has made it completely simple to create interfaces, display text, visualize data, render widgets, and manage a web application from start to deployment with its practical and highly intuitive application programming interface (Khorasani et al, 2022).
Current code implementation is a feature of the environment that allows the developer to visually track changes to the application every time he makes changes to the program (Konova, 2022). The code can be written in any Python editor. As Python scripts, they are compatible with Git and other version control software. It is also compatible with other Python machine learning libraries such as Keras, Scikit-Learn, NumPy and others. Installation is done through the package management system in Python with the following command, pip install streamlit.
The main feature of this web environment is the simplicity of operation. Figure 1 illustrates how in just a few lines of code, using the API provided by the framework, one can get a web page that visually presents the data (Docs.streamlit.io, 2023).  (Docs.streamlit.io, 2023) Рис. 1 -Веб-страница, созданная в среде приложения Streamlit (Docs.streamlit.io, 2023) Слика 1 -Пример веб-странице креиране у окружењу Streamlit (Docs.streamlit.io, 2023) Pandas is an open source Python library designed for fast and easy data processing. It is interesting that in terms of the number of questions, the Pandas library has the highest growth trend on the stackoverflow site (Reddit.com, 2020).
It has excellent documentation that allows easy manipulation of tabular data (Pandas.pydata.org, 2023). 2023

Application for automatic analysis of open traffic accident data
The user of this application will be able to get acquainted with the concept of open data and, through the analysis of traffic accident data, gain a clearer picture of the statistical data related to this social problem. The application consists of three pages: Homepage, About_project and Data. The central part of the application is on the Data page, while the remaining two pages are informative and present some functionality of the Streamlit development framework, such as adding images and videos. On the Data page there is a user manual for using the application. The user is required to initially select data for analysis by drag and drop or by using the file manager ( Figure 2). The application processes only traffic accident data sets available on the open data portal.  (Gavrilović, 2023) Рис. 2 -Изображение центральной части приложения для автоматической обработки данных о дорожно-транспортных происшествиях (Gavrilović, 2023) Слика 2 -Слика централног дела апликације за аутоматску обраду података о саобраћајним несрећама (Gavrilović, 2023) After the user enters data into the application, tabular, numerical and graphic reports on traffic accidents for the selected data set are obtained (Figure 3). In addition to this data, the user can perform appropriate filtering by selecting the police department as well as the type of offense and obtain a tabular display of the data according to the given criteria (Figure 4). At the end of the page, there is a display of traffic accidents on a geographical map ( Figure 5). The map has a possibility of enlargement for a more precise insight into the geolocation of the traffic accident. In this way, potentially dangerous places in traffic can be easily identified. Gavrilović (Gavrilović, 2023) Рис. 3 -Табличное, числовое и графическое представление дорожнотранспортных происшествий (Gavrilović, 2023) Слика 3 -Таблична, нумеричка и графичка презентација података о саобраћајним несрећама (Gavrilović, 2023) Figure 4 -Using filters for data selection (Gavrilović, 2023) Рис. 4 -Использование фильтров для отбора данных (Gavrilović, 2023) Слика 4 -Употреба филтера за селекцију података (Gavrilović, 2023) Figure 5 -Traffic accident data represented on a map (Gavrilović, 2023) Рис. 5 -Данные о дорожно-транспортных происшествиях, представленные на карте (Gavrilović, 2023) Слика 5 -Представљање података о саобраћајним несрећама на мапи (Gavrilović, 2023) Comparative analysis of the traffic accidents in the territory of the Police Department of Užice for 2021 and 2022 The practical application of the application for research purposes refers to the analysis of traffic accidents on the territory of the Užice Police Department (Užice PD). The primary goal of the research is to test the hypothesis about whether the analysis of open data on traffic accidents for previous years, and preventive actions, has led to an increase in traffic safety in the territory of the Užice PD. The data was obtained by using the application and selecting the appropriate filters available on it.
A total of 664 traffic accidents were recorded on the territory of the Užice police department in 2021, namely 349 accidents with material damage, 297 traffic accidents with injured persons and 18 accidents with killed persons. Regarding the time interval when the traffic accident occurred, 219 traffic accidents occurred from midnight to 12:00 and 445 from 12:00 to midnight. The most serious forms of traffic accidents are those in which there are injured or killed participants. Traffic accidents with injured persons in 2021 occurred more often after 12:00 (201) than before 12:00 (96). In the period from midnight to 12:00, there were 5 fatal traffic accidents, while in the period from 12:00 to midnight that number was 13. Through graphic interpretation and visualization, it is possible to identify potentially dangerous road routes on the territory of the selected PD. The Gavrilović, B. et (Gavrilović, 2023) Рис. 6 -Географические координаты дорожно-транспортных происшествий с пострадавшими в 2021 году (Gavrilović, 2023) Слика 6 -Локације саобраћајних несрећа са повређеним лицима у 2021. години (Gavrilović, 2023) A graphic representation of the geolocations of the traffic accidents with fatalities in the territory of the Užice PD for the year 2022 is given in Figure 7.
A total of 647 traffic accidents were recorded on the territory of the Užice Police Department in 2022, of which 375 resulted in material damage, 255 resulted in injuries and 17 accidents resulted in fatalities. In the period from midnight to 12:00, 236 traffic accidents were recorded, while in the period from noon to midnight, 411 were recorded. The number of traffic accidents with injured persons until noon in 2022 was 86, while 169 such accidents occurred from noon to midnight. When it comes to accidents in which there were fatalities, they happened more in the afternoon (10) than in the morning (7). The graphic representation of the accidents with injured persons is in Figure 8.  (Gavrilović, 2023) Рис. 7 -Географические координаты дорожно-транспортных происшествий со смертельным исходом в 2021 году (Gavrilović, 2023) Слика 7 -Геолокација саобраћајних несрећа са смртним исходом у 2021. години (Gavrilović, 2023) Figure 8 -Geolocations of the traffic accidents with injured persons for the year 2022 (Gavrilović, 2023) Рис. 8 -Географические координаты дорожно-транспортных происшествий с пострадавшими в 2022 году (Gavrilović, 2023) Слика 8 -Локације саобраћајних несрећа са повређеним лицима у 2022. години (Gavrilović, 2023) Gavrilović The graphic representation of the locations of the traffic accidents with fatalities is presented in Figure 9.  (Gavrilović, 2023) Рис. 9 -Географические координаты дорожно-транспортных происшествий со смертельным исходом в 2022 году (Gavrilović, 2023) Слика 9 -Геолокација саобраћајних несрећа са смртним исходом у 2022. години (Gavrilović, 2023) The analysis of the graphic interpretations in Figure 6 and Figure 8 shows it is noticeable that the largest number of traffic accidents with injured persons in the territory of the Užice Police Department occurs on two road directions, Užice-Čajetina and Užice-Požega. Apart from these main road routes, there is a noticeable problem with safety on the Požega-Arilje and Požega-Kosjerić roads, as well as on the branch of the road that leads from the town of Sušica to the Kotroman border crossing.
When the geolocations of the traffic accidents with dead participants in Figures 7 and 9 are compared, the conclusion is similar to the previous one. The largest number was recorded on the highway routes Užice-Čajetina and Užice-Požega. The comparison of the data from Figures 7 and 9 reveals a noticeable decrease in the number of traffic accidents with a fatal outcome in 2022 on the Užice-Požega road, while there is a slight increase on the Užice-Čajetina road.
The structure of traffic accidents for 2021 and 2022 by the time period when they occurred is presented in Table 2. 2023 From the Table 2 data, it is noticeable that a greater number of traffic accidents occur in the afternoon. Regarding the intervals by year, the number of traffic accidents from midnight to noon is percentageally higher in 2022 than in 2021. Table 3 shows the data on traffic accidents for 2021 and 2022. When it comes to traffic accidents with material damage, there was an increase of 7.45%, while the number of traffic accidents with injured persons decreased by 14.4%. There was also a decrease in the number of fatal traffic accidents by 5.56%. Regarding the absolute numbers of traffic accidents, the total number of traffic accidents is lower by 2.56%, so we can accept the hypothesis that there has been an increase in traffic safety in the territory of the Užice PD.
The conclusions obtained from the statistical analysis of the data should be taken with a grain of salt due to potential shortcomings in the method of collection, recording and uneven registration (Gladović& Deretić, 2018). Comparative analysis of the traffic accidents in the territory of the city of Užice for 2021 and 2022 using open data and the Streamlit application, pp.616-633

Conclusion
This work shows the possibility of using the Streamlit framework for creating an application for processing open data. The work begins with a short introduction, then the concept of open data is explained and the previous works based on open data on traffic accidents are presented. The main part of the paper is a description of the web application for the analysis of traffic accidents in the Republic of Serbia and its practical application to the analysis of traffic accidents in the territory of the Užice PD. The aim of the work is to get acquainted with open data and the functional possibilities of the Streamlit development environment through the analysis of the application operation. Using the application enables a review of data on traffic accidents according to various criteria, which facilitates the review by researchers and services in charge of traffic safety. The analysis of the data for the Užice Police Department showed that there was an increase in the general level of traffic safety in 2022 compared to 2021. The visualization identified road routes where there were the most accidents with injured and dead persons and where preventive control measures should potentially be strengthened. The application presented in the paper has room for upgrading in the future, by introducing additional functionalities, in order to make the work even easier for professionals for whom it is intended. Comparative