BIOINFORMATICS ONLINE SUPPORT FOR BIOACTIVE SUBSTANCES CYTOTOXICITY TESTING AND THEIR STATISTICAL ANALYSIS

Preclinical in vitro/in vivo testing is the first step in discovery of anticancer medicines, among others evaluation of cytotoxic activity of bioactive substances (BAS) on various human normal and cancer cell lines. Cytotoxicity expressed as IC50 value (a dose that inhibits 50% of cell growth) is one of the most commonly used parameter for comparable analysis of activity between different substances. This study includes examination of number of BAS and their results for cytotoxic activity obtained in the Laboratory for cell and molecular biology (LCMB) that require various statistical and computational techniques for proper effective analysis. In order to improve experimental data analysis, make it faster, more effective, error proof with secure online data repository, a web application with LCMB IC50 database was developed as a useful research tool acting as a leverage for scientific data processing requirements. Analysis includes cytotoxic effects of chemical and natural BAS (IC50 values) on HCT-116, SW-480, MDA-MB-231, and MRC-5 cell lines. Generally, it can be concluded that BAS of different origin, chemical or natural, have various cytotoxic effects and cause different cell line sensitivity, which is presented and discussed. This paper presents developed SQL database-centric web application with remote user-friendly data management for a biology researcher user type profile. Data processing in this article can be useful for a further overlook and testing of cytotoxic substances.


INTRODUCTION
Cancer represents one of the most common diseases worldwide. There are many attempts to improve anticancer therapy, where one of the main is using a variety of drugs from different origins and mechanisms of action. All of them must pass through preclinical and clinical testing on diverse model systems (LOPEZ-LAZARO, 2015). The preclinical investigations begin with isolation, purification and/or modification of bioactive substances (BAS) originating from natural sources, as well as the synthesis of new chemical compounds. They include testing of structural characteristics, their interaction with biomacromolecules and many assays that evaluate their anticancer properties, mainly on the recommended panels of cell lines and animal models. In vitro cytotoxic testing of BAS on cancer cell lines allows early detection of cytotoxic treatments that potentially can be used in anticancer therapy (FLOREN-TO et al., 2012). Testing the effect of different BAS on cell viability in culture is a widely used method. There are a large number of in vitro cytotoxicity assays to test whether a compound is toxic to cultured cells or not, mainly by determining the number of living cells, over a defined incubation period (RISS et al., 2011). Inhibitory concentration (IC50) of a testing substance that inhibits 50% of cell growth is a referent value for cytotoxicity (CALDWELL et al., 2012;ĆURČIĆ et al., 2012a). Intensive research in this field, and the large number of results for cytotoxicity obtained from various active substances, and their use in the prevention and treatment of cancer (MILUTINOVIĆ et al., 2015a), lead to statistical processing of these data and implementation of new LCMB IC50 database presented in this article.
The anticancer activity of chemical complexes has been known for decades and many of them have been used as a treatment for various types of cancer (NDAGI et al., 2017). The impressive effect of cisplatin on cancer cells has launched develop of new derivatives with improved pharmaceutical effects. In recent years, various chemical compounds and metalbased complexes have been increasingly tested, such as platinum, palladium, gold, ruthenium, etc. have been synthesized with different ligands ( In the Laboratory for cell and molecular biology (LCMB), the cytotoxicity of more than 50 chemically synthesized compounds was examined, as well as over 200 substances from natural origin. Thus, there was the need for collecting IC50 data of various BAS sources in the LCMB IC50 database for efficient analysis and comparison of cytotoxic effects between different cell lines, as well as within the treatment incubation period, 24 and 72 h. Also, the difference in the cytotoxic effect of BAS isolated from plants, fungi or lichens depending on cell lines and incubation period, as well as for the type of extract used for isolation was further analyzed.
Besides required complex laboratory procedures and tests, which are subject to permanent innovation and improvement, adequate support for large amounts of laboratory data processing becomes increasingly important. Laboratories generate more and more data that need to be adequately processed in order to extract new scientific information and produce new research insights and discoveries. Usual manual data processing supported with various software tools that require significant human assistance like excel tables and similar, cannot cope with increasing data amount, slow and error-prone processing due to required intensive human interaction while having limited data processing and interpretation capabilities and lacking support for complex data processing, decomposition and restructuring. Decomposition and data restructuring (EMELYANOV, 2018) can relate and organize existing data in a new way offering previously unavailable insights and aspects, discovering "hidden" information which may lead to new scientific achievements, thus giving added value and use of collected laboratory data.
The aims of this study are: (1) collecting IC50 values in LCMB IC50 database for easy and rapid analysis of given effects BAS cytotoxicity and comparison of cytotoxic effects of different sources of BAS, and (2) developing a web application for remote management of database, containing laboratory data, supporting different modes of operation such as: adaptive filtering for data selection and statistical analysis, parameter and data CRUD (Create Read Update Delete) operations for manipulation of parameter data characterizing IC50 values and IC50 data edit. All web application operational modes are available remotely, using an intuitive web user interface for a logged-in user.  (MOSMANN, 1983). For the purpose of this essay, cells were seeded in 96-well plate, incubated for 24 h and then treated with different substances in a concentration range of 0.1 to 500 μM for chemically synthesized or 0.1 to 500 μg/ml for natural extracts. The assay was performed 24 and 72 h after treatments.

Description of Laboratory data and methods
Percentages of cell viability were calculated as the ratio of absorbance of the treated group divided by the absorbance of the control group (untreated cells), multiplied by 100. The IC50 value was presented as the parameter for cytotoxicity and comparable result between different substances. The IC50 values were calculated from dose curves of cell viability using CalcuSyn v 2.1. software. The IC50 values of BAS are grouped according to their origins, such as chemically synthesized (CHS) and natural extracts (NE). The data of natural extracts are grouped according to treatment taxonomy type: plants, fungi, lichens, as well as according to the type of extract used in extraction procedures: methanol, acetone and other types (ethanol, ethyl acetate, water, chloroform and petroleum ether).

Statistical analysis of IC50 data
Statistical analysis of IC50 data was performed in the web application within the LCMB IC50 database presented in this article. The results are presented as mean ± standard error. Analysis of variance (ANOVA) on web application was used as a test to compare selected data. All data that were not read by MTT (values of 0) were excluded from the statistics by filtering and eliminating those values. All data greater than 500 (values >500 µM, and >500 µg/ml) were considered as non-cytotoxic and their values were included in statistics as values of 500.

LCMB IC50 database description
Starting point for a database and web application development was a collection of Laboratory data for IC50 organized in the form of a single excel table. In the table, columns were IC50 laboratory research data and various parameters characterizing experimental conditions of obtained IC50 data. Each of more than 360 table lines corresponds to the results of particular lab experiment. Such organization and storage of experimental data offered some basic features for results analysis, while more advanced specific custom analysis and teamwork requirements could not be met with a single table document, even with shared document collaborative environment in the cloud, such as Google disk, Microsoft One drive and other. Such collaborative cloud environments are suitable for teamwork where standard documents like excel, word, and similar are appropriate. Specific analysis features required data decomposition and relational database storage, together with development of an application with custom functionalities. Custom web application for remote management of database containing decomposed laboratory data within safe Faculty's academic network domain was considered as an optimal solution. Figure 1 presents part of the spreadsheet with original Laboratory data. Excel data were converted to an equivalent database table. For a web application with required functionalities, data from the obtained table should be decomposed on a number of tables containing data of the same kind corresponding to data in spreadsheet columns. Decomposition is performed programmatically by SQL queries executed from phpMyadmin (https://www.phpmyadmin.net/) database tool or PHP program files (https://www.php.net/). PHP is used as a web server programming language for web applications.

Web application
Web application page for remote access and management of IC50 data is presented in Figure 3. There are 3 main functionalities of IC50 web application: 1) data view with flexible filtering, 2) data management CRUD operations, and 3) ANOVA statistical analysis including flexible samples creation, single and compound samples. Each of these functionalities includes many other sub functionalities. Any number of filtering conditions can be selected, and only selected parameters actually contribute to filtering. Figure 4 shows the case of data filtering that returns only 2 data rows. Without selection of any filtering parameter, all data are returned, as in Figure 3 where all 360 existing rows are returned. Filtering parameters are: Treatment specifying exact name of the treatment, Alias specifies the name of the treatment more closely, Type specifying type of the treatment, Line for cell line which is used for testing, Unit for IC50 and additional filtering parameters. Additional filtering parameters allow specifying 24 or 72 hours interval for applying cytotoxic substance, filtering according to IC50 value interval, lower and upper interval value for 24 and 72 h separately, and finally filtering according to IC50 error concentration value.  Error value is estimated as a maximal value for concentration deviation from given nominal IC50 value. Figure 3 shows those additional filtering parameters as text boxes on a web page for entering those values. As for the mentioned filtering parameters, if values are not specified, there is no filtering. Figure 5 shows web panel for data management CRUD operations. New laboratory experiment data can be added, existing updated, or deleted. Also, new parameter data can be added, updated, or deleted, thus allowing full control of data edit. Statistical analysis can be performed on any filtered group of IC50 data, according to previously described filtering for data view. Figure 6 shows the statistical panel that consists of 3 main subsections: 1) Single sample, 2) Super or compound sample and 3) ANOVA statistics. Simple sample means that a sample for ANOVA test can be defined according to any possible combination of filtering parameters. Sample name is automatically generated from selected filter parameters, with optional sample name prefix and stored in a sample table with selected values of parameters. Two or more simple samples can be compared by ANOVA sample test.

RESULTS
All data for mean, standard error and significance on investigated samples were obtained on the same principle as described in the following example. IC50 values for CHS treatments on HCT-116 cell line after 24 h which mean value is presented in Table 1 are obtained from descriptive statistics in web application from LCMB IC50 database (Fig. 10   The p value of 0.3517 is obtained from ANOVA test named hct116_24_72 which compares previous two samples for CHS on HCT-116 after 24 and 72 h (Fig. 12). Column Sample title contains names of all tested samples (previously mentioned and discussed), with obtained Sig (significance) value of 0.3517, which is p value in Table 1.  Figure 13. It can be seen that four samples are compared, for each cell line. Obtained Sig is 0.0031 which is p value for CHS between cell lines after 24 h (Tab. 1).

Figure 13. ANOVA test for CHS treatments between cell lines after 24 h.
Analysis of variance for cytotoxicity of BAS was performed depending on the unit in which IC50 values are expressed. The IC50 values differ to their origin: isolated from natural sources (plants, lichen and fungi) expressed in μg/ml and chemically synthesized expressed in μM, which is why the statistical significance was not examined between this two samples. Basic descriptive statistics and values for significance (p) obtained by the ANOVA test from previously presented database, are shown in Table 1. Analysis of variance shows that there is a statistically significant difference in the cytotoxicity of CHS (p = 0.0031) treatments after 24 h between different cell lines. MRC-5 cell line is the most sensitive in response to CHS treatments and the most resistant to NE treatments. MDA-MB-231 is more resistant to CHS treatments compare to the other cell lines. Analysis of variance for cytotoxicity data of CHS and NE treatments after 72 h compared to cell lines shows that there is no significant difference. These results indicate a similar sensitivity of the cell lines within the selected treatment conditions after longer incubation time. Table 1 shows that there is no significant difference within cell lines compared to the incubation time of 24 and 72 h, indicating that the treatment acts cytotoxic, constantly during the selected treatment periods. The exception is MDA-MB-231 cells, where CHS treatments show stronger cytotoxic activity over time. Analysis of variance shows that there is a statistically significant difference between source of NE (plants, fungi, lichens) on HCT-116 cell line after 24 (p = 0.0087) and 72 h (p = 0.0019) of exposure (Tab. 2). These results indicate that plants show the best cytotoxic effect on HCT-116 cell line after 24 h and after 72 h of incubation time. Table 2 shows that there is no significant difference in tested treatment for SW-480 cell line after 24 and 72 h. Mean values show slightly stronger cytotoxic activity of plants after 24 and lichens after 72 h, indicating the different time-dependent effects of these types of treatment on SW-480 cells. There is no statistically significant difference in all tested groups between 24 and 72 h incubation time. The results are present as mean  standard error; p astatistical difference between source of natural extracts; p bstatistical difference between 24 and 72 h; *statistically significant difference (p<0.05). Table 3 presents basic descriptive statistics and analysis of variance for NE IC50 in treatment of HCT-116, SW-480, MDA-MB-231 and MRC-5 cells after 24 and 72 h in relation to the type of extracts (acetone, methanol and others) used for extraction of BAS. Based on results for significance between investigated type of extracts it can be concluded that all extracts generally adequately isolate the BAS and to exhibit similar activity on HCT-116 cell line, since no significant difference in cytotoxicity was shown depending on the type of extract. Analysis of variance shows that there is a statistically significant difference of NE IC50 values in type of extracts (acetone, methanol and others) on SW-480 cell line after 24 (p = 0.0148) of exposure (Tab. 3). Acetone extracts show stronger cytotoxicity on SW-480 cells than others. There is no statistically significant difference in all tested groups between 24 and 72 h incubation time. The results are present as mean  standard error; p astatistical difference between type of extracts; p bstatistical difference between 24 and 72 h; *statistically significant difference (p<0.05).

DISCUSSION
There is a very limited number of studies that analyze and group the database for the cytotoxic effect of bioactive substances. In the following, in order to examine the different effects of cytotoxicity, we discuss whether and why there are significant differences in the action of BAS of chemically synthesized and natural extracts on HCT-116, SW-480, MDA-MB-231 and MRC-5 cell lines, as well as within the treatment incubation period after 24 and 72 h. In addition, results were discussed to investigate the difference in the cytotoxic effect of BAS isolated from plants, fungi or lichens depending on cell lines and incubation period, as well as for the type of extract by which they were isolated. Based on the results obtained for cytotoxicity of BAS of NE in treatments on HCT-116 cells after 24 h between the type of extract: methanol, acetone and others, it can be concluded that all the extracts generally adequately isolate the BAS that exhibit similar activity. Comparing the mean values for IC50 on HCT-116 cells over a 72 h a stronger effect is shown by acetone extract compared to methanolic and others. For SW-480 cells over a 24 and 72 h incubation period, a better effect of cytotoxicity was shown by acetone extract compared to methanolic and other extracts. GUPTA et al. (2012) shown that acetone and methanol have been the most effective solvents for the isolation of BAS such as flavonoids and others. Considering the results shown above it can be assumed that acetone extracts additional compounds that can give better effects on SW-480 cells. All types of extracts show time-dependent cytotoxicity on HCT-116, SW-480, MDA-MB-231 and MRC-5 cells.

CONCLUSION
In this study, newly created LCMB IC50 database provides a useful way for storage and comparison of results by different categories: related to the origin of bioactive substances, testing cell line, the origin of cell line, period of incubation, etc. Using the LCMB IC50 database, results were processed and presented in a new way. This kind of data processing has proven to be useful for a further overlook of bases with cytotoxic substances and it can be helpful in the selection and preparation of new BAS, as well as for prediction of their effect in future investigation. Also, database offers the possibility for highlights and selects the BAS with noticeable cytotoxic activity, support for detailed analysis, developing effective anticancer substance and more noticeable selectivity of BAS against one type of cells, for example, cancer vs normal. Statistical data processing of previously observed results can predict the most applicable model system, cell line for chosen treatment, the most applicable cell line for investigation cancer cell resistance to the treatment after a longer time of exposure and response to which type of extract was more effective for natural substances. Using web application for statistical analysis significantly improved not only the speed of analysis but also eliminated potential errors in defining samples and compound samples once they were selected and added to ANOVA statistical test. Simultaneous use of web application by many researchers is possibly contributing to research work efficiency, making it time and place independent for research results analysis. Given that the LCMB IC50 database is currently restricted to log-in users only and contains data obtained from the LCMB, this database has the potential to become a global platform for depositing and accessing such results in the future updates, while meeting the requirements for adequate data protection.