COMPARATIVE ANALYSIS OF HUMAN DNA EXTRACTION METHODS AND MITOCHONDRIAL DNA HV1 AND HV2 HAPLOGROUP DETERMINATION

The high integrity of extracted DNA is necessary for the determination of human haplogroups, based on the mitochondrial DNA (mtDNA) marker sequences. The study aims were to isolate the total DNA from selected human samples and establish a protocol that gives the best yield and proper purity of DNA material for further analysis. Furthermore, human hypervariable regions HV1 and HV2, located on mtDNA were sequenced to define human haplogroups. For extraction, samples of buccal mucosa from ten human volunteers were taken and three different protocols were used (a method with ammonium acetate, a salting-out method and GeneJET Genomic DNA Purification Kit (Thermo Fisher Scientific)). Concentration and purity of DNA were measured on BioPhotometer. Sequencing was performed by Sanger method on 3130 Genetic Analyzer (Applied Biosystems) with commercial BigDyeTM Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems). Sequences were processed using Chromas and MEGA software and compared with sequence databases with BLAST tool on NCBI. Haplogroups and mutations were determined in the MITOMAP online database. Results of extraction indicated that the Purification Kit gives the highest concentration of DNA (11.86±0.63) and low purity (1.17±0.02), while a method with ammonium acetate gave the best purity (1.67±0.17). Sequencing obtained on DNA extracted with a commercial kit, gave 5 different human haplogroups U2e, T1a, J1c, H2a, K. In conclusion, it is recommended to use the Purification Kit for obtaining high integrity DNA that can be used for effective sequencing. Determination of haplogroups and haplotypes can reveal our ancestry and have important implications in molecular systematics, phylogenetics and phylogeography.


INTRODUCTION
Genome is the most studied model system in molecular biology, genetics and forensic research. Every field of study and further testing requires the use of many different DNA extraction methods which vary in processing time, yield, purity of isolated DNA and cost. DNA yield and purity are values that define the efficiency of DNA extraction method (BOESENBERG-SMITH et al., 2012).
Extracted DNA can be used in forming a DNA biobank for safe long-term storage of genomes. DNA is usually stored in a chamber at -80°C or in a liquid nitrogen container at -196°C for permanent cryopreservation (PASKAL et al., 2018). As it is stored at a very low temperature, the quality and integrity of extracted DNA are crucial for DNA biobanking (COPPOLA et al., 2019). Biobanks open many opportunities in biomedical research (large collections and databases, comparative analyses, conservation biology, preserving biodiversity and population genetic research, repeated testing over the years), as well as in biological anthropology (questions related to human evolution, population genetics, genealogy, the origin of modern humans) and molecular systematics (defining phylogenetic trees, haplogroups and haplotypes) (BRANKOVIĆ et al., 2014;DROEGE et al., 2014).
Mitochondrial DNA (mtDNA) is widely used in forensic analysis, human ancestry and genealogy, population genetics, evolutionary and molecular systematics research. When compared to nuclear DNA, mtDNA has a few advantages in this kind of studies, such as lack of recombination and repair mechanisms, presence in hundreds to thousands of copies per cell, small size and circular DNA, matrilineal inheritance ( It allows using these regions in defining the mtDNA haplogroups. A third hypervariable region (HV3, position 438-574 bp) and coding region of mtDNA, with additionnal polymorphic positions, can be useful for more precise determination of haplogroups in a specific population (DAVIDOVIĆ, 2018). An important human mtDNA online database is MITOMAP. Its Mitomaster analysis tool allows identification of polymorphic positions, calculation of variant statistics and haplogroup prediction. A combination of alleles for different polymorphisms that occur on the same chromosome inherited through a single parent represents a specific haplotype. Haplogroup is a group of haplotypes that share the same origin and is characterized by a combination of single nuclear polymorphisms (SNPs) inherited together and characteristic for a specific haplogroup (AMORIM et al., 2019). According to AMORIM et al., 2019 more than 90% of individuals of the European populations are categorized into 10 main haplogroups: H, I, J, K, M, T, U, V, W and X.
The main goal of this study was to establish the protocol that gives required concentration, purity and integrity of DNA for further haplogroup analysis using Sanger sequencing, but at the same time to be economic and easy to follow. In addition, the determination of DNA haplogroups can help us in revealing our ancestry and genealogy.

Sampling and extraction of DNA from buccal mucosa cells
Methodology for DNA extraction and PCR analysis was performed in the Laboratory for Cell and Molecular Biology (Laboratory), Faculty of Science University of Kragujevac. Sequencing methodology was performed in Veterinary Specialized Institute of Kraljevo.
For the extraction of the human genome, buccal mucosa cells were sampled from ten human volunteers. All volunteers have been part of the Laboratory team. They participated in preliminary research, aimed to introduce a methodology for DNA isolation, sequencing and haplogroup analyses and define the procedure for establishing biobanks (the procedure is ongoing). Volunteers signed Informed consent for participation in the study, which is part of Laboratory documentation.
All participants rinsed their mouths with water for 30 s. Both sides of the buccal mucosa were rubbed for 25 s with three different cotton swabs, one per protocol. Buccal mucosa cells were resuspended from swabs in 200 µL of a suitable solvent for each protocol.

Protocol 3 (Purification Kit)
In protocol 3, extraction was carried out in colon tubes with silica membranes and instructions from the GeneJET Genomic DNA Purification Kit (Thermo Fisher Scientific) were followed.

Concentration and purity evaluation by spectrophotometry
Concentration and purity of extracted DNA were measured at Spectrophotometer (Eppendorf BioPhotometer) on A260/280 nm. Obtained values in the range of 1.8-2.0 were considered to represent pure DNA without RNA or protein contamination (GALLAGHER, 1998).

Statistical analysis
Results were statistically processed in SPSS for Windows, version 17 (2008) statistics software (SPSS Inc., Chicago, IL, USA). ANOVA repeated measures post hoc tests (LSDleast significant difference test) was used for comparison of studied protocols. Data are presented as Mean ± Standard Error (SE). Results were considered statistically significant at p<0.05.

PCR amplification
In the previously described DNA extraction methods, more samples were included and several protocols were performed. That was the initial step after which samples were assessed and selected for further and more specific analyses (PCR amplification and sequencing). As GeneJET Genomic DNA Purification Kit (Thermo Fisher Scientific) gave the highest and constant DNA yield in all samples, we chose to use DNA obtained with purification kit for sequencing. Due to the idea of the study, the specificity of methodology and economic factors we chose the first five samples (H1-H5) of extracted DNA for specific analysis.
Two regions on D-loop (HV1 and HV2) of human mtDNA were first amplified by a polymerase chain reaction (PCR). Conventional primers used for amplification are presented in Table 1.  Each band was cut from agarose gel, samples were extracted and purified following the manufacture protocol of GeneJET Gel Extraction Kit (Thermo Fisher Scientific).

Sequencing of human HV1 and HV2 regions of mtDNA
The cycle sequencing reaction was performed on Eppendorf mastercycler gradient PCR using BigDye™ Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems) with the same primers that were used for the PCR amplification process (Table 1). Two sequencing reactions were performed for each region, one with forward (F) primer, the other with reverse (R) primer, for more accurate results. The sequencing reactions contained the following components: 4 µL Big Dye T, 2 µL Big Dye buffer, 0.1 µL of primer, 7 µL PCR clean H2O and 7 µL of amplified and purified HV1/HV2 regions. Sequencing conditions: Initial activation 96°C, 1 min 25 cycles: Denaturation 96°C, 10 s Annealing 50°C, 5 s Elongation 60°C, 4 min Purification of cycle sequencing reaction products was carried out using the BigDye XTerminator™ Purification Kit (Applied Biosystems). The capillary electrophoresis was performed on 3130 Genetic Analyzer (Appllied Biosystems).

Sequence analysis
Raw sequences obtained from DNA sequencing of hypervariable regions, HV1 and HV2, of human mtDNA were processed using Chromas software, version 2.6.6 (Technelysium Ltd.). Regions were sequenced in both directions (forward and reverse) to compare forward and reverse sequences of the same sample. Molecular Evolutionary Genetics Analysis (MEGA) software, version 6.06 was used to align F and R sequences and obtain more accurate results.
A MitoMaster tool of the human mitochondrial genome database (MITOMAP) (https://www.mitomap.org/MITOMAP) determine polymorphic sites on processed mtDNA sequences by comparing them with the revised Cambridge Reference Sequence. It also has been used for haplogroup prediction. The Basic Local Alignment Search Tool (BLAST) of the National Center for Biotechnology Information (NCBI) compared obtained sequences with sequence databases (https://blast.ncbi.nlm.nih.gov/Blast.cgi).

Evaluation and statistical analysis of extracted DNA
Total DNA yield and purity values of the extracted DNA from human buccal swab cells were used for statistical analysis and comparison ( Table 2). Observing results for protocol 1, the concentration and purity of extracted DNA vary through samples. The mean concentration and purity values were 2.42 µg/mL and 1.67 A260/280, respectively. Even if the mean value was near to the optimal purity range, no sample had optimal purity, it was very inconstant and ranged from 1.15 to 2.81 A260/280. If compared with protocol 1, the concentration and purity values of DNA extracted using protocol 2 were more stable and gave a higher yield. Purity in all samples was lower than optimal. Protocol 3 gave constant, high yield but low purity of extracted DNA comparing with protocol 1 and 2. Results indicate the presence of protein, RNA, or other contaminants in DNA extracted with used protocols, implicating that extraction needs further purification steps.
Statistics showed that there was a significant difference in the concentration of extracted DNA between protocol 3 and the other performed protocols. Commercial purification kit gives the highest concentration because extraction is performed in colon tubes with a silica-based membrane that binds a DNA.

Sequencing analysis of human HV1 and HV2 regions of mtDNA
Human hypervariable region 1 and hypervariable region 2 located on the non-coding region of mtDNA are commonly used for the determination of mtDNA haplogroup and haplotype of an individual. Obtained sequences by sequencing method were processed using Chromas and MEGA software and compared with Revised Cambridge Reference Sequence using BLAST tool at NCBI. Predicted haplogroups were determined using the MITOMAP program. Sequencing results showed differences in nucleotide sequences in five analyzed individuals. Each individual belongs to diverse haplogroups presented in Table 3. There were 16 polymorphic sites on HV1 and 13 on the HV2 region. Types of mutations were mostly nucleotide substitutions, but also deletion and additions. A - H2a(H2a2a) K H 1-5 -Human swab samples. Samples numeration correlates with Table 2.

DISCUSSION
The focus of the study was to establish the most efficient methodology for DNA extraction from human buccal swab samples, primarily in terms of DNA yield and purity, as it is the most critical step in many fundamental molecular biology, genetic and forensic applications. Additionally, the protocol should be easy to proceed, economic and time efficient. Total DNA obtained using protocol 1 and measured using spectrophotometry, ranged from 0.6 to 4.5 μg/mL which is a much lower concentration range than obtained by AIDAR and LINE (2007), who established a method. In study performed by GARBIERI et al. (2017) and KÜCHLER et al. (2012), who used the same extraction method, obtained concentrations were also higher (5-173 μg/mL). The reason for this difference could be in sampling, since they all collected saliva from individuals. This method is simple to follow and is not expensive. The disadvantage is that the obtained concentration and purity values are variable between samples and this variability correlates with GARBIERI et al. (2017) and AIDAR and LINE (2007) results. Compared with protocol 2 and protocol 3, this method gave the best mean purity value of DNA that can be a result of using ammonium acetate to eliminate proteins. Results obtained with protocol 2 gave higher and more constant yield and the purity in all samples was lower than optimal. This method is used for DNA extraction from many different sample types and gives acceptable yield and purity, that can be used in PCR reaction (JAVADI et al., 2014;RIVERO et al., 2006). Protocol 3, commercial kit, gave constant and high yield, but purity was lower than expected. This result correlates with CLER et al. (2006), in which 3 commercial kits performance was superior to 2 manual methods and 2 of 3 purification kits did not satisfy the proper purity range. In contrast, SHOKRZADEH and MOHAMMADPOUR (2018) obtained high DNA concentration and purity from blood samples with both methods, indicating the importance of considering the type of sample used in the selection of the optimal DNA extraction method. According to GALLAGHER, S. (1998), purity values lower than 1.8 A260/280 indicate protein contamination, they actually have a peak absorption at 280 nm that reduces the A260/A280 ratio.
The HV1 and HV2 regions of mtDNA are used in many studies considering genealogy and ancestry, as it is referred to in a recent review article AMORIM et al. (2019). We sequenced these two regions in five human individuals and obtained predicted haplogroups using the MitoMaster tool of a human mitochondrial genome database (MITOMAP). DNA obtained with GeneJET Genomic DNA Purification Kit (Thermo Fisher Scientific) (protocol 3) was further processed for sequencing, mostly because it gave the highest and constant yield in all samples. Protocol 1 that had mean purity value close to the optimal, was not chosen for further analyses because the values in individual samples were very variable. We chose five samples of extracted DNA for specific analyses, as this was preliminary research aimed to set up a methodology, because of the specificity of the methodology itself and the economic factor. In results, five different haplogroups were obtained and classified in particular (sub)haplogroup, except K haplogroup that could not be classified by sequencing of only these two regions and needed further analysis of coding mtDNA polymorphic sites or HV3 region. All haplogroups found in our samples were (U2e, T1a, J1c H2a and K) previously detected in DAVIDOVIĆ (2018). He sequenced HV1 and HV2 regions from 172 samples of the Serbian population. There are a few more research papers that refer to mtDNA analysis of Serbian population ( . It is important to emphasize that yield and purity of DNA extracted with commercial kit were satisfying for performing a PCR amplification and sequencing effectively. In this preliminary study the complete procedures for sampling and extraction of human genomic DNA, PCR amplification of mtDNA sequences and sequencing methodology were established. Processing and analyzing of obtained sequences and mtDNA human haplogroups determination was adopted. Prerequisites (expert, technical, informed consent) for the permanent preservation of the sampled DNA material and data protection are also defined. Established methodology for extraction and analysis of genomic sequences is essential for the determination of haplogroups and haplotypes that can reveal our ancestry and have important implications in molecular systematics, phylogenetics and phylogeography.