A Hierarchical Approach of Hybrid Image Classification for Land use and Land cover Mapping

Remote sensing data analysis can provide thematic maps describing land-use and land-cover (LULC) in a short period. Using proper image classification method in an area, is important to overcome the possible limitations of satellite imageries for producing land-use and land-cover maps. In the present study, a hierarchical hybrid image classification method was used to produce LULC maps using Landsat Thematic mapper TM for the year of 1998 and operational land imager OLI for the year of 2016. Images were classified using the proposed hybrid image classification method, vegetation cover crown percentage map from normalized difference vegetation index, Fisher supervised classification and object-based image classification methods. Accuracy assessment results showed that the hybrid classification method produced maps with total accuracy up to 84 percent with kappa statistic value 0.81. Results of this study showed that the proposed classification method worked better with OLI sensor than with TM. Although OLI has a higher radiometric resolution than TM, the produced LULC map using TM is almost accurate like OLI, which is because of LULC definitions and image classification methods used.


Introduction
Satellite data are often used to prepare land-use and land-cover maps. (Chrysoulakis et al., 2010;Lakshmi et al., 2015). Selection of proper land-use classification method is crucial in many inventories especially in watershed's uplands, which are usually water sources for wetlands (Anderson, 1976;Purkis et al., 2006;Mie et al., 2015;Tian et al., 2015). When satellite images data are used to produce LULC map, it is often very difficult to identify spectrally unique land-use/cover classes because of similar spectral responses arising from different features (Roy et al., 2014;Knudbya et al., 2014;Es-toque & Murayama, 2015;Lakshmi et al., 2015). Several methods can be employed to produce LULC by employing remote sensing data (Purkis & Klemas, 2011;Lakshmi et al., 2015;Al-doski et al., 2013). However, it should be noted that in case land surface objects have a similar reflectance or a small area, most of them could not provide high accurate maps (Gao & Xu, 2016). Using low radiometric resolution imageries, land classification can be a serious challenge because of spectral mixing of different surface elements and landscape complexity (Julien et al., 2011;Stenzel et al., 2016).
In such cases, application of hybrid classification approach in hierarchical way will produce better land-use/land-cover maps (Lakshmi et al., 2015). In this method, land-use/land-cover maps are produced by combining different methods like unsupervised, supervised, object-based methods and different indices produced from satellite images (Anderson, 1976;Homer et al., 2004;Di Gregorio, 2005;Disperati et al., 2015;Misra & Balaji, 2015;Lakshmi et al., 2015).
The objective of this study is to develop a hybrid classification method to prepare accurate land-use/ cover maps even when imageries with lower radiometric resolutions are used.

Methods and data
The study area The study area was Pelasjan sub-basin including the western part of the Gavkhooni watershed located in central Iran and covering approximately 412,999 hectares. The Zayandehrood is the major river in Gavkhooni watershed to which Pelasjan sub-basin gives the highest portion of water. The Gavkhooni wetland is located in the eastern part of Gavkhooni watershed and is the terminal basin of the Zayandehrood River. Pelasjan sub-basin average temperature is 8-13 Cº with 400-1250 mm precipitation. Agriculture activities and animal husbandry are the main activities of people living in these areas. Figure 1 shows the location of the Zayanderood River Basin and Pelasjan sub-basin in the western part of the Gavkhooni wetland in Iran.

Dataset
The Operational Land Imager (OLI) and Thematic Mapper (TM) sensors were launched with Landsat satellites and are useful in natural resources studies. OLI sensor measures in the visible, near infrared, and short wave infrared portions of the spectrum in 11 bands. TM sensors with 7 bands are in visible wavelengths and in infrared. The temporal resolution of both TM and OLI are 16 days. Considering the fact that the study area was located between two Landsat paths, 164 and 165, two images were downloaded from the USGS website. Because thehighest vegetation cover was in June and August, satellite images were downloaded for susceptive months. Table 1 shows satellite data selected for this study. In addition, aerial images, digital elevation model 1:25/000 topographic maps were used to best understand the study area's situation.

Field studies
Field studies were conducted to collect training areas for each LULC class to be used in the image classification. Positions of the lands under agricultural areas were determined with GPS. To check the status of the vegetation cover crown percentage (VCCP), 270 plots with 7•3 meters were measured.
In this study, samples from each LULC were collected by paying attention to imageries spatial resolution (30m); and they were collected in a homogenous area of LULC. In this order, we take samples in homogenous areas, which at least are more than 30-metersdistant from margins. Therefore, by avoiding marginal land-use/land-cover reflectance, we achieve almost pure reflectance samples for each LULC.
Because there were not enough data for 1998 image, by using topographic maps and aerials and by comparing TM and OLI image false color composites (FCC), NDVI images values in TM and OLI and field studies, VCCP in each recorded plot was predicted.

LULC classification
Based on the available data and field studies, 7 LULC classes were defined for the study area (Table 2).

Satellite image Pre-processing
Earth atmosphere is a mixed of gases, liquid and solid particles, most of these are optically active causing absorption, diffusion and scattering. Signals which measured at the satellite is the emergent radiation from the earth surface atmosphere system in the senor observation direction. The radiance measured at sensor is known as Top of Atmosphere (TOA) radiance. Atmospheric corrections aim to convert the TOA radiance of the objects into the near earth reflectance (Lakshmi et al., 2015). Atmospheric correction was done using Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubus (FLAASH) algorithm. FLAASH was developed to provide accurate, physicsbased derivation of atmospheric properties in Envi 5.1. FLAASH includes correction for the adjacency effect, cirrus and opaque cloud classification and adjustable spectral polishing for artifact suppression (Jia et al, 2014;Lakshmi et al., 2015).

Satellite image processing -Hierarchical image classification
First, for image processing, the conceptual model of the three-level earth's surface matrix that was shown in Figure 2 was applied on both TM and OLI data.

First step
At this step, the lower cover crown percentage of rainfed agriculture was considered as the threshold of 50% cover crown for separating from sparse rangeland and other LULC. To map VCCP, NDVI index was used as follows (Equation 1) (Mukherjee, 2004;Oldeland et al., 2010;Peña & Brenning, 2015): Simple linear regression was done between samples taken as the dependent variables and their NDVI values for each image as independent variables. Using prepared VCCP models, the VCCP maps prepared two classes of dense vegetation and sparse vegetation. Field control and comparison with FCC image showed that there were some mixings between the cultivation area (especially rain fed agriculture) and dense rangeland. In this step, because of drainage, agriculture areas separate more correctly.

Second step
For initial separation of the rain-fed agricultural area and dense rangeland during field studies, and by overlaying first step vegetation map on slope percentage image, it become clear that 30 percent slope was the threshold between rain-fed area and dense rangeland. On the other hand, there was no rain-fed cultivation over 30 percent slope in mountain areas. By applying 30 percent slope threshold in the first step, dense vegetation in more than 30 percent, which were mostly dense rangelands and forests, were separated from dense vegetation in less than 30 percent slope that were mostly drainage and rain-fed agricultural areas ( Figure 2).

Third level
On the third level of the hierarchical model (Figure 2), four categories: drainage, rain-fed agriculture, forest and dense rangeland were considered as the sub-classes for the dense vegetation (>50% vegetation coverage). On the other hand, three categories were determined as the sub-classes for the low-density vegetation (<50%) including residential areas, sparse rangeland, and land under water. Because agriculture lands have geomatics shapes, by paying attention to their reflectance and their shape, they were classified as rain-fed and drainage agriculture using object-based image classification method, and were separated from satellite data. Other LULCs were classified with Fisher supervised image classification. Residential area maps for TM image were produced using TM image for January1998 when the land was totally covered with snow, only residential areas did not have snow cover; and residential areas were separated by applying Fisher image classification method.
Finally, all the individual layers were combined to produce LULCs maps.

Maps accuracy assessment
For accuracy assessment, samples were collected in field studies and were used for TM images. Some areas were selected as samples by considering field studies results and FCC images comparison. The overall accuracy and Kappa coefficient, commission error and omission were also determined.

Satellite image classification
In supervised classification methods, especially in Fisher classification method, to produce accurate maps, it is important to take samples that are really pure samples of each land-use reflectance (Al-doski et al., 2013). Therefore, samples for each land-use must be prepared in areas far from margins of a land-use/ land-cover. In this study, by proposing a sampling in heterogeneous area of each land-use/land-cover and by taking at least 30 meters distance from marginal land-use/land-cover (pixels in border of two land uses/land covers), we managed to achieve pure samples that were really samples of a land use/land-cover (LULC). Image classifications results have shown that LULC classes with similar reflectance values in different bands have more errors. Moreover, small patches of isolated land covers can increase the classification errors because of impacts of the reflections from the adjacent pixels. In their studies, Luna, Cesar (2003) (2015), mentioned that similarity between LULCs increase errors in image classification. Kamusoko and Aniya (2007) explained that the accuracy of the classification depends on the degree of differentiation among the spectral reflections of LULC classes. Figures 3a and 3b show graphs signatures over used bands for TM and OLI simultaneously either as a spectral response pattern or mean reflectance (b 1…n = band number).
As shown in figures 3a and 3b, especially the dense rangeland and rain-fed agriculture follow almost similar reflectance patterns in all bands of the imageries. In these images, rain-fed, dense rangeland and forest almost have the same reflectance trend. Residential areas have high reflectance in all bands, and water reflectance is the lowest after near infrared band.
In this study, by paying attention to LULC similarity and complexity, hierarchical scheme of LULC was designed for satellite image classification. Disperati et al. (2015), for satellite image classification, designed 3 and 4 levels for land classes and mentioned they produced land maps in each level; and at the end, they combined all results together to achieve the final LULC map.
In this study, for VCCP, models were prepared using NDVI index that is a common and useful vegetation index for surviving different kinds of plants (Mukherjee, 2004;Oldeland et al., 2010;Jovanović et al., 2015). Formulae 2 and 3 have shown the vegetation cover crown percentage model.  Where Y is vegetation cover crown percentage, and X is values in NDVI index.
The range of NDVI values are -1 to 1, the lower values show lower VCCP and the upper values are related to areas with more VCCP. In this study, initial image classifications showed that it was not possible to separate agricultural area especially rain-fed agriculture from dense rangeland. In addition, NDVI classification on the basis of produced modes could not separate dense rangeland from agriculture areas in the first level. The field samplings and overlaying FCC images on slope map indicated that dense rangelands are normally located on slopes greater than 30 percent in mountain areas, while drainage and rain-fed agriculture were located on slopes less than 30 percent slope. Thus, by overlaying the slope layer on the satellite images in GIS area using multiple method, dense rangelands which were mostly located on mountain areas were separated.
Finally, LULC maps were proposed using the conceptual model, and images were classified with hybrid method in 7 layers for years 1998 and 2016. Yuan, et al. (2005); Kamusoko and Aniya (2007) used hybrid image classification and explained that this method is applicable in land with complex reflectance. Figures 4a and b and 5a and b, respectively show LULC maps of the area in the second and third stages. Table  4 shows each class area in hectares.
Residential areas that were small patches were distributed across the study area, and therefore their reflectance was influenced by the neighboring pixels (Malmir et al., 2015). Fisher classification method was able to separate the residential areas in both sensors. In some cases, residential areas and low-density rangeland were classified as one class in TM sensor images. Fisher supervised classification method can separate LULC with high accuracy when training sites were collected accurately (Al-doski et al., 2013). In this study, samples from each LULC were collected by paying attention to OLI spatial resolution; and they were collected in homogenous areas. Table 4 shows there is no significant change in drainage agriculture area and increase in rain-fed agriculture. From 1998 to 2016, water area in The Zayandehrood dam had a 1671-hectare decrease. On the basis of table 2, during this time, dense rangelands and forests decreased, too.

Accuracy assessment of classification
In this study, for classification accuracy assessment using field study samples and produced maps, error matrices produced for hybrid method results, and kappa coefficient, overall accuracy, precision of producer and user, commission and omission errors were calculated and shown in Tables 5 and 6 (Lunetta & Lyon, 2004;Benfield et al., 2007;Al-doski et al., 2015, Lakshmi et al., 2015. Error matrices, tables 5 and 6, show that most errors related to misclassified areas are related to rainfed and drainage agriculture areas and dense range   Figures 3.a and b show that these misclassifications are related to similarity between drainage and rain-fed agriculture in dense rangeland areas. TM error matrix (table 5) has shown that most misclassifications are for relating rain-fed agriculture to dense rangelands and forests. Table 6indicates that in the map prepared by OLI image, rain-fed agriculture, drainage agriculture, dense and sparse rangelands were separated correctly. In prepared maps, some drainage agricultural pixels are wrongly related to residential areas because of small green spaces in residential areas. On the other hand, because forests included trees and rangeland to-gether, in this class, there is a high misclassification between different vegetation classes, and therefore it has high commission and omission error in two imageries.
Using TM and OLI images, land-use/land-cover was extracted due to different reflectance behaviors of water compared to other phenomena ( Figure  3 a and b). Tables 5 and 6show produced map using TM sensor has overall accuracy 84% and with kappa coefficient 0.81, and produced map using OLI sensor has overall accuracy 91% and with kappa coefficient 0.87,which is more than TM. This difference was predictable because of OLI characteristics like more radiometric and spectral resolution.

Conclusion
LULC spectral profiles have shown LULCs digital numbers were more separated in OLI with 16-bit than TM data, so this is the reason for the less accuracy in TM map with 8-bit radiometric resolution (Figures 3a  and 3b). Applying proposed hybrid method inland hierarchy concept could produce almost the same accurate maps for two imageries data. In this study, Fisher classification (Al-doski et al., 2013), object-based classification (Blaschke, 2010;Vieira et al., 2012;Phinn et al., 2012), and NDVI vegetation index (Peña & Brenning, 2015;Oldeland et al., 2010) were used in designed hybrid classification method.
Error matrices have shown more accurate image classification results in the map provided by OLI sensor than TM, especially in mapping different vegetation types and separating land surfaces such as residential areas.
Considering the similarity of some LULC reflectance in this study, in the hierarchy concept of land matrix, hybrid method can produce acceptable LULC maps. Thus, providing detailed maps of LULC that have small areas and similar reflectance will be possible through appropriate methods for each defined land level using different imageries.