Semantic land cover and land use classification using deep convolutional neural networks

17 Kasım 2020 Yüksek Lisans Tezleri 1,936 Views

Bilişim Enstitüsü, Uydu Haberleşme ve Uzaktan Algılama Programı, İTÜ

Semantic land cover and land use classification using deep convolutional neural networks

Berk Güney; Elif Sertel, 2019

Abstract: In recent years, deep learning (DL), the successor of neural networks (NNs), has become the state-of-the-art approach in areas particularly, computer vision (CV), speech recognition and natural language processing. (NN) is an established branch of artificial intelligence that has been brought to life due to factors such as high-performance computing, algorithmic improvements and big data. In the field of remote sensing big data has also become the norm. Remote sensing is obtaining information about an object or phenomenon without making physical contact, especially the Earth. The definition includes the conventional areas of remote sensing, e.g. satellite and aerial photography. However, remote sensing also covers areas such as unmanned aerial vehicles (UAVs) and crowdsourcing (telephone images, tweets, etc.). Several satellites were launched in the last five years with high spatial resolution such as Sentinel-1A/B and Sentinel-2A within the European Copernicus program, and Landsat-8 within the U.S. Geological Survey (USGS) and the National Aeronautics and Space Administration. All of these data sets are free to access on operational basis. Land use and land cover classification is a standard remote sensing task where each image pixel is either associated with a class label indicating the physical material of the surface(land cover) or each object describing the socio-economic function of the land(land use). Therefore, land use objects are complex structures consist of many different land cover elements. Due to its complex nature, both spectral and spatial features need to be incorporated for a successful land use/land cover mapping. Experiments to combine both of these features based on the Conditional Random Field (CRF) model, Markov Random Field model and Composite Kernel (CK) method have been carried out. Nevertheless, in most cases, the process of extracting extensive number of features for the intent of supervised classification is time consuming and requires comprehensive knowledge to extract useful features. In addition to that, hand-crafted methods that are used for classification mainly relies on low-level features and produce inadequate classification results. With the increasing amount of accessible data, application of deep learning for overcoming these challenges has become prominent. Compared to machine learning approaches such as Support Vector Machine (SVM) and Random Forest (RF) deep learning shows great promise with the use of big data. Current deep learning models are Deep Belief Net (DBN), Stacked Auto-Encoder (SAE), and Convolutional Neural Network’s (CNN). Most well-known deep learning model (CNN) shows great progress for processing of remote sensing imagery. (CNN’s) outperform shallow-structured machine learning tools in remote sensing applications such as object detection, segmentation and classification. In this thesis, two pre-trained CNN models namely Inception-ResNet-V2 and Inception-v4 are used to classify scenes from satellite imagery. There are 20 classes with 700 images each such as airport, chaparral, dense residential, forest, freeway, golf course, ground track field, industrial area, intersection, meadow, medium residential, overpass, parking lot, rectangular farmland, river, runway, sparse residential, storage tank, tennis court and terrace. Scenes acquired from Worldview-3 satellite sensor are used to evaluate the performance of the network. Suggested networks reached %91.2 and %87.2 accuracy over the 1000 test image.