Vessel Detection From Very High-Resolution Satellite Images With Deep Learning Methods

19 April 2023 Master Theses 10,494 Views

İletişim Sistemleri Anabilim Dalı, Uydu Haberleşmesi ve Uzaktan Algılama Programı, İTÜ

Vessel Detection From Very High-Resolution Satellite Images With Deep Learning Methods

Furkan Büyükkanber; Prof. Dr. Mustafa Yanalak, 2022

Abstract: Vessel detection from remote sensing images is becoming increasingly important component in marine surveillance applications such as maritime traffic control, anti-illegal fishing applications, oil discharge control, marine pollution and marine safety. Increasingly, very high and medium resolution (VHR and MR) earth observation satellites both significantly increase the detectability of many terrestrial objects and shorten recurring visit times in orbit like never before, making the use of this technology attractive for a variety of maritime monitoring missions. However, the difficulty and complexity of object detection in huge satellite images that cover hundreds of square kilometers and derive results under near real-time constraints cause traditional methods to face many difficulties when processing satellite images of this size. Processing these images and applying them to deep learning methods makes it possible to minimize unforeseen errors that can be made by analysts, and to save labor, time and cost. In order to create the artificial neural network and make it successful by determining the deep learning method, it is necessary to train using as much as possible examples of the objects targeted to be detected. By using the designed convolutional neural networks, it is possible to detect more than one object in a given test input image and perform change analysis as well. The weights are updated in each layer for the input image processed in the multilayer convolutional neural network, and the error rate is found by looking at the difference between the detected value and the actual ground truth value. Many vessels for commercial, military and civil purposes are observed in international maritime areas, usually in areas close to ports and coasts. High resolution satellite images, which provide wide field of view and altitude monitoring, are very useful for vessel detection. Vessel detection from satellite images plays a significant role for inspecting maritime areas, controlling maritime transport traffic and applications for defense purposes. Open source datasets are widely used in object detection applications, since it takes a substantial amount of time and cost to build a dataset for object recognition and detection from satellite images. Within the scope of this thesis, models developed using convolutional neural networks including single-stage and two-stage deep learning methods were used by applying our own dataset images that we build with the open source DOTA dataset selected for vessel detection. For the purposes of the experiments in this research, three separate datasets were built. All the images were labelled with YOLO annotation format, then in accordance of use for various models, they have been converted to COCO and Pascal VOC annotation format. Both inshore and offshore vessel images have been collected with having wide variety of scales, shapes, orientations and weather conditions (fuzzy, cloudy, sunny, etc.). Experiments were performed by using Faster R-CNN, YOLOv3, YOLOv5 and YOLOX deep learning models on all three different datasets. Any dataset containing various examples of the target object considerably improves the accuracy of outcomes in deep learning applications by implementing various data augmentation techniques, such as mosaic, mixup, and rotating images, are utilized for remote sensing. In some experiments, more than one augmentation approach is being used simultaneously to improve the accuracy of the results. Not all data augmentation approaches had the same effect on the experiment outcomes. As a result, there is no logical answer to the question of which data augmentation strategy is the most effective. The outcomes of the studies were compared using the mean average precision metric (mAP), and the YOLOv5 model achieved on top results. All of the experiments have yielded the same result: raising the depth of the network by increasing the size of the input images. mAP value results improved as the input sizes were increased, however this caused the selected models longer to train. Experiments in deep learning studies are made easier by machines that have powerful graphics cards. Faster R-CNN, YOLOv3, YOLOv5 and YOLOX model trainings were conducted on a local machine workstation equipped with NVIDIA GeForce RTX 2080Ti graphics card and Intel® Core™ i9-9900K 3.60 GHz CPU processor. Deep learning applications were carried out using Python programming language and PyTorch framework deep learning library.