Authors :
Gauri Todur
Volume/Issue :
Volume 10 - 2025, Issue 10 - October
Google Scholar :
https://tinyurl.com/7puvux6y
Scribd :
https://tinyurl.com/5n6rh2h6
DOI :
https://doi.org/10.38124/ijisrt/25oct244
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Note : Google Scholar may take 30 to 40 days to display the article.
Abstract :
Small object detection in high resolution satellite imagery for search and rescue (SAR) operations remains chal-
lenging, with targets sometimes 3-4 pixels in width, compared to full images of 1000-pixel resolution. Using the SaRNet
dataset containing 2,552 satellite images from a real missing person search, we evaluated three modifications to a
baseline Faster R-CNN Feature Pyramid Network architecture to improve the recall performance metric on small object
detection. We tested (A) Focal Loss integration to address class imbalance since targets represent <0.16% of image area,
(B) multi-scale training and testing at higher image resolutions (10-20% up-scaled) and (C) decreased anchor sizes.
Results were mixed. Focal Loss was the only successful modification, improving small object recall by 4.4 percentage
points (10.4% relative improvement) while also increasing recall on large objects. Surprisingly, both anchor optimization
and multi-scale training degraded performance despite theoretical justification. Optimized anchor sizes decreased recall
across all object sizes and caused the worst AR-d20 per- formance drop (-12.64 points), revealing that geometric anchor
coverage doesn’t guarantee detection improvement in transfer learning contexts. Multi-scale training decreased medium-
sized object recall by 9.5 percentage points, contradicting recent super- resolution research. This work provides the first
systematic evaluation of modifications of the baseline model for the SaRNet dataset towards improved small object
detection. For operational SAR systems where lives depend on detection performance, our results recommend Focal
Loss integration while cautioning against modifications that disrupt pre-trained model configura- tions.
Keywords :
Search and Rescue (SAR), Small Object Detection, Satellite Imagery, Faster R-CNN, Remote Sensing, SaRNet Dataset, Disaster Response, Focal Loss, Anchor Optimization.
References :
- Thoreau, Michael & Wilson, Frazer. (2021). SaRNet: A Dataset for Deep Learning Assisted Search and Rescue with Satellite Imagery. 204-208. 10.1109/ISPA52656.2021.9552103.
- S. Ren, K. He, R. Girshick and J. Sun, ”Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks” in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 39, no. 06, pp. 1137-1149, June 2017, doi: 10.1109/TPAMI.2016.2577031.
- Zhao, B., Song, R. Enhancing two-stage object detection models via data-driven anchor box optimization in UAV-based maritime SAR. Sci Rep 14, 4765 (2024). https://doi.org/10.1038/s41598-024-55570-z.
- T. -Y. Lin, P. Goyal, R. Girshick, K. He and P. Dolla´r, ”Focal Loss for Dense Object Detection,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 2, pp. 318-327, 1 Feb. 2020, doi: 10.1109/TPAMI.2018.2858826.
- J. Shermeyer and A. Van Etten, ”The Effects of Super-Resolution on Object Detection Performance in Satellite Imagery,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 2019, pp. 1432-1441, doi: 10.1109/CVPRW.2019.00184.
- Yuxin Wu and Alexander Kirillov and Francisco Massa and Wan-Yen Lo and Ross Girshick, Detectron2, https://github.com/facebookresearch/detectron2, 2019.
- Lin, Tsung-Yi et al. “Feature Pyramid Networks for Object Detection.” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016): 936-944.
Small object detection in high resolution satellite imagery for search and rescue (SAR) operations remains chal-
lenging, with targets sometimes 3-4 pixels in width, compared to full images of 1000-pixel resolution. Using the SaRNet
dataset containing 2,552 satellite images from a real missing person search, we evaluated three modifications to a
baseline Faster R-CNN Feature Pyramid Network architecture to improve the recall performance metric on small object
detection. We tested (A) Focal Loss integration to address class imbalance since targets represent <0.16% of image area,
(B) multi-scale training and testing at higher image resolutions (10-20% up-scaled) and (C) decreased anchor sizes.
Results were mixed. Focal Loss was the only successful modification, improving small object recall by 4.4 percentage
points (10.4% relative improvement) while also increasing recall on large objects. Surprisingly, both anchor optimization
and multi-scale training degraded performance despite theoretical justification. Optimized anchor sizes decreased recall
across all object sizes and caused the worst AR-d20 per- formance drop (-12.64 points), revealing that geometric anchor
coverage doesn’t guarantee detection improvement in transfer learning contexts. Multi-scale training decreased medium-
sized object recall by 9.5 percentage points, contradicting recent super- resolution research. This work provides the first
systematic evaluation of modifications of the baseline model for the SaRNet dataset towards improved small object
detection. For operational SAR systems where lives depend on detection performance, our results recommend Focal
Loss integration while cautioning against modifications that disrupt pre-trained model configura- tions.
Keywords :
Search and Rescue (SAR), Small Object Detection, Satellite Imagery, Faster R-CNN, Remote Sensing, SaRNet Dataset, Disaster Response, Focal Loss, Anchor Optimization.