November  2019, 2(4): 299-314. doi: 10.3934/mfc.2019019

Big Map R-CNN for object detection in large-scale remote sensing images

a. 

FIST LAB, School of Information Science and Engineering, Yunnan University Kunming, 650091, Yunnan, China

b. 

Yunnan Union Vision Technology Co Ltd. Kunming, 650091, Yunnan, China

c. 

School of Software, Yunnan University Kunming, Yunnan University Kunming, 650091, Yunnan, China

* Corresponding author: Dapeng Tao

Published  December 2019

Detecting sparse and multi-sized objects in very high resolution (VHR) remote sensing images remains a significant challenge in satellite imagery applications and analytics. Difficulties include broad geographical scene distributions and high pixel counts in each image: a large-scale satellite image contains tens to hundreds of millions of pixels and dozens of complex backgrounds. Furthermore, the scale of the same category object can vary widely (e.g., ships can measure from several to thousands of pixels). To address these issues, here we propose the Big Map R-CNN method to improve object detection in VHR satellite imagery. Big Map R-CNN introduces mean shift clustering for quadric detecting based on the existing Mask R-CNN architecture. Big Map R-CNN considers four main aspects: 1) big map cropping to generate small size sub-images; 2) detecting these sub-images using the typical Mask R-CNN network; 3) screening out fragmented low-confidence targets and collecting uncertain image regions by clustering; 4) quadric detecting to generate prediction boxes. We also introduce a new large-scale and VHR remote sensing imagery dataset containing two categories (RSI LS-VHR-2) for detection performance verification. Comprehensive evaluations on RSI LS-VHR-2 dataset demonstrate the effectiveness of the proposed Big Map R-CNN algorithm for object detection in large-scale remote sensing images.

Citation: Linfei Wang, Dapeng Tao, Ruonan Wang, Ruxin Wang, Hao Li. Big Map R-CNN for object detection in large-scale remote sensing images. Mathematical Foundations of Computing, 2019, 2 (4) : 299-314. doi: 10.3934/mfc.2019019
References:
[1]

U. R. AcharyaH. Fujita and S. Bhat, Decision support system for fatty liver disease using GIST descriptors extracted from ultrasound images, Information Fusion, (2016), 32-39.  doi: 10.1016/j.inffus.2015.09.006.  Google Scholar

[2]

H. BayT. Tuytelaars and L. Van Gool, Surf: Speeded up robust features, European Conference On Computer Vision, 3951 (2006), 404-417.  doi: 10.1007/11744023_32.  Google Scholar

[3]

Y. S. Cao, X. Niu and Y. Dou, Region-based convolutional neural networks for object detection in very high resolution remote sensing images, 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, (2016), 548–554. doi: 10.1109/FSKD.2016.7603232.  Google Scholar

[4]

K. Chatfield, K. Simonyan and A. Vedaldi, Return of the devil in the details: Delving deep into convolutional nets, proceedings of BMVC, (2014). doi: 10.5244/C.28.6.  Google Scholar

[5]

L. C. Chen, G. Papandreou and I. Kokkinos, Semantic image segmentation with deep convolutional nets and fully connected crfs, arXiv: 1412.7062. Google Scholar

[6]

G. ChengP. Zhou and J. Han, Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images, IEEE Transactions on Geoscience and Remote Sensing, 54 (2016), 7405-7415.  doi: 10.1109/TGRS.2016.2601622.  Google Scholar

[7]

J. DaiY. LiK. He and J. Sun, R-fcn: Object detection via region-based fully convolutional networks, Advances in Neural Information Processing Systems, (2016), 379-387.   Google Scholar

[8]

N. Dalal and B. Triggs, Histograms of oriented gradients for human detection, international Conference on Computer Vision & Pattern Recognition, (2005), 886-893.  doi: 10.1109/CVPR.2005.177.  Google Scholar

[9]

R. GirshickJ. DonahueT. Darrell and J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2014), 580-587.  doi: 10.1109/CVPR.2014.81.  Google Scholar

[10]

R. Girshick, Fast R-CNN, Proceedings of the IEEE International Conference on Computer Vision, (2015), 1440-1448.  doi: 10.1109/ICCV.2015.169.  Google Scholar

[11]

D. Gray and H. Tao, Viewpoint invariant pedestrian recognition with an ensemble of localized features, Proceedings of the European Conference on Computer Vision, 5302 (2008), 262-275.  doi: 10.1007/978-3-540-88682-2_21.  Google Scholar

[12]

X. HanY. Zhong and L. Zhang, An efficient and robust integrated geospatial object detection framework for high spatial resolution remote sensing imagery, Remote Sensing, 9 (2017), 666-687.  doi: 10.3390/rs9070666.  Google Scholar

[13]

K. HeX. ZhangS. Ren and J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, ECCV, 8591 (2014), 346-361.  doi: 10.1007/978-3-319-10578-9_23.  Google Scholar

[14]

K. HeG. GkioxariP. Dollár and R. Girshick, Mask r-cnn, Proceedings of the IEEE international conference on computer vision, (2017), 2961-2969.  doi: 10.1109/ICCV.2017.322.  Google Scholar

[15]

J. JeongH. Park and N. Kwak, Enhancement of SSD by concatenating feature maps for object detection, BMVC, (2017), 1-12.  doi: 10.5244/C.31.76.  Google Scholar

[16]

K. KanistrasG. Martins and M. J. Rutherford, Survey of unmanned aerial vehicles (UAVs) for traffic monitoring, Handbook of Unmanned Aerial Vehicles, (2016), 2643-2666.  doi: 10.1109/ICUAS.2013.6564694.  Google Scholar

[17]

M. KangK. Ji and X. Leng, Contextual region-based convolutional neural network with multilayer fusion for SAR ship detection, Remote Sensing, (2017), 860-873.   Google Scholar

[18]

Y. Ke and R. Sukthankar, PCA-SIFT: A more distinctive representation for local image descriptors, CVPR, (2004), 506-513.   Google Scholar

[19]

S. KhanalJ. Fulton and S. Shearer, An overview of current and potential applications of thermal remote sensing in precision agriculture, Computers and Electronics in Agriculture, 139 (2017), 22-32.  doi: 10.1016/j.compag.2017.05.001.  Google Scholar

[20]

V. KyrkiJ. K. Kamarainen and H. Kälviäinen, Simple Gabor feature space for invariant object recognition, Pattern Recognition Letters, 25 (2004), 311-318.  doi: 10.1016/j.patrec.2003.10.008.  Google Scholar

[21]

Y. LiY. Tan and J. Deng, Cauchy graph embedding optimization for built-up areas detection from high-resolution remote sensing images, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 8 (2015), 2078-2096.  doi: 10.1109/JSTARS.2015.2394504.  Google Scholar

[22]

W. LiuD. Anguelov and D. Erhan, Ssd: Single shot multibox detector, European Conference on Computer Vision, 9905 (2016), 21-37.  doi: 10.1007/978-3-319-46448-0_2.  Google Scholar

[23]

D. G. Lowe, Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision, 60 (2004), 91-110.  doi: 10.1023/B:VISI.0000029664.99615.94.  Google Scholar

[24]

J. MaH. Zhou and J. Zhao, Robust feature matching for remote sensing image registration via locally linear transforming, IEEE Transactions on Geoscience and Remote Sensing, 53 (2015), 6469-6481.  doi: 10.1109/TGRS.2015.2441954.  Google Scholar

[25]

M. Mazhar RathoreA. Ahmad and A. Paul, Urban planning and building smart cities based on the internet of things using big data analytics, Computer Networks, 101 (2016), 63-80.  doi: 10.1016/j.comnet.2015.12.023.  Google Scholar

[26]

B. S. ManjunathJ. R. Ohm and V. V. Vasudevan, Color and texture descriptors, IEEE Transactions on Circuits and Systems for Video Technology, 11 (2011), 703-715.  doi: 10.1109/76.927424.  Google Scholar

[27]

V. Nair and G. E. Hinton, 3D object recognition with deep belief nets, Advances in Neural Information Processing Systems, (2009), 1339-1347.   Google Scholar

[28]

H. NohS. Hong and B. Han, Learning deconvolution network for semantic segmentation, Proceedings of the IEEE International Conference on Computer Vision, (2015), 1520-1528.  doi: 10.1109/ICCV.2015.178.  Google Scholar

[29]

W. OuyangX. Wang and X. Zeng, Deepid-net: Deformable deep convolutional neural networks for object detection, The IEEE Conference on Computer Vision and Pattern Recognition, (2015), 2403-2412.  doi: 10.1109/CVPR.2015.7298854.  Google Scholar

[30]

M. T. PhamG. Mercier and O. Regniers, Texture retrieval from VHR optical remote sensed images using the local extrema descriptor with application to vineyard parcel detection, Remote Sensing, 8 (2016), 368-388.  doi: 10.3390/rs8050368.  Google Scholar

[31]

J. RedmonS. Divvala and R. Girshick, You only look once: Unified, real-time object detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2016), 779-788.  doi: 10.1109/CVPR.2016.91.  Google Scholar

[32]

Y. RenC. Zhu and S. Xiao, Small object detection in optical remote sensing images via modified faster R-CNN, Applied Sciences, 8 (2018), 813-823.  doi: 10.3390/app8050813.  Google Scholar

[33]

S. RenK. He and R. Girshick, Faster r-cnn: Towards real-time object detection with region proposal networks, Advances in Neural Information Processing Systems, (2015), 91-99.   Google Scholar

[34]

M. SimonyS. Milzy and K. Amendey, Complex-YOLO: An Euler-region-proposal for real-time 3D object detection on point clouds, Proceedings of the European Conference on Computer Vision, 11127 (2018), 197-209.  doi: 10.1007/978-3-030-11009-3_11.  Google Scholar

[35]

M. VakalopoulouK. Karantzalos and N. Komodakis, Building detection in very high resolution multispectral data with deep learning features, 2015 IEEE International Geoscience and Remote Sensing Symposium, (2015), 1873-1876.  doi: 10.1109/IGARSS.2015.7326158.  Google Scholar

[36]

K. S. Willis, Remote sensing change detection for ecological monitoring in United States protected areas, Biological Conservation, 182 (2015), 233-242.  doi: 10.1016/j.biocon.2014.12.006.  Google Scholar

[37]

J. YanH. Wang and M. Yan, IoU-adaptive deformable R-CNN: Make full use of iou for multi-class object detection in remote sensing imagery, Remote Sensing, (2019), 286-306.   Google Scholar

[38]

Y. ZhongX. Han and L. Zhang, Multi-class geospatial object detection based on a position-sensitive balancing framework for high spatial resolution remote sensing imagery, ISPRS Journal of Photogrammetry and Remote Sensing, 138 (2018), 281-294.  doi: 10.1016/j.isprsjprs.2018.02.014.  Google Scholar

[39]

H. ZhuX. Chen and W. Dai, Orientation robust object detection in aerial images using deep convolutional neural network, 2015 IEEE International Conference on Image Processing, (2015), 3735-3739.  doi: 10.1109/ICIP.2015.7351502.  Google Scholar

show all references

References:
[1]

U. R. AcharyaH. Fujita and S. Bhat, Decision support system for fatty liver disease using GIST descriptors extracted from ultrasound images, Information Fusion, (2016), 32-39.  doi: 10.1016/j.inffus.2015.09.006.  Google Scholar

[2]

H. BayT. Tuytelaars and L. Van Gool, Surf: Speeded up robust features, European Conference On Computer Vision, 3951 (2006), 404-417.  doi: 10.1007/11744023_32.  Google Scholar

[3]

Y. S. Cao, X. Niu and Y. Dou, Region-based convolutional neural networks for object detection in very high resolution remote sensing images, 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, (2016), 548–554. doi: 10.1109/FSKD.2016.7603232.  Google Scholar

[4]

K. Chatfield, K. Simonyan and A. Vedaldi, Return of the devil in the details: Delving deep into convolutional nets, proceedings of BMVC, (2014). doi: 10.5244/C.28.6.  Google Scholar

[5]

L. C. Chen, G. Papandreou and I. Kokkinos, Semantic image segmentation with deep convolutional nets and fully connected crfs, arXiv: 1412.7062. Google Scholar

[6]

G. ChengP. Zhou and J. Han, Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images, IEEE Transactions on Geoscience and Remote Sensing, 54 (2016), 7405-7415.  doi: 10.1109/TGRS.2016.2601622.  Google Scholar

[7]

J. DaiY. LiK. He and J. Sun, R-fcn: Object detection via region-based fully convolutional networks, Advances in Neural Information Processing Systems, (2016), 379-387.   Google Scholar

[8]

N. Dalal and B. Triggs, Histograms of oriented gradients for human detection, international Conference on Computer Vision & Pattern Recognition, (2005), 886-893.  doi: 10.1109/CVPR.2005.177.  Google Scholar

[9]

R. GirshickJ. DonahueT. Darrell and J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2014), 580-587.  doi: 10.1109/CVPR.2014.81.  Google Scholar

[10]

R. Girshick, Fast R-CNN, Proceedings of the IEEE International Conference on Computer Vision, (2015), 1440-1448.  doi: 10.1109/ICCV.2015.169.  Google Scholar

[11]

D. Gray and H. Tao, Viewpoint invariant pedestrian recognition with an ensemble of localized features, Proceedings of the European Conference on Computer Vision, 5302 (2008), 262-275.  doi: 10.1007/978-3-540-88682-2_21.  Google Scholar

[12]

X. HanY. Zhong and L. Zhang, An efficient and robust integrated geospatial object detection framework for high spatial resolution remote sensing imagery, Remote Sensing, 9 (2017), 666-687.  doi: 10.3390/rs9070666.  Google Scholar

[13]

K. HeX. ZhangS. Ren and J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, ECCV, 8591 (2014), 346-361.  doi: 10.1007/978-3-319-10578-9_23.  Google Scholar

[14]

K. HeG. GkioxariP. Dollár and R. Girshick, Mask r-cnn, Proceedings of the IEEE international conference on computer vision, (2017), 2961-2969.  doi: 10.1109/ICCV.2017.322.  Google Scholar

[15]

J. JeongH. Park and N. Kwak, Enhancement of SSD by concatenating feature maps for object detection, BMVC, (2017), 1-12.  doi: 10.5244/C.31.76.  Google Scholar

[16]

K. KanistrasG. Martins and M. J. Rutherford, Survey of unmanned aerial vehicles (UAVs) for traffic monitoring, Handbook of Unmanned Aerial Vehicles, (2016), 2643-2666.  doi: 10.1109/ICUAS.2013.6564694.  Google Scholar

[17]

M. KangK. Ji and X. Leng, Contextual region-based convolutional neural network with multilayer fusion for SAR ship detection, Remote Sensing, (2017), 860-873.   Google Scholar

[18]

Y. Ke and R. Sukthankar, PCA-SIFT: A more distinctive representation for local image descriptors, CVPR, (2004), 506-513.   Google Scholar

[19]

S. KhanalJ. Fulton and S. Shearer, An overview of current and potential applications of thermal remote sensing in precision agriculture, Computers and Electronics in Agriculture, 139 (2017), 22-32.  doi: 10.1016/j.compag.2017.05.001.  Google Scholar

[20]

V. KyrkiJ. K. Kamarainen and H. Kälviäinen, Simple Gabor feature space for invariant object recognition, Pattern Recognition Letters, 25 (2004), 311-318.  doi: 10.1016/j.patrec.2003.10.008.  Google Scholar

[21]

Y. LiY. Tan and J. Deng, Cauchy graph embedding optimization for built-up areas detection from high-resolution remote sensing images, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 8 (2015), 2078-2096.  doi: 10.1109/JSTARS.2015.2394504.  Google Scholar

[22]

W. LiuD. Anguelov and D. Erhan, Ssd: Single shot multibox detector, European Conference on Computer Vision, 9905 (2016), 21-37.  doi: 10.1007/978-3-319-46448-0_2.  Google Scholar

[23]

D. G. Lowe, Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision, 60 (2004), 91-110.  doi: 10.1023/B:VISI.0000029664.99615.94.  Google Scholar

[24]

J. MaH. Zhou and J. Zhao, Robust feature matching for remote sensing image registration via locally linear transforming, IEEE Transactions on Geoscience and Remote Sensing, 53 (2015), 6469-6481.  doi: 10.1109/TGRS.2015.2441954.  Google Scholar

[25]

M. Mazhar RathoreA. Ahmad and A. Paul, Urban planning and building smart cities based on the internet of things using big data analytics, Computer Networks, 101 (2016), 63-80.  doi: 10.1016/j.comnet.2015.12.023.  Google Scholar

[26]

B. S. ManjunathJ. R. Ohm and V. V. Vasudevan, Color and texture descriptors, IEEE Transactions on Circuits and Systems for Video Technology, 11 (2011), 703-715.  doi: 10.1109/76.927424.  Google Scholar

[27]

V. Nair and G. E. Hinton, 3D object recognition with deep belief nets, Advances in Neural Information Processing Systems, (2009), 1339-1347.   Google Scholar

[28]

H. NohS. Hong and B. Han, Learning deconvolution network for semantic segmentation, Proceedings of the IEEE International Conference on Computer Vision, (2015), 1520-1528.  doi: 10.1109/ICCV.2015.178.  Google Scholar

[29]

W. OuyangX. Wang and X. Zeng, Deepid-net: Deformable deep convolutional neural networks for object detection, The IEEE Conference on Computer Vision and Pattern Recognition, (2015), 2403-2412.  doi: 10.1109/CVPR.2015.7298854.  Google Scholar

[30]

M. T. PhamG. Mercier and O. Regniers, Texture retrieval from VHR optical remote sensed images using the local extrema descriptor with application to vineyard parcel detection, Remote Sensing, 8 (2016), 368-388.  doi: 10.3390/rs8050368.  Google Scholar

[31]

J. RedmonS. Divvala and R. Girshick, You only look once: Unified, real-time object detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2016), 779-788.  doi: 10.1109/CVPR.2016.91.  Google Scholar

[32]

Y. RenC. Zhu and S. Xiao, Small object detection in optical remote sensing images via modified faster R-CNN, Applied Sciences, 8 (2018), 813-823.  doi: 10.3390/app8050813.  Google Scholar

[33]

S. RenK. He and R. Girshick, Faster r-cnn: Towards real-time object detection with region proposal networks, Advances in Neural Information Processing Systems, (2015), 91-99.   Google Scholar

[34]

M. SimonyS. Milzy and K. Amendey, Complex-YOLO: An Euler-region-proposal for real-time 3D object detection on point clouds, Proceedings of the European Conference on Computer Vision, 11127 (2018), 197-209.  doi: 10.1007/978-3-030-11009-3_11.  Google Scholar

[35]

M. VakalopoulouK. Karantzalos and N. Komodakis, Building detection in very high resolution multispectral data with deep learning features, 2015 IEEE International Geoscience and Remote Sensing Symposium, (2015), 1873-1876.  doi: 10.1109/IGARSS.2015.7326158.  Google Scholar

[36]

K. S. Willis, Remote sensing change detection for ecological monitoring in United States protected areas, Biological Conservation, 182 (2015), 233-242.  doi: 10.1016/j.biocon.2014.12.006.  Google Scholar

[37]

J. YanH. Wang and M. Yan, IoU-adaptive deformable R-CNN: Make full use of iou for multi-class object detection in remote sensing imagery, Remote Sensing, (2019), 286-306.   Google Scholar

[38]

Y. ZhongX. Han and L. Zhang, Multi-class geospatial object detection based on a position-sensitive balancing framework for high spatial resolution remote sensing imagery, ISPRS Journal of Photogrammetry and Remote Sensing, 138 (2018), 281-294.  doi: 10.1016/j.isprsjprs.2018.02.014.  Google Scholar

[39]

H. ZhuX. Chen and W. Dai, Orientation robust object detection in aerial images using deep convolutional neural network, 2015 IEEE International Conference on Image Processing, (2015), 3735-3739.  doi: 10.1109/ICIP.2015.7351502.  Google Scholar

Figure 1.  Motivation for the proposed method. (a) Remote sensing scene of Madrid Airport. (b) Remote sensing scene of the South China Sea. These examples are from the RSI LS-VHR-2 dataset. The targets in the images are indicated by red cicles. The remote sensing scenes show the characteristics of large scale, high resolution, and relatively sparse target distribution, which means that existing methods are suboptimal for detection
Figure 2.  The scheme of Big Map R-CNN, containing three main components: 1) cropping the input big map in the form of a sliding window; 2) detecting each sub-image sequentially and filtering possible object areas; 3) using mean shift clustering to precisely locate candidate object areas, cropping the new sub-images containing possible objects, and using quadric-detecting to judge whether there is an object or not
Figure 3.  Large-scale image cropping
Figure 5.  PRCs of the proposed Big Map R-CNN method and three other state-of-the-art detection methods (YOLOv3, Faster R-CNN, and Mask R-CNN). (a) is the PRC of the four methods for aircraft when IoU = 0.5; (b) is the PRC of the four methods for aircraft when IoU = 0.75; (c) is the PRC of the four methods for ships when IoU = 0.5; (d) is the PRC of the four methods for ships when IoU = 0.75
Figure 4.  Some examples from the RSI LS-VHR-2 dataset
Figure 6.  Detection comparisons of the different methods. (a) Typical Mask R-CNN for aircraft; (b) Big Map R-CNN for aircraft; (c) typical Mask R-CNN for ships; (d) Big Map R-CNN for ships. The true positives are indicated by green rectangles, the false negatives are indicated by red circles, and the bounding boxes that deviate from the ground truth are indicated by red rectangles
Table Ⅰ.  DESCRIPTION OF THE RSI LS-VHR-2 DATASET
Label Name Total instances Complete instances Fragmentary instances Scene class Images Image width Sub-images
1 aircraft 103917 85975 17942 203 2858 6000-15000 62129
2 ship 68436 54386 14050 30 397 5000-18000 53860
Label Name Total instances Complete instances Fragmentary instances Scene class Images Image width Sub-images
1 aircraft 103917 85975 17942 203 2858 6000-15000 62129
2 ship 68436 54386 14050 30 397 5000-18000 53860
Table Ⅱ.  DETAILS OF THE TEST IMAGES
Label Scale(pixels) Images Instances Sub-images
aircraft $ 8000\times8000 $ 5 272 980
ship $ 8000\times8000 $ 5 225 980
Label Scale(pixels) Images Instances Sub-images
aircraft $ 8000\times8000 $ 5 272 980
ship $ 8000\times8000 $ 5 225 980
Table Ⅵ.  PARAMETER SETTING OF Mask R-CNN AND Big Map R-CNN
Input Size Per Batch Size Max Iteration Anchor Stride Base Learning Rate Steps Weight Decay NMS Threshold Momentum
600 8 90000 (4, 8, 16, 32, 64) 0.01 (60000, 80000) 0.0001 0.7 0.9
Input Size Per Batch Size Max Iteration Anchor Stride Base Learning Rate Steps Weight Decay NMS Threshold Momentum
600 8 90000 (4, 8, 16, 32, 64) 0.01 (60000, 80000) 0.0001 0.7 0.9
Table Ⅲ.  PERFARMANCE COMPARISONS OF THREE DIFFERENT CROPPING SIZE IN Faster R-CNN NETWORK
Cropping Size AP Cost time(s)
C300 0.430 45.82
C600 0.651 13.20
C800 0.647 8.79
Cropping Size AP Cost time(s)
C300 0.430 45.82
C600 0.651 13.20
C800 0.647 8.79
Table Ⅳ.  PERFORMANCE COMPARISONS OF THE FOUR METHODS ON AIRCRAFT
Method IoU=0.5 IoU=0.75
TP FP FN Recall Precision AP TP FP FN Recall Precision AP
YOLOv3 213 25 59 0.783 0.895 0.727 166 72 106 0.610 0.6974 0.494
Faster R-CNN 242 55 30 0.890 0.815 0.830 189 108 83 0.695 0.636 0.618
Mask R-CNN 245 38 27 0.901 0.866 0.843 184 99 88 0.676 0.650 0.570
Big Map R-CNN 261 4 11 0.960 0.985 0.959 241 24 31 0.886 0.909 0.850
Method IoU=0.5 IoU=0.75
TP FP FN Recall Precision AP TP FP FN Recall Precision AP
YOLOv3 213 25 59 0.783 0.895 0.727 166 72 106 0.610 0.6974 0.494
Faster R-CNN 242 55 30 0.890 0.815 0.830 189 108 83 0.695 0.636 0.618
Mask R-CNN 245 38 27 0.901 0.866 0.843 184 99 88 0.676 0.650 0.570
Big Map R-CNN 261 4 11 0.960 0.985 0.959 241 24 31 0.886 0.909 0.850
Table Ⅴ.  PERFORMANCE COMPARISONS OF THE FOUR METHODS ON SHIP
Method IoU=0.5 IoU=0.75
TP FP FN Recall Precision AP TP FP FN Recall Precision AP
YOLOv3 128 53 97 0.569 0.707 0.513 66 115 159 0.293 0.365 0.213
Faster R-CNN 164 185 61 0.729 0.470 0.651 78 271 147 0.347 0.223 0.259
Mask R-CNN 166 121 59 0.738 0.578 0.661 78 209 147 0.347 0.272 0.273
Big Map R-CNN 191 49 34 0.849 0.796 0.826 133 107 92 0.591 0.554 0.546
Method IoU=0.5 IoU=0.75
TP FP FN Recall Precision AP TP FP FN Recall Precision AP
YOLOv3 128 53 97 0.569 0.707 0.513 66 115 159 0.293 0.365 0.213
Faster R-CNN 164 185 61 0.729 0.470 0.651 78 271 147 0.347 0.223 0.259
Mask R-CNN 166 121 59 0.738 0.578 0.661 78 209 147 0.347 0.272 0.273
Big Map R-CNN 191 49 34 0.849 0.796 0.826 133 107 92 0.591 0.554 0.546
Table Ⅶ.  THE AVERAGE PRECISION OF Mask R-CNN AND Big Map R-CNN IN RSI LS-VHR-2 DATASET
Method Backbone AP($ \% $)
Mask R-CNN ResNet50 75.2
Big Map R-CNN ResNet50 89.2
Method Backbone AP($ \% $)
Mask R-CNN ResNet50 75.2
Big Map R-CNN ResNet50 89.2
Table Ⅷ.  COMPREHENSIVE PERFORMANCE COMPARISONS OF FOUR METHODS
Method mAP (IoU=0.5) mAP (IoU=0.75) Inference time(s/im)
YOLOv3 0.620 0.354 3.310
Faster R-CNN 0.741 0.439 13.254
Mask R-CNN 0.752 0.422 13.310
Big Map R-CNN 0.892 0.700 16.005
Method mAP (IoU=0.5) mAP (IoU=0.75) Inference time(s/im)
YOLOv3 0.620 0.354 3.310
Faster R-CNN 0.741 0.439 13.254
Mask R-CNN 0.752 0.422 13.310
Big Map R-CNN 0.892 0.700 16.005
[1]

Zhihua Zhang, Naoki Saito. PHLST with adaptive tiling and its application to antarctic remote sensing image approximation. Inverse Problems & Imaging, 2014, 8 (1) : 321-337. doi: 10.3934/ipi.2014.8.321

[2]

Boris Kramer, John R. Singler. A POD projection method for large-scale algebraic Riccati equations. Numerical Algebra, Control & Optimization, 2016, 6 (4) : 413-435. doi: 10.3934/naco.2016018

[3]

Danuta Gaweł, Krzysztof Fujarewicz. On the sensitivity of feature ranked lists for large-scale biological data. Mathematical Biosciences & Engineering, 2013, 10 (3) : 667-690. doi: 10.3934/mbe.2013.10.667

[4]

Mahmut Çalik, Marcel Oliver. Weak solutions for generalized large-scale semigeostrophic equations. Communications on Pure & Applied Analysis, 2013, 12 (2) : 939-955. doi: 10.3934/cpaa.2013.12.939

[5]

Philippe Bonneton, Nicolas Bruneau, Bruno Castelle, Fabien Marche. Large-scale vorticity generation due to dissipating waves in the surf zone. Discrete & Continuous Dynamical Systems - B, 2010, 13 (4) : 729-738. doi: 10.3934/dcdsb.2010.13.729

[6]

Rongliang Chen, Jizu Huang, Xiao-Chuan Cai. A parallel domain decomposition algorithm for large scale image denoising. Inverse Problems & Imaging, 2019, 13 (6) : 1259-1282. doi: 10.3934/ipi.2019055

[7]

Tsuguhito Hirai, Hiroyuki Masuyama, Shoji Kasahara, Yutaka Takahashi. Performance analysis of large-scale parallel-distributed processing with backup tasks for cloud computing. Journal of Industrial & Management Optimization, 2014, 10 (1) : 113-129. doi: 10.3934/jimo.2014.10.113

[8]

Suli Zou, Zhongjing Ma, Xiangdong Liu. Auction games for coordination of large-scale elastic loads in deregulated electricity markets. Journal of Industrial & Management Optimization, 2016, 12 (3) : 833-850. doi: 10.3934/jimo.2016.12.833

[9]

Bo You, Chengkui Zhong, Fang Li. Pullback attractors for three dimensional non-autonomous planetary geostrophic viscous equations of large-scale ocean circulation. Discrete & Continuous Dynamical Systems - B, 2014, 19 (4) : 1213-1226. doi: 10.3934/dcdsb.2014.19.1213

[10]

Masataka Kato, Hiroyuki Masuyama, Shoji Kasahara, Yutaka Takahashi. Effect of energy-saving server scheduling on power consumption for large-scale data centers. Journal of Industrial & Management Optimization, 2016, 12 (2) : 667-685. doi: 10.3934/jimo.2016.12.667

[11]

Rouhollah Tavakoli, Hongchao Zhang. A nonmonotone spectral projected gradient method for large-scale topology optimization problems. Numerical Algebra, Control & Optimization, 2012, 2 (2) : 395-412. doi: 10.3934/naco.2012.2.395

[12]

Gaohang Yu. A derivative-free method for solving large-scale nonlinear systems of equations. Journal of Industrial & Management Optimization, 2010, 6 (1) : 149-160. doi: 10.3934/jimo.2010.6.149

[13]

Boling Guo, Guoli Zhou. Finite dimensionality of global attractor for the solutions to 3D viscous primitive equations of large-scale moist atmosphere. Discrete & Continuous Dynamical Systems - B, 2018, 23 (10) : 4305-4327. doi: 10.3934/dcdsb.2018160

[14]

David Burguet. Examples of $\mathcal{C}^r$ interval map with large symbolic extension entropy. Discrete & Continuous Dynamical Systems - A, 2010, 26 (3) : 873-899. doi: 10.3934/dcds.2010.26.873

[15]

Peter Benner, Ryan Lowe, Matthias Voigt. $\mathcal{L}_{∞}$-norm computation for large-scale descriptor systems using structured iterative eigensolvers. Numerical Algebra, Control & Optimization, 2018, 8 (1) : 119-133. doi: 10.3934/naco.2018007

[16]

Jiuping Xu, Pei Wei. Production-distribution planning of construction supply chain management under fuzzy random environment for large-scale construction projects. Journal of Industrial & Management Optimization, 2013, 9 (1) : 31-56. doi: 10.3934/jimo.2013.9.31

[17]

A Voutilainen, Jari P. Kaipio. Model reduction and pollution source identification from remote sensing data. Inverse Problems & Imaging, 2009, 3 (4) : 711-730. doi: 10.3934/ipi.2009.3.711

[18]

Min-Fan He, Li-Ning Xing, Wen Li, Shang Xiang, Xu Tan. Double layer programming model to the scheduling of remote sensing data processing tasks. Discrete & Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1515-1526. doi: 10.3934/dcdss.2019104

[19]

Xianchao Xiu, Ying Yang, Wanquan Liu, Lingchen Kong, Meijuan Shang. An improved total variation regularized RPCA for moving object detection with dynamic background. Journal of Industrial & Management Optimization, 2017, 13 (5) : 1-14. doi: 10.3934/jimo.2019024

[20]

Richard L Buckalew. Cell cycle clustering and quorum sensing in a response / signaling mediated feedback model. Discrete & Continuous Dynamical Systems - B, 2014, 19 (4) : 867-881. doi: 10.3934/dcdsb.2014.19.867

 Impact Factor: 

Metrics

  • PDF downloads (26)
  • HTML views (65)
  • Cited by (0)

[Back to Top]