基于增强注意力门控U-Net的建筑物提取研究

任远锐; 陈朋弟; 高小龙

doi:10.12265/j.gnss.2023175

基于增强注意力门控U-Net的建筑物提取研究

doi: 10.12265/j.gnss.2023175

1.
兰州大学资源环境学院, 兰州730000
2.
甘肃省地图院, 兰州730000

基金项目: 甘肃省自然资源科技项目(202223)

详细信息

作者简介:
任远锐：(1999—)，女，硕士，研究方向为深度学习语义分割算法的应用. E-mail: renyr21@lzu.edu.cn

陈朋弟：(1993—)，男，博士，研究方向为基于深度学习的遥感影像智能信息提取. E-mail: cpdhn1058475189@163.com

高小龙：(1989—)，男，硕士，研究方向包括遥感图像处理与应用. E-mail: 381940392@qq.com

通讯作者:
任远锐 E-mail: renyr21@lzu.edu.cn

中图分类号: P232；P237
计量
- 文章访问数: 49
- HTML全文浏览量: 14
- PDF下载量: 3
- 被引次数: 0
出版历程
- 收稿日期: 2023-09-06
- 录用日期: 2023-09-06
- 网络出版日期: 2024-03-26

Building extraction based on advanced attention gate U-Net

1.
College of Earth and Environment Sciences, Lanzhou University, Lanzhou 730000, China
2.
Mapping Institution of Gansu province, Lanzhou 730000, China

摘要

摘要: 针对经典深度学习语义分割网络对建筑物提取存在精度较低、边界模糊和小目标识别困难的问题，本文提出一种增强注意力门控的U型网络(advanced attention gate U-Net，AA_U-Net)用于改善建筑物提取的效果，该网络改进经典U-Net的结构，使用VGG16作为主干特征提取网络、注意力门控模块参与跳跃连接、双线性插值法代替反卷积进行上采样. 实验采用武汉大学建筑物数据集(WHU building dataset，WHD)对比提出的网络与部分经典语义分割网络的提取效果，并探究网络改进的各个模块对提取效果的影响. 结果显示：该网络对建筑物提取的总精度、交并比、查准率、召回率和F1分数分别为98.78%、89.71%、93.30%、95.89%、94.58%，各项评价指标均优于经典语义分割网络，且改进的各个模块有效提高了提取精度，改善了建筑物轮廓不清晰和小目标建筑物破碎的问题，可用于精准提取高分辨率遥感影像中的建筑物信息，对城市规划、土地利用、生产生活、军事侦察等具有指导意义.
- 高分辨率遥感影像 /
- 深度学习 /
- 语义分割 /
- 增强注意力门控U-Net /
- 建筑物提取
Abstract: To facilitate the problems of low accuracy, fuzzy boundary, and difficulty in identifying small targets in building extraction using deep learning semantic segmentation networks, we propose an advanced attention gate U-Net (AA_U-Net) to improve the effect of building extraction. This network improves the structure of classic U-Net, using VGG16 as the backbone feature extraction network, attention-gated module participating in skip connection, and bilinear interpolation instead of deconvolution for upsampling. In the experiment, we use the Wuhan University building dataset (WHD) to compare the extraction effect of the proposed network and some classical semantic segmentation networks and explore the influence of each module of the network improvement on the extraction. The results show that the total accuracy, intersection of union, precision, recall rate, and F1 score of the network are 98.78%, 89.71%, 93.30%, 95.89%, and 94.58%, respectively. All evaluation indexes are better than the classical semantic segmentation network, and the improved modules can effectively improve the extraction accuracy. The problem of unclear outlines of buildings and fragmentation of small target buildings was improved, too. It can be used to accurately extract building information from high-resolution remote sensing images, which has guiding significance for urban planning, land use, production, life, and military reconnaissance.
- high-resolution remote sensing images /
- deep learning /
- semantic segmentation /
- advanced attention gate U-Net /
- building extraction

HTML全文

图 1 U-Net结构

注：每个方框对应一个多通道特征图，棕红色代表编码器部分特征图，粉紫色代表解码器部分特征图；通道的数量表示在方框的顶部；特征图尺寸表示在方框下边缘；箭头表示不同的操作.

下载: 全尺寸图片幻灯片

图 2 本文提出的网络结构

注：每个方框对应一个多通道特征图，棕红色代表编码器特征图，青色代表VGG16比U-Net编码器增加的特征图，土黄色代表经注意力门控模块运算之后复制的特征图，红色代表解码器部分特征图；通道的数量表示在方框的顶部；特征图尺寸表示在方框下边缘处；箭头表示不同的操作.

下载: 全尺寸图片幻灯片

图 3 U-Net编码器与VGG16主干网结构对比

注：方框及颜色含义与图2相同；(a)代表U-Net的编码器部分，(b)代表VGG16的主干特征提取部分；字体为红色表示VGG16与U-Net的特征图通道数差异.

下载: 全尺寸图片幻灯片

图 4 注意力门控模块结构

注：方框代表特征图，矩形网格代表卷积计算，方形网格代表重采样.

下载: 全尺寸图片幻灯片

图 5 解码器改进前后对比

下载: 全尺寸图片幻灯片

图 6 WHD数据集示例影像

下载: 全尺寸图片幻灯片

图 7 不同网络分割结果对比图

下载: 全尺寸图片幻灯片

图 8 不同主干网分割结果对比

下载: 全尺寸图片幻灯片

图 9 解码器改进前后分割结果对比

下载: 全尺寸图片幻灯片

图 10 消融各模块分割结果对比

下载: 全尺寸图片幻灯片

表 1 不同网络分割精度对比 %

网络	IoU	Precision	Recall	F1	OA
U-Net	87.41	91.46	95.18	93.28	98.47
SegNet	87.13	91.94	94.34	93.12	98.45
FCN	69.15	85.03	78.73	91.76	96.09
DeepLabV3	68.89	82.38	80.79	81.58	95.94
DeepLabV3+	79.31	88.27	88.65	88.46	97.43
PSPNet	79.80	85.13	92.72	88.76	97.39
AA_U-Net	89.71	93.30	95.89	94.58	98.78

下载: 导出CSV

表 2 不同主干网提取精度对比 %

网络	IoU	Precision	Recall	F1	OA
ResNet50	85.17	89.98	94.09	91.99	98.18
MobileNetV3	83.93	90.00	92.56	91.27	98.03
VGG16	89.16	93.29	95.27	94.27	98.71

下载: 导出CSV

表 3 解码器改进前后分割精度对比 %

网络状态	IoU	Precision	Recall	F1	OA
Before	89.43	93.42	95.44	94.42	98.74
After	89.71	93.30	95.89	94.58	98.78

下载: 导出CSV

表 4 消融各模块分割精度对比

网络	VGG	AG	BIU	IoU/%	Precision/%	Recall/%	F1/%	OA/%
U-Net	-	-	-	87.41	91.46	95.18	93.28	98.47
VGG	有	-	-	89.16	93.29	95.27	94.27	98.71
VGG+AG	有	有	-	89.43	93.42	95.44	94.42	98.74
VGG+AG +BIU	有	有	有	89.71	93.30	95.89	94.58	98.78

下载: 导出CSV

参考文献(34)

[1]	李锋, 刘旭升, 胡聃, 等. 城市可持续发展评价方法及其应用[J]. 生态学报, 2007, 27(11): 4793-4802.
[2]	ATIK S O, IPBUKER C. Building extraction in VHR remote sensing imagery through deep learning[J]. Fresen environ bull, 2022, 31: 8468-8473.
[3]	SHALONI, DIXIT M, AGARWAL S, et al. Building extraction from remote sensing images: a survey[C]//The 2nd International Conference on Advances in Computing, Communication Control and Networking (ICACCCN), 2020996-971. DOI: 10.1109/ICACCCN51052.2020.9362894
[4]	高妙仙, 吴新辉. 高空间分辨率遥感影像建筑物自动提取方法综述[J]. 测绘与空间地理信息, 2023, 46(3): 32-34.
[5]	李文国, 黄亮, 左小清, 等. 一种结合语义分割模型和图割的街景影像变化检测方法[J]. 全球定位系统, 2021, 46(1): 98-104.
[6]	SHI X, HUANG H, PU C Y, et al. CSA-UNet: channel-spatial attention-based encoder–decoder network for rural blue-roofed building extraction from UAV imagery[J]. IEEE geoscience and remote sensing letters, 2022(19): 1-5. DOI: 10.1109/LGRS.2022.3197319
[7]	张忠豪. 基于深度学习的多场景下建筑物提取研究 [D]. 贵阳: 贵州大学, 2022.
[8]	JÓŹWIK A, SERPICO S, ROLI F. A parallel network of modified 1-NN and k-NN classifiers–application to remote-sensing image classification[J]. Pattern recognition letters, 1998, 19(1): 57-62. DOI: 10.1016/S0167-8655(97)00155-4
[9]	PAL M, MATHER P M. Support vector machines for classification in remote sensing[J]. International journal of remote sensing, 2005, 26(5): 1007-1011. DOI: 10.1080/01431160512331314083
[10]	PAL M. Random forest classifier for remote sensing classification[J]. International journal of remote sensing, 2005, 26(1): 217-222. DOI: 10.1080/01431160412331269698
[11]	XU S J, DENG B W, MENG Y B, et al. ReA-Net: a multiscale region attention network with neighborhood consistency supervision for building extraction from remote sensing image[J]. IEEE journal of selected topics in applied earth observations and remote sensing, 2022(15): 9033-9047. DOI: 10.1109/JSTARS.2022.3204576
[12]	WEI S Q, ZHANG T, JI S P, et al. BuildMapper: a fully learnable framework for vectorized building contour extraction[J]. ISPRS journal of photogrammetry and remote sensing, 2023(197): 87-104. DOI: 10.48550/arXiv.2211.03373
[13]	ZHOU Y G, CHEN Z L, WANG B J, et al. BOMSC-Net: boundary optimization and multi-scale context awareness based building extraction from high-resolution remote sensing imagery[J]. IEEE transactions on geoscience and remote sensing, 2022(60): 1-17. DOI: 10.1109/TGRS.2022.3152575
[14]	GUO Y M, LIU Y, GEORGIOU T, et al. A review of semantic segmentation using deep neural networks[J]. International journal of multimedia information retrieval, 2018(7): 87-93. DOI: 10.1007/s13735-017-0141-z
[15]	于坤, 王贺封, 焦月正, 等. 基于语义分割的遥感影像建筑物提取[J]. 测绘与空间地理信息, 2021, 44(10): 50-54.
[16]	LONG J, SHELHAMER E, DARRELL T, et al. Fully convolutional networks for semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015: 3431-3440. DOI: 10.1109/CVPR.2015.7298965
[17]	RONNEBERGER O, FISCHER P, BROX T. U-net: Convolutional networks for biomedical image segmentation[C]//Medical Image Computing and Computer-Assisted Intervention–MICCAI, 2015: 234-241. DOI: 10.1007/978-3-319-24574-4_28
[18]	BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(12): 2481-2495. DOI: 10.1109/TPAMI.2016.2644615
[19]	CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation [J]. arXiv, 2017: 170605587. DOI: 10.48550/arXiv.1706.05587
[20]	ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017: 6130-6239. DOI: 10.1109/CVPR.2017.660
[21]	ZHENG H H, GONG M G, LIU T F, et al. HFA-Net: High frequency attention siamese network for building change detection in VHR remote sensing images[J]. Pattern recognition, 2022(129): 108717. DOI: 10.1016/j.patcog.2022.108717
[22]	CUI M T, LI K, CHEN J Y, et al. CM-Unet: A novel remote sensing image segmentation method based on improved U-Net[J]. IEEE access, 2023(11): 56994-57005. DOI: 10.1109/ACCESS.2023.3282778
[23]	WANG H Y, MIAO F. Building extraction from remote sensing images using deep residual U-Net[J]. European journal of remote sensing, 2022, 55(1): 71-85. DOI: 10.1080/22797254.2021.2018944
[24]	YAN X, SHEN L, WANG J C, et al. PANet: Pixel-wise affinity network for weakly supervised building extraction from high-resolution remote sensing images[J]. IEEE geoscience and remote sensing letters, 2022(19): 1-5. DOI: 10.1109/LGRS.2022.3205309
[25]	陈雪娇, 田青林, 伊丕源. 基于深度学习的高分辨率遥感影像建筑物提取[J]. 世界核地质科学, 2023, 40(1): 81-88.
[26]	王华俊, 葛小三. 一种轻量级的DeepLabv3+遥感影像建筑物提取方法[J]. 自然资源遥感, 2022, 34(2): 128-135.
[27]	OKTAY O, SCHLEMPER J, FOLGOC L L, et al. Attention u-net: learning where to look for the pancreas[J]. arXiv, 2018. DOI: 10.48550/arXiv.1804.03999
[28]	赵元昊, 赵莹莹, 刘东升, 等. 遥感影像建筑物提取多尺度特征深度学习网络[J]. 航天返回与遥感, 2022, 43(4): 25-35.
[29]	宋佳, 徐慧窈, 高少华, 等. 轻量化卷积神经网络遥感影像建筑物提取模型[J]. 遥感技术与应用, 2023, 38(1): 190-199.
[30]	DU X T, ZHENG Z, XIAO G P, et al. DeepSIM: Deep semantic information-based automatic mandelbug classification[J]. IEEE transactions on reliability, 2021, 71(4): 1540-1554. DOI: 10.1109/TR.2021.3110096
[31]	JI S P, WEI S Q, LU M. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set[J]. IEEE transactions on geoscience and remote sensing, 2018, 57(1): 574-586. DOI: 10.1109/TGRS.2018.2858817
[32]	YUAN J Y. Learning building extraction in aerial scenes with convolutional networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 40(11): 2793-2798. DOI: 10.1109/TPAMI.2017.2750680
[33]	WAHYUNI I, WANG W-J, LIANG D, et al. Rice Semantic Segmentation Using Unet-VGG16: A Case Study in Yunlin, Taiwan[C]//International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), 2021. DOI: 10.1109/ISPACS51563.2021.9651038
[34]	GHOSH S, CHAKI A, SANTOSH K. Improved U-Net architecture with VGG-16 for brain tumor segmentation[J]. Physical and engineering sciences in medicine, 2021, 44(3): 703-712. DOI: 10.1007/s13246-021-01019-w