TY - GEN
T1 - Semantic segmentation of UAV image using combined U-net and heterogeneous UAV imagery datasets
AU - Song, A.
N1 - Publisher Copyright:
© COPYRIGHT SPIE.
PY - 2022
Y1 - 2022
N2 - Semantic segmentation of urban areas can provide useful information for analyzing and detecting changes in urban development. Recently, numerous remote sensing image datasets from various platforms have been acquired, and various semantic segmentation studies using them have been conducted. However, they do not contain many images because of their large data capacity and difficulty in constructing label data. Furthermore, it is difficult to use them simultaneously because each dataset has a different spatial resolution, shooting angle, and meaningful objects. In this study, two different UAV image datasets, such as UAVid semantic segmentation and semantic drone datasets, were used to train a combined U-net model to use heterogeneous remote sensing datasets for semantic segmentation tasks simultaneously. The UAVid dataset has a flight height of 50 m and 300 images with eight classes. However, the semantic drone dataset was acquired at an altitude of 5-30 m above the ground and contains 598 images with 20 classes. The combined U-net model is based on the U-net architecture, but it receives input from two different data sources. The experimental results showed that learning two datasets with a combined U-net improved semantic segmentation accuracy more than learning each data with a U-net. This study confirms the ability to train two different datasets acquired from different places and platforms simultaneously; thus, evaluating the applicability of semantic segmentation studies using heterogeneous remote sensing datasets.
AB - Semantic segmentation of urban areas can provide useful information for analyzing and detecting changes in urban development. Recently, numerous remote sensing image datasets from various platforms have been acquired, and various semantic segmentation studies using them have been conducted. However, they do not contain many images because of their large data capacity and difficulty in constructing label data. Furthermore, it is difficult to use them simultaneously because each dataset has a different spatial resolution, shooting angle, and meaningful objects. In this study, two different UAV image datasets, such as UAVid semantic segmentation and semantic drone datasets, were used to train a combined U-net model to use heterogeneous remote sensing datasets for semantic segmentation tasks simultaneously. The UAVid dataset has a flight height of 50 m and 300 images with eight classes. However, the semantic drone dataset was acquired at an altitude of 5-30 m above the ground and contains 598 images with 20 classes. The combined U-net model is based on the U-net architecture, but it receives input from two different data sources. The experimental results showed that learning two datasets with a combined U-net improved semantic segmentation accuracy more than learning each data with a U-net. This study confirms the ability to train two different datasets acquired from different places and platforms simultaneously; thus, evaluating the applicability of semantic segmentation studies using heterogeneous remote sensing datasets.
KW - combined U-net
KW - heterogeneous dataset
KW - Semantic segmentation
KW - UAV
UR - http://www.scopus.com/inward/record.url?scp=85142614814&partnerID=8YFLogxK
U2 - 10.1117/12.2638354
DO - 10.1117/12.2638354
M3 - Conference contribution
AN - SCOPUS:85142614814
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - Remote Sensing Technologies and Applications in Urban Environments VII
A2 - Erbertseder, Thilo
A2 - Chrysoulakis, Nektarios
A2 - Zhang, Ying
PB - SPIE
T2 - Remote Sensing Technologies and Applications in Urban Environments VII 2022
Y2 - 5 September 2022
ER -