Burned Area Segmentation in Optical Remote Sensing Images Driven by U-shaped Multi-stage Masked Autoencoder



Journal Title

Journal ISSN

Volume Title



Computer vision (CV) for natural disaster monitoring from optical remote sensing images (ORSIs) has been an emerging topic in analyzing ORSIs. Recently masked autoencoder (MAE) has achieved great success in CV and shown promising potential for many downstream vision tasks. However, due to the inherent limitation of vision transformer (ViT) in MAE which has fixed feature scale and performs poorly in modeling local spatial correlation, directly applying MAE to burned area segmentation (BAS) in ORSIs fails to achieve satisfactory results. To address this problem, we propose a novel dual-branch complement network (DCNet) driven by U-shaped multi-stage masked autoencoder (UMMAE) for BAS in ORSIs, which is also the first application of MAE in BAS. UMMAE has four stages and introduces skip connection between the encoder and decoder at the same stage, which improves the feature diversity and further enhances the model performance. DCNet has three major components: ViT encoder (global branch), convolution encoder (local branch) and the decoder. The global branch inherits visual representation learning ability from the pretrained UMMAE and captures global contextual information from the input image, while the local branch extracts local spatial information at different scales. Features from two different branches are fused in the decoder for feature complementation, which improves feature discriminability and segmentation accuracy. Besides, we build a new BAS dataset containing ORSIs of burned area in California, USA, from 2017 to 2022. Extensive experiments on two BAS datasets demonstrate that our DCNet outperforms the state-of-the-art methods. Code is available at: https://github.com/Voruarn/DCNet.


This work is licensed under CC BY-NC-ND 4.0


Burned area segmentation (BAS), Decoding, deep learning, Feature extraction, forest fire monitoring, Forestry, Semantic segmentation, semantic segmentation, Task analysis, Transformers, Wildfires


Fu, Y., Fang, W., & Sheng, V.S.. 2024. Burned Area Segmentation in Optical Remote Sensing Images Driven by U-shaped Multi-stage Masked Autoencoder. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. https://doi.org/10.1109/JSTARS.2024.3402122