基于多尺度Transformer特征的道路場景語義分割網(wǎng)絡(luò)

打印
收藏

收藏成功

微博 QQ空間微信

打開文本圖片集

中圖分類號：TP391.41；U491.1 文獻標志碼：A

本文引用格式：.基于多尺度Transformer特征的道路場景語義分割網(wǎng)絡(luò)[J].華東交通大學學報，2025，42（2）：110-118.

Road Scene Semantic Segmentation Network Based on Multi-Scale TransformerFeatures

PengYang，Wu Wenhuan， ZhangHaokun

（SchoolofIntelligentandConnected Vehicle，Hubei UniversityofAutomotiveTechnology，Shiyan442o02，China）

Abstract： Image contents in road scenes are usually complex， with significant differences in scale and shape between different objects，and lighting and shadows can make the scenes difficult to recognize.However，existing semantic segmentation methods often fail to effectively extract and fully integrate multi-scale semantic features， resulting in poor generalization ability and robustnes.To address these issues，this study proposes a semantic segmentation network model that fuses multi-scale Transformer features.Firstly，the CSWin Transformer was employed to extract semantic features at various scales，accompanied by the introductionofa feature refinement module （FRM） to enhance the semantic discrimination capability of deep，fine-grained features. Secondly，an attention aggregation module （AAM） was adopted to separately aggregate features across scales.Finally，by integrating these enhanced multi-scale features，the semantic expression ability of the features was further enhanced， thereby improving segmentation performance. Experimental results demonstrate that this network model achieves an accuracy of 82.3% on the Cityscapes dataset， outperforming SegNeXt and ConvNeXt by 2.2 percentage points and 1.2 percentage points， respectively. Moreover， it attains an accuracy of 47.4% on the highly challenging ADE20K dataset， surpassing SegNeXt and ConvNeXt by 3.2 percentage points and 2.8 percentage points，respectively.The proposed multi-scale Transformer feature fusion model not only achieves high semantic segmentation accuracy，accurately predicting pixel semantic categories ofroad scene images， but also has strong generalization performance and robustness.

Key words： semantic segmentation; Transformer features; feature fusion; spatial expectation maximizes attention; channel attention

Citation format： PENG Y，WU WH， ZHANG HK.Road scene semantic segmentation network based on multi scale transformer features[J]. Journal ofEast China Jiaotong University，2025，42（2）： 110-118.

語義分割的目標是識別出圖像中每個像素所屬的物體類別標簽。（剩余12740字）

試讀結(jié)束

購買全文6.00元下一篇機場群終端區(qū)離港航班協(xié)同排序方法研究

華東交通大學學報

2025年02期

￥7.29/本

特黄三级爱爱视频|国产1区2区强奸|舌L子伦熟妇aV|日韩美腿激情一区|6月丁香综合久久|一级毛片免费试看|在线黄色电影免费|国产主播自拍一区|99精品热爱视频|亚洲黄色先锋一区

基于多尺度Transformer特征的道路場景語義分割網(wǎng)絡(luò)