開放世界多維度特征融合場景圖生成算法研究

打開文本圖片集
中圖分類號:TP391 文獻標識碼:A
Open-world Multidimensional Feature Fusion Scene Graph Generation
GU Feifan1,ZHOU Mengmeng2,SONG Shimiao1,GE Jiashang 1 ,YANG Jie' (1.College of Mechanical and Electrical Engineering,Qingdao University,Qingdao 266o71,China; 2.Qingdao QCIT Technology Co.,Ltd.,Qingdao 266100,China)
Abstract: The open-world scene graph generation task has difficulty in detecting unknown objects and their relationships. To address this issue,a relation-reasoning model based on multidimensional feature fusion (MDFF) is proposed. The proposed model is combined with an open-world object detector to form a two-stage open-world scene graph generation algorithm. First,the pretrained open-world object detector identifies objects in the input images. The MDFF model then performs relationship inference based on the detection results. Comparative experiments are conducted on the VG -150 dataset using traditional methods and the MDFF model. The experimental results indicate that the MDFF model achieves 7% improvement in recall rate for predicate classification tasks. Moreover,the performance improves by 3% in open-world scene graph generation and zero-shot inference tasks. Furthermore,ablation studies have confirmed the effectiveness of different feature dimensions on model performance improvement.
Keywords: scene graph generation; feature fusion; object detection; deep learning
在開放世界環(huán)境中,場景圖生成任務復雜,特別是在未知場景和未見物體時,生成準確且具有高的泛化能力的場景圖成為研究的核心問題[1]。(剩余8352字)