Improving Object Detection via Lightweight Cross-Attention-based Semantic Alignment

Hyungseop Lee; Jiho Lee; Woochul Kang

Improving Object Detection via Lightweight Cross-Attention-based Semantic Alignment

Hyungseop Lee

Jiho Lee

Woochul Kang

Vol. 14, No. 9, pp. 668-676, Sep. 2025

https://doi.org/10.3745/TKIPS.2025.14.9.668

Deep Learning

Object Detection

Multi-scale feature fusion

Cross-attention

Real-Time Inference

PDF

Abstract

Accurate detection of objects with varying scales requires effective multi-scale feature representation learning. To this end, most modern object detectors adopt Feature Pyramid Network (FPN)-based feature fusion strategies. However, due to differences in semantic granularity and information content across feature levels, direct fusion often results in semantic misalignment, leading to increased false positives and limited detection performance. In this paper, we propose a lightweight cross-attention-based semantic alignment module that aligns adjacent feature levels prior to fusion. The module leverages semantically weak low-level features as queries and semantically rich high-level features as keys and values, enabling effective modeling of inter-level semantic relationships. To ensure computational efficiency and real-time applicability, the sequence length is constrained based on the lowest-resolution feature map. We integrate the proposed module into both conventional and real-time object detectors and evaluate it on the MS COCO and PASCAL VOC datasets. Experimental results demonstrate consistent improvements in AP and AP50 metrics, validating the effectiveness and generality of our approach.

Statistics

Cite this article

[IEEE Style]

H. Lee, J. Lee, W. Kang, "Improving Object Detection via Lightweight Cross-Attention-based Semantic Alignment," The Transactions of the Korea Information Processing Society, vol. 14, no. 9, pp. 668-676, 2025. DOI: https://doi.org/10.3745/TKIPS.2025.14.9.668.

[ACM Style]

Hyungseop Lee, Jiho Lee, and Woochul Kang. 2025. Improving Object Detection via Lightweight Cross-Attention-based Semantic Alignment. The Transactions of the Korea Information Processing Society, 14, 9, (2025), 668-676. DOI: https://doi.org/10.3745/TKIPS.2025.14.9.668.

Improving Object Detection via Lightweight Cross-Attention-based Semantic Alignment

Submenu

Forms

Search
(IN TITLE, AUTHOR, ABSTRACT,KEYWORDS)

Advanced Search

Recent Publications
(LAST 3 YEARS)

Old Journals

Indexing

Related Journals

Improving Object Detection via Lightweight Cross-Attention-based Semantic Alignment

Submenu

Forms

Search (IN TITLE, AUTHOR, ABSTRACT,KEYWORDS)

Advanced Search

POPULAR KEYWORDS(TOP 10 KEYWORDS)

Recent Publications(LAST 3 YEARS)

Old Journals

Indexing

Related Journals

Search
(IN TITLE, AUTHOR, ABSTRACT,KEYWORDS)

POPULAR KEYWORDS
(TOP 10 KEYWORDS)

Recent Publications
(LAST 3 YEARS)