Performance Analysis of Video–Audio Action Recognition Using a Cross-Attention-Based Multimodal Fusion Architecture
Vol. 15, No. 2, pp. 113-120, Feb. 2026
https://doi.org/10.3745/TKIPS.2026.15.2.113
Abstract
Statistics
|
|
Cite this article
[IEEE Style]
J. H. Kim, "Performance Analysis of Video–Audio Action Recognition Using a Cross-Attention-Based Multimodal Fusion
Architecture," The Transactions of the Korea Information Processing Society, vol. 15, no. 2, pp. 113-120, 2026. DOI: https://doi.org/10.3745/TKIPS.2026.15.2.113.
[ACM Style]
Jun Hwa Kim. 2026. Performance Analysis of Video–Audio Action Recognition Using a Cross-Attention-Based Multimodal Fusion
Architecture. The Transactions of the Korea Information Processing Society, 15, 2, (2026), 113-120. DOI: https://doi.org/10.3745/TKIPS.2026.15.2.113.

Korean