Efficient Emotion Classification Method Based on Multimodal Approach Using Limited Speech and Text Data

Mirr Shin; Youhyun Shin

Efficient Emotion Classification Method Based on Multimodal Approach Using Limited Speech and Text Data

Mirr Shin

Youhyun Shin

Vol. 13, No. 4, pp. 174-180, Apr. 2024

https://doi.org/10.3745/TKIPS.2024.13.4.174

Artificial intelligence

Natural Language Processing

speech recognition

Multimodal

Emotion classification

PDF

Abstract

In this paper, we explore an emotion classification method through multimodal learning utilizing wav2vec 2.0 and KcELECTRA models. It is known that multimodal learning, which leverages both speech and text data, can significantly enhance emotion classification performance compared to methods that solely rely on speech data. Our study conducts a comparative analysis of BERT and its derivative models, known for their superior performance in the field of natural language processing, to select the optimal model for effective feature extraction from text data for use as the text processing model. The results confirm that the KcELECTRA model exhibits outstanding performance in emotion classification tasks. Furthermore, experiments using datasets made available by AI-Hub demonstrate that the inclusion of text data enables achieving superior performance with less data than when using speech data alone. The experiments show that the use of the KcELECTRA model achieved the highest accuracy of 96.57%. This indicates that multimodal learning can offer meaningful performance improvements in complex natural language processing tasks such as emotion classification.

Statistics

Cite this article

[IEEE Style]

M. Shin and Y. Shin, "Efficient Emotion Classification Method Based on Multimodal Approach Using Limited Speech and Text Data," The Transactions of the Korea Information Processing Society, vol. 13, no. 4, pp. 174-180, 2024. DOI: https://doi.org/10.3745/TKIPS.2024.13.4.174.

[ACM Style]

Mirr Shin and Youhyun Shin. 2024. Efficient Emotion Classification Method Based on Multimodal Approach Using Limited Speech and Text Data. The Transactions of the Korea Information Processing Society, 13, 4, (2024), 174-180. DOI: https://doi.org/10.3745/TKIPS.2024.13.4.174.

Efficient Emotion Classification Method Based on Multimodal Approach Using Limited Speech and Text Data

Submenu

Forms

Search
(IN TITLE, AUTHOR, ABSTRACT,KEYWORDS)

Advanced Search

Recent Publications
(LAST 3 YEARS)

Old Journals

Indexing

Related Journals

Efficient Emotion Classification Method Based on Multimodal Approach Using Limited Speech and Text Data

Submenu

Forms

Search (IN TITLE, AUTHOR, ABSTRACT,KEYWORDS)

Advanced Search

POPULAR KEYWORDS(TOP 10 KEYWORDS)

Recent Publications(LAST 3 YEARS)

Old Journals

Indexing

Related Journals

Search
(IN TITLE, AUTHOR, ABSTRACT,KEYWORDS)

POPULAR KEYWORDS
(TOP 10 KEYWORDS)

Recent Publications
(LAST 3 YEARS)