A Study on Optimizing Korean Performance of Quantized Multilingual LLMs through Language-Specific Neuron Preservation 


Vol. 14,  No. 9, pp. 687-694, Sep.  2025
https://doi.org/10.3745/TKIPS.2025.14.9.687


  Abstract

Quantization of large language models (LLMs) is an effective technique for reducing model size and computational overhead. However, most existing methods are designed with an English-centric perspective, which leads to performance degradation in non-English languages. To address this limitation, this study proposes an improved quantization approach that enhances the AWQ framework by identifying Korean-specific neurons using the Language Activation Probability Entropy (LAPE) metric and preserving their weights during quantization. Furthermore, we introduce an optimization strategy that focuses the analysis on deeper layers where language-specific activation patterns tend to emerge, thereby improving both computational efficiency and model performance. Ex perimental results demonstrate that the proposed method significantly improves Korean performance by up to 17.9% on the Llama-3.2-3B-Instruct model, while keeping the model size virtually unchanged.

  Statistics


  Cite this article

[IEEE Style]

J. Lee and J. Y. Choi, "A Study on Optimizing Korean Performance of Quantized Multilingual LLMs through Language-Specific Neuron Preservation," The Transactions of the Korea Information Processing Society, vol. 14, no. 9, pp. 687-694, 2025. DOI: https://doi.org/10.3745/TKIPS.2025.14.9.687.

[ACM Style]

Jaeyoung Lee and Jin Young Choi. 2025. A Study on Optimizing Korean Performance of Quantized Multilingual LLMs through Language-Specific Neuron Preservation. The Transactions of the Korea Information Processing Society, 14, 9, (2025), 687-694. DOI: https://doi.org/10.3745/TKIPS.2025.14.9.687.