Backdoor Attacks on Generative Language Models in Disaster Situations: Attack Scenarios and Empirical Analysis 


Vol. 14,  No. 7, pp. 533-541, Jul.  2025
https://doi.org/10.3745/TKIPS.2025.14.7.533


PDF
  Abstract

With the remarkable advancement of artificial intelligence, AI models have begun to be used across various fields, while simultaneously revealing security vulnerabilities within AI itself. This paper assumes a situation in which a backdoor is inserted into a generative model used in disaster situations through data poisoning and transfer learning. By means of this backdoor, the model produces normal outputs for general inputs, yet deliberately yields manipulated outputs for inputs containing a specific trigger, thereby potentially causing social confusion. We experimentally demonstrate this possibility and emphasize its associated risks.

  Statistics


  Cite this article

[IEEE Style]

J. Kang and J. Lee, "Backdoor Attacks on Generative Language Models in Disaster Situations: Attack Scenarios and Empirical Analysis," The Transactions of the Korea Information Processing Society, vol. 14, no. 7, pp. 533-541, 2025. DOI: https://doi.org/10.3745/TKIPS.2025.14.7.533.

[ACM Style]

Jongyoung Kang and Jiyeon Lee. 2025. Backdoor Attacks on Generative Language Models in Disaster Situations: Attack Scenarios and Empirical Analysis. The Transactions of the Korea Information Processing Society, 14, 7, (2025), 533-541. DOI: https://doi.org/10.3745/TKIPS.2025.14.7.533.