A Comparative Study of Feature Importance Algorithms and Feature Selection for Static Feature-Based Ransomware Detection 


Vol. 14,  No. 8, pp. 576-587, Aug.  2025
https://doi.org/10.3745/TKIPS.2025.14.8.576


PDF
  Abstract

In this paper, we extract 54 static features from ransomware PE files—including header metadata, section sizes, and virtual memory sizes—and evaluate their importance using four algorithms: Gain Ratio, Information Gain, Gini Importance, and Mutual Information. For each algorithm, we select the top-K features to form a reduced feature set, which is then used to train and validate four classification models: Random Forest, Decision Tree, Support Vector Machine, and Multi-Layer Perceptron. Experimental results show that the Random Forest model, using 41 features selected by a Gain Ratio threshold of K = 0.01, achieves the highest accuracy of 99.33%. The Decision Tree, SVM, and MLP models also demonstrate strong performance with accuracies of 98.67%, 96.67%, and 98.75%, respectively. These findings confirm that careful feature selection can substantially reduce computational costs while maintaining high detection accuracy

  Statistics


  Cite this article

[IEEE Style]

J. H. Min, C. D. Seop, I. E. Gyu, "A Comparative Study of Feature Importance Algorithms and Feature Selection for Static Feature-Based Ransomware Detection," The Transactions of the Korea Information Processing Society, vol. 14, no. 8, pp. 576-587, 2025. DOI: https://doi.org/10.3745/TKIPS.2025.14.8.576.

[ACM Style]

Jeon Hye Min, Choi Doo Seop, and Im Eul Gyu. 2025. A Comparative Study of Feature Importance Algorithms and Feature Selection for Static Feature-Based Ransomware Detection. The Transactions of the Korea Information Processing Society, 14, 8, (2025), 576-587. DOI: https://doi.org/10.3745/TKIPS.2025.14.8.576.