Accelerating the EM Algorithm through Selective Sampling for Naive Bayes Text Classifier 


Vol. 13,  No. 3, pp. 369-376, Jun.  2006
10.3745/KIPSTD.2006.13.3.369


PDF
  Abstract

This paper presents a new method of significantly improving conventional Bayesian statistical text classifier by incorporating accelerated EM(Expectation Maximization) algorithm. EM algorithm experiences a slow convergence and performance degrade in its iterative process, especially when real online-textual documents do not follow EM’s assumptions. In this study, we propose a new accelerated EM algorithm with uncertainty-based selective sampling, which is simple yet has a fast convergence speed and allow to estimate a more accurate classification model on Naive Bayesian text classifier. Experiments using the popular Reuters-21578 document collection showed that the proposed algorithm effectively improves classification accuracy.

  Statistics


  Cite this article

[IEEE Style]

J. Y. Chang and H. J. Kim, "Accelerating the EM Algorithm through Selective Sampling for Naive Bayes Text Classifier," The KIPS Transactions:PartD, vol. 13, no. 3, pp. 369-376, 2006. DOI: 10.3745/KIPSTD.2006.13.3.369.

[ACM Style]

Jae Young Chang and Han Joon Kim. 2006. Accelerating the EM Algorithm through Selective Sampling for Naive Bayes Text Classifier. The KIPS Transactions:PartD, 13, 3, (2006), 369-376. DOI: 10.3745/KIPSTD.2006.13.3.369.