A Document Ranking Method by Document Clustering Using Bayesian SOM and Bootstrap 


Vol. 7,  No. 7, pp. 2108-2115, Jul.  2000
10.3745/KIPSTE.2000.7.7.2108


PDF
  Abstract

The conventional Boolean retrieval systems based on vector space model can provide the results of retrieval fast, they can''t reflect exactly user''s retrieval purpose including semantic information. Consequently, the results of retrieval process are very different from those users expected. This fact forces users to waste much time for finding expected documents among retrieved documents. In this paper, we designed a bayesian SOM(Self-Organizing feature Maps) in combination with bayesian statistical method and Kohonen network as a kind of unsupervised learning, then perform classifying documents depending on the semantic similarity to user query in real time. If it is difficult to observe statistical characteristics as there are less than 30 documents for clustering, the number of documents must be increased to at least 50. Also, to give high rank to the documents which is most similar to user query semantically among generalized classifications for generalized clusters, we find the similarity by means of Kohonen centroid of each document classification and adjust the secondary rank depending on the similarity.

  Statistics


  Cite this article

[IEEE Style]

J. H. Choi, S. H. Jun, J. H. Lee, "A Document Ranking Method by Document Clustering Using Bayesian SOM and Bootstrap," The Transactions of the Korea Information Processing Society (1994 ~ 2000), vol. 7, no. 7, pp. 2108-2115, 2000. DOI: 10.3745/KIPSTE.2000.7.7.2108.

[ACM Style]

Jun Hyeog Choi, Sung Hae Jun, and Jung Hyun Lee. 2000. A Document Ranking Method by Document Clustering Using Bayesian SOM and Bootstrap. The Transactions of the Korea Information Processing Society (1994 ~ 2000), 7, 7, (2000), 2108-2115. DOI: 10.3745/KIPSTE.2000.7.7.2108.