Resampling Feedback Documents Using Overlapping Clusters 


Vol. 16,  No. 3, pp. 247-256, Jun.  2009
10.3745/KIPSTB.2009.16.3.247


PDF
  Abstract

Typical pseudo-relevance feedback methods assume the top-retrieved documents are relevant and use these pseudo-relevant documentsto expand terms. The initial retrieval set can, however, contain a great deal of noise. In this paper, we present a cluster-based resampling method to select better pseudo-relevant documents based on the relevance model. The main idea is to use document clusters to find dominant documents for the initial retrieval set, and to repeatedly feed the documents to emphasize the core topics of a query. Experimental results on large-scale web TREC collections show significant improvements over the relevance model. For justification of the resampling approach, we examine relevance density of feedback documents. The resampling approach shows higher relevance density than the baseline relevance model on all collections, resulting in better retrieval accuracy in pseudo-relevance feedback. This result indicates that the proposed method is effective for pseudo-relevance feedback.

  Statistics


  Cite this article

[IEEE Style]

K. S. Lee, "Resampling Feedback Documents Using Overlapping Clusters," The KIPS Transactions:PartB , vol. 16, no. 3, pp. 247-256, 2009. DOI: 10.3745/KIPSTB.2009.16.3.247.

[ACM Style]

Kyung Soon Lee. 2009. Resampling Feedback Documents Using Overlapping Clusters. The KIPS Transactions:PartB , 16, 3, (2009), 247-256. DOI: 10.3745/KIPSTB.2009.16.3.247.