Study on the Generation Methods of Composition Noun for Efficient Index Term Extraction 


Vol. 7,  No. 4, pp. 1122-1131, Apr.  2000
10.3745/KIPSTE.2000.7.4.1122


PDF
  Abstract

The efficiency of thesytem depends upon an accurate extraction capability of index terms in the system of information search or in that of automatic index. Therefore, extraction of accurate index terms is of utmost importance. This report presents the generation methods of composition noun for efficient index term extraction by using words of high frequency appearance, so that the right documents can be found during information search. For the sake of presentation of this method, index terms of composition noun shall be extracted by applying the rule of composition and disintegration to the nouns with high frequency of appearance in the documents, such as those with upper 30%∼40% of frequency ratio. In addition, for he purpose of effecting an inspection of validity in relation to a composition of high frequency nouns such as those with upper 30∼40% of frequency ratio as presented in this report, it proposes an adequate frquency ratio during noun composition. Based upon the proposed application, in this short documents with less than 300 syllables, low frequency omissions were noticed, when composed with nouns in the upper 30% of frequency ratio; whereas the documents with more than 30 syllables, when composed with nouns in he upper 40% of frequency ration, had a considerable reduction of low frequency omissions. Thus, total number of index terms has decreased to 57.7% of these existing and an accurate extraction of index terms with an 85.6% adequacy ratio became possible.

  Statistics


  Cite this article

[IEEE Style]

M. J. Kim, M. S. Park, J. H. Choi, S. J. Lee, "Study on the Generation Methods of Composition Noun for Efficient Index Term Extraction," The Transactions of the Korea Information Processing Society (1994 ~ 2000), vol. 7, no. 4, pp. 1122-1131, 2000. DOI: 10.3745/KIPSTE.2000.7.4.1122.

[ACM Style]

Mi Jin Kim, Mi Sung Park, Jae Hyuk Choi, and Sang Jo Lee. 2000. Study on the Generation Methods of Composition Noun for Efficient Index Term Extraction. The Transactions of the Korea Information Processing Society (1994 ~ 2000), 7, 4, (2000), 1122-1131. DOI: 10.3745/KIPSTE.2000.7.4.1122.