A Reverse Segmentation Algorithm of Compound Nouns 


Vol. 8,  No. 4, pp. 357-364, Aug.  2001
10.3745/KIPSTB.2001.8.4.357


PDF
  Abstract

In this paper, we propose a new segmentation algorithm for compound noun analysis in Korean. The algorithm segments a compound noun into a sequence of unit nouns and affixes using a unit noun dictionary and an affix dictionary. In most cases, the head of a compound noun appears at the end of the word, the proposed algorithm tries to segment the given compound noun from the end of the word to the beginning of the word. To evaluate the accuracy of the proposed algorithm, an experiment was conducted with 3,230 compound nouns which is extracted from ETRI tagged corpus. Experimental results shows that the accuracy of the proposed method is 96.6% on the average. In case of compound nouns with unknown words, the accuracy drops to 77.5%. From the experiment, it become clear that the proposed algorithm outperformed other methods in case of compound nouns with unknown words.

  Statistics


  Cite this article

[IEEE Style]

H. M. Lee and H. R. Park, "A Reverse Segmentation Algorithm of Compound Nouns," The KIPS Transactions:PartB , vol. 8, no. 4, pp. 357-364, 2001. DOI: 10.3745/KIPSTB.2001.8.4.357.

[ACM Style]

Hyun Min Lee and Hyuk Ro Park. 2001. A Reverse Segmentation Algorithm of Compound Nouns. The KIPS Transactions:PartB , 8, 4, (2001), 357-364. DOI: 10.3745/KIPSTB.2001.8.4.357.