A Korean Homonym Disambiguation System Using Refined Semantic Information and Thesaurus 


Vol. 12,  No. 7, pp. 829-840, Dec.  2005
10.3745/KIPSTB.2005.12.7.829


PDF
  Abstract

Word Sense Disambiguation(WSD) is one of the most difficult problem in Korean information processing. We propose a WSD model with the capability to filter semantic information using the specific characteristics in dictionary definitions, and with added information, useful to sense determination, such as statistical, distance and case information. we propose a model, which can resolve the issues resulting from the scarcity of semantic information data, based on the word hierarchy system (thesaurus) developed by Ulsan Universty''s UOU Word Intelligent Network, a dictionary-based lexicological database. Among the WSD models elaborated by this study, the one using statistical information, distance and case information along with the thesaurus (hereinafter referred to as "SDJ-X model") performed the best. In an experiment conducted on the sense-tagged corpus consisting of 1,500,000 eojeols, provided by the Sejong Project, the SDJ-X model recorded improvements over the maximum frequency word sense determination (maximum frequency determination, MFC, accuracy baseline) of 18.87% (21.73% for nouns, 17.11% for verbs). The results were superior in accuracy to the model using statistical and inter-eojeol distance weights by 10.49% (8.84% for nouns, 11.51% for verbs). Finally, the accuracy level of the SDJ-X model was higher than that recorded by the model using only statistical information, distance and case information, without the thesaurus by a margin of 6.12% (5.29%for nouns, 6.64% for verbs).

  Statistics


  Cite this article

[IEEE Style]

J. S. Kim and C. Y. Ock, "A Korean Homonym Disambiguation System Using Refined Semantic Information and Thesaurus," The KIPS Transactions:PartB , vol. 12, no. 7, pp. 829-840, 2005. DOI: 10.3745/KIPSTB.2005.12.7.829.

[ACM Style]

Jun Su Kim and Cheol Young Ock. 2005. A Korean Homonym Disambiguation System Using Refined Semantic Information and Thesaurus. The KIPS Transactions:PartB , 12, 7, (2005), 829-840. DOI: 10.3745/KIPSTB.2005.12.7.829.