Construction of Linearly Aliened Corpus Using Unsupervised Learning 


Vol. 11,  No. 3, pp. 387-394, Jun.  2004
10.3745/KIPSTB.2004.11.3.387


PDF
  Abstract

In this paper, we propose a modified unsupervised linear alignment algorithm for building an aligned corpus. The original algorithm inserts null characters into both of two aligned strings (source string and target string), because the two strings are different from each other in length. This can cause some difficulties like the search space explosion for applications using the aligned corpus with null characters and no possibility of applying to several machine learning algorithms. To alleviate these difficulties, we modify the algorithm not to contain null characters in the aligned source strings. We have shown the usability of our approach by applying it to different areas such as Korean-English back-transliteration, English grapheme-phoneme conversion, and Korean morphological analysis., ,

  Statistics


  Cite this article

[IEEE Style]

K. J. Lee and J. H. Kim, "Construction of Linearly Aliened Corpus Using Unsupervised Learning," The KIPS Transactions:PartB , vol. 11, no. 3, pp. 387-394, 2004. DOI: 10.3745/KIPSTB.2004.11.3.387.

[ACM Style]

Kong Joo Lee and Jae Hoon Kim. 2004. Construction of Linearly Aliened Corpus Using Unsupervised Learning. The KIPS Transactions:PartB , 11, 3, (2004), 387-394. DOI: 10.3745/KIPSTB.2004.11.3.387.