A Protein Sequence Prediction Method by Mining Sequence Data 


Vol. 10,  No. 2, pp. 261-266, Apr.  2003
10.3745/KIPSTD.2003.10.2.261


PDF
  Abstract

A protein, which is a linear polymer of amino acids, is one of the most important bio-molecules composing biological structures and regulating bio-chemical reactions. Since the characteristics and functions of proteins are determined by their amino acid sequences in principle, protein sequence determination is the starting point of protein function study. This paper proposes a protein sequence prediction method based on data mining techniques, which can overcome the limitation of previous bio-chemical sequencing methods. After applying multiple proteases to acquire overlapped protein fragments, we can identify candidate fragment sequences by comparing fragment mass values with peptide databases. We propose a method to construct multi-partite graph and search maximal paths to determine the protein sequence by assembling proper candidate sequences. In addition, experimental results based on the SWISS-PROT database showing the validity of the proposed method is presented.

  Statistics


  Cite this article

[IEEE Style]

S. I. Cho, D. H. Lee, K. H. Cho, Y. G. Won, B. K. Kim, "A Protein Sequence Prediction Method by Mining Sequence Data," The KIPS Transactions:PartD, vol. 10, no. 2, pp. 261-266, 2003. DOI: 10.3745/KIPSTD.2003.10.2.261.

[ACM Style]

Sun I Cho, Do Heon Lee, Kwang Hwi Cho, Yong Gwan Won, and Byoung Ki Kim. 2003. A Protein Sequence Prediction Method by Mining Sequence Data. The KIPS Transactions:PartD, 10, 2, (2003), 261-266. DOI: 10.3745/KIPSTD.2003.10.2.261.