XML Document Clustering Based on Sequential Pattern 


Vol. 10,  No. 7, pp. 1093-1102, Dec.  2003
10.3745/KIPSTD.2003.10.7.1093


PDF
  Abstract

As the use of internet is growing, the amount of information is increasing rapidly and XML that is a standard of the web data has the property of flexibility of data representation. Therefore electronic document systems based on web, such as EDMS (Electronic Document Management System), ebXML (e-business eXtensible Markup Language), have been adopting XML as the method for exchange and standard of documents. So research on the method which can manage and search structural XML documents in an effective way is required. In this paper we propose the clustering method based on structural similarity among the many XML documents, using typical structures extracted from each document by sequential pattern mining in pre-clustering process. The proposed algorithm improves the accuracy of clustering by computing cost considering cluster cohesion and inter-cluster similarity.

  Statistics


  Cite this article

[IEEE Style]

H. J. Hui and L. G. Ho, "XML Document Clustering Based on Sequential Pattern," The KIPS Transactions:PartD, vol. 10, no. 7, pp. 1093-1102, 2003. DOI: 10.3745/KIPSTD.2003.10.7.1093.

[ACM Style]

Hwang Jeong Hui and Lyu Geun Ho. 2003. XML Document Clustering Based on Sequential Pattern. The KIPS Transactions:PartD, 10, 7, (2003), 1093-1102. DOI: 10.3745/KIPSTD.2003.10.7.1093.