Published in:

Volume 1 Issue 5
eISSN: 2454-5988


Unique Identifier




Page Number




Share This Article



Paper Details


Document Clustering Using Side Information for Mining Text Data


In many text mining applications, side-information is accessible alongside the text documents. Such side-information could be of various types, like document place of origin info, the links within the document, user-access behavior from internet logs, or other non-textual attributes that are embedded into the text document. Such attributes could contain an incredible quantity of information for cluster functions. However, the relative importance of this side-information is also troublesome to estimate, especially when a number of the knowledge is same. In such cases, it is often risky to include side-information into the text mining method, because it will either improve the standard of the illustration for the mining method, or will add noise. Therefore, we need a principled way to perform the mining method, therefore on maximize the benefits from exploitation this aspect info. In this paper, we design associate algorithmic rule which mixes classical partitioning algorithms with probabilistic models so as to make an efficient clustering approach. We tend to then show a way to extend the approach to the classification drawback. We tend to gift experimental results on a number of real knowledge sets so as the benefits of exploitation such an approach.

Key Words

Classification, Text Mining, Side Information, Data mining, Clustering.


Click here for Article Preview

It appears you don't have Adobe Reader or PDF support in this web browser. Click here for view PDF

Download Paper


Print This Page


Download Citations


Download Counter