contact@ijirct.org      

 

Publication Number

IJIRCT1201096

 

Page Numbers

453-455

Paper Details

Document Clustering Using Side Information for Mining Text Data

Authors

Kiran D. Gavali, Saurabh S. Kamthe, Ganesh Pingale, Prasad Vedpathak

Abstract

In many text mining applications, side-information is accessible alongside the text documents. Such side-information could be of various types, like document place of origin info, the links within the document, user-access behavior from internet logs, or other non-textual attributes that are embedded into the text document. Such attributes could contain an incredible quantity of information for cluster functions. However, the relative importance of this side-information is also troublesome to estimate, especially when a number of the knowledge is same. In such cases, it is often risky to include side-information into the text mining method, because it will either improve the standard of the illustration for the mining method, or will add noise. Therefore, we need a principled way to perform the mining method, therefore on maximize the benefits from exploitation this aspect info. In this paper, we design associate algorithmic rule which mixes classical partitioning algorithms with probabilistic models so as to make an efficient clustering approach. We tend to then show a way to extend the approach to the classification drawback. We tend to gift experimental results on a number of real knowledge sets so as the benefits of exploitation such an approach.

Keywords

Classification, Text Mining, Side Information, Data mining, Clustering.

 

. . .

Citation

Document Clustering Using Side Information for Mining Text Data. Kiran D. Gavali, Saurabh S. Kamthe, Ganesh Pingale, Prasad Vedpathak. 2016. IJIRCT, Volume 1, Issue 5. Pages 453-455. https://www.ijirct.org/viewPaper.php?paperId=IJIRCT1201096

Download/View Paper

 

Download/View Count

218

 

Share This Article