ISSN (Online) : 2456 - 0774

Email : ijasret@gmail.com

ISSN (Online) 2456 - 0774


A Comparative Study of K-means andK-medoid Clustering for Social Media Text Mining

Abstract

The text mining is a technique for analyzing text data or unstructured data using data mining algorithms. Data mining algorithms works on the basis of two different manners namely supervised and unsupervised. In this context, when the data is available in pre-defined patterns and need to find the similar kinds of pattern data, then the supervised approaches such as classification techniques are used. On the other hand when the data pattern is not available on a specific manner then the unsupervised approaches are used such as clustering. The unsupervised learning techniques works on the basis of internal similarity of data objects and on the basis of this data clustered. Nowadays, in number of places the unsupervised learning approaches are used for recovering the data text patterns, trends, similarities and other kinds of patterns. In this proposed work, the social media text analysis is the main motive of the work using the unsupervised learning techniques. Therefore, two popular algorithms which are slightly different from each other are used for finding the best performing algorithm for text mining. Thus the k-means clustering and k-medoid clustering algorithms are used for analyzing text data and performing clustering. The proposed work of analyzing text data using clustering approach includes data pre-processing, feature selection and clustering. During pre-processing the data is refined and filtered for finding the required data, in next phase during the feature selection the word frequency is computed and data is transformed in 2D vector. Finally using the implemented algorithms the data is processed. The implementation of proposed technique is provided using JAVA technology. Additionally, the performance in terms of inter cluster similarity is measured. According to experimental evaluation K-medoid clustering is efficient in terms of time space complexity and able to prepare the uniformly distributed clusters of data. Keywords- social media text mining, unsupervised learning, clustering, k-means, k-medoid

Full Text PDF

IMPORTANT DATES 

Submit paper at ijasret@gmail.com

Paper Submission Open For October 2024
UGC indexed in (Old UGC) 2017
Last date for paper submission 30th October, 2024
Deadline Submit Paper any time
Publication of Paper Within 15-30 Days after completing all the formalities
Publication Fees  Rs.6000 (UG student)
Publication Fees  Rs.8000 (PG student)