Modified Cosine Similarity Measure based Data Classification in Data Mining
D. Mabuni

D. Mabuni*, Department of Computer Science, Dravidian University, Kuppam, India. 

Manuscript received on May 29, 2020. | Revised Manuscript received on June 22, 2020. | Manuscript published on June 30, 2020. | PP: 649-654 | Volume-9 Issue-5, June 2020. | Retrieval Number: E9754069520/2020©BEIESP | DOI: 10.35940/ijeat.E9754.069520
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Text data analytics became an integral part of World Wide Web data management and Internet based applications rapidly growing all over the world. E-commerce applications are growing exponentially in the business field and the competitors in the E-commerce are gradually increasing many machine learning techniques for predicting business related operations with the aim of increasing the product sales to the greater extent. Usage of similarity measures is inevitable in modern day to day real applications. Cosine similarity plays a dominant role in text data mining applications such as text classification, clustering, querying, and searching and so on. A modified clustering based cosine similarity measure called MCS is proposed in this paper for data classification. The proposed method is experimentally verified by employing many UCI machine learning datasets involving categorical attributes. The proposed method is superior in producing more accurate classification results in majority of experiments conducted on the UCI machine learning datasets. 
Keywords: Clustering based similarity measure, Modified cosine similarity measure, Text classification, Text clustering.