MRF: Multivariate Data Clustering using Heuristic Data Intensive Computing and Relevance Feedback Learning Approach
M. Sankara Prasanna Kumar1, A. P. Siva Kumar2, K. Prasanna3
1M. Sankara Prasanna Kumar, Research Scholar, Assistant Professor, JNTUH Hyderabad, AITS Rajampet (A.P), India.
2A. P. Siva Kumar, Assistant Professor, Department of Computer Science and Engineering, JNTUCE, Anantapuramu (A.P), India.
3K. Prasanna, Associate Professor, Department of Information Technology, AITS, Rajampet (A.P), India.
Manuscript received on 29 May 2019 | Revised Manuscript received on 11 June 2019 | Manuscript Published on 22 June 2019 | PP: 858-862 | Volume-8 Issue-3S, February 2019 | Retrieval Number: C11800283S19/19©BEIESP
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Most of the problems in the real world are multivariate i.e., involves many variables. Multivariate data comprises of several datasets with more than one variable. Multivariate datasets has power to change the use of data dramatically as database size increases and it shows adequate results on predicting the effect on change in one variable will have on other variable. These datasets consist of transitive and intrinsic hidden relationships among the variables such as analyzing a variable is influenced by other process variables and preferences. It is the situation where efficient multivariate data analysis techniques exhaustively needed to catalog the given type of data. In the literature several techniques are proposed and analyzed; one such technique is multivariate data clustering. This paper will present a unified framework of multivariate data clustering using heuristic data intensive computing and relevance feedback learning. The implementation starts with formalizing a heuristic data intensive computing (HDIC) which have the ability to handle data flows. Clustering the data is performed with proposed Relevance feedback learning algorithm with consensus functions. These functions are selected as the change in the cluster ensemble selection, combine and reduction. In this proposed approach we have used a new kind of distance functions such as Camberra, Chi-square and Cityblock. The empirical analysis shows that, the proposed approach attains better cluster ensembles on various multivariate datasets taken from UCI and out performs with k-nearest neighbour (KNN) in different settings. The performance of the proposed approach is assessed with Accuracy and F1-measure.
Keywords: Multivariate Data, Clustering, Consensus Functions, Cluster Ensembles, K-Nearest Neighbour (KNN).
Scope of the Article: Clustering