A Novel Clustering Algorithm to Process Big Data Using Hadoop Framework
D. Jayalatchumy1, P. Thambidurai2, D. Kadhirvelu3
1Mrs. D. Jayalatchumy, Computer Science and Engineering from Pondicherry University. Pondicherry. India.
2Prof. Dr. P. Thambidurai, Computer Science from the Alagappa University, Karaikudi, India.
3Mr. D. Kadhirvelu, Associate, Professor in Krishnaswamy College of Engineering and Technology, Cuddalore India.
Manuscript received on July 20, 2019. | Revised Manuscript received on August 10, 2019. | Manuscript published on August 30, 2019. | PP: 5227-5231 | Volume-8 Issue-6, August 2019. | Retrieval Number: F8874088619/2019©BEIESP | DOI: 10.35940/ijeat.F8874.088619
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: The real challenge for data miners lies in extracting useful information from huge datasets. Moreover, choosing an efficient algorithm to analyze and process these unstructured data is itself a challenge. Cluster analysis is an unsupervised practice to attain data insight in the era of Big Data. Hyperflated PIC is a Big Data processing solution designed to take advantage over clustering. It is a scalable efficient algorithm to address the shortcomings of existing clustering algorithm and it can process huge datasets quickly. HPIC algorithms have been validated by experimenting them with synthetic and real datasets using different evaluation measure. The quality of clustering results has also been analyzed and proved to be highly efficient and suitable for Big Data processing.
Keywords: Inflation, Hyper flation, Deflation, Power method, Hadoop, MapReduce