Performance Evaluation of Machine Learning Techniques in Diabetes Prediction
Raghavendra S1, Santosh Kumar J2, Raghavendra B. K3
1Dr. Raghavendra, Associate Professor, Department of Computer Science and Engineering at Christ Deemed To Be University, Bangalore (Karnataka), India.
2Santosh Kumar, Associate Professor, Department of Computer Science and Engineering at K.S.School of Engineering and Management, Bangalore (Karnataka), India.
3Dr. Raghavendra B.K. Pursued P.hd From VTU Belgaum, Karnataka and Masters from VTU Belagavi and Bachelors from Bangalore (Karnataka), India.
Manuscript received on 18 February 2019 | Revised Manuscript received on 27 February 2019 | Manuscript published on 28 February 2019 | PP: 363-369 | Volume-8 Issue-3, February 2019 | Retrieval Number: C5955028319/19©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Diabetes diagnosis is very important at preliminary stage rather than treatment. In today’s world devices like sensors are used for detection of diabetes. Accurate classification techniques are required for automatic identification of diabetes disease. In regards to research diabetes prediction with minimal number of attributes (test parameters) is to be identified earlier research states about feature reduction but with less predictive accuracy. In this regards, this work exploits machine learning techniques(methodology) such as Logistic Regression (LR), Artificial Neural Network (ANN), Support Vector Machine (SVM), Random Forest (RF) and Neural Network (NN) with 10-fold Cross Validation (CV) for classification and prediction of diabetes with Feature Selection Methods (FSMs) using R platform. Above all models enable us to investigate the relationship between a categorical outcome and a set of explanatory variables. The experiment was conducted on PIMA Indian diabetes dataset selected from UCI machine learning repository. From the experimental results it is identified that for full set of diabetes dataset attributes, Classification Accuracy (CA) achieved was 84.25%whereas with reduced set attributes an accuracy of 85.24% is achieved using NN with 10-fold CV technique compared to others which will help in medical application to predict diabetes with minimal features.
Keywords: Logistic Regression; Artificial Neural Network; Random Forest; Support Vector Machine; Neural Network With 10-Fold.
Scope of the Article: Machine Learning