Evaluating The Performance of Machine Learning using Feature Selection Methods on Dengue Dataset
Subhram Dasgupta1, Naman Sharma2, Sweta Sinha3, Raghavendra S4
1Dr. Raghavendra S, Associate Professor, Department of Computer Science And Engineering At CHRIST DEEMED TO BE UNIVERSITY, Bangalore. (Karnataka), India.
2Subhram Dasgupta, Department of Computer Science And Engineering From CHRIST DEEMED TO BE UNIVERSITY, Bangalore (Karnataka), India.
3Naman Sharma, Department of Computer Science And Engineering From CHRIST DEEMED TO BE UNIVERSITY, Bangalore (Karnataka), India.
4Sweta Sinha, Department of Computer Science And Engineering From CHRIST DEEMED TO BE UNIVERSITY, Bangalore (Karnataka), India.
Manuscript received on 18 June 2019 | Revised Manuscript received on 25 June 2019 | Manuscript published on 30 June 2019 | PP: 2679-2685 | Volume-8 Issue-5, June 2019 | Retrieval Number: E7855068519/19©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Dengue fever is a mosquito-borne disease transmitted by the bite of an Aedes mosquito infected with a dengue virus. The bites of an infected female Aedes mosquito which gets the virus while feeding on the infected persons blood, transmits the virus to others. Dengue transmission is climate sensitive for several reasons such as temperature, humidity, rainfall, etc. Areas having higher vapor pressure and rainfall rate are most vulnerable to the spreading of the dengue disease. So to find the important features responsible for spreading the dengue we have used the classification algorithms. Machine learning is one of the key methods used in modern day analysis. Many algorithms have been used for medical purposes. Dengue disease is one of the serious contagious diseases. To find the features related to spreading of dengue disease, we have used popular machine learning algorithms. This proposed work focuses on evaluating the performances of the various machine learning techniques like- Random Forest Classifier (RFC), Decision Tree Classifier (DTC) and Linear Support Vector Machine (LSVM). Predictive Mean Matching is applied for preprocessing of the data and percentage split is applied for resampling of the data. Information gain values for each of the attributes are calculated. The attributes are sorted on the basis of information gain values. Feature selection methods (FSMs) such as Forward Selection (FS) and Backward Elimination (BE) are applied to choose the finest subset of the attributes, so that the algorithm runs more efficiently with a lower run time. It also results in the improvement of the accuracy. The attributes selected by the Feature Selection Methods are the main attributes which results in the probable effects of global weather change on human healthiness.
Keywords: Logistic Regression; Artificial Neural Network; Random Forest; Support Vector Machine; Neural Network With 10-Fold, Dengue Disease, Neural Network, R.
Scope of the Article: Machine Learning