The Adequacy Assessment of Test Sets in Machine Learning using Mutation Testing
Hoijin Yoon
Hoijin Yoon*, Department of Computer Engineering, Hyupsung University, Hwaseung, Kyunggi, South Korea.
Manuscript received on September 23, 2019. | Revised Manuscript received on October 15, 2019. | Manuscript published on October 30, 2019. | PP: 4390-4395 | Volume-9 Issue-1, October 2019 | Retrieval Number: A1183109119/2019©BEIESP | DOI: 10.35940/ijeat.A1183.109119
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: The accuracy is computed by applying the test dataset to the model that has been trained using the training dataset. Thus, The test dataset in machine learning is expected to be able to validate whether a trained model is sufficiently accurate for use. This study addresses this issue in the form of the research question, “how adequate is the test dataset used in machine learning models to validate the models.” To answer this question, the study takes seven most-popular datasets registered in the UCI machine learning data repository, and applies the data sets to the six difference machine learning models. We do an empirical study to analyze how adequate the test sets are, which are used in validating machine learning models. The testing adequacy for each model and each data set is analyzed by mutation analysis technique.
Keywords: Software testing, Mutation analysis, Machine learning, Test adequacy.