Comprehensive Study on Techniques of Incremental Learning with Decision Trees for Streamed Data
Prerana Gupta1, Amit Thakkar2, Amit Ganatra3
1Prerana Gupta, Department of Computer Engineering, Charotar Institute of Technology, Charotar University of Technology Changa, Anand, Gujarat, India.
2Amit Thakkar, Department of Information Technology, Charotar Institute of Technology, Charotar University of Technology Changa, Anand, Gujarat, India.
3Amit Ganatra, Department of Computer Engineering, Charotar Institute of Technology, Charotar University of Technology Changa, Anand, Gujarat, India.
Manuscript received on January 17, 2012. | Revised Manuscript received on February 05, 2012. | Manuscript published on February 29, 2012. | PP: 92-97 | Volume-1 Issue-3, February 2012. | Retrieval Number: C0201021312/2011©BEIESP
Open Access | Ethics and Policies | Cite
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Incremental learning is an approach to deal with the classification task when datasets are too large or when new examples can arrive at any time. Data streams are inherently time-varying and exhibit various types of dynamics. There are some problems in data stream mining like class imbalance, concept drift, arrival of a novel class, etc. This paper focuses on the problem of concept drift. The presence of concept drift in the data significantly influences the accuracy of the learner, thus efficient handling of non-stationary environment is an important problem. Detecting changes of concept definitions in data streams and adapting classifiers to them is studied in this paper. The classifying technique studied is decision trees classification for streamed data, As decision trees are more efficient and easily interpretable. The comparative studies of some algorithms FIMT-DD, ORTO, FIOT, OVA-classifier, ilearning, UFFT, SCRIPT and HOT are shown in this paper.
Keywords: Concept drift, Data stream mining, Incremental learning, Hoeffding Trees.