An Enhanced Classification Based Outlier Detection using Decision Tree for Multi class in Data Stream
R. Sangeetha1, S. Sathappan2
1R. Sangeetha, Research Scholar, Department of Computer Science, Erode Arts and Science College, Erode (Tamil Nadu), India.
2Dr. S. Sathappan, Associate Professor, Department of Computer Science, Erode Arts and Science College, Erode (Tamil Nadu), India.
Manuscript received on 25 August 2019 | Revised Manuscript received on 01 September 2019 | Manuscript Published on 14 September 2019 | PP: 207-213 | Volume-8 Issue-5S3, July 2019 | Retrieval Number: E10480785S319/19©BEIESP | DOI: 10.35940/ijeat.E1048.0785S319
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Data Stream has continual, unbound, large and unstable records. The processing in data streams involves extracting significantidea in primary data of the kind static and dynamic with one sweep. In streaming, records are generated by thousands of data sources continuously and simultaneously. These data normally won’t have common range. Some data will be deviated from the rest in terms of variant factors. These are considered as outliers and it is tough to find those in data stream as they have multi dimensionality. Outliers, being the most abnormal observations, may include the sample maximum or sample minimum. E-commerce is an application or category of data stream that is generated from millions of sources at a time. It includes multiple products and transactions. Some products are cancelled during transactions and some are infrequent and these are termed as outliers. This paper focus on the challenge in E-Commerce and the objective is to aid the agencies in taking fine decisions in right time by finding the outliers using supervised learning scheme. This work is carried out in two phases. In first phase the outliers are detected and classified as cancelled and delivered products. In the second phase the least transaction is found as an outlier by the enhanced methodology for multi-class classification. The work is implemented in WEKA 3.9.6 and is compared with the existing works with evaluation metrics.
Keywords: Data Stream, Outlier, Decision Tree, Random Correction Code, Log Loss.
Scope of the Article: Classification