An Improved Classifier Technique for Spam Filtering
Rahul Maheshwari1, Vivek Kapoor2, Sandeep Verma3
1Rahul Maheshwari, Department of Computer Science and Engineering, Institute of Engineering and Technology, DAVV, Indore (Madhya Pradesh), India.
2Dr. Vivek Kapoor, Department of Information Technology, Institute of Engineering and Technology, DAVV, Indore (Madhya Pradesh), India.
3Sandeep Verma, Department of Information Technology, Institute of Engineering and Technology, DAVV, Indore (Madhya Pradesh), India.
Manuscript received on 13 June 2017 | Revised Manuscript received on 20 June 2017 | Manuscript Published on 30 June 2017 | PP: 255-261 | Volume-6 Issue-5, June 2017 | Retrieval Number: E5075066517/17©BEIESP
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Email spam or junk e-mail (unwanted e-mail “usually of a commercial nature sent out in bulk”) is one of the major problems of the today’s Internet, bringing financial damage to companies and annoying individual users. There are various approaches developed to stop spam, filtering is an important and popular one. Spam or unsolicited e-mail has become a major problem for companies and private users. This paper explores the problems associated with spam and some different approaches attempting to deal with it. Since spam is a major issue for web world thus the most appealing methods are those that are easy to maintain and prove to have a satisfactory performance. A learning algorithm which uses the Naive Bayesian classifier has shown promising results in separating spam from legitimate mail. There are various initial steps involved in spam classifier like Tokenization, probability estimation and feature selection are processes performed prior to classification and all have a significant influence upon the performance of spam filtering. The main objective of this work is to examine and empirically test the currently known techniques used for each of these processes and to investigate the possibilities for improving the Bayesian classifier performance. There are many different approaches available at present attempting to solve the spam issue. One of the most promising methods for filtering spam with regards to performance and ease of implementation is that of Naive Bayesian classifier. The objective of this paper is to explore the statistical filter called Naive Bayesian classifier and to investigate the possibilities for improving its performance.
Keywords: E-mail Classification, Spam, Spam Filtering, Machine Learning, Algorithms.
Scope of the Article: Classification