A Rule Based Stemmer
R. Cynthia Monica Priya1, J.G.R. Sathiaseelan2
1R.Cynthia Monica Priya, Department of Computer Science, Bishop Heber College , Tiruchirapalli, India.
2Dr.J.G.R. Sathiaseelan, Department of Computer Science, Bishop Heber College , Tiruchirapalli, India.
Manuscript received on September 22, 2019. | Revised Manuscript received on October 20, 2019. | Manuscript published on October 30, 2019. | PP: 2026-2029 | Volume-9 Issue-1, October 2019 | Retrieval Number: A9545109119/2019©BEIESP | DOI: 10.35940/ijeat.A9545.109119
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: The present digital world generates enormous amount of data instantaneously. The need to effectively mine knowledge seems to be the need of the hour. Sentiment Analysis, a part of web content mining which is a subpart of web mining has gained momentum in the field of research. It analyses the opinion of variety of people all over the world. Sentiment Analysis encompasses preprocessing, feature selection, classification and sentiment prediction. Preprocessing is an important process and it deals with many techniques. Stop word removal, punctuation removal, conversion of numbers to number names are some of the basic techniques. Stemming is yet another important preprocessing technique that reduces the different words form to its root. There are basically three types of stemmers namely truncating, statistical and hybrid. The aim of this paper is to propose a rule based stemmer that is a truncating stemmer. It deals with rules for truncation and replacement. The data given as input passes through a series of rules. If the condition specified gets satisfied then the associated rule gets executed otherwise the input is checked with the next rule and the process continues further. The result of execution is stemmed words. The performance of the proposed rule based stemmer is compared with the existing stemmers under the same rule based category namely Porter and Lancaster. Various metrics have been used for evaluation. The observations reveal the fact that the proposed stemmer out performs the Porter and Lancaster stemmers in terms of correctly stemmed words factor and shows a good average conflation factor and lesser over stemming and under stemming errors.
Keywords: Orthogonal Machining, Grains, Johnson–Cook model, Numerical Technique.