A Document Classification Framework for Efficient Retrieval
Aijazahamed Qazi1, R.H. Goudar2
1Aijazahamed Qazi, Department of CSE, SDMCET, Dharwad (Karnataka), India.
2R.H.Goudar, Department of CNE, Center for PG Studies Visvesvaraya Technological University, Belgaum (Karnataka), India.
Manuscript received on 18 June 2019 | Revised Manuscript received on 25 June 2019 | Manuscript published on 30 June 2019 | PP: 2592-2597 | Volume-8 Issue-5, June 2019 | Retrieval Number: E7653068519/19©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Document classification has become an evolving field of exploration with the significant rise in the volume of computerized information. Weighting of a term is an elementary research issue in document classification. Several alternatives to the traditional techniques to weight a term like TF_IDF have been proposed by the researchers. This paper introduces a novel method to weight a term by calculating the semantic similarity between the category label and the term. Also the proposed term weighting technique includes the co-occurrence relation between the terms. Experiments were carried on the 20 Newsgroups and Reuters_21578 benchmark datasets. The results obtained infer that the proposed method outperforms the other weighting methods using various classifiers.
Keywords: TF_IDF, Fuzzy kNN, Accuracy
Scope of the Article: Fuzzy Logics