Enhanced Personalized Web Search using Patternbased Topic Modelling
Ramitha A T1, Jayasudha J S2
1Ramitha A T, Department of Computer Science & Engineering, Sree Chitra Thirunal College of Engineering, Thiruvananthapuram (Kerala), India.
2Dr. Jayasudha J. S, Department of Computer Science & Engineering, Sree Chitra Thirunal College of Engineering, Thiruvananthapuram (Kerala), India.
Manuscript received on 13 August 2016 | Revised Manuscript received on 20 August 2016 | Manuscript Published on 30 August 2016 | PP: 208-211 | Volume-5 Issue-6, August 2016 | Retrieval Number: F4717085616/16©BEIESP
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Personalized Web Search is a method of searching to improve the quality and accuracy of web search. It has gained much attention recently. The main goal of personalized web search is to customize search results that are more relevant and tailored to the user interests. Effective personalization needs collecting and aggregating user information that can be private or general. Personalized search results can be improved by information filtering. Information Filtering is a system to remove irrelevant or unwanted information from an information stream based on document representations which represent users’ interest. Traditional information filtering models assume that one user is only interested in a single topic. In statistical topic modelling documents and collections can be represented by word distributions. But directly applying topic models for information filtering is insufficient to distinctively represent documents with different semantic content. In order to alleviate these problems, patterns are used to represent topics for information filtering. Pattern-based representations are considered more meaningful and more accurate to represent topics than word-based representations. Pattern-based Topic Model (PBTM) combines pattern mining with statistical topic modelling to generate more discriminative and semantic rich topic representations. In the proposed system, user information preferences are acquired as a collection of documents from user browsing history. Latent Dirichlet Allocation is used to perform topic modelling on the collected documents. Word-topic assignments from LDA are used for constructing transactional dataset. Frequent patterns are discovered from topic models. Maximum matched Pattern-based Topic Model is used to build user interest model representing the user preference information from the collection of documents and filter the incoming documents based on the user preferences by document relevance ranking.
Keywords: Topic Model, Information Filtering, Pattern Based Mining, User Interest Model
Scope of the Article: Data Mining