Efficient Frequent Item set Discovery Technique in Uncertain Data
Deepak Chopra1, Dilip Vishwakarma2
1Deepak Chopra, Dept. of Computer Application SATI, Vidisha,(M.P) India.
2Dilip Vishwakarma, Dept. of Computer Application SATI, Vidisha,(M.P) India.
Manuscript received on July 17, 2012. | Revised Manuscript received on August 25, 2012. | Manuscript published on August 30, 2012. | PP: 73-78 | Volume-1 Issue-6, August 2012. | Retrieval Number: F0618081612/2012©BEIESP
Open Access | Ethics and Policies | Cite
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Frequent itemset mining, the task of finding sets of items that frequently occur together in a dataset, has been at the core of the field of data mining for the past sixteen years. In that time, the size of datasets has grown much faster than has the ability of existing algorithms to handle those datasets. Consequently, improvements are needed. In this thesis, we take the classic algorithm for the problem, A Priori, and improve it quite significantly by introducing what we call a vertical sort. We then use the large dataset, web documents to contrast our performance against several state-of-the-art implementations and demonstrate not only equal efficiency with lower memory usage at all support thresholds, but also the ability to mine support thresholds as yet un-attempted in literature. We also indicate how we believe this work can be extended to achieve yet more impressive results.
Keywords: Uncertain Databases, Frequent Itemset Mining, Probabilistic Frequent Itemsets.