Efficient Method for De-Duplication and Periodicity Mining In Time Series Databases
S. Drishya1, I. Nancy Jeba Jingle2
1S.Drishya, Dept Of CSE, Vins Christian College Of Engineering, Nagercoil, India.
2I.Nancy Jeba Jingle, Dept Of CSE, Vins Christian College of Engineering, Nagercoil, India.
Manuscript received on May 27, 2012. | Revised Manuscript received on June 12, 2012. | Manuscript published on June 30, 2012. | PP: 187-192 | Volume-1 Issue-5, June 2012. | Retrieval Number: E0469061512/2012©BEIESP
Open Access | Ethics and Policies | Cite
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: A Periodic pattern mining or periodicity detection has a number of applications, such as prediction, forecasting, detection of unusual activities, etc. The problem is not trivial because the data to be analyzed are mostly noisy and different periodicity types(namely symbol, sequence, and segment) are to be investigated. Noise is the duplication of data from different databases when they are used for same purpose in different places. So it should be removed. Time series is a collection of data values gathered generally at uniform interval of time to reflect certain behavior of an entity. Real life has several examples of time series such as weather conditions of a particular location, transactions in a superstore, network delays, power consumption, earthquake prediction. A time series is mostly characterized by being composed of repeating cycles. Identifying repeating (periodic) patterns could reveal important observations about the behavior and future trends of the case represented by the time series, and hence would lead to more effective decision making. The goal of analyzing a time series is to find whether and how frequent a periodic pattern (full or partial) is repeated within the series. There is a need for a comprehensive approach capable of analyzing the whole time series or in a subsection of it to effectively handle different types of noise (to a certain degree) and at the same time is able to detect different types of periodic patterns; combining these under one umbrella is by itself a challenge. In this paper, we present an algorithm which can detect symbol, sequence (partial), and segment (full cycle) periodicity in time series. The algorithm is noise resilient; it has been successfully demonstrated to work with replacement, insertion, deletion, or a mixture of these types of noise.
Keywords: Time series, periodicity detection, suffix tree, symbol periodicity, segment periodicity, sequence periodicity, noise resilient.