Novel Space Efficient Indices for Kannada Text: V-KTPY Trie Family
Yashaswini Hegde1. Padma S.K2
1Yashaswini Hegde*, Graduate in Electronics and Communications Engineering from UBDT College of Engg, Davangere.
2Dr. S.K Padma, Professor, Department of Information Science & Engineering, SJCE College Mysuru.
Manuscript received on September 21, 2019. | Revised Manuscript received on October 15, 2019. | Manuscript published on October 30, 2019. | PP: 6312-6320 | Volume-9 Issue-1, October 2019 | Retrieval Number: A1985109119/2019©BEIESP | DOI: 10.35940/ijeat.A1985.109119
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: V-KTPY Trie family is a group of space efficient indices designed to store and search Kannada text. The existing text searching and document fetching methods use different kinds of indices such as indices based on hashing, lexicographical indices and clustering technique based indexing. Each of these indexing methods have their own advantages including the optimal time complexity. However these indices are not space efficient. In this paper we are proposing a family of novel space efficient indices called V-KTPY Tries, which have the features of both lexicographical and hash based indexing. V-KTPY Tries are congruence of V-KTPY Rule (“Vistruta Katapayadi sutra”) and Prefix trees (Trie) , where the text labels of the Trie are encrypted by V-KTPY Rule. This powerful rule is an extension of an ancient “Katapayadi Sutra” (KTPY Rule) which can convert characters of Brahmi/Devanagari scripts to numbers. In this paper V-KTPY Tries are indexing V-KTPY encrypted Kannada text due to which compression is possible. The experiments are conducted on the family of V-KTPY Tries and their corresponding Tries with unicode Kannada. And the results show that the simple V-KTPY Trie gives 35% space efficiency; V-KTPY 10Ary Trie gives 65% space efficiency over simple unicode Trie with almost the same time complexity. The Prefix Hashed Trie is a fully compressed V-KTPY Trie which gives 20% space efficiency when compared to fully compressed unicode Trie. V-KTPY Tries can be used where Tries are applicable. The VKTPY prefix hashed Tries are used in Kannada feature selection. V-KTPY Tries can be extended to index many (120+) Indian languages which follow Brahmi or Devanagari script.
Keywords: Indices, VKTPY Tries, Prefix Hashed Trie, Kannada.