A New Indexing Technique XR+ Tree for Bio Informatics XML Data Compression
Dinh Duc Luong1, Vuong Quang Phuong2, Hoang Do Thanh Tung3
1Dinh Duc Luong, Working at the Food Industrial College. Research Interest: bioinformatics, big data
2Vuong Quang Phuong, working at the Institute of Information Technology (IOIT) – Vietnam Academy of Science and Technology (VAST).
3Hoang Do Thanh Tung, working at the Institute of Information Technology (IOIT)-Vietnam Academy of Science and Technology (VAST)
Manuscript received on 18 June 2019 | Revised Manuscript received on 25 June 2019 | Manuscript published on 30 June 2019 | PP: 1168-1173 | Volume-8 Issue-5, June 2019 | Retrieval Number: E7409068519/19©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Informatics data is becoming huge due to the regular contribution of informatics community. Because bioinformatics problems are so diverse, documents for storing information need a structure that is easy to change, flexible, diverse and especially easy to share / contribute. For that reason, XML documents are the best solution for describing and storing this massive bioinformatics data. However, because XML documents have textual and semi-structured data, there is a lack of data reduction and data retrieval methods that are really relevant and effective. In this paper, we apply a method of converting the number of name tags to numerical spatial coordinates to reduce the size of the document and propose an improved indexing method to increase query efficiency of XML documents. The idea of the paper is based on the characteristics of the XML data tree structure and the structure of the XR-tree index tree (R-tree improvement), we advance the algorithms to optimize for queries for XPath. Experiments have shown that the new algorithm has significantly increased performance compared to the previous method. And the results also point to some practical issues due to the variety of bioinformatics XML documents
Keywords: Bioinformatics, Bioinformatics Data, Indexing Method.
Scope of the Article: Bioinformatics