A method to find unique sequences on distributed genomic databases

2003 
Thanks to the development of genetic engineering, various kinds of genomic information are being unveiled. Hence, it becomes feasible to analyze the entire genomic information all at once. On the other hand, the quantity of the genomic information stocked on databases is increasing day after day. In order to process the whole information, we have to develop an effective method to deal with lots of data. Therefore, it is indispensable not only to make an effective and rapid algorithm but also to use high-speed computer resource so as to analyze the biological information. For this purpose, as one of the most promised computing environments, the grid computing architecture has appeared recently. The European Data Grid (EDG) is one of the data-oriented grid computing environments [11]. In the field of bioinformatics, it is important to find unique sequences to succeed in molecular biological experiments [6]. Once unique sequences have been found they can be useful for target specific probes/primers design, gene sequence comparison and so on. In this paper, we propose a method to discover unique sequences from among genomic databases located in a distributed environment. Next, we implement this method upon the European Data Grid and show the calculation results for E. coli genomes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    6
    Citations
    NaN
    KQI
    []