Secret Sequence Comparison onPublic GridComputing Resources

2005 
Onceanewgenehasbeensequenced, itmustbeverified whether ornotitissimilar topreviously sequenced genes. Inmanycases, theorganization that sequenced a potentially novel geneneeds tokeep thesequence itself in confidence. However, tocompare thepotentially novel sequence withknownsequences, itmusteither besentas aquery topublic databases, orthese databases mustbe downloaded ontoalocal computer: Inbothcases, thepotentially newsequence isexposed tothepublic. Inthis work, wepropose anovel method tocompare sequences without anyexact sequence information leaks tothepublic. This method isbased onourprevious proposed method [11] tofind unique sequences ongrid computing environments, which iswell-parallelized inreasonable performance. In order tokeeptheexact sequence infonnation inconfidence, this method samples intervals (subsequences) fromasequence, andthese intervals arehashed. Anykeycryptosystemisnotused. Thehashed dataareopentothepublic to verify thenovelty ofthesequence. Theexperimental results for19797h.sapiens genes showthat theparallel implementation ofthis method performs reasonably well interms of speed andmemory usage. Inthis paper, theimplementation ontheworld-wide testbeds ofEuropean DataGrid(EDG) andits results aredescribed.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []