Novel Visual and Analytical Methods in Repurposing Legacy Scientific Code - A Case Study

2013 
Scientific computing is dominated by team-authored legacy code that has evolved over decades with the purpose of capturing the evolving understanding of a scientific discipline. Accumulated deprecated code, various optimization techniques, and evolving algorithms lead to convoluted source code that is impractical to reverse engineer using mainstream methods. This prevents codes from being truly repeatable or understandable, which are two of the most essential needs in scientific computing. We refactored a long-standing implementation of a common biosequence alignment algorithm in an effort to reproduce its salient behaviors in usable form. Because of the sheer size and complexity of this code base, we developed custom tools to visualize and manipulate the source code behavior under a variety of conditions. We present a case study of extracting and refactoring the algorithmic core and a novel process of discovery/prototyping/testing using a combination of openly available and custom-built tools. The result is a reduction in code size of over 2 orders of magnitude while reconstructing the key protein alignment function in BLAST
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    2
    Citations
    NaN
    KQI
    []