MetaProFi: A Protein-Based Bloom Filter for Storing and Querying Sequence Data for Accurate Identification of Functionally Relevant Genetic Variants

2021 
MetaProFi (https://github.com/kalininalab/metaprofi) is a Bloom filter-based tool for querying and storing sequence data that achieves high time and space efficiency compared to state-of-the-art tools. For the first time, in addition to supporting traditional nucleotide sequence storage, it can directly index and query amino acid sequences and translated nucleotide sequences, thus bringing sequence comparison to a more biologically relevant protein level. We demonstrate the utility of MetaProFi by indexing various large datasets: UniProtKB datasets at the organism and sequence level, Tara Oceans dataset and the 2585 human RNA-seq experiments.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []