Longest Common Subsequences in Bacteria Taxonomic Classification

2018 
In 1980s, Carl Woese made a ground breaking contribution to microbiology using rRNA-genes for phylogenetic classifications. He used it not only to explore microbial diversity but also as a method for bacterial annotation. Today, rRNA-based analysis remains a central method in microbiology. Many researchers followed this track, using several new generations of Artificial Neural Networks obtained high accuracies using available datasets of their time. By the time, the number of bacteria increased enormously. In this article we used Longest Common Subsequence similarity measure to classify bacterial 16S rRNA gene sequences of 1.820.414 bacteria in SILVA, 3.196.038 bacteria in RDP, and 198.509 bacteria in Greengenes. The last two taxonomy have six taxonomical levels, phylum, class, order, family, genus, and species, while SILVA has two more levels subclass and suborder, but lacks species level. The majority of classifications (98%) were of high accuracy (98%).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []