A correlation feature-based method for binning metagenomic sequences

2013 
Metagenomics has provided an unprecedented access to the microbial research since it was proposed at the end of last century.Meanwhile,the rapid advances of high-throughput sequencing technologies dramatically prompted metagenomic studies.Sequence analysis plays an important role in the metagenomics research.One point worth emphasizing and sketching in the process of sequence analysis is the procedure named binning,for its accuracy affects the precision and efficiency of the metagenomics research directly.The key of improving binning accuracy is to extract sequence feature which reflect the nature specialty of sequencing fragments.While all of the current binning methods take composition feature of sequences as characteristics,this paper studies the correlation feature of sequence in depth,and then proposes a new binning method based on this feature,using machine learning algorithms.Even for different classification levels and complexity-simulated datasets,our method maintains not only a high accuracy but also a good stability,which is superior to the performances of current unsupervised binning algorithms and the binning methods which only use the frequency feature of the triple or quadruple nucleotides.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []