Single-base mismatch profiles for NGS samples

2011 
Within the preprocessing pipeline of a Next Generation Sequencing sample, its set of Single-Base Mismatches is one of the first outcomes, together with the number of correctly aligned reads. The union of these two sets provides a 4x4 matrix (called Single Base Indicator, SBI in what follows) representing a blueprint of the sample and its preprocessing ingredients such as the sequencer, the alignment software, the pipeline parameters. In this note we show that, under the same technological conditions, there is a strong relation between the SBI and the biological nature of the sample. To reach this goal we need to introduce a similarity measure between SBIs: we also show how two measures commonly used in machine learning can be of help in this context.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    0
    Citations
    NaN
    KQI
    []