A Novel Analytical Pipeline for de novo Haplotype Phasing and Amplicon Analysis using SMRT® Sequencing Technology
2014
While the identification of individual SNPs has been readily available for some time, the ability to accurately phase SNPs and structural variation across a haplotype has been a challenge. With individual reads of up to 30kb in length, SMRT® Sequencing technology allows the identification of combinations of mutations such as microdeletions, insertions, and substitutions without any predetermined reference sequence. Long amplicon analysis is a novel protocol that identifies and reports the abundance of differing clusters of sequencing reads within a single library. Graphs generated via hierarchical clustering of individual sequencing reads are used to generate Markov models representing the consensus sequence of individual clusters found to be significantly different. Long amplicon analysis is capable of differentiating between underlying sequences that are 99.9% similar, such as haplotypes and pseudogenes. This protocol allowed for the identification of structural variation in the MUC5AC gene sequence, despite the presence of a gap in the current genome assembly. Long amplicon analysis allows for the elucidation of complex regions otherwise missed by other sequencing technologies, which may contribute to the diagnosis and understanding of otherwise mysterious diseases.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
0
References
1
Citations
NaN
KQI