Comprehensive Analysis of Indels in Whole-genome Microsatellite Regions and Microsatellite Instability across 21 Cancer Types

2018 
Microsatellites are repeats of 1-6bp units and ~10 million microsatellites have been identified across the human genome. Microsatellites are sensitive to DNA mismatch errors, and therefore have been used to detect cancers with mismatch repair deficiency. To reveal the mutational landscape of the microsatellite repeat regions at the genome level, we analyzed approximately 9.2 million microsatellites in 2,717 whole genomes of pan-cancer samples across 21 tissue types. First, we developed an insertion and deletion caller that takes into consideration the error patterns of different types of microsatellites. Among the 2,717 pan-cancer samples, our analysis identified 31 samples, including colorectal, uterus, and stomach cancers, with higher microsatellite mutation rate (≥ 0.03), which we defined as microsatellite instability (MSI) cancers. Next, we found 20 highly-mutated microsatellites to detect MSI cancers with high sensitivity. Validation of the marker set in an independent cancer cohort showed that the marker set was effective. Third, we found that replication timing and DNA shape were significantly associated with mutation rates of the microsatellites. Analysis of germline variation of the microsatellites suggested that the amount of germline variations and somatic mutation rates were correlated. Lastly, analysis of mutations in the mismatch repair genes showed that somatic SNVs and short indels had larger functional impact than germline mutations and structural variations. Our analysis provides a comprehensive picture of mutations in the microsatellite regions, and reveals possible causes of mutations, as well as provides a useful marker set for MSI detection.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    44
    References
    1
    Citations
    NaN
    KQI
    []