Assessment of the Limits of Predictability of Protein and Phosphorylation Levels in Cancer

2020 
Even though cancer is driven by genomic alterations, the chain functions causing this disease are largely carried out by proteins. Proteins are also typically targeted in treatment. However, proteomes are harder and more expensive to measure than genomes and transcriptomes. Thus, it would be very valuable to accurately estimate protein levels using other omics data. To catalise developments of solutions to this problem, and to answer fundamental questions about transcriptional and translational control, we leveraged the power of crowdsourcing via a collaborative competition: The NCI-CPTAC DREAM Proteogenomics Challenge. The best performance for predicting protein and phosphorylation levels was achieved by an ensemble of models including as predictors transcript level of the corresponding genes, interaction between genes, conservation across tumor types and, for phosphorylation prediction, phosphosite proximity. Proteins from metabolic pathways were the best predicted, whereas complex proteins were the least well predicted. However, the performance even of the best performing model was modest, suggesting that the level for many proteins are strongly regulated through translational control and degradation. From the best-performing model, we identified common predictors, which are predictive of survival outcome. Our results shed light on the potential application of computational models to large scale proteogenomic characterization of cancer in order to better understand signaling dysregulation mechanisms in the disease.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []