Integrative proteo-genomic analysis to construct CNA-protein regulatory map in breast and ovarian tumors

2019 
Recent development in high throughput proteomics and genomics profiling enable one to study regulations of genome alterations on protein activities in a systematic manner. In this article, we propose a new statistical method, ProMAP, to systematically characterize the regulatory relationships between proteins and DNA copy number alterations (CNA) in breast and ovarian tumors based on proteogenomic data from the CPTAC-TCGA studies. Because of the dynamic nature of mass spectrometry instruments, proteomics data from labeled mass spectrometry experiments usually have non-ignorable batch effects. Moreover, mass spectrometry based proteomic data often possesses high percentages of missing values and non-ignorable missing-data patterns. Thus, we use a linear mixed effects model to account for the batch structure and explicitly incorporate the abundance-dependent-missing-data mechanism of proteomic data in ProMAP. In addition, we employ a multivariate regression framework to characterize the multiple-to-multiple regulatory relationships between CNA and proteins. Further, we use proper statistical regularization to facilitate the detection of master genetic regulators, which affect the activities of many proteins and often play important roles in genetic regulatory networks. Improved performance of ProMAP over existing methods were illustrated through extensive simulation studies and real data examples. Applying ProMAP to the CPTAC-TCGA breast and ovarian cancer data sets, we identified many genome regions, including a few novel ones, whose CNA were associated with protein and or phosphoprotein abundances. For example, in breast tumors, a small region in 8p11.21 was recognized as the second biggest hub in the CNA-phosphoprotein regulatory map, and further investigation of the regulatory targets suggests the potential role of 8p11.21 CNA in perturbing oxygen binding and transport activities in tumor cells. This and other findings from our analyses help to characterize the impacts of CNAs on protein activity landscapes and cast light on the genetic regulation mechanisms underlying these tumors.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    40
    References
    4
    Citations
    NaN
    KQI
    []