Using Deep Learning to Extrapolate Protein Expression Measurements.

Mitra Barzine,Kārlis Freivalds,James C. Wright,Mārtiņš Opmanis,Darta Rituma,Fatemeh Zamanzad Ghavidel,A J Jarnuczak,Edgars Celms,Karlis Cerans,Inge Jonassen,Lelde Lāce,Juan Antonio Vizcaíno,Jyoti S. Choudhary,Alvis Brazma,Juris Viksna

Using Deep Learning to Extrapolate Protein Expression Measurements.

2020

MOTIVATION Mass spectrometry (MS) based quantitative proteomics experiments typically assay a subset of up to 60% of the ∼20,000 human protein coding genes. Computational methods for imputing the missing values using RNA expression data usually allow only for imputations of proteins measured in at least some of the samples. In silico methods for comprehensively estimating abundances across all proteins are still missing. RESULTS We propose a novel method using deep learning to extrapolate the observed protein expression values in label-free MS experiments to all proteins, leveraging gene functional annotations and RNA measurements as key predictive attributes. We tested our method on four datasets, including human cell lines and human and mouse tissues. Our method predicts the protein expression values with average R2 scores between 0.46 and 0.54, which is significantly better than predictions based on correlations using the RNA expression data alone. Moreover, we demonstrate that the derived models can be "transferred" across experiments and species. For instance, the model derived from human tissues gave a R2 = 0.51 when applied to mouse tissue data. We conclude that protein abundances generated in label free MS experiments can be computationally predicted using functional annotated attributes and can be used to highlight aberrant protein abundance values. This article is protected by copyright. All rights reserved.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations