Comparative Analysis of Network Embeddings for Functional Annotation in Protein Interaction Networks

2020 
One of the major problems in bioinformatics is the computational prediction of functions for the large number of sequenced proteins which will facilitate the expensive and long process of wet lab verification. Protein-protein interaction networks (PINs) are considered as one of the richest sources of information for solving this problem. PINs can be represented as graphs, where the nodes are the proteins with their functions as node labels and the edges are their physical interactions. In this paper embedding vectors are created to represent the nodes of the graph which are later used as the input data for a classification model. This is a graph node classification problem and because of the property of proteins to have multiple functions, it is also a multi-label problem. The classification model used is linear SVM, while the embeddings are built with 4 algorithms, HOPE, SDNE, GF and node2vec and then a comparative analysis is done on the results. Hamming loss is used as an evaluation metrics, because of the multi-label problem. Based on the comparative evaluation, recommendation for using a specific network embedding in specific scenarios is given.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    4
    References
    0
    Citations
    NaN
    KQI
    []