Enhance Link Prediction in Online Social Networks Using Similarity Metrics, Sampling, and Classification

2018 
Link prediction in an online social network aims to determine new interactions among its members which are probably to arise in the near future. The previous researches dealt with the prediction task after calculating similarity scores between nodes in the link graph. New links are then predicted by implementing a supervised method from the scores. However, real-world applications often contain sparse and imbalanced data from the network, which may lead to difficulty in predicting new links. The selection of an appropriate classification method is indeed an important matter. Firstly, this paper proposes several extended metrics to calculate the similarity scores between nodes. Then, we design a new sampling method to make the training and testing data based on the data created by the extended metrics. Lastly, we assess some well-known classification methods namely J48, Weighted SVM, Gboost, Naive Bayes, Random Forest, Logistics Regressive, and Xgboost in order to choose the best method and equivalent environments for the link prediction problem. A number of open directions to the problem are suggested further.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    3
    Citations
    NaN
    KQI
    []