Multi-target QSAR modelling of chemo-genomic data analysis based on Extreme Learning Machine

2019 
Abstract This paper presents a new Quantitative Structure-Activity Relationship (QSAR) model based on Extreme Learning Machine (ELM) to predict the biological activity of the benchmark Escape-Data sets compounds in order to provide an effective learning solution for regression analysis. The pre-processing phase of this model has been performed for the chemo-genomics datasets using the k-Nearest Neighbours (k-NN) algorithm to predict missing values of the dataset. In the second phase, the Genetic algorithm hybrid with Binary Whale Optimization algorithm (GBWOA) is adapted to determine the significance and the optimized features in feature selection phase. The min–max method is used in the third phase to transform all features to binary form in order to increases the efficiency of the proposed model by smoothing the data points and reducing fluctuation among features. ELM is used in the final phase as regression algorithm to predict chemo-genomics chemical compound. Different experiments have been performed in this paper on datasetwhich has been collected from ExCAPE chemo-genomics database project composed of 43509 compounds, 1134 targets besides biological activity and 40 chemical descriptors. The experimental results show that the proposed model is efficient in improving the level of prediction based on some statistical measurements. Also, ELM produced satisfactory results when the number of hidden nodes is greater than or equal to 1000 L. Moreover, the proposed model achieved high accuracy using R 2 measure ( ≈ 0.971) which outperforms the other algorithms in literature such as (WOA, ALO, BAT and CSA) with accuracies ( ≈ 0.673, ≈ 0.753, ≈ 0.680, and ≈ 0.897) respectively. In addition, the docking results succeeded in validating the current QSAR model. In the current research, 41686 (95.81%) compounds are lead compound and 36965 (84.95%) compounds are a candidate for multi-target genes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    72
    References
    7
    Citations
    NaN
    KQI
    []