Classification of Query Graph Using Maximum Connected Component

2019 
Graphs have been successfully applied in several applications such as classification of protein molecules or chemical compounds, software bug detection, text classification etc. to model structural relationships in the data. Existing graph classification methods have focused on either graph kernels or frequent subgraph patterns for classification. Graph kernel approaches work with fixed input dimension whereas subgraph feature based methods suffer from exponential pattern search space and high complexity due to graph isomorphism. To overcome these limitations, the proposed approach of classification using maximum connected component (CMCC) builds DFS tree to store only maximum substructures based on DFS canonical code approach of gSpan. Once DFS trees are built for all the classes, the query graph is classified based on its maximum match with input graph set. Advantages of using DFS trees are faster graph matching as compared to graph isomorphism test, less storage requirements and speed-up in execution time. Experimental analysis on five real-world datasets shows that the proposed method CMCC significantly enhances the classification accuracy as compared to other established methods on most of the datasets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    0
    Citations
    NaN
    KQI
    []