Regularizing conjunctive features for classification

2021 
Abstract We consider the feature-generation task wherein we are given a database with entities labeled as positive and negative examples, and we want to find feature queries that linearly separate the two sets of examples. We focus on conjunctive feature queries, and explore two problems: (a) deciding if separating feature queries exist (separability), and (b) generating such queries when they exist. To restrict the complexity of the generated classifiers, we explore various ways of regularizing them by limiting their dimension, the number of joins in feature queries, and their generalized hypertreewidth (ghw). We show that the separability problem is tractable for bounded ghw; yet, the generation problem is not because feature queries might be too large. So, we explore a third problem: classifying new entities without necessarily generating the feature queries. Interestingly, in the case of bounded ghw we can efficiently classify without explicitly generating such queries.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    49
    References
    0
    Citations
    NaN
    KQI
    []