Discrete Stochastic Search and Its Application to Feature-Selection for Deep Relational Machines
2019
We use a model for discrete stochastic search in which one or more objects (“targets”) are to be found by a search over n locations (“boxes”), where n is infinitely large. Each box has some probability that it contains a target, resulting in a distribution H over boxes. We model the search for the targets as a stochastic procedure that draws boxes using some distribution S. We derive first a general expression on the expected number of misses \(\text {E}[Z]\) made by the search procedure in terms of H and S. We then obtain an expression for an optimal distribution \(S^{*}\) to minimise \(\text {E}[Z]\). This results in a relation between: the entropy of H and the KL-divergence between H and \(S^{*}\). This result induces a 2-partitions over the boxes consisting of those boxes with H probability greater than \(\frac{1}{n}\) and the rest. We use this result to devise a stochastic search procedure for the practical situation when H is unknown. We present results from simulations that agree with theoretical predictions; and demonstrate that the expected misses by the optimal seeker decreases as the entropy of H decreases, with a maximum obtained for uniform H. Finally, we demonstrate applications of this stochastic search procedure with a coarse assumption about H. The theoretical results and the procedure are applicable to stochastic search over any aspect of machine learning that involves a discrete search-space: for example, choice over features, structures or discretized parameter-selection. In this work, the procedure is used to select features for Deep Relational Machines (DRMs) which are Deep Neural Networks (DNNs) defined in terms of domain-specific knowledge and built with features selected from large, potentially infinite-attribute space. Empirical results obtained across over 70 real-world datasets show that using the stochastic search procedure results in significantly better performances than the state-of-the-art.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
16
References
11
Citations
NaN
KQI