Query learning is one of the most important active learning models that have been studied in the literature. In this thesis, we study two types of queries, namely, edge-detecting queries and value-injecting queries, both of which are motivated by biological applications.
Motivated by PCR experiments used in a genome sequencing problem, edge-detecting queries allow the learner to query whether a set of vertices induces an edge or not. We present algorithms that learn general graphs and uniform hypergraphs with queries linear in the number of edges. The algorithms are optimal in terms of their dependence on the number of edges. In both problems, we present algorithms in which queries can be made in a small number of rounds. We also give matching upper and lower bounds for the problem of learning almost uniform hypergraphs.
Value-injection queries are motivated by experiments that use selective gene disruption and expression to identify gene regulatory networks. We propose a new model of exact learning of acyclic circuits with value-injection queries. We prove lower bounds and show that a large class of circuits including NC1 and AC0 are learnable in the new model.
Abstract As transformers have gained prominence in natural language processing, some researchers have investigated theoretically what problems they can and cannot solve, by treating problems as formal languages. Exploring such questions can help clarify the power of transformers relative to other models of computation, their fundamental capabilities and limits, and the impact of architectural choices. Work in this subarea has made considerable progress in recent years. Here, we undertake a comprehensive survey of this work, documenting the diverse assumptions that underlie different results and providing a unified framework for harmonizing seemingly contradictory findings.
An algorithm for learning the class of Boolean formulas that are expressible as conjunctions of Horn clauses is presented. (A Horn clause is a disjunction of literals, all but at most one of which is a negated variable). The algorithm uses equivalence queries and membership queries to produce a formula that is logically equivalent to the unknown formula to be learned. The amount of time used by the algorithm is polynomial in the number of variables and the number of clauses in the unknown formula.< >