Anirudh Goyal

Yale University

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Trends

Author Order

Document Type

Co-Authors

Yoshua Bengio

Université de Montréal

104

Nan Rosemary Ke

DeepMind (United Kingdom)

Michael C. Mozer

Google (United States)

Alex Lamb

Microsoft Research (United Kingdom)

Bernhard Schölkopf

Max Planck Institute for Intelligent Systems

Aniket Didolkar

Manipal Academy of Higher Education

Sergey Levine

University of California, Berkeley

Hugo Larochelle

Google (Canada)

Stefan Bauer

Technical University of Munich

Jonathan Binas

University of Zurich

Cooperative Institutions

Google (United States)

422

DeepMind (United Kingdom)

182

Google (United Kingdom)

Stanford University

Vanderbilt University Medical Center

Carnegie Mellon University

Baystate Medical Center

Intermountain Medical Center

Université de Montréal

University of California, Berkeley

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Field

Learning Neural Causal Models with Active Interventions

arXiv (Cornell University) (2021)

Nino Scherrer Olexa Bilaniuk Yashas Annadani Anirudh Goyal Patrick Schwab

Discovering causal structures from data is a challenging inference problem of fundamental importance in all areas of science. The appealing scaling properties of neural networks have recently led to a surge of interest in differentiable neural network-based methods for learning causal structures from data. So far differentiable causal discovery has focused on static datasets of observational or interventional origin. In this work, we introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process. Our method significantly reduces the required number of interactions compared with random intervention targeting and is applicable for both discrete and continuous optimization formulations of learning the underlying directed acyclic graph (DAG) from data. We examine the proposed method across a wide range of settings and demonstrate superior performance on multiple benchmarks from simulated to real-world data.

Causal structure

Identification

Causal model

Source

Cite

Citations (1)

Leveraging Communication Topologies Between Learning Agents in Deep Reinforcement Learning

arXiv (Cornell University) (2019)

Dhaval Adjodah Dan Calacci Abhimanyu Dubey Anirudh Goyal Peter Krafft

A common technique to improve learning performance in deep reinforcement learning (DRL) and many other machine learning algorithms is to run multiple learning agents in parallel. A neglected component in the development of these algorithms has been how best to arrange the learning agents involved to improve distributed search. Here we draw upon results from the networked optimization literatures suggesting that arranging learning agents in communication networks other than fully connected topologies (the implicit way agents are commonly arranged in) can improve learning. We explore the relative performance of four popular families of graphs and observe that one such family (Erdos-Renyi random graphs) empirically outperforms the de facto fully-connected communication topology across several DRL benchmark tasks. Additionally, we observe that 1000 learning agents arranged in an Erdos-Renyi graph can perform as well as 3000 agents arranged in the standard fully-connected topology, showing the large learning improvement possible when carefully designing the topology over which agents communicate. We complement these empirical results with a theoretical investigation of why our alternate topologies perform better. Overall, our work suggests that distributed machine learning algorithms could be made more effective if the communication topology between learning agents was optimized.

Benchmark (surveying)

10.48550/arxiv.1902.06740

Cite

Citations (1)

Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future.

arXiv (Cornell University) (2019)

Nan Rosemary Ke Amanpreet Singh Abdelaziz Touati Anirudh Goyal Yoshua Bengio

In model-based reinforcement learning, the agent interleaves between model learning and planning. These two components are inextricably intertwined. If the model is not able to provide sensible long-term prediction, the executed planner would exploit model flaws, which can yield catastrophic failures. This paper focuses on building a model that reasons about the long-term future and demonstrates how to use this for efficient planning and exploration. To this end, we build a latent-variable autoregressive model by leveraging recent ideas in variational inference. We argue that forcing latent variables to carry future information through an auxiliary task substantially improves long-term predictions. Moreover, by planning in the latent space, the planner's solution is ensured to be within regions where the model is valid. An exploration strategy can be devised by searching for unlikely trajectories under the model. Our method achieves higher reward faster compared to baselines on a variety of tasks and environments in both the imitation learning and model-based reinforcement learning settings.

Planner

Forcing (mathematics)

Source

Cite

Citations (16)

Generalization of Equilibrium Propagation to Vector Field Dynamics

arXiv (Cornell University) (2018)

Benjamin Scellier Anirudh Goyal Jonathan Binas Thomas Mesnard Yoshua Bengio

The biological plausibility of the backpropagation algorithm has long been doubted by neuroscientists. Two major reasons are that neurons would need to send two different types of signal in the forward and backward phases, and that pairs of neurons would need to communicate through symmetric bidirectional connections. We present a simple two-phase learning procedure for fixed point recurrent networks that addresses both these issues. In our model, neurons perform leaky integration and synaptic weights are updated through a local mechanism. Our learning method generalizes Equilibrium Propagation to vector field dynamics, relaxing the requirement of an energy function. As a consequence of this generalization, the algorithm does not compute the true gradient of the objective function, but rather approximates it at a precision which is proven to be directly related to the degree of symmetry of the feedforward and feedback weights. We show experimentally that our algorithm optimizes the objective function.

Backpropagation

Learning rule

Degree (music)

10.48550/arxiv.1808.04873

Cite

Citations (14)

ACtuAL: Actor-Critic Under Adversarial Learning

arXiv (Cornell University) (2017)

Anirudh Goyal Nan Rosemary Ke Alex Lamb R Devon Hjelm Chris Pal

Generative Adversarial Networks (GANs) are a powerful framework for deep generative modeling. Posed as a two-player minimax problem, GANs are typically trained end-to-end on real-valued data and can be used to train a generator of high-dimensional and realistic images. However, a major limitation of GANs is that training relies on passing gradients from the discriminator through the generator via back-propagation. This makes it fundamentally difficult to train GANs with discrete data, as generation in this case typically involves a non-differentiable function. These difficulties extend to the reinforcement learning setting when the action space is composed of discrete decisions. We address these issues by reframing the GAN framework so that the generator is no longer trained using gradients through the discriminator, but is instead trained using a learned critic in the actor-critic framework with a Temporal Difference (TD) objective. This is a natural fit for sequence modeling and we use it to achieve improvements on language modeling tasks over the standard Teacher-Forcing methods.

Discriminator

Sequence (biology)

Forcing (mathematics)

10.48550/arxiv.1711.04755

Cite

Citations (8)

Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding

arXiv (Cornell University) (2018)

Nan Rosemary Ke Anirudh Goyal Olexa Bilaniuk Jonathan Binas Michael C. Mozer

Learning long-term dependencies in extended temporal sequences requires credit assignment to events far back in the past. The most common method for training recurrent neural networks, back-propagation through time (BPTT), requires credit information to be propagated backwards through every single step of the forward computation, potentially over thousands or millions of time steps. This becomes computationally expensive or even infeasible when used with long sequences. Importantly, biological brains are unlikely to perform such detailed reverse replay over very long sequences of internal states (consider days, months, or years.) However, humans are often reminded of past memories or mental states which are associated with the current mental state. We consider the hypothesis that such memory associations between past and present could be used for credit assignment through arbitrarily long sequences, propagating the credit assigned to the current state to the associated past state. Based on this principle, we study a novel algorithm which only back-propagates through a few of these temporal skip connections, realized by a learned attention mechanism that associates current states with relevant past states. We demonstrate in experiments that our method matches or outperforms regular BPTT and truncated BPTT in tasks involving particularly long-term dependencies, but without requiring the biologically implausible backward replay through the whole history of states. Additionally, we demonstrate that the proposed method transfers to longer sequences significantly better than LSTMs trained with BPTT and LSTMs trained with full self-attention.

Backtracking

10.48550/arxiv.1809.03702

Cite

Citations (28)

Massive right atrial thrombus

European Heart Journal - Case Reports (2019)

Willy Roque Eman Rashed Anirudh Goyal James Maher

10.1093/ehjcr/ytz148

Cite

Citations (0)

Robust Representation Learning via Perceptual Similarity Metrics

arXiv (Cornell University) (2021)

Saeid Asgari Taghanaki Kristy Choi Amir Hosein Khasahmadi Anirudh Goyal

A fundamental challenge in artificial intelligence is learning useful representations of data that yield good performance on a downstream task, without overfitting to spurious input features. Extracting such task-relevant predictive information is particularly difficult for real-world datasets. In this work, we propose Contrastive Input Morphing (CIM), a representation learning framework that learns input-space transformations of the data to mitigate the effect of irrelevant input features on downstream performance. Our method leverages a perceptual similarity metric via a triplet loss to ensure that the transformation preserves task-relevant information.Empirically, we demonstrate the efficacy of our approach on tasks which typically suffer from the presence of spurious correlations: classification with nuisance information, out-of-distribution generalization, and preservation of subgroup accuracies. We additionally show that CIM is complementary to other mutual information-based representation learning techniques, and demonstrate that it improves the performance of variational information bottleneck (VIB) when used together.

Overfitting

Spurious relationship

Representation

Feature Learning

Similarity (geometry)

Information bottleneck method

Feature (linguistics)

10.48550/arxiv.2106.06620

Cite

Citations (2)

Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers

arXiv (Cornell University) (2020)

Alex Lamb Anirudh Goyal Agnieszka Słowik Michael C. Mozer Philippe Beaudoin

Feed-forward neural networks consist of a sequence of layers, in which each layer performs some processing on the information from the previous layer. A downside to this approach is that each layer (or module, as multiple modules can operate in parallel) is tasked with processing the entire hidden state, rather than a particular part of the state which is most relevant for that module. Methods which only operate on a small number of input variables are an essential part of most programming languages, and they allow for improved modularity and code re-usability. Our proposed method, Neural Function Modules (NFM), aims to introduce the same structural capability into deep learning. Most of the work in the context of feed-forward networks combining top-down and bottom-up feedback is limited to classification problems. The key contribution of our work is to combine attention, sparsity, top-down and bottom-up feedback, in a flexible algorithm which, as we show, improves the results in standard classification, out-of-domain generalization, generative modeling, and learning representations in the context of reinforcement learning.

Modularity

Code (set theory)

10.48550/arxiv.2010.08012

Cite

Citations (0)

Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules

Sarthak Mittal Alex Lamb Anirudh Goyal Vikram Voleti Murray Shanahan

Robust perception relies on both bottom-up and top-down signals. Bottom-up signals consist of what's directly observed through sensation. Top-down signals consist of beliefs and expectations based on past experience and short-term memory, such as how the phrase `peanut butter and~...' will be completed. The optimal combination of bottom-up and top-down information remains an open question, but the manner of combination must be dynamic and both context and task dependent. To effectively utilize the wealth of potential top-down information available, and to prevent the cacophony of intermixed signals in a bidirectional architecture, mechanisms are needed to restrict information flow. We explore deep recurrent neural net architectures in which bottom-up and top-down signals are dynamically combined using attention. Modularity of the architecture further restricts the sharing and communication of information. Together, attention and modularity direct information flow, which leads to reliable performance improvements in perceptual and language tasks, and in particular improves robustness to distractions and noisy data. We demonstrate on a variety of benchmarks in language modeling, sequential image classification, video prediction and reinforcement learning that the \emph{bidirectional} information flow can improve results over strong baselines.

Robustness

Modularity

Information flow

Cite

Citations (15)